Beyond One-Hot: an exploration of categorical variables
In machine learning, data are king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out....
View ArticleEven Further Beyond One-Hot: Feature Hashing
In the previous post about categorical encoding we explored different methods for converting categorical variables into numeric features. In this post, we will explore another method: feature hashing....
View ArticleBeyond One-Hot: Sklearn transformers and pip release
I've just released version 1.0.0 of category_encoders on pypi, you can check out the source here: https://github.com/wdm0006/categorical_encoding In two previous posts (here and here), we discussed and...
View ArticleBeyond One-Hot: incremental improvements in categorical encoding
The beyond-one-hot project has started to grow up. Last fall, I did a couple of posts comparing different methods of encoding categorical variables for machine learning problems. You can check them...
View ArticleCategory Encoders now on conda forge
My scikit-learn compatible library of categorical data encoders (category_encoders) is now published on conda forge! Conda, if you didn't know, is an open source package manager for python (and other...
View ArticleCategory Encoders accepted into scikit-learn-contrib
In the past I've posted a few times about a library I'm working on called category encoders. The idea of it is to provide a complete toolbox of scikit-learn compatible transformers for the encoding of...
View ArticleBaseN Encoding and Grid Search in category_encoders
In the past I've posted about the various categorical encoding methods one can use for machine learning tasks, like one-hot encoding, ordinal or binary. In my OSS package, category_encodings, I've...
View ArticleCategory Encoders v1.2.5 Release
This release was actually cut a couple of weeks ago, but I forgot to put a post here. It's been a release of mainly incremental changes, but also one of increased contributions from the community, so...
View Article