Quantcast
Browsing latest articles
Browse All 8 View Live

Beyond One-Hot: an exploration of categorical variables

In machine learning, data are king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out....

View Article


Image may be NSFW.
Clik here to view.

Even Further Beyond One-Hot: Feature Hashing

In the previous post about categorical encoding we explored different methods for converting categorical variables into numeric features.  In this post, we will explore another method: feature hashing....

View Article


Beyond One-Hot: Sklearn transformers and pip release

I've just released version 1.0.0 of category_encoders on pypi, you can check out the source here: https://github.com/wdm0006/categorical_encoding In two previous posts (here and here), we discussed and...

View Article

Beyond One-Hot: incremental improvements in categorical encoding

The beyond-one-hot project has started to grow up.  Last fall, I did a couple of posts comparing different methods of encoding categorical variables for machine learning problems.  You can check them...

View Article

Category Encoders now on conda forge

My scikit-learn compatible library of categorical data encoders (category_encoders) is now published on conda forge!  Conda, if you didn't know, is an open source package manager for python (and other...

View Article


Category Encoders accepted into scikit-learn-contrib

In the past I've posted a few times about a library I'm working on called category encoders.  The idea of it is to provide a complete toolbox of scikit-learn compatible transformers for the encoding of...

View Article

BaseN Encoding and Grid Search in category_encoders

In the past I've posted about the various categorical encoding methods one can use for machine learning tasks, like one-hot encoding, ordinal or binary.  In my OSS package, category_encodings, I've...

View Article

Category Encoders v1.2.5 Release

This release was actually cut a couple of weeks ago, but I forgot to put a post here. It's been a release of mainly incremental changes, but also one of increased contributions from the community, so...

View Article

Browsing latest articles
Browse All 8 View Live