Replies: 1 comment
-
Very interesting - thank you for sharing, we will take a closer look! If you want to provide an implementation, we're happy to accept PRs! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Adding Target Statistic based method for categorical encoding can be useful and efficient.
We wrote this paper in 2021.
In fraud detection for card payments we have a feature named MCC (merchant category code).
It's a high cardinality categorical feature with about 1000 categories. The most efficient way we found to transform it is to use the CatBoost encoding, but there are also other target Statistics based methods that can be good. Since our study all the students and PhD students we work with in our research university chair are using this type of encoding every-time the cardinality is high.
Beta Was this translation helpful? Give feedback.
All reactions