Miles-can
Miles-can

Reputation: 59

How to encode item features with high number of categories for recommendation

For the recommendation problem I am working on, there are around 50000 unique brands and 3 level product categories, level_1_cat (50 categories), level_2_cat (100 categories) and level_3_cat (1000 categories). All these item features are represented by integers only. So far I have tried binary-encoding, label-encoding and target-encoding for my lightfm model. With binary-encoding and label-encoding, the results were worse than not using any item features. With target-encoding, the result were similar to not using any item features. I am wondering what else I can try.

Upvotes: -1

Views: 50

Answers (1)

Lakshayknows
Lakshayknows

Reputation: 1

try regularization techniques : L1 and L2 which will remove the unnecessary columns eventually. Regularization can help the model generalize better when dealing with high-cardinality features. Ensure that you're applying appropriate regularization to avoid overfitting,

Upvotes: 0

Related Questions