Reputation: 59
For the recommendation problem I am working on, there are around 50000 unique brands and 3 level product categories, level_1_cat (50 categories), level_2_cat (100 categories) and level_3_cat (1000 categories). All these item features are represented by integers only. So far I have tried binary-encoding, label-encoding and target-encoding for my lightfm model. With binary-encoding and label-encoding, the results were worse than not using any item features. With target-encoding, the result were similar to not using any item features. I am wondering what else I can try.
Upvotes: -1
Views: 50
Reputation: 1
try regularization techniques : L1 and L2 which will remove the unnecessary columns eventually. Regularization can help the model generalize better when dealing with high-cardinality features. Ensure that you're applying appropriate regularization to avoid overfitting,
Upvotes: 0