How to handle multi-label categorical feature for binary classification problem?

Question

I have dataset like :

   profile     category  target
0        1      [5, 10]       1
1        2          [1]       0
2        3   [23, 5000]       1
3        4  [700, 4500]       0

How to handle category feature, this table may have others additional features too. One hot encoding lead to consume too much space.because number of rows is around 10 million. Any suggestion would be helpful.

GIRISH kuniyal · Accepted Answer

MultiLabelBinarizer is solution for this kind of problem which gave sparse output low in memory you can convert other feature to sparse matrix than combine all features to feed into Machine learning model.

source

How to handle multi-label categorical feature for binary classification problem?

Answers (2)

Related Questions