Reputation: 353
I have a huge dataframe (4 million rows and 25 columns). I am trying to investigate 2 categorical columns. One of them has around 5000 levels (app_id) and the other has 50 levels (app_category).
I have seen that for for each level in app_id there is a unique value of app_category. How do I code to prove that?
I have tried something like this:
app_id_unique = list(train['app_id'].unique())
for unique in app_id_unique:
train.loc[train['app_id'] == unique].app_category.nunique()
This code takes forever.
Upvotes: 1
Views: 44