Reputation: 563
There is a much similar post in the forum POST but i just cant figure out what how to do it in my exaple.
My code now with explanation below
for i in dfuser.appid :
print i
d = dfbinary.loc[dfbinary['appid'] == i]
print d
glist = dfbinary.columns[dfbinary.loc[i]==1]
print glist
I have a dataframe with a list of users with their apps (dfuser) and i have another dataframe with the genres of all the apps(an app may have more than one genre). So i want to see which genre is more popular in each user.
My code is fine except that glist
is not finding the appid that i want but finds the appid with index i. For example i=10 , so it will find the app that is at row 11(10).
This is what it prints
10
appid Accounting Action Adventure Animation&Modeling AudioProduction...
0 10.0 0.0 1.0 0.0 0.0 0.0
[1 rows x 23 columns]
Index([u'Action'], dtype='object')
(And this just happens to be correct)
Upvotes: 1
Views: 392
Reputation: 2831
Firstly whenever you have a loop with pandas you are probably doing it wrong!
You need to use merge to combine the two dataframes and select only the user and genre columns. It works just like SQL. Then you have a table keyed on user/genre. Now you can groupby("user").count(). No explicit loops.
Upvotes: 1