Reputation: 979
I have data like this:
Users_id My_Fav Bro_Fav Friend_Fav
User0 BMW VW BMW
UserA VW Mercedes Honda
UserB Honda Honda VW
UserC Mercedes BMW Mercedes
UserD VW BMW BMW
I would like output for correlation between Columns and Brands and desired output would be like this:
My_Fav Bro_Fav Friend_Fav
BMW 1 2 2
VW 2 1 1
Honda 1 1 1
Mercedes 1 1 1
Upvotes: 1
Views: 42
Reputation: 863501
You can count columns values per columns and then sum
per index values, if necessary convert Users_id
column to index in first step:
#Users_id is column
df = df.set_index('Users_id').apply(pd.value_counts).sum(level=0)
#Users_id is index
#df = df.apply(pd.value_counts).sum(level=0)
print (df)
My_Fav Bro_Fav Friend_Fav
BMW 1 2 2
Honda 1 1 1
Mercedes 1 1 1
VW 2 1 1
Upvotes: 2
Reputation: 323366
IIUC melt
+ crosstab
s=df.melt('Users_id')
s=pd.crosstab(s.value,s.variable)
variable Bro_Fav Friend_Fav My_Fav
value
BMW 2 2 1
Honda 1 1 1
Mercedes 1 1 1
VW 1 1 2
Upvotes: 1