Reputation: 149
I'm attempting to consolidate channel lists from providers to show all channels as the row indices and values as the provider name. Currently organized as:
DIRECTV FUBO YOUTUBE TV
0 A&E A&E ABC
1 ABC ADULT SWIM AMC
2 ACCUWEATHER AMC ANIMAL PLANET
3 AMC ANIMAL PLANET BBC
4 ANIMAL PLANET BBC BBC WORLD
I would like the result to be (header row can be anything):
A&E DIRECTV FUBO
ABC DIRECTV YOUTUBE TV
ACCUWEATHER DIRECTV
ADULT SWIM FUBO
AMC DIRECTV FUBO YOUTUBE TV
ANIMAL PLANET DIRECTV FUBO YOUTUBE TV
BBC FUBO YOUTUBE TV
BBC WORLD YOUTUBE TV
I believe this should be accomplished very simply through a join, merge, or other, but I can't get it working. Any help would be appreciated.
Note: the above images are obviously just a small sample of the list, the index column should be every unique channel from all providers. Something along the lines of "df.values.unique()". Yes, I know that doesn't work.
Upvotes: 2
Views: 68
Reputation: 51155
IIUC, this is a melt
+ crosstab
problem
u = df.melt(
var_name='provider',
value_name='channel'
)
pd.crosstab(
index=u['channel'],
columns=u['provider'],
values=u['provider'],
aggfunc='first'
)
provider DIRECTV FUBO YOUTUBE TV
channel
A&E DIRECTV FUBO NaN
ABC DIRECTV NaN YOUTUBE TV
ACCUWEATHER DIRECTV NaN NaN
ADULT SWIM NaN FUBO NaN
AMC DIRECTV FUBO YOUTUBE TV
ANIMAL PLANET DIRECTV FUBO YOUTUBE TV
BBC NaN FUBO YOUTUBE TV
BBC WORLD NaN NaN YOUTUBE TV
If you would rather a simple 1
or 0
if a value exists, just omit the aggfunc
and values
argument from crosstab:
pd.crosstab(u['channel'], u['provider'])
provider DIRECTV FUBO YOUTUBE TV
channel
A&E 1 1 0
ABC 1 0 1
ACCUWEATHER 1 0 0
ADULT SWIM 0 1 0
AMC 1 1 1
ANIMAL PLANET 1 1 1
BBC 0 1 1
BBC WORLD 0 0 1
Upvotes: 3