Compare the values in two columns and extract the values of a third column in a dataframe

Question

df =

Location	teams	goals
1	A	5
1	B	6
2	A	7
2	B	5
2	C	6
3	B	7

Expected output df =

Location	A Team	A Team goals	B Team	B Team Goals	C Team	C Team goals
1	1	5	1	6	0	NA
2	1	7	1	5	1	6
3	0	NA	1	7	0	NA

Panda Kim · Accepted Answer

Example Code

data = {'Location': {0: 1, 1: 1, 2: 2, 3: 2, 4: 2, 5: 3},
        'teams': {0: 'A', 1: 'B', 2: 'A', 3: 'B', 4: 'C', 5: 'B'},
        'goals': {0: 5, 1: 6, 2: 7, 3: 5, 4: 6, 5: 7}}
df = pd.DataFrame(data)

First

aggregation with groupby

(df.groupby(['Location', 'teams'])['goals'].agg(['count', sum])
 .unstack().swaplevel(0, 1, axis=1).sort_index(axis=1))

output:

teams   A               B               C
        count   sum     count   sum     count   sum
Location                        
1       1.0     5.0     1.0     6.0     NaN     NaN
2       1.0     7.0     1.0     5.0     1.0     6.0
3       NaN     NaN     1.0     7.0     NaN     NaN

Second

Let's create idx to change columns

idx = pd.MultiIndex.from_product([df['teams'].unique(), ['Team', 'Team Goal']]).map(lambda x: ' '.join(x))

idx

Index(['A Team', 'A Team Goal', 'B Team', 'B Team Goal', 'C Team', 'C Team Goal'], dtype='object')

Last

change columns and reset_index (include First code)

(df.groupby(['Location', 'teams'])['goals'].agg(['count', sum])
 .unstack().swaplevel(0, 1, axis=1).sort_index(axis=1)
 .set_axis(idx, axis=1).reset_index())

output

    Location    A Team  A Team Goal B Team  B Team Goal C Team  C Team Goal
0   1           1.0     5.0         1.0     6.0         NaN     NaN
1   2           1.0     7.0         1.0     5.0         1.0     6.0
2   3           NaN     NaN         1.0     7.0         NaN     NaN

Compare the values in two columns and extract the values of a third column in a dataframe

Answers (2)

Related Questions