pandas stack find unique on combination of two columns

Question

I am trying to find unique values in Field1, Field2 along with the sum of corresponding values from Value1, Value2.

Please find below the sample input and output. I tried using stack but was unable to get the similar output.

Input

id  Field1  Value1  Field2  Value2
1     A       1       B       0
1     A       0     
2     A       1       D       1
3     C       1       A       0
4     E       0     
3     A       1       C       1
4     F       1

Output

id  Field   Value
1     A       1
1     B       0     
2     A       1
2     D       1
3     C       2
3     A       1
4     E       0
4     F       1

I am trying to remove rows if the column contains a specific value along with others.

For example in the above output, I will remove rows with A,B if they appear along with others and will print them if they appear standalone. Please see below the new output

Output2

id  Field   Value
1     A       1
1     B       0     
2     D       1
3     C       2
4     E       0
4     F       1

jezrael · Accepted Answer

Use wide_to_long with groupby and aggregate sum:

df = (pd.wide_to_long(df.reset_index(), stubnames=['Field', 'Value'], i='index', j='id')
       .reset_index(drop=True)
       .groupby(by=['id', 'Field'], as_index=False)['Value'].sum())

Alternative with lreshape, but still not in oficial docs:

df = (pd.lreshape(df, {'Value':['Value1','Value2'], 'Field':['Field1','Field2']})
        .groupby(['id', 'Field'], as_index=False)['Value'].sum())

print (df)
   id Field  Value
0   1     A    1.0
1   1     B    0.0
2   2     A    1.0
3   2     D    1.0
4   3     A    1.0
5   3     C    2.0
6   4     E    0.0
7   4     F    1.0

pandas stack find unique on combination of two columns

Input

Output

Output2

Answers (1)

Related Questions