Reputation: 223
I have this dataframe :
label | time_0 | value_0 | time_1 | value_1 | time_2 | value_2
value_0 column corresponds to time_0, value_1 to time_1...
time_i is unique, which means for all rows time_0 has the same value, same for time_1, time_2..
I want to achieve 2 things :
1- first group_by label, then time_i
The result should look like :
Label0 time_0(0) value_0(0)
value_0(1)
...
time_1(0) value_1(0)
value_1(1)
..
...
Label1 time_0(0) value_0(0)
value_0(1)
...
time_1(0) value_1(0)
value_1(1)
..
...
Label2 time_0(0) value_0(0)
value_0(1)
...
time_1(0) value_1(0)
value_1(1)
..
...
2- first group_by label, then time_i and sum all values_i
The result should look like
Label1 time_0(0) sum(value_0)
time_1(1) sum(value_1)
time_2(2) sum(value_2)
Label2 time_0(0) sum(value_0)
time_1(1) sum(value_1)
time_2(2) sum(value_2)
...
I tried different combinations with pd.merge, group_by, but without a success
here is an example with values
Upvotes: 1
Views: 386
Reputation: 862751
Use wide_to_long
with aggregate sum
:
df = (pd.wide_to_long(df.reset_index(),
stubnames=['value','time'],
i=['index','Label'], j='tmp', sep='_')
.groupby(['Label','time'])['value']
.sum()
.reset_index())
print (df)
Label time value
0 EUR/CHF 2020-12-04 -51.260248
1 EUR/CHF 2020-12-10 98.202053
2 USD/CHF 2020-12-04 0.134488
3 USD/CHF 2020-12-10 4.510396
4 USD/NOK 2020-12-04 0.395785
5 USD/NOK 2020-12-10 -0.801768
EDIT:
Use:
print (df)
Label veg__0_time veg__0_value veg__1_time veg__1_value
0 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
1 USD/CHF 2020-12-04 -0.439058 2020-12-10 1.392752
2 USD/CHF 2020-12-04 -0.012020 2020-12-10 0.043742
3 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
4 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
5 USD/CHF 2020-12-04 -0.525791 2020-12-10 1.273146
6 USD/CHF 2020-12-04 1.306578 2020-12-10 1.115313
7 USD/CHF 2020-12-04 -0.195221 2020-12-10 0.685444
8 USD/NOK 2020-12-04 0.395785 2020-12-10 -0.801768
9 EUR/CHF 2020-12-04 -29.385792 2020-12-10 45.951600
10 EUR/CHF 2020-12-04 -21.874456 2020-12-10 52.250453
df = df.set_index('Label')
df.columns = df.columns.str.split('_', expand=True).droplevel([0,1])
print (df)
0 1
time value time value
Label
USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
USD/CHF 2020-12-04 -0.439058 2020-12-10 1.392752
USD/CHF 2020-12-04 -0.012020 2020-12-10 0.043742
USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
USD/CHF 2020-12-04 -0.525791 2020-12-10 1.273146
USD/CHF 2020-12-04 1.306578 2020-12-10 1.115313
USD/CHF 2020-12-04 -0.195221 2020-12-10 0.685444
USD/NOK 2020-12-04 0.395785 2020-12-10 -0.801768
EUR/CHF 2020-12-04 -29.385792 2020-12-10 45.951600
EUR/CHF 2020-12-04 -21.874456 2020-12-10 52.250453
df = df.stack(0).groupby(['Label','time'])['value'].sum().reset_index()
print (df)
Label time value
0 EUR/CHF 2020-12-04 -51.260248
1 EUR/CHF 2020-12-10 98.202053
2 USD/CHF 2020-12-04 0.134488
3 USD/CHF 2020-12-10 4.510396
4 USD/NOK 2020-12-04 0.395785
5 USD/NOK 2020-12-10 -0.801768
EDIT1: Solution with rename:
def f(x):
splitted = x.split('_')
if len(splitted) > 1:
return f'{splitted[-1]}_{splitted[-2]}'
else:
return x
df = df.rename(columns=f)
print (df)
Label time_0 value_0 time_1 value_1
0 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
1 USD/CHF 2020-12-04 -0.439058 2020-12-10 1.392752
2 USD/CHF 2020-12-04 -0.012020 2020-12-10 0.043742
3 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
4 USD/CHF 2020-12-04 0.000000 2020-12-10 0.000000
5 USD/CHF 2020-12-04 -0.525791 2020-12-10 1.273146
6 USD/CHF 2020-12-04 1.306578 2020-12-10 1.115313
7 USD/CHF 2020-12-04 -0.195221 2020-12-10 0.685444
8 USD/NOK 2020-12-04 0.395785 2020-12-10 -0.801768
9 EUR/CHF 2020-12-04 -29.385792 2020-12-10 45.951600
10 EUR/CHF 2020-12-04 -21.874456 2020-12-10 52.250453
df = (pd.wide_to_long(df.reset_index(),
stubnames=['value','time'],
i=['index','Label'], j='tmp', sep='_')
.groupby(['Label','time'])['value']
.sum()
.reset_index())
print (df)
Label time value
0 EUR/CHF 2020-12-04 -51.260248
1 EUR/CHF 2020-12-10 98.202053
2 USD/CHF 2020-12-04 0.134488
3 USD/CHF 2020-12-10 4.510396
4 USD/NOK 2020-12-04 0.395785
5 USD/NOK 2020-12-10 -0.801768
Upvotes: 1