Reputation: 990
I have a pandas dataframe like this -
Time 1 A 2 A 3 A 4 A 5 A 6 A 100 A
5 10 4 6 6 4 6 4
3 7 19 2 7 7 9 18
6 3 6 3 3 8 10 56
2 5 9 1 1 9 12 13
The Time
columns gives me the number of A
columns that I need to sum up.So that the output looks like this -
Time 1 A 2 A 3 A 4 A 5 A 6 A 100 A Total
5 10 4 6 6 4 6 4 30
3 7 19 2 7 7 9 18 28
6 3 6 3 3 8 10 56 33
2 5 9 1 1 9 12 13 14
In other words, when the value in Time
column is 3, it should sum up 1A, 2A and 3A
when the value in Time
column is 5, it should sum up 1A, 2A, 3A, 4A and 5A
Note: There are other columns also in between the As
. So I cant sum using simple indexing.
Highly appreciate any help in finding a solution.
Upvotes: 1
Views: 646
Reputation: 862591
Use numpy - idea is compare array created by np.arange
with length of columns with Time
columns converted to index with broadcasting to 2d mask, get matched values by numpy.where
and last sum
:
df1 = df.set_index('Time')
m = np.arange(len(df1.columns)) < df1.index.values[:, None]
df['new'] = np.where(m, df1.values, 0).sum(axis=1)
print (df)
Time 1 A 2 A 3 A 4 A 5 A 6 A 100 A new
0 5 10 4 6 6 4 6 4 30
1 3 7 19 2 7 7 9 18 28
2 6 3 6 3 3 8 10 56 33
3 2 5 9 1 1 9 12 13 14
Details:
print (df1)
1 A 2 A 3 A 4 A 5 A 6 A 100 A
Time
5 10 4 6 6 4 6 4
3 7 19 2 7 7 9 18
6 3 6 3 3 8 10 56
2 5 9 1 1 9 12 13
print (m)
[[ True True True True True False False]
[ True True True False False False False]
[ True True True True True True False]
[ True True False False False False False]]
print (np.where(m, df1.values, 0))
[[10 4 6 6 4 0 0]
[ 7 19 2 0 0 0 0]
[ 3 6 3 3 8 10 0]
[ 5 9 0 0 0 0 0]]
Upvotes: 3
Reputation: 3495
Try:
df['total'] = df.apply(lambda x: sum([x[i+1] for i in range(x['Time'])]), axis=1)
Upvotes: 1