Reputation: 411
I have a DataFrame which looks like this (with many additional columns)
age1 age2 age3 age 4 \
Id#
1001 5 6 2 8
1002 7 6 1 0
1003 10 9 7 5
1004 9 12 5 9
I am trying write a loop that sums each column with the previous ones before it and returns it to a new DataFrame. I have started out, simply, with this:
New = pd.DataFrame()
New[0] = SFH2.ix[:,0]
for x in SFH2:
ls = [x,x+1]
B = SFH2[ls].sum(axis=1)
New[x] = B
print(New)
and the error I get is
ls = [x,x+1]
TypeError: Can't convert 'int' object to str implicitly
I know that int and str are different objects, but how can I overcome this, or is there a different way to iterate through columns? Thanks!
Upvotes: 4
Views: 214
Reputation: 95948
It sounds like cumsum
is what you are looking for:
In [5]: df
Out[5]:
age1 age2 age3 age4
Id#
1001 5 6 2 8
1002 7 6 1 0
1003 10 9 7 5
1004 9 12 5 9
In [6]: df.cumsum(axis=1)
Out[6]:
age1 age2 age3 age4
Id#
1001 5 11 13 21
1002 7 13 14 14
1003 10 19 26 31
1004 9 21 26 35
Upvotes: 2
Reputation: 862641
You can use add
with shift
ed DataFrame
:
print (df.shift(-1,axis=1))
age1 age2 age3 age4
Id#
1001 6.0 2.0 8.0 NaN
1002 6.0 1.0 0.0 NaN
1003 9.0 7.0 5.0 NaN
1004 12.0 5.0 9.0 NaN
print (df.add(df.shift(-1,axis=1), fill_value=0))
age1 age2 age3 age4
Id#
1001 11.0 8.0 10.0 8.0
1002 13.0 7.0 1.0 0.0
1003 19.0 16.0 12.0 5.0
1004 21.0 17.0 14.0 9.0
If need shift with 1
(default parameter, omited):
print (df.shift(axis=1))
age1 age2 age3 age4
Id#
1001 NaN 5.0 6.0 2.0
1002 NaN 7.0 6.0 1.0
1003 NaN 10.0 9.0 7.0
1004 NaN 9.0 12.0 5.0
print (df.add(df.shift(axis=1), fill_value=0))
age1 age2 age3 age4
Id#
1001 5.0 11.0 8.0 10.0
1002 7.0 13.0 7.0 1.0
1003 10.0 19.0 16.0 12.0
1004 9.0 21.0 17.0 14.0
Upvotes: 2