Reputation: 8913
Having the following Pandas DataFrame of Strings:
key 0 1-9 10-18 19-27 28-36 37-45 46-54 55-63 64-72 73-81 82-90 91-99 100
1 A 1 2 1 4 1 1 1 7 1 3 1 1 1
2 B 3 1 1 1 6 1 1 1 7 1 8 1 1
3 C 1 1 2 1 1 1 1 1 1 1 1 1 1
I would like to get the sum of the cells of a specific row, so for example for the first row (key A) the result should be 25 (1 + 2 + 1 + 4 + 1 + 1 + 1 + 7 + 1 + 3 + 1 + 1 + 1).
How would you approach such a problem?
Upvotes: 3
Views: 3811
Reputation: 862761
If values in key
are unique and need select by label:
Create index by column key
by set_index
, then select by DataFrame.loc
:
#select return Series
print (df.set_index('key').loc['A'])
0 1
1-9 2
10-18 1
19-27 4
28-36 1
37-45 1
46-54 1
55-63 7
64-72 1
73-81 3
82-90 1
91-99 1
100 1
Name: A, dtype: int64
out = df.set_index('key').loc['A'].sum()
Or create index
first, then sum
and last select by Series.at
or Series.loc
:
#sum return Series
print (df.set_index('key').sum(axis=1))
key
A 25
B 33
C 14
dtype: int64
out = df.set_index('key').sum(axis=1).at['A']
out = df.set_index('key').sum(axis=1)['A']
out = df.set_index('key').sum(axis=1).loc['A']
Or filter by boolean indexing
first and then sum
:
#filtering create one row DataFrame
print (df[df['key'] == 'A'])
key 0 1-9 10-18 19-27 28-36 37-45 46-54 55-63 64-72 73-81 82-90 \
1 A 1 2 1 4 1 1 1 7 1 3 1
91-99 100
1 1 1
out = df[df['key'] == 'A'].sum(axis=1).item()
If values in key
should be duplicated and need select by label:
print (df)
key 0 1-9 10-18 19-27 28-36 37-45 46-54 55-63 64-72 73-81 82-90 \
1 A 1 2 1 4 1 1 1 7 1 3 1
2 A 3 1 1 1 6 1 1 1 7 1 8
3 C 1 1 2 1 1 1 1 1 1 1 1
91-99 100
1 1 1
2 1 1
3 1 1
First is possible convert filtered values to numpy array by values
and then sum
of 2d array
:
out = df.set_index('key').loc['A'].values.sum()
Of double sum
- first sum
create Series
and second sum
return scalar:
out = df.set_index('key').loc['A'].sum().sum()
out = df.set_index('key').sum(axis=1).at['A'].sum()
If need select by positions:
Use DataFrame.iloc
or Series.iat
, Series.iloc
:
out = df.set_index('key').iloc[0].sum()
out = df.set_index('key').sum(axis=1).iat[0]
out = df.set_index('key').sum(axis=1).iloc[0]
Upvotes: 4