Reputation: 139
I am trying to add values in cells of one column in Pandas Dataframe. The dataframe was created:
data = [['ID_123456', 'example=1(abc)'], ['ID_123457', 'example=1(def)'], ['ID_123458', 'example=1(try)'], ['ID_123459', 'example=1(try)'], ['ID_123460', 'example=1(try),2(test)'], ['ID_123461', 'example=1(try),2(test),9(yum)'], ['ID_123462', 'example=1(try)'], ['ID_123463', 'example=1(try),7(test)']]
df = pd.DataFrame(data, columns = ['ID', 'occ'])
display(df)
The table looks like this:
ID occ
ID_123456 example=1(abc)
ID_123457 example=1(def)
ID_123458 example=1(try)
ID_123459 example=1(test)
ID_123460 example=1(try),2(test)
ID_123461 example=1(try),2(test),9(yum)
ID_123462 example=1(test)
ID_123463 example=1(try),7(test)
The following link is related to it but I was unable to run the command on my dataframe.
Sum all integers in a PANDAS DataFrame "cell"
The command gives an error of "string index out of range".
The output should look like this:
ID occ count
ID_123456 example=1(abc) 1
ID_123457 example=1(def) 1
ID_123458 example=1(try) 1
ID_123459 example=1(test) 1
ID_123460 example=1(try),2(test) 3
ID_123461 example=1(try),2(test),9(yum) 12
ID_123462 example=1(test) 1
ID_123463 example=1(try),7(test) 8
Upvotes: 1
Views: 893
Reputation: 862761
If want sum all numbers on column occ
use Series.str.extractall
, convert to integers with sum
:
df['count'] = df['occ'].str.extractall('(\d+)')[0].astype(int).sum(level=0)
print (df)
ID occ count
0 ID_123456 example=1(abc) 1
1 ID_123457 example=1(def) 1
2 ID_123458 example=1(try) 1
3 ID_123459 example=1(try) 1
4 ID_123460 example=1(try),2(test) 3
5 ID_123461 example=1(try),2(test),9(yum) 12
6 ID_123462 example=1(try) 1
7 ID_123463 example=1(try),7(test) 8
Upvotes: 2