Reputation: 197
I have a dataframe that looks like this:
col_A col_B
AA {"0":10,"5":13,"8":20}
BB {"0":2,"3":34,"5":40,"15":100}
CC {"2":5,"5":19,"15":200,"20":200,"30":340}
As you can see, col_B is some sort of dict. I need to get the last number in it. Basically, I need to achieve this:
col_A col_B col_C
AA {"0":10,"5":13,"8":20} 20
BB {"0":2,"3":34,"5":40,"15":100} 100
CC {"2":5,"5":19,"15":200,"20":200,"30":340} 340
Since col_B doesn't have a fixed length, I was trying to use str.count(',') to count the number of items and them str.split(':') to split the numbers. But this seems like too complex. I was wondering if there is an easier approach that I'm missing in here.
Upvotes: 1
Views: 51
Reputation: 150785
Try with str.extract
:
# all digits between `:` and `}`
# also, for any characters, not necessarily digits
# ':([^:]+)}'
df['col_C'] = df['col_B'].str.extract(':(\d+)}')
Output:
col_A col_B col_C
0 AA {"0":10,"5":13,"8":20} 20
1 BB {"0":2,"3":34,"5":40,"15":100} 100
2 CC {"2":5,"5":19,"15":200,"20":200,"30":340} 340
Upvotes: 2