dummyds
dummyds

Reputation: 197

How can I get a string column that look like a dictionary and get the last item of it?

I have a dataframe that looks like this:

col_A      col_B
AA         {"0":10,"5":13,"8":20}
BB         {"0":2,"3":34,"5":40,"15":100}
CC         {"2":5,"5":19,"15":200,"20":200,"30":340}

As you can see, col_B is some sort of dict. I need to get the last number in it. Basically, I need to achieve this:

col_A      col_B                                        col_C
AA         {"0":10,"5":13,"8":20}                       20
BB         {"0":2,"3":34,"5":40,"15":100}               100
CC         {"2":5,"5":19,"15":200,"20":200,"30":340}    340

Since col_B doesn't have a fixed length, I was trying to use str.count(',') to count the number of items and them str.split(':') to split the numbers. But this seems like too complex. I was wondering if there is an easier approach that I'm missing in here.

Upvotes: 1

Views: 51

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150785

Try with str.extract:

# all digits between `:` and `}`
# also, for any characters, not necessarily digits
# ':([^:]+)}'
df['col_C'] = df['col_B'].str.extract(':(\d+)}')

Output:

  col_A                                      col_B col_C
0    AA                     {"0":10,"5":13,"8":20}    20
1    BB             {"0":2,"3":34,"5":40,"15":100}   100
2    CC  {"2":5,"5":19,"15":200,"20":200,"30":340}   340

Upvotes: 2

Related Questions