Reputation: 23
I have basically the same problem of this guy: How can I get a string column that look like a dictionary and get the last item of it?. However, instead of getting the last number, I need the index. Which is not actually an index, but a string.
As shown in data
, 'col_B'
contains a list
of strings
.
Any ideas?
data =\
{'col_A': ['AA', 'BB', 'CC'],
'col_B': ['{"0":10,"5":13,"8":20}', '{"0":2,"3":34,"5":40,"15":100}', '{"2":5,"5":19,"15":200,"20":200,"30":340}']}
df = pd.DataFrame(data)
col_A col_B
0 AA {"0":10,"5":13,"8":20}
1 BB {"0":2,"3":34,"5":40,"15":100}
2 CC {"2":5,"5":19,"15":200,"20":200,"30":340}
I need to find a way of extracting the numbers 8, 15 and 30.
Upvotes: 2
Views: 43
Reputation: 88276
You could parse the strings with literal_eval
and index dict.keys
to obtain the last key:
from ast import literal_eval
df['col_B'] = df.col_B.map(literal_eval)
df.col_B.map(lambda x: list(x.keys())[-1])
0 8
1 15
2 30
Name: col_B, dtype: object
Though depending on the version of python, the order of the dictionary may not be preserved. In such case, perhaps a regex is safer:
df.col_B.str.extract(r'"(\d+)"\:\d+}$').squeeze()
0 8
1 15
2 30
Name: 0, dtype: object
Upvotes: 4