Reputation: 2507
I would like to extract all cases where digits appear in a list in my a column in my dataframe
Using this as a sample:
':[{"id":836890 name:"Rob Rubnitz" scorecard:[40 35]} {"id":401538 name:"Steve Weisfeld" scorecard:[40 35]} {"id":799385 name:"Marc Werlinsky" scorecard:[40 35]}] '
I would like to extract [40 35] [40 35] [40 35] and have these as the numbers appearing in the updated column.
This is what I tried:
data['col'].str.extract('scorecard:(?P<scorecards>.*?)}')
The problem is that this only extracts the first scorecard from my column
Upvotes: 2
Views: 51
Reputation: 294516
extractall
data['col'].str.extractall('scorecard:(?P<scorecards>.*?)}')
scorecards
match
0 0 [40 35]
1 [40 35]
2 [40 35]
findall
data['col'].str.findall('scorecard:(.*?)}')
0 [[40 35], [40 35], [40 35]]
Name: col, dtype: object
Upvotes: 1