Reputation: 32226
How do I extract the value from a list? For e.g.
df = pd.DataFrame([[0, 4, 'Abc', 456, '[45.55%]'],
[2, 5.2, 'abc', 5, '[34.54%]'],
[0.2, 6, 'xyz', 65, '[12.21%]'],
[3, 4.1, 'Xbc', 23, '[99.12%]']], columns=['start', 'end', 'name','body_mass', 'budget'])
I can use the string replace function as shown below. But I am looking for a better solution.
df.budget.str.replace('[', '').str.replace(']', '').str.replace('%', '').astype(float)
0 45.55
1 34.54
2 12.21
3 99.12
Name: budget, dtype: float64
There is only 1 item in the list, if that matters.
Upvotes: 1
Views: 65
Reputation: 32226
Using Regular Expression:
df.budget.str.extract('(\d*\.?\d+)').astype(float)
Upvotes: 1
Reputation: 164843
This is a different way using pd.Series.str.replace
to remove %, ast.literal_eval
to convert string to list, and operator.itemgetter
to extract first item.
from ast import literal_eval
from operator import itemgetter
df['budget'] = df['budget'].str.replace('%', '')\
.apply(literal_eval)\
.apply(itemgetter(0))
print(df['budget'])
0 45.55
1 34.54
2 12.21
3 99.12
Name: budget, dtype: float64
Alternative method using regular expression:
import re
pattern = '|'.join([re.escape(i) for i in ('%', '[', ']')])
df['budget'] = df['budget'].str.replace(pattern, '')\
.astype(float)
Upvotes: 1
Reputation: 2322
df['budget']=df.budget.str.replace('[', '').str.replace(']', '').str.replace('%', '').astype(float)
this will replace the whole column in your dataset
Upvotes: 0