shantanuo
shantanuo

Reputation: 32226

Extract the only item from list in pandas

How do I extract the value from a list? For e.g.

df = pd.DataFrame([[0, 4, 'Abc', 456, '[45.55%]'],
                   [2, 5.2, 'abc', 5, '[34.54%]'],
                   [0.2, 6, 'xyz', 65, '[12.21%]'],
                   [3, 4.1, 'Xbc', 23, '[99.12%]']], columns=['start', 'end', 'name','body_mass', 'budget'])

I can use the string replace function as shown below. But I am looking for a better solution.

df.budget.str.replace('[', '').str.replace(']', '').str.replace('%', '').astype(float)

0    45.55
1    34.54
2    12.21
3    99.12
Name: budget, dtype: float64

There is only 1 item in the list, if that matters.

Upvotes: 1

Views: 65

Answers (3)

shantanuo
shantanuo

Reputation: 32226

Using Regular Expression:

df.budget.str.extract('(\d*\.?\d+)').astype(float)

Upvotes: 1

jpp
jpp

Reputation: 164843

This is a different way using pd.Series.str.replace to remove %, ast.literal_eval to convert string to list, and operator.itemgetter to extract first item.

from ast import literal_eval
from operator import itemgetter

df['budget'] = df['budget'].str.replace('%', '')\
                           .apply(literal_eval)\
                           .apply(itemgetter(0))

print(df['budget'])

0    45.55
1    34.54
2    12.21
3    99.12
Name: budget, dtype: float64

Alternative method using regular expression:

import re

pattern = '|'.join([re.escape(i) for i in ('%', '[', ']')])

df['budget'] = df['budget'].str.replace(pattern, '')\
                           .astype(float)

Upvotes: 1

Eliethesaiyan
Eliethesaiyan

Reputation: 2322

df['budget']=df.budget.str.replace('[', '').str.replace(']', '').str.replace('%', '').astype(float)

this will replace the whole column in your dataset

Upvotes: 0

Related Questions