Harsh Patel
Harsh Patel

Reputation: 49

How to extract specific content using pandas

Consider the following data

Non-removable Li-Po 2870 mAh battery
Non-removable Li-Po 5910 mAh battery (A3-A20-K1AY)
Non-removable Li-Po 1810 mAh battery (6.9 Wh)

I would like to extract the mAh battery numeric value from this like below

2870
5910
1810

I tried using

def func(x):
  # Split array
  ar = x.split(' mAh')

but i dont get what do i need to return

Upvotes: 0

Views: 61

Answers (2)

Jan
Jan

Reputation: 43169

It seems to always be the first number, so you might use

^\D*(\d+)

As in

df.column_in_question_here.str.extract('^\D*(\d+)')

See a demo on regex101.com for the expression.

Upvotes: 0

meW
meW

Reputation: 3967

Considering the value always lie between LiPo and mAh, use extract:

df = pd.DataFrame({'col': ['Non-removable Li-Po 2870 mAh battery',
                           'Non-removable Li-Po 5910 mAh battery (A3-A20-K1AY)',
                           'Non-removable Li-Po 1810 mAh battery (6.9 Wh)']})
df.col.str.extract('Li-Po (.*) mAh')

      0
0  2870
1  5910
2  1810

Upvotes: 2

Related Questions