Reputation: 77

extract number from string in pandas dataframe column

I have a dataframe in the below format and and trying to use the extract function but I keep getting the following error:

ValueError: If using all scalar values, you must pass an index

column1    column2
1         abc2150/abc2152/abc2154/abc215601/U215602


df.column2.str
    .split('/',expand=True)
    .apply(lambda row: row.str.extract('(\d+)', expand=True))
    .apply(lambda x: '/'.join(x.dropna().astype(str)), axis=1)

I need the output in the below format.

column1    column2
1         2150/2152/2154/215601/215602

Please let me know how to fix it.

Thanks

Upvotes: 2

Answers (3)

wwnde

Reputation: 26676

Why not?

df['column2']=df.column2.str.replace('abc','')

Upvotes: 0

quest

Reputation: 3936

Here is what I will do:

df.loc[:, "column2"] = df.column2.apply(lambda x: re.sub("[a-zA-Z]+", "", x))

Upvotes: -1

yatu

Reputation: 88305

You could instead use str.replace with a positive lookahead to remove all characters that precede the numerical part:

df.column2.str.replace(r'[a-zA-Z]+(?=\d+)','')

 0    2150/2152/2154/215601/215602
Name: column2, dtype: object

Upvotes: 2

extract number from string in pandas dataframe column

Answers (3)

Related Questions