Mitch
Mitch

Reputation: 596

Change one column with one of multiple strings from another column if condition is met

I want to populate one column with a string (one of many) contained in another column (if it is contained in that column)

Right now I can do it by repeating the line of code for every different string, I'm looking for the more efficient way of doing it. I have about a dozen in total.

df.loc[df['column1'].str.contains('g/mL'),'units'] = 'g/mL'
df.loc[df['column1'].str.contains('mPa.s'),'units'] = 'mPa.s'
df.loc[df['column1'].str.contains('mN/m'),'units'] = 'mN/m'

I don't know how to make it to check

df.loc[df['column1'].str.contains('g/mL|mPa.s|mN/m'),'units'] = ...

And then make it equal to the one that is contained.

Upvotes: 0

Views: 87

Answers (3)

jezrael
jezrael

Reputation: 862511

Use loop with str.contains:

L = ['g/mL', 'mPa.s', 'mN/m']
for val in L:
    df.loc[df['column1'].str.contains(val),'units'] = val

Or Series.str.extract with list of all possible values:

L = ['g/mL', 'mPa.s', 'mN/m']
df['units'] = df['column1'].str.extract('(' + '|'.join(L) + ')')

Upvotes: 1

Erfan
Erfan

Reputation: 42886

Use str.extract:

# example dataframe
df = pd.DataFrame({'column1':['this is test g/mL', 'this is test2 mPa.s', 'this is test3 mN/m']})

               column1
0    this is test g/mL
1  this is test2 mPa.s
2   this is test3 mN/m
df['units'] = df['column1'].str.extract('(g/mL|mPa.s|mN/m)')

               column1  units
0    this is test g/mL   g/mL
1  this is test2 mPa.s  mPa.s
2   this is test3 mN/m   mN/m

Upvotes: 1

gosuto
gosuto

Reputation: 5741

Actually, according to the docs you can exactly do that using the regex=True parameter!

df.loc[df['column1'].str.contains('g/mL|mPa.s|mN/m', regex=True),'units'] = ...

Upvotes: 0

Related Questions