Reputation: 596
I want to populate one column with a string (one of many) contained in another column (if it is contained in that column)
Right now I can do it by repeating the line of code for every different string, I'm looking for the more efficient way of doing it. I have about a dozen in total.
df.loc[df['column1'].str.contains('g/mL'),'units'] = 'g/mL'
df.loc[df['column1'].str.contains('mPa.s'),'units'] = 'mPa.s'
df.loc[df['column1'].str.contains('mN/m'),'units'] = 'mN/m'
I don't know how to make it to check
df.loc[df['column1'].str.contains('g/mL|mPa.s|mN/m'),'units'] = ...
And then make it equal to the one that is contained.
Upvotes: 0
Views: 87
Reputation: 862511
Use loop with str.contains
:
L = ['g/mL', 'mPa.s', 'mN/m']
for val in L:
df.loc[df['column1'].str.contains(val),'units'] = val
Or Series.str.extract
with list of all possible values:
L = ['g/mL', 'mPa.s', 'mN/m']
df['units'] = df['column1'].str.extract('(' + '|'.join(L) + ')')
Upvotes: 1
Reputation: 42886
Use str.extract
:
# example dataframe
df = pd.DataFrame({'column1':['this is test g/mL', 'this is test2 mPa.s', 'this is test3 mN/m']})
column1
0 this is test g/mL
1 this is test2 mPa.s
2 this is test3 mN/m
df['units'] = df['column1'].str.extract('(g/mL|mPa.s|mN/m)')
column1 units
0 this is test g/mL g/mL
1 this is test2 mPa.s mPa.s
2 this is test3 mN/m mN/m
Upvotes: 1