Reputation: 592
I have the following datadrame:
import pandas as pd
df = pd.DataFrame({'Id_email': [1, 2, 3, 4],
'Word': ['_ SENSOR 12', 'new_SEN041', 'engine', 'sens 12'],
'Date': ['2018-01-05', '2018-01-06', '2017-01-06', '2018-01-05']})
print(df)
I would like to scroll through the 'Word' column looking for derivatives of the word Sensor.
If I found it I wanted to fill the new column 'Type' with Sensor_Type, if I didn't find it, in the corresponding line, I wanted to fill it with Other.
I tried to implement it as follows (this code is wrong):
df['Type'] = 'Other'
for i in range(0, len(df)):
if(re.search('\\SEN\\b', df['Word'].iloc[i], re.IGNORECASE) or
re.search('\\sen\\b', df['Word'].iloc[i], re.IGNORECASE)):
df['Type'].iloc[i] == 'Sensor_Type'
else:
df['Type'].iloc[i] == 'Other'
My (wrong) output is as follows:
Id_email Word Date_end Type
1 _ SENSOR 12 2018-01-05 Other
2 new_SEN041 2018-01-06 Other
3 engine 2017-01-06 Other
4 sens 12 2018-01-05 Other
But, I would like the output to be like this:
Id_email Word Date_end Type
1 _ SENSOR 12 2018-01-05 Sensor_Type
2 new_SEN041 2018-01-06 Sensor_Type
3 engine 2017-01-06 Other
4 sens 12 2018-01-05 Sensor_Type
Upvotes: 0
Views: 114
Reputation: 16172
df['Type'] = df.apply(lambda x: 'Sensor_Type' if re.search(r'SEN|sen',x['Word']) else 'Other', axis=1)
Upvotes: 1
Reputation: 28729
Use pandas str contains, and include case as False - this allows you to search for sen or SEN
df.assign(Type = lambda x: np.where(x.Word.str.contains(r'SEN', case=False),
'Sensor_Type','Other'))
Id_email Word Date Type
0 1 _ SENSOR 12 2018-01-05 Sensor_Type
1 2 new_SEN041 2018-01-06 Sensor_Type
2 3 engine 2017-01-06 Other
3 4 sens 12 2018-01-05 Sensor_Type
Upvotes: 3