Reputation: 794
I have a data frame that looks the following:
variable value
0 TrafficIntensity_end 217.0
1 TrafficIntensity_end+105 213.0
2 TrafficIntensity_end+120 204.0
3 TrafficIntensity_end+15 489.0
4 TrafficIntensity_end+30 479.0
5 TrafficIntensity_end+45 453.0
6 TrafficIntensity_end+60 387.0
7 TrafficIntensity_end+75 303.0
8 TrafficIntensity_end+90 221.0
9 pred_rf_end+15 545.0
10 pred_rf_end 244.0
11 pred_rf_end+30 448.0
12 pred_rf_end+45 408.0
13 pred_rf_end+60 363.0
14 pred_rf_end+75 305.0
15 pred_rf_end+90 199.0
16 pred_rf_end+105 181.0
17 pred_rf_end+120 163.0
I want to create a new column based on what the string in ['variable']
column contains. I have the following code:
def classify(row):
if row['variable'].str.contains('TrafficIntensity'):
return 'Real Traffic Intensity'
elif row['variable'].str.contains('pred_rf_end'):
return 'Predicited Value'
a['category'] = a.apply(classify, axis=1)
However this gives me the following error:
AttributeError: ("'str' object has no attribute 'str'", 'occurred at index 0')
Why does this happen and hw can I fix it? Thanks!
Upvotes: 1
Views: 723
Reputation: 862771
Use numpy.select
:
m1 = df['variable'].str.contains('TrafficIntensity')
m2 = df['variable'].str.contains('pred_rf_end')
a['category'] = np.select([m1, m2],
['Real Traffic Intensity','Predicited Value'],
a['variable'])
Your solution with test scalar by in
statement:
def classify(x):
if 'TrafficIntensity' in x:
return 'Real Traffic Intensity'
elif 'pred_rf_end' in x:
return 'Predicited Value'
else:
return x
a['category'] = a['variable'].apply(classify)
Upvotes: 2