borisvanax
borisvanax

Reputation: 794

create new column in pandas raises AttributeError: ("'str' object has no attribute 'str'", 'occurred at index 0')

I have a data frame that looks the following:

                  variable              value
0           TrafficIntensity_end        217.0
1           TrafficIntensity_end+105    213.0
2           TrafficIntensity_end+120    204.0
3           TrafficIntensity_end+15     489.0
4           TrafficIntensity_end+30     479.0
5           TrafficIntensity_end+45     453.0
6           TrafficIntensity_end+60     387.0
7           TrafficIntensity_end+75     303.0
8           TrafficIntensity_end+90     221.0
9           pred_rf_end+15              545.0
10          pred_rf_end                 244.0
11          pred_rf_end+30              448.0
12          pred_rf_end+45              408.0
13          pred_rf_end+60              363.0
14          pred_rf_end+75              305.0
15          pred_rf_end+90              199.0
16          pred_rf_end+105             181.0
17          pred_rf_end+120             163.0

I want to create a new column based on what the string in ['variable'] column contains. I have the following code:

def classify(row):
    if row['variable'].str.contains('TrafficIntensity'):
        return 'Real Traffic Intensity'
    elif row['variable'].str.contains('pred_rf_end'):
        return 'Predicited Value'

a['category'] = a.apply(classify, axis=1)

However this gives me the following error:

AttributeError: ("'str' object has no attribute 'str'", 'occurred at index 0')

Why does this happen and hw can I fix it? Thanks!

Upvotes: 1

Views: 723

Answers (1)

jezrael
jezrael

Reputation: 862771

Use numpy.select:

 m1 = df['variable'].str.contains('TrafficIntensity')
 m2 = df['variable'].str.contains('pred_rf_end')

 a['category'] = np.select([m1, m2], 
                           ['Real Traffic Intensity','Predicited Value'], 
                           a['variable'])

Your solution with test scalar by in statement:

def classify(x):
    if 'TrafficIntensity' in x:
        return 'Real Traffic Intensity'
    elif 'pred_rf_end' in x:
        return 'Predicited Value'
    else:
        return x

a['category'] = a['variable'].apply(classify)

Upvotes: 2

Related Questions