caston1414
caston1414

Reputation: 35

Creating a new column based on a condition

I have a dataframe in the form of:

 weekday           station   num_bikes  num_racks  hour
 no               Girwood   5           6         8
 yes              Girwood   6           5         12
 yes              Girwood   2           9         6
 no               Girwood   9           2         18
 yes              Fraser    0           14        16

I am attempting to create a new column called rush hour based on the value of the hour and weekday columns , the code i have used is :

df.loc[(df['hour'] <7) , 'Rush_hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am' 
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No' 
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush_hour?'] = 'No' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No' 
df.loc[(df['hour']>18) , 'Rush_hour?'] = 'No' 

When I run this code i get NaN, can any one suggest what is wrong with my code?

Upvotes: 0

Views: 123

Answers (2)

Nora_F
Nora_F

Reputation: 451

Do this:

# initialize a list :
aList = []

# loop over all data and check whatever you want with if-elif-else :
for i in range(len(dff)):
    h = df['hour'][i]
    w = h = df['weekday?'][i]

    if(h < 7):
        aList .append('No')
    elif((h >= 7) & (h <= 10) & (w=='yes')):
        aList .append('Yes-am')
    else:
        aList .append('blah blah')
    # ....
# create a new columns and assign the list to it :
df['Rush hour'] = alist

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153460

You have some column naming inconsistency.

"Rush hour?" vs "Rush_hour?" and "weekday" vs "weekday?"

Try this:

df=df.rename(columns={'weekday':'weekday?'})

df.loc[(df['hour'] <7) , 'Rush hour?'] = 'No'
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'yes'), 'Rush hour?'] = 'Yes-am' 
df.loc[(df['hour']>=7) & (df['hour']<=10) & (df['weekday?'] == 'no'), 'Rush hour?'] = 'No' 
df.loc[(df['hour']>10) & (df['hour'] <15) , 'Rush hour?'] = 'No' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'yes'), 'Rush hour?'] = ' Yes-pm' 
df.loc[(df['hour'] >=15) & (df['hour']<=18) & (df['weekday?'] == 'no'), 'Rush hour?'] = ' No' 
df.loc[(df['hour']>18) , 'Rush hour?'] = 'No' 

df

Output:

  weekday?  station  num_bikes  num_racks  hour Rush hour?
0       no  Girwood          5          6     8         No
1      yes  Girwood          6          5    12         No
2      yes  Girwood          2          9     6         No
3       no  Girwood          9          2    18         No
4      yes   Fraser          0         14    16     Yes-pm

Assuming your logic is correct.

Upvotes: 1

Related Questions