Rob
Rob

Reputation: 405

Python pandas: appending information to row while looping through dataframe

I would like to know a better way to append information to a dataframe while in a loop. Specifically, to add COLUMNS of information to a dataframe in a conditional manner. The code below technically works, but other than the fact that it is sloppy, more importantly, information such as the data type in each cell is lost as everything is converted to a string. Any tips would be great.

raw_data = {'first_name': ['John', 'Molly', 'Tina', 'Jake', 'Amy'], 
'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
'age': [42, 20, 16, 24, '']}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age'])
headers = df.columns.values
count = 0
for index, row in df.iterrows():
    count += 1
    if row['age'] > 18:

        adult = True
    else:
        adult = False
    headers = np.append(headers,'ADULT')
    vals = np.append(row.values,adult)
    if count == 1:
        print ','.join(headers.tolist())
        print str(vals.tolist()).replace('[','').replace(']','').replace("'","")
    else:
        print str(vals.tolist()).replace('[','').replace(']','').replace("'","")

Upvotes: 2

Views: 225

Answers (1)

sacuL
sacuL

Reputation: 51425

This seems to give your desired outcome (at least, it's the same outcome as your loop):

df['ADULT'] = np.where(pd.to_numeric(df.age) > 18, True, False)

>>> df
  first_name last_name age  ADULT
0       John    Miller  42   True
1      Molly  Jacobson  20   True
2       Tina       Ali  16  False
3       Jake    Milner  24   True
4        Amy     Cooze      False

As pointed out by @Wen, this is much more straightforward:

df['ADULT'] = pd.to_numeric(df.age) > 18

Upvotes: 2

Related Questions