Reputation: 60093
i read the csv data into pandas like below, students got one score for every day. I want to add one extra column as "all_attendance" as extra score.
import pandas as pd
import numpy as np
data = np.array([['','day1','day2','day3','day4','day5'],
['larry',1,4,7,3,5],
['niko',2,-1,3,np.nan,4],
['tin',np.nan,5,5, 6,7]])
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
print(df)
output
day1 day2 day3 day4 day5
larry 1 4 7 3 5
niko 2 -1 3 nan 4
tin nan 5 5 6 7
I want to get result below, 1
if student had score every day, ´0´ is there is nan
exists
day1 day2 day3 day4 day5 all_attendance
larry 1 4 7 3 5 1
niko 2 -1 3 nan 4 0
tin nan 5 5 6 7 0
Upvotes: 1
Views: 106
Reputation: 4253
data = np.array([['','day1','day2','day3','day4','day5'],
['larry',1,4,7,3,5],
['niko',2,-1,3,np.nan,4],
['tin',np.nan,5,5, 6,7]])
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
columns=df.columns
for key,item in df.iterrows():
for column in columns:
if item[column]=='nan':
df.loc[key,column]=0
[df[column].astype(int) for column in columns if column!='']
print(df)
df['all_attendance']=0
for key,row in df.iterrows():
found=0
for value in row[columns]:
if value==0:
found=1
break
if found==1:
df.loc[key,'all_attendance']=0
else:
df.loc[key,'all_attendance']=1
print(df)
output:
day1 day2 day3 day4 day5 all_attendance
larry 1 4 7 3 5 1
niko 2 -1 3 0 4 0
tin 0 5 5 6 7 0
Upvotes: -1
Reputation: 75110
You can replace the string 'nan'
with np.nan
and then check if all the columns for a row is notna using df.all()
on axis=1
df['all_attendance'] = df.replace('nan',np.nan).notna().all(1).astype(int)
Or:
df['all_attendance'] = df.ne('nan').all(1).astype(int)
day1 day2 day3 day4 day5 all_attendance
larry 1 4 7 3 5 1
niko 2 -1 3 nan 4 0
tin nan 5 5 6 7 0
Upvotes: 2
Reputation: 256
You can use an apply()
function to achieve this result. Please see below:
def f(row):
if 'nan' in row.values:
return 0
else:
return 1
df['all_attendance'] = df.apply(f, axis=1)
Upvotes: 0