Reputation: 85
Is there a way to for Less, Middle Greater columns. If it is greater than 0 to replace the ROW with 1. So the top row would become?
Conc Less Middle Greater
Date
2005-03-02 00:00 10.3 0.000000 1 1
This is the original
Conc Less Middle Greater
Date
2005-03-02 00:00 10.3 0.000000 0.083333 0.916667
2005-03-02 01:00 14.1 0.000000 0.750000 0.250000
2005-03-02 02:00 7.0 0.000000 0.833333 0.166667
2005-03-02 03:00 7.0 0.000000 1.000000 0.000000
2005-03-02 04:00 7.2 0.000000 1.000000 0.000000
2005-03-02 06:00 6.6 0.333333 0.666667 0.000000
2005-03-02 07:00 6.6 0.416667 0.583333 0.000000
i've tried:
df.loc[df['Less']>0:]=1
df.loc[df['Less']==0:]=0
but that shows up in red and says False True (In the correct places) and: dtype: bool, None, None)
I also tried: looping through like:
for line in df['Less']:
if df['Less'] >0:
df['Less']=1
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 1
Views: 1691
Reputation: 7903
To do them all at once:
columns = ['Less', 'Middle', 'Greater']
df[columns] = np.where(df[columns] >0, 1 ,0)
Or to do it individually (Note double brackets for selecting columns):
df[['Less']] = np.where(df[['Less']] >0, 1 ,0)
df[['Middle']] = np.where(df[['Middle']] >0, 1 ,0)
df[['Greater']] = np.where(df[['Greater']] >0, 1 ,0)
Upvotes: 0
Reputation: 394041
You can use loc
with boolean condition:
In [250]:
df.loc[df['Less'] > 0, 'Less'] = 1
df
Out[250]:
Conc Less Middle Greater
Date
2005-03-02 00:00:00 10.3 0.0 0.083333 0.916667
2005-03-02 01:00:00 14.1 0.0 0.750000 0.250000
2005-03-02 02:00:00 7.0 0.0 0.833333 0.166667
2005-03-02 03:00:00 7.0 0.0 1.000000 0.000000
2005-03-02 04:00:00 7.2 0.0 1.000000 0.000000
2005-03-02 06:00:00 6.6 1.0 0.666667 0.000000
2005-03-02 07:00:00 6.6 1.0 0.583333 0.000000
this df.loc[df['Less']>0:]
is invalid syntax, you want to use a comma and pass the list of column names of interest
Your for
loop version:
for line in df['Less']:
if df['Less'] >0:
df['Less']=1
is invalid because if
doesn't understand how to interpret an array of boolean values hence the error, if you did if (df['Less'] >0).all()
or if (df['Less'] >0).any()
then it would be happy but it doesn't make sense anyway as you're iterating row-wise but then testing the entire df which is wasteful.
Upvotes: 2