Reputation: 11
Can someone help me out? I am getting
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
from the following code:
import pandas as pd
testdf = pd.read_csv('../../IBM.csv')
print testdf
print "------------"
testdf['NHigh'] = 0
print testdf
if testdf['Close'] > testdf['Open']:
testdf['Nhigh'] = testdf['Close'] * testdf['High']
print "********"
print tested
What I am trying to do is create a new column populated by values from two existing columns but only if a condition is true.
The shape is a stock dataframe with the following columns - Open, High, Low,
Close
etc and I want to add a new column (NHigh
) based on an operation between
say Close
and High
if Close
is > than High
for that row.
Thanks if you can help....
Upvotes: 1
Views: 105
Reputation: 862661
I think you can use loc
and fillna
:
print testdf
Open High Low Close Volume
Date_Time
1997-02-03 09:04:00 3046.0 3048.5 3046.0 3047.5 505
1997-02-03 09:27:00 3043.5 3043.5 3043.0 3043.0 56
1997-02-03 09:28:00 3043.0 3044.0 3043.0 3044.0 32
1997-02-03 09:29:00 3044.5 3044.5 3044.5 3044.5 63
1997-02-03 09:30:00 3045.0 3045.0 3045.0 3045.0 28
1997-02-03 09:31:00 3045.0 3045.5 3045.0 3045.5 75
print testdf['Close'] > testdf['Open']
Date_Time
1997-02-03 09:04:00 True
1997-02-03 09:27:00 False
1997-02-03 09:28:00 True
1997-02-03 09:29:00 False
1997-02-03 09:30:00 False
1997-02-03 09:31:00 True
dtype: bool
testdf.loc[testdf['Close'] > testdf['Open'],'Nhigh'] = testdf['Close'] * testdf['High']
testdf['Nhigh'] = testdf['Nhigh'].fillna(0)
print testdf
Open High Low Close Volume Nhigh
Date_Time
1997-02-03 09:04:00 3046.0 3048.5 3046.0 3047.5 505 9290303.75
1997-02-03 09:27:00 3043.5 3043.5 3043.0 3043.0 56 0.00
1997-02-03 09:28:00 3043.0 3044.0 3043.0 3044.0 32 9265936.00
1997-02-03 09:29:00 3044.5 3044.5 3044.5 3044.5 63 0.00
1997-02-03 09:30:00 3045.0 3045.0 3045.0 3045.0 28 0.00
1997-02-03 09:31:00 3045.0 3045.5 3045.0 3045.5 75 9275070.25
Other solution use numpy.where
:
testdf['Nhigh']=np.where(testdf['Close'] > testdf['Open'], testdf['Close']*testdf['High'], 0)
print testdf
Open High Low Close Volume Nhigh
Date_Time
1997-02-03 09:04:00 3046.0 3048.5 3046.0 3047.5 505 9290303.75
1997-02-03 09:27:00 3043.5 3043.5 3043.0 3043.0 56 0.00
1997-02-03 09:28:00 3043.0 3044.0 3043.0 3044.0 32 9265936.00
1997-02-03 09:29:00 3044.5 3044.5 3044.5 3044.5 63 0.00
1997-02-03 09:30:00 3045.0 3045.0 3045.0 3045.0 28 0.00
1997-02-03 09:31:00 3045.0 3045.5 3045.0 3045.5 75 9275070.25
Upvotes: 1