Reputation: 5488
I'm using python 3.5.1 and Pandas 0.18.0 and trying to use this notebook to modify financial tick data as the exercises are of interest to me:
I'm having issues with some of the commands and wondered if this is due to the versions of python and pandas?
For example:
This is the file I am reading in with associated output:
data = pd.read_csv('test30dayes2tickforpython.csv',index_col=0, header=0,parse_dates={"Timestamp" : [0,1]})
data.dtypes
Out[80]:
Open float64
High float64
Low float64
Last float64
Volume int64
NumberOfTrades int64
BidVolume int64
AskVolume int64
dtype: object
When I try to then create another object like this:
ticks = data.ix[:, ['High','Volume']]
ticks
I get NaN values:
High Volume
Timestamp
2015-12-27 23:00:25.000 NaN NaN
2015-12-27 23:01:11.000 NaN NaN
But if I use the column reference instead of names it works:
ticks = data.ix[:, [1,4]]
ticks
High Volume
Timestamp
2015-12-27 23:00:25.000 2045.25 1
2015-12-27 23:01:11.000 2045.50 2
Why is this?
Also, the notebook shows another object created:
bars = ticks.Price.resample('1min', how='ohlc')
bars
When I try this I get this error:
bars = ticks.High.resample('60min', how='ohlc')
bars
1 bars = ticks.High.resample('60min', how='ohlc')
AttributeError: 'DataFrame' object has no attribute 'High'
It works if I don't call the High column:
bars = ticks.resample('60min', how='ohlc')
bars
FutureWarning: how in .resample() is deprecated the new syntax is .resample(...).ohlc()
High Volume
open high low close open high low close
Timestamp
2015-12-27 23:00:00 2045.25 2047.75 2045.25 2045.25 1.0 7.0 1.0 5.0
What is the correct command for this please?
I appreciate the notebook is probably not valid for the version of Python/Pandas Im using but as a newbie it is very useful for me so would like to get it working on my data.
Upvotes: 1
Views: 2242
Reputation: 862601
There is problem spaces
in column names.
print (data.columns)
Index(['Timestamp', ' Open', ' High', ' Low', ' Last', ' Volume',
' NumberOfTrades', ' BidVolume', ' AskVolume'],
dtype='object')
You can strip
this spaces:
data.columns = data.columns.str.strip()
print (data.columns)
Index(['Timestamp', 'Open', 'High', 'Low', 'Last', 'Volume', 'NumberOfTrades',
'BidVolume', 'AskVolume'],
dtype='object')
ticks = data.ix[:, ['High','Volume']]
print (ticks.head())
High Volume
0 2045.25 1
1 2045.50 2
2 2045.50 2
3 2045.50 2
4 2045.50 2
Now you can use:
print (ticks.Price.resample('1min', how='ohlc'))
If you dont want remove spaces, add space to column name:
print (ticks[' Price'].resample('1min', how='ohlc'))
But better is use Resampler.ohlc
, if pandas
version higher as 0.18.0
:
print (ticks.Price.resample('1min').ohlc())
Upvotes: 2