Reputation: 21
I have a .csv file of financial tick data with 3 columns corresponding to date, time & price. The files has no header.
i.e
01/18/14, 04:09:28, 55.0
01/18/14, 02:18:31, 55.4
01/17/14, 10:42:34, 55.3
01/17/14, 03:18:07, 55.2
...
I want to resample into Daily OHLC using pandas so i can import it into my charting software in the correct format.
I have only gotten so far as opening the file using:
data = pd.read_csv('data.csv')
Can you help me convert the data in the fomat i have into OHLC with pandas resample. Thanks
Upvotes: 0
Views: 2400
Reputation: 333
If it's still actual there is simplest way to do that in Pandas:
data.resample('1D').apply('ohlc')
Upvotes: 3
Reputation: 682
With Python, but without pandas:
#!/usr/bin/env python
import datetime
from decimal import Decimal
class Tick(object):
pass
ticks = []
with open('data.csv') as f:
ticksTemp = []
lines = [x.strip('\n') for x in f.readlines()]
for line in lines:
columns = [x.strip() for x in line.split(',')]
if len(columns) != 3:
continue;
timeStr = columns[0] + '/' + columns[1]
time = datetime.datetime.strptime(timeStr, "%m/%d/%y/%H:%M:%S" )
price = columns[2]
tick = Tick()
tick.time = time
tick.price = Decimal(price)
ticksTemp.append(tick)
ticks = sorted(ticksTemp, key = lambda x: x.time, reverse=False)
lines = []
first = ticks[0]
last = ticks[-1]
time = first.time
o,h,l,c = first.price, first.price, first.price, first.price
def appendLine():
lines.append(time.strftime('%Y-%m-%d')+','+str(o)+ ','+str(h)+','+str(l)+','+str(c))
for tick in ticks:
if(tick.time.year != time.year or tick.time.day != time.day):
appendLine()
time = tick.time
o = tick.price
c = tick.price
if tick.price > h:
h = tick.price
if tick.price < l:
l = tick.price
if last != first:
appendLine()
with open('ohlc.csv', 'w') as f:
f.write('\n'.join(lines))
data.csv:
01/18/14, 04:09:28, 55.0
01/18/14, 02:18:31, 55.4
01/17/14, 10:42:34, 55.3
01/17/14, 03:18:07, 55.2
ohlc.csv:
2014-01-17,55.2,55.3,55.2,55.3
2014-01-18,55.4,55.4,55.0,55.0
Upvotes: 0