Rbdm
Rbdm

Reputation: 39

Aggregating financial candlestick data

I have 1 minute OHLCV Candlestick data, and I need to aggregate it to create 15m Candlesticks. The database comes from MongoDB; this is a version in clean Python:

def get_candela(self,tf):
    c = dict()
    candel = dict()
    candele_finale = list()
    prov_c = list()
    db = database("price_data", "1min_OHLC_XBTUSD")
    col = database.get_collection(db,"1min_OHLC_XBTUSD")
    db_candela = col.find({}, sort = [('timestamp', pymongo.ASCENDING)]).limit(20)
    candele = list(db_candela)
    timestamp_calc = list()
    open_calc = list()
    max_calc = list()
    min_calc = list()
    close_calc = list()
    vol_calc = list()
    #for _ in range(len(candele)):
    for a in range(tf):
        if len(candele) ==  0:
            break
        prov_c.append(candele[a])
        c.append(prov_c)
        candele[:tf]=[]
    for b in range(len(c)):
        cndl = c[b]
    for d in range(tf):
        print(cndl)
        cnd = cndl[d]
        #print(len(cnd))
        timestamp_calc.append(cnd["timestamp"])
        open_calc.append(cnd["open"])
        max_calc.append(cnd["high"])
        min_calc.append(cnd["low"])
        close_calc.append(cnd["close"])
        vol_calc.append(cnd["volume"])
        index_close=len(close_calc)
        candel["timestamp"] = timestamp_calc[d]
        candel["open"] = open_calc[0]
        candel["high"] = max(max_calc)
        candel["low"] = min(min_calc)
        candel["close"] = close_calc[index_close-1]
        candel["volume"] = sum(vol_calc)
        #print(candel)
        candele_finale.append(candel)
        max_calc.clear()
        min_calc.clear()
        vol_calc.clear()
    return candele_finale

This returns an array with only the last candlestick created. And this is another version in pandas:

db = database("price_data", "1min_OHLC_XBTUSD")
 col = database.get_collection(db,"1min_OHLC_XBTUSD")
 db_candela = col.find({}, sort = [('timestamp', pymongo.ASCENDING)]).limit(20)
 prov_c = list()


    for item in db_candela:
            cc={"timestamp":item["timestamp"],"open":item["open"],"high":item["high"],"low":item["low"],"close":item["close"],"volume":item["volume"]}
            prov_c.append(cc)
            print(prov_c)
        data = pandas.DataFrame([prov_c], index=[pandas.to_datetime(cc["timestamp"])])
            #print(data)
        df = data.resample('5T').agg({'timestamp':'first','open':'first','high':'max', 'low':'min','close' : 'last','volume': 'sum'})
        #print(data.mean())
        #with pandas.option_context('display.max_rows', None, 'display.max_columns',None):  # more options can be specified also
        pprint(df)

This returns a dataframe with weird/random values.

Upvotes: 0

Views: 1252

Answers (1)

5zorro
5zorro

Reputation: 31

I had the same question today, and I answered it. Basically there is a pandas function called resample that does all of the work for you.

Here is my code:

import json
import pandas as pd

#load the raw data and clean it up

data_json = open('./testapiresults.json') #load json object
data_dict = json.load(data_json)          #convert to a dict object
df = pd.DataFrame.from_dict(data_dict)    #convert to panel data's dataframe object

df['datetime'] = pd.to_datetime(df['datetime'],unit='ms') #convert from unix time (ms from epoch) to human time
df = df.set_index(pd.DatetimeIndex(df['datetime'])) #set time series as index format for resample function

#resample to aggregate to a different frequency of data

candle_summary = pd.DataFrame()
candle_summary['open'] = df.open.resample('15Min').first()
candle_summary['high'] = df.high.resample('15Min').max()
candle_summary['low'] = df.low.resample('15Min').min()
candle_summary['close'] = df.close.resample('15Min').last()

candle_summary.head()

I had to export to csv and recalculate it in excel to doublecheck that it was calculating correctly, but this works. I have not figured out the pandas.DataFrame.resample('30min').ohlc() function, but that looks like it could make some really elegant code if I could figure out how to make it work without tons of errors.

Upvotes: 1

Related Questions