Frank
Frank

Reputation: 61

How to get raw data from Alpaca Trade API BarSet?

I want to convert the results from get_barset in the alpaca_trade_api to a dict so that I can access the raw values.

{
    'AAPL': [
        Bar({   'c': 204.25,'h': 205.08,'l': 202.9,'o': 203.35,'t': 1562299200,'v': 14933941}), 
        Bar({   'c': 200.01,'h': 201.4,'l': 198.41,'o': 200.81,'t': 1562558400,'v': 21987224})
    ]
}
import alpaca_trade_api as tradeapi
import pandas as pd

bar= api.get_barset('AAPL', 'day', limit=2)
df=pd.DataFrame.from_dict(bar)
print(df)

The print output for this process is below. I'm not sure how to get the raw data from each Bar.

                                           AAPL
 0  Bar({   'c': 204.25,\n    'h': 205.08,\n    'l...
 1  Bar({   'c': 200.01,\n    'h': 201.4,\n    'l'...

I want my data to look like this in the end:

               c       h       l       o       t           v
bar1    204.25  205.08  202.9   203.35  1562299200  14933941
bar2    200.01  201.4   198.41  200.81  1562558400  21987224

Any and all help would be appreciated. Sorry if something is missing, this is post number one. Thank you for the help!

Upvotes: 0

Views: 3498

Answers (2)

gmdev
gmdev

Reputation: 3155

Alpaca is still fairly new and unfortunately does not have the most explanatory documentation. Firstly, the result of alpaca_trade_api.get_barset is a BarSet. There is a .df attribute of BarSet, but neither the BarSet itself nor its df gives us direct access to the raw results.

Diving into the source code for the API's BarSet, which can be found here, we can see two things:

  1. BarSet is a subclass of dict
  2. BarSet has a _raw attribute

We can use this _raw attribute to access the raw data, which is a dict.

symbol = "AAPL"
bar_set = api.get_barset(symbol, "day", limit=5)._raw

Which outputs:

{
    'AAPL': [
        {'t': 1607317200, 'o': 122.31, 'h': 124.57, 'l': 122.25, 'c': 123.8, 'v': 72463180}, 
        {'t': 1607403600, 'o': 124.37, 'h': 124.98, 'l': 123.09, 'c': 124.33, 'v': 69695298}, 
        {'t': 1607490000, 'o': 124.53, 'h': 125.95, 'l': 121, 'c': 121.67, 'v': 99218318}, 
        {'t': 1607576400, 'o': 120.5, 'h': 123.87, 'l': 120.15, 'c': 123.22, 'v': 70011939}, 
        {'t': 1607662800, 'o': 122.43, 'h': 122.76, 'l': 120.55, 'c': 122.49, 'v': 75289233}
    ]
}

So, if you want to create a pandas.DataFrame with the raw values, you can do something like this:

import pandas as pd

symbol = "AAPL"
bar_set = api.get_barset(symbol, "day", limit=5)[symbol]._raw
df = pd.DataFrame(data=bar_set)

This gives us:

            t       o       h       l       c         v
0  1607317200  122.31  124.57  122.25  123.80  72463180
1  1607403600  124.37  124.98  123.09  124.33  69695298
2  1607490000  124.53  125.95  121.00  121.67  99218318
3  1607576400  120.50  123.87  120.15  123.22  70011939
4  1607662800  122.43  122.76  120.55  122.49  75289233

Upvotes: 0

Frank
Frank

Reputation: 61

There are two answers to this question.

  1. This will create a Pandas dataframe
import alpaca_trade_api as tradeapi #see https://alpaca.markets/
import pandas as pd

api = tradeapi.REST(key_id, secret_key, base_url)

symbol = 'AAPL'

bar= api.get_barset(symbol, 'day', limit=60).df
  1. This is a more brute force approach but does the samething.
import alpaca_trade_api as tradeapi #see https://alpaca.markets/
import pandas as pd

api = tradeapi.REST(key_id, secret_key, base_url)

symbol = 'AAPL'

bar= api.get_barset(symbol, 'day', limit=60)

c = []
h = []
l = []
o = []
t = []
v = []
idx = []

i = 0
for i in range(len(bar[symbol])):
    temp_c = c.append(bar[symbol][i].c) 
    temp_h = h.append(bar[symbol][i].h)
    temp_l = l.append(bar[symbol][i].l)
    temp_o = o.append(bar[symbol][i].o)
    temp_t = t.append(bar[symbol][i].t)
    temp_v = v.append(bar[symbol][i].v)
    temp_idx = idx.append(i)
    i = i + 1
df_bar_t = pd.DataFrame(t, idx,columns = ['Datetime'])
df_bar_c = pd.DataFrame(c, idx, columns = ['Close'])
df_bar_h = pd.DataFrame(h, idx, columns = ['High'])
df_bar_l = pd.DataFrame(l, idx, columns = ['Low'])
df_bar_o = pd.DataFrame(o, idx, columns = ['Open'])
df_bar_v = pd.DataFrame(v, idx, columns = ['Volume'])

mdf1 = pd.merge(df_bar_t,df_bar_c, left_index = True, right_index = True)
mdf2 = pd.merge(mdf1,df_bar_h, left_index = True, right_index = True )
mdf3 = pd.merge(mdf2,df_bar_l, left_index = True, right_index = True )
mdf4 = pd.merge(mdf3,df_bar_o, left_index = True, right_index = True )
mdf4 = pd.merge(mdf4,df_bar_v, left_index = True, right_index = True )

Upvotes: 3

Related Questions