Reputation: 350
I have a generator being returned from:
data = public_client.get_product_trades(product_id='BTC-USD', limit=10)
How do i turn the data in to a pandas dataframe?
the method DOCSTRING reads:
"""{"Returns": [{
"time": "2014-11-07T22:19:28.578544Z",
"trade_id": 74,
"price": "10.00000000",
"size": "0.01000000",
"side": "buy"
}, {
"time": "2014-11-07T01:08:43.642366Z",
"trade_id": 73,
"price": "100.00000000",
"size": "0.01000000",
"side": "sell"
}]}"""
I have tried:
df = [x for x in data]
df = pd.DataFrame.from_records(df)
but it does not work as i get the error:
AttributeError: 'str' object has no attribute 'keys'
When i print the above "x for x in data" i see the list of dicts but the end looks strange, could this be why?
print(list(data))
[{'time': '2020-12-30T13:04:14.385Z', 'trade_id': 116918468, 'price': '27853.82000000', 'size': '0.00171515', 'side': 'sell'},{'time': '2020-12-30T12:31:24.185Z', 'trade_id': 116915675, 'price': '27683.70000000', 'size': '0.01683711', 'side': 'sell'}, 'message']
It looks to be a list of dicts but the end value is a single string 'message'.
Upvotes: 2
Views: 2825
Reputation: 26251
Based on the updated question:
df = pd.DataFrame(list(data)[:-1])
Or, more cleanly:
df = pd.DataFrame([x for x in data if isinstance(x, dict)])
print(df)
time trade_id price size side
0 2020-12-30T13:04:14.385Z 116918468 27853.82000000 0.00171515 sell
1 2020-12-30T12:31:24.185Z 116915675 27683.70000000 0.01683711 sell
Oh, and BTW, you'll still need to change those strings into something usable...
So e.g.:
df['time'] = pd.to_datetime(df['time'])
for k in ['price', 'size']:
df[k] = pd.to_numeric(df[k])
Upvotes: 3
Reputation: 5745
its straightforward just use the pd.DataFrame
constructor:
#list_of_dicts = [{
# "time": "2014-11-07T22:19:28.578544Z",
# "trade_id": 74,
# "price": "10.00000000",
# "size": "0.01000000",
# "side": "buy"
# }, {
# "time": "2014-11-07T01:08:43.642366Z",
# "trade_id": 73,
# "price": "100.00000000",
# "size": "0.01000000",
# "side": "sell"
#}]
# or if you take it from 'data'
list_of_dicts = data[:-1]
df = pd.DataFrame(list_of_dicts)
df
Out[4]:
time trade_id price size side
0 2014-11-07T22:19:28.578544Z 74 10.00000000 0.01000000 buy
1 2014-11-07T01:08:43.642366Z 73 100.00000000 0.01000000 sell
UPDATE
according to the question update, it seems you have json data that is still string...
import json
data = json.loads(data)
data = data['Returns']
pd.DataFrame(data)
time trade_id price size side
0 2014-11-07T22:19:28.578544Z 74 10.00000000 0.01000000 buy
1 2014-11-07T01:08:43.642366Z 73 100.00000000 0.01000000 sell
Upvotes: 0
Reputation: 1789
You could access the values in the dictionary and build a dataframe from it (although not particularly clean):
dict_of_data = [{
"time": "2014-11-07T22:19:28.578544Z",
"trade_id": 74,
"price": "10.00000000",
"size": "0.01000000",
"side": "buy"
}, {
"time": "2014-11-07T01:08:43.642366Z",
"trade_id": 73,
"price": "100.00000000",
"size": "0.01000000",
"side": "sell"
}]
import pandas as pd
list_of_data = [list(dict_of_data[0].values()),list(dict_of_data[1].values())]
pd.DataFrame(list_of_data, columns=list(dict_of_data[0].keys())).set_index('time')
Upvotes: 0