edlujan
edlujan

Reputation: 3

How to graph Binance API Orderbook with Pandas-matplotlib?

the data comes in 3 columns after (orderbook = pd.DataFrame(orderbook_data):

   timestamp                bids                      asks
 UNIX timestamp      [bidprice, bidvolume]   [askprice, askvolume]
        list has 100 values of each. timestamp is the same 

the problem is that I don't know how to access/index the values inside each row list [price, volume] of each column

I know that by running ---> bids = orderbook["bids"]

I get the list of 100 lists ---> [bidprice, bidvolume]

I'm looking to avoid doing a loop.... there has to be a way to just plot the data

I hope someone can undertand my problem. I just want to plot price on x and volume on y. The goal is to make it live

Upvotes: 0

Views: 1098

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30971

As you didn't present your input file, I prepared it on my own:

timestamp;bids
1579082401;[123.12, 300]
1579082461;[135.40, 220]
1579082736;[130.76, 20]
1579082801;[123.12, 180]

To read it I used:

orderbook = pd.read_csv('Input.csv', sep=';')
orderbook.timestamp = pd.to_datetime(orderbook.timestamp, unit='s')

Its content is:

             timestamp           bids
0  2020-01-15 10:00:01  [123.12, 300]
1  2020-01-15 10:01:13  [135.40, 220]
2  2020-01-15 10:05:36   [130.76, 20]
3  2020-01-15 10:06:41  [123.12, 180]

Now:

  • timestamp has been converted to native pandasonic type of datetime,
  • but bids is of object type (actually, a string).

and, as I suppose, this is the same when read from your input file.

And now the main task: The first step is to extract both numbers from bids, convert them to float and int and save in respective columns:

orderbook = orderbook.join(orderbook.bids.str.extract(
    r'\[(?P<bidprice>\d+\.\d+), (?P<bidvolume>\d+)]'))
orderbook.bidprice = orderbook.bidprice.astype(float)
orderbook.bidvolume = orderbook.bidvolume.astype(int)

Now orderbook contains:

            timestamp           bids  bidprice  bidvolume
0 2020-01-15 10:00:01  [123.12, 300]    123.12        300
1 2020-01-15 10:01:01  [135.40, 220]    135.40        220
2 2020-01-15 10:05:36   [130.76, 20]    130.76         20
3 2020-01-15 10:06:41  [123.12, 180]    123.12        180

and you can generate e.g. a scatter plot, calling:

orderbook.plot.scatter('bidprice', 'bidvolume');

or other plotting function.

Another possibility

Or maybe your orderbook_data is a dictionary? Something like:

orderbook_data = {
    'timestamp': [1579082401, 1579082461, 1579082736, 1579082801],
    'bids': [[123.12, 300], [135.40, 220], [130.76, 20], [123.12, 180]] }

In this case, when you create a DataFrame from it, the column types are initially:

  • timestamp - int64,
  • bids - also object, but this time each cell contains a plain pythonic list.

Then you can also convert timestamp column to datetime just like above.

But to split bids (a column of lists) into 2 separate columns, you should run:

orderbook[['bidprice', 'bidvolume']] = pd.DataFrame(orderbook.bids.tolist())

Then you have 2 new columns with respective components of the source column and you can create your graphics jus like above.

Upvotes: 1

Related Questions