bluesummers
bluesummers

Reputation: 12607

Create a dict from a histogram - Python

I am looking to create a json/dict from a histogram.

Loading data with pandas and plotting it results in the following

import pandas as pd

df = pd.read_csv(PATH_TO_CSV)
df.hist(log=True)

Results in the following: Example histogram

I am wondering what would be the best way to get this as a dict, I'm not strict about the way I want the dict to look like, but I'm thinking about something like

histogram = {
    'dropoff_latitude': {
        '30-35': 1800000,
        .....
    },
    'dropoff_longitude': {
        ....
    }
}

Upvotes: 0

Views: 395

Answers (1)

Zero
Zero

Reputation: 76917

Here's one way. histfun creates gets you the bins and count info from np.histogram. And, label creates the bin representation.

In [95]: def histfun(x):
    ...:     hist, bins = np.histogram(x)
    ...:     bbins = np.char.mod('%.2f', bins)
    ...:     label = map('-'.join, zip(bbins[:-1], bbins[1:]))
    ...:     return dict(zip(label, hist))
    ...:

In [96]: df.apply(histfun).to_dict()
Out[96]:
{'dropoff_latitude': {'30.00-35.00': 2,
  '35.00-40.00': 0,
  '40.00-45.00': 0,
  '45.00-50.00': 1,
  '50.00-55.00': 0,
  '55.00-60.00': 0,
  '60.00-65.00': 0,
  '65.00-70.00': 0,
  '70.00-75.00': 0,
  '75.00-80.00': 1},
 'dropoff_longitude': {'0.00-12.00': 2,
  '108.00-120.00': 1,
  '12.00-24.00': 0,
  '24.00-36.00': 0,
  '36.00-48.00': 0,
  '48.00-60.00': 0,
  '60.00-72.00': 1,
  '72.00-84.00': 0,
  '84.00-96.00': 0,
  '96.00-108.00': 0}}

Sample test data

In [97]: df
Out[97]:
   dropoff_latitude  dropoff_longitude
0                30                120
1                30                  0
2                45                  0
3                80                 60

Upvotes: 3

Related Questions