g.humpkins
g.humpkins

Reputation: 341

Create histogram from dict of lists

I am trying to graph the frequency of the numbers 1, 2, and 3 that occur for certain keys in a dictionary (titled 'hat1' through 'hat10') and am having trouble converting my data (shown below) into a format that I might be able to graph.

data = {'hat9': [[1, 2, 3, 1, 2]], 'hat8': [[1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]], 'hat1': [[1, 2, 3]], 'hat3': [[1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1]], 'hat2': [[1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], 'hat5': [[1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 3, 3, 1, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 2, 3, 1, 3, 3, 3, 3]], 'hat4': [[1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 3, 1, 2, 1, 3, 2, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1]], 'hat7': [[1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]], 'hat6': [[1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 1, 1, 3]], 'hat10': [[1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, 2, 3, 3, 3, 3, 3, 2, 1, 1, 3, 3, 1, 2, 2, 3, 3, 1, 3, 3, 3, 3, 3, 2, 3, 1, 3, 1, 3, 1, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 2, 1, 3, 2, 1, 3, 2, 3, 3, 1, 2, 1, 2, 3, 3, 1, 3, 2, 2, 1, 2, 3, 3, 1, 2, 3, 2, 3, 3, 1, 3, 3, 3, 3]]}

When I ran DataFrame.from_dict(data) I received output that looked like this:

In [100]: DataFrame.from_dict(data)
Out[100]: 
        hat1                                              hat10  \
0  [1, 2, 3]  [1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, ...   

                                                hat2  \
0  [1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...   

                                                hat3  \
0  [1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, ...   

                                                hat4  \
0  [1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, ...   

                                                hat5  \
0  [1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, ...   

                                                hat6  \
0  [1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...   

                                            hat7  \
0  [1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]   

                                                hat8             hat9  
0  [1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...  [1, 2, 3, 1, 2]  

I was hoping someone might be able to help me get the data into a more workable format that can be converted into a graph relatively easily. Thanks for your help.

Upvotes: 0

Views: 978

Answers (2)

xnx
xnx

Reputation: 25518

If you want to create you histogram with Matplotlib, you don't really need to do much more than call its hist method with each hat you want to show. For example,

import pylab
pylab.hist(data['hat4'][0], bins=(1,2,3,4), align='left')

(You need to index at [0] because for some reason each of your dictionary values is a list of length 1, the single item itself being a list of data values).

If you need to aggregrate the hats in some way, you need to say how.

enter image description here

You can do the same with a pandas DataFrame if you prefer:

import pandas as pd
df = pd.DataFrame(data)
pylab.hist(df['hat4'], bins=(1,2,3,4), align='left')

Upvotes: 1

Abhinav Ramakrishnan
Abhinav Ramakrishnan

Reputation: 1090

Try this out:

data = {'hat9': [[1, 2, 3, 1, 2]], 'hat8': [[1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]], 'hat1': [[1, 2, 3]], 'hat3': [[1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1]], 'hat2': [[1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], 'hat5': [[1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 3, 3, 1, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 2, 3, 1, 3, 3, 3, 3]], 'hat4': [[1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 3, 1, 2, 1, 3, 2, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1]], 'hat7': [[1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]], 'hat6': [[1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 1, 1, 3]], 'hat10': [[1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, 2, 3, 3, 3, 3, 3, 2, 1, 1, 3, 3, 1, 2, 2, 3, 3, 1, 3, 3, 3, 3, 3, 2, 3, 1, 3, 1, 3, 1, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 2, 1, 3, 2, 1, 3, 2, 3, 3, 1, 2, 1, 2, 3, 3, 1, 3, 2, 2, 1, 2, 3, 3, 1, 2, 3, 2, 3, 3, 1, 3, 3, 3, 3]]}


keys = []
values = []
for key,value in data.iteritems():
    keys.append(key)
    a = 0
    b = 0
    c = 0
    for x in value[0]:
        if x==1: a+=1;
        elif x ==2: b+=1;
        elif x==3: c+=1;
    values.append([a,b,c])

print keys
print values

Hopefully that helps. Keys is ['hat9', 'hat8', etc.,..] and values = [[freq of 1 in 'hats9', freq of 2 in 'hats9', freq of 3 in 'hats9'], [freq of 1 in 'hats8', freq of 2 in 'hats8', freq of 3 in 'hats8'],..] (a list of 3 item lists)

Upvotes: 1

Related Questions