Reputation: 57
I am new to Data Science with Python, Numpy, and Pandas. So, excuse me if my question is very trivial.
Is there a way to print the value_counts of any two or more columns from a Python dictionary?
For example, right now I am just printing the value_counts of the columns I want like below:
import os
import numpy as np
import pandas as pd
import matplotlib as plot
dict_h = {'a':['APPLE', 'DONUT', 'APPLE', 'APPLE', 'APPLE', 'DONUT', 'PEAR'],
'b':['PEAR', 'DONUT', 'DONUT', 'DONUT', 'APPLE', 'PEAR', 'DONUT'],
'c':['APPLE', 'APPLE', 'APPLE','DONUT','DONUT','DONUT','PEAR'],
'd':['PEAR', 'PEAR', 'PEAR','DONUT','DONUT','DONUT','PEAR']}
print('\n')
print('Orignal Dict:')
print(dict_h)
y = pd.DataFrame.from_dict(dict_h, orient='index')
print('\n')
print(y.describe())
print('\n')
print(y[0].value_counts())
print('\n')
print(y[1].value_counts())
That prints:
...
PEAR 2
APPLE 2
Name: 0, dtype: int64
DONUT 2
PEAR 1
APPLE 1
Name: 1, dtype: int64
but instead, I was hoping to use print like this:
0 1
-------------
APPLE 2 1
PEAR 2 1
DONUT NaN 2
-------------
Thank you. Drew
Upvotes: 2
Views: 181
Reputation: 22503
To get your desired output, use transpose
and melt
with crosstab
:
s = df.T.melt()
print (pd.crosstab(s["value"], s["variable"]))
variable 0 1 2 3 4 5 6
value
APPLE 2 1 2 1 2 0 0
DONUT 0 2 1 3 2 3 1
PEAR 2 1 1 0 0 1 3
Upvotes: 2