machinery
machinery

Reputation: 6290

Count number of occurences of values per column of DataFrame

I have the following dataframe:

df = pd.DataFrame(np.array([[4, 1], [1,1], [5,1], [1,3], [7,8], [np.NaN,8]]), columns=['a', 'b'])

    a    b
0   4    1
1   1    1
2   5    1
3   1    3
4   7    8
5   Nan  8

Now I would like to do a value_counts() on the columns for values from 1 to 9 which should give me the following:

    a    b
1   2    3
2   0    0
3   0    1
4   1    0
5   1    0
6   0    0
7   1    0
8   0    2
9   0    0

That means I just count the number of occurences of the values 1 to 9 for each column. How can this be done? I would like to get this format so that I can apply afterwards df.plot(kind='bar', stacked=True) to get e stacked bar plot with the discrete values from 1 to 9 at the x axis and the count for a and b on the y axis.

Upvotes: 1

Views: 275

Answers (2)

Hello.World
Hello.World

Reputation: 740

Use pd.value_counts:

df.apply(pd.value_counts).reindex(range(10)).fillna(0)

enter image description here

Upvotes: 5

cs95
cs95

Reputation: 402333

Use np.bincount on each column:

df.apply(lambda x: np.bincount(x.dropna(),minlength=10))

   a  b
0  0  0
1  2  3
2  0  0
3  0  1
4  1  0
5  1  0
6  0  0
7  1  0
8  0  2
9  0  0

Alternatively, using a list comprehension instead of apply.

pd.DataFrame([
        np.bincount(df[c].dropna(), minlength=10) for c in df
    ], index=df.columns).T

   a  b
0  0  0
1  2  3
2  0  0
3  0  1
4  1  0
5  1  0
6  0  0
7  1  0
8  0  2
9  0  0

Upvotes: 2

Related Questions