Ali H. Kudeir
Ali H. Kudeir

Reputation: 1054

Plot the count of a Pandas df column using Plotly

I have a df column where its data are only categorical (e.g., [a, b, c, a, a, d, c, b, etc]). I want to plot the count of these data using plotly count bar (bar chart).

I have calculated the count of the data using df.groupby('<col_name>')['<col_name>'].count(), but this returns a series data structure so I will only have the count data (1-D).

How can I get the count result and the corresponding data item in the resulting output efficiently?

I want to get this output and plot the bar chart using Plotly:

import plotly.express as px

fig = px.bar(count_df, x="<col_name>", y="count", color="count", title="----------")
fig.show()

Upvotes: 4

Views: 5660

Answers (4)

vestland
vestland

Reputation: 61084

It doesn't get more efficient than this:

df['x'].plot(kind = 'hist')

enter image description here

And NO, this isn't matplotlib, but rather a figure constructed directly from a pandas dataframe using plotly as a backend. And yes, it's awesome!

Complete code:

import random
import pandas as pd
pd.options.plotting.backend = "plotly"
random.seed(7)

df = pd.DataFrame({'x':[random.choice(list('abcde')) for i in range(25)]})
fig = df['x'].plot(kind = 'hist')
fig.layout.bargap = 0
fig.show()

Upvotes: 4

Ali H. Kudeir
Ali H. Kudeir

Reputation: 1054

Answering my own question.

I found a solution by converting the result of value_counts(returns a Series) to a pd DataFrame. Ref : SO question and answers

import plotly.express as px

new_df = df['<col_name>'].value_counts().rename_axis('<col_name>').reset_index(name='counts')

fig = px.bar(new_df, x="<col_name>", y="counts", color="counts", title="----------")
fig.show()

Upvotes: 3

Andrea NR
Andrea NR

Reputation: 1677

You can get it easily by using hist(), for example:

df['<col_name>'].hist()

and also you can see the abs frequency.

Another way to do the same is:

df['<col_name>'].value_counts().plot(kind='bar')

Upvotes: 0

Khyber Thrall
Khyber Thrall

Reputation: 51

Should be able to use .index of the result to give you the values for x-axis and the series itself for the y-axis.

Also, I think using df[‘col_name’].value_counts() is probably what you want to use here.

Upvotes: 1

Related Questions