user12809368
user12809368

Reputation:

Count frequency and plot

I would need to plot the frequency of items by date. My csv contains three columns: one for Date, one for Name & Surname and another one for Birthday. I am interested in plotting the frequency of people recorded in a date. My expected output would be:

         Date  Count
0   01/01/2018      9
1   01/02/2018     12
2   01/03/2018      6
3   01/04/2018      4
4   01/05/2018      5
..         ...    ...
..  02/27/2020    122
..  02/28/2020     84

The table above was found as follows:

by_date = df.groupby(df['Date']).size().reset_index(name='Count')

Date is a column in my csv file, but not Count. This explains the reason why I am having difficulties to draw a line plot.

How can I plot the frequency as a list of numbers/column?

Upvotes: 1

Views: 161

Answers (1)

Code Different
Code Different

Reputation: 93161

Although not absolutely required, you should convert the Date column into Timestamp for easier analysis in later steps:

df['Date'] = pd.to_datetime(df['Date'])

Now, to your question. To count many births there are per day, you can use value_counts:

births = df['Date'].value_counts()

But you don't even have to do that for plotting a histogram! Use hist:

import matplotlib.dates as mdates
year = mdates.YearLocator()
month = mdates.MonthLocator()
formatter = mdates.ConciseDateFormatter(year)

ax = df['Date'].hist()
ax.set_title('# of births')
ax.xaxis.set_major_locator(year)
ax.xaxis.set_minor_locator(month)
ax.xaxis.set_major_formatter(formatter)

Result (from random data):

Histogram of birth

Upvotes: 1

Related Questions