Python/Pandas aggregating by date

Question

I am trying to count and plot the number of data points I have for each area by day, so far I have:

But I would like to show the number of instances of each county per day, with the end goal of plotting them on a line graph, like:

Only I would want to plot each county on its own line, rather than the total which I have plotted above.

Update:

I have managed to get this from the answers provided:

Which is great and exactly what I was looking for. However, in hindsight, this looks a little messy and not very descriptive even for the short period plotted let alone if I were to plot this for a couple of years worth of data.

So I'm thinking to plot this indivually on an 8 grid plot. But when I try to plot this for one county I am getting the boolean values. As below:

What would be the best way to plot only the True values?

Ami Tavory · Accepted Answer

You can try

df.county.groupby([df.date_stamp, df.county]).count().unstack().plot();

df.county...count() is the numerical series you want to plot.
groupby([df.date_stamp, df.county]) groups first by date_stamp, then by country (the order matters).
unstack will create a Dataframe whose index is the time stamp, and columns are counties.
plot(); will plot it (and the ; suppresses the unnecessary output).

Edit

To plot it on separate plots, you could do something like

for county in df.county.unique():
    this_county = df[df.county == county]
    this_county.county.groupby(df.date_stamp).count().plot();
    title(county);
    show();

Python/Pandas aggregating by date

Answers (2)

Related Questions