NewGuy
NewGuy

Reputation: 3413

How can I plot the number of rows that occurred per hour over a long period of time?

I have a large CSV file that looks like this:

ID,Time,Disposition,eventsID,Class,teamID
1,"2011-03-02 22:18:37",1,107,2,2
2,"2011-03-02 22:19:05",1,115,1,2
3,"2011-03-02 22:19:10",1,103,4,2
4,"2011-03-02 22:19:41",1,104,1,3
5,"2011-03-03 01:24:31",1,117,4,3

This data spans many months.

I want to show a plot of the number of events (rows) that occurs per Year-Month-Day-Hour (I don't need it to the minute or second).

I have this code:

import pandas as pd
df = pd.read_csv('rtd_log.csv')
times = pd.DatetimeIndex(df.Time)
count_per_day = df.groupby([times.year, times.month, times.day, times.hour]).count()

This makes count_per_day look like:

2011  3  2  22    4
         3  1     1

The last column is the count. How do I plot this count over time, using the count_per_day result?

Upvotes: 1

Views: 1497

Answers (2)

Anand S Kumar
Anand S Kumar

Reputation: 90889

You can use DataFrame.plot() to plot the graph. Example -

count_per_day.plot()

Demo with the example data you added -

enter image description here

Upvotes: 1

Leb
Leb

Reputation: 15953

When you load your data make sure you add parse_dates=['Time'] to let pandas know it's a datetime column.

Then what you'll need is to index each column using then apply df.resample()

df_time = df.set_index('Time').resample('1D', how='count') 
# change 1D to 1M or however you want to break it down

Then use whatever tool you'll want to plot, bar or line.

df_time.plot(kind='bar', x=df_time.index)

Upvotes: 1

Related Questions