Bn.F76
Bn.F76

Reputation: 1013

How to plot data per hour, grouped by days?

Background: from a large DataFrame I filtered out entries for year=2013, month=June, week of the 3rd - 9th (Monday to Sunday). Then, I grouped the data by day, hour, and user_type, and pivoted the table to get a DataFrame which looks like:

   Day  Hour  Casual  Registered  Casual_percentage
0  3    0     14      19          42.42
1  3    1     8       8           50.00
2  3    2     1       3           25.00
3  3    3     2       1           66.67
4  3    4     1       3           25.00
5  3    5     1       17          5.56
.  .    .     .       .           .

For each day I have 24 hours so for day 4 (Tuesday), the data starts like:

.  .    .     .       .           .  
21 3    21    32      88          26.67
22 3    22    26      64          28.89
23 3    23    23      30          43.40
24 4    0     10      11          47.62
25 4    1     1       5           16.67
26 4    2     1       1           50.00
.  .    .     .       .           .

How can I plot Casual and Registered variables per Hour, for each of the 7 Days? Would I need to create 7 different plots and align them in 1 figure?

Current code. I feel I'm way off. I also tried to create a second x-axis (for Days) using the documentation.

def make_patch_spines_invisible(ax):
    ax.set_frame_on(True)
    ax.patch.set_visible(False)
    for sp in ax.spines.values():
        sp.set_visible(False)

fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='Hours', ylabel='Total # of trips started')

ax1.plot(data.Hour, data.Casual, color='g')
ax1.plot(data.Hour, data.Registered, color='b')


"""This part is trying to create the 2nd x-axis (Days)"""
ax2 = ax1.twinx()
#offset the bottom spine
ax2.spines['bottom'].set_position(('axes', -.5))
make_patch_spines_invisible(ax2)
#show bottomm spine
ax2.spines['bottom'].set_visible(True)
ax2.set_xlabel("Days")


plt.show()

Output: enter image description here

End goal

Upvotes: 1

Views: 10819

Answers (2)

Lante Dellarovere
Lante Dellarovere

Reputation: 1858

I think this should be easier if you work on datetime objects rather than Day, Hour strings.
This way, you'll be able to use date tick locators and formatters along with major and minor ticks.

Even if you didn't mention it, I assume you can use pandas to deal with dataframes.
I created a new dataframe by copying many times data you provided and cutting some of them (this is not so important).
Here I rebuilt dates from infos you provided, but I suggest to work directly on them (I suppose the original dataframe has some kind of date-like field in it).

import pandas as pd
import matplotlib.pyplot as plt 
import matplotlib.dates as mdates

df = pd.read_csv("mydataframe.csv")
df["timestamp"] = "2013-06-" + df["Day"].astype(str).str.zfill(2) + "-" + df["Hour"].astype(str).str.zfill(2)
df["timestamp"] = pd.to_datetime(df["timestamp"], format="%Y-%m-%d-%H")


fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='', ylabel='Total # of trips started')
ax1.plot(df["timestamp"], df.Casual, color='g')
ax1.plot(df["timestamp"], df.Registered, color='b')

ax1.xaxis.set(
    major_locator=mdates.DayLocator(),
    major_formatter=mdates.DateFormatter("\n\n%A"),
    minor_locator=mdates.HourLocator((0, 12)),
    minor_formatter=mdates.DateFormatter("%H"),
)
plt.show()

Output:

formatted dataframe

Upvotes: 5

Teuszie
Teuszie

Reputation: 74

Assuming your data is ordered by index (e.g., 0 - 24 is day 3, 25 - 48 is day 4, etc.) you can plot the index values rather than hours in your code:

ax1.plot(data.index.values, df.Casual, color='g')
ax1.plot(data.index.values, df.Registered, color='b')

This will yield a graph similar to what you're looking for as an end product (note I used fake data):

enter image description here

Upvotes: 1

Related Questions