Reputation: 1013
Background: from a large DataFrame
I filtered out entries for year=2013
, month=June
, week of the 3rd - 9th (Monday to Sunday). Then, I grouped the data by day
, hour
, and user_type
, and pivoted the table to get a DataFrame
which looks like:
Day Hour Casual Registered Casual_percentage
0 3 0 14 19 42.42
1 3 1 8 8 50.00
2 3 2 1 3 25.00
3 3 3 2 1 66.67
4 3 4 1 3 25.00
5 3 5 1 17 5.56
. . . . . .
For each day I have 24 hours so for day 4 (Tuesday), the data starts like:
. . . . . .
21 3 21 32 88 26.67
22 3 22 26 64 28.89
23 3 23 23 30 43.40
24 4 0 10 11 47.62
25 4 1 1 5 16.67
26 4 2 1 1 50.00
. . . . . .
How can I plot Casual
and Registered
variables per Hour
, for each of the 7 Day
s? Would I need to create 7 different plots and align them in 1 figure?
Current code. I feel I'm way off. I also tried to create a second x-axis (for Days
) using the documentation.
def make_patch_spines_invisible(ax):
ax.set_frame_on(True)
ax.patch.set_visible(False)
for sp in ax.spines.values():
sp.set_visible(False)
fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='Hours', ylabel='Total # of trips started')
ax1.plot(data.Hour, data.Casual, color='g')
ax1.plot(data.Hour, data.Registered, color='b')
"""This part is trying to create the 2nd x-axis (Days)"""
ax2 = ax1.twinx()
#offset the bottom spine
ax2.spines['bottom'].set_position(('axes', -.5))
make_patch_spines_invisible(ax2)
#show bottomm spine
ax2.spines['bottom'].set_visible(True)
ax2.set_xlabel("Days")
plt.show()
Upvotes: 1
Views: 10819
Reputation: 1858
I think this should be easier if you work on datetime
objects rather than Day
, Hour
strings.
This way, you'll be able to use date tick locators and formatters
along with major and minor ticks.
Even if you didn't mention it, I assume you can use pandas
to deal with dataframes.
I created a new dataframe by copying many times data you provided and cutting some of them (this is not so important).
Here I rebuilt dates from infos you provided, but I suggest to work directly on them (I suppose the original dataframe has some kind of date-like field in it).
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
df = pd.read_csv("mydataframe.csv")
df["timestamp"] = "2013-06-" + df["Day"].astype(str).str.zfill(2) + "-" + df["Hour"].astype(str).str.zfill(2)
df["timestamp"] = pd.to_datetime(df["timestamp"], format="%Y-%m-%d-%H")
fig, ax1 = plt.subplots(figsize=(10, 5))
ax1.set(xlabel='', ylabel='Total # of trips started')
ax1.plot(df["timestamp"], df.Casual, color='g')
ax1.plot(df["timestamp"], df.Registered, color='b')
ax1.xaxis.set(
major_locator=mdates.DayLocator(),
major_formatter=mdates.DateFormatter("\n\n%A"),
minor_locator=mdates.HourLocator((0, 12)),
minor_formatter=mdates.DateFormatter("%H"),
)
plt.show()
Output:
Upvotes: 5
Reputation: 74
Assuming your data is ordered by index (e.g., 0 - 24 is day 3, 25 - 48 is day 4, etc.) you can plot the index values rather than hours in your code:
ax1.plot(data.index.values, df.Casual, color='g')
ax1.plot(data.index.values, df.Registered, color='b')
This will yield a graph similar to what you're looking for as an end product (note I used fake data):
Upvotes: 1