Reputation: 33
I'm still working on a pandas project with data from all the gym signups at my local school. I'm attempting to display a bar graph by (x = df.index.values, y = the count of values), the index consists of an array that looks like this
Time
10:00AM 44
10:00PM 45
11:00AM 98
12:00PM 93
1:00PM 58
2:00PM 95
3:00PM 148
4:00PM 118
5:00PM 196
6:00AM 11
6:00PM 169
7:00AM 8
7:00PM 108
8:00AM 49
8:00PM 100
9:00AM 90
9:00PM 75
Name: Name, dtype: int64
This data represents every occurrence of a sign-up on a Monday, I'm now trying to graph this and I use the ["Time"] column as the x value on a bar plot. As I expected, the graph doesn't sort the values by
"6:00AM","7:00AM", "8:00AM","9:00AM","10:00AM", "11:00AM", "12:00PM", "1:00PM","2:00PM", "3:00PM", "4:00PM", "5:00PM", "6:00PM", "7:00PM", "8:00PM", "9:00PM"
I attempted to use regex to find the values that end in "AM" or "PM" and placed them in opposite arrays, but that has lead me down a dark path.
Does anyone have a suggestion to sorting these values? Should I convert them into a datetime and then apply df.sort.values()?
Here is the code that has gotten me to display this graph:
import numpy as np
import pandas as pd
main_data = pd.read_csv('newoutput3.csv', delimiter=",", encoding='cp1252')
main_data['Date'] = pd.to_datetime(main_data['Date'])
main_data['dayOfWeek'] = main_data['Date'].dt.day_name()
main_data_dayOfWeek = main_data.groupby('dayOfWeek')
Monday = main_data_dayOfWeek.get_group('Monday')
## HERE IS THE GROUPING BY TIME!!
MondayTimes = Monday.groupby('Time')
##Displays the dataframe that I posted at the top
test = MondayTimes['Name'].count()
plt.bar(test.index.values, test, align='center', alpha=0.5)
plt.xticks(rotation=90)
plt.show()
Lastly, here is some an example of my original data!
I hope I provided as much information as possible, thanks for reading this and being apart of the project which i'm having so much fun developing!
Upvotes: 0
Views: 416
Reputation: 428
Edit: still keeping the method of sorting by a key, however, this doesn't need to split the list first, by basically converting the hour to 24h format.
import random
times = ["12:00AM", "6:00AM","7:00AM", "8:00AM","9:00AM","10:00AM", "11:00AM", "12:00PM", "1:00PM","2:00PM", "3:00PM", "4:00PM", "5:00PM", "6:00PM", "7:00PM", "8:00PM", "9:00PM"]
random.shuffle(times)
print(times)
def srt(item):
am_pm = 12 if (item.endswith("PM") and "12:00" not in item) else -12 if (item.endswith("AM") and "12:00" in item) else 0
return int(item.split(":")[0])+am_pm # 9:00PM -> ['9', '00PM'] -> 9
times.sort(key=srt)
print(times)
Upvotes: 0
Reputation: 2583
First, convert to 24-hour format then sort it ascending:
df =(
df.assign(
time = lambda x: pd.to_datetime(x['time']).dt.strftime('%H:%M:%S')
).sort_values('time')
)
Upvotes: 2