mlabenski
mlabenski

Reputation: 33

Sorting a dataframe index by ascending time values

I'm still working on a pandas project with data from all the gym signups at my local school. I'm attempting to display a bar graph by (x = df.index.values, y = the count of values), the index consists of an array that looks like this

Time
10:00AM     44
10:00PM     45
11:00AM     98
12:00PM     93
1:00PM      58
2:00PM      95
3:00PM     148
4:00PM     118
5:00PM     196
6:00AM      11
6:00PM     169
7:00AM       8
7:00PM     108
8:00AM      49
8:00PM     100
9:00AM      90
9:00PM      75
Name: Name, dtype: int64

This data represents every occurrence of a sign-up on a Monday, I'm now trying to graph this and I use the ["Time"] column as the x value on a bar plot. As I expected, the graph doesn't sort the values by

"6:00AM","7:00AM", "8:00AM","9:00AM","10:00AM", "11:00AM", "12:00PM", "1:00PM","2:00PM", "3:00PM", "4:00PM", "5:00PM", "6:00PM", "7:00PM", "8:00PM", "9:00PM"

I attempted to use regex to find the values that end in "AM" or "PM" and placed them in opposite arrays, but that has lead me down a dark path.

Does anyone have a suggestion to sorting these values? Should I convert them into a datetime and then apply df.sort.values()?

Here is the code that has gotten me to display this graph:

import numpy as np
import pandas as pd
main_data = pd.read_csv('newoutput3.csv', delimiter=",", encoding='cp1252')
main_data['Date'] = pd.to_datetime(main_data['Date'])
main_data['dayOfWeek'] = main_data['Date'].dt.day_name() 
main_data_dayOfWeek = main_data.groupby('dayOfWeek')
Monday = main_data_dayOfWeek.get_group('Monday')
## HERE IS THE GROUPING BY TIME!!
MondayTimes = Monday.groupby('Time')
##Displays the dataframe that I posted at the top
test = MondayTimes['Name'].count()
plt.bar(test.index.values, test, align='center', alpha=0.5)
plt.xticks(rotation=90)
plt.show()

Graph Here

Lastly, here is some an example of my original data!

enter image description here

I hope I provided as much information as possible, thanks for reading this and being apart of the project which i'm having so much fun developing!

Upvotes: 0

Views: 416

Answers (2)

Tyler Stoney
Tyler Stoney

Reputation: 428

Edit: still keeping the method of sorting by a key, however, this doesn't need to split the list first, by basically converting the hour to 24h format.

import random
times = ["12:00AM", "6:00AM","7:00AM", "8:00AM","9:00AM","10:00AM", "11:00AM", "12:00PM", "1:00PM","2:00PM", "3:00PM", "4:00PM", "5:00PM", "6:00PM", "7:00PM", "8:00PM", "9:00PM"]

random.shuffle(times)
print(times)

def srt(item):
    am_pm = 12 if (item.endswith("PM") and "12:00" not in item) else -12 if (item.endswith("AM") and "12:00" in item) else 0
    return int(item.split(":")[0])+am_pm # 9:00PM -> ['9', '00PM'] -> 9
times.sort(key=srt)

print(times)

Upvotes: 0

Mehdi Golzadeh
Mehdi Golzadeh

Reputation: 2583

First, convert to 24-hour format then sort it ascending:

df =(
    df.assign(
        time = lambda x: pd.to_datetime(x['time']).dt.strftime('%H:%M:%S')
    ).sort_values('time')
)

Upvotes: 2

Related Questions