Reputation: 309
I am trying to return the index of DataFrame statement, first I am loading a csv (CSV example below)
I created a code to count the number of each hour and return the max number as below
import pandas as pd
filename = 'mylist.csv'
df = pd.read_csv(filename)
df['Start Time'] = df['Start Time'].astype('datetime64[ns]')
df['hour'] = df['Start Time'].dt.hour
# find the most common hour (from 0 to 23)
popular_hour = df.groupby(['hour'])['hour'].count().max()
print('Most Frequent Start Hour:', popular_hour)
what I am trying to do is to return the hour not the counted value, I've tried index
as below but doesn't work
popular_hour = df.groupby(['hour'])['hour'].count().max().index.values
Upvotes: 1
Views: 118
Reputation: 863116
I think you need Series.idxmax
for indice of maximal value of Series
returned by GroupBy.count
:
Notice: For convert to datetimes is better use parameter parse_dates
in read_csv
.
df = pd.read_csv(filename, parse_dates=['Start Time','End Time'])
df['hour'] = df['Start Time'].dt.hour
popular_hour = df.groupby(['hour'])['hour'].count().idxmax()
Another idea is use Series.value_counts
- there is default sorting, so first value is also maximal:
popular_hour = df['hour'].value_counts().idxmax()
working same like selecting first index:
popular_hour = df['hour'].value_counts().index[0]
Upvotes: 1