Pipo
Pipo

Reputation: 309

get the index of counted dataframe

I am trying to return the index of DataFrame statement, first I am loading a csv (CSV example below)

I created a code to count the number of each hour and return the max number as below

import pandas as pd

filename = 'mylist.csv'

df = pd.read_csv(filename)

df['Start Time'] = df['Start Time'].astype('datetime64[ns]')

df['hour'] = df['Start Time'].dt.hour

# find the most common hour (from 0 to 23)
popular_hour = df.groupby(['hour'])['hour'].count().max()

print('Most Frequent Start Hour:', popular_hour)

what I am trying to do is to return the hour not the counted value, I've tried index as below but doesn't work

popular_hour = df.groupby(['hour'])['hour'].count().max().index.values

Upvotes: 1

Views: 118

Answers (1)

jezrael
jezrael

Reputation: 863116

I think you need Series.idxmax for indice of maximal value of Series returned by GroupBy.count:

Notice: For convert to datetimes is better use parameter parse_dates in read_csv.

df = pd.read_csv(filename, parse_dates=['Start Time','End Time'])

df['hour'] = df['Start Time'].dt.hour

popular_hour = df.groupby(['hour'])['hour'].count().idxmax()

Another idea is use Series.value_counts - there is default sorting, so first value is also maximal:

popular_hour = df['hour'].value_counts().idxmax()

working same like selecting first index:

popular_hour = df['hour'].value_counts().index[0]

Upvotes: 1

Related Questions