Grace_radetsk
Grace_radetsk

Reputation: 1

KeyError when trying to access column of a Pandas DataFrame

This is the first time I used stackoverflow so please forgive me if my question doesn't not follow proper conventions.

I'm trying to create a function to find the station with the maximum riders on the first day, then return the mean riders per day for that station. Also return the mean ridership overall. However, when I executed the following codes, a KeyError Exception was raised as below. Please advise what went wrong. Thank you very much!

import pandas as pd

def mean_riders_for_max_station(ridership_df):
    
    overall_mean = ridership_df.mean()

    max_station = ridership_df.iloc[0].argmax()  
    mean_for_max = ridership_df[max_station].mean() 
    return (overall_mean, mean_for_max)

ridership_df = pd.DataFrame(
    data=[[   0,    0,    2,    5,    0],
          [1478, 3877, 3674, 2328, 2539],
          [1613, 4088, 3991, 6461, 2691],
          [1560, 3392, 3826, 4787, 2613],
          [1608, 4802, 3932, 4477, 2705],
          [1576, 3933, 3909, 4979, 2685],
          [  95,  229,  255,  496,  201],
          [   2,    0,    1,   27,    0],
          [1438, 3785, 3589, 4174, 2215],
          [1342, 4043, 4009, 4665, 3033]],
    index=['05-01-11', '05-02-11', '05-03-11', '05-04-11', '05-05-11',
           '05-06-11', '05-07-11', '05-08-11', '05-09-11', '05-10-11'],
    columns=['R003', 'R004', 'R005', 'R006', 'R007']
)

print(mean_riders_for_max_station(ridership_df))

I received the following Error message:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2894             try:
-> 2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 3

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-23-60b53dc0106e> in <module>
     37 )
     38 
---> 39 mean_riders_for_max_station(ridership_df)

<ipython-input-23-60b53dc0106e> in mean_riders_for_max_station(ridership_df)
     17 
     18     max_station = ridership_df.iloc[0].argmax()   #difference between argmax() for an array (--returning a location)
---> 19     mean_for_max = ridership_df[max_station].mean() #and argmax() for a series: returning index (or column name of the dataframe)
     20     return (overall_mean, mean_for_max)
     21 

~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2900             if self.columns.nlevels > 1:
   2901                 return self._getitem_multilevel(key)
-> 2902             indexer = self.columns.get_loc(key)
   2903             if is_integer(indexer):
   2904                 indexer = [indexer]

~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err
   2898 
   2899         if tolerance is not None:

KeyError: 3

Upvotes: 0

Views: 2725

Answers (2)

Aswin Babu
Aswin Babu

Reputation: 348

max_station will be 3, but ridership_df[max_station] will give a key error since there is no column name 3.

Upvotes: 0

rwadman
rwadman

Reputation: 81

The argmax() method of a pandas Series returns the position of the maximum value (as in integer index in the array).

What you want is max_station = ridership_df.iloc[0].idxmax().

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.argmax.html

Upvotes: 1

Related Questions