David
David

Reputation: 487

Pandas Groupby error

When trying to groupby a dataframe by month and country I am getting the following error: 'tuple' object has no attribute 'lower'

here is the code I am using:

df = df.groupby(
[pd.to_datetime(df.time).dt.strftime('%b %Y'), 'Country'])['% Return'].mean().reset_index()

example dateframe

time       Country  % Return

2017-07-30   br         3
2017-07-31   br         4
2017-08-01   br         5
2017-08-02   br         6
2017-08-03   br         7
2017-07-30   es         2
2017-07-31   es         3
2017-08-01   es         4
2017-08-02   es         5
2017-08-03   es         6

desired output:

time        Country  % Return
2017-07-01    br        3.5
2017-08-01    br        6
2017-07-01    es        2.5
2017-08-01    es        5

I have used this same code for simialr DFs. Not sure why its not working this time

Thanks in advance

edit:

python version: Python 2.6.6 pandas version: 0.22.0

full error

AttributeError                            Traceback (most recent call last)
<ipython-input-29-db720e55a304> in <module>()
      1 new_return_poster_df_g = new_return_poster_df_g.groupby(
----> 2     [pd.to_datetime(new_return_poster_df_g.time).dt.strftime('%b %Y'), 'Country']
      3 )['% Return Poster'].mean().reset_index(name='% Return Poster')

/var/local/ishbook.executor.daemon/lib/python-venvs/libraries/pandas==0.22.0/lib/python2.7/site-packages/pandas/core/tools/datetimes.pyc in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
    374         result = Series(values, index=arg.index, name=arg.name)
    375     elif isinstance(arg, (ABCDataFrame, MutableMapping)):
--> 376         result = _assemble_from_unit_mappings(arg, errors=errors)
    377     elif isinstance(arg, ABCIndexClass):
    378         result = _convert_listlike(arg, box, format, name=arg.name)

/var/local/ishbook.executor.daemon/lib/python-venvs/libraries/pandas==0.22.0/lib/python2.7/site-packages/pandas/core/tools/datetimes.pyc in _assemble_from_unit_mappings(arg, errors)
    444         return value
    445 
--> 446     unit = {k: f(k) for k in arg.keys()}
    447     unit_rev = {v: k for k, v in unit.items()}
    448 

/var/local/ishbook.executor.daemon/lib/python-venvs/libraries/pandas==0.22.0/lib/python2.7/site-packages/pandas/core/tools/datetimes.pyc in <dictcomp>((k,))
    444         return value
    445 
--> 446     unit = {k: f(k) for k in arg.keys()}
    447     unit_rev = {v: k for k, v in unit.items()}
    448 

/var/local/ishbook.executor.daemon/lib/python-venvs/libraries/pandas==0.22.0/lib/python2.7/site-packages/pandas/core/tools/datetimes.pyc in f(value)
    439 
    440         # m is case significant
--> 441         if value.lower() in _unit_map:
    442             return _unit_map[value.lower()]
    443 

AttributeError: 'tuple' object has no attribute 'lower'

Upvotes: 2

Views: 1217

Answers (1)

YOLO
YOLO

Reputation: 21749

I think you can use pd.Grouper function to get the desired output:

Step 1: Convert to datetime and set time as index

df['time'] = pd.to_datetime(df['time'])
df = df.set_index('time')

Step 2: Group by time and country

df = df.groupby([pd.Grouper(freq='M'),'Country'])['Return'].mean().reset_index()

    time       Country  Return
0   2017-07-31  br       3.5
1   2017-07-31  es       2.5
2   2017-08-31  br       6.0
3   2017-08-31  es       5.0

Upvotes: 1

Related Questions