geds133
geds133

Reputation: 1485

How to groupby column and return a dataFrame instead of groupby object

I have a dataFrame that looks as such:

Date        Yearly_cost
2009-01-01  230
2010-03-03  260
2009-01-01  320
2007-03-02  430

The same dataFrame contains multiple duplicate values for Date but different values for Yearly_cost. I want to groupby Date so that I have a consistent time series with all corresponding values for each day. However I want it to return a df rather than a groupby object.

The desired result would look as such:

Date Yearly_cost 2007-03-02 430 2009-01-01 230, 320 2010-03-03 260

Any help would be appreciated

Upvotes: 0

Views: 54

Answers (2)

U13-Forward
U13-Forward

Reputation: 71560

To answer the revised question, use:

df.groupby('Date')['Yearly_cost'].apply(list).reset_index(name='Yearly_cost')

If you want to change e.g. [320] to 320, do:

df.groupby('Date')['Yearly_cost'].apply(list).apply(lambda x: x[0] if len(x) == 1 else x).reset_index(name='Yearly_cost')

Upvotes: 1

VnC
VnC

Reputation: 2016

Say you have the following df:

df1 = pd.DataFrame({'Date': ['2009-01-01', '2009-01-01', '2010-03-03' , '2010-03-03', '2004-04-03' ,'2007-03-02'],
'Yearly_cost': [230 ,460, 260, 250, 320 ,430],})
df1

df1

        Date    Yearly_cost
0   2009-01-01  230
1   2009-01-01  460
2   2010-03-03  260
3   2010-03-03  250
4   2004-04-03  320
5   2007-03-02  430

You can do the following:

df1['Yearly_cost'] = df1.Yearly_cost.astype(str)
df1.groupby('Date')['Yearly_cost'].apply(','.join)

Result:

Date
2004-04-03        320
2007-03-02        430
2009-01-01    230,460
2010-03-03    260,250
Name: Yearly_cost, dtype: object

In order to make the above a dataframe:

df1['Yearly_cost'] = df1.Yearly_cost.astype(str)
combined = df1.groupby('Date')['Yearly_cost'].apply(','.join)
pd.DataFrame(combined)

And you'll get:

            Yearly_cost
Date    
2004-04-03  320
2007-03-02  430
2009-01-01  230,460
2010-03-03  260,250

Upvotes: 1

Related Questions