Nikhil Ratna Shakya
Nikhil Ratna Shakya

Reputation: 113

How do I specify a column header for pandas groupby result?

I need to group by and then return the values of a column in a concatenated form. While I have managed to do this, the returned dataframe has a column name 0. Just 0. Is there a way to specify what the results will be.

    all_columns_grouped = all_columns.groupby(['INDEX','URL'], as_index  = False)['VALUE'].apply(lambda x: ' '.join(x)).reset_index()

The resulting groupby object has the headers

    INDEX | URL | 0

The results are in the 0 column. While I have managed to rename the column using

   .rename(index=str, columns={0: "variant"}) this seems very in elegant. 

Any way to provide a header for the column? Thanks

Upvotes: 1

Views: 14034

Answers (2)

Alexander
Alexander

Reputation: 109546

You can use agg when applied to a column (VALUE in this case) to assign column names to the result of a function.

# Sample data (thanks @jezrael)
all_columns = pd.DataFrame({'VALUE':['a','s','d','ss','t','y'],
                   'URL':[5,5,4,4,4,4],
                   'INDEX':list('aaabbb')})

# Solution
>>> all_columns.groupby(['INDEX','URL'], as_index=False)['VALUE'].agg(
        {'variant': lambda x: ' '.join(x)})
  INDEX  URL variant
0     a    4       d
1     a    5     a s
2     b    4  ss t y

Upvotes: 2

jezrael
jezrael

Reputation: 862791

The simpliest is remove as_index = False for return Series and add parameter name to reset_index:

Sample:

all_columns = pd.DataFrame({'VALUE':['a','s','d','ss','t','y'],
                   'URL':[5,5,4,4,4,4],
                   'INDEX':list('aaabbb')})

print (all_columns)
  INDEX  URL VALUE
0     a    5     a
1     a    5     s
2     a    4     d
3     b    4    ss
4     b    4     t
5     b    4     y

all_columns_grouped = all_columns.groupby(['INDEX','URL'])['VALUE'] \
                                 .apply(' '.join) \
                                 .reset_index(name='variant')

print (all_columns_grouped)
  INDEX  URL variant
0     a    4       d
1     a    5     a s
2     b    4  ss t y

Upvotes: 5

Related Questions