Shashishekhar Hasabnis
Shashishekhar Hasabnis

Reputation: 1756

Pandas how to add comma separated values based on rows?

I have the following pandas DataFrame:-

import pandas as pd
df = pd.DataFrame({
    'code': ['eq150', 'eq150', 'eq152', 'eq151', 'eq151', 'eq150'],
    'reg': ['A', 'C', 'H', 'P', 'I', 'G'],
    'month': ['1', '2', '4', '2', '1', '1']
})
df

    code   reg  month
0   eq150   A    1
1   eq150   C    2
2   eq152   H    4
3   eq151   P    2
4   eq151   I    1
5   eq150   G    1

Expected Output:-

         1         2       3       4
eq150   A, G       C    
eq152                              H
eq151    I         P

Upvotes: 1

Views: 575

Answers (2)

CHRD
CHRD

Reputation: 1957

If you want the output to include the empty 3 column as well:

all_cols = list(map(
    str,
    list(range(
        df.month.astype(int).min(),
        df.month.astype(int).max()+1
    ))
))

df_cols = list(df.month.unique())
add_cols = list(set(all_cols)-set(df_cols))

df = df.pivot_table(
    index='code',
    columns='month',
    aggfunc=','.join
).reg.rename_axis(None).rename_axis(None, axis=1).fillna('')

for col in add_cols: df[col] = ''

df = df[all_cols]

df
        1   2   3   4
eq150   A,G C       
eq151   I   P       
eq152               H

Upvotes: 1

jezrael
jezrael

Reputation: 863196

Use pivot_table with DataFrame.reindex for add missing months:

df['month'] = df['month'].astype(int)
r = range(df['month'].min(), df['month'].max() + 1)

df1 = (df.pivot_table(index='code', 
                      columns='month', 
                      values='reg', 
                      aggfunc=','.join,
                      fill_value='')
          .reindex(r, fill_value='', axis=1))
print (df1)
month    1  2 3  4
code              
eq150  A,G  C     
eq151    I  P     
eq152            H

Upvotes: 0

Related Questions