Daniel
Daniel

Reputation: 49

How to create columns based on string

Item Date
Bread,Muffin 1/3
Jam,Cake 2/3
Jam 3/3

I would like to change to the output below using python / pandas

Date Bread Muffin Jam Cake
1/3 1 1
2/3 1 1
3/3 1

Upvotes: 1

Views: 48

Answers (1)

jezrael
jezrael

Reputation: 862761

If need 1 if value exist in column use Series.str.get_dummies:

df = df.join(df.pop('Item').str.get_dummies(',').replace(0, ''))
print (df)
  Date Bread Cake Jam Muffin
0  1/3     1               1
1  2/3          1   1       
2  3/3              1       

If need counts values use crosstab with DataFrame.explode od splitted values by ,:

df1 = df.assign(Item = df['Item'].str.split(',')).explode('Item')

df2 = pd.crosstab(df1['Date'], df1['Item']).replace(0,'').rename_axis(None, axis=1).reset_index()
print (df2)
  Date Bread Cake Jam Muffin
0  1/3     1               1
1  2/3          1   1       
2  3/3              1       

Difference in changed data:

print (df)
                 Item Date
0  Bread,Muffin,Bread  1/3 <- duplicated Bread
1            Jam,Cake  2/3
2                 Jam  3/3

df1 = df.assign(Item = df['Item'].str.split(',')).explode('Item')

df2 = pd.crosstab(df1['Date'], df1['Item']).replace(0,'').rename_axis(None, axis=1).reset_index()
print (df2)
  Date Bread Cake Jam Muffin
0  1/3     2               1
1  2/3          1   1       
2  3/3              1       

df0 = df.join(df.pop('Item').str.get_dummies(',').replace(0, ''))
print (df0)
  Date Bread Cake Jam Muffin
0  1/3     1               1
1  2/3          1   1       
2  3/3              1       

Upvotes: 4

Related Questions