OctavianWR
OctavianWR

Reputation: 227

Pandas new column per days

I have this data frame:

user     day
A        1
A        4
B        2
B        4

I want to change the data frame into this:

user   day_1   day_2   day_3   day_4
A      1       0       0       1
B      0       1       0       1

So it will automatically generate day_3 even there is no user on day_3

I have tried with this code but it doesn't work

for index, row in grouped_user.iterrows():
    grouped_user["day_" + str(int(row.active_period))] = 1

Upvotes: 1

Views: 84

Answers (2)

jezrael
jezrael

Reputation: 863751

Use get_dummies with converting to strings and aggregate max :

df1 = pd.get_dummies(df.astype(str), columns=['day']).groupby('user', as_index=False).max()
print (df1)
  user  day_1  day_2  day_4
0    A      1      0      1
1    B      0      1      1

If is necessary add missing days convert user to index, use get_dummies with DataFrame.reindex for add all possible days to columns:

days = [f'day_{x}' for x in range(df['day'].min(), df['day'].max() + 1)]
df1 = (pd.get_dummies(df.set_index('user').astype(str))
         .max(level=0)
         .reindex(columns=days, fill_value=0)
         .reset_index())
print (df1)
  user  day_1  day_2  day_3  day_4
0    A      1      0      0      1
1    B      0      1      0      1

Another solution with crosstab and DataFrame.clip:

df1 = (pd.crosstab(df['user'], df['day'])
         .clip(upper=1)
         .reindex(range(df['day'].min(), df['day'].max()+1), fill_value=0, axis=1)
         .add_prefix('day_')
         .rename_axis(None, axis=1)
         .reset_index())
print (df1)
  user  day_1  day_2  day_3  day_4
0    A      1      0      0      1
1    B      0      1      0      1

Upvotes: 2

Mykola Zotko
Mykola Zotko

Reputation: 17911

You can use the function pivot_table():

df.assign(vals=1).\
pivot_table(index='user', columns='day', values='vals', fill_value=0).\
reindex(range(df['day'].min(), df['day'].max()+1), fill_value=0, axis=1).\
add_prefix('day_')

Result:

day   day_1  day_2  day_3  day_4
user                            
A         1      0      0      1
B         0      1      0      1

Upvotes: 3

Related Questions