Reputation: 325
I have the following dataframe, in the ID column we can have 2 feedbacks, good or bad. I cannot figure out how to systematically identify if a user is missing a feedback and if it is missing, add a new line with the missing feedback in the level 1 and add 0 to all values.
import pandas as pd
df = {'ID': ['Good','Good','Good', 'Bad', 'Bad', 'Bad', 'Good', 'Good', 'Bad'],
'USERS' : ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'C'],
'DATE_VIEW': ['16/05/2019','16/05/2019', '16/05/2019', '18/03/2020', '18/03/2020', '18/03/2020', '18/03/2020', '18/03/2020', '18/03/2020'],
'VALUES': [1, 3, 4, 5, 6, 7, 8, 1, 2]
}
df = pd.DataFrame(df)
df = pd.pivot_table(df, index=['USERS', 'ID'], columns='DATE_VIEW', values='VALUES', aggfunc='sum', fill_value=0)
this is the expected output:
Upvotes: 2
Views: 210
Reputation: 71689
unstack
and stack
df.unstack(fill_value=0).stack()
DATE_VIEW 16/05/2019 18/03/2020
USERS ID
A Bad 0 6
Good 5 8
B Bad 0 12
Good 3 1
C Bad 0 2
Good 0 0
Upvotes: 1
Reputation: 195438
Try:
idx = pd.MultiIndex.from_product(
[
df.index.get_level_values(0).unique(),
df.index.get_level_values(1).unique(),
],
names=["USERS", "ID"],
)
df = df.reindex(idx, fill_value=0)
print(df)
Prints:
DATE_VIEW 16/05/2019 18/03/2020
USERS ID
A Bad 0 6
Good 5 8
B Bad 0 12
Good 3 1
C Bad 0 2
Good 0 0
Upvotes: 3