Reputation: 33
I have data like this
Column1 Column2 Column3
0 This Sushi is Awesome NaN NaN
1 NaN Id: 2261
2 NaN City: Tokyo
3 NaN Food: Positive
4 NaN Price: NaN
5 This food is really expensi... NaN NaN
6 NaN Id: 3`
7 NaN City: Osaka
8 NaN Food: Negative
9 NaN Price: Negative
i wrote code like this but i got error
pivoted = data.pivot(index='Column1',columns='Column2', values='Column3')
ValueError: Index contains duplicate entries, cannot reshape
pivot_table also doesnt work
I want to have output like this
0 Id City Food Price
1 This Sushi is Awesome 2261 Tokyo Positive NaN
2 This food is really expensi... 3 Osaka Negative Negative
Upvotes: 1
Views: 84
Reputation: 863651
Use pre processing before pivot
- check missing values per Column1
, then forward fillling, remove :
from Column2
by rstrip
and last filter by boolean indexing
:
m = df['Column1'].isnull()
df['Column1'] = df['Column1'].ffill()
df['Column2'] = df['Column2'].str.rstrip(':')
pivoted = df[m].pivot(index='Column1',columns='Column2', values='Column3')
print (pivoted)
Column2 City Food Id Price
Column1
This Sushi is Awesome Tokyo Positive 2261 NaN
This food is really expensive Osaka Negative 3` Negative
Upvotes: 1