Reputation: 79
I am using pivot_table to transform the following data from row-wise shape to column-wise shape based on the ITEMID
values:
import pandas as pd
df = pd.DataFrame({'id': [36, 36, 36, 36, 36],
'VALUE': [86, 21, 85, 19, 87],
'ITEMID': [220045, 220210, 220045, 220210, 220045],
'TIME': [pd.to_datetime('2134-05-12 13:00:00'),
pd.to_datetime('2134-05-12 13:00:00'),
pd.to_datetime('2134-05-12 14:00:00'),
pd.to_datetime('2134-05-12 14:00:00'),
pd.to_datetime('2134-05-12 15:00:00')]})
if __name__ == '__main__':
print(df.head())
df = df.pivot_table(index=['id', 'TIME'],
columns='ITEMID',
values='VALUE', aggfunc='sum').reset_index()
print(df)
but I am getting unexpected results:
|ITEMID|id |TIME |220045 |220210 |
|------|---|--------------------|--------|--------|
|0 |36 |2134-05-12 13:00:00 |86.0 |21.0 |
|1 |36 |2134-05-12 14:00:00 |85.0 |19.0 |
|2 |36 |2134-05-12 15:00:00 |87.0 |NaN |
whereas the columns should be like
|id |TIME |220045 |220210 |220045 |220210|220045 |
|---|-----|-------|-------|-------|------|-------|
I have an index like column for ITEMID
and some ITEMID
s are transformed into the column-wise shape. I am not sure what is causing this result.
Upvotes: 0
Views: 33
Reputation: 150745
Since the ITEMID
is passed as columns
to pivot_table
, the resulting dataframe has a non-null name for the columns. Try rename_axis
:
(df.pivot_table(index=['id', 'TIME'],
columns='ITEMID',
values='VALUE', aggfunc='sum').reset_index()
.rename_axis(columns=None))
Output:
id TIME 220045 220210
0 36 2134-05-12 13:00:00 86.0 21.0
1 36 2134-05-12 14:00:00 85.0 19.0
2 36 2134-05-12 15:00:00 87.0 NaN
Upvotes: 2