Reputation: 311
How to I convert the below multi-index dataframe:
open | high | low | close | volume | ||
---|---|---|---|---|---|---|
symbol | date | |||||
4005 | 2017-12-03 07:00:00 | 38.75 | 38.75 | 38.75 | 38.75 | 2518 |
4333 | 2018-01-04 09:58:00 | 12.06 | 12.06 | 12.06 | 12.06 | 2000 |
4338 | 2018-01-22 10:06:00 | 9.22 | 9.22 | 9.22 | 9.22 | 10 |
2018-01-22 11:23:00 | 9.22 | 9.22 | 9.22 | 9.22 | 1000 | |
4334 | 2017-12-14 08:41:00 | 9.28 | 9.28 | 9.28 | 9.28 | 2 |
4332 | 2018-01-02 10:19:00 | 10.30 | 10.30 | 10.30 | 10.30 | 100 |
2018-01-09 08:56:00 | 10.28 | 10.30 | 10.28 | 10.30 | 1500 |
To this table below with symbol as columns and "close" column as the value
4005 | 4333 | 4338 | 4334 | 4332 | |
---|---|---|---|---|---|
date | |||||
2017-12-03 07:00:00 | 38.75 | nan | nan | nan | nan |
2018-01-04 09:58:00 | nan | 12.06 | nan | nan | nan |
2018-01-22 10:06:00 | nan | nan | 9.22 | nan | nan |
2018-01-22 11:23:00 | nan | nan | 9.22 | nan | nan |
2017-12-14 08:41:00 | nan | nan | nan | 9.28 | nan |
2018-01-02 10:19:00 | nan | nan | nan | nan | 10.30 |
2018-01-09 08:56:00 | nan | nan | nan | nan | 10.30 |
Upvotes: 2
Views: 61
Reputation: 35626
Use pivot
Setup
import pandas as pd
df = pd.DataFrame({
'symbol': {0: 4005, 1: 4333, 2: 4338, 3: 4338, 4: 4334, 5: 4332, 6: 4332},
'date': {0: '2017-12-03 07:00:00', 1: '2018-01-04 09:58:00',
2: '2018-01-22 10:06:00', 3: '2018-01-22 11:23:00',
4: '2017-12-14 08:41:00', 5: '2018-01-02 10:19:00',
6: '2018-01-09 08:56:00'},
'open': {0: 38.75, 1: 12.06, 2: 9.22, 3: 9.22, 4: 9.28, 5: 10.3, 6: 10.28},
'high': {0: 38.75, 1: 12.06, 2: 9.22, 3: 9.22, 4: 9.28, 5: 10.3, 6: 10.3},
'low': {0: 38.75, 1: 12.06, 2: 9.22, 3: 9.22, 4: 9.28, 5: 10.3, 6: 10.28},
'close': {0: 38.75, 1: 12.06, 2: 9.22, 3: 9.22, 4: 9.28, 5: 10.3, 6: 10.3},
'volume': {0: 2518, 1: 2000, 2: 10, 3: 1000, 4: 2, 5: 100, 6: 1500}
})
# Recreate Multi-Index Data
df = df.set_index(['symbol', 'date'])
Create new_df
with pivoted data.
# Reset Index and Pivot
new_df = df \
.reset_index() \
.pivot(index='date', columns='symbol', values='close') \
.rename_axis(None, axis=1)
print(new_df)
4005 4332 4333 4334 4338
date
2017-12-03 07:00:00 38.75 NaN NaN NaN NaN
2017-12-14 08:41:00 NaN NaN NaN 9.28 NaN
2018-01-02 10:19:00 NaN 10.3 NaN NaN NaN
2018-01-04 09:58:00 NaN NaN 12.06 NaN NaN
2018-01-09 08:56:00 NaN 10.3 NaN NaN NaN
2018-01-22 10:06:00 NaN NaN NaN NaN 9.22
2018-01-22 11:23:00 NaN NaN NaN NaN 9.22
Upvotes: 1