Reputation: 25
I have a dataframe df
as shown below,A index,B Index and C Index
appear as headers
and each of them have sub header as the Last price
Input
A index B Index C Index
Date Last Price Date Last Price Date Last Price
1/10/2021 12 1/11/2021 46 2/9/2021 67
2/10/2021 13 2/11/2021 51 3/9/2021 70
3/10/2021 14 3/11/2021 62 4/9/2021 73
4/10/2021 15 4/11/2021 47 5/9/2021 76
5/10/2021 16 5/11/2021 51 6/9/2021 79
6/10/2021 17 6/11/2021 22 7/9/2021 82
7/10/2021 18 7/11/2021 29 8/9/2021 85
I want to transform the to the below dataframe.
Expected Output
Date Index Name Last Price
1/10/2021 A index 12
2/10/2021 A index 13
3/10/2021 A index 14
4/10/2021 A index 15
5/10/2021 A index 16
6/10/2021 A index 17
7/10/2021 A index 18
1/11/2021 B Index 46
2/11/2021 B Index 51
3/11/2021 B Index 62
4/11/2021 B Index 47
5/11/2021 B Index 51
6/11/2021 B Index 22
7/11/2021 B Index 29
2/9/2021 C Index 67
3/9/2021 C Index 70
4/9/2021 C Index 73
5/9/2021 C Index 76
6/9/2021 C Index 79
7/9/2021 C Index 82
8/9/2021 C Index 85
How can this be done in pandas dataframe?
Upvotes: 1
Views: 32
Reputation: 8219
The structure of your df is not clear from your output. It would be useful if you provided Python code that creates an example, or at the very lest the output of df.columns
. Now let us assume it is a 2-level multindex created as such:
columns = pd.MultiIndex.from_tuples([('A index','Date'), ('A index','Last Price'),('B index','Date'), ('B index','Last Price'),('C index','Date'), ('C index','Last Price')])
data = [
['1/10/2021', 12, '1/11/2021', 46, '2/9/2021', 67],
['2/10/2021', 13, '2/11/2021', 51, '3/9/2021', 70],
['3/10/2021', 14, '3/11/2021', 62, '4/9/2021', 73],
['4/10/2021', 15, '4/11/2021', 47, '5/9/2021', 76],
['5/10/2021', 16, '5/11/2021', 51, '6/9/2021', 79],
['6/10/2021', 17, '6/11/2021', 22, '7/9/2021', 82],
['7/10/2021', 18, '7/11/2021', 29, '8/9/2021', 85],
]
df = pd.DataFrame(columns = columns, data = data)
Then what you are trying to do is basically an application of .stack
with some re-arrangement after:
(df.stack(level = 0)
.reset_index(level=1)
.rename(columns = {'level_1':'Index Name'})
.sort_values(['Index Name','Date'])
)
this produces
Index Name Date Last Price
0 A index 1/10/2021 12
1 A index 2/10/2021 13
2 A index 3/10/2021 14
3 A index 4/10/2021 15
4 A index 5/10/2021 16
5 A index 6/10/2021 17
6 A index 7/10/2021 18
0 B index 1/11/2021 46
1 B index 2/11/2021 51
2 B index 3/11/2021 62
3 B index 4/11/2021 47
4 B index 5/11/2021 51
5 B index 6/11/2021 22
6 B index 7/11/2021 29
0 C index 2/9/2021 67
1 C index 3/9/2021 70
2 C index 4/9/2021 73
3 C index 5/9/2021 76
4 C index 6/9/2021 79
5 C index 7/9/2021 82
6 C index 8/9/2021 85
Upvotes: 2