Reputation: 33
I have a CSV data that looks like this:
In pandas using python, I want to convert it into something like this:
The point is to have the same column variables for each year, where year is the index.
I have tried out many different forms of converting the dataframe
at hand such as pivot table, melt, stack/unstack etc. but to no avail. Any help in this regard will be appreciated!
Upvotes: 1
Views: 275
Reputation: 862661
IIUC you need:
df = df.stack(0)
Sample:
mux = pd.MultiIndex.from_product([[2003,2004], ['C', 'D']])
mux1 = pd.MultiIndex.from_product([[1,2], ['A', 'B']], names=('State1','State2'))
np.random.seed(100)
df = pd.DataFrame(np.random.random((4,4)), columns=mux, index = mux1)
print (df)
2003 2004
C D C D
State1 State2
1 A 0.543405 0.278369 0.424518 0.844776
B 0.004719 0.121569 0.670749 0.825853
2 A 0.136707 0.575093 0.891322 0.209202
B 0.185328 0.108377 0.219697 0.978624
print (df.stack(0).swaplevel(1,2).reset_index())
State1 level_1 State2 C D
0 1 2003 A 0.543405 0.278369
1 1 2004 A 0.424518 0.844776
2 1 2003 B 0.004719 0.121569
3 1 2004 B 0.670749 0.825853
4 2 2003 A 0.136707 0.575093
5 2 2004 A 0.891322 0.209202
6 2 2003 B 0.185328 0.108377
7 2 2004 B 0.219697 0.978624
Upvotes: 1