Reputation: 440
import pandas as pd
df = pd.DataFrame({
"col1": [11, 12, 13],
"col2": [21, 22, 23],
"col3": [31, 32, 33],
"col4": [41, 42, 43],
})
I have a Pandas data frame like df
above, and I would like to reshape df
to look like the following.
import pandas as pd
df = pd.DataFrame({
"col1": [11, 12, 13, 11, 12, 13, 11, 12, 13],
"col2": [21, 22, 23, 31, 32, 33, 41, 42, 43],
"indx": ["col2", "col2", "col2", "col3", "col3", "col3", "col4", "col4", "col4"]
})
I can slice up df
and get my desired data frame, but what would be the slick, Pythonic way to do it in Pandas?
EDIT
I'm realizing that my question is more complicated that I originall realized, but not too much more (I think). Again, I have a data frame.
import pandas as pd
df = pd.DataFrame({
"col1": [11, 12, 13],
"col2": [21, 22, 23],
"col3": [31, 32, 33],
"col4": [41, 42, 43],
"col5": [51, 52, 53],
"col6": [61, 62, 63]
})
I want to do something like melt
to get my data frame to be like this:
import pandas as pd
df = pd.DataFrame({
"col1": [11, 12, 13, 11, 12, 13, 11, 12, 13],
"colA": [21, 22, 23, 31, 32, 33, 41, 42, 43],
"indx": ["do", "do", "do", "re", "re", "re", "me", "me", "me"],
"col4": [41, 42, 43, 41, 42, 43, 41, 42, 43],
"col5": [51, 52, 53, 51, 52, 53, 51, 52, 53],
"col6": [61, 62, 63, 61, 62, 63, 61, 62, 63]
})
So I want to be able to set the strings to which the "indx" etc are set; I want to drag along several other columns the way that I drag along "col1", and I want to set the name of the new "col2" column header.
Thanks!
Upvotes: 1
Views: 736
Reputation: 28709
Create a duplicate of col4
, melt the data and replace the values :
(
df.assign(temp=df.col4)
.melt(id_vars=["col1", "col4", "col5", "col6"],
var_name="indx",
value_name="colA")
.replace({"col2": "do", "col3": "re", "temp": "me"})
.iloc[:, [0, -1, -2, 1, 2, 3]]
)
col1 colA indx col4 col5 col6
0 11 21 do 41 51 61
1 12 22 do 42 52 62
2 13 23 do 43 53 63
3 11 31 re 41 51 61
4 12 32 re 42 52 62
5 13 33 re 43 53 63
6 11 41 me 41 51 61
7 12 42 me 42 52 62
8 13 43 me 43 53 63
Upvotes: 1
Reputation: 25259
You need melt
, merge
and replace
d = {'col2': 'do', 'col3': 're', 'col4': 'me'}
df_final = (df.melt(['col1','col5','col6'], var_name="indx", value_name="colA")
.merge(df[['col1','col4']], how='left').replace(d))
Out[522]:
col1 col5 col6 indx colA col4
0 11 51 61 do 21 41
1 12 52 62 do 22 42
2 13 53 63 do 23 43
3 11 51 61 re 31 41
4 12 52 62 re 32 42
5 13 53 63 re 33 43
6 11 51 61 me 41 41
7 12 52 62 me 42 42
8 13 53 63 me 43 43
Or you may rename
columns before melt
d = {'col2': 'do', 'col3': 're', 'col4': 'me'}
df_final = (df.rename(d, axis=1)
.melt(['col1','col5','col6'], var_name="indx", value_name="colA")
.merge(df[['col1','col4']], how='left'))
Out[529]:
col1 col5 col6 indx colA col4
0 11 51 61 do 21 41
1 12 52 62 do 22 42
2 13 53 63 do 23 43
3 11 51 61 re 31 41
4 12 52 62 re 32 42
5 13 53 63 re 33 43
6 11 51 61 me 41 41
7 12 52 62 me 42 42
8 13 53 63 me 43 43
Upvotes: 2
Reputation: 6132
This is basically what you're looking for:
df.set_index('col1').unstack()
col1
col2 11 21
12 22
13 23
col3 11 31
12 32
13 33
col4 11 41
12 42
13 43
Your column names will be indexes which seems to be what you were looking for
Upvotes: 2