zitoun
zitoun

Reputation: 45

Reshape Dataframe from horizontal column to vertical

Is there any efficient way to reshape a dataframe from:

(A1, A2, A3, B1, B2, B3, C1, C2, C3, TT, YY and ZZ are columns)

A1 A2 A3 B1 B2 B3 C1 C2 C3 TT YY ZZ
11 22 33 44 55 66 77 88 99 23 24 25
11 22 33 44 55 66 77 88 99 23 24 25
11 22 33 44 55 66 77 88 99 23 24 25
11 22 33 44 55 66 77 88 99 23 24 25
11 22 33 44 55 66 77 88 99 23 24 25
11 22 33 44 55 66 77 88 99 23 24 25

TO:

HH JJ KK TT YY ZZ
11 22 33 23 24 25
11 22 33 23 24 25
11 22 33 23 24 25
11 22 33 23 24 25
11 22 33 23 24 25
11 22 33 23 24 25
44 55 66 23 24 25
44 55 66 23 24 25
44 55 66 23 24 25
44 55 66 23 24 25
44 55 66 23 24 25
44 55 66 23 24 25
77 88 99 23 24 25
77 88 99 23 24 25
77 88 99 23 24 25
77 88 99 23 24 25
77 88 99 23 24 25
77 88 99 23 24 25

HH, JJ and KK are new columns where I would make a vertical stack of column A, B, C and keeping in horizontal stack TT, YY and ZZ

A1 A2 A3 TT YY ZZ 
B1 B2 B3 TT YY ZZ
C1 C2 C3 TT YY ZZ

Thanks for your help

Upvotes: 3

Views: 84

Answers (2)

sammywemmy
sammywemmy

Reputation: 28649

a bit longer than the previous solution :

#extract columns ending with numbers
abc = df.filter(regex='\d$')

#sort columns into separate lists
from itertools import groupby
from operator import itemgetter
cols = sorted(abc.columns,key=itemgetter(0))
filtered_columns = [list(g) for k,g in groupby(cols,key=itemgetter(0))]

#iterate through the dataframe
#and stack them
abc_stack = pd.concat([abc.filter(col)
                       .set_axis(['HH','JJ','KK'],axis='columns')
                       for col in filtered_columns],
                      ignore_index=True)

#filter for columns ending with alphabets
tyz = df.filter(regex= '[A-Z]$')

#get the dataframe to be the same length as abc_stack
tyz_stack = pd.concat([tyz] * len(filtered_columns),ignore_index=True)

#combine both dataframes
res = pd.concat([abc_stack,tyz_stack], axis=1)

res

    HH  JJ  KK  TT  YY  ZZ
0   11  22  33  23  24  25
1   11  22  33  23  24  25
2   11  22  33  23  24  25
3   11  22  33  23  24  25
4   11  22  33  23  24  25
5   11  22  33  23  24  25
6   44  55  66  23  24  25
7   44  55  66  23  24  25
8   44  55  66  23  24  25
9   44  55  66  23  24  25
10  44  55  66  23  24  25
11  44  55  66  23  24  25
12  77  88  99  23  24  25
13  77  88  99  23  24  25
14  77  88  99  23  24  25
15  77  88  99  23  24  25
16  77  88  99  23  24  25
17  77  88  99  23  24  25

UPDATE : 2021-01-08

The reshaping process could be abstracted by using the pivot_longer function from pyjanitor; at the moment you have to install the latest development version from github:

The data you shared has patterns (some columns ends with 1, others with 2, and some end with 3), we can use these patterns to reshape the data;

# install latest dev version
# pip install git+https://github.com/ericmjl/pyjanitor.git
 import janitor 

    (df.pivot_longer(names_to=("HH", "JJ", "KK"),
                     names_pattern=("1$", "2$", "3$"),
                     index=("TT", "YY", "ZZ")
                     )
       .sort_index(axis="columns"))

Basically, what it does is look for columns that end with 1, aggregates them into one column ("TT") and does the same for 2 and 3.

Upvotes: 1

Naga kiran
Naga kiran

Reputation: 4607

You can use Column splitting and concatenation

df = pd.read_clipboard()
ColSets= [df.columns[i:i+3] for i in np.arange(0,len(df.columns)-3,3)]
LCols = df.columns[-3:]
NewDf = pd.concat([df[ColSet].join(df[LCols]).T.reset_index(drop=True).T for ColSet in ColSets])
NewDf.columns = ['HH', 'JJ', 'KK', 'TT', 'YY', 'ZZ']

Out:

HH  JJ  KK  TT  YY  ZZ
0   11  22  33  23  24  25
1   11  22  33  23  24  25
2   11  22  33  23  24  25
3   11  22  33  23  24  25
4   11  22  33  23  24  25
5   11  22  33  23  24  25
0   44  55  66  23  24  25
1   44  55  66  23  24  25
2   44  55  66  23  24  25
3   44  55  66  23  24  25
4   44  55  66  23  24  25
5   44  55  66  23  24  25
0   77  88  99  23  24  25
1   77  88  99  23  24  25
2   77  88  99  23  24  25
3   77  88  99  23  24  25
4   77  88  99  23  24  25
5   77  88  99  23  24  25

Upvotes: 1

Related Questions