GoBackess
GoBackess

Reputation: 414

how to columns into multiple rows in dataframe?

i have dataframe like below

A B C D E F G H G H  I  J  K
1 2 3 4 5 6 7 8 9 10 11 12 13

and i want result like this

A B C D E F G H 
1 2 3 4 5 6 7 8 
1 2 3 4 5 6 7 9
1 2 3 4 5 6 7 10
1 2 3 4 5 6 7 11
1 2 3 4 5 6 7 12
1 2 3 4 5 6 7 13

like a result column 'G~K' is under column 'H'

how can i do this?

Upvotes: 1

Views: 92

Answers (4)

Andy L.
Andy L.

Reputation: 25239

Your data has some duplicates in columns name, so melt will fail. However, you could change columns name and then apply melt

In [166]: df
Out[166]:
   A  B  C  D  E  F  G  H  G   H   I   J   K
0  1  2  3  4  5  6  7  8  9  10  11  12  13

Duplicates in column name 'G' and 'H'. Just change those to 'GG', 'HH'. Finally, apply melt

In [167]: df.columns = ('A','B','C','D','E','F','G','H','GG','HH','I','J','K')

In [168]: df
Out[168]:
   A  B  C  D  E  F  G  H  GG  HH   I   J   K
0  1  2  3  4  5  6  7  8   9  10  11  12  13

In [169]: df.melt(id_vars=df.columns.tolist()[0:7], value_name='H').drop('variable', 1)
Out[169]:
   A  B  C  D  E  F  G   H
0  1  2  3  4  5  6  7   8
1  1  2  3  4  5  6  7   9
2  1  2  3  4  5  6  7  10
3  1  2  3  4  5  6  7  11
4  1  2  3  4  5  6  7  12
5  1  2  3  4  5  6  7  13

Upvotes: 0

jhermann
jhermann

Reputation: 2101

letters = list("ABCDEFGHIJKLM")
df = pd.DataFrame([np.arange(1, len(letters) + 1)], columns=letters)
df = pd.concat([df.iloc[:, :7]] * (len(letters) - 7)).assign(H=df[letters[7:]].values[0])
df = df.reset_index(drop=True)
df

gives you

    A   B   C   D   E   F   G   H
0   1   2   3   4   5   6   7   8
1   1   2   3   4   5   6   7   9
2   1   2   3   4   5   6   7   10
3   1   2   3   4   5   6   7   11
4   1   2   3   4   5   6   7   12
5   1   2   3   4   5   6   7   13

Upvotes: 0

BENY
BENY

Reputation: 323266

You need to adjust your columns by using cummax , then after melt, we create additional key with cumcount, then just do reshape here, I am using unstack , you can using pivot , pivot_table

s=pd.Series(df.columns)
s[(s>='H').cummax()==1]='H'
df.columns=s
df=df.melt()

yourdf=df.set_index(['variable',df.groupby('variable').cumcount()]).\
          value.unstack(0).ffill()
yourdf
variable    A    B    C    D    E    F    G     H
0         1.0  2.0  3.0  4.0  5.0  6.0  7.0   8.0
1         1.0  2.0  3.0  4.0  5.0  6.0  7.0   9.0
2         1.0  2.0  3.0  4.0  5.0  6.0  7.0  10.0
3         1.0  2.0  3.0  4.0  5.0  6.0  7.0  11.0
4         1.0  2.0  3.0  4.0  5.0  6.0  7.0  12.0
5         1.0  2.0  3.0  4.0  5.0  6.0  7.0  13.0

Upvotes: 2

Aiden Zhao
Aiden Zhao

Reputation: 653

I hope this would give you some help

import pandas as pd

df = pd.DataFrame([list(range(1,14))])
df.columns = ('A','B','C','D','E','F','G','H','G','H','I','J','K')

print('starting data frame:')
print(df)


df1 = df.iloc[:,0:7]
df1 = df1.append([df1]*(len(df.iloc[:,7:].T)-1))
df1.insert(df1.shape[1],'H',list(df.iloc[:,7:].values[0]))

print('result:')
print(df1)

Upvotes: 0

Related Questions