Rtut
Rtut

Reputation: 1007

pandas dataframe convert columns as rows

I am looking for a way to convert my dataframe columns as rows. Following is my example dataframe:

mylist= [['xx'], [None, 'yy'], ['xx', None], ['xx',None],['xx','yy']]

pd.DataFrame(mylist,index=['A','B','C','D','E'],columns=['Col1','Col2'])

Input DataFrame:

-------------------
Ind | Col1 | Col2 |
-------------------
A   | xx   | None |
B   | None | yy   |
C   | xx   | None |
D   | xx   | None |
E   | xx   | yy   |
-------------------

I want to split my columns as separate rows in the dataframe. Below is how my desired output looks like. Can anyone suggest how to acheive the following.

Desired dataframe:

------------------------
Ind | Values | Columns |
------------------------
A   | xx     | Col1    |
B   | yy     | Col2    |
C   | xx     | Col1    |
D   | xx     | Col1    |
E   | xx     | Col1    |
E   | xx     | Col2    |
------------------------

Thanks,
Rtut

Upvotes: 2

Views: 4516

Answers (3)

Alexander
Alexander

Reputation: 109528

In the case of two values, it appears that you only want the first (e.g. the last row of your example).

You can use loc to first set the second value to None in the case both columns have values.

df.loc[(df.Col1.notnull()) & (df.Col2.notnull()), 'Col2'] = None

You can then melt your results.

>>> pd.melt(df.reset_index(), id_vars='index', var_name='Columns', value_name='Values'
            ).dropna().set_index('index')[['Values', 'Columns']]

      Values Columns
index               
A         xx    Col1
C         xx    Col1
D         xx    Col1
E         xx    Col1
B         yy    Col2

Upvotes: 1

akuiper
akuiper

Reputation: 214927

Another option is to use melt:

longDf = pd.melt(df.reset_index(), id_vars=['index'], var_name=['Columns'])
longDf[pd.notnull(longDf.value)]

  index Columns value
0   A   Col1    xx
2   C   Col1    xx
3   D   Col1    xx
4   E   Col1    xx
6   B   Col2    yy
9   E   Col2    yy

Upvotes: 3

mechanical_meat
mechanical_meat

Reputation: 169274

df = pd.DataFrame(mylist,index=['A','B','C','D','E'],columns=['Col1','Col2'])
# rotate df
stacked_df = pd.DataFrame(df.stack().reset_index())
# name columns
stacked_df.columns = ['Ind','Columns','Values']
# reorder columns
reordered_df = pd.DataFrame(stacked_df,columns=['Ind','Values','Columns'])

Results in:

>>> reordered_df
  Ind Values Columns
0   A     xx    Col1
1   B     yy    Col2
2   C     xx    Col1
3   D     xx    Col1
4   E     xx    Col1
5   E     yy    Col2

Upvotes: 2

Related Questions