Chris Macaluso
Chris Macaluso

Reputation: 1482

Pandas - How can I stack columns based on datatype?

If I have a dataframe with only two datatypes like below:

d = {'col1': [1, 2], 'col2': ['jack', 'bill'], 'col3': [4, 5], 'col4': ['megan', 'sarah']}
df = pd.DataFrame(data=d)
print(df)


   col1  col2  col3   col4
0     1  jack     4  megan
1     2  bill     5  sarah


print(df.dtypes)

col1     int64
col2    object
col3     int64
col4    object

Is there a way to stack these columns based only on data type? The end result would be:

   col1  col2
0     1  jack
1     2  bill
2     4  megan
3     5  sarah

It's not necessary for the final column names to remain the same.

Upvotes: 2

Views: 94

Answers (3)

rafaelc
rafaelc

Reputation: 59304

For mismatch in number of dtype columns, you may use the default constructor. Borrowing Quang's idea on groupby(axis=1),

pd.DataFrame(df.groupby(df.dtypes, axis=1).apply(lambda s: list(s.values.ravel())).tolist()).T

Upvotes: 2

BENY
BENY

Reputation: 323396

Why not give a chance for for loop

pd.DataFrame([ df.loc[:,df.dtypes==x].values.ravel() for x in df.dtypes.unique()]).T
Out[46]: 
   0      1
0  1   jack
1  4  megan
2  2   bill
3  5  sarah

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150825

This works with your sample data, not sure if it works with general data

(df.groupby(df.dtypes, axis=1)
   .apply(lambda x: (x.stack().reset_index(drop=True)))
)

Output

int64   object
0   1   jack
1   4   megan
2   2   bill
3   5   sarah

Upvotes: 4

Related Questions