Pandas - Split columns into rows while keeping indices

Question

I have the following simplified DataFrame:

import pandas as pd

pd.DataFrame([{'index_a':'a1', 'index_b':'b1', 'value_x':'x1', 'value_y':'y1'},
              {'index_a':'a2', 'index_b':'b2', 'value_x':'x2', 'value_y':'y2'},
              {'index_a':'a3', 'index_b':'b3', 'value_x':'x3', 'value_y':'y3'}])

It contains two indices and two value columns. For downstream usage, it does not make sense to have two value columns (they are from the same distribution). I therefore want to 'explode' these columns and make one large list. This is what should result:

pd.DataFrame([{'index_a':'a1', 'index_b':'b1', 'value':'x1'},
              {'index_a':'a1', 'index_b':'b1', 'value':'x1'},
              {'index_a':'a2', 'index_b':'b2', 'value':'x2'},
              {'index_a':'a2', 'index_b':'b2', 'value':'y2'},
              {'index_a':'a3', 'index_b':'b3', 'value':'x3'},
              {'index_a':'a3', 'index_b':'b3', 'value':'y3'}])

I tried isolating values via .value and .ravel() but none yielded the desired results.

Thanks in advance. BBQuercus :)

anky · Accepted Answer

Use str.contains() for column names to find to filter the index columns and pass it under df.melt() as id_vars:

final=df.melt(df.columns[df.columns.str.contains('index')]).drop('variable',1)

  index_a index_b value
0      a1      b1    x1
1      a2      b2    x2
2      a3      b3    x3
3      a1      b1    y1
4      a2      b2    y2
5      a3      b3    y3

Pandas - Split columns into rows while keeping indices

Answers (2)

Related Questions