Beginner
Beginner

Reputation: 749

Convert the comma separated cell contents to multiple cells in same columns

I have a excel file with the following kind of data.

Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree 5.56

The data comprises of same kind of rows.The numeric value is in next cell. I want to convert it into the following format-

Pink_Floyd 5.56
Beatles    5.56
Pearl_Jam  5.56
Porcupine_Tree 5.56 

How to do it?

Upvotes: 2

Views: 39

Answers (1)

jezrael
jezrael

Reputation: 862671

Use:

df = pd.DataFrame({'A': ['Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree', 'Beatles'], 
                   'B': [5.56, 10.0]})
print (df)
                                             A      B
0  Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree   5.56
1                                      Beatles  10.00

s = (df.pop('A').str.split(',', expand=True)
       .stack()
       .reset_index(level=1, drop=True)
       .rename('A'))

df = df.join(s).reset_index(drop=True)[['A','B']]

Explanation:

  1. Extract column A by pop
  2. Then split to DataFrame
  3. Reshape by stack
  4. Then reset_index for remove first level of MultiIndex
  5. Change column name by rename
  6. Last join to original and if necessary change order of columns by list

Or create new DataFrame by constructor:

from itertools import chain

a = df['A'].str.split(',')

df = pd.DataFrame({
    'A' : list(chain.from_iterable(a.values.tolist())), 
    'B' : df['B'].values.repeat(a.str.len())
})

print (df)
                A      B
0      Pink_Floyd   5.56
1         Beatles   5.56
2       Pearl_Jam   5.56
3  Porcupine_Tree   5.56
4         Beatles  10.00

Upvotes: 1

Related Questions