Reputation: 749
I have a excel file with the following kind of data.
Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree 5.56
The data comprises of same kind of rows.The numeric value is in next cell. I want to convert it into the following format-
Pink_Floyd 5.56
Beatles 5.56
Pearl_Jam 5.56
Porcupine_Tree 5.56
How to do it?
Upvotes: 2
Views: 39
Reputation: 862671
Use:
df = pd.DataFrame({'A': ['Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree', 'Beatles'],
'B': [5.56, 10.0]})
print (df)
A B
0 Pink_Floyd,Beatles,Pearl_Jam,Porcupine_Tree 5.56
1 Beatles 10.00
s = (df.pop('A').str.split(',', expand=True)
.stack()
.reset_index(level=1, drop=True)
.rename('A'))
df = df.join(s).reset_index(drop=True)[['A','B']]
Explanation:
A
by pop
split
to DataFrame
stack
reset_index
for remove first level of MultiIndex
rename
join
to original and if necessary change order of columns by list
Or create new DataFrame
by constructor:
from itertools import chain
a = df['A'].str.split(',')
df = pd.DataFrame({
'A' : list(chain.from_iterable(a.values.tolist())),
'B' : df['B'].values.repeat(a.str.len())
})
print (df)
A B
0 Pink_Floyd 5.56
1 Beatles 5.56
2 Pearl_Jam 5.56
3 Porcupine_Tree 5.56
4 Beatles 10.00
Upvotes: 1