scott martin
scott martin

Reputation: 1293

Pandas - Splitting data from single row into multiple rows

I am trying to extract data from a pandas Dataframe column that has a specific pattern. I am trying to loop in such that each occurrence is created as a new row. Given below is how the data is:

id: id_101
description: id_name1
id: id_102
description: id_name2
id: id_103
description: id_name3

All of the above content is stored in a single row. I am trying to convert as below where each occurrence is made into a new row:

 , id, description
0, id_101, id_name1 
1, id_102, id_name2
2, id_103, id_name3 

Upvotes: 1

Views: 503

Answers (1)

jezrael
jezrael

Reputation: 862591

If data has always pairs first Series.str.split and then DataFrame.pivot with helper column created by GroupBy.cumcount:

df = df['col'].str.split(': ', expand=True)
df['g'] = df.groupby(0)[1].cumcount()
df = df.pivot('g', 0, 1).rename_axis(index=None, columns=None)
print (df)
  description      id
0    id_name1  id_101
1    id_name2  id_102
2    id_name3  id_103

Or get values after :, convert to numpy array and reshape to new DataFrame:

a = df['col'].str.split(': ').str[1].to_numpy()
df = pd.DataFrame(a.reshape(-1, 2), columns=['id','description'])
print (df)
       id description
0  id_101    id_name1
1  id_102    id_name2
2  id_103    id_name3

Upvotes: 1

Related Questions