Reputation: 1293
I am trying to extract data from a pandas Dataframe column that has a specific pattern. I am trying to loop in such that each occurrence is created as a new row. Given below is how the data is:
id: id_101
description: id_name1
id: id_102
description: id_name2
id: id_103
description: id_name3
All of the above content is stored in a single row. I am trying to convert as below where each occurrence is made into a new row:
, id, description
0, id_101, id_name1
1, id_102, id_name2
2, id_103, id_name3
Upvotes: 1
Views: 503
Reputation: 862591
If data has always pairs first Series.str.split
and then DataFrame.pivot
with helper column created by GroupBy.cumcount
:
df = df['col'].str.split(': ', expand=True)
df['g'] = df.groupby(0)[1].cumcount()
df = df.pivot('g', 0, 1).rename_axis(index=None, columns=None)
print (df)
description id
0 id_name1 id_101
1 id_name2 id_102
2 id_name3 id_103
Or get values after :
, convert to numpy array and reshape to new DataFrame
:
a = df['col'].str.split(': ').str[1].to_numpy()
df = pd.DataFrame(a.reshape(-1, 2), columns=['id','description'])
print (df)
id description
0 id_101 id_name1
1 id_102 id_name2
2 id_103 id_name3
Upvotes: 1