Reputation: 531
O have this table where the type
column shows 3 levels of information. I want to transform the second and third level to be in separated column.
number type
10 type 1
10 bottom
10 up
10 1
10 2
10 3
20 type 2
20 bottom
20 up
20 1
20 2
20 3
The expected result is as below:
number type description detail
10 type 1 bottom bottom
10 type 1 up 1
10 type 1 up 2
10 type 1 up 3
20 type 2 bottom bottom
20 type 2 up 1
20 type 2 up 2
20 type 2 up 3
is there any way I can do to make it with python?
Thanks in advance
Upvotes: 3
Views: 87
Reputation: 18406
You can use pandas str.extract
with ffill
:
df['type_new'] = df['type'].str.extract(('(type.*)')).ffill()
df['detail'] = df['type'].str.extract('(bottom|[0-9])').ffill()
df['description'] = df['type'].str.extract('(bottom|up)').ffill()
Finally use masking to get only the required rows, and rename columns if needed:
df = df[df['type'].isin(df['detail'].values)].reset_index(drop=True)[['number', 'type_new', 'description', 'detail']].rename(columns={'type_new':'type'})
OUTPUT:
number type description detail
0 10 type 1 bottom bottom
1 10 type 1 up 1
2 10 type 1 up 2
3 10 type 1 up 3
4 20 type 2 bottom bottom
5 20 type 2 up 1
6 20 type 2 up 2
7 20 type 2 up 3
Upvotes: 2