Reputation: 429
I have a DataFrame that looks like this:
|Index| Dimension |
|-----|-----------|
|0 |1@43X32X34 |
|1 |1@120X80X74|
|2 |2@26X26X32 |
|3 |1@120X80X81|
I want to extract the number from the Dimension column and split it into multiple column:
|Index| Amount|Length|Width|Height|
|-----|-------|------|-----|------|
|0 | 1| 43| 32| 34|
|1 | 1| 120| 80| 74|
|2 | 2| 26| 26| 32|
|3 | 1| 120| 80| 81|
How to do that using the Pandas module in Python? Thank you!
Upvotes: 5
Views: 942
Reputation: 28709
You can use pandas str split with expand=True, the delimiters are @ and X, so passing them in will ensure appropriate splits. You can then insert Index as the first column and rewrite the column names:
M = df.Dimension.str.split('[@X]',expand=True)
M.insert(0,'Index',df.Index)
M.columns = ['Index','Amount','Length','Width','Height']
Index Amount Length Width Height
0 0 1 43 32 34
1 1 1 120 80 74
2 2 2 26 26 32
3 3 1 120 80 81
Upvotes: 4
Reputation: 402922
You can use extractall
:
df2 = df.pop('Dimension').str.extractall('(\d+)')[0].unstack().astype(int)
df2.columns = ['Amount', 'L', 'W', 'H']
Assuming you only have the one "Dimension" column, you are finished. Otherwise, concatenate this back to df
:
pd.concat([df, df2], axis=1)
Index Amount L W H
0 0 1 43 32 34
1 1 1 120 80 74
2 2 2 26 26 32
3 3 1 120 80 81
Upvotes: 4