Doggy Face
Doggy Face

Reputation: 429

How to extract multiple numbers from Pandas Dataframe

I have a DataFrame that looks like this:

|Index| Dimension |
|-----|-----------|
|0    |1@43X32X34 |
|1    |1@120X80X74|
|2    |2@26X26X32 |
|3    |1@120X80X81|

I want to extract the number from the Dimension column and split it into multiple column:

|Index| Amount|Length|Width|Height|
|-----|-------|------|-----|------|
|0    |      1|    43|   32|    34|
|1    |      1|   120|   80|    74|
|2    |      2|    26|   26|    32|
|3    |      1|   120|   80|    81|

How to do that using the Pandas module in Python? Thank you!

Upvotes: 5

Views: 942

Answers (2)

sammywemmy
sammywemmy

Reputation: 28709

You can use pandas str split with expand=True, the delimiters are @ and X, so passing them in will ensure appropriate splits. You can then insert Index as the first column and rewrite the column names:

M = df.Dimension.str.split('[@X]',expand=True)
M.insert(0,'Index',df.Index)
M.columns = ['Index','Amount','Length','Width','Height']


   Index    Amount  Length  Width   Height
0   0       1         43    32      34
1   1       1        120    80      74
2   2       2         26    26      32
3   3       1        120    80      81

Upvotes: 4

cs95
cs95

Reputation: 402922

You can use extractall:

df2 = df.pop('Dimension').str.extractall('(\d+)')[0].unstack().astype(int) 
df2.columns = ['Amount', 'L', 'W', 'H']

Assuming you only have the one "Dimension" column, you are finished. Otherwise, concatenate this back to df:

pd.concat([df, df2], axis=1)

   Index  Amount    L   W   H
0      0       1   43  32  34
1      1       1  120  80  74
2      2       2   26  26  32
3      3       1  120  80  81

Upvotes: 4

Related Questions