Kun Fang
Kun Fang

Reputation: 25

How to split column to multiple columns with some features?

I have a column like this

col1  
(0,0):19,(1,1):155  
(0,0):13,(0,1):32,(1,1):34  
(0,0):56,(1,0):43,(1,1):38  
(0,0):67,(0,1):45,(1,0):98,(1,1):70

and I would like to split it into four columns:

(0,0)    (0,1)    (1,0)    (1,1)  
19        0        0       155  
13        32       0       34  
56        0        43      38  
67        45       98      70

I have try col.str.split(',\s*(?![^()]*\))',expand=True) but can only get:

col1    col2    col3    col4  
(0,0):19    (1,1):155    NAN    NAN  
(0,0):13    (0,1):32     (1,1):34    NAN   
(0,0):56    (1,0):43     (1,1):38    NAN  
(0,0):67    (0,1):45     (1,0):98   (1,1):70

I was wondering if there is a way get what I show above? Thanks a lot!
Kun

Upvotes: 2

Views: 52

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150765

My approach with double str.split and stack/unstack:

(df.col1.str.split(',\s*(?![^()]*\))', expand=True)
   .stack().str.split(':',expand=True)
   .set_index([0], append=True)
   .reset_index(level=1, drop=True)[1]
   .unstack(fill_value=0)
)

Output:

0 (0,0) (0,1) (1,0) (1,1)
0    19     0     0   155
1    13    32     0    34
2    56     0    43    38
3    67    45    98    70

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626950

You may extract the digits after (0,0), (0,1), (1,0) and (1,1) and then drop the col1 column:

import pandas as pd
df = pd.DataFrame({'col1': ['(0,0):19,(1,1):155','(0,0):13,(0,1):32,(1,1):34','(0,0):56,(1,0):43,(1,1):38','(0,0):67,(0,1):45,(1,0):98,(1,1):70']})
df['(0,0)'] = df['col1'].str.extract(r'\(0,0\):(\d+)', expand=False).fillna(0)
df['(0,1)'] = df['col1'].str.extract(r'\(0,1\):(\d+)', expand=False).fillna(0)
df['(1,0)'] = df['col1'].str.extract(r'\(1,0\):(\d+)', expand=False).fillna(0)
df['(1,1)'] = df['col1'].str.extract(r'\(1,1\):(\d+)', expand=False).fillna(0)
df = df.drop(columns = ['col1'])

Output:

>>> df
  (0,0) (0,1) (1,0) (1,1)
0    19     0     0   155
1    13    32     0    34
2    56     0    43    38
3    67    45    98    70

Upvotes: 1

Related Questions