Reputation: 59
I want to replace a str from one of the column from the table. example: i want to remove b"SET and b"MULTISET from the df column. how to achieve that. I need output like Details are as below,
columns = ['cust_id', 'cust_name', 'vehicle', 'details', 'bill']
df = pd.DataFrame(data=t, columns=columns)
df
cust_id cust_name vehicle details bill
0 101 b"SET{'Tom','C'}" b"MULTISET{'Toyota','Cruiser'}" b"ROW('Street 1','12345678','NewYork, US')" 1200.00
1 102 b"SET{'Rachel','Green'}" b"MULTISET{'Ford','se'}" b"ROW('Street 2','12344444','Florida, US')" 2400.00
2 103 b"SET{'Chandler','Bing'}" b"MULTISET{'Dodge','mpv'}" b"ROW('Street 1','12345555','Georgia, US')" 601.10
Required Output:
cust_id cust_name vehicle details bill
0 101 {'Tom','C'} {'Toyota','Cruiser'} ('Street 1','12345678','NewYork, US') 1200.00
1 102 {'Rachel','Green'} {'Ford','se'} ('Street 2','12344444','Florida, US') 2400.00
2 103 {'Chandler','Bing'} {'Dodge','mpv'} ('Street 1','12345555','Georgia, US') 601.10
Upvotes: 1
Views: 74
Reputation: 8302
Here is a possible solution,
columns = ['cust_name', 'vehicle', 'details']
{}
or ()
regex_ = r"([\{|\(].*[\}|\)])"
str.decode('ascii')
is to convert columns values from byte
to string
.columns = ['cust_name', 'vehicle', 'details']
regex_ = r"([\{|\(].*[\}|\)])"
for col in columns:
df[col] = df[col].str.decode('ascii').str.extract(regex_)
cust_id cust_name ... details bill
0 101 {'Tom','C'} ... ('Street 1','12345678','NewYork, US') 1200.0
1 102 {'Rachel','Green'} ... ('Street 2','12344444','Florida, US') 2400.0
2 103 {'Chandler','Bing'} ... ('Street 1','12345555','Georgia, US') 601.1
Upvotes: 1