Reputation: 159
Suppose I have a pandas data frame df
userid subcategory timestamp smartexpenseid companyid
20648196 SmartExpense Declined 2016-03-06T16:44:55.702Z 11771712||91164585|||| 9797
43124398 SmartExpense Declined 2016-03-06T17:09:06.033Z 11111111|249178181?CARRT?266298850196|93461910|||| 63177
76764125 SmartExpense Declined 2016-03-06T19:44:19.078Z 137177|250155900?HOTEL?270593373724|92826286|||| 199412
I want to split the smartexpenseid column into separate columns in the same data frame 11111111|249178181?CARRT?266298850196|93461910|||| -> “CctKey|TripId?SegType?SegId|EreceiptId|PctKey|MeKey|RcKey|CapKey”
Can somebody please suggest a best possible way to do it in Python ?
Upvotes: 0
Views: 341
Reputation: 2557
Try this
(?<CctKey>\d+)\|(?<TripId>\d*)\??(?<SegType>[^?]*)\??(?<SegId>\d*)\|(?<EreceiptId>\d+)\|(?<PctKey>[^|]*)\|(?<MeKey>[^|]*)\|(?<RcKey>[^|]*)\|(?<CapKey>[^|\n\s]*)
Remove all group ?<name>
syntax in Python
(\d+)\|(\d*)\??([^?]*)\??(\d*)\|(\d+)\|([^|]*)\|([^|]*)\|([^|]*)\|([^|\n\s]*)
Upvotes: 1