Reputation: 831
I have a string,
line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'
and I would like to find and replace the commas, between quotation marks, with ":". Looking for a results
line = '12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'
So far I have been able to match this pattern with,
import re
re.findall('(\"\d),(.+?\")', line)
however, I guess I should be using
re.compile(...something..., line)
re.sub(':', line)
Does anyone know how to do this? thanks, labjunky
Upvotes: 4
Views: 4799
Reputation: 627082
There is also a generic regex solution to replace any kind of fixed (and non-fixed, too) pattern in between double (or single) quotes: match the double- or single quoted substrings with the corresponding pattern, and use a callable as the replacement argumet to re.sub
where you may manipulate the match:
Replacing commas in between double quotes with colons and remove the double quotes (the current OP scenario):
re.sub(r'"([^"]*)"', lambda x: x.group(1).replace(',', ':'), line)
(demo)# => 12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,
Replacing commas in between double quotes with colons and keep the double quotes:
re.sub(r'"[^"]*"', lambda x: x.group(0).replace(',', ':'), line)
(demo)# => 12/08/2013,3,"9:25",42:51,"3:08","12:9","13:9",159,170,"3:19",437,
Replacing commas in between double and single quotes with colons and keep the single/double quotes:
re.sub(r''''[^']*'|"[^"]*"''', lambda x: x.group(0).replace(',', ':'), '''0,1,"2,3",'4,5',''')
(demo)# => 0,1,"2:3",'4:5',
Also, if you need to handle escaped single and double quotes, consider using r"'[^\\']*(?:\\.[^\\']*)*'"
(for single quoted substrings), r'"[^\\"]*(?:\\.[^\\"]*)*"'
(for double quoted substrings) or for both - r''''[^\\']*(?:\\.[^\\']*)*'|"[^\\"]*(?:\\.[^\\"]*)*"'''
instead of the patterns above.
Upvotes: 0
Reputation: 98068
import re
line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'
r = ""
for t in re.split(r'("[^"]*")', line):
if t[0] == '"':
t = t.replace(",", ":")[1:-1]
r += t
print r
Prints:
12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,
Upvotes: 0
Reputation: 369274
>>> import re
>>> line = '12/08/2013,3,"9,25",42:51,"3,08","12,9","13,9",159,170,"3,19",437,'
>>> re.sub(r'"(\d+),(\d+)"', r'\1:\2', line)
'12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'
\1
, \2
refer to matched groups.
Non-regex solution:
>>> ''.join(x if i % 2 == 0 else x.replace(',', ':')
for i, x in enumerate(line.split('"')))
'12/08/2013,3,9:25,42:51,3:08,12:9,13:9,159,170,3:19,437,'
Upvotes: 8