Reputation: 559
There are strings from the user input I need to convert. The use case is pretty simple:
In theory, no big problem. I use Python, but I'm sure others with other languages will find this as easy with regular expressions.
import re
def get_lines(text):
"""Return a list of lines (list of str)."""
command_stacking = ";"
delimiter = re.escape(command_stacking)
re_del = re.compile("(?<!{s}){s}(?!{s})".format(s=delimiter), re.UNICODE)
chunks = re_del.split(text)
# Clean the double delimiters
for i, chunk in enumerate(chunks):
chunks[i] = chunk.replace(2 * command_stacking, command_stacking)
return chunks
That seems to work:
>>> get_lines("first line;second line;third line with;;a semicolon")
['first line', 'second line', 'third line with;a semicolon']
>>>
But when there's three or four semicolons, it doesn't behave as expected.
The multiple semicolons are ignored by the regular expression (as they should), but when replacing ;;
by ;
, ;;;
is replaced by ;;
, ;;;;
is replaced by ;;...
and so on. It would be great if 2 was replaced by 1, 3 by 2, 4 by 3... that's something I could explain to my users.
What would be the best solution to do that?
Thanks for your help,
Upvotes: 2
Views: 434
Reputation: 23743
The repl argument of re.sub can be a function.
>>> s = 'a;;b;;;c;;;;d'
>>> pattern = ';{2,}'
>>> def f(m):
return m.group(0)[1:]
>>> re.sub(pattern, f, s)
'a;b;;c;;;d'
>>>
Upvotes: 1
Reputation: 26667
You can use re.split with look arounds.
Example
>>> re.split(r'(?<!;);(?!;)', string)
['first line', 'second line', 'third line with;;a semicolon']
Regex
(?<!;)
Negative look behind. Checks if the ;
is not preceded by another ;
;
Matches the ;
(?!;)
Negative look ahead. Checks if the ;
is not followed by another ;
>>> [x.replace(';;', ';') for x in re.split(r'(?<!;);(?!;)', string)]
['first line', 'second line', 'third line with;a semicolon']
Upvotes: 0
Reputation: 8917
Instead of the string replace
method use re.sub()
with count=1
import re
re.sub(';;', ';', 'foo;;;bar', count=1)
https://docs.python.org/2/library/re.html#re.sub
Upvotes: 1