Reputation: 10859
I have a text of the form:
können {konnte, gekonnt} Verb
And I want to get a match for all words in it that are not in parenthesis. That means:
können = 1st match, Verb = 2nd match
Unfortunately I still don't get the knock of regular expression. There is a lot of testing possibility but not much help for creation unless you want to read a book.
I will use them in Java or Python.
Upvotes: 1
Views: 62
Reputation: 11233
A regex SPLIT using this pattern will do the job:
(\s+|\s*{[^}]*\}\s*)
and ignore any empty value.
Upvotes: 1
Reputation: 80031
In Python you could do this:
import re
regex = re.compile(r'(?:\{.*?\})?([^{}]+)', re.UNICODE)
print 'Matches: %r' % regex.findall(u'können {konnte, gekonnt} Verb')
Result:
Matches: [u'können ', u' Verb']
Although I would recommend simply replacing everything between {
and }
like so:
import re
regex = re.compile(r'\{.*?\}', re.UNICODE)
print 'Output string: %r' % regex.sub('', u'können {konnte, gekonnt} Verb')
Result:
Output string: u'können Verb'
Upvotes: 1