Yiling Liu
Yiling Liu

Reputation: 686

How to match a string that is not belong to a set of substring in regex?

For example:

I have a string which is:

k0 + k1 * x + k2 * x ** 2 + a + b + 3

I want to throw k\d, a, b away and get a list returned.

The return values should be:

[' + ", ' * x + ', ' * x ** 2 +', '+' + 3']

I tried [^k\d+,a,b]+

but this one cannot combine k and one or more number together, in fact, it deleted all numbers and + from the result.

is there any way to solve this problem?

Simple python code for testing:

import re

# this regex is the wrong one
str_format = re.compile(r'[^k\d+,a,b]+')
str = 'k0 + k1 * x + k2 * x ** 2 + b + a +3'
re.findall(str_format, str)

Upvotes: 1

Views: 167

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

A character class is used to match a single character, not a group of characters. In your case, you can achieve the result by splitting with the pattern like

k\d+|[ab]

that matches k and any 1+ digits after it (as a sequence of chars) or a or b. Then, you may remove all empty matches and get the final result:

import re
text = 'k0 + k1 * x + k2 * x ** 2 + b + a +3'
print (list(filter(None, re.split(r'k\d+|[ab]', text))))

See the online Python demo

Upvotes: 1

Related Questions