Reputation: 425
So I'am trying to write regular expression for complex numbers (just as an exercise to study re module). But I can't get it to work. I want regex to match strings of form: '12+18j', '-14+45j', '54', '-87j' and so on. My attempt:
import re
num = r'[+-]?(?:\d*.\d+|\d+)'
complex_pattern = rf'(?:(?P<real>{num})|(?P<imag>{num}j))|(?:(?P=real)(?P=imag))'
complex_pattern = re.compile(complex_pattern)
But it doesn't really work as I want.
m = complex_pattern.fullmatch('1+12j')
m.groupdict()
Out[166]: {'real': None, 'imag': '1+12j'}
The reason behind its structure is the fact that I want input string to contain either real or imaginary part or both. And also to be able to extract real and imag groups from match object. There is other approach i tried and it seems to work except it catches empty strings (''):
complex_pattern = rf'(?P<real>{num})+(?P<imag>{num}j)+'
complex_pattern = re.compile(complex_pattern)
I guess I could implement check for empty string simply using if. But I'm interested in more pure way and to know why first implementation doesn't work as expected.
Upvotes: 1
Views: 903
Reputation: 173
Just for completeness, I wanted to add a solution which also allows basic scientific notation, and also use of i or j. I answer this only if other people came here like me to seek a regular expression which can find complex numbers, and for this key fact, a number with no imaginary part does not return as a match.
It deviates from the original question because of matching groups but could be changed, see commented out line with cx_num_groups
.
This expression does not include matching groups for real and imaginary part because it allows for numbers such as 2j.
def _complex_re_gen():
'''
Because it is complicated, returns a string which returns a match with complex numbers.
'''
num = r'(?:[+\-]?(?:\d*\.)?\d+)'
num_sci = r'(?:{num}(?:e[+\-]?\d+)?)'.format(num=num)
cx_num = r'(?:{num_sci}?{num_sci}[ij])'.format(num_sci=num_sci)
#cx_num_groups = cx_num = r'(?:(P<real>{num_sci})?(P<img>{num_sci}[ij])?)'.format(num_sci=num_sci)
cx_match_wrapped= r"^(?:{cx_num}|\({cx_num}\))$".format(cx_num=cx_num)
return cx_match_wrapped
With following test strings, this regexp returns a match for the commented ones:
cmplx_tests = [
'1 + 2j' , #no match
'1e5-2e-2j' , #match
'i2 +4j' , #no match
'1.25' , #no match
'-5-3.2i' , #match
'64.2-3.9j' , #match
]
This post was written in part because I wanted to solve a problem in this post, with parsing complex arrays inside of parameter files.
Upvotes: 1
Reputation: 425
Even though I accepted Wiktor Stribiżew's answer and consider it really good. I have to add something that I noticed. Firstly, last string in texts
list didn't grouped correctly (i.e. '-87j' -> real: -8; imag: 7j). To address this I propose following changes to simplified version of his answer:
import re
num = r'[+-]?(?:\d*\.\d+|\d+)'
pattern = rf'(?!$)(?P<real>{num}(?!\d))?(?P<imag>{num}j)?'
texts = ['1+12j', '12+18j','-14+45j','54','-87j']
for text in texts:
match = re.fullmatch(pattern, text)
if match:
print(f'{text:>7} => {match.groupdict()}')
else:
print(f'{text:>7} did not match!')
Output:
1+12j => {'real': '1', 'imag': '+12j'}
12+18j => {'real': '12', 'imag': '+18j'}
-14+45j => {'real': '-14', 'imag': '+45j'}
54 => {'real': '54', 'imag': None}
-87j => {'real': None, 'imag': '-87j'}
The important diffrence here is adding (?!\d)
to 'real' group of regex, to prevent strings like '-87j' to be splitted into '-8' and '7j'.
Upvotes: 0
Reputation: 1486
Does this work for what you want?
import re
words= '+122+6766j'
pattern = re.compile(r'((^[-+]?(?P<real>\d+))?[-+]?(?P<img>\d{2,}j?\w)?)')
pattern.fullmatch(words).groupdict()
Output
{'real': '122', 'img': '6766j'}
Upvotes: 0
Reputation: 627327
I suggest using
import re
pattern = r'^(?!$)(?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))?(?:(?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j)))?$'
texts = ['1+12j', '12+18j','-14+45j','54','-87j']
for text in texts:
match = re.fullmatch(pattern, text)
if match:
print(text, '=>', match.groupdict())
else:
print(f'{text} did not match!')
See the Python demo. Output:
1+12j => {'real': '1', 'sign1': '', 'number1': '1', 'imag': '+12j', 'sign2': '+', 'number2': '12j'}
12+18j => {'real': '12', 'sign1': '', 'number1': '12', 'imag': '+18j', 'sign2': '+', 'number2': '18j'}
-14+45j => {'real': '-14', 'sign1': '-', 'number1': '14', 'imag': '+45j', 'sign2': '+', 'number2': '45j'}
54 => {'real': '54', 'sign1': '', 'number1': '54', 'imag': None, 'sign2': None, 'number2': None}
-87j => {'real': '-8', 'sign1': '-', 'number1': '8', 'imag': '7j', 'sign2': '', 'number2': '7j'}
See the regex demo.
Details
^
- start of string(?!$)
- no end of string should follow at this position (no empty input is allowed)(?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))?
- a "real" group:
(?P<sign1>[+-]?)
- an optional -
or +
sign captured into Group "sign1"(?P<number1>\d+(?:\.\d+)?)
- one or more digits followed with an optional sequence of a .
and one or more digits captured into Group "number1"(?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j))?
- an optional sequence captured into "imag" group:
(?P<sign2>[+-]?)
- an optional -
or +
sign captured into Group "sign2"(?P<number2>\d+(?:\.\d+)?j)
- one or more digits followed with an optional sequence of a .
and one or more digits and then a j
char captured into Group "number2"$
- end of string.Upvotes: 2