shubham
shubham

Reputation: 83

How do i find regular expression pattern with _(underscore), alphabets with any number of character and ends with =(equal) and number?

I need a python regular expression pattern that matches an optional name and equal sign followed by an integer in a function. Python names start with an alphabetic or underscore character, and continue with( alphabetic, numeric, or underscore). Integers have an optional sign followed by some non-zero numbers. It allows no spaces in the text.

Matches have name and value in a dictionary.

m = re.match(the-pattern,'x=3’) m.groupdict()

will return

{'name': 'x', 'value': '3'}.

Some test cases

re.match(p4a,'_a_b_c_12_=12').groupdict() --> {'name': '_a_b_c_12_', 'value': '12'}

re.match(p4a,'x=-12345').groupdict() --> {'name': 'x', 'value': '-12345'}

import re

#pattern = r"([_]+$?[A-Za-z0-9_]+$[=][0-9])"
pattern = r"(([A-Za-z]|_)|[0-9]|[=]\d+$)"
if re.match(pattern, "LS8"):
    print("Match 1")

if re.match(pattern, "_a_b_c_12_=12"):
    text = "_a_b_c_12_=12"
    items = text.split('=')
    d = {'name': items[0], 'value': items[-1]}

    print("Match 2 ", d)

if re.match(pattern, "1ab"):
    print("Match 3")

I am getting output as

but I want output as Match 2 {'name': '_a_b_c_12_', 'value': '12'}

Upvotes: 0

Views: 755

Answers (3)

Nick
Nick

Reputation: 147206

You can use this regex, which has named groups to capture the name and value parts of the input string as you have specified in the question:

(?:(?P<name>[_a-z][a-z0-9_]*)=)?(?P<value>[+-]?\d+)$

You can pass each of your strings to re.match to test them against the pattern, printing the item number and the match group dictionary when a match is found:

import re

pattern = r'(?:(?P<name>[_a-z][a-z0-9_]*)=)?(?P<value>[+-]?\d+)$'

for i, e in enumerate(['LS8', '_A_b_c_12_=12', 'lab', '-123', '4ab=5', 'a=3x']):
    m = re.match(pattern, e, re.I)
    if m:
        print('Match ' + str(i+1) + ' ', m.groupdict())

Output:

Match 2  {'value': '12', 'name': '_a_b_c_12_'}
Match 4  {'value': '-123', 'name': None}

Upvotes: 1

yyyyyyyan
yyyyyyyan

Reputation: 420

The regex you're looking for is ^([A-Za-z_][A-Za-z_0-9]*)=([-+]?\d+)$.

  • ^ and $ assures the string it finds starts at the beginning of a line and ends at the end of a line.
  • [A-Za-z_] gets the first character of the python identifier - needs to be a letter (any case) or an underscore
  • [A-Za-z_0-9]* gets the following characters (if any, as the * specifies) of the identifier, which can be the same of the first character plus any digits
  • = - One equal sign
  • [+-]?\d+ gets the end of the string - an optional sign followed by any amount of digits

Upvotes: 0

Shiman Guo
Shiman Guo

Reputation: 26

For the given test cases, the following code should work:

import re

pattern = r"(?:(?P<name>[A-Za-z_][A-Za-z0-9_]*)=)?(?P<value>-?\d+)$"

match1 = re.match(pattern, "LS8")
if match1:
    print("Match 1 ", match1.groupdict())

match2 = re.match(pattern, "_a_b_c_12_=12")
if match2:
    print("Match 2 ", match2.groupdict())

match3 = re.match(pattern, "1ab")
if match3:
    print("Match 3 ", match3.groupdict())

match4 = re.match(pattern, "123")
if match4:
    print("Match 4 ", match4.groupdict())

The output:

Match 2  {'name': '_a_b_c_12_', 'value': '12'}
Match 4  {'name': None, 'value': '123'}

Upvotes: 1

Related Questions