Bdfy
Bdfy

Reputation: 24661

How to convert regexp to python from perl

in Perl:

 if ($test =~ /^id\:(.*)$/ ) {

 print $1;

 }

Upvotes: 4

Views: 2268

Answers (4)

Tagar
Tagar

Reputation: 14911

I wrote this Perl to Python regex converter when I had to rewrite a bunch of Perl regex'es (a lot) to Python's re package calls. It covers some basic stuff, but might be still helpful in many ways:

def convert_re (perl_re, string_var='column_name', 
                test_value=None, expected_test_result=None):
    '''
        Returns Python regular expression converted to calls of Python `re` library
    '''

    match = re.match(r"(\w+)/(.+)/(.*)/(\w*)", perl_re)

    if not match:
        raise ValueError("Not a Perl regex? "+ perl_re)

    if not match.group(1)=='s':
        raise ValueError("This function is only for `s` Perl regexpes (substitutes), i.e s/a/b/")

    flags = match.group(4)

    if 'g' in flags:
        count=0     # all matches
        flags=flags.replace('g','') # remove g
    else:
        count=1     # one exact match only

    if not flags:
        flags=0

    # change any group references in replacements like \2 to group references like \g<2>
    replacement=match.group(3)
    replacement = re.sub(r"\$(\d+)", r"\\g<\1>", replacement)

    python_code = "re.sub(r'{regexp}', r'{replacement}', {string}{count}{flags})".format(
                    regexp=match.group(2)
                ,   replacement=replacement
                ,   string=string_var
                ,   count=", count={}".format(count) if count else ''
                ,   flags=", flags={}".format(flags) if flags else ''
            )

    if test_value:
        print("Testing Perl regular expression {} with value '{}':".format(perl_re, test_value))
        print("(generated equivalent Python code: {} )".format(python_code))
        exec('{}=r"{}"; test_result={}'.format(string_var, test_value, python_code))
        assert test_result==expected_test_result, "produced={} expected={}".format(test_result, expected_test_result)
        print("Test OK.")

    return string_var+" = "+python_code

print convert_re(r"s/^[ 0-9-]+//", test_value=' 2323 col', expected_test_result='col')

print convert_re(r"s/[+-]/_/g", test_value='a-few+words', expected_test_result='a_few_words')

Upvotes: 0

Cameron
Cameron

Reputation: 98776

In Python:

import re

test = 'id:foo'

match = re.search(r'^id:(.*)$', test)
if match:
    print match.group(1)

In Python, regular expressions are available through the re library.

The r before the string indicates that it is a raw string literal, meaning that backslashes are not treated specially (otherwise every backslash would need to be escaped with another backslash in order for a literal backslash to make its way into the regex string).

I have used re.search here because this is the closest equivalent to Perl's =~ operator. There is another function re.match which does the same thing but only checks for a match starting at the beginning of the string (counter-intuitive to a Perl programmer's definition of "matching"). See this explanation for full details of the differences between the two.

Also note that there is no need to escape the : since it is not a special character in regular expressions.

Upvotes: 15

Novikov
Novikov

Reputation: 4489

match = re.match("^id:(.*)$", test)
if match:
    print match.group(1)

Upvotes: 4

Machinarius
Machinarius

Reputation: 3731

Use a RegexObject like stated here: http://docs.python.org/library/re.html#regular-expression-objects

Upvotes: 0

Related Questions