Sangeeth Saravanaraj
Sangeeth Saravanaraj

Reputation: 16597

Python: Data validation using regular expression

I am trying to use Python regular expression to validate the value of a variable.

The validation rules are as follows:

Currently I am using the following snippet of code to do the validation:

import re
data = "asdsaq2323-asds"
if re.compile("[a-zA-Z0-9*]+").match(data).group() == data:
    print "match"
else:
    print "no match"

I feel there should be a better way of doing the above. I am looking for something like the following:

validate_func(pattern, data) 
/* returns data if the data passes the validation rules */
/* return None if the data does not passes the validation rules */
/* should not return part of the data which matches the validation rules */

Does one such build-in function exist?

Upvotes: 3

Views: 5181

Answers (3)

Andrew Clark
Andrew Clark

Reputation: 208435

To make sure the entire string matches your pattern, use beginning and end of string anchors in your regex. For example:

regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
if regex.match(data):
    print "match"
else:
    print "no match"

Making this a function:

def validate_func(regex, data):
    return data if regex.match(data) else None

Example:

>>> regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
>>> validate_func(regex, 'asdsaq2323-asds')
>>> validate_func(regex, 'asdsaq2323asds')
'asdsaq2323asds'

As a side note, I prefer \A and \Z over ^ and $ for validation like this the meaning of ^ and $ can change depending on the flags used, and $ will match just before a line break characters at the end of the string.

Upvotes: 3

ruakh
ruakh

Reputation: 183251

In a regex, the metacharacters ^ and $ mean "start-of-string" and "end-of-string" (respectively); so, rather than seeing what matches, and comparing it to the whole string, you can simply require that the regex match the whole string to begin with:

import re
data = "asdsaq2323-asds"
if re.compile("^[a-zA-Z0-9*]+$").match(data):
    print "match"
else:
    print "no match"

In addition, since you're only using the regex once — you compile it and immediately use it — you can use the convenience method re.match to handle that as a single step:

import re
data = "asdsaq2323-asds"
if re.match("^[a-zA-Z0-9*]+$", data):
    print "match"
else:
    print "no match"

Upvotes: 6

AlwaysBTryin
AlwaysBTryin

Reputation: 1964

I think you're looking for

re.match('^[a-zA-Z0-9*]+$',data) and data

The extra and data is just to return data, but I'm not sure why you need that. Checking the re.match result against None is enough to check whether the string is valid.

Upvotes: 2

Related Questions