SKTLZ
SKTLZ

Reputation: 183

Regex as if string is a variable

I'm working with a script that would determine if my string would be a valid variable. It's very basic but I can`t seem to figure out how to use regular expression.

So basically I want:

A-Z
a-z
0-9
no whitespace anywhere
no special char except _

Is that possible ? This is what I tried:

re.match("[a-zA-Z0-9_,/S]*$", char_s):

Upvotes: 0

Views: 86

Answers (4)

Veedrac
Veedrac

Reputation: 60127

The correct methods:

Python 2

import re
import keyword
import tokenize

re.match(tokenize.Name+"$", char_s) and not keyword.iskeyword(char_s)

Python 3

import keyword

char_s.isidentifier() and not keyword.iskeyword(char_s)

Note that Python 2's method silently fails on Python 3.


When you see these kind of questions the first thing you should ask is "how does Python do it?" because almost all of the time it exposes a method to the user.

Upvotes: 1

Emil Davtyan
Emil Davtyan

Reputation: 14089

Well on top of the regular expressions mentioned you need to make sure it is not one of the reserved keywords :

and       del       from      not       while    
as        elif      global    or        with     
assert    else      if        pass      yield    
break     except    import    print              
class     exec      in        raise              
continue  finally   is        return             
def       for       lambda    try

So something like this :

reserved = ["and", "del", "from", "not", "while", "as", "elif", "global", "or", "with", "assert", "else", "if", "pass", "yield", "break", "except", "import", "print", "class", "exec", "in", "raise", "continue", "finally", "is", "return", "def", "for", "lambda", "try"]

def is_valid(keyword):
    return (keyword not in reserved and
            re.match(r"^(?!\d)\w+$", keyword) # from p.s.w.g answer

Or like @nofinator suggests you can and should probably just use keyword.iskeyword().

Upvotes: 3

p.s.w.g
p.s.w.g

Reputation: 149000

A pattern like this should work:

^[a-zA-Z_][a-zA-Z0-9_]*$

Or more simply:

^(?!\d)\w+$

In both cases, it will match a string which consists of one or more letters, digits or underscores as long it doesn't start with a digit.

The (?!…) in the second pattern is a negative look-ahead assertion. It ensures the first character is not a digit. More information can be found in the manual.

Upvotes: 4

John Kugelman
John Kugelman

Reputation: 361585

re.match(r"^[^\W\d]\w*$", char_s):

The word \w character class is equivalent to [a-zA-Z0-9_]. Identifiers cannot start with a digit, so match [^\W\d] for the first character and \w* for the rest of them.

Upvotes: 1

Related Questions