Bastian
Bastian

Reputation: 918

Regular expression: alphanumerics without pure numerics

I need to extract the names of variables from a function string.

A variable can be [a-zA-Z0-9]+ but not a real number notated like 1, 3.5, 1e4, 1e5...

Is there a smart way of doing this?

Here's a M(not)WE in python:

import re
pattern = r"[a-zA-z0-9.]+"
function_string = "(A+B1)**2.5"
re.findall(pattern, function_string)

The above code returns:

A, B1 and 2.5.

My desired output is

A and B1.

And here's a nice way of testing the regular expressions: https://regex101.com/r/fv0DfR/1

Upvotes: 0

Views: 64

Answers (2)

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

Try this Regex:

\b(?!\d)[a-zA-Z0-9]+

Click for Demo

Explanation:

  • \b - matches a word boundary
  • (?!\d) - negative lookahead to make sure that the next character is not a digit. This will make sure that the variable name does not start with a digit. Will also exclude words like 1e3
  • [a-zA-Z0-9]+ - matches 1+ letters or digits

If you want those variables also which start with a digit and are alphanumeric, you can use \b(?!\d+(?:[eE]\d+)?\b)[a-zA-Z0-9]+

Upvotes: 0

DirtyBit
DirtyBit

Reputation: 16772

import re
pattern = r'[a-zA-Z_][a-zA-Z0-9_]{0,31}'
function_string = "(A+B1)2.5"

print(re.findall(pattern, function_string))

OUTPUT:

['A', 'B1']

Upvotes: 1

Related Questions