Reputation: 918
I need to extract the names of variables from a function string.
A variable can be [a-zA-Z0-9]+ but not a real number notated like 1, 3.5, 1e4, 1e5...
Is there a smart way of doing this?
Here's a M(not)WE in python:
import re
pattern = r"[a-zA-z0-9.]+"
function_string = "(A+B1)**2.5"
re.findall(pattern, function_string)
The above code returns:
A, B1 and 2.5.
My desired output is
A and B1.
And here's a nice way of testing the regular expressions: https://regex101.com/r/fv0DfR/1
Upvotes: 0
Views: 64
Reputation: 10360
Try this Regex:
\b(?!\d)[a-zA-Z0-9]+
Explanation:
\b
- matches a word boundary(?!\d)
- negative lookahead to make sure that the next character is not a digit. This will make sure that the variable name does not start with a digit. Will also exclude words like 1e3
[a-zA-Z0-9]+
- matches 1+ letters or digitsIf you want those variables also which start with a digit and are alphanumeric, you can use \b(?!\d+(?:[eE]\d+)?\b)[a-zA-Z0-9]+
Upvotes: 0
Reputation: 16772
import re
pattern = r'[a-zA-Z_][a-zA-Z0-9_]{0,31}'
function_string = "(A+B1)2.5"
print(re.findall(pattern, function_string))
OUTPUT:
['A', 'B1']
Upvotes: 1