Reputation: 509
I'm parsing a xml file in which I get basic expressions (like id*10+2
). What I am trying to do is to evaluate the expression to actually get the value. To do so, I use the eval()
method which works very well.
The only thing is the numbers are in fact hexadecimal numbers. The eval()
method could work well if every hex number was prefixed with '0x', but I could not find a way to do it, neither could I find a similar question here. How would it be done in a clean way ?
Upvotes: 3
Views: 2578
Reputation: 185988
One option is to use the parser
module:
import parser, token, re
def hexify(ast):
if not isinstance(ast, list):
return ast
if ast[0] in (token.NAME, token.NUMBER) and re.match('[0-9a-fA-F]+$', ast[1]):
return [token.NUMBER, '0x' + ast[1]]
return map(hexify, ast)
def hexified_eval(expr, *args):
ast = parser.sequence2st(hexify(parser.expr(expr).tolist()))
return eval(ast.compile(), *args)
>>> hexified_eval('id*10 + BABE', {'id':0xcafe})
567466
This is somewhat cleaner than a regex solution in that it only attempts to replace tokens that have been positively identified as either names or numbers (and look like hex numbers). It also correctly handles more general python expressions such as id*10 + len('BABE')
(it won't replace 'BABE'
with '0xBABE'
).
OTOH, the regex solution is simpler and might cover all the cases you need to deal with anyway.
Upvotes: 0
Reputation: 76107
Be careful with eval
! Do not ever use it in untrusted inputs.
If it's just simple arithmetic, I'd use a custom parser (there are tons of examples out in the wild)... And using parser generators (flex/bison, antlr, etc.) is a skill that is useful and easily forgotten, so it could be a good chance to refresh or learn it.
Upvotes: 0
Reputation: 10643
If you can parse expresion into individual numbers then I would suggest to use int function:
>>> int("CAFE", 16)
51966
Upvotes: 0
Reputation: 32310
Use the re
module.
>>> import re
>>> re.sub(r'([\dA-F]+)', r'0x\1', 'id*A+2')
'id*0xA+0x2'
>>> eval(re.sub(r'([\dA-F]+)', r'0x\1', 'CAFE+BABE'))
99772
Be warned though, with an invalid input to eval
, it won't work. There are also many risks of using eval
.
If your hex numbers have lowercase letters, then you could use this:
>>> re.sub(r'(?<!i)([\da-fA-F]+)', r'0x\1', 'id*a+b')
'id*0xa+0xb'
This uses a negative lookbehind assertion to assure that the letter i
is not before the section it is trying to convert (preventing 'id'
from turning into 'i0xd'
. Replace i
with I
if the variable is Id
.
Upvotes: 4