Thibault Martin
Thibault Martin

Reputation: 509

Appending '0x' before the hex numbers in a string

I'm parsing a xml file in which I get basic expressions (like id*10+2). What I am trying to do is to evaluate the expression to actually get the value. To do so, I use the eval() method which works very well.

The only thing is the numbers are in fact hexadecimal numbers. The eval() method could work well if every hex number was prefixed with '0x', but I could not find a way to do it, neither could I find a similar question here. How would it be done in a clean way ?

Upvotes: 3

Views: 2578

Answers (4)

Marcelo Cantos
Marcelo Cantos

Reputation: 185988

One option is to use the parser module:

import parser, token, re

def hexify(ast):
    if not isinstance(ast, list):
        return ast
    if ast[0] in (token.NAME, token.NUMBER) and re.match('[0-9a-fA-F]+$', ast[1]):
        return [token.NUMBER, '0x' + ast[1]]
    return map(hexify, ast)

def hexified_eval(expr, *args):
    ast = parser.sequence2st(hexify(parser.expr(expr).tolist()))
    return eval(ast.compile(), *args)

>>> hexified_eval('id*10 + BABE', {'id':0xcafe})
567466

This is somewhat cleaner than a regex solution in that it only attempts to replace tokens that have been positively identified as either names or numbers (and look like hex numbers). It also correctly handles more general python expressions such as id*10 + len('BABE') (it won't replace 'BABE' with '0xBABE').

OTOH, the regex solution is simpler and might cover all the cases you need to deal with anyway.

Upvotes: 0

fortran
fortran

Reputation: 76107

Be careful with eval! Do not ever use it in untrusted inputs.

If it's just simple arithmetic, I'd use a custom parser (there are tons of examples out in the wild)... And using parser generators (flex/bison, antlr, etc.) is a skill that is useful and easily forgotten, so it could be a good chance to refresh or learn it.

Upvotes: 0

aisbaa
aisbaa

Reputation: 10643

If you can parse expresion into individual numbers then I would suggest to use int function:

>>> int("CAFE", 16)
51966

Upvotes: 0

Volatility
Volatility

Reputation: 32310

Use the re module.

>>> import re
>>> re.sub(r'([\dA-F]+)', r'0x\1', 'id*A+2')
'id*0xA+0x2'
>>> eval(re.sub(r'([\dA-F]+)', r'0x\1', 'CAFE+BABE'))
99772

Be warned though, with an invalid input to eval, it won't work. There are also many risks of using eval.

If your hex numbers have lowercase letters, then you could use this:

>>> re.sub(r'(?<!i)([\da-fA-F]+)', r'0x\1', 'id*a+b')
'id*0xa+0xb'

This uses a negative lookbehind assertion to assure that the letter i is not before the section it is trying to convert (preventing 'id' from turning into 'i0xd'. Replace i with I if the variable is Id.

Upvotes: 4

Related Questions