ccpizza
ccpizza

Reputation: 31841

Python: Parse JSON string with embedded Javascript constants/variables

How can I parse a JSON string that has embedded Javascript constants or variables?

For example, how to parse a JSON string like this one?

  {
    "menu": {
      "id": "file",
      "value": "File",
      "popup": {
        "menuitem": [
          {
            "value": "New",
            "onclick": Handlers.NEW
          },
          {
            "value": "Open",
            "onclick": Handlers.OPEN
          },
          {
            "value": "Custom",
            "onclick": "function(){doSomething(Handlers.OPEN);}"

          }
        ]
      }
    }
  }

All validators of course consider the JSON to be invalid, yet it is perfectly valid when evaluated in a context where the corresponding Javascript objects are defined.

The first thing that comes to mind is to pre-process the string before feeding it to the JSON parser, but that is tricky, since the same strings can occur inside existing strings (as shown in the sample JSON), and it would require some regex fiddling in order to reliably detect whether e.g. Handlers.NEW is used as an undecorated value, or inside an existing string value.

Is there a clean way to handle this use case without having to do manual regex replacements?

Upvotes: 1

Views: 602

Answers (2)

L3viathan
L3viathan

Reputation: 27331

import ast

s="""{
"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {
        "value": "New",
        "onclick": Handlers.NEW
      },
      {
        "value": "Open",
        "onclick": Handlers.OPEN
      },
      {
        "value": "Custom",
        "onclick": "function(){doSomething(Handlers.OPEN);}"

      }
    ]
  }
}
}"""

def evaluate(obj):
    if isinstance(obj, ast.Module):
        return evaluate(obj.body[0])
    elif isinstance(obj, ast.Expr):
        return evaluate(obj.value)
    elif isinstance(obj, ast.Dict):
        return {key.s: parse(value) for key, value in zip(obj.keys, obj.values)}
    elif isinstance(obj, ast.List):
        return [parse(element) for element in obj.elts]
    elif isinstance(obj, ast.Str):
        return obj.s
    elif isinstance(obj, ast.Attribute):
        return evaluate(obj.value) + "." + obj.attr
    elif isinstance(obj, ast.Name):
        return obj.id
    elif isinstance(obj, ast.Num):
        return obj.n
    else:
        print("I apparently forgot", type(obj))

x = evaluate(ast.parse(s))
print(x)

This parses the string into an Abstract Syntax Tree, and then recursively builds a Python object out of it, converting attributes into strings.

Upvotes: 1

Daniel
Daniel

Reputation: 42788

You can use the ast-module:

import ast

data = """{
    "menu": {
      "id": "file",
      "value": "File",
      "popup": {
        "menuitem": [
          {
            "value": "New",
            "onclick": Handlers.NEW
          },
          {
            "value": "Open",
            "onclick": Handlers.OPEN
          },
          {
            "value": "Custom",
            "onclick": "function(){doSomething(Handlers.OPEN);}"

          }
        ]
      }
    }
  }"""

def transform(item):
    if isinstance(item, ast.Dict):
        return dict(zip(map(transform,item.keys), map(transform, item.values)))
    elif isinstance(item, ast.List):
        return map(transform, item.elts)
    elif isinstance(item, ast.Str):
        return item.s
    else:
        return item

print transform(ast.parse(data).body[0].value)

Upvotes: 2

Related Questions