Devavrata
Devavrata

Reputation: 1795

How to read python dictionary string in JAVA

I want to read python dictionary string using java. Example string:

{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}

This is not a valid JSON. I want it to convert into proper JSON using java code.

Upvotes: 2

Views: 2191

Answers (3)

user7610
user7610

Reputation: 28879

Hacky solution

Do a string replace ' -> ", True -> true, False -> false, and None -> null, then parse the result as Json. If you are lucky (and are willing to bet on remaining lucky in the future), this can actually work in practice.

See rh-messaging/cli-rhea/blob/main/lib/formatter.js#L240-L249 (in Javascript)

static replaceWithPythonType(strMessage) {
    return strMessage.replace(/null/g, 'None').replace(/true/g, 'True').replace(/false/g, 'False').replace(/undefined/g, 'None').replace(/\{\}/g, 'None');
}

Skylark solution

Skylark is a subset (data-definition) language based on Python. There are parsers in Go, Java, Rust, C, and Lua listed on the project's page. The problem is that the Java artifacts aren't published anywhere, as discussed in Q: How do I include a Skylark configuration parser in my application?

Graal Python

Possibly this, https://github.com/oracle/graalpython/issues/96#issuecomment-1662566214

DIY Parsers

I was not able to find a parser specific to the Python literal notation. The ANTLR samples contain a Python grammar that could plausibly be cut down to work for you https://github.com/antlr/grammars-v4/tree/master/python/python3

Upvotes: 0

zmo
zmo

Reputation: 24802

well, the best way would be to pass it through a python script that reads that data and outputs valid json:

>>> json.dumps(ast.literal_eval("{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}"))
'{"name": "Shivam", "otherInfo": [[0], [1]], "isMale": true}'

so you could create a script that only contains:

import json, ast; print(json.dumps(ast.literal_eval(sys.argv[1])))

then you can make it a python oneliner like so:

python -c "import sys, ast, json ; print(json.dumps(ast.literal_eval(sys.argv[1])))" "{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}"

that you can run from your shell, meaning you can run it from within java the same way:

String PythonData = "{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}";

String[] cmd = {
    "python", "-c", "import sys, ast, json ; print(json.dumps(ast.literal_eval(sys.argv[1])))",
    python_data
    };
Runtime.getRuntime().exec(cmd);

and as output you'll have a proper JSON string.

This solution is the most reliable way I can think of, as it's going to parse safely any python syntax without issue (as it's using the python parser to do so), without opening a window for code injection.

But I wouldn't recommend using it, because you'd be spawning a python process for each string you parse, which would be a performance killer.

As an improvement on top of that first answer, you could use some jython to run that python code in the JVM for a bit more performance.

PythonInterpreter interpreter = new PythonInterpreter();
interpreter.eval("to_json = lambda d: json.dumps(ast.literal_eval(d))")
PyObject ToJson = interpreter.get("to_json");
PyObject result = ToJson.__call__(new PyString(PythonData));
String realResult = (String) result.__tojava__(String.class);

The above is untested (so it's likely to fail and spawn dragons 👹) and I'm pretty sure you can make it more elegant. It's loosely adapted from this answer. I'll leave up to you as an exercise to see how you can include the jython environment in your Java runtime ☺.


P.S.: Another solution would be to try and fix every pattern you can think of using a gigantic regexp or multiple ones. But even if on simpler cases that might work, I would advise against that, because regex is the wrong tool for the job, as it won't be expressive enough and you'll never be comprehensive. It's only a good way to plant a seed for a bug that'll kill you at some point in the future.


P.S.2: Whenever you need to parse code from an external source, always make sure that data is sanitized and safe. Never forget about little bobby tables

Upvotes: 5

GhostCat
GhostCat

Reputation: 140553

In conjunction to the other answer: it is straight forward to simply invoke that python one-liner statement to "translate" a python-dict-string into a standard JSON string.

But doing a new Process for each row in your database might turn into a performance killer quickly.

Thus there are two options that you should consider on top of that:

  • establish some small "python server" that keeps running; its only job is to do that translation for JVMs that can connect to it
  • you can look into jython. Meaning: simply enable your JVM to run python code. In other words: instead of writing your own python-dict-string parser; you simply add "python powers" to your JVM; and rely on existing components to that translation for you.

Upvotes: 1

Related Questions