Reputation: 1795
I want to read python
dictionary string using java
. Example string:
{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}
This is not a valid JSON. I want it to convert into proper JSON using java
code.
Upvotes: 2
Views: 2191
Reputation: 28879
Do a string replace '
-> "
, True
-> true
, False
-> false
, and None
-> null
, then parse the result as Json. If you are lucky (and are willing to bet on remaining lucky in the future), this can actually work in practice.
See rh-messaging/cli-rhea/blob/main/lib/formatter.js#L240-L249 (in Javascript)
static replaceWithPythonType(strMessage) {
return strMessage.replace(/null/g, 'None').replace(/true/g, 'True').replace(/false/g, 'False').replace(/undefined/g, 'None').replace(/\{\}/g, 'None');
}
Skylark is a subset (data-definition) language based on Python. There are parsers in Go, Java, Rust, C, and Lua listed on the project's page. The problem is that the Java artifacts aren't published anywhere, as discussed in Q: How do I include a Skylark configuration parser in my application?
Possibly this, https://github.com/oracle/graalpython/issues/96#issuecomment-1662566214
I was not able to find a parser specific to the Python literal notation. The ANTLR samples contain a Python grammar that could plausibly be cut down to work for you https://github.com/antlr/grammars-v4/tree/master/python/python3
Upvotes: 0
Reputation: 24802
well, the best way would be to pass it through a python script that reads that data and outputs valid json:
>>> json.dumps(ast.literal_eval("{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}"))
'{"name": "Shivam", "otherInfo": [[0], [1]], "isMale": true}'
so you could create a script that only contains:
import json, ast; print(json.dumps(ast.literal_eval(sys.argv[1])))
then you can make it a python oneliner like so:
python -c "import sys, ast, json ; print(json.dumps(ast.literal_eval(sys.argv[1])))" "{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}"
that you can run from your shell, meaning you can run it from within java the same way:
String PythonData = "{'name': u'Shivam', 'otherInfo': [[0], [1]], 'isMale': True}";
String[] cmd = {
"python", "-c", "import sys, ast, json ; print(json.dumps(ast.literal_eval(sys.argv[1])))",
python_data
};
Runtime.getRuntime().exec(cmd);
and as output you'll have a proper JSON string.
This solution is the most reliable way I can think of, as it's going to parse safely any python syntax without issue (as it's using the python parser to do so), without opening a window for code injection.
But I wouldn't recommend using it, because you'd be spawning a python process for each string you parse, which would be a performance killer.
As an improvement on top of that first answer, you could use some jython to run that python code in the JVM for a bit more performance.
PythonInterpreter interpreter = new PythonInterpreter();
interpreter.eval("to_json = lambda d: json.dumps(ast.literal_eval(d))")
PyObject ToJson = interpreter.get("to_json");
PyObject result = ToJson.__call__(new PyString(PythonData));
String realResult = (String) result.__tojava__(String.class);
The above is untested (so it's likely to fail and spawn dragons 👹) and I'm pretty sure you can make it more elegant. It's loosely adapted from this answer. I'll leave up to you as an exercise to see how you can include the jython environment in your Java runtime ☺.
P.S.: Another solution would be to try and fix every pattern you can think of using a gigantic regexp or multiple ones. But even if on simpler cases that might work, I would advise against that, because regex is the wrong tool for the job, as it won't be expressive enough and you'll never be comprehensive. It's only a good way to plant a seed for a bug that'll kill you at some point in the future.
P.S.2: Whenever you need to parse code from an external source, always make sure that data is sanitized and safe. Never forget about little bobby tables
Upvotes: 5
Reputation: 140553
In conjunction to the other answer: it is straight forward to simply invoke that python one-liner statement to "translate" a python-dict-string into a standard JSON string.
But doing a new Process for each row in your database might turn into a performance killer quickly.
Thus there are two options that you should consider on top of that:
Upvotes: 1