Reputation: 24759
[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]
I have a long string of data that came from a javascript file, like the above. Is there a short cut or library that would parse this into the appropriate data types?
As you can see, it's a list of lists that contain dictionaries, Boolean values, integers and null values.
I mean, I could do this by hand but I don't think I could do it very quickly or efficiently. There must be a better method.
Upvotes: 2
Views: 425
Reputation: 76765
I suggest you take a look at PyParsing.
http://pyparsing.wikispaces.com/
You could also take a look at the Python "scanf" library.
If you needed to solve this problem just using Python built-ins, I would recommend using a regular expression with capture groups.
EDIT: Hmm, I took another look at this. You did say it was from JavaScript... this looks to me like a legal JSON array. I tried using the json
module (specifically, the method function json.loads()
) but I couldn't get it to parse.
But! Python syntax is close to JavaScript syntax. Replace a few things and eval()
can parse this, or ast.literal_eval()
. We need to replace true
with True
, false
with False
, and null
with None
before ast.literal_eval()
will accept it.
import ast
s = '[[{"date":"January 2004"},True,False,100,null,null,true],[{"date":"February 2004"},False,False,99,null,null,true]]'
s1 = s.replace("true","True").replace("false","False").replace("null","None")
x = ast.literal_eval(s1)
print(x)
The above will print:
[[{'date': 'January 2004'}, True, False, 100, None, None, True], [{'date': 'February 2004'}, False, False, 99, None, None, True]]
Originally I showed defining variables (like true = True
) and using eval()
to parse this, but of course eval()
is a potential security hole; so if you need to parse text that might come from a web page or any other untrusted source, it's worth the small amount of effort to import ast
and use ast.literal_eval()
instead.
EDIT: Okay, the json
module can parse this; the problem was the use of True
instead of true
and False
instead of false
. Just use the str.replace()
method function to fix those, and then json.loads()
can parse this.
I was just about to post a code fragment with the .replace()
method calls, when the question got updated again, and the capitalized True
and False
became ordinary legal JSON ones.
So my final answer:
s = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'
import json
x = json.loads(s)
print(x)
prints:
[[{u'date': u'January 2004'}, True, False, 100, None, None, True], [{u'date': u'February 2004'}, False, False, 99, None, None, True]]
Upvotes: 2
Reputation: 25974
That's pretty close to valid JSON. The only invalid thing is that False
should be false
and True
should be true
. That could be a transcription error (...yep)
Use json
:
import json
x = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'
json.loads(x)
Out[20]:
[[{'date': 'January 2004'}, True, False, 100, None, None, True],
[{'date': 'February 2004'}, False, False, 99, None, None, True]]
Upvotes: 5