User
User

Reputation: 24759

Parse string to appropriate variables

[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]

I have a long string of data that came from a javascript file, like the above. Is there a short cut or library that would parse this into the appropriate data types?

As you can see, it's a list of lists that contain dictionaries, Boolean values, integers and null values.

I mean, I could do this by hand but I don't think I could do it very quickly or efficiently. There must be a better method.

Upvotes: 2

Views: 425

Answers (2)

steveha
steveha

Reputation: 76765

I suggest you take a look at PyParsing.

http://pyparsing.wikispaces.com/

You could also take a look at the Python "scanf" library.

sscanf in Python

If you needed to solve this problem just using Python built-ins, I would recommend using a regular expression with capture groups.

EDIT: Hmm, I took another look at this. You did say it was from JavaScript... this looks to me like a legal JSON array. I tried using the json module (specifically, the method function json.loads()) but I couldn't get it to parse.

But! Python syntax is close to JavaScript syntax. Replace a few things and eval() can parse this, or ast.literal_eval(). We need to replace true with True, false with False, and null with None before ast.literal_eval() will accept it.

import ast
s = '[[{"date":"January 2004"},True,False,100,null,null,true],[{"date":"February 2004"},False,False,99,null,null,true]]'
s1 = s.replace("true","True").replace("false","False").replace("null","None")
x = ast.literal_eval(s1)
print(x)

The above will print:

[[{'date': 'January 2004'}, True, False, 100, None, None, True], [{'date': 'February 2004'}, False, False, 99, None, None, True]]

Originally I showed defining variables (like true = True) and using eval() to parse this, but of course eval() is a potential security hole; so if you need to parse text that might come from a web page or any other untrusted source, it's worth the small amount of effort to import ast and use ast.literal_eval() instead.

EDIT: Okay, the json module can parse this; the problem was the use of True instead of true and False instead of false. Just use the str.replace() method function to fix those, and then json.loads() can parse this.

I was just about to post a code fragment with the .replace() method calls, when the question got updated again, and the capitalized True and False became ordinary legal JSON ones.

So my final answer:

s = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'

import json

x = json.loads(s)
print(x)

prints:

[[{u'date': u'January 2004'}, True, False, 100, None, None, True], [{u'date': u'February 2004'}, False, False, 99, None, None, True]]

Upvotes: 2

roippi
roippi

Reputation: 25974

That's pretty close to valid JSON. The only invalid thing is that False should be false and True should be true. That could be a transcription error (...yep)


Use json:

import json

x = '[[{"date":"January 2004"},true,false,100,null,null,true],[{"date":"February 2004"},false,false,99,null,null,true]]'

json.loads(x)
Out[20]: 
[[{'date': 'January 2004'}, True, False, 100, None, None, True],
 [{'date': 'February 2004'}, False, False, 99, None, None, True]]

Upvotes: 5

Related Questions