user1451817
user1451817

Reputation: 151

Parse defaultdict string

I have dumped multiple defaultdict with a simple print command, like this:

defaultdict(<type 'list'>, {'actual': [20000.0, 19484.0, 19420.0], 'gold': [20000.0, 19484.0, 19464.0]})

Is there some standard parser I could use to retrieve them? I understand I should have used pickle, but the code that generated these defaultdict's is very slow and I'd like to avoid rerunning it.

Upvotes: 3

Views: 1768

Answers (3)

Amr
Amr

Reputation: 735

You can make your own subclass:

from collections import defaultdict

class mydefdict(defaultdict):
     def __repr__(self):
         return "mydefdict(%s, %s)" % (repr(self.default_factory()) + ".__class__", repr(dict(self)))

and then use it like other types with eval:

>>> d = mydefdict(list)
>>> d['foo'] = [1,2,3]
>>> d['bar']
[]
>>> print d
mydefdict([].__class__, {'foo': [1, 2, 3], 'bar': []})
>>> reprstring = repr(d)
>>> d2 = eval(reprstring)
>>> d2
mydefdict([].__class__, {'foo': [1, 2, 3], 'bar': []})

Please note that using this way, a separate copy will be created for each object reference in your structure, even if some were references to the same object.

Upvotes: 0

Andrew Clark
Andrew Clark

Reputation: 208415

If the type of your defaultdict is always <type 'list'>, you can use the following:

from collections import defaultdict

s = """
defaultdict(<type 'list'>, {'actual': [20000.0, 19484.0, 19420.0], 'gold': [20000.0, 19484.0, 19464.0]})
"""
data = eval(s.replace("<type 'list'>", 'list'))

People will tell you that eval() is unsafe and evil, but if someone was trying to inject harmful code into the data that you dumped, they could probably just as easily edit your source code. If the text files you are grabbing this data from is more accessible than your source code, then you might not want to use this method.

If there are multiple types for your defaultdicts, but they are all built-in types (or easy to translate between repr and the type name), then you could still use this method with multiple replacements, for example:

for rep, typ in ((repr(list), 'list'), (repr(dict), 'dict')):
    s = s.replace(rep, typ)
data = eval(s)

Upvotes: 6

georg
georg

Reputation: 214949

Totally ugly, but works:

s = """
defaultdict(<type 'list'>, {'actual': [20000.0, 19484.0, 19420.0], 'gold': [20000.0, 19484.0, 19464.0]})
"""

import re, ast

s = re.sub('^[^{]+', '', s)
s = re.sub('[^}]+$', '', s)

print ast.literal_eval(s)

Note that this creates a plain dict, not a defaultdict.

Upvotes: 2

Related Questions