Omid
Omid

Reputation: 2667

Pickling a dictionary that uses defaultdict

I have this dictionary defined by:

def train(features):
    model = collections.defaultdict(lambda: 1)
    for f in features:
        model[f] += 1
    return model

Later along the way, I want to to use pickle and dump the dictionary into a text file:

f = open('dict.txt', 'wb')
pickle.dump(Nwords, f)

However the code doesn't work and I receive an error. Apparently pickle can't work with lambda and I'm better off defining the model using a module-level function. I have already read the answers here

Unfortunately as I am not experienced with Python I am not exactly sure how to do this. I tried:

def dd():
    return defaultdict(int)

def train(features):
##    model = defaultdict(lambda: 1)
    model = defaultdict(dd)
    for f in features:
        model[f] += 1
    return model 

I receive the error:

TypeError: unsupported operand type(s) for +=: 'collections.defaultdict' and 'int'

Other than that, return defaultdict(int) would always assign a zero to the first occurrence of a key, whereas I want it to assign 1. Any ideas on how I can fix this?

Upvotes: 1

Views: 1520

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122262

Unfortunately, that answer there is correct for that question, but subtly wrong for yours. Although a top-level function instead of a lambda is great and indeed would make pickle a lot happier, the function should return the default value to be used, which for your case is not another defaultdict object.

Simply return the same value your lambda returns:

def dd():
    return 1

Every time you try to access a key in the defaultdict instance that doesn't yet exist, dd is called. The other post then returns another defaultdict instance, that one set to use int as a default, which matches the lambda shown in the other question.

Upvotes: 2

Related Questions