Reputation: 2667
I have this dictionary defined by:
def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model
Later along the way, I want to to use pickle and dump the dictionary into a text file:
f = open('dict.txt', 'wb')
pickle.dump(Nwords, f)
However the code doesn't work and I receive an error. Apparently pickle
can't work with lambda
and I'm better off defining the model
using a module-level function. I have already read the answers here
Unfortunately as I am not experienced with Python I am not exactly sure how to do this. I tried:
def dd():
return defaultdict(int)
def train(features):
## model = defaultdict(lambda: 1)
model = defaultdict(dd)
for f in features:
model[f] += 1
return model
I receive the error:
TypeError: unsupported operand type(s) for +=: 'collections.defaultdict' and 'int'
Other than that, return defaultdict(int)
would always assign a zero to the first occurrence of a key, whereas I want it to assign 1. Any ideas on how I can fix this?
Upvotes: 1
Views: 1520
Reputation: 1122262
Unfortunately, that answer there is correct for that question, but subtly wrong for yours. Although a top-level function instead of a lambda is great and indeed would make pickle a lot happier, the function should return the default value to be used, which for your case is not another defaultdict
object.
Simply return the same value your lambda
returns:
def dd():
return 1
Every time you try to access a key in the defaultdict
instance that doesn't yet exist, dd
is called. The other post then returns another defaultdict
instance, that one set to use int
as a default, which matches the lambda shown in the other question.
Upvotes: 2