Reputation: 662
I keep getting random errors like:
suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(corrupted)
or
suspended generator put(context.py:796) raised ValueError(Expecting , delimiter: line 1 column 440 (char 440))
or
suspended generator put(context.py:796) raised ValueError(Invalid \escape: line 1 column 18002 (char 18002))
or
suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(truncated)
Everything was working fine up until a couple of days ago, and I haven’t made any changes. When I restart my app, everything is fine for about five minutes until I get a
suspended generator _get_tasklet(context.py:329) raised ProtocolBufferDecodeError(corrupted)
After that point, I get one of the other errors on every database put or get. The table and code that causes the error is different every time. I have not idea where to begin, since the error is in a new place every time. These are just regular database puts and gets, like
ndbstate = NdbStateJ.get_by_id(self.screen_name)
or
ndbstate.put()
Google searches haven’t been able to point me in any particular directions. Any ideas? The error
Expecting , delimiter: line 1 column 440 (char 440)
might be because some of the field types in some of the tables are JSON. But why all the sudden?
So maybe I'm not escaping properly somewhere, like by using r'{...}', but if there is a bad entry in there somewhere, how do I fix it if I can't query? And why does it break the whole table for all queries? And why is it random. It's not the same query every time.
Here’s an example of a table
class NdbStateJ(ndb.Model):
last_id = ndb.IntegerProperty()
last_search_id = ndb.IntegerProperty()
last_geo_id = ndb.IntegerProperty()
mytweet_num = ndb.IntegerProperty()
mentions_processed = ndb.JsonProperty()
previous_follower_responses = ndb.JsonProperty()
my_tweets_tweeted = ndb.JsonProperty()
responses_already_used = ndb.JsonProperty()
num_followed_by_cyborg = ndb.IntegerProperty(default=0)
num_did_not_follow_back = ndb.IntegerProperty(default=0)
language_model_vector = ndb.FloatProperty(repeated=True)
follow_wait_counter = ndb.IntegerProperty(default=0)
Here’s an example of creating a table
ndbstate = NdbStateJ(id=screen_name,
last_id = 37397357946732541,
last_geo_id = 37397357946732541,
last_search_id = 0,
mytweet_num = 0,
mentions_processed = [],
previous_follower_responses = [],
my_tweets_tweeted = [],
responses_already_used= [],
language_model_vector = [])
ndbstate.put()
Upvotes: 2
Views: 393
Reputation: 662
It was malformed JSON in the database causing the problem. I don't know why suddenly the problem started happening everywhere; maybe something changed on the Google side, or maybe I wasn't checking sufficiently, and new users were able to enter in malformed data. Who knows.
To fix it, I took inspiration from https://stackoverflow.com/users/1011633/nizz responding to App Engine return JSON from JsonProperty, https://stackoverflow.com/users/1709587/mark-amery responding to How to escape special characters in building a JSON string?, and https://stackoverflow.com/users/1639625/tobias-k responding to How do I automatically fix an invalid JSON string?.
I replaced ndb.JsonProperty()
with ExtendedJsonProperty
where the extended version looks similar to the code below.
import json
from google.appengine.ext import ndb
import logging
logging.getLogger().setLevel(logging.DEBUG)
import re
class ExtendedJsonProperty(ndb.BlobProperty):
# Inspired by https://stackoverflow.com/questions/18576556/app-engine-return-json-from-jsonproperty
def _to_base_type(self, value):
logging.debug('Dumping value '+str(value))
try:
return json.dumps(value)
except Exception as e:
logging.warning(('trying to fix error dumping from database: ') +str(e))
return fix_json(value,json.dumps)
def _from_base_type(self, value):
# originally return json.loads(value)
logging.debug('Loading value '+str(value))
try:
return json.loads(value)
except Exception as e:
logging.warning(('trying to fix error loading from database: ') +str(e))
return fix_json(value,json.loads)
def fix_json(s,json_fun):
for _i in range(len(s)):
try:
result = json_fun(s) # try to parse...
return result
except Exception as e:
logging.debug('Exception for json loads: '+str(e))
if 'delimiter' in str(e):
# E.g.: "Expecting , delimiter: line 34 column 54 (char 1158)"
logging.debug('Escaping quote to fix.')
s = escape_quote(s,e)
elif 'escape' in str(e):
# E.g.: "Invalid \escape: line 1 column 9 (char 9)"
logging.debug('Removing invalid escape to fix.')
s = remove_invalid_escape(s)
else:
break
return json_fun('{}')
def remove_invalid_escape(value):
# Inspired by https://stackoverflow.com/questions/19176024/how-to-escape-special-characters-in-building-a-json-string
return re.sub(r'\\(?!["\\/bfnrt])', '', value)
def escape_quote(s,e):
# Inspired by https://stackoverflow.com/questions/18514910/how-do-i-automatically-fix-an-invalid-json-string
# "Expecting , delimiter: line 34 column 54 (char 1158)"
# position of unexpected character after '"'
unexp = int(re.findall(r'\(char (\d+)\)', str(e))[0])
# position of unescaped '"' before that
unesc = s.rfind(r'"', 0, unexp)
s = s[:unesc] + r'\"' + s[unesc+1:]
# position of corresponding closing '"' (+2 for inserted '\')
closg = s.find(r'"', unesc + 2)
if closg + 2 < len(s):
print closg, len(s)
s = s[:closg] + r'\"' + s[closg+1:]
return s
Upvotes: 1