Reputation: 51
I'm writing a python module designed to work with displaying and entering Emoji in pygame. This means I'm often working with non-BMP Unicode characters with apparently the python shell doesn't like.
I've made a custom string-like object to make dealing with emoji characters and sequences easier by storing emoji sequences as a single character. However, although I'd like for str(self) to return the object's raw Unicode representation, this causes problems when attempting to print out or, even worse, when it's included in an error message.
This is an example of what happens when a non-BMP character is included in the error message. Running Python 3.7.3 on Windows 10.
>>> raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
exec(code, self.locals)
File "<pyshell#0>", line 1, in <module>
Traceback (most recent call last):
File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
exec(code, self.locals)
File "<pyshell#0>", line 1, in <module>
ValueError:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python37\lib\idlelib\run.py", line 144, in main
ret = method(*args, **kwargs)
File "D:\Python37\lib\idlelib\run.py", line 486, in runcode
print_exception()
File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
print_exc(typ, val, tb)
File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
print(line, end='', file=efile)
File "D:\Python37\lib\idlelib\run.py", line 362, in write
return self.shell.write(s, self.tags)
File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
return self.asyncreturn(seq)
File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
return self.decoderesponse(response)
File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python37\lib\idlelib\run.py", line 158, in main
print_exception()
File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
print_exc(typ, val, tb)
File "D:\Python37\lib\idlelib\run.py", line 220, in print_exc
print_exc(type(context), context, context.__traceback__)
File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
print(line, end='', file=efile)
File "D:\Python37\lib\idlelib\run.py", line 362, in write
return self.shell.write(s, self.tags)
File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
return self.asyncreturn(seq)
File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
return self.decoderesponse(response)
File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "D:\Python37\lib\idlelib\run.py", line 162, in main
traceback.print_exception(type, value, tb, file=sys.__stderr__)
File "D:\Python37\lib\traceback.py", line 105, in print_exception
print(line, file=file, end="")
File "D:\Python37\lib\idlelib\run.py", line 362, in write
return self.shell.write(s, self.tags)
File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
return self.asyncreturn(seq)
File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
return self.decoderesponse(response)
File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk
=============================== RESTART: Shell ===============================
As you can see, it looks like the shell gets into an infinite loop trying to deal with the error, then restarts the shell to prevent getting stuck. Is there any way I could a) make str work differently for the error handler or b) prevent the shell restart so the error displays properly?
Upvotes: 1
Views: 75
Reputation: 51
Taking ideas from snakecharmerb and these two questions, I've implemented some code that checks whether the module is being run in the IDLE and if so, whether the function is being called by the error handler. Tests appear to be working fine. I've got the following checking for an IDLE running environment
IN_IDLE = False
for item in ['idlelib.__main__','idlelib.run','idlelib']:
IN_IDLE = IN_IDLE or item in sys.modules
And below is the new __str__
function
def __str__(self):
""" Return str(self). """
if IN_IDLE:
# Check for caller. If string is being printed, modify
# output to be IDLE-friendly (no non-BMP characters)
callername = sys._getframe(1).f_code.co_name
if callername == '_some_str':
rstr = ''
for char in self.__raw:
if ord(char) > 0xFFFF:
rstr += '\\U'+hex(ord(char))[2:].zfill(8)
else:
rstr += repr(char)[1:-1]
return rstr
else:
return self.__raw
else:
return self.__raw
Where self.__raw
holds the raw text representation of the object. I'm caching it to improve efficiency since the objects are intended to be immutable.
Of course, while this does work around the issue, I feel like python shouldn't do an entire shell restart when this occurs. Will post on bugs.python.org
EDIT: Posted on bugs.python.org as issue 36698
Upvotes: 1