Reputation: 10010
Running python2.7 here. I am writing a quick and dirty little script to do some web scraping, and I just want the unicode handler to just ignore all unicode errors.
That is, I am totally fine if it just drops whatever characters it can't convert to ascii anywhere in the program. This is just a throwaway script I just want to get done :-)
Is there some global "ignore" variable I can set?
Thanks! /YGA
Upvotes: 1
Views: 3050
Reputation: 32309
I am totally fine if it just drops whatever characters it can't convert to ascii anywhere in the program
Then you want to explicitly create your Unicode objects from the ascii
codec, and specify to ignore
errors:
input = unicode(input_bytes, encoding='ascii', errors='ignore')
See the Unicode HOWTO for more on properly handling Unicode.
(And for writing new code, always choose Python 3 or later unless you have an excellent well-formed reason to stay behind.)
Upvotes: 2