Reputation: 13339
mystr = 'aaaa'
myvar = u'My string %s' % str(mystr)
Can this be a problem in the future? I'm messing up woth some in-house code that uses email modules in Python and found some code like this. mystr
will always have only ascii characters since it comes from a list with pre defined ascii only characters.
I didn't write the code, and having str(mystr)
or mystr
doesn't change the matter of the question.
Doing the first snippet I'm going to have a safe unicode object, or do I have to do
mystr = u'aaaa'
myvar = u'My string %s' % mystr
or
mystr = 'aaaa'
myvar = u'My string %s' % unicode(mystr)
?
(I know this is not the correct way of doing, I know I should handle the exceptions, I'm asking here only if the first snippet returns a valid unicode object, or if Python mess up with it's internals or something when doing it.)
Upvotes: 2
Views: 629
Reputation: 176800
As long as the regular 8-bit string contains only ASCII characters, you're fine. This can be done to save processing time and / or memory space if you really only need ASCII.
Can it be a problem in the future? Yes, if you're taking input possibly in a non-ASCII character set and saving it in a string. It's also just generally a good idea to be consistent -- don't use strings as storage for text anywhere if you need Unicode widely, unless there is a good reason otherwise.
Upvotes: 1
Reputation: 37633
Try putting actual unicode symbols in the strings (like umlauts or cyrillic) and watch hell breaking lose. :)
s = 'свят' # world
v = u'здравей %s' % s # hello %s
Traceback (most recent call last):
File "<input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 0: ordinal not in range(128)
The problem is that you will most likely code your application and on a bright shiny day some Russian or German will write her name and will suddenly get an Internal Server Error
for having a non-ascii symbol in her name.
I know... I'm asking about the situation in my example, using ascii only in
No, there will be no problem. And IMHO this is a fault in Python, because this is bug, waiting to bite. This should have been a fatal error, but because of historical reasons, I guess, it isn't.
Upvotes: 3