Reputation: 991
Python3 natively and some of the 3rd party libs seem to have different approaches in returning either string or bytes. Is there a de facto way of handling these two different types? For me, it seems natural to work only with one of the types in the code as much as possible (keep bytes only at boundaries) but I'm not sure if it makes sense.
Upvotes: 0
Views: 75
Reputation: 991
I did found the official recommendation: https://docs.python.org/3/howto/unicode.html#tips-for-writing-unicode-aware-programs:
" Software should only work with Unicode strings internally, decoding the input data as soon as possible and encoding the output only at the end. "
Upvotes: 0
Reputation: 121
If you want to manipulate text use strings as it is more straight forward, unless you have a specific use case where manipulating binary data is significant like manipuating different encodings, ciphers or row binary files.
Upvotes: 0
Reputation: 650
I do not know about any "standard" way to handle strings and byte strings, but you could have a function like this to make sure everything you use is a string (or bytes)
# To always have strings:
def get_string(s):
if type(s) == 'bytes':
return s.decode()
return s
# To always have bytes:
def get_bytes(s):
if type(s) == 'str':
return s.encode()
return s
Upvotes: 0
Reputation: 416
This question is somewhat vague, can you provide some more context? You might want to have a look at 1, which advocates the following for developing with Python3
Bytes on the outside, unicode on the inside, encode/decode at the edges
Upvotes: 1