Reputation: 43
I have a program running on my Arduino that takes serial input, and saves it to a variable. Works a charm. With the Arduino applications built in serial monitor, I have successfully sent and received bytes between 0-255.
Using pyserial, to send any byte higher then 127 (or 0b01111111
), pyserial
returns 2 - Meaning for values higher then 127, say 0b10000000
, 2 bytes will be sent, not one.
I believe my problem is with pyserial
, therefore.
ser.write(chr(int('01000000', base=2)).encode('utf-8'))
works perfectly, and is received on the Arduino end correctly.
ser.write(chr(int('10000000', base=2)).encode('utf-8'))
returns 2, however - And shows on the Arduino as 0b11000010
and 0b10000000
.
Upvotes: 4
Views: 3203
Reputation: 809
As NPE says, this is the encoding for UTF-8 - a byte between 128 and 2047 (8 - 11 bits) inclusive is converted to two bytes: if the original 11 bits is abcdefghijk then then utf-8 version is 110abcde 10fghijk. In your example (with padding left 0s to make 11 bits), 00010000000 would be converted to 11000010 10000000 or \xc2\x80, which is exactly what you are seeing. See the Wikipedia article on UTF-8 for more
You can see this in python with this code (I'm replacing int('10000000', base=2) with 128):
>>> unichr(128).encode('utf-8')
'\xc2\x80'
The thing that confuses me is that you can use chr(int('10000000',base=2)).encode('utf-8'), or equivalently chr(128).encode('utf-8)'. When I do this I get:
>>> chr(int('10000000', base=2)).encode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
Have you changed the default encoding?
What you need is an encoding that uses one byte for 0 - 255, and matches unicode. So try using 'latin_1' instead:
>>> unichr(128).encode('latin_1')
'\x80'
Upvotes: 2