Reputation: 89
I have a problem I'm working on for a few hours and I can't get it fixed. I'm sure it is just a small thing, but somehow I don't know what I am doing wrong.
My aim is to get data via json from the public transport company and show the next departure-times of metro/tram on a display. Basically everything works, but as soon as json returns an umlaut (like "ü") I get an error message. The interesting thing is: The sharp s (ß) works!
Here is the exact error message (it should be "Hütteldorf"):
UnicodeEncodeError('ascii', u'H\xfctteldorf', 1, 2, 'ordinal not in range(128)')
The part of the code:
...
apiurl = 'https://www.wienerlinien.at/ogd_realtime/monitor?rbl={rbl}&sender={apikey}'
...
for rbl in rbls:
r = requests.get(url, timeout=10)
##r.encoding = 'utf-8';
##print(r.json())
##print(r.encoding)
##r.encoding = 'latin1'
if requests.codes.ok:
try:
for monitor in r.json()['data']['monitors']:
rbl.station = monitor['locationStop']['properties']['title'].encode('utf-8')
for line in monitor['lines']:
#Decoding-Problem is here - ß works, ü doesn't
#UnicodeEncodeError('ascii', u'H\xfctteldorf', 1, 2, 'ordinal not in range(128)')
rbl.name = str(line['name'])
rbl.direction = str(line['towards'])
rbl.trafficjam = line['trafficjam'] #Boolean
...
I personally think I tried everything I found that is possible in Python3...encode, decode, ... Every time either the sharp s or the umlaut ü is failing.
Can someone give me a hint in the right direction? Thank you very much!
[Edit:] Here is the full source-code, which has a workaround (ü=ue):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys, getopt, time
import requests
import smbus
# Define some device parameters
I2C_ADDR = 0x27 # I2C device address, if any error, change this address to 0x3f
LCD_WIDTH = 20 # Maximum characters per line
# Define some device constants
LCD_CHR = 1 # Mode - Sending data
LCD_CMD = 0 # Mode - Sending command
LCD_LINE_1 = 0x80 # LCD RAM address for the 1st line
LCD_LINE_2 = 0xC0 # LCD RAM address for the 2nd line
LCD_LINE_3 = 0x94 # LCD RAM address for the 3rd line
LCD_LINE_4 = 0xD4 # LCD RAM address for the 4th line
LCD_BACKLIGHT = 0x08 # On
#LCD_BACKLIGHT = 0x00 # Off
ENABLE = 0b00000100 # Enable bit
# Timing constants
E_PULSE = 0.0005
E_DELAY = 0.0005
#Open I2C interface
bus = smbus.SMBus(1) # Rev 2 Pi uses 1
class RBL:
id = 0
line = ''
station = ''
direction = ''
time = -1
def replaceUmlaut(s):
s = s.replace("Ä", "Ae") # A umlaut
s = s.replace("Ö", "Oe") # O umlaut
s = s.replace("Ü", "Ue") # U umlaut
s = s.replace("ä", "ae") # a umlaut
s = s.replace("ö", "oe") # o umlaut
s = s.replace("ü", "ue") # u umlaut
return s
def lcd_init():
# Initialise display
lcd_byte(0x33,LCD_CMD) # 110011 Initialise
lcd_byte(0x32,LCD_CMD) # 110010 Initialise
lcd_byte(0x06,LCD_CMD) # 000110 Cursor move direction
lcd_byte(0x0C,LCD_CMD) # 001100 Display On,Cursor Off, Blink Off
lcd_byte(0x28,LCD_CMD) # 101000 Data length, number of lines, font size
lcd_byte(0x01,LCD_CMD) # 000001 Clear display
time.sleep(E_DELAY)
def lcd_byte(bits, mode):
# Send byte to data pins
# bits = the data
# mode = 1 for data
# 0 for command
bits_high = mode | (bits & 0xF0) | LCD_BACKLIGHT
bits_low = mode | ((bits<<4) & 0xF0) | LCD_BACKLIGHT
# High bits
bus.write_byte(I2C_ADDR, bits_high)
lcd_toggle_enable(bits_high)
# Low bits
bus.write_byte(I2C_ADDR, bits_low)
lcd_toggle_enable(bits_low)
def lcd_toggle_enable(bits):
# Toggle enable
time.sleep(E_DELAY)
bus.write_byte(I2C_ADDR, (bits | ENABLE))
time.sleep(E_PULSE)
bus.write_byte(I2C_ADDR,(bits & ~ENABLE))
time.sleep(E_DELAY)
def lcd_string(message,line):
# Send string to display
message = message.ljust(LCD_WIDTH," ")
lcd_byte(line, LCD_CMD)
for i in range(LCD_WIDTH):
lcd_byte(ord(message[i]),LCD_CHR)
def main(argv):
apikey = False
apiurl = 'https://www.wienerlinien.at/ogd_realtime/monitor?rbl={rbl}&sender={apikey}'
#Time between updates
st = 10
# Initialise display
lcd_init()
lcd_string("Willkommen!",LCD_LINE_2)
try:
opts, args = getopt.getopt(argv, "hk:t:", ["help", "key=", "time="])
except getopt.GetoptError:
usage()
sys.exit(2)
for opt, arg in opts:
if opt in ("-h", "--help"):
usage()
sys.exit()
elif opt in ("-k", "--key"):
apikey = arg
elif opt in ("-t", "--time"):
try:
tmpst = int(arg)
if tmpst > 0:
st = tmpst
except ValueError:
usage()
sys.exit(2)
if apikey == False or len(args) < 1:
usage()
sys.exit()
rbls = []
for rbl in args:
tmprbl = RBL()
tmprbl.id = rbl
rbls.append(tmprbl)
x = 1
while True:
for rbl in rbls:
url = apiurl.replace('{apikey}', apikey).replace('{rbl}', rbl.id)
r = requests.get(url, timeout=10)
r.encoding = 'utf-8'
if requests.codes.ok:
try:
for monitor in r.json()['data']['monitors']:
rbl.station = monitor['locationStop']['properties']['title']
for line in monitor['lines']:
rbl.name = replaceUmlaut(str(line['name'].encode('ascii','xmlcharrefreplace').decode('ascii')))
rbl.direction = replaceUmlaut(str(line['towards'].encode('ascii','xmlcharrefreplace').decode('ascii')))
rbl.trafficjam = line['trafficjam']
rbl.type = line['type']
rbl.time1 = line['departures']['departure'][0]['departureTime']['countdown']
rbl.time2 = line['departures']['departure'][1]['departureTime']['countdown']
rbl.time3 = line['departures']['departure'][2]['departureTime']['countdown']
lcdShow(rbl)
time.sleep(st)
except Exception as e:
print("Fehler (Exc): " + repr(e))
print(r)
lcd_string("Fehler (Exc):",LCD_LINE_1)
lcd_string(repr(e),LCD_LINE_2)
lcd_string("",LCD_LINE_3)
lcd_string("",LCD_LINE_4)
else:
print('Fehler bei Kommunikation mit Server')
lcd_string("Fehler:",LCD_LINE_1)
lcd_string("Serverkomm.",LCD_LINE_2)
lcd_string("",LCD_LINE_3)
lcd_string("",LCD_LINE_4)
def lcdShow(rbl):
lcdLine1 = rbl.name + ' ' + rbl.station
lcdLine2 = rbl.direction
lcdLine3 = "".ljust(LCD_WIDTH-9) + ' ' + '{:0>2d}'.format(rbl.time1) + ' ' + '{:0>2d}'.format(rbl.time2) + ' ' + '{:0>2d}'.format(rbl.time3)
if not rbl.type == "ptMetro":
if rbl.trafficjam:
lcdLine4 = "Stau in Zufahrt"
else:
lcdLine4 = "kein Stau"
else:
lcdLine4 = ""
lcd_string(lcdLine1,LCD_LINE_1)
lcd_string(lcdLine2,LCD_LINE_2)
lcd_string(lcdLine3,LCD_LINE_3)
lcd_string(lcdLine4,LCD_LINE_4)
#print(lcdLine1 + '\n' + lcdLine2+ '\n' + lcdLine3+ '\n' + lcdLine4)
def usage():
print('usage: ' + __file__ + ' [-h] [-t time] -k apikey rbl [rbl ...]\n')
print('arguments:')
print(' -k, --key=\tAPI key')
print(' rbl\t\tRBL number\n')
print('optional arguments:')
print(' -h, --help\tshow this help')
print(' -t, --time=\ttime between station updates in seconds, default 10')
if __name__ == "__main__":
main(sys.argv[1:])
Upvotes: 1
Views: 5185
Reputation: 17930
I personally think I tried everything I found that is possible in Python3...encode, decode, ... Every time either the sharp s or the umlaut ü is failing.
As noted in the comments, you appear to be running Python 2 based on the error messages you're seeing.
Python 2 has two 'string' types, str
which contains raw bytes and unicode
which contains unicode characters. When you call .json()
you get back a data structure containing unicode
strings. So line['name']
is one such unicode
string.
When you call str(line['name'])
you are implicitly asking to encode the unicode
string into a sequence of ASCII bytes. This fails as ASCII cannot represent these characters. Unfortunately I don't know why you're trying to do this here. Does rbl.name
need to be a str
? Where is it used? What encoding is it expected to be in by other code using it?
In the comments, Jorropo suggests writing line['name'].decode("utf-8")
which you indicate also doesn't work. This is because it doesn't really make sense to de-code a unicode
string, but Python 2 will try anyway by first en-coding it in ASCII (which fails) before attempting to decode in UTF-8 as you requested.
Your fix is going to depend on what you're doing with rbl.name. You might:
rbl.name = line['name']
This requires that subsequent code expects a unicode string.rbl.name = line['name'].encode('utf-8')
This requires that subsequent code expects a sequence of UTF-8 bytes.Either way, it's possible (or even probable) that something else will subsequently break when you try either of these, depending entirely on what assumptions the rest of the code makes about what rbl.name
is supposed to be and how it's encoded.
As for why it works with u'Westbahnstraße' I couldn't say for sure. Can you provide a complete example including input data that demonstrates one working and the other not working?
Upvotes: 1