Why do I receive this error on parsing?

Question

I am reading in a textfile and converting it into a python dictionary:

The file looks like this with labelword:

20001   World Economies

20002   Politics

20004   Internet Law

20005   Philipines Elections

20006   Israel Politics

20007   Science

This is the code to read the file and create a dictionary:

def get_pair(line):
  key, sep, value = line.strip().partition("	")
  return int(key), value


with open("mapped.txt") as fd:    
           d = dict(get_pair(line) for line in fd)
print(d)

I receive {} when I print the contents of d. Additionally, I receive this error:

Traceback (most recent call last):
  File "predicter.py", line 23, in 
    d = dict(get_pair(line) for line in fd)
  File "predicter.py", line 23, in 
    d = dict(get_pair(line) for line in fd)
  File "predicter.py", line 19, in get_pair
    return int(key), value
ValueError: invalid literal for int() with base 10: ''

What does this mean? I do have content inside the file, I am not sure why is it not being read.

Martijn Pieters · Accepted Answer

It means key is empty, which in turn means you have a line with a tab at the start or an empty line:

>>> '	Science'.partition('	')
>>> ''.partition('	')
('', '', '')

My guess is that it is the latter; you can skip either such lines in your generator expression:

d = dict(get_pair(line) for line in fd if '	' in line.strip())

Because line.strip() returns the lines without leading and trailing whitespace, empty lines or lines with only a tab at the start result in a string without a tab in it altogether. This won't handle all cases, but you could also strip the value passed to get_pair():

d = dict(get_pair(line.strip()) for line in fd if '	' in line.strip())

Why do I receive this error on parsing?

Answers (1)

Related Questions