Reputation: 3039
I am reading in a textfile and converting it into a python dictionary:
The file looks like this with labelword:
20001 World Economies
20002 Politics
20004 Internet Law
20005 Philipines Elections
20006 Israel Politics
20007 Science
This is the code to read the file and create a dictionary:
def get_pair(line):
key, sep, value = line.strip().partition("\t")
return int(key), value
with open("mapped.txt") as fd:
d = dict(get_pair(line) for line in fd)
print(d)
I receive {}
when I print the contents of d.
Additionally, I receive this error:
Traceback (most recent call last):
File "predicter.py", line 23, in <module>
d = dict(get_pair(line) for line in fd)
File "predicter.py", line 23, in <genexpr>
d = dict(get_pair(line) for line in fd)
File "predicter.py", line 19, in get_pair
return int(key), value
ValueError: invalid literal for int() with base 10: ''
What does this mean? I do have content inside the file, I am not sure why is it not being read.
Upvotes: 1
Views: 33
Reputation: 1125248
It means key
is empty, which in turn means you have a line with a \t
tab at the start or an empty line:
>>> '\tScience'.partition('\t')
>>> ''.partition('\t')
('', '', '')
My guess is that it is the latter; you can skip either such lines in your generator expression:
d = dict(get_pair(line) for line in fd if '\t' in line.strip())
Because line.strip()
returns the lines without leading and trailing whitespace, empty lines or lines with only a tab at the start result in a string without a tab in it altogether. This won't handle all cases, but you could also strip the value passed to get_pair()
:
d = dict(get_pair(line.strip()) for line in fd if '\t' in line.strip())
Upvotes: 3