python split function does not work as expected

Question

I am having a hard time to formulate what my problem is... It concerns this piece of code here:

def txt_to_dict_ad():

    my_dict = {}    

    with open('database.txt', 'r') as file:
        for line in file:
            temp = list(line.strip().split('-'))
            my_dict[temp[1].strip('
')] = temp[0]

    return my_dict

When I run it and for example want to print the output of this function, I am getting an "index out of range" error for the line containing temp[1] and temp[0]. Why is that? And how can I avoid it?

The txt file contains Arabic and German vocabulary, Example data: Auto - سيارة

Heiko Oberdiek · Accepted Answer

If a line in database.txt does not contain a -, then the variable temp contains a list with one element only and temp[1] of the next line tries to access the non-existent second element and will therefore throw the error.

You can avoid the error by ignoring lines without -, for example.

if '-' in line:
    temp = list(line.strip().split('-'))
    my_dict[temp[1].strip('
')] = temp[0]

If you want to identify the lines without hyhen:

with open('database.txt', 'r') as file:
    for i, line in enumerate(file, start=1):
        if '-' in line:
            temp = list(line.strip().split('-'))
            my_dict[temp[1].strip('
')] = temp[0]
        else:
            print('Line {} misses the underscore.'.format(i))

python split function does not work as expected

Answers (1)

Related Questions