Thanx
Thanx

Reputation: 7653

Something wrong with output from list in Python

I want a Python program to import a list of words from a text file and print out the content of the text file as two lists. The data in the text file is on this form:

A Alfa
B Betta
C Charlie

I want a Python program to print out one list with A,B,C and one with Alfa, Betta, Charlie.

This is what I've written:

english2german = open('english2german.txt', 'r')
englist = []
gerlist = []

for i, line in enumerate(english2german):
    englist[i:], gerlist[i:] = line.split()

This is making two lists, but will only print out the first letter in each word. How can I make my code to print out the whole word?

Upvotes: 2

Views: 421

Answers (6)

Autoplectic
Autoplectic

Reputation: 7676

And even shorter than amo-ej1's answer, and likely faster:

In [1]: english2german = open('english2german.txt')
In [2]: eng, ger = zip(*( line.split() for line in english2german ))
In [3]: eng
Out[3]: ('A', 'B', 'C')
In [4]: ger
Out[4]: ('Alfa', 'Betta', 'Charlie')

If you're using Python 3.0 or from future_builtins import zip, this is memory-efficient too. Otherwise replace zip with izip from itertools if english2german is very long.

Upvotes: 6

dbr
dbr

Reputation: 169763

Slightly meta-answer(?) to Autoplectic's suggestion of using zip()

With 3 lines in the input file (from the supplied data in the question):

The zip() method takes an average of 0.404729390144 seconds, compared to 0.341339087486 with the simple for loop constructing two lists (the code from mipadi's currently accepted answer).

With 10,000 lines in the input file (random generated 3-12 character words. I reduced the timeit.repeat() values to 100 times, repeated twice):

zip() took an average of 1.43965339661 seconds, compared to 1.52318406105 with the for loop.

Both benchmarks were done using Python version 2.5.1

Hardly a huge difference.. Given how much more readable the simple for loop is, I would recommend using it.. The zip code might be a bit quicker with large files, but the difference is about 0.083 seconds with 10,000 lines..

Benchmarking code:

import timeit

# https://stackoverflow.com/questions/743248/something-wrong-with-output-from-list-in-python/743313#743313
code_zip = """english2german = open('english2german.txt')
eng, ger = zip(*( line.split() for line in english2german ))
"""

# https://stackoverflow.com/questions/743248/something-wrong-with-output-from-list-in-python/743268#743268
code_for = """english2german = open("english2german.txt")
englist = []
gerlist = []

for line in english2german:
    (e, g) = line.split()
    englist.append(e)
    gerlist.append(g)
"""

for code in [code_zip, code_for]:
    t = timeit.Timer(stmt = code)
    try:
        times = t.repeat(10, 10000)
    except:
        t.print_exc()
    else:
        print "Code:"
        print code
        print "Time:"
        print times
        print "Average:"
        print sum(times) / len(times)
        print "-" * 20

Upvotes: 1

ZeD
ZeD

Reputation: 571

just an addition: you're working with files. please close them :) or use the with construct:

with open('english2german.txt') as english2german:
  englist, gerlist = zip(*(line.split() for line in english2german))

Upvotes: 3

ibz
ibz

Reputation: 46859

The solutions already posted are OK if you have no spaces in any of the words (ie each line has a single space). If I understand correctly, you are trying to build a dictionary, so I would suggest you consider the fact that you can also have definitions of multiple word expressions. In that case, you'd better use some other character instead of a space to separate the definition from the word. Something like "|", which is impossible to appear in a word.

Then, you do something like this:

for line in english2german:
    (e, g) = line.split("|")
    englist.append(e)
    gerlist.append(g)

Upvotes: 1

amo-ej1
amo-ej1

Reputation: 3307

Like this you mean:

english2german = open('k.txt', 'r')
englist = []
gerlist = []

for i, line in enumerate(english2german):
    englist.append(line.split()[0])
    gerlist.append(line.split()[1])

print englist
print gerlist

which generates:

['A', 'B', 'C'] ['Alfa', 'Betta', 'Charlie']

Upvotes: 1

mipadi
mipadi

Reputation: 411390

You want something like this:

english2german = open("english2german.txt")
englist = []
gerlist = []

for line in english2german:
    (e, g) = line.split()
    englist.append(e)
    gerlist.append(g)

The problem with your code before is that englist[i:] is actually a slice of a list, not just a single index. A string is also iterable, so you were basically stuffing a single letter into several indices. In other words, something like gerlist[0:] = "alfa" actually results in gerlist = ['a', 'l', 'f', 'a'].

Upvotes: 6

Related Questions