Hayley van Waas
Hayley van Waas

Reputation: 439

Splitting multiple lined text file into a single list?

I need some help figuring out how to split the words in a text file into a list. I can use something like this:

words = []
for line in open('text.txt'):
    line.split()
    words.append(line)

But if the file contains multiple lines of text, they are split into sublists, e.g.

this is the first line
this is the second line

Becomes:

[['this', 'is', 'the', 'first', 'line'], ['this', 'is', 'the', 'second', 'line']]

How do I make it so that they are in the same list? i.e.

[['this', 'is', 'the', 'first', 'line', 'this', 'is', 'the', 'second', 'line']]

thanks!

EDIT: This program will be opening multiple text files, so the words in each file need to be added to a sublist. So if a file has multiple lines, all the words from these lines should be stored together in a sublist. i.e. Each new file starts a new sublist.

Upvotes: 1

Views: 6719

Answers (3)

user2197172
user2197172

Reputation: 77

Not sure why you want to keep the [[]] but:

words = [open('text.txt').read().split()]

Upvotes: 1

abarnert
abarnert

Reputation: 365577

Your code doesn't actually do what you say it does. line.split() just returns a list of words in the line, which you don't do anything with; it doesn't affect line in any way, so when you do words.append(line), you're just appending the original line, a single string.

So, first, you have to fix that:

words = []
for line in open('text.txt'):
    words.append(line.split())

Now, what you're doing is repeatedly appending a new list of words to an empty list. So of course you get a list of lists of words. This is because you're mixing up the append and extend methods of list. append takes any object, and adds that object as a new element of the list; extend takes any iterable, and adds each element of that iterable as separate new elements of the list.

And if you fix that too:

words = []
for line in open('text.txt'):
    words.extend(line.split())

… now you get what you wanted.

Upvotes: 3

thefourtheye
thefourtheye

Reputation: 239433

You can use list comprehension, like this to flatten the list of words

[word for words in line.split() for word in words]

This is the same as writing

result = []
for words in line.split():
    for word in words:
       result.append(word)

Or you can use itertools.chain.from_iterable, like this

from itertools import chain
with open("Input.txt") as input_file:
    print list(chain.from_iterable(line.split() for line in input_file))

Upvotes: 3

Related Questions