Reputation: 439
I need some help figuring out how to split the words in a text file into a list. I can use something like this:
words = []
for line in open('text.txt'):
line.split()
words.append(line)
But if the file contains multiple lines of text, they are split into sublists, e.g.
this is the first line
this is the second line
Becomes:
[['this', 'is', 'the', 'first', 'line'], ['this', 'is', 'the', 'second', 'line']]
How do I make it so that they are in the same list? i.e.
[['this', 'is', 'the', 'first', 'line', 'this', 'is', 'the', 'second', 'line']]
thanks!
EDIT: This program will be opening multiple text files, so the words in each file need to be added to a sublist. So if a file has multiple lines, all the words from these lines should be stored together in a sublist. i.e. Each new file starts a new sublist.
Upvotes: 1
Views: 6719
Reputation: 77
Not sure why you want to keep the [[]] but:
words = [open('text.txt').read().split()]
Upvotes: 1
Reputation: 365577
Your code doesn't actually do what you say it does. line.split()
just returns a list of words in the line, which you don't do anything with; it doesn't affect line
in any way, so when you do words.append(line)
, you're just appending the original line, a single string.
So, first, you have to fix that:
words = []
for line in open('text.txt'):
words.append(line.split())
Now, what you're doing is repeatedly appending a new list of words to an empty list. So of course you get a list of lists of words. This is because you're mixing up the append
and extend
methods of list
. append
takes any object, and adds that object as a new element of the list; extend
takes any iterable, and adds each element of that iterable as separate new elements of the list.
And if you fix that too:
words = []
for line in open('text.txt'):
words.extend(line.split())
… now you get what you wanted.
Upvotes: 3
Reputation: 239433
You can use list comprehension, like this to flatten the list of words
[word for words in line.split() for word in words]
This is the same as writing
result = []
for words in line.split():
for word in words:
result.append(word)
Or you can use itertools.chain.from_iterable
, like this
from itertools import chain
with open("Input.txt") as input_file:
print list(chain.from_iterable(line.split() for line in input_file))
Upvotes: 3