Alex T
Alex T

Reputation: 3754

Remove commas and newlines from text file in python

I have text file which looks like this:

ab initio
ab intestato
ab intra
a.C.
acanka, acance, acanek, acankach, acankami, acanką
Achab, Achaba, Achabem, Achabie, Achabowi

I would like to pars every word separated by comma into a list. So it would look like ['ab initio', 'ab intestato', 'ab intra','a.C.', 'acanka', ...] Also mind the fact that there are words on new lines that are not ending with commas. When I used list1.append(line.strip()) it gave me string of every line instead of separate words. Can someone provide me some insight into this?

Full code below:

list1=[]
filepath="words.txt"
with open(filepath, encoding="utf8") as fp:  
   line = fp.readline()
   while line:
       list1.append(line.strip(','))
       line = fp.readline()

Upvotes: 1

Views: 9758

Answers (2)

Patrick Artner
Patrick Artner

Reputation: 51653

You can use your code to get down to "list of line"-content and apply:

cleaned = [ x for y in list1 for x in y.split(',')]

this essentially takes any thing you parsed into your list and splits it at , to creates a new list.

sberrys all in one solution that uses no intermediate list is faster.

Upvotes: 0

sberry
sberry

Reputation: 132018

Very close, but I think you want split instead of strip, and extend instead of append

You can also iterate directly over the lines with a for loop.

list1=[]
filepath="words.txt"
with open(filepath, encoding="utf8") as fp:  
   for line in fp:
       list1.extend(line.strip().split(', '))

Upvotes: 5

Related Questions