Reputation: 361
I am trying to open a text file, remove certain words that have a ] after them, and then write the new contents to a new file. With the following code, new_content
contains what I need, and a new file is created, but it's empty. I cannot figure out why. I've tried indenting differently and passing in an encoding type, with no luck. Any help greatly appreciated.
import glob
import os
import nltk, re, pprint
from nltk import word_tokenize, sent_tokenize
import pandas
import string
import collections
path = "/pathtofiles"
for file in glob.glob(os.path.join(path, '*.txt')):
if file.endswith(".txt"):
f = open(file, 'r')
flines = f.readlines()
for line in flines:
content = line.split()
for word in content:
if word.endswith(']'):
content.remove(word)
new_content = ' '.join(content)
f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
f2.write(new_content)
f.close
Upvotes: 0
Views: 343
Reputation: 1423
This should work @firefly. Happy to answer questions if you have them.
import glob
import os
path = "/pathtofiles"
for file in glob.glob(os.path.join(path, '*.txt')):
if file.endswith(".txt"):
with open(file, 'r') as f:
flines = f.readlines()
new_content = []
for line in flines:
content = line.split()
new_content_line = []
for word in content:
if not word.endswith(']'):
new_content_line.append(word)
new_content.append(' '.join(new_content_line))
f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
f2.write('\n'.join(new_content))
f.close
f2.close
Upvotes: 1