Reputation: 1
I have a collection of 20 text files in a folder that I am trying to create a dictionary for and output the dictionary to a text file.
I created a code that works for a single file in the directory by inputting a filename. However it doesn't let me input multiple text files at once, and if I run each one individually they just overwrite each other. I tried converting the file input to using import os and read from my cwd, but I'm running into errors with variables and I'm just not sure what I'm doing wrong.
fname = input ('Enter File: ')
hand = open(fname)
di = dict()
for lin in hand:
lin = lin.rstrip()
wds = lin.split()
for w in wds:
di[w] = di.get(w,0) + 1
print(di)
largest = -1
theword = None
for k,v in di.items() :
if v > largest :
largest = v
theword = k
print(theword,largest)
f = open("output.txt", "w")
f.write(str(di))
f.close()
I tried adding
import os
for filename in os.listdir(os.getcwd()):
fname = ('*.txt')
hand = open(fname)
To the top, but I'm erroring out as it's not recognizing what I thought would be a wildcard to assign fname as the file it is reading.
Upvotes: 0
Views: 7683
Reputation: 23463
import glob
# a list of all txt file in the current dir
files = glob.glob("*.txt")
# the dictionary that will hold the file names (key) and content (value)
dic = {}
# loop to opend files
for file in files:
with open(file, 'r', encoding='utf-8') as read:
# the key will hold the name the value the content
dic[file] = read.read()
# For each file we will append the name and the content in output.txt
with open("output.txt", "a", encoding = 'utf-8') as output:
output.write(dic[file] + "\n" + read.read() + "\n\n")
Upvotes: 0
Reputation: 5372
If you are using Python 3.4 or higher, your code can be very simplified by using pathlib.Path()
and collections.Counter()
:
from pathlib import Path
from collections import Counter
counter = Counter()
dir = Path('dir')
out_file = Path('output.txt')
for file in dir.glob('*.txt'):
with file.open('r', encoding='utf-8') as f:
for l in f:
counter.update(l.strip().split())
counter.most_common(10)
with out_file.open('w', encoding='utf-8') as f:
f.write(counter)
If you are on Python 3.5 or higher, that code can be even more simple:
from pathlib import Path
from collections import Counter
counter = Counter()
dir = Path('dir')
out_file = Path('output.txt')
for file in dir.glob('*.txt'):
counter.update(file.read_text(encoding='utf-8').split())
counter.most_common(10)
out_file.write_text(counter, encoding='utf-8')
And here is as sample output:
>>> from pathlib import Path
>>> from collections import Counter
>>> counter = Counter()
>>> file = Path('t.txt')
>>> file.is_file()
True
>>> with file.open('r', encoding='utf-8') as f:
... for l in f:
... counter.update(l.strip().split())
...
>>> counter.most_common(5)
[('is', 10), ('better', 8), ('than', 8), ('to', 5), ('the', 5)]
>>>
Upvotes: 1
Reputation: 414
You can loop through each and every .txt file inside your directory and print or store the content of those text files in a dictionary or variable.
import os
for filename in os.listdir(os.getcwd()):
name, file_extension = os.path.splitext(filename)
if '.txt' in file_extension:
hand = open(filename)
for line in hand:
print line
Upvotes: 1
Reputation: 249592
If you want to use wildcards, you need the glob
module. But in your case it sounds like you just want all files in one directory, so:
for filename in os.listdir('.'): # . is cwd
hand = open(filename)
Upvotes: 0