Reputation: 1
I am splitting a large wordlist by length of the word i didn't find a different approach for it so i decided to write a script in python for it.
say test.txt has
word
words
i want it to make new text files based on length of line and write the line to it
4.txt
word
5.txt
words
CODE
import os
import sys
basefile = open(sys.argv[1],'rt')
print("Writing.....")
os.mkdir(str(os.path.splitext(sys.argv[1])[0]))
os.chdir(os.path.splitext(sys.argv[1])[0])
#print(basefile)
for line in basefile:
cpyfile=open(str(len(line.strip()))+'.txt',mode = 'a',encoding = 'utf-8')
cpyfile.write(line)
cpyfile.close()
print("Done")
basefile.close()
It works for small files but for larger files it gives out an error after a while
PermissionError: [Errno 13] Permission denied: '10.txt'
or
PermissionError: [Errno 13] Permission denied: '11.txt'
the error file is completely random too and the previous lines written are perfectly okay.
I have tried it on windows using powershell and using gitbash
Any help is appreciated and thanks
Upvotes: 0
Views: 125
Reputation: 1380
I suspect you are running into the issue that Windows does not allow two programs to open the same file at once. I'm not sure what the second program would be. Maybe a virus scanner? Your program works unaltered on Ubuntu using /usr/share/dict/american-english, so I think this may be a Windows thing.
In any case, I think you can solve this by keeping the files open while the program is running.
import os
import sys
basefile = open(sys.argv[1], 'rt')
print("Writing.....")
os.mkdir(str(os.path.splitext(sys.argv[1])[0]))
os.chdir(os.path.splitext(sys.argv[1])[0])
# print(basefile)
files = {}
try:
for line in basefile:
cpyfilename = str(len(line.strip()))+'.txt'
cpyfile = files.get(cpyfilename)
if cpyfile is None:
cpyfile = open(cpyfilename, mode='a', encoding='utf-8')
files[cpyfilename] = cpyfile
cpyfile.write(line)
finally:
for cpyfile in files.values():
# Not strictly necessary because the program is about to end and
# auto-close the files.
cpyfile.close()
print("Done")
basefile.close()
Upvotes: 1