Reputation: 2876
I have below list of text files , I wanted to combine group of files like below
Inv030001.txt - should have all data of files starting with Inv030001
Inv030002.txt - should have all data of files starting with Inv030002
I tried below code but it's not working
filenames = glob(textfile_dir+'*.txt')
for fname in filenames:
filename = fname.split('\\')[-1]
current_invoice_number = (filename.split('_')[0]).split('.')[0]
prev_invoice_number = current_invoice_number
with open(textfile_dir + current_invoice_number+'.txt', 'w') as outfile:
for eachfile in fnmatch.filter(os.listdir(textfile_dir), '*[!'+current_invoice_number+'].txt'):
current_invoice_number = (eachfile.split('_')[0]).split('.')[0]
if(current_invoice_number == prev_invoice_number):
with open(textfile_dir+eachfile) as infile:
for line in infile:
outfile.write(line)
prev_invoice_number = current_invoice_number
else:
with open(textfile_dir+eachfile) as infile:
for line in infile:
outfile.write(line)
prev_invoice_number = current_invoice_number
#break;
Upvotes: 0
Views: 361
Reputation: 2876
Below is the working code, if someone is looking for same solution
filenames = glob(textfile_dir+'*.txt')
dd = defaultdict(list)
for filename in filenames:
name, ext = os.path.splitext(filename)
name = name.split('\\')[-1].split('_')[0]
dd[name].append(filename)
for key, fnames in dd.items():
with open(textfile_dir+key+'.txt', "w") as newfile:
for line in fileinput.FileInput(fnames):
newfile.write(line)
Upvotes: 0
Reputation: 5541
Does this answer your question? My version will append the data from "like" invoice numbers to a .txt file named with just the invoice number. In other words, anything that starts with "Inv030001" will have it's contents appended to "Inv030001.txt". The idea being that you likely don't want to overwrite files and possibly destroy them if your write logic had a mistake.
I actually recreated your files to test this. I did exactly what I suggested you do. I just treated every part as a separate task and built it up to this, and in doing that the script became far less verbose and convoluted. I labeled all of my comments with task
to pound it in that this is just a series of very simple things.
I also renamed your vars to what they actually are. For instance, filenames
aren't filenames, at all. They are entire paths.
import os
from glob import glob
#you'll have to change this path to yours
root = os.path.join(os.getcwd(), 'texts/')
#sorting this may be redundant
paths = sorted(glob(root+'*.txt'))
for path in paths:
#task: get filename
filename = path.split('\\')[-1]
#task: get invoice number
invnum = filename.split('_')[0]
#task: open in and out files
with open(f'{root}{invnum}.txt', 'a') as out_, open(path, 'r') as in_:
#task: append in contents to out
out_.write(in_.read())
Upvotes: 1
Reputation: 4453
Your code may have had a little too much complications in it. And so, the idea is that for every file in the directory, just add it's contents (that is, append) to the invoice file.
from glob import glob, fnmatch
import os
textfile_dir="invs" + os.sep # # I changed this to os.sep since I'm on a MAC - hopefully works in windows, too
filenames = glob(textfile_dir+'*.txt')
for fname in filenames:
filename = fname.split(os.sep)[-1]
current_invoice_number = (filename.split('_')[0]).split('.')[0]
with open(textfile_dir + current_invoice_number+'.txt', 'a') as outfile:
with open(fname) as infile:
for line in infile:
outfile.write(line)
Some room for improvement:
'a'
when we open the files for writing.Upvotes: 0