Reputation: 1097
I have a folder with a set of text documents. I want to split each document to two or three documents, each one should be 45-70kb.
How сan I do it? I tried:
def split_file(filename, pattern, size):
with open(filename, 'rb') as f:
for index, line in enumerate(f, start=1):
with open(pattern.format(index), 'wb') as out:
n=0
for line in chain([line], f):
out.write(line)
n += len(line)
if n >= 450000 and n <=700000:
break
if __name__ == '__main__':
split_file('folderadress', 'part_{0:03d}.txt', 20000)
but it seems to me it's completely wrong.
Upvotes: 0
Views: 92
Reputation: 44354
This uses a different approach to yours. I have set the maximum size for each file to be 1000 bytes for testing purposes:
import glob
import os
dname = './gash' # directory name
unit_size = 1000 # maximum file size
for fname in glob.iglob("%s/*" % dname):
with open(fname, 'rb') as fo:
data = True
n = 1
while data:
# read returns "" (False) on EOF
data = fo.read(unit_size)
if data:
sub_fname = fname + str(n)
with open(sub_fname, 'wb') as out:
out.write(data)
n += 1
What this might do is to split a line between files, however you do not state if this could be an issue or not.
Upvotes: 2