Arsenal Fanatic
Arsenal Fanatic

Reputation: 3813

Split a txt file into N lines each?

I would like to split a very large .txt file in to equal parts files each part containing N lines. and save it to a folder

with open('eg.txt', 'r') as T:
    while True:
        next_n_lines = islice(T, 300)
        f = open("split" + str(x.pop()) + ".txt", "w")
        f.write(str(next_n_lines))
        f.close()

But this creates a files with data

" < itertools.islice object at 0x7f8fa94a4940 >" 

in the txt files.

I would like to preserve the same structure and style maintained in the original txt file.

And this code does not terminate automatically when it reaches end of file as well. If possible I would the code to stop writing to files and quit if there is no data left to write.

Upvotes: 3

Views: 3500

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180540

You can use iter with islice, taking n lines at a time using enumerate to give your files unique names. f.writelines will write each list of lines to a new file:

with open('eg.txt') as T:
    for i, sli in enumerate(iter(lambda:list(islice(T, 300)), []), 1):
        with open("split_{}.txt".format(i), "w") as f:
            f.writelines(sli)

Your code loops forever as you don't include any break condition, using iter with an empty list will mean the loop ends when the iterator has been exhausted.

Also if you wanted to pass an islice object to be written you would just call writelines on it i.e f.writelines(next_n_lines), str(next_n_lines).

Upvotes: 6

Kasravnd
Kasravnd

Reputation: 107347

The problem is tat itertools.islice returns an iterator and you are writing it's str in your file which is the representation of functions in python (showing the identity of object):

< itertools.islice object at 0x7f8fa94a4940 >

As a more pythinic way for slicing an iterator to equal parts, you can use following grouper function, which has been suggested by python wiki as itertools recipes:

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

You can pass your file object as an iterator to function and then loop over the result and writ them to your file:

with open('eg.txt', 'r') as T:
    for partition in grouper(T,300):
        # do anything with `partition` like join the lines 
        # or any modification you like. Then write it in output.

Upvotes: 2

Related Questions