john doe
john doe

Reputation: 2253

How to "join" two text files with python?

I have two txt files like this: txt1:

Foo
Foo
Foo
Foo

txt2:

Bar
Bar
Bar
Bar

How can I concatenate them in a new file by the left and the right side let's say like this:

Bar Foo
Bar Foo
Bar Foo
Bar Foo

I tried the following:

folder = ['/Users/user/Desktop/merge1.txt', '/Users/user/Desktop/merge2.txt']
with open('/Users/user/Desktop/merged.txt', 'w') as outfile:
    for file in folder:
        with open(file) as newfile:
            for line in newfile:
                outfile.write(line)

Upvotes: 4

Views: 8753

Answers (3)

thefourtheye
thefourtheye

Reputation: 239473

Use itertools.izip to combine the lines from both the files, like this

from itertools import izip
with open('res.txt', 'w') as res, open('in1.txt') as f1, open('in2.txt') as f2:
    for line1, line2 in izip(f1, f2):
        res.write("{} {}\n".format(line1.rstrip(), line2.rstrip()))

Note: This solution will write lines from both the files only until either of the files exhaust. For example, if the second file contains 1000 lines and the first one has only 2 lines, then only two lines from each file are copied to the result. In case you want lines from the longest file even after the shortest file exhausts, you can use itertools.izip_longest, like this

from itertools import izip_longest
with open('res.txt', 'w') as res, open('in1.txt') as f1, open('in2.txt') as f2:
    for line1, line2 in izip_longest(f1, f2, fillvalue=""):
        res.write("{} {}\n".format(line1.rstrip(), line2.rstrip()))

In this case, even after the smaller file exhausts, the lines from the longer file will still be copied and the fillvalue will be used for the lines from the shorter file.

Upvotes: 8

Fabrício Pereira
Fabrício Pereira

Reputation: 1640

Here have a script to solve this: https://gist.github.com/fabriciorsf/92c5fb1a7d9f001f777813a79e681d8b

#!/usr/bin/env python

'''
Merge/Join/Combine lines of multiple input files.
Write lines consisting of the sequentially corresponding lines from each input file, separated by whitespace character, to output file.
TODO: implements params like https://github.com/coreutils/coreutils/blob/master/src/paste.c
'''

import sys
from contextlib import ExitStack
from itertools import zip_longest


def main(args):
  if len(args) < 3:
    print(sys.argv[0] + ' <input-file-1> <input-file-2> [<input-file-n>...] <output-file>')
    sys.exit(0)
  mergeFiles(args[:len(args)-1], args[len(args)-1])


def mergeFiles(inputFileNames, outputFileName, delimiterChar=" ", fillValue="-"):
  with ExitStack() as eStack:
    inputFiles = [eStack.enter_context(open(fileName, 'r', encoding='utf-8', errors='replace')) for fileName in inputFileNames]    
    with open(outputFileName, 'w', encoding='utf-8', errors='replace') as outputFile:    
      for tupleOfLineFiles in zip_longest(*inputFiles, fillvalue=fillValue):
        outputFile.write(delimiterChar.join(map(str.strip, tupleOfLineFiles)) + "\n")

if __name__ == "__main__":
  main(sys.argv[1:])

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107287

You can use zip to zip those lines then concatenate and write them in your outfile:

folder = ['/Users/user/Desktop/merge1.txt', '/Users/user/Desktop/merge2.txt']
with open('/Users/user/Desktop/merged.txt', 'w') as outfile:
    for file in folder:
        with open(file[0]) as newfile,open(file[1]) as newfile1:
            lines=zip(newfile,newfile1)
            for line in lines:
                outfile.write(line[0].rstrip() + " " + line[1])

Upvotes: 3

Related Questions