JBWhitmore
JBWhitmore

Reputation: 12246

Reproduce the Unix cat command in Python

I am currently reproducing the following Unix command:

cat command.info fort.13 > command.fort.13

in Python with the following:

with open('command.fort.13', 'w') as outFile:
  with open('fort.13', 'r') as fort13, open('command.info', 'r') as com:
    for line in com.read().split('\n'):
      if line.strip() != '':
        print >>outFile, line
    for line in fort13.read().split('\n'):
      if line.strip() != '':
        print >>outFile, line

which works, but there has to be a better way. Any suggestions?

Edit (2016):

This question has started getting attention again after four years. I wrote up some thoughts in a longer Jupyter Notebook here.

The crux of the issue is that my question was pertaining to the (unexpected by me) behavior of readlines. The answer I was aiming toward could have been better asked, and that question would have been better answered with read().splitlines().

Upvotes: 12

Views: 61277

Answers (6)

Jonathan Callen
Jonathan Callen

Reputation: 11571

The easiest way might be simply to forget about the lines, and just read in the entire file, then write it to the output:

with open('command.fort.13', 'wb') as outFile:
    with open('command.info', 'rb') as com, open('fort.13', 'rb') as fort13:
        outFile.write(com.read())
        outFile.write(fort13.read())

As pointed out in a comment, this can cause high memory usage if either of the inputs is large (as it copies the entire file into memory first). If this might be an issue, the following will work just as well (by copying the input files in chunks):

import shutil
with open('command.fort.13', 'wb') as outFile:
    with open('command.info', 'rb') as com, open('fort.13', 'rb') as fort13:
        shutil.copyfileobj(com, outFile)
        shutil.copyfileobj(fort13, outFile)

Upvotes: 17

jfs
jfs

Reputation: 414179

#!/usr/bin/env python
import fileinput

for line in fileinput.input():
    print line,

Usage:

$ python cat.py command.info fort.13 > command.fort.13

Or to allow arbitrary large lines:

#!/usr/bin/env python
import sys
from shutil import copyfileobj as copy

for filename in sys.argv[1:] or ["-"]:
    if filename == "-":
        copy(sys.stdin, sys.stdout)
    else:
        with open(filename, 'rb') as file:
            copy(file, sys.stdout)

The usage is the same.

Or on Python 3.3 using os.sendfile():

#!/usr/bin/env python3.3
import os
import sys

output_fd = sys.stdout.buffer.fileno()
for filename in sys.argv[1:]:
    with open(filename, 'rb') as file:
        while os.sendfile(output_fd, file.fileno(), None, 1 << 30) != 0:
            pass

The above sendfile() call is written for Linux > 2.6.33. In principle, sendfile() can be more efficient than a combination of read/write used by other approaches.

Upvotes: 8

Handyman5
Handyman5

Reputation: 129

List comprehensions are awesome for things like this:

with open('command.fort.13', 'w') as output:
  for f in ['fort.13', 'command.info']:
    output.write(''.join([line for line in open(f).readlines() if line.strip()]))

Upvotes: 1

Ned Batchelder
Ned Batchelder

Reputation: 375574

You can simplify this in a few ways:

with open('command.fort.13', 'w') as outFile:
  with open('fort.13', 'r') as fort13, open('command.info', 'r') as com:
    for line in com:
      if line.strip():
        print >>outFile, line
    for line in fort13:
      if line.strip():
        print >>outFile, line

More importantly, the shutil module has the copyfileobj function:

with open('command.fort.13', 'w') as outFile:
  with open('fort.13', 'r') as fort13:
    shutil.copyfileobj(com, outFile)
  with open('command.info', 'r') as com:
    shutil.copyfileobj(fort13, outFile)

This doesn't skip the blank lines, but cat doesn't do that either, so I'm not sure you really want to.

Upvotes: 1

kindall
kindall

Reputation: 184161

def cat(outfilename, *infilenames):
    with open(outfilename, 'w') as outfile:
        for infilename in infilenames:
            with open(infilename) as infile:
                for line in infile:
                    if line.strip():
                        outfile.write(line)

cat('command.fort.13', 'fort.13', 'command.info')

Upvotes: 8

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798606

Iterating over a file yields lines.

for line in infile:
  outfile.write(line)

Upvotes: 1

Related Questions