Why is os.lseek() slower than seek() on file-like objects?

Question

In Python, why is os.lseek() so much slower than the seek() method on file-like objects?

$ dd if=/dev/urandom of=test.bin bs=1024 count=1024
1024+0 records in
1024+0 records out
1048576 bytes transferred in 0.063247 secs (16579072 bytes/sec)
$ python -m timeit -s 'import os; f = open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
100 loops, best of 3: 2.62 msec per loop
$ python -m timeit -s 'import os; f = os.open("test.bin", os.O_RDONLY)' 'for i in xrange(10000): os.lseek(f, i, os.SEEK_SET)'
100 loops, best of 3: 4.23 msec per loop

The docs for os.open() say "This function is intended for low-level I/O." I would think that "low-level I/O" would be faster.

I'm using CPython 2.7.9 on Mac OS 10.10.5 on a MacBook Pro with a solid-state drive.

Bakuriu · Accepted Answer

Low-level doesn't necessarily means faster. It simply means low-level. Given that python is primarily intended for high-level usage the high-level APIs are generally pretty optimized and avoid pitfalls that you'd have to handle writing the "equivalent" low-level code.

Now os.open returns a file descriptor, which is an integer, that is what is actually passed around the system calls (and that's why it's called low-level. You generally don't want to handle file descriptors directly and leave that to the interpreter.)

The open function returns a file object. The implementation of the seek method can be found here and it's pretty straight-forward: it does some error checking and in the end it calls _portable_fseek:

Py_DECREF(off_index);
if (PyErr_Occurred())
    return NULL;

FILE_BEGIN_ALLOW_THREADS(f)

errno = 0;
ret = _portable_fseek(f->f_fp, offset, whence);
FILE_END_ALLOW_THREADS(f)

if (ret != 0) {
    PyErr_SetFromErrno(PyExc_IOError);
    clearerr(f->f_fp);
    return NULL;
}

f->f_skipnextlf = 0;
Py_INCREF(Py_None);
return Py_None;

Where _portable_fseek is defined here and its implementation is really just:

static int
_portable_fseek(FILE *fp, Py_off_t offset, int whence)
{
#if !defined(HAVE_LARGEFILE_SUPPORT)
    return fseek(fp, offset, whence);

#elif defined(HAVE_FSEEKO) && SIZEOF_OFF_T >= 8
    return fseeko(fp, offset, whence);

#elif defined(HAVE_FSEEK64)
    return fseek64(fp, offset, whence);

#elif defined(__BEOS__)
    return _fseek(fp, offset, whence);

#elif SIZEOF_FPOS_T >= 8
    /* lacking a 64-bit capable fseek(), use a 64-bit capable fsetpos()
       and fgetpos() to implement fseek()*/
    fpos_t pos;
    switch (whence) {
    case SEEK_END:
#ifdef MS_WINDOWS
        fflush(fp);
        if (_lseeki64(fileno(fp), 0, 2) == -1)
            return -1;
#else
        if (fseek(fp, 0, SEEK_END) != 0)
            return -1;
#endif
        /* fall through */
    case SEEK_CUR:
        if (fgetpos(fp, &pos) != 0)
            return -1;
        offset += pos;
        break;
    /* case SEEK_SET: break; */
    }
    return fsetpos(fp, &offset);
#else
#error "Large file support, but no way to fseek."
#endif
}

The os.lseek function is instead defined here and it's pretty much the same code except it does this:

    if (!_PyVerify_fd(fd))
        return posix_error();
    Py_BEGIN_ALLOW_THREADS
#if defined(MS_WIN64) || defined(MS_WINDOWS)
    res = _lseeki64(fd, pos, how);
#else
    res = lseek(fd, pos, how);
#endif
    Py_END_ALLOW_THREADS

Note the call to _PyVerify_fd!

You could call os.lseek with any integer object and so the interpreter must verify that:

The integer is in the right range
It references an existing open file descriptor

When using the file object you can just assume that the file descriptor associated with the file object is valid and avoid the check.

Hence in this case the low-level function actually has to perform more error checking making the operation slower.

There's also a third way to seek a file, which is to use the io library. The results are:

$ dd if=/dev/urandom of=test.bin bs=1024 count=1024
1024+0 record dentro
1024+0 record fuori
1048576 byte (1,0 MB) copiati, 0,0851599 s, 12,3 MB/s
$ python2 -m timeit -s 'import io;import os; f=open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
100 loops, best of 3: 5.72 msec per loop
$ python2 -m timeit -s 'import io;import os; f=os.open("test.bin", os.O_RDONLY)' 'for i in xrange(10000): os.lseek(f, i, os.SEEK_SET)'
100 loops, best of 3: 6.28 msec per loop
$ python2 -m timeit -s 'import io;import os; f=io.open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
10 loops, best of 3: 63.8 msec per loop

They take 10 times more time then regular files! However if you look at how they are implemented here you'll see that their implementation uses fairly high-level APIs and introduces quite a bit of overhead compared to the pure C versions.

Also note that on my machine there isn't a 2x times difference between os.lseek and seek.

Why is os.lseek() slower than seek() on file-like objects?

Answers (1)

Related Questions