Reputation: 2533
In Python, why is os.lseek()
so much slower than the seek()
method on file-like objects?
$ dd if=/dev/urandom of=test.bin bs=1024 count=1024
1024+0 records in
1024+0 records out
1048576 bytes transferred in 0.063247 secs (16579072 bytes/sec)
$ python -m timeit -s 'import os; f = open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
100 loops, best of 3: 2.62 msec per loop
$ python -m timeit -s 'import os; f = os.open("test.bin", os.O_RDONLY)' 'for i in xrange(10000): os.lseek(f, i, os.SEEK_SET)'
100 loops, best of 3: 4.23 msec per loop
The docs for os.open()
say "This function is intended for low-level I/O." I would think that "low-level I/O" would be faster.
I'm using CPython 2.7.9 on Mac OS 10.10.5 on a MacBook Pro with a solid-state drive.
Upvotes: 3
Views: 2130
Reputation: 102009
Low-level doesn't necessarily means faster. It simply means low-level. Given that python is primarily intended for high-level usage the high-level APIs are generally pretty optimized and avoid pitfalls that you'd have to handle writing the "equivalent" low-level code.
Now os.open
returns a file descriptor, which is an integer, that is what is actually passed around the system calls (and that's why it's called low-level. You generally don't want to handle file descriptors directly and leave that to the interpreter.)
The open
function returns a file
object. The implementation of the seek method can be found here and it's pretty straight-forward: it does some error checking and in the end it calls _portable_fseek
:
Py_DECREF(off_index);
if (PyErr_Occurred())
return NULL;
FILE_BEGIN_ALLOW_THREADS(f)
errno = 0;
ret = _portable_fseek(f->f_fp, offset, whence);
FILE_END_ALLOW_THREADS(f)
if (ret != 0) {
PyErr_SetFromErrno(PyExc_IOError);
clearerr(f->f_fp);
return NULL;
}
f->f_skipnextlf = 0;
Py_INCREF(Py_None);
return Py_None;
Where _portable_fseek
is defined here and its implementation is
really just:
static int
_portable_fseek(FILE *fp, Py_off_t offset, int whence)
{
#if !defined(HAVE_LARGEFILE_SUPPORT)
return fseek(fp, offset, whence);
#elif defined(HAVE_FSEEKO) && SIZEOF_OFF_T >= 8
return fseeko(fp, offset, whence);
#elif defined(HAVE_FSEEK64)
return fseek64(fp, offset, whence);
#elif defined(__BEOS__)
return _fseek(fp, offset, whence);
#elif SIZEOF_FPOS_T >= 8
/* lacking a 64-bit capable fseek(), use a 64-bit capable fsetpos()
and fgetpos() to implement fseek()*/
fpos_t pos;
switch (whence) {
case SEEK_END:
#ifdef MS_WINDOWS
fflush(fp);
if (_lseeki64(fileno(fp), 0, 2) == -1)
return -1;
#else
if (fseek(fp, 0, SEEK_END) != 0)
return -1;
#endif
/* fall through */
case SEEK_CUR:
if (fgetpos(fp, &pos) != 0)
return -1;
offset += pos;
break;
/* case SEEK_SET: break; */
}
return fsetpos(fp, &offset);
#else
#error "Large file support, but no way to fseek."
#endif
}
The os.lseek
function is instead defined here and it's pretty much the same code except it does this:
if (!_PyVerify_fd(fd))
return posix_error();
Py_BEGIN_ALLOW_THREADS
#if defined(MS_WIN64) || defined(MS_WINDOWS)
res = _lseeki64(fd, pos, how);
#else
res = lseek(fd, pos, how);
#endif
Py_END_ALLOW_THREADS
Note the call to _PyVerify_fd
!
You could call os.lseek
with any integer object and so the interpreter must verify that:
When using the file object you can just assume that the file descriptor associated with the file object is valid and avoid the check.
Hence in this case the low-level function actually has to perform more error checking making the operation slower.
There's also a third way to seek a file, which is to use the io
library. The results are:
$ dd if=/dev/urandom of=test.bin bs=1024 count=1024
1024+0 record dentro
1024+0 record fuori
1048576 byte (1,0 MB) copiati, 0,0851599 s, 12,3 MB/s
$ python2 -m timeit -s 'import io;import os; f=open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
100 loops, best of 3: 5.72 msec per loop
$ python2 -m timeit -s 'import io;import os; f=os.open("test.bin", os.O_RDONLY)' 'for i in xrange(10000): os.lseek(f, i, os.SEEK_SET)'
100 loops, best of 3: 6.28 msec per loop
$ python2 -m timeit -s 'import io;import os; f=io.open("test.bin", "r")' 'for i in xrange(10000): f.seek(i, os.SEEK_SET)'
10 loops, best of 3: 63.8 msec per loop
They take 10 times more time then regular files! However if you look at how they are implemented here you'll see that their implementation uses fairly high-level APIs and introduces quite a bit of overhead compared to the pure C versions.
Also note that on my machine there isn't a 2x times difference between os.lseek
and seek
.
Upvotes: 5