Sayalic
Sayalic

Reputation: 7650

Why is using the Python mmap module much slower than calling POSIX mmap from C++?

C++ code:

#include <string>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/time.h>

using namespace std;
#define FILE_MODE (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)

int main() {
    timeval tv1, tv2, tv3, tve;
    gettimeofday(&tv1, 0);
    int size = 0x1000000;
    int fd = open("data", O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);
    ftruncate(fd, size);
    char *data = (char *) mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    for(int i = 0; i < size; i++) {
        data[i] = 'S';
    }
    munmap(data, size);
    close(fd);
    gettimeofday(&tv2, 0);
    timersub(&tv2, &tv1, &tve);
    printf("Time elapsed: %ld.%06lds\n", (long int) tve.tv_sec, (long int) tve.tv_usec);
}

Python code:

import mmap
import time

t1 = time.time()
size = 0x1000000

f = open('data/data', 'w+')
f.truncate(size)
f.close()

file = open('data/data', 'r+b')
buffer = mmap.mmap(file.fileno(), 0)

for i in xrange(size):
    buffer[i] = 'S'

buffer.close()
file.close()
t2 = time.time()
print "Time elapsed: %.3fs" % (t2 - t1)

I think these two program are the essentially same since C++ and Python call the same system call(mmap).

But the Python version is much slower than C++'s:

Python: Time elapsed: 1.981s
C++:    Time elapsed: 0.062143s

Could any one please explain the reason why the mmap Python of is much slower than C++?


Environment:

C++:

$ c++ --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0

Python:

$ python --version
Python 2.7.11 :: Anaconda 4.0.0 (x86_64)

Upvotes: 1

Views: 2830

Answers (2)

fish2000
fish2000

Reputation: 4435

To elaborate on what @Daniel said – any Python operation has more overhead (in some cases way more, like orders of magnitude) than the comparable amount of code implementing a solution in C++.

The loop filling the buffer is indeed the culprit – but also the mmap module itself has a lot more housekeeping to do than you might think, despite that it offers an interface whose semantics are, misleadingly, verrrry closely aligned with POSIX mmap(). You know how POSIX mmap() just tosses you a void* (which you just have to use munmap() to clean up after it, at some point)? Python’s mmap has to allocate a PyObject structure to babysit the void* – making it conform to Python’s buffer protocol by furnishing metadata and callbacks to the runtime, propagating and queueing reads and writes, maintaining GIL state, cleaning up its allocations no matter what errors occur…

All of that stuff takes time and memory, too. I personally don’t ever find myself using the mmap module, as it doesn’t give you a clear-cut advantage on any I/O problem, like out-of-the-box – you can just as easily use mmap to make things slower as you might make them faster.

Contrastingly I often *do* find that using POSIX mmap() can be VERY advantageous when doing I/O from within a Python C/C++ extension (provided you’re minding the GIL state), precisely because coding around mmap() avoids all that Python internal-infrastructure stuff in the first place.

Upvotes: 3

Daniel
Daniel

Reputation: 42758

Not mmap is slower, but the filling of a array with values. Python is known, to be slow on doing primitive operations. Use higher-level operations:

buffer[:] = 'S' * size

Upvotes: 7

Related Questions