Why is using the Python mmap module much slower than calling POSIX mmap from C++?

Question

C++ code:

#include 
#include 
#include 
#include 
#include 

using namespace std;
#define FILE_MODE (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)

int main() {
    timeval tv1, tv2, tv3, tve;
    gettimeofday(&tv1, 0);
    int size = 0x1000000;
    int fd = open("data", O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);
    ftruncate(fd, size);
    char *data = (char *) mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    for(int i = 0; i < size; i++) {
        data[i] = 'S';
    }
    munmap(data, size);
    close(fd);
    gettimeofday(&tv2, 0);
    timersub(&tv2, &tv1, &tve);
    printf("Time elapsed: %ld.%06lds
", (long int) tve.tv_sec, (long int) tve.tv_usec);
}

Python code:

import mmap
import time

t1 = time.time()
size = 0x1000000

f = open('data/data', 'w+')
f.truncate(size)
f.close()

file = open('data/data', 'r+b')
buffer = mmap.mmap(file.fileno(), 0)

for i in xrange(size):
    buffer[i] = 'S'

buffer.close()
file.close()
t2 = time.time()
print "Time elapsed: %.3fs" % (t2 - t1)

I think these two program are the essentially same since C++ and Python call the same system call(mmap).

But the Python version is much slower than C++'s:

Python: Time elapsed: 1.981s
C++:    Time elapsed: 0.062143s

Could any one please explain the reason why the mmap Python of is much slower than C++?

Environment:

C++:

$ c++ --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.5.0

Python:

$ python --version
Python 2.7.11 :: Anaconda 4.0.0 (x86_64)

Daniel · Accepted Answer

Not mmap is slower, but the filling of a array with values. Python is known, to be slow on doing primitive operations. Use higher-level operations:

buffer[:] = 'S' * size

Why is using the Python mmap module much slower than calling POSIX mmap from C++?

Answers (2)

Related Questions