Python implementation of c++ algorithm using L1 CPU Cache

Question

I am looking to make a python implementation of the sieve of Eratosthenes with a segmented sieve, and using L1 Cache of CPU.

I have my own version on github here: https://github.com/nick599/PythonMathsAlgorithms/blob/master/segmented_soe_v6.py , which does not use L1 cache size of the CPU.

I found the following site - http://primesieve.org/segmented_sieve.html , which gives a C++ implementation using the L1 cache size. It says it is much faster than my algorithm (mine takes several minutes for creating primes upto 10^7, and hangs on 10^8 due to memory usage).

I am developing on Linux Mint v17, python version: 2.74. Update My CPU is an Intel i7.

I am fairly new to python.

I want to know:

How I could start implementing a python version of this C++ algorithm?
What I would need to consider?
Are there things in the C++ implementation that can't be coded in Python 2.74?
What about multithreading?
What about hyperthreading?
What about python's GIL?

Looking for answers that answer the spirit of all my questions above. Hints and tips are welcomed.

Python implementation of c++ algorithm using L1 CPU Cache

Answers (1)

Related Questions