fdesmond
fdesmond

Reputation: 35

Efficient matrix vector structure for improving cython code

I have to run some simulations over a system of SODEs. Since I need to use random graphs I thought it was a good idea to use python for generating the adjacent matrix for the graph and then C for the simulations. So I turned to cython.

I wrote my code following the hints of cython documentation for improving its speed as much as possible. But know I really don't know if my code is good or not. I run cython toast.pyx -a too, but I don't understand the problems.

But I let my code talk for me:

from __future__ import division
import scipy.stats as stat
import numpy as np
import networkx as net

#C part
from libc.math cimport sin
from libc.math cimport sqrt

#cimport cython
cimport numpy as np
cimport cython
cdef double tau = 2*np.pi  #http://tauday.com/

#graph
def graph(int N, double p):
    """
    It generates an adjacency matrix for a Erdos-Renyi graph G{N,p} (by default not directed).
    Note that this is an O(n^2) algorithm and it gives an array, not a (sparse) matrix.
    
    Remark: fast_gnp_random_graph(n, p, seed=None, directed=False) is O(n+m), where m is the expected number of edges m=p*n*(n-1)/2.
    Arguments:
        N : number of edges
        p : probability for edge creation
    """
    G=net.gnp_random_graph(N, p, seed=None, directed=False)
    G=net.adjacency_matrix(G, nodelist=None, weight='weight')
    G=G.toarray()
    return G


@cython.boundscheck(False)
@cython.wraparound(False)
#simulations
def simul(int N, double H, G, np.ndarray W, np.ndarray X, double d, double eps, double T, double dt, int kt_max):
    """
    For details view the general description of the package.
    Argumets:
        N : population size
        H : coupling strenght complete case
        G : adjiacenty matrix
        W : disorder
        X : initial condition
        d : diffusion term
        eps : 0 for the reversibily case, 1 for the non-rev case
        T : time of the simulation
        dt : increment time steps
        kt_max = (int) T/dt
    """
    cdef int kt
    #kt_max =  T/dt to check
    cdef np.ndarray xt = np.zeros([N,kt_max], dtype=np.float64)
    cdef double S1 = 0.0
    cdef double Stilde1 = 0.0
    cdef double xtmp, xtilde, x_diff, xi
    
    cdef np.ndarray bruit = d*sqrt(dt)*stat.norm.rvs(N)
    cdef int i, j, k

    for i in range(N): #setting initial conditions
        xt[i][0]=X[i]
        
    for kt in range(kt_max-1):
        for i in range(N):
            S1 = 0.0
            Stilde1= 0.0
            xi = xt[i][kt]
        
            for j in range(N): #computation of the sum with the adjiacenty matrix
                if G[i][j]==1:
                    x_diff = xt[j][kt] - xi
                    S2 = S2 + sin(x_diff)

            xtilde = xi + (eps*(W[i]) + (H/N)*S1)*dt + bruit[i]

            for j in range(N):
                if G[i][j]==1:
                    x_diff = xt[j][kt] - xtilde
                    Stilde2 = Stilde2 + sin(x_diff)
        
            #computation of xt[i]
            xtmp = xi + (eps*(W[i]) + (H/N)*(S1+Stilde1)*0.5)*dt + bruit
            xt[i][kt+1] = xtmp%tau

    return xt

Thank you very much!

Update

I changed the order of the variables definitions, np.array to double and xt[i][j] to xt[i,j] and the matrix to long long. The code is pretty faster now and the yellow part on the html file is just around the declaration. Thanks!

from __future__ import division
import scipy.stats as stat
import numpy as np
import networkx as net

#C part
from libc.math cimport sin
from libc.math cimport sqrt

#cimport cython
cimport numpy as np
cimport cython
cdef double tau = 2*np.pi  #http://tauday.com/

#graph
def graph(int N, double p):
    """
    It generates an adjacency matrix for a Erdos-Renyi graph G{N,p} (by default not directed).
    Note that this is an O(n^2) algorithm and it gives an array, not a (sparse) matrix.

    Remark: fast_gnp_random_graph(n, p, seed=None, directed=False) is O(n+m), where m is the expected number of edges m=p*n*(n-1)/2.
    Arguments:
        N : number of edges
        p : probability for edge creation
    """
    G=net.gnp_random_graph(N, p, seed=None, directed=False)
    G=net.adjacency_matrix(G, nodelist=None, weight='weight')
    G=G.toarray()
    return G


@cython.boundscheck(False)
@cython.wraparound(False)
#simulations
def simul(int N, double H, long long [:, ::1] G, double[:] W, double[:] X, double d, double eps, double T, double dt, int kt_max):
    """
    For details view the general description of the package.
    Argumets:
        N : population size
        H : coupling strenght complete case
        G : adjiacenty matrix
        W : disorder
        X : initial condition
        d : diffusion term
        eps : 0 for the reversibily case, 1 for the non-rev case
        T : time of the simulation
        dt : increment time steps
        kt_max = (int) T/dt
    """
    cdef int kt
    #kt_max =  T/dt to check
    cdef double S1 = 0.0
    cdef double Stilde1 = 0.0
    cdef double xtmp, xtilde, x_diff

    cdef double[:] bruit = d*sqrt(dt)*np.random.standard_normal(N)

    cdef double[:, ::1] xt = np.zeros((N, kt_max), dtype=np.float64)
    cdef double[:, ::1] yt = np.zeros((N, kt_max), dtype=np.float64)
    cdef int i, j, k

    for i in range(N): #setting initial conditions
        xt[i,0]=X[i]

    for kt in range(kt_max-1):
        for i in range(N):
            S1 = 0.0
            Stilde1= 0.0

            for j in range(N): #computation of the sum with the adjiacenty matrix
                if G[i,j]==1:
                    x_diff = xt[j,kt] - xt[i,kt]
                    S1 = S1 + sin(x_diff)

            xtilde = xt[i,kt] + (eps*(W[i]) + (H/N)*S1)*dt + bruit[i]

            for j in range(N):
                if G[i,j]==1:
                    x_diff = xt[j,kt] - xtilde
                    Stilde1 = Stilde1 + sin(x_diff)

            #computation of xt[i]
            xtmp = xt[i,kt] + (eps*(W[i]) + (H/N)*(S1+Stilde1)*0.5)*dt + bruit[i]
            xt[i,kt+1] = xtmp%tau

    return xt

Upvotes: 2

Views: 418

Answers (1)

ev-br
ev-br

Reputation: 26030

cython -a color codes the cython source. If you click on a line it shows the corrsponding C source. As a rule of thumb, you don't want anything yellow in your inner loops.

A couple of things jump out in your code:

  • x[j][i] creates a temporary array for x[j] on each invocation, so use x[j, i] instead.
  • instead of cdef ndarray x better either provide the dimensionality and dtype (cdef ndarray[ndim=2, dtype=float]) or --- preferably --- use the typed memoryview syntax: cdef double[:, :] x.

E.g., instead of

cdef np.ndarray xt = np.zeros([N,kt_max], dtype=np.float64)

better use

cdef double[:, ::1] xt = np.zeros((N, kt_max), dtype=np.float64)
  • Make sure you're accessing memory in the cache-friendly pattern. E.g., make sure your arrays are in C order (last dimension varies the fastest), declare the memoryviews as double[:, ::1] and iterate over the array with the last index varying the fastest.

EDIT: See http://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html for the typed memoryview syntax double[:, ::1] etc

Upvotes: 2

Related Questions