George_Washington
George_Washington

Reputation: 21

How come this code is not faster with Numba?

Why is this monte carlo simulation not faster with numba jit? Removing @jit makes it run a bit faster. However, I thought these loops where what numba was good at...

import numpy as np
from numba import jit

T = 1000
ALPHA = 0.11
BETA = 0.22
GAMMA = 0.33

@jit(nopython=True, fastmath=True)
def sim(n):
  mu = np.array([0.0, 0.1]).reshape(2,1)
  rho = 0.1
  sigma = np.array([[1, rho*4],[rho*4, 4^2]])
  A = np.linalg.cholesky(sigma)

  out = np.empty((n, 2))
  for i in range(n):
    # (a)
    u = np.random.randn(T)

    # X = np.random.multivariate_normal(mu, sigma, T)
    X = mu + A @ np.random.randn(2,T)
    X = np.concatenate((np.ones((T, 1)), X.T), axis=1)

    y = X @ np.array([ALPHA, BETA, GAMMA]) + u

    # (b)
    thetahat = np.linalg.solve(X.T @ X, X.T @ y)

    Xf = X[:,:2].copy()
    thetatilde = np.linalg.solve(Xf.T @ Xf, Xf.T @ y)

    out[i,:] = (thetahat[1], thetatilde[1])

  return out

n = 10**5
s = sim(n)
print(s)

Upvotes: 2

Views: 766

Answers (2)

Kenneth
Kenneth

Reputation: 67

In your case, you can use the "Eager compilation" and the document can be found at https://numba.pydata.org/numba-doc/latest/user/jit.html.

import time
import numpy as np
from numba import jit, float64, uint32

T = 1000
ALPHA = 0.11
BETA = 0.22
GAMMA = 0.33

# using the eager compilation
@jit(float64[:, :](uint32))
def sim(n):
  # the code below is not changed at all
  mu = np.array([0.0, 0.1]).reshape(2,1)
  rho = 0.1
  sigma = np.array([[1, rho*4],[rho*4, 4^2]])
  A = np.linalg.cholesky(sigma)

  out = np.empty((n, 2))
  for i in range(n):
    # (a)
    u = np.random.randn(T)

    # X = np.random.multivariate_normal(mu, sigma, T)
    X = mu + A @ np.random.randn(2,T)
    X = np.concatenate((np.ones((T, 1)), X.T), axis=1)

    y = X @ np.array([ALPHA, BETA, GAMMA]) + u

    # (b)
    thetahat = np.linalg.solve(X.T @ X, X.T @ y)

    Xf = X[:,:2].copy()
    thetatilde = np.linalg.solve(Xf.T @ Xf, Xf.T @ y)

    out[i,:] = (thetahat[1], thetatilde[1])

  return out

n = 10**5
# calculate the elapsed time
start = time.time()
s = sim(n)
end = time.time()
print("Elapsed (with compilation) = %s" % (end - start))
print(s)

On my computer, the code above using the eager compilation is about 7s faster.

Upvotes: 0

Roim
Roim

Reputation: 3066

As the documentation states:

First, recall that Numba has to compile your function for the argument types given before it executes the machine code version of your function, this takes time. However, once the compilation has taken place Numba caches the machine code version of your function for the particular types of arguments presented. If it is called again the with same types, it can reuse the cached version instead of having to compile again.

A really common mistake when measuring performance is to not account for the above behaviour and to time code once with a simple timer that includes the time taken to compile your function in the execution time.

simply: in the first execution (in your case, the only one) numba compiles it to machine code. It takes time. If you run it once again, then you will see the difference.

For example:

import numpy as np
from numba import jit
import time

T = 1000
ALPHA = 0.11
BETA = 0.22
GAMMA = 0.33

@jit(nopython=True, fastmath=True)
def sim(n):
  mu = np.array([0.0, 0.1]).reshape(2,1)
  rho = 0.1
  sigma = np.array([[1, rho*4],[rho*4, 4^2]])
  A = np.linalg.cholesky(sigma)

  out = np.empty((n, 2))
  for i in range(n):
    # (a)
    u = np.random.randn(T)

    # X = np.random.multivariate_normal(mu, sigma, T)
    X = mu + A @ np.random.randn(2,T)
    X = np.concatenate((np.ones((T, 1)), X.T), axis=1)

    y = X @ np.array([ALPHA, BETA, GAMMA]) + u

    # (b)
    thetahat = np.linalg.solve(X.T @ X, X.T @ y)

    Xf = X[:,:2].copy()
    thetatilde = np.linalg.solve(Xf.T @ Xf, Xf.T @ y)

    out[i,:] = (thetahat[1], thetatilde[1])

  return out

n = 10**2
start = time.time()
sim(n)
end = time.time()
print("Elapsed (with compilation) = %s" % (end - start))

s = sim(n)
start = time.time()
sim(n)
end = time.time()
print("Elapsed (with compilation) = %s" % (end - start))
print(s)

Here I ran the simulation twice. The first run took 5 seconds, but the second run took only 0.01 seconds. Numba improved indeed.

Numba is useful when you have a function you using multiple times. For a single execution, numba has no use.

Upvotes: 1

Related Questions