Reputation: 1
I am trying to write a finite difference Python code for 2D and 3D arrays using CuPy library. I want to use cupy.ElementwiseKernel() to speed up my code. Currently, I am facing problems to write the ElementwiseKernel for 2D array where I want to take the central difference similar to slicing. I have the following code where I want to keep the slicing inside the kernel itself.
device = 'CPU'
device_rank = 0
if device == 'CPU':
import numpy as ncp
from numpy import copy
else:
import cupy as ncp
from cupy import copy
dev = ncp.cuda.Device(device_rank)
dev.use()
import time
Nx = 16000
Ny = 16000
Lx = 2*ncp.pi
Ly = 2*ncp.pi
dx = Lx/Nx
dy = Ly/Ny
X = ncp.linspace(0,Lx,Nx+1)
Y = ncp.linspace(0,Ly,Ny+1)
X_mesh, Y_mesh = ncp.meshgrid(X,Y,indexing='ij')
temp = ncp.zeros([Nx+1,Ny+1])
def dfx(f, d_type):
# Central difference of array 'f' w.r.t. x
# For peridoic f
if d_type == 0:
if device == 'GPU':
temp1 = f[2:,:]
temp2 = f[:-2,:]
derv_kernel = ncp.ElementwiseKernel(
'T temp1, T temp2, T dx',
'T derv',
'''
derv = (temp1 - temp2)/(2*dx);
''',
'derv_kernel'
)
derv_kernel(temp1,temp2,dx,temp[1:-1,:])
elif device == 'CPU':
temp[1:-1,:] = (f[2:,:] - f[:-2,:])/(2*dx)
# Boundary terms when f is periodic in x
temp[0,:] = (f[1,:] - f[-2,:])/(2*dx)
temp[-1,:] = copy(temp[0,:])
return temp
A = ncp.sin(X_mesh)
B = dfx(A, 0)
print('Max error in derivative: ', ncp.max(ncp.cos(X_mesh)-B))
if device == 'GPU':
ncp.cuda.runtime.deviceSynchronize()
t1 = time.time()
for i in range(10):
dfx(A,0)
A += 2*ncp.pi
if device == 'GPU':
ncp.cuda.runtime.deviceSynchronize()
t2 = time.time()
print('time = ', t2-t1)
I also tried using raw T f
where we can take i
index but I am having problems to use it for 2D array where we need two indices. Any help would be great since I can generalize it for my solver. I am not acquainted with CUDA.
derv_kernel = ncp.ElementwiseKernel(
'raw T f, float64 dx',
'T derv',
'''
for(int i = 0; i < Nx+1; ++i){
for(int j = 0; j < Ny+1; ++i){
derv = (f[i + 1,j] - f[i - 1,j]) / (2*dx);
}
}
''',
'derv_kernel'
)
derv_kernel(f,dx,temp)
Upvotes: 0
Views: 50