stonecrusher
stonecrusher

Reputation: 19

How to use numba.SmartArrays for vector addition?

I have written this code for vector addition using numba.SmartArrays. I am using this numba.SmartArrays for the first time. I am not sure how to use that. This code is not working and it is throwing errors.

    import numpy as np
    from numba import SmartArray,cuda, jit, uint32


    li1=np.uint32([1,2,3,4])
    li=np.uint32([1,2,3,4])
    b=SmartArray(li,where="host",copy=True)
    a=SmartArray(li1,where="host",copy=True)
    c=np.uint32([1,1,1,1])
    print type(li)
    print type(a)

    @cuda.jit('void(uint32[:],uint32[:],uint32[:])',type="gpu")
    def additionG(c,a,b):
        idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x

        if idx< len(a):
            a[idx]=c[idx]+b[idx]

    dA=cuda.to_device(a)
    dB=cuda.to_device(b)
    dC=cuda.to_device(c)
    additionG[1, 128](c,a,b)

    print a.__array__()

Errors:

    <type 'numpy.ndarray'>
    <class 'numba.smartarray.SmartArray'>
    Traceback (most recent call last):
      File "C:\Users\hp-pc\My Documents\LiClipse Workspace\cuda\blowfishgpu_smart_arrays.py", line 20, in <module>
        dA=cuda.to_device(a)
      File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devices.py", line 257, in _require_cuda_context
        return fn(*args, **kws)
      File "C:\Anaconda\lib\site-packages\numba\cuda\api.py", line 55, in to_device
        to, new = devicearray.auto_device(obj, stream=stream, copy=copy)
      File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 403, in auto_device
        devobj.copy_to_device(obj, stream=stream)
      File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 148, in copy_to_device
        sz = min(_driver.host_memory_size(ary), self.alloc_size)
      File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1348, in host_memory_size
        s, e = host_memory_extents(obj)
      File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1333, in host_memory_extents
        return mviewbuf.memoryview_get_extents(obj)
    TypeError: expected a readable buffer object

Upvotes: 0

Views: 755

Answers (2)

stonecrusher
stonecrusher

Reputation: 19

Its been a while since I posted this question. Still posting the answer so that someone may find it helpful in future.

import numpy as np
from numba import SmartArray,cuda, jit, uint32,autojit

li1=np.uint32([6,7,8,9])
li=np.uint32([1,2,3,4])

a=SmartArray(li1,where='host',copy=True)
b=SmartArray(li,where="host",copy=True)

c=np.uint32([1,1,1,1])

def additionG(a,c):
    idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x

    if idx < len(c):
        a[idx]=a[idx]+c[idx]

    cuda.syncthreads()

bpg=1
tpb=128
dC=cuda.to_device(c)
cfunc = cuda.jit()(additionG)
cfunc[bpg, tpb](a,dC)

print a.__array__()

Upvotes: 2

talonmies
talonmies

Reputation: 72349

It looks to me like cuda.to_device doesn't handle smart arrays, which would sort of make sense, because smart arrays are supposed to do away with explicit copy management.

If my reading of the documentation is correct (I have never tried SmartArray before), you should just be able to change this

dA=cuda.to_device(a)
dB=cuda.to_device(b)
dC=cuda.to_device(c)
additionG[1, 128](c,a,b)

to just

dC=cuda.to_device(c)
additionG[1, 128](dC,a.gpu(),b.gpu())

The .gpu() method should return a GPU resident object that the kernel can understand and access.

Upvotes: 1

Related Questions