Eddy
Eddy

Reputation: 6861

How to allocate a shared memory array in CUDA Fortran?

I'm having trouble trying to declare a shared memory array within the kernel. Here's the code containing my kernel:

module my_kernels

  use cudafor
  implicit none

contains

  attributes(global) subroutine mykernel(N)

    ! Declare variables
    integer :: index
    integer, intent(in), value :: N
    real,shared,dimension(N) :: shared_array  

    ! Map threadID to index
    index = blockDim%x * (blockIdx%x-1) + threadIdx%x

    ! Set array element equal to index
    shared_array(index) = index

  end subroutine mykernel

end module my_kernels

And here's how I call my kernel:

program cuda

  use my_kernels
  implicit none  

  ! Set number of threads
  integer :: N = 9

  ! Invoke kernel with 3 blocks of 3 threads
  call mykernel<<<N/3,3>>>(N)

end program cuda

All of this I have in one file, test.cuf. When I try to compile test.cuf with pgf90, I get this error:

PGF90-S-0000-Internal compiler error. unexpected runtime function call       0 (test.cuf: 34)
PGF90-S-0000-Internal compiler error. unsupported procedure     349 (test.cuf: 34)
  0 inform,   0 warnings,   2 severes, 0 fatal for mykernel
/tmp/pgcudaforw5MgcaFALD9p.gpu(19): error: a value of type "int" cannot be assigned to an entity of type "float *"

/tmp/pgcudaforw5MgcaFALD9p.gpu(22): error: expected an expression

2 errors detected in the compilation of "/tmp/pgnvdl7MgHLY1VOV5.nv0".
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code       0 (test.cuf: 34)
PGF90/x86-64 Linux 10.8-0: compilation aborted

In this case line 34 refers to end subroutine mykernel. The compiler error is not very helpful, it took me a while to find out that the problem was to do with the shared array (I'm using this code as a simple example).

When I replace 'N' with '9' in the declaration of the shared array such that real,shared,dimension(N) :: shared_array is replaced with real,shared,dimension(9) :: shared_array, the error goes away.

My question is, why is this error occurring, and how do I set the dimension of a shared array with a variable (if indeed its possible)?

Upvotes: 0

Views: 2008

Answers (2)

Vicentito
Vicentito

Reputation: 11

You can have more than one shared memory array, but their size must be known at compile time. In general shared memory arrays should be of fixed size, the case where you can pass the size in bytes at runtime is kind of exceptional. I guess this is all due to the limitation on shared memory in the SM (Stream Multiprocessor). In my experience developing in both CUDA C and CUDA fortran is better to have all these parameters "fixed" and then make the kernel repeat the work as many times as needed to cover all input data, that way i easier to control all the paarmeters that affect the occupancy (how well you use all the physical resources in the GPU).

Upvotes: 1

mkcolg
mkcolg

Reputation: 26

Change "dimension(N)" to "dimension(*)" and then pass in the size of shared array (in bytes) as the third argument of your kernel launch.

Hope this helps,

Mat

% cat test.cuf 
module my_kernels

  use cudafor
  implicit none

  real, dimension(:), allocatable,device :: Ad
  real, dimension(:),allocatable :: Ah

contains

  attributes(global) subroutine mykernel(N)

    ! Declare variables
    integer :: index
    integer, intent(IN), value :: N
    real,shared,dimension(*) :: shared_array  

    ! Map threadID to index
    index = blockDim%x * (blockIdx%x-1) + threadIdx%x

    ! Set array element equal to index
    shared_array(index) = index

    Ad(index) = index

  end subroutine mykernel

end module my_kernels


program cuda

  use my_kernels
  implicit none  

  ! Set number of threads
  integer :: N = 9

   allocate(Ad(N), Ah(N))

  ! Invoke kernel with 3 blocks of 3 threads
  call mykernel<<<N/3,3,N*4>>>(N)

  Ah=Ad
  print *, Ah

end program cuda

% pgf90 test.cuf -V10.9 ; a.out
    1.000000        2.000000        3.000000        4.000000     
    5.000000        6.000000        7.000000        8.000000     
    9.000000 

Upvotes: 1

Related Questions