SadBoySquad
SadBoySquad

Reputation: 21

How to enable out of bounds memory checking on nvfortran compiler?

Description and source example

Below are two simple, crude test programs that try to access out of bounds memory in cpu and gpu code. I put the gpu example separately, so one can test the cpu example with different compilers and examine their behavior.

CPU example

module sizes

    integer, save :: size1
    integer, save :: size2

end module sizes

module arrays

    real, allocatable, save :: testArray1(:, :)
    real, allocatable, save :: testArray2(:, :)

end module arrays

subroutine testMemoryAccess
    use sizes
    use arrays

    implicit none

    real :: value

    value = testArray1(size1+1, size2+1)
    print *, 'value', value

end subroutine testMemoryAccess

Program testMemoryAccessOutOfBounds
    use sizes
    use arrays

    implicit none

    ! set sizes for the example
    size1 = 5000
    size2 = 2500

    allocate (testArray1(size1, size2))
    allocate (testArray2(size2, size1))
    testArray1 = 1.d0
    testArray2 = 2.d0

    call testMemoryAccess

end program testMemoryAccessOutOfBounds

GPU example

module sizes

    integer, save :: size1
    integer, save :: size2

end module sizes

module sizesCuda

    integer, device, save :: size1
    integer, device, save :: size2

end module sizesCuda

module arrays

    real, allocatable, save :: testArray1(:, :)
    real, allocatable, save :: testArray2(:, :)

end module arrays

module arraysCuda

    real, allocatable, device, save :: testArray1(:, :)
    real, allocatable, device, save :: testArray2(:, :)

end module arraysCuda

module cudaKernels
    use cudafor
    use sizesCuda
    use arraysCuda

contains

    attributes(global) Subroutine testMemoryAccessCuda

        implicit none

        integer :: element

        real :: value

        element = (blockIdx%x - 1)*blockDim%x + threadIdx%x

        if (element.eq.1) then

            value = testArray1(size1+1, size2+1)
            print *, 'value', value

        end if

    end Subroutine testMemoryAccessCuda

end module cudaKernels

Program testMemoryAccessOutOfBounds
    use cudafor
    use cudaKernels
    use sizes
    use sizesCuda, size1_d => size1, size2_d => size2
    use arrays
    use arraysCuda, testArray1_d => testArray1, testArray2_d => testArray2

    implicit none

    integer :: istat

    ! set sizes for the example
    size1 = 5000
    size2 = 2500

    size1_d = size1
    size2_d = size2

    allocate (testArray1_d(size1, size2))
    allocate (testArray2_d(size2, size1))
    testArray1_d = 1.d0
    testArray2_d = 2.d0

    call testMemoryAccessCuda<<<64, 64>>>
    istat = cudadevicesynchronize()

end program testMemoryAccessOutOfBounds

When using nvfortran and trying to debug the program, the compiler does not give any warnings for the out of bounds access. Taking a look at the available flags for out of bounds access, both -C and -Mbounds options seem to be doing just that. However, they do not seem to work as intended.

When using ifort for the same thing, the compiler stops and prints the exact line that the out of bounds access was encountered.

How can I accomplish this using nvfortran? I though it was a CUDA specific problem, however as I was creating the examples to create this question here, I found out that nvfortran does the same thing on CPU code. Thus, it is not CUDA specific.

Compilers used:

nvfortran

nvfortran 23.5-0 64-bit target on x86-64 Linux -tp zen2
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

ifort

ifort (IFORT) 2021.10.0 20230609
Copyright (C) 1985-2023 Intel Corporation.  All rights reserved.

Steps:

nvfortran

I compile the examples as follows:

nvfortran -C -traceback -Mlarge_arrays -Mdclchk -cuda -gpu=cc86 testOutOfBounds.f90
nvfortran -C -traceback -Mlarge_arrays -Mdclchk -cuda -gpu=cc86 testOutOfBoundsCuda.f90

When running the cpu code, I get a non-initialized array value:

value   1.5242136E-27

When running the gpu code, I get a zero value:

value    0.000000

ifort

I compile the cpu example as follows:

ifort -init=snan -C -fpe0 -g -traceback testOutOfBounds.f90

and I get:

forrtl: severe (408): fort: (2): Subscript #2 of the array TESTARRAY1 has value 2501 which is greater than the upper bound of 2500

Image              PC                Routine            Line        Source
a.out              00000000004043D4  testmemoryaccess_          23  testOutOfBounds.f90
a.out              0000000000404FD6  MAIN__                     43  testOutOfBounds.f90
a.out              000000000040418D  Unknown               Unknown  Unknown
libc.so.6          00007F65A9229D90  Unknown               Unknown  Unknown
libc.so.6          00007F65A9229E40  __libc_start_main     Unknown  Unknown
a.out              00000000004040A5  Unknown               Unknown  Unknown

which is actually what I expect the compiler to print.

Upvotes: 1

Views: 249

Answers (1)

Mat Colgrove
Mat Colgrove

Reputation: 5646

Bounds checking isn't support by nvfortran in device code and, as the following warning indicates, is disabled when using GPU related flags:

% nvfortran -C -g -traceback -Mlarge_arrays -Mdclchk -cuda -gpu=cc86 test_bounds1.f90
nvfortran-Warning-CUDA Fortran or OpenACC GPU targets disables -Mbounds

The out-of-bounds error is found for CPU targets:

% nvfortran -C -g -traceback -Mlarge_arrays -Mdclchk  test_bounds1.f90; a.out
0: Subscript out of range for array testarray1 (test_bounds1.f90: 23)
    subscript=5001, lower bound=1, upper bound=5000, dimension=1

Upvotes: 2

Related Questions