Peter Petrik
Peter Petrik

Reputation: 10185

Intrinsic assignment of scalar to array

From Fortran 2008 specification, 7.2.1.3.5

If expr is a scalar and the variable is an array, the expr is treated as if it were an array of the same shape as the variable with every element of the array equal to the scalar value of expr.

I have seen the following coding styles:

A:

integer, dimension(3) :: x
do i=1,3
  x(i) = 1
enddo

B:

integer, dimension(3) :: x
x(:) = 1

C:

integer, dimension(3) :: x
x = 1

What is considered to be a best practice to assign scalar value to an array (performance & readability)?

Note: A smells like Fortran77 and I have a feeling that C could confuse future readers?

Upvotes: 3

Views: 1188

Answers (1)

Scientist
Scientist

Reputation: 1835

SHORT ANSWER:

In my experience, case (A) has always outperformed Fortran array syntax. Then, case (C) outperforms case (B), only in special cases where the compiler has to do some extra work to understand the meaning of (:), otherwise these two notations are most often the same. But, the best option so far seems to be DO CONCURRENT, IFF the compiler supports it, like Intel Fortran compiler 2015, and I have seen performance gain with DO CONCURRENT, IFF the compiler optimization flag is on. Beware of DO CONCURRENT status in other Fortran compilers as they might not optimize it and simply translate a DO CONCURRENT to a DO loop, in which case performance loss may happen, depending on how the header is written (long answer below).

LONG ANSWER:

I compared DO-loop vs array assignment vs. DO CONCURRENT this morning for my own knowledge with the following code:

program performance_test
implicit none
integer :: i,j,k, nloop = 10**4
integer, dimension(:,:), allocatable :: variable
real*8  :: tstart,tend
allocate(variable(nloop,nloop))

call cpu_time(tstart)
do k = 1,10
  variable = 0.0
end do
call cpu_time(tend)
write(*,*) tend-tstart

call cpu_time(tstart)
do k = 1,10
  do j = 1,nloop
    do i = 1,nloop
      variable(i,j) = 0.0
    end do
  end do
end do
call cpu_time(tend)
write(*,*) tend-tstart

call cpu_time(tstart)
do k = 1,10
  do concurrent (j = 1:nloop, i = 1:nloop)
    variable(i,j) = 0.0
  end do
end do
call cpu_time(tend)
write(*,*) tend-tstart

call cpu_time(tstart)
do k = 1,10
  do i = 1,nloop
    do j = 1,nloop
      variable(i,j) = 0.0
    end do
  end do
end do
call cpu_time(tend)
write(*,*) tend-tstart

call cpu_time(tstart)
do k = 1,10
  do concurrent (i = 1:nloop, j = 1:nloop)
    variable(i,j) = 0.0
  end do
end do
call cpu_time(tend)
write(*,*) tend-tstart    

end program performance_test

The compiler is Intel Fortran 2015.

Result:

DO CONCURRENT and simple DO loop both win over Fortran array syntax (at least in this example, as far as I can tell) whether compiler optimization flag is on or off, but ONLY when it is done with Fortran column-wise order in mind. Technically speaking, I don't think if one could talk about column-wise or row-wise with DO CONCURRENT, as I use below. With no optimization flag, DO CONCURRENT is basically simple DO loop. And with optimization flag, the compiler itself will take care of the order of the loop.

My experience so far with DO CONCURRENT in Intel Fortran Compiler 2015: In complex DO loops where compiler cannot easily decipher concurrency, it results in some performance gain compared to simple DO loop, and in other cases, it is just as good as DO loop, as it should be, EXCEPT when it is used in combination with OPENMP directives, which causes disaster as of today, at least as far as I have experimented.

No compiler optimization :

array syntax               :  4.44602850000000
do-loop, column-wise       :  3.82202450000000
do concurrent, column-wise :  3.91562510000000
do-loop, row-wise          :  19.1413227000000
do concurrent, row-wise    :  19.2817236000000

O2 level optimization :

array syntax               :  0.218401400000000
do-loop, column-wise       :  0.187201200000000
do concurrent, column-wise :  0.171601100000000
do-loop, row-wise          :  0.187201200000000
do concurrent, row-wise    :  0.171601100000000

Upvotes: 2

Related Questions