Two times looping 1D with different configurations

Question

Does anyone know, why Program B faster than Program A is?

I used ifort-16 with -fast optimization flag and the optimization reports say that Program A would give estimated potential speed up 10.09, while Program B only 3.90. But, actually the running time of Program B is 14s, while Program A is 20s.

!Program A
 DO J=1, 100000          !This is the different part
   !$OMP SIMD
    DO I=1, 100000
       IF(A(I)==J) THEN
          B(I)=J
       END IF          
    END DO
   !$OMP END SIMD
 END DO

!Program B
 DO I=1, 100000          !This is the different part
   !$OMP SIMD
    DO J=1, 100000
       IF(A(I)==J) THEN
          B(I)=J
       END IF          
    END DO
   !$OMP END SIMD
 END DO

Well, both programs were successfully vectorized and somehow my feeling says that program A would be faster, since (in my opinion), both codes would be vectorized as follows:

!Program A
 IF(A(I)==J) THEN
    B(I)=J
 END IF

 IF(A(I+1)==J) THEN
    B(I+1)=J
 END IF

...

and

!Program B
 IF(A(I)==J) THEN
    B(I)=J
 END IF

 IF(A(I)==J+1) THEN
    B(I)=J+1
 END IF

...

where Program A will be more effective, since the left-hand-side indexes are directly computed. But in fact, my expectations were wrong. Thanks in advance.

Two times looping 1D with different configurations

Answers (1)

Related Questions