Reputation: 163
I am using Ubuntu for development of Fortran (2008+) programs with MPI. Things were pretty settled on earlier Ubuntu versions, but I am experiencing some difficulties to compile and run Fortran/MPI on Ubuntu 22.04, which I installed on a new PC very recently.
I first installed OpenMPI, but it wouldn't compile my programs at all, complaining that it can't find some include files related to mpi_f08
. (I am sorry, but I can't recall the exact message and I uninstalled the OpenMPI since).
I had better luck with MPICH though. It can compile my programs, but crashes during execution as soon as the first communication between processors should take place. A minimum example which demonstrates the issue is given below:
subroutine global_sum_real(phi_old)
use mpi_f08
implicit none
real :: phi_old
real :: phi_new
integer :: error
call mpi_allreduce(phi_old, & ! send buffer
phi_new, & ! recv buffer
1, & ! length
mpi_real, & ! datatype
mpi_sum, & ! operation
mpi_comm_world, & ! communicator
error)
phi_old = phi_new
end subroutine
program global_sum_mpi
use mpi_f08
implicit none
real :: suml
integer :: error
call mpi_init(error)
suml = 1.0
call global_sum_real(suml)
print *, suml
call mpi_finalize(error)
end program
I hope it is clear what is happening above. The main program (global_sum_mpi
) initializes MPI and calls one subroutine (global_sum_real
) which is essentially an interface to MPI_Allreduce
. Very simple.
If I compile it with mpifort (it is an: mpifort for MPICH version 4.0 ... gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)) and try to run it in parallel, it crashes with the error:
Internal Error: Invalid type in descriptor
in the line which calls MPI_Allreduce
. The funny thing is that if I change the modules I used for MPI from:
use mpi_f08
to the plain:
use mpi
Everything works as expected. This is not a route I would like to take because I believe that mpi_f08
is more up to date with later Fortran standards and I also need the mpi_f08
for better compatibility with external PETSc libraries.
Any ideas on why the use mpi_f08
is causing problems on the new Ubuntu installation?
Kind regards
Upvotes: 3
Views: 937
Reputation: 8395
The root cause is a bug in gfortran
provided by Ubuntu 22.04 (jammy).
Here is a sample program that crashes
module mymod
implicit none
interface bar
subroutine bar_f08ts (a) bind(C, name="sync")
implicit none
type(*), dimension(..) :: a
end subroutine
end interface
end module
module pub
implicit none
interface sub
subroutine pub_f08ts(a)
implicit none
type (*), dimension(..) :: a
end subroutine
end interface
end module
subroutine pub_f08ts(a)
use mymod
implicit none
type (*), dimension(..) :: a
call bar(a)
end subroutine
subroutine bugsub(a)
use pub
implicit none
real :: a
call sub(a)
end subroutine
program bug
implicit none
real a
a = 1
call bugsub(a)
end program
$ gfortran test.f90
$ ./a.out
Internal Error: Invalid type in descriptor
Error termination. Backtrace:
#0 0x7f27b38c2ad0 in ???
#1 0x7f27b38c3649 in ???
#2 0x7f27b38c3e38 in ???
#3 0x7f27b3b058a4 in ???
#4 0x56281847220b in ???
#5 0x5628184721c4 in ???
#6 0x562818472264 in ???
#7 0x5628184722a0 in ???
#8 0x7f27b36a0d8f in __libc_start_call_main
at ../sysdeps/nptl/libc_start_call_main.h:58
#9 0x7f27b36a0e3f in __libc_start_main_impl
at ../csu/libc-start.c:392
#10 0x5628184720c4 in ???
#11 0xffffffffffffffff in ???
I was unable to reproduce this issue on a redhat box with various gfortran
version.
The right way to move forward is to report this to Ubuntu and wait for a fix.
Meanwhile, you can either use an other distro, or use Open MPI (that does not use the CFI_desc_t
stuff, so the gfortran
bug should not impact you).
I do not understand how to use the ubuntu packages for openmpi
(some are provided, but unless I missed something, the libraries and header files are not available), but you can build and install from source in your home directory (no need root access).
ADDITIONAL INFO
The issue occurs because gfortran-11
uses libgfortran
from gfortran-12
, and then bad interaction happens.
I reported this to the GNU folks at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108056
Upvotes: 5