Reputation: 23
I am trying to work on an earth system model and I am new to this. Currently, I am only trying to run a test case which is computationally less intensive.
My system is Ubuntu 20.04. I have built the required libraries in the following order - mpich, pnetcdf, zlib, hdf5, netcdf-c, netcdf-fortran, lapack and blas. The versions are as follows (my GCC and gfortran version is 9.4.0) mpich-3.3.1, pnetcdf-1.12.3, zlib-1.2.13, hdf5-1.10.5, netcdf-c-4.9.0, netcdf-fortran-4.6.0, LAPACK and BLAS 3.11. For building with Parallel I/O support I had followed the order Pnetcdf, then hdf5, then Netcdf-c and finally Netcdf-fortran while installing. All the libraries and packages were installed properly without any error and with the same compiler that I'd be using for the model.
The issue that I am coming across now has to do with the linking of libraries (pnetcdf, netcdf-c and netcdf-fortran), more particularly the order, as indicated by the forum dedicated for the model. At the end of the build for the model, when it is trying to create a single executable,it fails (collect2: error: ld returned 1 exit status). The following is the command where it shows the errors
mpif90 -o /home/ubuntuvm/projects/cesm/scratch/testrun11/bld/cesm.exe \
cime_comp_mod.o cime_driver.o component_mod.o component_type_mod.o \
cplcomp_exchange_mod.o map_glc2lnd_mod.o map_lnd2glc_mod.o \
map_lnd2rof_irrig_mod.o mrg_mod.o prep_aoflux_mod.o prep_atm_mod.o \
prep_glc_mod.o prep_ice_mod.o prep_lnd_mod.o prep_ocn_mod.o \
prep_rof_mod.o prep_wav_mod.o seq_diag_mct.o seq_domain_mct.o \
seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_io_mod.o \
seq_map_mod.o seq_map_type_mod.o seq_rest_mod.o t_driver_timers_mod.o \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -latm \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lice \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -llnd \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -locn \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lrof \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lglc \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lwav \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lesp \
-L../../gnu/mpich/nodebug/nothreads/mct/noesmf/c1a1l1i1o1r1g1w1e1/lib \
-lcsm_share -L../../gnu/mpich/nodebug/nothreads/lib -lpio -lgptl \
-lmct -lmpeu -L/home/ubuntuvm/CESM/lib -lnetcdff \
-Wl,-rpath=/home/ubuntuvm/CESM/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM/lib \
-lpnetcdf -L/usr/local/lib -llapack -L/usr/local/lib -lblas \
-L/home/ubuntuvm/CESM/lib -lpnetcdf -L/home/ubuntuvm/CESM/lib
Below is a part of the errors where libpio.a is a component library that is built before the above command
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_copy_att':
nf_mod.F90:(.text+0x31): undefined reference to `nfmpi_copy_att'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_def_var_md':
nf_mod.F90:(.text+0x3b5): undefined reference to `nfmpi_def_var'
/usr/bin/ld: nf_mod.F90:(.text+0x4fe): undefined reference to `nfmpi_def_var'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_def_dim':
nf_mod.F90:(.text+0xab9): undefined reference to `nfmpi_def_dim'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_redef':
nf_mod.F90:(.text+0xeb9): undefined reference to `nfmpi_redef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_enddef':
nf_mod.F90:(.text+0xff0): undefined reference to `nfmpi_enddef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimlen':
nf_mod.F90:(.text+0x115c): undefined reference to `nfmpi_inq_dimlen'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimname':
nf_mod.F90:(.text+0x14c2): undefined reference to `nfmpi_inq_dimname'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimid':
nf_mod.F90:(.text+0x1821): undefined reference to `nfmpi_inq_dimid'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_varnatts_vid':
nf_mod.F90:(.text+0x1c24): undefined reference to `nfmpi_inq_varnatts'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_vardimid_vid':
nf_mod.F90:(.text+0x1fcc): undefined reference to `nfmpi_inq_vardimid'
The libraries are linked as follows
-L/home/ubuntuvm/CESM_Library/lib -lnetcdff -lnetcdf \
-Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lpnetcdf
What am I doing wrong here? I would be grateful for any suggestions regarding the order of the libraries and would be happy to provide any other details that might be required.
Upvotes: 1
Views: 895
Reputation: 17179
While it is difficult to reproduce this specific error. It looks like the issue is with PNetCDF library that does not seem to contain some FORTRAN functions.
Here is a brief instruction for setting up CESM on Ubuntu with OpenMPI starting from scratch.
Caveats: This example is for Ubuntu 22.04 (Jammy) but should work on all recent Ubuntu versions. This is for GCC version 11.4. Switching from OpenMPI to MPICH might require certain changes.
It is based on a very detailed post by Yonash Mersha. It would be ok to compare the instructions below with the instructions in that post if some details seem missing.
Assume the hostname is ubuntu-jammy
apt-get install -y gcc g++ gfortran build-essential cmake git \
subversion python-is-python3 perl vim python3-pip libxml2-utils unzip libopenmpi-dev
option for make
.echo "ubuntu-jammy slots=8" >> /etc/openmpi/opempi-default-hostfile
mkdir CESM
RUN git clone -b release-cesm2.1.3 my_cesm_sandbox
mkdir ~/.cime
export CIME_MODEL=cesm
mkdir -p cesm/inputdata
around line 267 to add options to ignore unknown CA. This might introduce a vulnerability, so think twice.
Make the SVN command look like this:cmd = ['svn', 'checkout', '--non-interactive', '--trust-server-cert-failures=unknown-ca', '--quiet', url, repo_dir_path]
Use ./manage_externals/checkout_externals -S
to check that all the necessary codes have been delivered.
Download and unpack zlib, hdf5, netcdf-c, netcdf-fortran, pnetcdf, lapack
To build zlib use the default settings: In the directory containing zlib sources
./configure --prefix=$ZLIB
make -j 8
make check
make install
CPPFLAGS="-I$ZLIB/include" LDFLAGS="-L$ZLIB/lib" \
CC=mpicc CXX=mpicxx ./configure --prefix=$HDF5 --with-zlib=$ZLIB --enable-hl --enable-fortran --enable-parallel
After configure check that parallel support is enabled:
Parallel HDF5: yes
Parallel Filtered Dataset Writes: yes
Large Parallel I/O: yes
High-level library: yes
Now build, check, and install.
make -j8
make -j8 check
make install
CPPFLAGS="-I$HDF5/include -I$ZLIB/include" LDFLAGS="-L$HDF5/lib -L$ZLIB/lib" \
CC=mpicc CXX=mpicxx ./configure --prefix=$NETCDF --disable-dap --enable-parallel4
Make sure the correct options are enabled
# NetCDF C Configuration Summary
HDF5 Support: yes
NetCDF-4 API: yes
NC-4 Parallel Support: yes
Now make,check, and install
make -j8
make -j8 check
make install
CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$NETCDF
Check that Parallel options are configured as follows:
# NetCDF Fortran Configuration Summary
Parallel IO: yes
NetCDF4 Parallel IO: yes
PnetCDF Parallel IO: no
Now build, check, and install
make -j8
make -j8 check
make install
CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
LDFLAGS="-L$NETCDF/lib -L$HDF5/lib -L$ZLIB/lib" \
CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$PNETCDF --enable-shared --enable-fortran --enable-profiling --enable-large-file-test --with-netcdf4
make -j8
make -j8 tests
make check
make ptest
make ptests
make install
While ptest
runs on 4 processors, you should have at least this number.
Do not worry if some ptests fail because your system does not have enough processors, as some tests require 10 processors for MPI or more.
Switch to the LAPACK source directory
make -j8 blaslib
make -j8 lapacklib
mkdir -p $LAPACK/lib
mv librefblas.a $LAPACK/lib/libblas.a
mv liblapack.a $LAPACK/lib/liblapack.a
<?xml version="1.0"?>
<config_machines version="2.0">
<machine MACH="ubuntu-jammy">
Example port to Ubuntu Jammy linux system with gcc, netcdf, pnetcdf and openmpi
<SUPPORTED_BY>[email protected]</SUPPORTED_BY>
<mpirun mpilib="default">
<arg name="ntasks"> -np {{ total_tasks }} </arg>
<module_system type="none" allow_error="true">
<env name="NETCDF">$ENV{HOME}/CESM/lib/necdf</env>
<env name="PNETCDF">$ENV{HOME}/CESM/lib/pnetcdf</env>
<env name="OMP_STACKSIZE">256M</env>
<resource name="RLIMIT_STACK">-1</resource>
Validate the XML file
xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_machines.xsd $HOME/.cime/config_machines.xml
<?xml version="1.0" encoding="UTF-8"?>
<config_compilers version="2.0">
<append compile_threaded="true"> -fopenmp </append>
<append> -fallow-argument-mismatch -fallow-invalid-boz</append>
<append>-L $ENV{HOME}/CESM/lib/nectdf/lib -lnetcdff -lnetcdf -lm</append>
<append>-L $ENV{HOME}/CESM/lib/lapack/lib -llapack -lblas</append>
Validate the xml file
xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_compilers_v2.xsd $HOME/.cime/config_compilers.xml
cd $HOME/CESM/my_cesm_sandbox
cime/scripts/create_newcase --case mycase --compset X --res f19_g16
cd mycase
Upvotes: 1