Aatreyee
Aatreyee

Reputation: 23

Linking PnetCDF, NetCDF-C and NetCDF-Fortran libraries for an earth system model

I am trying to work on an earth system model and I am new to this. Currently, I am only trying to run a test case which is computationally less intensive.

My system is Ubuntu 20.04. I have built the required libraries in the following order - mpich, pnetcdf, zlib, hdf5, netcdf-c, netcdf-fortran, lapack and blas. The versions are as follows (my GCC and gfortran version is 9.4.0) mpich-3.3.1, pnetcdf-1.12.3, zlib-1.2.13, hdf5-1.10.5, netcdf-c-4.9.0, netcdf-fortran-4.6.0, LAPACK and BLAS 3.11. For building with Parallel I/O support I had followed the order Pnetcdf, then hdf5, then Netcdf-c and finally Netcdf-fortran while installing. All the libraries and packages were installed properly without any error and with the same compiler that I'd be using for the model.

The issue that I am coming across now has to do with the linking of libraries (pnetcdf, netcdf-c and netcdf-fortran), more particularly the order, as indicated by the forum dedicated for the model. At the end of the build for the model, when it is trying to create a single executable,it fails (collect2: error: ld returned 1 exit status). The following is the command where it shows the errors

mpif90 -o /home/ubuntuvm/projects/cesm/scratch/testrun11/bld/cesm.exe \
cime_comp_mod.o cime_driver.o component_mod.o component_type_mod.o \
cplcomp_exchange_mod.o map_glc2lnd_mod.o map_lnd2glc_mod.o \
map_lnd2rof_irrig_mod.o mrg_mod.o prep_aoflux_mod.o prep_atm_mod.o \
prep_glc_mod.o prep_ice_mod.o prep_lnd_mod.o prep_ocn_mod.o \
prep_rof_mod.o prep_wav_mod.o seq_diag_mct.o seq_domain_mct.o \
seq_flux_mct.o seq_frac_mct.o seq_hist_mod.o seq_io_mod.o \
seq_map_mod.o seq_map_type_mod.o seq_rest_mod.o t_driver_timers_mod.o \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -latm \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lice \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -llnd \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -locn \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lrof \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lglc \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lwav \
-L/home/ubuntuvm/projects/cesm/scratch/testrun11/bld/lib/ -lesp \
-L../../gnu/mpich/nodebug/nothreads/mct/noesmf/c1a1l1i1o1r1g1w1e1/lib \
-lcsm_share -L../../gnu/mpich/nodebug/nothreads/lib -lpio -lgptl \
-lmct -lmpeu  -L/home/ubuntuvm/CESM/lib -lnetcdff \
-Wl,-rpath=/home/ubuntuvm/CESM/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM/lib \
-lpnetcdf -L/usr/local/lib -llapack -L/usr/local/lib -lblas \
-L/home/ubuntuvm/CESM/lib -lpnetcdf  -L/home/ubuntuvm/CESM/lib

Below is a part of the errors where libpio.a is a component library that is built before the above command

/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_copy_att':
nf_mod.F90:(.text+0x31): undefined reference to `nfmpi_copy_att'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_def_var_md':
nf_mod.F90:(.text+0x3b5): undefined reference to `nfmpi_def_var'
/usr/bin/ld: nf_mod.F90:(.text+0x4fe): undefined reference to `nfmpi_def_var'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_def_dim':
nf_mod.F90:(.text+0xab9): undefined reference to `nfmpi_def_dim'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_redef':
nf_mod.F90:(.text+0xeb9): undefined reference to `nfmpi_redef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_enddef':
nf_mod.F90:(.text+0xff0): undefined reference to `nfmpi_enddef'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimlen':
nf_mod.F90:(.text+0x115c): undefined reference to `nfmpi_inq_dimlen'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimname':
nf_mod.F90:(.text+0x14c2): undefined reference to `nfmpi_inq_dimname'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_pio_inq_dimid':
nf_mod.F90:(.text+0x1821): undefined reference to `nfmpi_inq_dimid'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_varnatts_vid':
nf_mod.F90:(.text+0x1c24): undefined reference to `nfmpi_inq_varnatts'
/usr/bin/ld: ../../gnu/mpich/nodebug/nothreads/lib/libpio.a(nf_mod.F90.o): in function `__nf_mod_MOD_inq_vardimid_vid':
nf_mod.F90:(.text+0x1fcc): undefined reference to `nfmpi_inq_vardimid'

The libraries are linked as follows

-L/home/ubuntuvm/CESM_Library/lib -lnetcdff -lnetcdf \
-Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lnetcdf -lm -lnetcdf -lhdf5_hl \
-lhdf5 -lpnetcdf -ldl -lm -lz -Wl,-rpath=/home/ubuntuvm/CESM_Library/lib -lpnetcdf

What am I doing wrong here? I would be grateful for any suggestions regarding the order of the libraries and would be happy to provide any other details that might be required.

Upvotes: 1

Views: 895

Answers (1)

Dima Chubarov
Dima Chubarov

Reputation: 17179

While it is difficult to reproduce this specific error. It looks like the issue is with PNetCDF library that does not seem to contain some FORTRAN functions.

Here is a brief instruction for setting up CESM on Ubuntu with OpenMPI starting from scratch.

Caveats: This example is for Ubuntu 22.04 (Jammy) but should work on all recent Ubuntu versions. This is for GCC version 11.4. Switching from OpenMPI to MPICH might require certain changes.

It is based on a very detailed post by Yonash Mersha. It would be ok to compare the instructions below with the instructions in that post if some details seem missing.

Assume the hostname is ubuntu-jammy.

  1. Install the necessary system packages and libraries
apt-get install -y gcc g++ gfortran build-essential cmake git \
subversion python-is-python3 perl vim python3-pip libxml2-utils unzip libopenmpi-dev
  1. For OpenMPI tests configure the number of slots on your machine. Assume that you have 8 CPU cores. This number will be used below many times as argument to -j option for make.
echo "ubuntu-jammy slots=8" >> /etc/openmpi/opempi-default-hostfile
  1. Download CESM and setup CIME
mkdir CESM
cd CESM
RUN git clone -b release-cesm2.1.3 https://github.com/ESCOMP/CESM.git my_cesm_sandbox
mkdir ~/.cime
export CIME_MODEL=cesm
mkdir -p cesm/inputdata
  1. Some UCAR SVN servers are using either self-signed certificates or certificates from some unknown CA, so edit my_cesm_sandbox/manage_externals/manic/repository_svn.py around line 267 to add options to ignore unknown CA. This might introduce a vulnerability, so think twice. Make the SVN command look like this:
cmd = ['svn', 'checkout', '--non-interactive', '--trust-server-cert-failures=unknown-ca', '--quiet', url, repo_dir_path]
  1. Checkout the model codes
./manage_externals/checkout_externals

Use ./manage_externals/checkout_externals -S to check that all the necessary codes have been delivered.

  1. Now build the IO libraries and Lapack. We shall assume that libraries for CESM will be installed under $HOME/CESM/lib.
CESM_LIB_DIR=$HOME/CESM/lib
ZLIB=$CESM_LIB_DIR/zlib
HDF5=$CESM_LIB_DIR/hdf5
NETCDF=$CESM_LIB_DIR/netcdf
PNETCDF=$CESM_LIB_DIR/pnetcdf
export ZLIB HDF5 NETCDF PNETCDF
  1. Download and unpack zlib, hdf5, netcdf-c, netcdf-fortran, pnetcdf, lapack

  2. To build zlib use the default settings: In the directory containing zlib sources

./configure --prefix=$ZLIB
make -j 8
make check
make install
  1. Important: build HDF5 with parallel support In the directory containing HDF5 sources
CPPFLAGS="-I$ZLIB/include" LDFLAGS="-L$ZLIB/lib" \
CC=mpicc CXX=mpicxx ./configure --prefix=$HDF5 --with-zlib=$ZLIB --enable-hl --enable-fortran --enable-parallel

After configure check that parallel support is enabled:

        SUMMARY OF THE HDF5 CONFIGURATION
        =================================
...
Features:
---------
                   Parallel HDF5: yes
Parallel Filtered Dataset Writes: yes
              Large Parallel I/O: yes
              High-level library: yes
...

Now build, check, and install.

make -j8
make -j8 check
make install
  1. Build NetCDF C bindings with Parallel-NetCDF4 support In the directory containing netcdf-c sources
CPPFLAGS="-I$HDF5/include -I$ZLIB/include" LDFLAGS="-L$HDF5/lib -L$ZLIB/lib" \
CC=mpicc CXX=mpicxx ./configure --prefix=$NETCDF --disable-dap --enable-parallel4

Make sure the correct options are enabled

# NetCDF C Configuration Summary
==============================
...
HDF5 Support:       yes
NetCDF-4 API:       yes
NC-4 Parallel Support:  yes
...

Now make,check, and install

make -j8
make -j8 check
make install
  1. Build NetCDF FORTRAN bindings In the directory containing netcdf-fortran sources
CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
LDFLAGS="-L$NETCDF/lib -L$HDF5/lib -L$ZLIB/lib" LD_LIBRARY_PATH="$NETCDF/lib:$LD_LIBRARY_PATH" \
CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$NETCDF

Check that Parallel options are configured as follows:

# NetCDF Fortran Configuration Summary
==============================
...
Parallel IO:                    yes
NetCDF4 Parallel IO:            yes
PnetCDF Parallel IO:            no
...

Now build, check, and install

make -j8 
make -j8 check
make install
  1. Build PNetCDF Switch to the directory, containing pnetcdf sources
CPPFLAGS="-I$NETCDF/include -I$HDF5/include -I$ZLIB/include" \
FFLAGS="-fallow-argument-mismatch -fallow-invalid-boz" \
LDFLAGS="-L$NETCDF/lib -L$HDF5/lib -L$ZLIB/lib" \
LD_LIBRARY_PATH="$NETCDF/lib:$LD_LIBRARY_PATH" \
CC=mpicc CXX=mpicxx FC=mpifort ./configure --prefix=$PNETCDF --enable-shared --enable-fortran --enable-profiling --enable-large-file-test --with-netcdf4

Run

make -j8
make -j8 tests
make check
make ptest
make ptests
make install

While ptest runs on 4 processors, you should have at least this number. Do not worry if some ptests fail because your system does not have enough processors, as some tests require 10 processors for MPI or more.

  1. Build and install LAPACK

Switch to the LAPACK source directory

make -j8 blaslib
make -j8 lapacklib
mkdir -p $LAPACK/lib
mv librefblas.a $LAPACK/lib/libblas.a
mv liblapack.a $LAPACK/lib/liblapack.a
  1. Configure machine copy the following text to .cime/config_machines.xml
<?xml version="1.0"?>
<config_machines version="2.0">
 <machine MACH="ubuntu-jammy">
    <DESC>
      Example port to Ubuntu Jammy linux system with gcc, netcdf, pnetcdf and openmpi
    </DESC>
    <NODENAME_REGEX>ubuntu-jammy</NODENAME_REGEX>
    <OS>LINUX</OS>
    <COMPILERS>gnu</COMPILERS>
    <MPILIBS>openmpi</MPILIBS>
    <PROJECT>none</PROJECT>
    <SAVE_TIMING_DIR> </SAVE_TIMING_DIR>
    <CIME_OUTPUT_ROOT>$ENV{HOME}/cesm/scratch</CIME_OUTPUT_ROOT>
    <DIN_LOC_ROOT>$ENV{HOME}/cesm/inputdata</DIN_LOC_ROOT>
    <DIN_LOC_ROOT_CLMFORC>$ENV{HOME}/cesm/inputdata/lmwg</DIN_LOC_ROOT_CLMFORC>
    <DOUT_S_ROOT>$ENV{HOME}/cesm/archive/$CASE</DOUT_S_ROOT>
    <BASELINE_ROOT>$ENV{HOME}/cesm/cesm_baselines</BASELINE_ROOT>
    <CCSM_CPRNC>$ENV{HOME}/cesm/tools/cime/tools/cprnc/cprnc</CCSM_CPRNC>
    <GMAKE>make</GMAKE>
    <GMAKE_J>8</GMAKE_J>
    <BATCH_SYSTEM>none</BATCH_SYSTEM>
    <SUPPORTED_BY>[email protected]</SUPPORTED_BY>
    <MAX_TASKS_PER_NODE>8</MAX_TASKS_PER_NODE>
    <MAX_MPITASKS_PER_NODE>8</MAX_MPITASKS_PER_NODE>
    <PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
    <mpirun mpilib="default">
      <executable>mpiexec</executable>
      <arguments>
        <arg name="ntasks"> -np {{ total_tasks }} </arg>
      </arguments>
    </mpirun>
    <module_system type="none" allow_error="true">
    </module_system>
    <environment_variables>
      <env name="NETCDF">$ENV{HOME}/CESM/lib/necdf</env>
      <env name="PNETCDF">$ENV{HOME}/CESM/lib/pnetcdf</env>
      <env name="OMP_STACKSIZE">256M</env>
    </environment_variables>
    <resource_limits>
      <resource name="RLIMIT_STACK">-1</resource>
    </resource_limits>
  </machine>
</config_machines>

Validate the XML file

xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_machines.xsd $HOME/.cime/config_machines.xml
  1. Configure compilers Put the following text to .cime/config_compilers.xml
<?xml version="1.0" encoding="UTF-8"?>
<config_compilers version="2.0">

  <compiler>
        <LDFLAGS>
                <append compile_threaded="true"> -fopenmp </append>
        </LDFLAGS>
        <FFLAGS>
                <append>   -fallow-argument-mismatch -fallow-invalid-boz</append>
        </FFLAGS>
        <SFC>gfortran</SFC>
        <SCC>gcc</SCC>
        <SCXX>g++</SCXX>
        <MPIFC>mpifort</MPIFC>
        <MPICC>mpicc</MPICC>
        <MPICXX>mpicxx</MPICXX>
        <CXX_LINKER>FORTRAN</CXX_LINKER>
        <NETCDF_PATH>$ENV{HOME}/CESM/lib/nectdf</NETCDF_PATH>
        <PNETCDF_PATH>$ENV{HOME}/CESM/lib/pnectdf</PNETCDF_PATH>
        <SLIBS>
                <append>-L $ENV{HOME}/CESM/lib/nectdf/lib -lnetcdff -lnetcdf -lm</append>
                <append>-L $ENV{HOME}/CESM/lib/lapack/lib -llapack -lblas</append>
        </SLIBS>
</compiler>

</config_compilers>

Validate the xml file

xmllint --noout --schema $HOME/CESM/my_cesm_sandbox/cime/config/xml_schemas/config_compilers_v2.xsd $HOME/.cime/config_compilers.xml
  1. Create a new case, set it up, and build it.
cd $HOME/CESM/my_cesm_sandbox
cime/scripts/create_newcase --case mycase --compset X --res f19_g16
cd mycase
./case.setup
./case.build

Upvotes: 1

Related Questions