Documentation for the F90 version of SEOM2D

Documentation for the F90 version of SEOM2D

Mohamed Iskandarani

Abstract

This is an update to the first version of the documentation. It contains new information on setting up and running the new version of the two-dimensional Spectral Element Ocean Model (SEOM2D).

1  Introduction to SEOM2D-F90

The F90 version of the spectral element ocean model is a little different in design from the F77 version. It is meant to consolidate the serial/parallel versions and the implicit/explicit versions. This has required some amount of acrobatics with Makefile, I/O, and F90 modules. At the same time I have tried to make use of the module features of F90 to hide and isolate some of the implementation details, and eliminate all common statements. Global variables are now accessed through modules. I will try to explain the lay-out and the features of the new version below.

The parallel version of the code follows a MIMD approach and relies on the SPMD (single program multiple data) programming model, although I rely on a Host/Node paradigm to handle the I/O. The message passing library is MPI. Most communication routines are shielded from the computational ones by wrapper subroutines to enhance the portability of the code. The most cumbersome task was devising I/O subroutines that would work for a parallel and serial version while simultaneously localizing the data in the proper modules.

The CPP flags which are still located in file Sflags.h control the different options/configuration of the model. Most files with a .h suffix have been eliminated since no common statements are used anymore. The only included files are Sflags.h for the CPP Prepocessor, and mpif.h which defines some variables for MPI. This latter file is INCLUDE'ed into message.F and shuffle.F only, and must be compilable with an F90 compiler. All array dimensions are declared in dimns.F to localize the array declaration for the serial and parallel versions.

The excutable is still built with the make facility which invokes the C preprocessor and the fortran compiler. The Makefile has become a little more complex since the dependency list changes with the options turned on or off in the Sflags.h file. Also, computer vendors have different policies concerning modules, with some (like CRAY) embedding the information in .o object files and others (like IBM) writing to new .mod files. Makefile needs to be changed accordingly.

For parallel code development the shifting of all the computations to an elemental level is advantageous. This in turn calls for the definition of all quantities at an elemental level, and trying to minimize the translation of global quantities to local quantities. I have shifted the definition of the wind, depth, and Coriolis parameter to an elemental basis, duplicating the values on inter-element boundaries. The relationship between the global and elemental basis are:

 do elm = 1,nelem
   do j = 1,npts
     do i = 1,npts
       ij = iglob(i,j,elm)
       depth_element(i,j,elm) = depth_global(ij)
     enddo
   enddo
 enddo

2  The Task List

Before going into the details of the file formats and input required by the SFE code, we give a summary of the tasks required to run the code successfully.
  1. Velocity grid
    Design the spectral element mesh of the velocity grid.
    The grid generation part can be time consumming depending on the complexity of the domain. For rectangular or annulus type domains, a regular structured-type grid can be generated quite quickly using simple codes and inputs. For more complex domains we recommend the use of CUBIT, a software package for unstructured grid generation using quadrilateral and hexahedral elements in 2 and 3D, respectively. Kate Hedstrom has written a helpful tutorial on how to build SEOM grid with CUBIT.
  2. Pressure grid
    The code requires also a pressure mesh. Fortunately, the latter can be deduced from the velocity grid with the help of a little program written by Gabe Vecchi. If boundary conditions are imposed on the the surface elevation, the boundary information needed can be obtained from the same program used to obtain the velocity boundary conditions.
  3. CPP flags
    The file Sflags.h must be edited to set the proper cpp flags.
  4. Array dimension
    Most of the storage is allocated at run time with F90's allocate statement. The only array dimension that needs to be declared at compile-time is npts which is the order of the interpolation polynomial plus one. It is declared in file dimns.F, and must match the value given in the velocity grid file. This is the only array dimension that must be matched between the code and the grid and is done to help optimizing compilers do their cost benefit analysis.
  5. Initialize data
    A subroutine must be provided to initialize the variables, set the boundary conditions, Dirichlet and Neumann ones, and optionally compute the analytic solution and the errors incurred. Templates of such a subroutine have already been written for a bunch of different problems and can be used as models to build new ones.
  6. Edit Makefile The make file Makefile has to be edited to add the user supplied subroutine file to the list of dependencies of target shallow. The different flags to pass to the compiler must also be set at this stage. Users must tailor Makefile to take into account their computing enviroment and platform.
  7. Domain Decomposition
    Decompose the domain if the code is to be run in parallel. There is a default decomposition that divides and allocates the elements to the processors according to their numbers; its performance on regular grids is acceptable. Fancier partitioning algorithms need to be used if the domain is irregular. Several partitioning algorithms are available with each having its good and bad points. Enrique Cuchitser has written one that does a decent job, another fancier tool is METIS . The partitioning program need to write out a file that lists the elements belonging to the same partition; this file is then read by the decomposition subroutine. The default partition is used if the partition file is specified as default_partition.
  8. Wind data
    The wind must be obtained from some source and interpolated on the velocity grid. The interpolation is rather easy because information is moved from a structured grid to a number of specified points. It is the time interpolation of the wind that is more worrisome. The trouble is one of I/O and storage. If hourly or daily winds are used it is probably a bad idea to store the whole wind data set; a scheme must be devised to keep in store only the relevant time slices. I have written a couple of subroutines that take care of daily, monthly and annual mean winds. These subroutines can serve as templates if new subroutines are needed.
  9. Lat-Lon of Grids and Coriolis Parameter
    The Coriolis parameter is now computed on-line from the grid coordinate. If SPHERICAL is turned on in Sflags.h, f = 2cosq, where q is the latitude; otherwise f = b*(y-y0)+f0, if BETAPLANE, FCENTER and YCENTER are defined in Sflags.h.
  10. Set Numerical Parameters
    The numerical parameters such as the time step, number of time step, gravity coefficient, etc, are set in parameter file.
  11. Build and run the code
    The executable file is produced with the make utility. The serial code does not use any library and its executable is called ßhallow". The parallel code needs the MPI library, and its executable is called ßhallowp". Most parallel platforms provide an mpif90 that takes care of including mpif.h and linking the library; The details vary widely according to the computer platforms, refer to the documentation specific to the site for the appropriate information.

    Changing the dependency list is a little tricky since it now depends on the specific options listed in Sflags.h. The dependency list must be constructed from the *.f files which must be built first. The command "make littlef" runs CPP on all *.F to produce their corresponding *.f files. The dependency list is then produced using the command ``make depend'' (it requires sfmakedepend, a perl 5 script written by Kate Hedstrom; it is available from ahab.rutgers.edu from directory ~ftp/pub/perl). sfmakedepend automates the dependency listing but complicates the build process for ``make'', primarily because it builds a 2-level dependency: each object file is first assigned a module dependency list, then each module is assigned its own dependency list. It is simpler to eliminate the module dependency lists, and to make the object files depends on other object files. This change is simple to effect: edit the Makefile and first remove all the *.mod targets; second change all the suffixes in the dependency list of the *.o targets to ".o".

    Building the executable requires the following commands:

    1. serial
           make shallow
           shallow < in > out &
          
    2. parallel
           make shallowp
           mpirun -np 4 shallowp < in > out &
          

    The structure of the in data file is presented in section 3.

3  The in file

I usually redirect standard input to a file when I run the code. This file which I will call in contains the names of files needed. Their order is the following:
  1. parallel partition file. This file is neded only if the code is run in parallel. It instructs the code on how to distribute the elements to the processors. A default partition is available and is activated by setting the partition file name to default_partition. It assigns the first K/P elements to the first processor, the second K/P elements to processor 2, etc... Here K is the total number of elements and P is the number of processors.
  2. Velocity grid file. Its format is elaborated on below.
  3. Pressure grid file.
  4. Parameters data files. It contains the integration parameters and some other constants. Its content is detailed in the following section.
  5. Bathymetry file. It is needed if the value given in the data file is negative.
  6. Bottom drag file. It is needed if the value given in the data file is negative and VARIABLE_BOTTOM_DRAG is defined in Sflags.h.
  7. Viscosity file. It is needed if the value given in the data file is negative and VARIABLE_VISCOSITY is defined in Sflags.h.
  8. Filtering file. It is needed if the order of the filter varies with the element number, and is needed only if FILTER is defined in Sflags.h and nfile > 1. The filtering file is needed to assign the filter order to the elements.
  9. Diffusion file. Needed only if an advection diffusion equation is being solved (TRACERS defined in Sflags.h), and its diffusion coefficient in the data file is negative and VARIABLE_DIFFUSION is defined), this file allows the use of variable diffusion coefficient for an advection-diffusion equation.
  10. Restart file. This file is needed to restart a computation. It contains the right hand sides at two previous time-levels. This information is needed to jump start the Adams-Bashforth integration scheme. This file is written automatically at every movieinterv and should be used to restart a simulation.

4  The parallel partition file

It is necessary to break-up the grid into subdomains when the model is running on a parallel machine. The partition file assigns the elements to a specific processor. MPI probes for the number of processors being used (size), and sets NPROC=size-1 a since the processor number starts from 0. The partition file should contain the following information
  1. The number of partitions, i.e. NPROC+1
  2. for each partition the following information is read
    1. jproc, nelem_in_proc which are the processor number and number of element on this processor
    2. A list of element numbers belonging to this processor, this list must contain nelem_in_proc numbers.

The read statements for the above information are:
      read(7,*) idum
      do iproc = 0,NPROC
        read(7,*) idum, nelem_in_proc(iproc)
        read(7,*) (ielem(elm,iproc), elm=1,nelem_in_proc(iproc))
      enddo

5  The data file

The format of the data file have been changed. The changes were mostly prompted by the need of accomodating the I/O in the serial and parallel versions. Here is the new format broken in terms of lines:
  1. The first line contains 2 items:
    1. timestep: The step size used in the integration.
    2. ntimestep: number of integration steps. The total time of integration is ntimestep × timestep

  2. The second line of the data file contains 2 items:
    1. itimestep_ratio the ratio between the implicit time step and the time step of the starter explicit RK4 scheme. The implicit time step is larger than that of the explicit stability limit, and one may need to take several small time steps to compute enough right hand sides for the Adams-Bashforth scheme. If the explicit scheme is used one can safely set the itimestep_ratio ratio to 1, and take only 2 RK4 steps. If the simulation is started from a state of rest, there may be no need to use the RK4 scheme at all.
    2. ntimeRK4: number of 4th order Runge-Kutta integration steps. At least two Runge-Kutta steps are needed before one can switch to the 3rd order Adam-Bashforth scheme.

  3. movieinterv: frequency at which data is saved to disks. Snapshots of the variables are taken every movieinterv time steps.
  4. init_file is an integer flag: if it is equal to zero the initial u, v and z fields are set in subroutine uvh_initial, else they are read from a file that must be provided.
  5. initial_time sets the initial time, at the start of the computations; this is useful if one wants to restart the computations.
  6. gravity: the gravity coefficient in the units appropriate to the problem.
  7. drag: the friction coefficient for bottom stresses.
  8. constant_depth: The depth of a basin with uniform bathymetry. If the depth is negative, a bathymetry file must be provided. This switch used to be effected through CPP flags, and has been moved to the data file. It would avoid recompilation of the code without impacting its performace. A similar policy is in place for the the viscous terms and the drag coeficients, although VARIABLE_VISCOSITY and VARIABLE_BOTTOM_DRAG must be defined in Sflags.h. If the values provided in the data files are positive or zero, they are taken to be constant and spread throught out the grid. If they are negative, it means the data is to be read from a separate file.
  9. viscosity: the viscous coefficient for Laplacian friction.
  10. tidal_frequency, tidal_amplitude, tidal_phase input the tidal fequency w, tidal amplitude a and tidal phase f of the equilibrium tide:
    ze = 0.134r8 a(1-3sin2q) cos(wt+f)
    These values must be provided only when TIDAL_FORCING is defined in Sflags.h
  11. filtvis This parameter must be supplied if cpp flag FILTER is defined in Sflags.h. It represents a weighing of the filtered field. If full filtering is desired, set to unity, otherwise set to zero.
  12. n_filt This parameter must be supplied if cpp flag FILTER is defined in Sflags.h. Gives the number of filter orders in the simulation. Variable filter orders might be required if one region needs heavier filtering than others. Usually only one filter order is used for all elements. If more than one is used, then a file is required to associate the filter order with the elements.
  13. filterorder(i), i=1,n_filt This parameter must be supplied if cpp flag FILTER is defined in Sflags.h. List the order of the filter.
  14. diffusion: the diffusion coefficient for diffusion of tracers. This value must be provided only when TRACERS is defined in Sflags.h. The supplied value must be negative in order to read a spatially dependent diffusion coefficient from a file, i.e. when VARIABLE_DIFFUSION is defined in Sflags.h.

6  The velocity grid

  1. The first line of the velocity grid file should contain 9 parameters:

    1. nodeglob: The total number of velocity nodes in the spectral grid. It must match the value given in file dimns.F.
    2. nelem: the total number of elements in the spectral grid. It must match the value given in file dimns.F.
    3. npts: the number of nodes per element per direction in the velocity grid. It must match the value given in file dimns.F.
    4. ndiricu: the total number of u-Dirichlet nodes.
    5. neumanelu: the total number of elements that contain Neumann (flux) edges where the u-flux is specified.
    6. ndiricv: the total number of v-Dirichlet nodes.
    7. neumanelv: the total number of elements that contain Neumann (flux) edges where the v-flux is specified.
    8. ndiriczt: the total number of z-Dirichlet nodes.
    9. nodeglbp: The total number of pressure nodes in the spectral grid. It must match the value given in file dimns.F.

  2. The x-y coordinates of the nodal grid points on the velocity grid. There should be nodeglob pairs of x-y coordinates listed in order of increasing global node number. The read statement is the following:
    read( ,*) (xglob(k), yglob(k), k=1,nodeglob).

  3. The connectivity of the elements. This information tells the program which nodes are contained in each element. There should be node=npts**2 nodes per element. The matrix holding this information is called iglob; iglob(i,j,n) gives the global node number of node (i,j) (1 £ (i,j) £ npts) in element number n (1 £ n £ nelem). (i,j) can be interpreted as the local node number (relative to element number n) of a specific node. The entire iglob array can be listed sequentially, it is neater however to separate the information pertaining to each element. The format to read the connectivity data is
    read( ,*) ( ( (iglob(i,j,n), i=1,npts), j=1,npts), n=1,nelem ).
  4. List of nodes where u-Dirichlet boundary conditions are applied. There should be ndiricu node number where ndiricu is as supplied in the first line of the file. The node numbers can be listed sequentially; the read statement is
    read( ,*) (idiricu(i), i=1,ndiricu).

  5. List of nodes with v-Dirichlet conditions. There should be ndiricv of them as indicated in the first line of the file.

  6. List of z-Dirichlet nodes. There should be ndiriczt of them as indicated in the first line.

  7. The information for the Neumann conditions on the u variable. First the elements with u-flux edges should be listed by their number, followed by the number of the edge. There should be neumanelu pairs as indicated in the first line of the file. The edge code is 1 for the top edge, 2 for the left edge, 3 for the bottom edge and 4 for the right edge.

  8. The Neumann information for the v-flux nodes and elements. The format is the same as for the u-Neumann nodes.

7  The pressure grid

  1. The first line of the velocity grid file should contain:

    1. nodeglbp: The total number of pressure nodes in the spectral grid. It must match the value given in file dimns.F.
    2. nelem: the total number of elements in the spectral grid. It must match the value given in file dimns.F and in the velocity grid file.
    3. nptp: the number of nodes per element per direction in the pressure grid. It must match the value given in file dimns.F, i.e. nptp = npts-2.

  2. The x-y coordinates of the nodal grid points on the pressure grid. There should be nodeglbp pairs of x-y coordinates listed in order of increasing global node number. The read statement is the following:
    read( ,*) (xglbp(k), yglbp(k), k=1,nodeglbp).

  3. The connectivity of the elements. There should be nodep=nptp**2 nodes per element. The matrix holding this information is called iglbp. The format to read this data is
    read( ,*) ( (iglbp(i,n), i=1,nodep), n=1,nelem ).

8  Bathymetry File

The depth at each node of the velocity grid should be provided in elemental binary form, i.e. npts*npts*nelem values. The read-statement is
real(r8) depth(npts,npts,nelem)
read(unit) depth

9  Wind-Stress File

If wind forcing is present the wind stress at all velocity nodes should be given. Two files listing the wind-stress component in the x and y directions, respectively, should be provided. The wind stress must be normalized by the fluid density. The read statement for the x and y-components is of the form
read(unit) windsx. The previous input is adequate for steady winds. If the winds vary in time then special subroutines need to be devised to handle the I/O and wind interpolation. Templates of such subroutines exist, although in F77 form, for mean-monthly wind forcing

10  Spherical Coordinates

No longitude-latitude files are needed anymore, the orginal grid files ought to be in lat-lon coordinates.

11  Subroutine File

There are slight changes to the initialization subroutine. No subroutines is needed to read-in or set the Dirichlet conditions anymore. It is assumed that the Dirichlet conditions are provided as part of the initialization and do not change in time.

The subroutine file contains source code that are compiled and linked with the main part of program shallow. The subroutines in the file initialize u, v, and z, compute the fluxes at the Neumann boundaries and, optionally, compute the error between the numerical solution and an analytical solution when one is available. The subroutines should be supplied by the user and depend on the problem under investigation. There is a template file trest.F that initializes all fields to zero for a simulation starting from rest. The file trossby.F is another templates that illustrate the Analytic subroutine.

To write these subroutines the user needs to know the storage scheme used in the code. The x-component of the velocity at node number k is stored in uglob(k) and the y-component in vglob(k) where 1 £ k £ nodeglob; the free surface elevation is stored in ztglob(k) where 1 £ k £ nodeglbp.

The subroutines in the initialization file are the following:

  1. uvh_initial Sets the initial values of the dependent variables.

  2. Analytic Computes the error between the numerical solution and the analytical solution when one is available. The maximum pointwise errors and the rms errors are computed and written to stdout. This subroutine accepts one argument in its argument list: time which refers to the time level.

  3. gtneumanu Computes the fluxes at the u-Neumann nodes.

  4. gtneumanv Computes the fluxes at the v-Neumann nodes. and at a specific time level.

12  The preprocessor file Sflags.h

The virtues of the C preprocessor cpp have already been spelled out. The flags can be defined with the simple #define command of cpp. There are comments in Sflags.h that explains the different flags. Note that comments in cpp have the style of C comments: all text between a /* and a */ is ignored by cpp. Kate tells me that cpp comment styles differ from one platform to the other but that the C style is pretty common. Here is a list of the different flags that shallow understands.

13  The array dimensions

The only array dimensions now declared in file dimns.F is the number of points in each elements in each direction npts. It must match the number provided in the velocity grid file. All the other array dimensions are read from the velocity grid file, and the arrays are allocated at run-time. The parameters in dimns.F are:

14  Modules

All header file have been removed and their functions have been taken over by the more versatile F90 modules. Modules can hold private and public (global) data and subroutines. The public and private attributes can be manipulated to encapsulate the code and data data structures. Subroutines internal to a module have access to the global variables declared in this module through host-association. If a subroutine not belonging to that module must have access to the module's global data, then that subroutine must USE the module in question; the global data must be declared public to be accessible by external subroutines.

The modules and their functionality and private data will be listed below. In the following k refers to a global node number while i and j refer to local node numbers.

  1. grid.F: Holds the subroutines needed to read the velocity and pressure grids. In addition, it performs bounds-checking. The global data declared here are the connectivity of the velocity and pressure grids:
    1. iglob(i,j,n): Global node number of local node number (i,j) in element n in the velocity grid.
    2. iglbp(i,j,n): Global node number of local node number (i,j) in element n in the pressure grid.

    and the coordinates of the collocation points for the velocity and pressure:

    1. xglob(k): x-coordinate of velocity node k.
    2. yglob(k): y-coordinate of velocity node k.
    3. xglbp(k): x-coordinate of pressure node k.
    4. yglbp(k): y-coordinate of pressure node k.

  2. dirichlet.F Handles the Dirichlet nodes. Its global data are:
    1. ndiricu number of u-Dirichlet nodes in sub-domain.
    2. ndiricv number of v-Dirichlet nodes in sub-domain.
    3. ndiriczt number of z-Dirichlet nodes in sub-domain.
    4. idiricu array to hold the sub-domain global node number of the u-Dirichlet nodes.
    5. idiricv array to hold the sub-domain global node number of the v-Dirichlet nodes.
    6. idiriczt array to hold the sub-domain global node number of the z-Dirichlet nodes.

  3. ab3_rk4.F Handles the time-stepping schemes. It receives or reads in the initial data, and integrates the equations in time using either the starter method (RK4) or the (AB3) scheme. If IMPLICIT is defined the weighed Crank-Nichloson scheme is used. Its only global public data is timestep. Its private data holds the variables u, v and z.
  4. decompose.F Module to decompose the global spectral element grid. Runs on the the host processor. Its only global public data is copy_tids which holds the task id number of all processes. This module is also responsible for re-assembling the global solution from the node processes.
  5. dimns.F Module that holds the array dimensions as global data.
  6. element.F: Holds most of the spectral element computational routines. Among other things, it computes the mass matrices, derivatives, diffusion operator, and mass conservation. Its private data includes the metric terms of the mapping, and the elemental mass matrices. Its public global data are:
    1. xGLv: The Gauss-Lobbato roots of the velocity grid.

    2. Gweightv: The Gauss-Lobbato weights of the velocity grid.

    3. xGLp: The Gauss-Lobbato roots of the pressure grid.

    4. Gweightp: The Gauss-Lobbato weights of the pressure grid.

    5. massinv: The inverse of the global velocity mass matrix multiplied by the time step.

    6. maspinv: The inverse of the global pressure mass matrix multiplied by the time step.

  7. filter.F This module implements the erfc-log filter of John P. Boyd, and does not hold any global public data.
  8. inflow.F This module enforces the inflow/outflow boundary condition on the continuity equations when mass is injected or removed from the side of the basin. It does not have global public data.
  9. kinds.F This module declares the kind of floating point precision to be done:
    integer, parameter :: r8 = selected_real_kind(15, 307)
  10. legendre.F Computes quantities associated with the Gauss-Lobatto-Legendre Cardinal functions. It does not hold any global data.
  11. msource.F This module adds the source terms necessary for the addition or removal of mass in the domain. It holds the public global data qv and qp of the source terms on the velocity and pressure grids.
    1. qv source term on the velocity grid.
    2. qp source term on the pressure grid.

  12. neumann.F Handles Neumann boundary conditions on u and v.
  13. paramet.F: contains the physical constants of the problem:
    1. viscosity: visocosity in appropriate units.
    2. gravity: gravitational accelaration.
    3. Doverd: ratio of depth length scale to surface elevation length scale.
    4. friction_coeff: friction coefficient of bottom drag.

  14. period.F Has the data and subroutines to handle periodic boundary conditions. In parallel mode, the duplicate nodes are eliminated and the elements are made to be adjacent. In serial mode, the explicit assembly of the periodic nodes has to be performed. The present version of the code accepts periodicity in the x direction only. It has the public global data:
    1. nperiodv: number of periodic nodes in the velocity grid.
    2. nperiodp: number of periodic nodes in the pressure grid.
    3. iperiodv(i): is the global node number of the ith periodic node on the velocity grid. iperiodv(i) and iperiodv(i+1) duplicate the same node in space. Hence, nperiodv is always even.
    4. iperiodp(i): is the global node number of the ith periodic node on the pressure grid. iperiodp(i) and iperiodp(i+1) duplicate the same node in space. Hence, nperiodp is always even.

  15. rhs.F Computes the right hand sides of the equations prior to the integration step. If the code is run in implicit mode it also calculates the left hand side of the equations for the iterative solver. It does not have any global public data and holds the Coriolis parameter, the depth, gravity, and drag as its private data.
  16. shuffle.F Handles the domain decomposition and the communication of the node processes. It does not hold public global data.
  17. vpgrid.F: Holds all the subroutines needed to manipulate the data between the pressure and velocity grids. All its global data are private.
  18. winds.F
    1. windsx(i,j,n): x-component of wind-stress on node (i,j) of velocity grid in element n.
    2. windsy(i,j,n): y-component of wind-stress on node (i,j) of velocity grid in element n.


File translated from TEX by TTH, version 2.72.
On 27 Aug 2002, 10:13.