Topography tool chain - description and upgrades for high-res
Note: see @Walter Hannah 's original page on a remap based topo generation tool chain: https://e3sm.atlassian.net/wiki/spaces/IPD/pages/4449009672
Motivation
Upgrade workflow and datasets to support model resolutions finer than 3km
Upgrade serial codes to parallel versions that can support sub-km resolutions.
Separate out remap steps from SGH algorithms, to allow for more general SGH formulas
Summary of Recommendations
Upgrade PHI algorithms to compute topography directly from USGS datasets.
Upgrade VAR30 and VAR algorithms to use scalable cell_avg algorithm from MBTR
Implement all calculations in terms of algebraic expressions of remapped quantities.
Algorithms
The derivation of the algorithms used below for computing SGH and SGH30 for general grids with resolutions that can be finder then 3.4 km is written up in this tech report: Topo Tool Chain (overleaf document)
Datasets
Default dataset in current tool chain
30s (1km DEM mesh): 43.2K x 21.6k. file: 7GB
Provides TERR_USGS
GMTED2010_7.5_stitcH_s5P_OPER_REF_DEM_15_NCL_24-3.r172800x86400.nc
New ultra-high res data set created by @Jishi Zhang , see https://e3sm.atlassian.net/wiki/spaces/DOC/pages/4189520033
7.5s (250m DEM mesh). 86.4K x 172.8K = 13.9GB (points) = 111GB memory after converting to real*8
Provides higher resolution version of TERR_USGS
USGS-topo-cube3000.nc
This file contains data (TERR_3km, SGH30_3km) on the 3km “ne3000pg1” grid used internally by the topo generation process.
Current tools
homme_tool: Used to apply dycore specific smoothing for the np4 topography, and the derived pg2 topography.
MBDA: MOAB disk averager. A superfast and simple top-hat convolution operator. By default it will remap by, for each target grid cell, computing the average over all source grids points within a disk centered on the target grid cell with area equal to the target grid cell area. This results in a monotone cell-averaged type algorithm with good downsampling properties. This algorithm can map between any two grids, it only requires the coordinates of the grid centers.
Older tools used in E3SMv3
Mostly built using NCAR CESM tools described in: P.H. Lauritzen, J.T. Bacmeister, P.F. Callaghan, M. Taylor, NCAR_Topo (v1.0): NCAR global model topography generation software for unstructured grids, Geosci. Model Dev., 8, 3975-3986, 2015.
bin_to_cube.F90: used to compute TERR_3km and SGH_3km, using cell_avg(). bin_to_cube.F90 is limited to lat/lon source grids and cubed sphere target grids. The cell_avg() algorithm is simple and fast, but this serial utility currently requires over 512GB of memory to run (see https://e3sm.atlassian.net/wiki/spaces/DOC/pages/4189520033 ) and is estimated to take weeks to run.
cube_to_target.F90: Used to compute TERR_t, SGH and SGH30. cube_to_target.F90 is limited to cubed sphere source grids and arbitrary target grids. It uses an aave algorithm, making it extremely slow for high resolution grids.
Output
The final output of the atmospheric topography tool chain is a topo file with the following variables:
PHIS_d = g*H_d_np4 (on the np4 spectral element GLL grid)
PHIS = PHIS_d mapped to pg2 (E3SM finite volume phys grid) via dycore specific algorithm (homme_tool)
SGH30 = standard deviation between source topography and the 3.4 km topography
SGH = standard deviation between the target grid topography and the 3.4 km topography
Some new data for the new gravity wave drag schemes on pg2 grid (need more information, cube_to_target)
Algorithms
Intermediate Quantities to compute
H # Height filed on source grid
# notation:
1st letter represents the source data
2nd letter represents the grid it has been mapped to
_d_ denotes dycore smoothing has been applied
SQ suffice denotes source data was squared before mapping
# source -> target grid
H_np4 = MBDA(H) # Height field mapped to np4 target grid
H_pg2 = MBDA(H) # only needed for SGH30
HSQ_pg2 = MBDA(H^2) # only needed for SGH30
# source->cube3000 grid:
H_3 = MBDA(H)
HSQ_3 = MBDA(H^2) # only needed for SGH30
# cube3000->Target grid (only needed for SGH)
H3_np4 = MBDA(H_3)
H3_pg2 = MBDA(H_3) # needed for SGH
H3SQ_pg2 = MBDA(H_3^2) # needed for SGH
# dycore smoothing on target grid (also computes consistent np4->pg2 versions)
H_d_np4 = dycore_smoothing(H_np4) # needed for PHI_d
H_d_pg2 = dycore_smoothing(H_np4) # needed for SGH30, PHI
H3_d_pg2 = dycore_smoothing( H3_np4) ) # needed for SGHSimplified algorithm if target grid resolution << 3.3km
Skip all the source->target maps, dont compute VAR2, and replace H_np4 with H3_np4. SGH formula unchanged. SGH30=SQRT(VAR1)
Height field
We use the MBDA algorithm to map the height field directly to the target grid, then apply dycore specific smoothing. The dycore smoothing step will compute both H_d_np4 and H_d_pg2.
PHI_d = g*H_d_np4
PHI = g*H_d_pg2
SGH algorithm
SGH is computed from the variance between H_d_pg2 and H_3 within each target grid cell for scales up to 3.4 km. It should smoothly decay to zero as the resolution is
increased to 3.4 km and remain zero in regions where the target grid resolution is finer than 3.4 km. To ensure that we do not include contributions to SGH from components of H_d_pg2 which contain scales finer than H3, we compute SGH from a modified version of the model topography, H3_d_pg2 where the 3.4 km scales and finer are removed. This approach is the same as what’s done in E3SMv3, where H_d_pg2 is computed from the topography on the 3.4 km cube3000 grid so that it can never contain scales finder than 3.4 km.
VAR = MPDA((H_3-H3_d_pg2)^2) = H3SQ_pg2 + H3_d_pg2^2 - 2 * H3_d_pg2 * H3_pg2
SGH = sqrt(VAR)SGH30 algorithm
SGH30 represents the standard deviation between H and H_3 within each target grid cell. For regions of the target grid that are coarser than 3.4 km, this variance is given by
VAR1_3000 = MBDA( (HS-H3)^2) = HSQ_3 - H3^2 # compute variance on cube3000 grid.
VAR1 = MBDA(VAR1_3000) # map to physics grid Similar to the SGH calculation above, we need to modify this formula in regions where the target grid has resolutions finer than 3.4 km. If the target grid contains regions with resolution as fine or finer than the source grid, than the variance should be should be near zero in this regions, since the source topography will be well represented by H_d_pg2. For regions with resolution in the transition region between the source grid resolutions and 3.4 km, we assume SGH30 should transition smoothly between zero and the VAR1 value. To achieve this, we compute a second variance, VAR2, and then compute SGH30 from the minimum of VAR1 and VAR2:
VAR2 = MBDA( H^2 - H_d_pg2^2) = HSQ_pg2 + H_d_pg2^2 - 2 H_d_pg2 H_pg2 # Mapping from source grid to physics grid
VAR = min(VAR1,VAR2)
SGH30 = sqrt(VAR)
To see that this formula gives the desired results, consider three different regimes: (1) In a region where the target grid has resolution coarser than 3.4 km, VAR2 > VAR1 (since H_d is much smoother than H_3) and thus SGH30 will be VAR1. (2) In regions where the target grid has resolution equal to or finer than the source grid, H_d_pg2 ∼ H and VAR2 ∼ 0 so that SGH30 will be zero as desired. (3) In transition regions between (1) and (2), H_3 will be smoother than H_d so that 0 < VAR2 < VAR1, and thus SGH30 will smoothly transition from VAR1 down to zero.