Regridding E3SM Data with ncremap

Regridding E3SM Data with ncremap

This page is devoted to instruction in NCO’s regridding operator, ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve this problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

Software Requirements

At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your $LD_LIBRARY_PATH. NCO installation instructions are here. Unless you have reason to do otherwise, we recommend installing NCO through the Conda package (conda install -c conda-forge nco) or via activating the E3SM-Unified environment. The Conda NCO package automagically installs two other important regridding tools, the ESMF_RegridWeightGen (aka ERWG) executable and the TempestRemap (aka TR) executables. Execute ncremap --config to verify you have a working installation:

zender@aerosol:~$ ncremap --config
ncremap, the NCO regridder and grid, map, and weight-generator, version 4.9.3-alpha02 "Fuji Rolls"
...[Legal Stuff]...
Config: ncremap script located in directory /Users/zender/bin
Config: NCO binaries located in directory /Users/zender/bin, linked to netCDF library version 4.7.3
Config: No hardcoded machine-dependent path/module overrides. (If desired, turn-on NCO hardcoded paths at supported national labs with "export NCO_PATH_OVERRIDE=Yes").
Config: External (non-NCO) program availability:
Config: ESMF weight-generation command ESMF_RegridWeightGen version 7.1.0r found as /opt/local/bin/ESMF_RegridWeightGen
Config: MOAB-Tempest weight-generation command mbtempest not found
Config: MPAS depth coordinate addition command add_depth.py found as /Users/zender/bin/add_depth.py
Config: TempestRemap weight-generation command GenerateOfflineMap found as /usr/local/bin/GenerateOfflineMap

Only NCO is required for many operations including applying existing regridding weights (aka, regridding) and/or generating grids, maps, or conservative weights with the NCO algorithms. Generating new weights (and map-files) with ERWG or TR requires that you install those packages (both of which come with the NCO Conda package). MOAB-Tempest (MBTR) is only required to generate TR weights on the largest meshes. It is also available as a Conda package, and comes with the E3SM-Unified environment. Make sure ncremap reports a sufficiently working status as above before proceeding further.

Prototypical Horizontal Regridding I: Use Existing Map-file

The horizontal regridding problem most commonly faced is converting output from a standard resolution model simulation to equivalent data on a different horizontal grid for visualization or intercomparison with other data. The EAM v1 model low-resolution simulations are performed and output on the ne30np4 SE (spectral element) grid, aka the "source grid". The corresponding EAM v2 simulations were conducted on the ne30pg2 FV (finite volume) grid. E3SM source grids like these are called “unstructured” because they have only one horizontal dimension (i.e., 1D) which makes them difficult to visualize. The recommended 2D (latitude-longitude) grids for analysis (aka the "destination grid") of v1 simulations was the 129x256 Cap grid (the gridcells at the poles look like yarmulke caps), and since v2 is the 180x360 equi-angular grid used by CMIP. The single most important capability of a regridder is the intelligent application of weights that transform data on the input grid to the desired output grid. These weights are stored in a "map-file", a single file which contains all the necessary weights and grid information necessary. While most regridding problems revolve around creating the appropriate map-file, this prototype problem assumes that has already been done, so the appropriate map-file (map.nc) already exists and ncremap can immediately transform the input dataset (dat_src.nc) to the output (regridded) dataset (dat_rgr.nc):

ncremap -m map.nc dat_src.nc dat_rgr.nc

This solution is deceptively simple because it conceals the choices, paths, and options required to create the appropriate map.nc for all situations. We will discuss creating map.nc later after showing more powerful and parallel ways to solve the prototype problem. The solution above works for users savvy enough to know how to find appropriate pre-built map-files. Map-files used by the E3SM model are available at https://web.lcrc.anl.gov/public/e3sm/inputdata/cpl/gridmaps/ . Additional map-files useful in post-processing are available at https://web.lcrc.anl.gov/public/e3sm/diagnostics/maps/. Many commonly used maps and grids can also be found in my (@czender's) directories as ~zender/data/[maps,grids] at most DOE High Performance Computing (HPC) centers. Take a minute now to look at these locations.

Pre-built map-files use the (nearly) standardized naming convention map_srcgrd_to_dstgrd_algtyp.YYYYMMDD.nc, where srcgrd and dstgrd are the source and destination grid names, algtyp is a shorthand for the numerical regridding algorithm, and YYYYMMDD is the date assigned to the map (i.e., the date the map was created). The source grid in the example above is called ne30np4, the destination is called fv129x256. A pre-built map for the v1 combination is map.nc = map_ne30np4_to_fv129x256_aave.20150901.nc, and for v2 is map_ne30pg2_to_cmip6_180x360_aave.20200201.nc. What is aave? Weight generators can use about a dozen interpolation algorithms for regridding, and each has a shorthand name. For now, it is enough to know that the two most common algorithms are (first-order) conservative area-average regridding (aave) and bilinear interpolation (bilin or blin). Hence this conservatively regrids dat_src.nc to dat_rgr.nc with first order accuracy:

ncremap -m map_ne30np4_to_fv129x256_aave.20150901.nc dat_src.nc dat_rgr.nc # EAMv1 ncremap -m map_ne30pg2_to_cmip6_180x360_aave.20200201.nc dat_src.nc dat_rgr.nc # EAMv2 ncremap -m map_ne30pg2_to_cmip6_180x360_traave.20231201.nc dat_src.nc dat_rgr.nc # EAMv3

Before looking into map-file generation in the next section, try a few ncremap features. For speed's sake, regrid only selected variables:

ncremap -v FSNT,AODVIS -m map.nc dat_src.nc dat_rgr.nc

To regrid multiple files with a single command, supply ncremap with the source and regridded directory names (drc_src, drc_rgr). It regrids every file in drc_src and places the output in drc_rgr:

ncremap -m map.nc -I drc_src -O drc_rgr

Or supply specific input filenames on the command-line, piped through standard input, or redirected from a file:

ncremap -m map.nc -O drc_rgr mdl*2005*nc ls mdl*2005*nc | ncremap -m map.nc -O drc_rgr ncremap -m map.nc -O drc_rgr < file_list.txt

When an output directory is not specified, ncremap writes to the current working directory. When the output and input directories are the same, ncremap appends a string (based on the destination grid resolution) to each input filename to avoid name collisions. Finally, be aware that multiple-file invocations of ncremap execute in parallel by default. Power users will want to tune this as described in the section on "Intermediate Regridding".

Prototypical Horizontal Regridding II: Create Map-file from Known Grid-files

The simplest regridding procedure applies an existing map-file to your data, as in the above example (public servers of pre-existing map-files are also linked to above). At most DOE High Performance Computing (HPC) centers these and others can also be found in my (@Charlie Zender 's) directory, ~zender/data/maps. If the desired map-file cannot be found, then you must create it. Creating a map-file requires a complete specification of both source and destination grids (meshes). The files that contain these grid specifications are called "grid-files". Many E3SM grid-files are publicly available within model-specific directories of the previous location, e.g., https://web.lcrc.anl.gov/public/e3sm/inputdata/ocn/mpas-o/oEC60to30v3/ . Many grids useful for post-processing are publicly served from https://web.lcrc.anl.gov/public/e3sm/diagnostics/grids/. At most DOE High Performance Computing (HPC) centers these can also be found in my (@czender's) directory, ~zender/data/grids. Take a minute now to look there for the prototype problem grid-files, e.g., for FV 129x256, cmip6_180x360, and ne30pg2 grid-files.

You might find multiple grid-files that contain the string 129x256. Grid-file names are often ambiguous. The grid-file global metadata (ncks -M grid.nc) often displays the source of the grid. These metadata, and sometimes the actual data (fxm: link), are usually more complete and/or accurate in files with a YYYYMMDD-format date-stamp. For example, the metadata in file 129x256_SCRIP.20150901.nc clearly state it is an FV grid and not some other type of grid with 129x256 resolution. The metadata in 129x256_SCRIP.130510 tell the user nothing about the grid boundaries, and some of the data are flawed. When grids seem identical except for their date-stamp, use the grid with the later date-stamp. The curious can examine a grid-file (ncks -M -m grid.nc) and easily see it looks completely different from a typical model or observational data file. Grid-files and data-files are not interchangeable.

Multiple grid-files also contain the string ne30. These are either slightly different grids, or the same grids store in different formats meant for different post-processing tools. The different spectral element (SE) and Finite Volume (FV) grid types are described with figures and described here (https://acme-climate.atlassian.net/wiki/spaces/Docs/pages/34113147/Atmosphere+Grids). As explained there, for E3SMv1 data many people will want the "dual-grid" with pentagons. The correct grid-file for this is ne30np4_pentagons.091226.nc. Do not be tempted by SE grid-files named with latlon. Datasets from E3SM v2 and v3 simulations are all on FV grids. EAM v2 and v3 grids are named in the format neXXXpg2. ELM and MPAS names take a wider variety of forms, many of which appear below.

All grid-files discussed so far are in SCRIP-format, named for the Spherical Coordinate Remapping and Interpolation Package (authored by @pjones). Other formats exist and are increasingly important, especially for SE grids. For now just know that these other formats are also usually stored as netCDF, and that some tools allow non-SCRIP formats to be used interchangeably with SCRIP.

Once armed with source and destination grid-files, one can generate their map-file with

ncremap -s grd_src.nc -g grd_dst.nc -m map.nc

Regrid a datafile at the same time as generating the map-file for archival:

ncremap -s grd_src.nc -g grd_dst.nc -m map.nc dat_src.nc dat_rgr.nc

Regrid a datafile without archiving the (internally generated) map-file:

ncremap -s grd_src.nc -g grd_dst.nc dat_src.nc dat_rgr.nc

For the prototype problem, the map-file generation procedure becomes

ncremap -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_nco.YYYYMMDD.nc # EAMv1 ncremap -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_nco.YYYYMMDD.nc # EAMv2

The map-files above are named with alg_typ=nco because the ncremap default interpolation algorithm is the first-order conservative NCO algorithm (NB: before NCO 4.9.1 the default algorithm was ESMF bilin). To re-create the aave map in the first example, invoke ncremap with -a esmfaave (the newest v3 naming convention) or -a conserve (same algorithm, different name in v1, v2):

ncremap -a conserve -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_aave.YYYYMMDD.nc # EAMv1 ncremap -a conserve -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_aave.YYYYMMDD.nc # EAMv2 ncremap -a esmfaave -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_esmfaave.YYYYMMDD.nc # EAMv3

This takes a few minutes, so save custom-generated map-files for future use. Computing weights to generate map-files is much more computationally expensive and time-consuming than regridding, i.e., than applying the weights in the map-file to the data. We will gloss over most options that weight-generators can take into consideration, because their default values often work well. One option worth knowing now is -a. The invocation synonyms for -a are --alg_typ, --algorithm, and --regrid_algorithm. These are listed in the square brackets in the self-help message that ncremap prints when it is invoked without argument, or with --help:

-a alg_typ Algorithm for weight generation (default ncoaave) [alg_typ, algorithm, regrid_algorithm]
ESMF algorithms: esmfbilin,bilinear|esmfaave,aave,conserve|conserve2nd|nearestdtos|neareststod|patch
NCO algorithms: ncoaave,nco,nco_con|ncoidw,nco_dwe (inverse-distance-weighted interpolation/extrapolation)
Tempest (and MOAB-Tempest) algorithms: traave,fv2fv_flx|trbilin|trfv2|trintbilin|tempest|fv2fv|fv2fv_stt

At least one valid option argument for each supported interpolation type is shown separated by vertical bars. The arguments shown have multiple synonyms, separated by commas, that are equivalent. For example, -a esmfaave is equivalent to -a aave and --alg_typ=conserve. Use the longer option form for clarity and precision, and the shorter form for conciseness. The full list of synonyms, and the complete documentation, is at http://nco.sf.net/nco.html#alg_typ. The NCO algorithm ncoaave is the default (because it is always available). Commonly-used algorithms that invoke ERWG are esmfbilin and esmfaave. TR options are discussed below. As of E3SM v3, TR algorithms are preferred for all mappings because they are the most accurate. Peruse the list of options now, though defer a thorough investigation until you reach the "Intermediate Regridding" section.

Prototypical Horizontal Regridding III: Infer Grid-file from Data-file

Thus far we have explained how to apply a map-file to data, and how, if necessary, to generate a map-file from known grids. What if there is no map-file and the source or the destination grid-files (or both) are unavailable? Often, one knows the basic information about a grid (e.g., resolution) but lacks the grid-file that contains the complete information for that grid geometry. In such cases, one must create the grid-file via one of two methods. First, one can let ncremap attempt to infer the grid-file from a data file known to be on the desired grid. This procedure is called "inferral" and is fairly painless. Second, one can feed ncremap all the required parameters (for rectangular grids only) and it will generate a grid-file. This requires a precise specification of the grid geometry, and will be covered the sub-section on "Manual Grid-file Generation".

Before we describe what the inferral procedure does, here is an example that demonstrates how easy it is. You can regrid an SE (or FV) dataset from our prototype example to the same grid as an evaluation dataset. Pick any 2D (e.g., MxN latitude-by-longitude) dataset to compare the E3SM simulations to. Inferral uses the grid information in the evaluation dataset, which is already on the desired destination grid, to create the (internally generated) destination grid-file. Supply ncremap with any dataset on the desired destination grid (dat_dst.nc) with -d (for "dataset") instead of -g (for "grid"):

ncremap -a bilin -s ne30np4_pentagons.091226.nc -d dat_dst.nc -m map_ne30np4_to_MxN_bilin.YYYYMMDD.nc # EAMv1 ncremap -a bilin -s ne30pg2.nc -d dat_dst.nc -m map_ne30pg2_to_MxN_bilin.YYYYMMDD.nc # EAMv2

This tells ncremap to infer the destination grid-file and to use it to generate the desired map-file, named with the supposed destination resolution (here, MxN degrees or gridcells). To archive the inferred destination grid for later use, supply a name for the grid with -g:

ncremap -s ne30np4_pentagons.091226.nc -d dat_dst.nc -g grd_dst.nc -m map_ne30np4_to_1x1_nco.YYYYMMDD.nc # Requires NCO >= 4.7.6 ncremap -s ne30pg2.nc -d dat_dst.nc -g grd_dst.nc -m map_ne30pg2_to_MxN_nco.YYYYMMDD.nc # EAMv2

One can infer a grid without having to regrid anything. Supply ncremap with a data-template file (dat.nc) and a grid-file name (grd.nc). Since there are no input files to regrid, ncremap exits after inferring the grid-file:

ncremap -d dat.nc -g grd.nc

Grid-inferral is easier to try than manual grid-generation, and will work if the data file contains the necessary information. The only data needed to construct a SCRIP grid-file are the vertices of each gridcell. The gridcell vertices define the gridcell edges and these in turn define the gridcell area which is equivalent to the all-important weight parameter necessary to regrid data. Of course the gridcell vertices must be stored with recognizable names and/or metadata indicators. The Climate & Forecast (CF) metadata convention calls the gridcell vertices the "cell-bounds". Coordinates (like latitude and longitude) usually store cell-center values, and should, according to CF, have bounds attributes whose values point to variables (e.g., lat_bounds or lon_vertices) that contain the actual vertices. Relatively few datasets "in the wild" contain gridcell vertices, though the practice is, happily, on the rise. Formally SE models have nodal points with weights without any fixed perimeter or vertices assigned to the nodes, so the lack of vertices in SE model output is a feature, not an oversight. The dual-grid (referenced above) addresses this by defining "pretend" gridcell vertices for each nodal point so that an SE dataset can be treated like an FV dataset. However, dual-grids are difficult to generate, and may not exist or be accurate for many SE grids. In that case, ncremap cannot infer the grid (because the vertices are unknown) and one needs to use a different package (such as TempestRemap, below) to construct the grid-file and the mapping weights.

Inferral works well on important categories of grids for which ncremap can guess the missing grid information. In the absence of gridcell vertice information, ncremap examines the location of and spacing between gridcell centers and can often determine from these what type of grid a data-file (not a grid-file!) is stored on. A data-file simply means the filetype preferred for creation/distribution of modeled/observed data. Hence ncremap has the (original and unique, so far as we know) ability to infer all useful rectangular grid-types from data-files that employ the grid. The key constraint here is "rectangular", meaning the grid must be orthogonal (though not necessarily regularly spaced) in latitude and longitude. This includes all uniform angle grids, FV grids, and Gaussian grids. For curvilinear grids (including most swath data), ncremap infers the underlying grid to be the set of curves that bisect the curves created by joining the gridcell centers. This often works well for curvilinear grids that do not cross a pole. Inferral works for unstructured (i.e., 1D) grids only when the cell-bounds are stored in the datafile as described above. Hence inferral will not work on raw output from SE models.

A few more examples will flesh-out how inferral can be used. First, ncremap can infer both source and destination grids in one command:

ncremap -d dat_dst.nc dat_src.nc dat_rgr.nc

Here the user provides only data files (no grid- or map-files) yet still obtains regridded output! The first positional (i.e., not immediately preceded by an option indicator) argument (dat_src.nc) is interpreted as the data to regrid, and the second positional argument (dat_rgr.nc) is the output name. The -d argument is the name of any dataset (dat_dst.nc) on the desired destination grid. ncremap infers the source grid from dat_src.nc, then infers the destination grid from dat_dst.nc, then generates weights (with the default algorithm since none is specified) and creates a map-file that it uses to regrid dat_src.nc to dat_rgr.nc. No grid-file or map-file names were specified (with -g or -m) so both grid-files and the map-file are generated internally in temporary locations and erased after use.

Second, this procedure, like most ncremap features, works for multiple input files:

ncremap -d dat_dst.nc -I drc_in -O drc_rgr ncremap -d dat_dst.nc -I drc_in

Unless a map-file or source grid-file is explicitly provided (with -m or -s, respectively), ncremap infers a separate source grid-file (and computes a corresponding map-file) for each input file. This allows it to regrid lists of uniquely gridded data (e.g., satellite swaths each on its own grid) to a common destination grid. When all source files are on the same grid (as is typical with models), then turn-off the expensive multiple inferral and map-generation procedures with the -M switch to save time:

ncremap -M -d dat_dst.nc -I drc_in -O drc_rgr ncremap --mlt_map --dst_fl=dat_dst.nc --drc_in=drc_in --drc_rgr=drc_rgr # Long-options for clarity

Prototypical Horizontal Regridding IV: Manual Grid-file Generation

If a desired grid-file is unavailable, and no dataset on that grid is available (so inferral cannot be used), then one must manually create a new grid. Users create new grids for many reasons including dataset intercomparisons, regional studies, and fine-tuned graphics. NCO and ncremap support manual generation of the most common rectangular grids as SCRIP-format grid-files. Create a grid by supplying ncremap with a grid-file name and "grid-formula" (grd_sng) that contains, at a minimum, the grid-resolution. The grid-formula is a hash-separated string of name-value pairs each representing a grid parameter. All parameters except grid resolution have reasonable defaults, so a grid-formula can be as simple as "latlon=180,360":

ncremap -g grd.nc -G latlon=180,360

Congratulations! The new grid-file grd.nc is a valid source or destination grid for ncremap commands.

Grid-file generation documentation in the NCO Users Guide at http://nco.sf.net/nco.html#grid describes all the grid parameters and contains many examples. Note that the examples in this section use grid generation API for ncremap version 4.7.6 (August, 2018) and later. Earlier versions can use the ncks API explained in the Users Guide.

The most useful grid parameters (besides resolution) are latitude type (lat_typ), longitude type (lon_typ), title (ttl), and, for regional grids, the SNWE bounding box (snwe). The three supported varieties of global rectangular grids are Uniform/equiangular (lat_typ=uni), Cap/FV (lat_typ=cap), and Gaussian (lat_typ=gss). The four supported varieties of longitude types are the first (westernmost) gridcell centered at Greenwich (lon_typ=grn_ctr), western edge at Greenwish (grn_wst), or at the Dateline (lon_typ=180_ctr and lon_typ=180_wst, respectively). Grids are global, uniform, store latitudes from south-to-north, and have their first longitude centered at Greenwich by default. The grid-formula for this is 'lat_typ=uni#lon_typ=grn_ctr#lat_drc=s2n'. Some examples (remember, this API requires NCO 4.7.6+):

ncremap -g grd.nc -G latlon=180,360 # 1x1 Uniform grid ncremap -g grd.nc -G latlon=180,360#lon_typ=grn_wst # CMIP6 1x1 Uniform grid ncremap -g grd.nc -G latlon=129,256#lat_typ=cap # 1.4x1.4 FV grid ncremap -g grd.nc -G latlon=94,192#lat_typ=gss # T62 Gaussian grid ncremap -g grd.nc -G latlon=721,1440#lat_drc=n2s#lat_typ=cap#lon_typ=grn_ctr # ECMWF ERA5 native grid ncremap -g grd.nc -G latlon=1280,2560#lat_typ=gss#lon_typ=grn_ctr#lat_drc=n2s # ECMWF IFS F640 Full Gaussian ncremap -g grd.nc -G latlon=360,720#lat_typ=uni#lon_typ=180_wst # "r05" ELM/MOSART 0.5x0.5 uniform grid

Regional grids are a powerful tool in regional process analyses, and can be much smaller in size than global datasets. Regional grids are always uniform. Specify the rectangular bounding box, i.e., the outside edges of the region, in SNWE order:

ncremap -g grd.nc -G latlon=30,90#snwe=55.0,85.0,-90.0,0.0 # 1x1 Greenland grid

Prototypical Vertical Interpolation I: Terminology

ncremap can also access all of NCO’s vertical interpolation algorithms. The avoidance of the term “regridding” in this terminology is intentional. NCO can interpolate fields to new vertical levels yet its interpolation algorithms do not conserve mass or energy (unlike horizontal regridding algorithms that are often conservative). This is properly called interpolation not regridding, although in practice we often conflate the two because the interpolation places fields on a new vertical grid.

Prototypical Vertical Interpolation II: Vertical Grid Files

The documentation of vertical interpolation on this Confluence page is nascent. Please see the complete documention in the NCO Users Guide.

Intermediate Regridding:

The sections on Prototypical Horizontal Regridding were intended to be read sequentially and introduced the most frequently required ncremap features. The Intermediate and Advanced Horizontal regridding sections are a-la-carte descriptions of features most useful to particular component models, workflows, and data formats. Peruse these sections in any order.

Intermediate Regridding I: Treatment of Missing Data

The terminology of this section can be confusing. The most important point to clarify is that the treatments of missing data described here are independent of the regridding algorithm (and weight-generator) which is specified by alg_typ. alg_typ only defines the algorithm used to generate the weights contained in a map-file, not how to apply those weights. ncremap applies the map-file weights (i.e., regrids) to data-files that may contain fields with spatio-temporally missing values, such as AODVIS in EAM, all  fields with ocean gridpoints in ELM output, most fields in MPAS-Seaice, and many fields in satellite-retrieved datasets. Unless the locations of a field's (like AODVIS) missing values were supplied as a mask to the weight-generator algorithm at map-creation time, the map-file weights cannot and will not automatically take into account that some of the source gridcells contain invalid (aka missing) values. This section describes three options for how the weight-applicator (not weight-generator) is to reconstruct values in destination gridcells affected by missing source gridcells. One option preserves the integral value of the input gridcells and is called "integral-preserving" or "conservative" because it is first-order conservative in the affected gridcells locally, as well as globally conservative. At the other extreme is the option to preserve (locally) the gridcell mean of the input values in the affected output gridcells. This option is called "mean-preserving" or "renormalized" and is not conservative. The third option allows the user to specify a sliding scale between the integral- and mean-preserving options, which is to say it is a more general form of renormalization. Once again, these weight-application approaches are independent of the weight-generation algorithm defined by alg_typ.

Conservative weight-generation is, for first-order accurate algorithms, a straightforward procedure of identifying gridcell overlap and apportioning values correctly from source to destination. The absence of valid values (or presence of missing values) forces a choice on how to construct destination gridcell values where some but not all contributing source cells are valid. ncremap supports three distinct approaches: "Integral-preserving", "Mean-preserving", and "Sliding Scale". The integral-preserving approach uses all valid data from the input grid on the output grid once and only once. Destination cells receive the weighted valid values of the source cells. This is a first-order conservative procedure, and will, in conjunction with weights from conservative regridding algorithm, conserve and preserve the local and global integrals of the source and destination fields. The mean-preserving approach divides the destination value by the sum of the valid weights. This preserves and extends the mean of the valid input values throughout the entire destination gridcell. In other words, it extrapolates valid data to missing regions while preserving the mean valid data. Input and output integrals are unequal and mean-preserving regridding is not conservative and does not preserve the integrals when missing data are involved. Furthermore, mean-preserving regridding only preserves the local (i.e., gridcell-level) mean of the input data, not the global mean. Both approaches, integral-preserving and mean-preserving (aka conservative and re-normalized) produce identical answers when no missing data maps to the destination gridcell. Before explaining the nuance of the third approach ("sliding-scale"), we demonstrate the symmetric ways of invoking integral- or mean-preserving regridding: 

ncremap -m map.nc dat_src.nc dat_rgr.nc # Integral-preserving treatment of missing data is used by default ncremap --preserve=integral -m map.nc dat_src.nc dat_rgr.nc # Explicitly specify integral-preserving treatment (requires NCO 4.9.2+) ncremap --preserve=mean -m map.nc dat_src.nc dat_rgr.nc # Explicitly specify mean-preserving treatment (requires NCO 4.9.2+)

By default, ncremap implements the integral-preserving, conservative approach because it has useful properties, is simpler to understand, and requires no additional parameters. However, this approach will often produce unrealistic gridpoint values (e.g., ocean temperatures < 100 K and that is not a typo) along coastlines or near data gaps where state variables are regridded to/from small fractions of a gridcell. The mean-preserving approach solves this problem, yet incurs others, like non-conservation. The mean-preserving approach (aka re-normalization) ensures that the output values are physically consistent and do not stick-out like sore thumbs in a plot, although the integral of their value-times-area is not conserved.

The third option, the "sliding scale" approach, is slightly more complex and much more flexible than the all-or-nothing integral- or mean-preserving examples above. The sliding-scale refers to the fraction of the destination gridcell that must be overlapped by valid source gridcells. Users can specify the renormalization threshold weight rnr_thr which is the required valid fraction of destination gridcells with the renormalization threshold option "--rnr_thr=rnr_thr". The weight-application algorithm then ensures that valid values overlap at least the fraction rnr_thr of each destination gridcell for that gridcell to meet the threshold for a non-missing destination value. When rnr_thr is exceeded, the mean valid value in the valid area is placed in the destination gridcell. If the valid area covers less than rnr_thr, then the destination gridcell is assigned the missing value. Valid values of rnr_thr range from zero to one. A threshold weight of rnr_thr = 0.0 indicates that any amount (no matter how small) of valid data should be represented by its mean value on the output grid. Keep in mind though, that the actual valid area divides a sum (the area-weighted integral of the valid values), and values of zero or very near to zero in the divisor can lead to floating-point underflow and divide-by-zero errors. A threshold weight of rnr_thr = 1.0 indicates that the entire destination gridcell (to within machine precision) must be overlapped by valid data in order for that destination gridcell to be assigned a valid (non-missing) value. Threshold weights 0.0 < rnr_thr < 1.0 invoke the real-power of the sliding scale. Remote sensing classification schemes applied to L2 data may require, for example, valid retrievals over at least 50% of source pixels before assigning a valid value to a destination gridcell. This is equivalent to rnr_thr = 0.5. And so, 

ncremap --rnr_thr=0.25 -m map.nc dat_src.nc dat_rgr.nc # Sliding scale: Destination value preserves mean of input data, requires at least 25% valid input data ncremap --rnr_thr=0.50 -m map.nc dat_src.nc dat_rgr.nc # Sliding scale: Destination value preserves mean of input data, requires at least 50% valid input data

These sliding scale examples specify that valid values must cover at least 25% and 50% of the destination gridcell to meet the threshold for a non-missing destination value. With actual valid destination areas of 25% or 50%, this approach would produce destination values greater than the conservative algorithm by factors of four and two, respectively. Careful readers may already have observed that the mean-preserving approach (--preserve=mean) is exactly equivalent to the sliding scale approach with a renormalization threshold weight rnr_thr = 0.0. The latter approach is just a more numerical way of expressing the former approach. On the other hand, no numerical threshold weight produces answers equivalent to the integral-preserving approach. However, it is convenient for power users to be able invoke all three missing data approaches via the rnr_thr option alone. Hence, we made setting the threshold weight to the string "none" (--rnr_thr=none) to be exactly equivalent to specifying the integral-preserving approach. In summary, the --preserve option with its two valid arguments integral and mean that invoke the integral-preserving and mean-preserving algorithms will suffice for most situations. The sliding-scale algorithm must be invoked via the --rnr_thr option with a numerical argument, although the string "none" will default to the integral-preserving algorithm.

In practice, it may make sense to use the default integral-preserving treatment with conservative weights, and the mean-preserving or sliding-scale treatment with other (non-conservative) weights such as those produced by bilinear interpolation or nearest-neighbor. Another consideration in selecting the weight-application algorithm for missing values is whether the fields to regrid are fluxes or state variables. For example, temperature (unlike heat) and concentrations (amount per unit volume) are not physically conserved quantities under areal-regridding so it often makes sense to interpolate them in a non-conservative fashion, to preserve their fine-scale structures. Few researchers can digest the unphysical values of temperature that the integral-preserving treatment produces in regions rife with missing values. On the other hand, mass and energy fluxes should be physically conserved under areal-regridding. Hence, one must consider both the type of field and its conservation properties when choosing a missing value treatment.

Intermediate Regridding II: TempestRemap

TempestRemap (TR) is the chief alternative to ERWG and NCO for regridding weight-generation. TR is replacing ERWG as the default on-line weight generator in E3SMv2. Tempest algorithms, written by @paulullrich, have many numerical advantages described in papers and at Transition to TempestRemap for Atmosphere grids . Verify that ncremap can access your Tempest installation as described in the above section on "Software Requirements" before trying the examples below. Once installed, TR can be as easy to use as ERWG or NCO with FV grid-files, e.g.,

ncremap -a fv2fv_flx -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc

This command, the same as shown in the "Create Map-file from Known Grid-files" section above, except using alg_typ='fv2fv_flx', is a jumping-off point for understanding Tempest features and quirks. First, simply note that the ncremap interfaces for ERWG, NCO, and TR weight-generators are the same even though the underlying ERWG, NCO, and TR applications have different APIs. Second, Tempest accepts SCRIP-format grids (as shown) and Exodus-format grid-files, also stored in netCDF though typically with a '.g' suffix, e.g., ne30.g as described at Transition to TempestRemap for Atmosphere grids Exodus grid-files contain grid "connectivity" and other information required to optimally treat advanced grids like SE. Hence this also works

ncremap -a se2fv_flx -s ne30.g -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc

This produces subtly different weights because ne30.g encodes the SE ne30np4 grid-specification, not its dual-grid FV representation. TR generates superior weights when instructed to use an algorithm optimized for the type of remapped variable and the grid representation as described below. The above example employs the recommended algorithm to remap fluxes on the SE grid to the FV destination grid, i.e., se2fv_flx.

The exact options that NCO invokes for a specific TR algorithm like se2fv_flx can be discovered in multiple ways: First, all TR options that NCO employs are recommended on the Transition to TempestRemap for Atmosphere grids; Second, the NCO Users Guide documents TR options at http://nco.sf.net/nco.html#tr; Third, the options --dbg_lvl=1 or --dbg_lvl=2 cause ncremap to print the sub-commands it executes, including the TR commands with options. Experiment, intercompare, and find the algorithm that works best for your purposes. Advanced users may be interested in the quantitative evaluations of the quality of the weights in the map-file provided by the map-checker (ncks --chk_map map.nc) described below.

As mentioned at  the Tempest overlap-mesh generator expects to be sent the two grid-files in the order smaller then larger (ERWG and NCO have no corresponding restriction). For example, Tempest considers the global ocean to be a smaller domain than the global atmosphere since it covers less area (due to masked points). Hence ncremap must be told when the source grid-file covers a larger domain than the destination. Do this with the "--a2o" or "--l2s" switch (for "atmosphere-to-ocean" or "large-to-small", respectively):

ncremap --l2s -a se2fv_flx -s atm_se_grd.nc -g ocn_fv_grd.nc -m map.nc # Source larger than destination ncremap -a fv2se_flx -s ocn_fv_grd.nc -g atm_se_grd.nc -m map.nc # Source smaller than destination ncremap -a se2fv_flx -s atm_se_grd.nc -g atm_fv_grd.nc -m map.nc # Source same size as destination

As mentioned above, EAM v1 datasets are the only ones stored in an SE grid format. To accomodate the mixture of FV and SE grids needed for model evaluation, Transition to TempestRemap for Atmosphere grids describes eight specific global FV<->SE mappings that optimize different combinations of accuracy, conservation, and monotonicity desirable for remapping flux-variables (flx), state-variables (stt), and an alternative (alt) mapping. A plethora of boutique options and switches control the Tempest weight-generation algorithms for these six cases. To simplify their invocation, ncremap names these eight algorithms fv2se_stt, fv2se_flx, fv2se_alt, fv2fv_flx, fv2fv_stt, se2fv_stt, se2fv_flx, se2fv_alt, and se2se. E3SM maps with these algorithms have adopted the suffixes "mono" (for fv2fv_flx, se2fv_flx, and fv2se_alt), "highorder" (fv2fv_stt, se2fv_stt, fv2se_stt), "intbilin" (se2fv_alt), and "monotr" (fv2se_flx). The relevant Tempest map-files can be generated with

ncremap -a fv2se_flx -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_monotr.20180901.nc ncremap -a fv2se_stt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_highorder.20180901.nc ncremap -a fv2se_alt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_mono.20180901.nc ncremap --l2s -a se2fv_flx -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_mono.20180901.nc ncremap --l2s -a se2fv_stt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_highorder.20180901.nc ncremap --l2s -a se2fv_alt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_intbilin.20180901.nc ncremap -a se2se -s ne30.g -g ne120.g -m map_ne30np4_to_ne120np4_se2se.20180901.nc ncremap -a fv2fv_flx -s ocean.RRS.30-10km_scrip_150722.nc -g cmip6_180x360_scrip.20181001.nc -m map_oRRS30to10_to_180x360_mono.20180901.nc ncremap -a fv2fv_stt -s ocean.RRS.30-10km_scrip_150722.nc -g cmip6_180x360_scrip.20181001.nc -m map_oRRS30to10_to_180x360_highorder.20180901.nc

These maps, applied to appropriate flux and state variables, should exactly reproduce the online remapping in the E3SM v1 coupler. However, explicitly generating all standard maps this way is not recommended because ncremap includes an MWF-mode (for "Make All Weight Files") described below. MWF-mode generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps for ERWG and Tempest. The E3SM v2/v3 configurations all use FV grids. Moreover, the mapfile naming convention changed in v3.

Intermediate Regridding III: MOAB/mbtempest support

Weight-generator computations (e.g., finding grid intersections) increase non-linearly with the grid size, so the largest grids are most efficiently computed with parallel algorithms. ERWG has long supported distributed computation in an MPI-environment, and NCO has always supported multi-threaded weight computation via OpenMP. A growing subset of TempestRemap algorithms have now been ported to the parallel MOAB (Mesh Oriented datABase) tool mbtempest (aka MBTR). ncremap can invoke the MOAB regridding toolchain as of NCO version 5.0.2 from September, 2021. The "spoiler" answer to "How do I get ncremap to invoke mbtempest?" is simple: Add the --mpi_nbr=mpi_nbr option to an ncremap command that already calls a TempestRemap algorithm. However, MOAB involves complex, MPI-enabled software, and support for it is continually changing and subject to important limitations. Read on to understand what to currently expect.

First, MOAB requires an MPI environment to perform well. Invoking MOAB (i.e., using --mpi_nbr=mpi_nbr with a TR algorithm name) in a non-MPI environment will result in an error. One can easily obtain MPI-enabled MOAB environment with Conda. For example, install Conda MPI versions of the MOAB package with

conda install -c conda-forge moab=5.3.0=mpich_tempest esmf

Please ensure you have the latest version of ERWG, MOAB, and/or TempestRemap before reporting any related problems to NCO.

This section is intentionally placed after the TR section because mbtempest re-uses the TR algorithm names described in the previous section. For example, ncremap invokes TR to generate weights to remap fluxes from FV to FV grids when invoked with

ncremap -a fv2fv_flx --src_grd=src.g --dst_grd=dst.nc -m map.nc # Invoke TR

Add the --mpi_nbr option and ncremap will instead invoke the MOAB toolchain to compute weights for any TempestRemap algorithm (otherwise the TR toolchain would be used):

ncremap --mpi_nbr=8 -a fv2fv_flx --src_grd=src.g --dst_grd=dst.nc -m map.nc # Invoke MOAB

Although mapping weights generated by MOAB and TempestRemap use the same numerical algorithms, they are likely to produce slightly different weights due to round-off differences. MOAB is heavily parallelized and computes and adds terms together in an unpredictable order compared to the serial TempestRemap.

Transparently to the user, ncremap supplies the selected weight generator with the recommended options, which can be quite complex. For the preceding example, ncremap either invokes TempestRemap's GenerateOverlapWeights with the boutique options --in_type fv --in_np 1 --out_type fv --out_np 1 --correct_areas that E3SM recommends for conservative and monotone remapping of fluxes, or it invokes multiple components of the MOAB toolchain:

mbconvert -B -o PARALLEL=WRITE_PART -O PARALLEL=BCAST_DELETE -O PARTITION=TRIVIAL -O PARALLEL_RESOLVE_SHARED_ENTS "src.g" "src.h5m"
mbconvert -B -o PARALLEL=WRITE_PART -O PARALLEL=BCAST_DELETE -O PARTITION=TRIVIAL -O PARALLEL_RESOLVE_SHARED_ENTS "dst.nc" "dst.h5m"
mbpart 8 --zoltan RCB "src.h5m" "src_8p.h5m"
mbpart 8 --zoltan RCB --recompute_rcb_box --scale_sphere --project_on_sphere 2 "dst.h5m" "dst_8p.h5m"
mpirun -n 8 mbtempest --type 5 --weights --load "src_8p.h5m" --load "dst_8p.h5m" --method fv --order 1 --method fv --order 1 --file "map.nc"

The purpose of the ncremap front-end to MOAB is to hide this complexity from the user while preserving the familiar look and feel of invoking other weight-generators. Once again, the MOAB toolchain should produce a map-file identical (to rounding precision) to one produced by TR. When speed matters (i.e., large grids), and the algorithm is supported (i.e., fv2fv_flx), invoke MOAB, otherwise invoke TR.

This section comes before the parallelism section because mbtempest supports MPI-enabled parallelism that is distinct from the ncremap workflow parallelism described in the next section.

Caveat lector: As of September, 2021 Weights generated by MOAB (version 5.3.0 and earlier) are only trustworthy for the fv2fv_flx algorithm. The options for all other algorithms are implemented as indicated though they should be invoked for testing purposes only. High order and spectral element map algorithms are not fully implemented and will produce unreliable results. MOAB anticipates supporting more TempestRemap algorithms in the future. Always use the map-checker to test maps before use, e.g., with ncks --chk_map map.nc.

Another limitation is that mbtempest (version 5.3.0) currently only generates map-files that regrid to rank 1 (unstructured) grids. mbtempest "unrolls" any rank 2 (e.g., latxlon) destination grid into its rank 1 equivalent. Users seeking to regrid to rank 2 grids can manually alter the MOAB-generated mapfile with something like:

ncks -O -x -v dst_grid_dims ~/map.nc ~/map_fix.nc
ncap2 -O -s 'defdim("dst_grid_dims",2);dst_grid_dims($dst_grid_dims)={256,129};' ~/map_tmp.nc ~/map_fix.nc

Replace 256,129 above by the lon,lat (Fortran order!) of your grid. The resulting map-file, map_fix.nc, will almost regrid as intended. The regridded output will have curvilinear coordinates (e.g, lat(lat,lon)) instead of rectangular coordinates (e.g., lat(lat)), and the coordinate bounds variables (e.g., lat_bnds) will not be correct and may cause plotting issues at poles and Greenwich. Nevertheless, the weights are correct and of unsurpassed accuracy. MOAB anticipates supporting rank 2 mapfiles and TR high-order and spectral element algorithms in the future.

Intermediate Regridding IV: Parallelism

ncremap can exploit three types of parallelism: multiples nodes in a cluster, multiple simultaneous file regriddings on a single node, and multiple simultaneous variables regridded within a single file. Each level of parallelism reduces the wallclock time to complete the regridding workflow at the expense of increase resource requirements. First we describe ways to control file-level parallelism during regridding, then we move to OpenMP threading of variables within the same file. At the end we sketch out how these methods may be combined.

File-level parallelism accelerates throughput when regridding multiple files in one ncremap invocation, and has no effect when only one file is to be regridded. Note that the ncclimo and ncremap semantics for selecting file-level parallelism are identical, though their defaults differ (Background mode for ncclimo and Serial mode for ncremap). Select the desired mode with the argument to --par_typ=par_typ. Explicitly select Background mode with par_typ values of bck, background, or Background. The values mpi or MPI select MPI mode, and the srl, serial, Serial, nil,  and none will all select Serial mode (which disables file-level parallelism, though still allows intra-file OpenMP parallelism over variables).

The default file-level parallelism for ncremap is Serial mode (i.e., no file-level parallelism), in which ncremap processes one input file at a time on a single node. Background and MPI modes implement true file-level parallelism on one node or multiple-nodes, respectively. Typically both these parallel modes scale well with sufficent memory unless and until I/O contention becomes the bottleneck. In Background mode ncremap issues all commands to regrid the input file list as UNIX background processes on the local node. Nodes with mutiple cores and sufficient RAM take advantage of this to simultaneously regrid multiple files. In MPI mode ncremap issues commands to regrid the input file list in round-robin fashion to all available compute nodes. Prior to NCO version 4.9.0 (released December, 2019), Background and MPI parallelism modes both regridded all the input files at one time and there was no way to limit the number of files being simultaneously regridded. Subsequent versions allow finer grained parallelism by introducing the ability to limit the number of discrete workflow elements or ``jobs'' (i.e., file regriddings) to perform simultaneously within an ncremap invocation or ``workflow''.

# 1. (Default) Regrid files in serial order, on single node

ncremap --par=serial --map=map.nc file1.nc file2.nc … fileN.nc

# 2a. Regrid files in parallel on a single node (Suitable for "small" files, otherwise RAM constraints and I/O contention can limit efficacy)

ncremap --par=background --map=map.nc file1.nc file2.nc … fileN.nc

# 3a. Regrid files in parallel in round-robin order on multiple nodes (Suitable for large jobs with many beefy files)

ncremap --par=mpi --map=map.nc file1.nc file2.nc … fileN.nc

When in MPI mode ncremap automatically determines the number and names of available nodes from the array returned by scontrol show hostname ${SLURM_NODELIST} (in SLURM) or from the contents of ${PBS_NODEFILE} (in PBS). The node number (node_nbr) is also the argument given to sbatch -n node_nbr, salloc -n node_nbr and srun -n node_nbr (in SLURM) or qsub -l nodes=node_nbr (in PBS) to create the environment in which ncremap is running. The ncremap option --mpi_nbr=mpi_nbr is only used to control the number of MPI tasks invoked by parallel weight-generators (mbtempest and ESMF_RegridWeightGen) when generating weights. Do not try to use mpi_nbr to control the number of nodes to which ncremap distributes files to regrid.

As of NCO version 4.9.0 (released December, 2019), the --job_nbr=job_nbr option specifies the maximum number of files to regrid simultaneously on all nodes being harnessed by the workflow. Thus job_nbr is an additional parameter to fine-tune file-level parallelism (it has no effect in Serial mode). In both parallel modes ncremap spawns processes in batches of job_nbr jobs, then waits for those processes to complete. Once a batch finishes, ncremap spawns the next batch. In Background mode, all jobs are spawned to the local node. In MPI mode, all jobs are spawned in round-robin fashion to all available nodes until job_nbr jobs are running.

If regridding consumes so much RAM (e.g., because variables are large and/or the number of threads is large) that a single node can perform only one regridding job at a time, then a reasonable value for job_nbr is the number of nodes, node_nbr. Often, however, nodes can regrid multiple files simultaneously. It can be more efficient to spawn multiple jobs per node than to increase the threading per job because I/O contention for write access to a single file prevents threading from scaling indefinitely.

By default job_nbr = 2 in Background mode, and job_nbr = node_nbr in MPI mode. This helps prevent users from overloading nodes with too many jobs. Subject to the availability of adequate RAM, expand the number of jobs per node by increasing job_nbr until performance plateaus. Remember that processes and threading are multiplicative in core use. Regridding four files on a node, each with four OpenMP threads (see below), will utilize sixteen cores on the node. In practice, I/O contention (i.e., waiting for access to the disk) causes throughput to saturate (and then decline) at somewhere between four and sixteen cores.

# 2b. Background parallelism default is 2 jobs (aka files) at a time

ncremap --par=background --job_nbr=1 --map=map.nc file1.nc file2.nc … fileN.nc # Equivalent to Serial mode (1 job at a time)

ncremap --par=background --job_nbr=2 --map=map.nc file1.nc file2.nc … fileN.nc # Default

ncremap --par=background --job_nbr=5 --map=map.nc file1.nc file2.nc … fileN.nc # More simultaneous when files are small or RAM is large

# 3b. MPI Parallelism: job_nbr defaults to node_nbr. Consider the case where node_nbr is 4 and file_nbr is 12

ncremap --par=mpi --map=map.nc file1.nc file2.nc … fileN.nc # (Default) Each node gets 3 jobs (files) at a time

ncremap --par=mpi --job_nbr=1 --map=map.nc file1.nc file2.nc … fileN.nc # Each node gets 1 file at a time, three round-robins will be performed

ncremap --par=mpi --job_nbr=8 --map=map.nc file1.nc file2.nc … fileN.nc # Each node gets 2 files in first round-robin. Once those complete, each node will get one file during the second round-robin

We have thus far demonstrated how to control file-level parallelism (with par_typ) and workflow level parallelism (with job_nbr). The third level of parallelism mentioned above is that ncremap uses OpenMP shared-memory techniques (aka threading) to simultaneosly regrid multiple variables within a single file. This capability is only available when NCO has been compiled and linked with the OpenMP library. This shared memory parallelism over variables is efficient because it requires only a single copy of the regridding weights in physical memory for any number of regridding threads.

The --thr_nbr=thr_nbr option controls the number of threads. By default ncremap sets thr_nbr = 2. Users may override this default and select anywhere from one to sixteen threads. NCO prevents using more than sixteen threads because, in our experience, throughput saturates due to I/O contention. Even so, regridding multiple variables at high resolution may become memory-limited, meaning that the insufficient RAM can often limit the number of variables that the system can simultaneously regrid.

UPDATE on 12/3/2025 regarding OpenMP behavior on netCDF4 files: Don’t rely on it! Explicitly set thr_nbr = 1 when regridding netCDF4 files.
netCDF4 formats (except Zarr) use HDF5 as the back-end storage format and the HDF5 library is not thread-safe. For this reason ncremap is unreliable when operating with more than one thread on netCDF4 files. This is a longstanding limitation and is not expected to change soon, although a future version of NCO may automatically limit regridder operations to one thread on netCDF4 files. Fortunately, most E3SM files are in netCDF3 format and are not subject to this limitation. Moreover, many people regrid on login nodes that prevent using multiple threads and will not notice this limitation. That said, expect ncremap to crash when attempting to regrid a netCDF4 file unless thr_nbr = 1 which can be explicitly specified using the option --thr_nbr=1 with either ncremap or ncks. Another workaround is to first convert the netCDF4 file to netCDF3 (e.g., with ncks -5 in.nc out.nc) then regrid the netCDF3 file (with as many threads as you like), then convert back to netCDF4 if desired.

Why does regridding demand so much RAM? How can OOM errors occur even when the node has more memory than the disk-size of the file? The regridder only loads one copy of the mapfile weights into memory no matter how many threads are used, because by convention all variables to be regridded share the same regridding weights. Instead, it is the per-thread (i.e., per-variable) OpenMP memory demands that are considerable, with the memory required to regrid variables amounting to no less than about 5-7 times (for type NC_FLOAT) and 2.5-3.5 times (for type NC_DOUBLE) the size of the uncompressed variable, respectively. Memory requirements are so high because the regridder performs all arithmetic in double precision to retain the highest accuracy, and must allocate separate buffers to hold the input and output (regridded) variable, a tally array to count the number of missing values and an array to sum the of the weights contributing to each output gridcell. The last two arrays are only necessary for variables with missing values. Missing values are those values that equal an explicitly specified _FillValue attribute, or equal the default netCDF _FillValue for the variable type (9.9692099683868690e+36 for floating point types, -2147483647 for unsigned four-byte integers). The input, output, and weight-sum arrays are always double precision, and the tally array is composed of four-byte integers. Given the high memory demands, one strategy to optimize thr_nbr for repetitious workflows is to increase it to keep doubling it (1, 2, 4, ...) until throughput stops improving. With sufficient memory, the NCO regridder scales well up to 8-16 threads.

Here is a simple scaling analysis of regridding a full EAMv3 h0 monthly output file (160 MB, 250 variables) from the native atmosphere grid to the land grid with increasing OpenMP threads on a Chrysalis compute node:

ac.zender@chr-0134:~$ for thr_nbr in 1 2 4 8 16; do

time ncremap --thr_nbr=${thr_nbr} --map=${DATA}/maps/map_ne30pg2_to_r05_traave.20240701.nc ${DATA}/bm/eamv3_ne30pg2l80.nc ~/foo.nc

done

The wallclock times to regrid this file with 1, 2, 4, 8, and 16 threads were 0m8.632s, 0m4.819s, 0m4.145s, 0m3.691s, and 0m3.520s. Two threads (the default) are ~1.8x faster than one thread, four threads are 2.1 times faster, eight threads are 2.3x faster and sixteen threads are 2.5x faster than one thread, respectively. Throughput does not appreciably increased by using more than 16 threads. This saturation is due to I/O contention.

Another relevant scaling analysis is the performance of regridding EAMxx ne1024pg2 data of the decadal simulation at NERSC from the cubed-sphere native grid to a higher-resolution lat/lon grid. Here we used

map=/global/cfs/cdirs/e3sm/inputdata/atm/scream/init/map_ne1024pg2_to_refined_5760_by_11520.nc

fl_in=/global/cfs/cdirs/e3smdata/simulations/scream-decadal/decadal-production-run6/run/output.scream.decadal.6hourlyINST_ne1024pg2/output.scream.decadal.6hourlyINST_ne1024pg2.INSTANT.nhours_x6.1994-10-01-00000.nc

zender@nid005946:~$ for thr_nbr in 1 2 4 8 16; do
time ncremap --thr_nbr=${thr_nbr} --map=${map} ${fl_in} ${SCRATCH}/foo.${thr_nbr}.nc
done

and obtained wallclock times of 2m39s, 1m53s, 1m30s, 1m18s, and 1m9s. Two threads (the default) are ~1.4x faster than one thread, four threads are 1.8 times faster, eight threads are 2.0x faster and sixteen threads are 2.3x faster than one thread, respectively. The scaling rises more slowly than the EAM example above, yet reaches a similar value with 16 threads.

The previous two examples showed how OpenMP threading can more-or-less double the speed of I/O-intensive regridding. The --thr_nbr=thr_nbr option also works on weight-generation with the NCO conservative option. Weight generation scales much differently than regridding because there are greater amounts of math that are interspersed with I/O that can be highly contentious. Consider the case of upscaling the ne1024pg2 EAMxx physics grid to the the r025 ELM land grid.