NetCDF output

SpeedyWeather.jl uses NetCDF to output the data of a simulation. The following describes the details of this and how to change the way in which the NetCDF output is written. There are many options to this available.

Creating NetCDFOutput

using SpeedyWeather
spectral_grid = SpectralGrid()
output = NetCDFOutput(spectral_grid)
NetCDFOutput{FullGaussianGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 21600 seconds
└┐ variables:
 ├ v: meridional wind [m/s]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

With NetCDFOutput(::SpectralGrid, ...) one creates a NetCDFOutput writer with several options, which are explained in the following. By default, the NetCDFOutput is created when constructing the model, i.e.

model = ShallowWaterModel(spectral_grid)
model.output
NetCDFOutput{FullGaussianGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 21600 seconds
└┐ variables:
 ├ eta: interface displacement [m]
 ├ v: meridional wind [m/s]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

The output writer is a component of every Model, i.e. BarotropicModel, ShallowWaterModel, PrimitiveDryModel and PrimitiveWetModel, and they only differ in their default output.variables (e.g. the primitive models would by default output temperature which does not exist in the 2D models BarotropicModel or ShallowWaterModel). But any NetCDFOutput can be passed onto the model constructor with the output keyword argument.

output = NetCDFOutput(spectral_grid, Barotropic)
model = ShallowWaterModel(spectral_grid, output=output)

Here, we created NetCDFOutput for the model class Barotropic (2nd positional argument, outputting only vorticity and velocity) but use it in the ShallowWaterModel. By default the NetCDFOutput is set to inactive, i.e. output.active is false. It is only turned on (and initialized) with run!(simulation, output=true). So you may change the NetCDFOutput as you like but only calling run!(simulation) will not trigger it as output=false is the default here.

Output frequency

If we want to increase the frequency of the output we can choose output_dt (default =Hour(6)) like so

output = NetCDFOutput(spectral_grid, ShallowWater, output_dt=Hour(1))
model = ShallowWaterModel(spectral_grid, output=output)
model.output
NetCDFOutput{FullGaussianGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 3600 seconds
└┐ variables:
 ├ eta: interface displacement [m]
 ├ v: meridional wind [m/s]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

which will now output every hour. It is important to pass on the new output writer output to the model constructor, otherwise it will not be part of your model and the default is used instead. Note that the choice of output_dt can affect the actual time step that is used for the model integration, which is explained in the following. Example, we run the model at a resolution of T42 and the time step is going to be

spectral_grid = SpectralGrid(trunc=42, nlayers=1)
time_stepping = Leapfrog(spectral_grid)
time_stepping.Δt_sec
1350.0f0

seconds. Depending on the output frequency (we chose output_dt = Hour(1) above) this will be slightly adjusted during model initialization:

output = NetCDFOutput(spectral_grid, ShallowWater, output_dt=Hour(1))
model = ShallowWaterModel(spectral_grid; time_stepping, output)
simulation = initialize!(model)
model.time_stepping.Δt_sec
1200.0f0

The shorter the output time step the more the model time step needs to be adjusted to match the desired output time step exactly. This is important so that for daily output at noon this does not slowly shift towards night over years of model integration. One can always disable this adjustment with

time_stepping = Leapfrog(spectral_grid, adjust_with_output=false)
time_stepping.Δt_sec
1339.535f0

and a little info will be printed to explain that even though you wanted output_dt = Hour(1) you will not actually get this upon initialization:

model = ShallowWaterModel(spectral_grid; time_stepping, output)
simulation = initialize!(model)
Simulation{ShallowWaterModel}
├ prognostic_variables::PrognosticVariables{...}
├ diagnostic_variables::DiagnosticVariables{...}
└ model::ShallowWaterModel{...}

The time axis of the NetCDF output will now look like

using NCDatasets
run!(simulation, period=Day(1), output=true)
id = model.output.id
ds = NCDataset("run_$id/output.nc")
ds["time"][:]
22-element Vector{DateTime}:
 2000-01-01T00:00:00
 2000-01-01T01:06:58.605
 2000-01-01T02:13:57.210
 2000-01-01T03:20:55.815
 2000-01-01T04:27:54.420
 2000-01-01T05:34:53.025
 2000-01-01T06:41:51.630
 2000-01-01T07:48:50.235
 2000-01-01T08:55:48.840
 2000-01-01T10:02:47.445
 ⋮
 2000-01-01T14:30:41.865
 2000-01-01T15:37:40.470
 2000-01-01T16:44:39.075
 2000-01-01T17:51:37.680
 2000-01-01T18:58:36.285
 2000-01-01T20:05:34.890
 2000-01-01T21:12:33.495
 2000-01-01T22:19:32.100
 2000-01-01T23:26:30.705

which is a bit ugly, that's why adjust_with_output=true is the default. In that case we would have

time_stepping = Leapfrog(spectral_grid, adjust_with_output=true)
output = NetCDFOutput(spectral_grid, ShallowWater, output_dt=Hour(1))
model = ShallowWaterModel(spectral_grid; time_stepping, output)
simulation = initialize!(model)
run!(simulation, period=Day(1), output=true)
id = model.output.id
ds = NCDataset("run_$id/output.nc")
ds["time"][:]
25-element Vector{DateTime}:
 2000-01-01T00:00:00
 2000-01-01T01:00:00
 2000-01-01T02:00:00
 2000-01-01T03:00:00
 2000-01-01T04:00:00
 2000-01-01T05:00:00
 2000-01-01T06:00:00
 2000-01-01T07:00:00
 2000-01-01T08:00:00
 2000-01-01T09:00:00
 ⋮
 2000-01-01T16:00:00
 2000-01-01T17:00:00
 2000-01-01T18:00:00
 2000-01-01T19:00:00
 2000-01-01T20:00:00
 2000-01-01T21:00:00
 2000-01-01T22:00:00
 2000-01-01T23:00:00
 2000-01-02T00:00:00

very neatly hourly output in the NetCDF file!

Output grid

Say we want to run the model at a given horizontal resolution but want to output on another resolution, the NetCDFOutput takes as argument output_Grid<:AbstractFullGrid and nlat_half::Int. So for example output_Grid=FullClenshawGrid and nlat_half=48 will always interpolate onto a regular 192x95 longitude-latitude grid of 1.875˚ resolution, regardless the grid and resolution used for the model integration.

my_output_writer = NetCDFOutput(spectral_grid, output_Grid=FullClenshawGrid, nlat_half=48)
NetCDFOutput{FullClenshawGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 21600 seconds
└┐ variables:
 ├ v: meridional wind [m/s]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

Note that by default the output is on the corresponding full type of the grid type used in the dynamical core so that interpolation only happens at most in the zonal direction as they share the location of the latitude rings. You can check this by

RingGrids.full_grid_type(OctahedralGaussianGrid)
FullGaussianGrid (alias for FullGaussianArray{T, 1, Array{T, 1}} where T)

So the corresponding full grid of an OctahedralGaussianGrid is the FullGaussianGrid and the same resolution nlat_half is chosen by default in the output writer (which you can change though as shown above). Overview of the corresponding full grids

GridCorresponding full grid
FullGaussianGridFullGaussianGrid
FullClenshawGridFullClenshawGrid
OctahadralGaussianGridFullGaussianGrid
OctahedralClensawhGridFullClenshawGrid
HEALPixGridFullHEALPixGrid
OctaHEALPixGridFullOctaHEALPixGrid

The grids FullHEALPixGrid, FullOctaHEALPixGrid share the same latitude rings as their reduced grids, but have always as many longitude points as they are at most around the equator. These grids are not tested in the dynamical core (but you may use them experimentally) and mostly designed for output purposes.

Output variables

One can easily add or remove variables from being output with the NetCDFOut writer. The following variables are predefined (note they are not exported so you have to prefix SpeedyWeather.)

subtypes(SpeedyWeather.AbstractOutputVariable)
20-element Vector{Any}:
 SpeedyWeather.AbstractRainRateOutputVariable
 SpeedyWeather.CloudTopOutput
 SpeedyWeather.ConvectivePrecipitationOutput
 SpeedyWeather.DivergenceOutput
 SpeedyWeather.HumidityOutput
 SpeedyWeather.InterfaceDisplacementOutput
 SpeedyWeather.LargeScalePrecipitationOutput
 SpeedyWeather.MeridionalVelocityOutput
 SpeedyWeather.NoOutputVariable
 SpeedyWeather.OrographyOutput
 SpeedyWeather.OutgoingLongwaveRadiationOutput
 SpeedyWeather.OutgoingShortwaveRadiationOutput
 SpeedyWeather.RandomPatternOutput
 SpeedyWeather.SeaSurfaceTemperatureOutput
 SpeedyWeather.SurfaceFluxHeatOutput
 SpeedyWeather.SurfaceFluxHumidOutput
 SpeedyWeather.SurfacePressureOutput
 SpeedyWeather.TemperatureOutput
 SpeedyWeather.VorticityOutput
 SpeedyWeather.ZonalVelocityOutput

"Defined" here means that every such type contains information about a variables (long) name, its units, dimensions, any missing values and compression options. For HumidityOutput for example we have

SpeedyWeather.HumidityOutput()
SpeedyWeather.HumidityOutput <: SpeedyWeather.AbstractOutputVariable
├ name::String = humid
├ unit::String = kg/kg
├ long_name::String = specific humidity
├ dims_xyzt::NTuple{4, Bool} = (true, true, true, true)
├ missing_value::Float64 = NaN
├ compression_level::Int64 = 3
├ shuffle::Bool = true
├ keepbits::Int64 = 7

You can choose name and unit as you like, e.g. SpeedyWeather.HumidityOutput(unit = "1") or change the compression options, e.g. SpeedyWeather.HumidityOutput(keepbits = 5) but more customisation is discussed in Customizing netCDF output.

We can add new output variables with add!

output = NetCDFOutput(spectral_grid)            # default variables
add!(output, SpeedyWeather.DivergenceOutput())  # output also divergence
output
NetCDFOutput{FullGaussianGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 21600 seconds
└┐ variables:
 ├ v: meridional wind [m/s]
 ├ div: divergence [s^-1]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

If you didn't create a NetCDFOutput separately, you can also apply this directly to model, either add!(model, SpeedyWeather.DivergenceOutput()) or add!(model.output, args...), which technically also just forwards to add!(model.output.variables, args...). output.variables is a dictionary were the variable names (as Symbols) are used as keys, so output.variables[:div] just returns the SpeedyWeather.DivergenceOutput() we have just created using :div as key. With those keys one can also delete! a variable from netCDF output

delete!(output, :div)
NetCDFOutput{FullGaussianGrid{Float32}}
├ status: inactive/uninitialized
├ write restart file: true (if active)
├ interpolator: AnvilInterpolator{Float32, OctahedralGaussianGrid}
├ path: output.nc
├ frequency: 21600 seconds
└┐ variables:
 ├ v: meridional wind [m/s]
 ├ u: zonal wind [m/s]
 └ vor: relative vorticity [s^-1]

If you change the name of an output variable, i.e. SpeedyWeather.DivergenceOutput(name="divergence") the key would change accordingly to :divergence.

Output path and identification

That's easy by passing on path="/my/favourite/path/" and the folder run_* with * the identification of the run (that's the id keyword, which can be manually set but is also automatically determined as a number counting up depending on which folders already exist) will be created within.

julia> path = pwd()
"/Users/milan"
julia> my_output_writer = NetCDFOutput(spectral_grid, path=path)

This folder must already exist. If you want to give your run a name/identification you can pass on id

julia> my_output_writer = NetCDFOutput(spectral_grid, id="diffusion_test");

which will be used instead of a 4 digit number like 0001, 0002 which is automatically determined if id is not provided. You will see the id of the run in the progress bar

Weather is speedy: run diffusion_test 100%|███████████████████████| Time: 0:00:12 (19.20 years/day)

and the run folder, here run_diffusion_test, is also named accordingly

shell> ls
...
run_diffusion_test
...

Further options

Further options are described in the NetCDFOutput docstring, (also accessible via julia>?NetCDFOutput for example). Note that some fields are actual options, but others are derived from the options you provided or are arrays/objects the output writer needs, but shouldn't be passed on by the user. The actual options are declared as [OPTION] in the following

@doc NetCDFOutput

Output writer for a netCDF file with (re-)gridded variables. Interpolates non-rectangular grids. Fields are

  • active::Bool

  • path::String: [OPTION] path to output folder, run_???? will be created within

  • id::String: [OPTION] run identification number/string

  • run_path::String

  • filename::String: [OPTION] name of the output netcdf file

  • write_restart::Bool: [OPTION] also write restart file if output==true?

  • pkg_version::VersionNumber

  • startdate::DateTime

  • output_dt::Second: [OPTION] output frequency, time step

  • variables::Dict{Symbol, SpeedyWeather.AbstractOutputVariable}: [OPTION] dictionary of variables to output, e.g. u, v, vor, div, pres, temp, humid

  • output_every_n_steps::Int64

  • timestep_counter::Int64

  • output_counter::Int64

  • netcdf_file::Union{Nothing, NCDatasets.NCDataset}

  • interpolator::Any

  • grid2D::Any

  • grid3D::Any

NetCDFOutput(
    S::SpectralGrid;
    ...
) -> NetCDFOutput{_A, _B, Interpolator} where {_A, _B, Interpolator<:(AnvilInterpolator{Float32})}
NetCDFOutput(
    S::SpectralGrid,
    Model::Type{<:AbstractModel};
    output_Grid,
    nlat_half,
    output_NF,
    output_dt,
    kwargs...
) -> NetCDFOutput{_A, _B, Interpolator} where {_A, _B, Interpolator<:(AnvilInterpolator{Float32})}

Constructor for NetCDFOutput based on S::SpectralGrid and optionally the Model type (e.g. ShallowWater, PrimitiveWet) as second positional argument. The output grid is optionally determined by keyword arguments output_Grid (its type, full grid required), nlat_half (resolution) and output_NF (number format). By default, uses the full grid equivalent of the grid and resolution used in SpectralGrid S.