Skip to content

Benchmarks

Performance benchmarks for SpeedyWeather.jl across architectures. This page is auto-generated at doc-build time from SpeedyWeather/benchmark/assets/benchmark_results.json, which itself is updated by SpeedyWeather/benchmark/manual_benchmarking.jl. Running the benchmark script on a new architecture adds a column to the overview table, a series to each comparison figure, and a per-architecture section to this page.

All simulations are benchmarked over several seconds (wallclock time) without output. Benchmarking excludes initialization. Timings can vary by ±50% between runs, so treat the numbers as rough rather than precise.

Overview: PrimitiveWet resolution across architectures

Simulated years per wallclock day (SYPD) for the PrimitiveWetModel resolution sweep, one column per architecture. Each (T, L) configuration is reported for both the standard Legendre transform and fast Fourier transform (LT+FFT) and the single matrix transform (MT). Empty cells (—) mean the architecture has either not been benchmarked yet or did not run that specific configuration.

TLTransformcpu-armcpu-x86gpu-nvidia
318LT+FFT14849591137
318MT484711480
428LT+FFT662413656
428MT17319278
638LT+FFT188115380
638MT342.8148
858LT+FFT6554213
858MT8.70.776
8516LT+FFT4228242
8516MT7.10.477
8524LT+FFT2317231
8524MT5.20.377
1278LT+FFT181364
1278MT0.60.118
12716LT+FFT10.07.471
12716MT1.10.118
12724LT+FFT6.24.567
12724MT0.90.017
1708LT+FFT7.05.133
17016LT+FFT3.72.829
17024LT+FFT2.21.727
2558LT+FFT1.81.314
25516LT+FFT0.90.711
25524LT+FFT0.60.48.7

Architecture: cpu-arm

Created for SpeedyWeather.jl v0.20.1+DEV on Thu, 28 May 2026 11:48:39.

Machine details — cpu-arm

julia
julia> versioninfo()
Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin24.0.0)
  CPU: 8 × Apple M3
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, apple-m2)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Models, default setups — cpu-arm

ModelTLPhysicsΔtSYPDMemory
BarotropicModel311false240047411688.34 KB
ShallowWaterModel311false240026114822.86 KB
PrimitiveDryModel318true240026774.16 MB
PrimitiveWetModel318true240012224.84 MB

Shallow water model, resolution — cpu-arm

ModelTLRingsΔtSYPDMemory
ShallowWaterModel31148240027856822.86 KB
ShallowWaterModel421641800130851.44 MB
ShallowWaterModel63196120036553.25 MB
ShallowWaterModel85112890014415.94 MB
ShallowWaterModel127119260031614.14 MB
ShallowWaterModel170125645013226.86 MB
ShallowWaterModel25513843003368.30 MB

Primitive wet model, resolution — cpu-arm

ModelTLRingsTransformΔtSYPDMemory
PrimitiveWetModel31848default240014844.84 MB
PrimitiveWetModel42864default18006628.18 MB
PrimitiveWetModel63896default120018817.42 MB
PrimitiveWetModel858128default9006530.35 MB
PrimitiveWetModel1278192default6001867.03 MB
PrimitiveWetModel1708256default4507.0119.25 MB
PrimitiveWetModel2558384default3001.8272.27 MB
PrimitiveWetModel8516128default9004251.65 MB
PrimitiveWetModel12716192default60010.0113.22 MB
PrimitiveWetModel17016256default4503.7200.00 MB
PrimitiveWetModel25516384default3000.9450.63 MB
PrimitiveWetModel8524128default9002373.00 MB
PrimitiveWetModel12724192default6006.2159.48 MB
PrimitiveWetModel17024256default4502.2280.85 MB
PrimitiveWetModel25524384default3000.6629.13 MB
PrimitiveWetModel31848matrix240048447.07 MB
PrimitiveWetModel42864matrix1800173132.12 MB
PrimitiveWetModel63896matrix120034579.07 MB
PrimitiveWetModel858128matrix9008.71.74 GB
PrimitiveWetModel1278192matrix6000.68.17 GB
PrimitiveWetModel8516128matrix9007.11.76 GB
PrimitiveWetModel12716192matrix6001.18.22 GB
PrimitiveWetModel8524128matrix9005.21.78 GB
PrimitiveWetModel12724192matrix6000.98.26 GB

Primitive Equation, Float32 vs Float64 — cpu-arm

ModelNFTLΔtSYPDMemory
PrimitiveWetModelFloat32318240011454.84 MB
PrimitiveWetModelFloat64318240013379.00 MB

Grids — cpu-arm

ModelTLGridRingsΔtSYPDMemory
PrimitiveWetModel638FullGaussianGrid9612007625.36 MB
PrimitiveWetModel638FullClenshawGrid9512009625.14 MB
PrimitiveWetModel638OctahedralGaussianGrid96120018217.42 MB
PrimitiveWetModel638OctahedralClenshawGrid95120012217.18 MB
PrimitiveWetModel638HEALPixGrid95120025712.73 MB
PrimitiveWetModel638OctaHEALPixGrid95120019015.50 MB

Number of vertical layers — cpu-arm

ModelTLΔtSYPDMemory
PrimitiveWetModel314240021843.11 MB
PrimitiveWetModel318240012634.84 MB
PrimitiveWetModel311224009816.58 MB
PrimitiveWetModel311624006198.33 MB

PrimitiveDryModel: Physics or dynamics only — cpu-arm

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveDryModel318truetrue240026974.16 MB
PrimitiveDryModel318truefalse240044944.16 MB
PrimitiveDryModel318falsetrue240033894.16 MB

PrimitiveWetModel: Physics or dynamics only — cpu-arm

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveWetModel318truetrue240012574.84 MB
PrimitiveWetModel318truefalse240033544.84 MB
PrimitiveWetModel318falsetrue240019214.84 MB

Individual dynamics functions — cpu-arm

PrimitiveWetModel | Float32 | T31 L8 | OctahedralGaussianGrid | 48 Rings — cpu-arm

FunctionTimeMemoryAllocations
pressure_gradient_flux!40.041 μs100.28 KiB790
linear_virtual_temperature!1.754 μs0 bytes0
geopotential!6.308 μs5.61 KiB23
vertical_integration!13.833 μs0 bytes0
surface_pressure_tendency!14.791 μs24.66 KiB288
vertical_velocity!48.125 μs384.31 KiB12
linear_pressure_gradient!1.750 μs0 bytes0
vertical_advection!98.083 μs8.62 KiB100
vordiv_tendencies!287.333 μs259.66 KiB724
temperature_tendency!313.875 μs381.95 KiB1027
humidity_tendency!287.958 μs380.59 KiB1017
bernoulli_potential!99.125 μs510.22 KiB345

Architecture: cpu-x86

Created for SpeedyWeather.jl v0.20.3 on Sat, 06 Jun 2026 21:45:16.

Machine details — cpu-x86

julia
julia> versioninfo()
Julia Version 1.12.2
Commit ca9b6662be4 (2025-11-20 16:25 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × AMD EPYC 9554 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, znver4)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 128 virtual cores)
Environment:
  LD_LIBRARY_PATH = /usr/local/lib:/usr/local/lib:

Models, default setups — cpu-x86

ModelTLPhysicsΔtSYPDMemory
BarotropicModel311false240028443688.47 KB
ShallowWaterModel311false240016131822.99 KB
PrimitiveDryModel318true240010464.16 MB
PrimitiveWetModel318true24008884.84 MB

Shallow water model, resolution — cpu-x86

ModelTLRingsΔtSYPDMemory
ShallowWaterModel31148240015640822.99 KB
ShallowWaterModel42164180072271.44 MB
ShallowWaterModel63196120022083.25 MB
ShallowWaterModel8511289009125.94 MB
ShallowWaterModel127119260018614.14 MB
ShallowWaterModel17012564509126.86 MB
ShallowWaterModel25513843002068.30 MB

Primitive wet model, resolution — cpu-x86

ModelTLRingsTransformΔtSYPDMemory
PrimitiveWetModel31848default24009594.84 MB
PrimitiveWetModel42864default18004138.18 MB
PrimitiveWetModel63896default120011517.42 MB
PrimitiveWetModel858128default9005430.35 MB
PrimitiveWetModel1278192default6001367.03 MB
PrimitiveWetModel1708256default4505.1119.25 MB
PrimitiveWetModel2558384default3001.3272.27 MB
PrimitiveWetModel8516128default9002851.65 MB
PrimitiveWetModel12716192default6007.4113.22 MB
PrimitiveWetModel17016256default4502.8200.00 MB
PrimitiveWetModel25516384default3000.7450.63 MB
PrimitiveWetModel8524128default9001773.00 MB
PrimitiveWetModel12724192default6004.5159.48 MB
PrimitiveWetModel17024256default4501.7280.85 MB
PrimitiveWetModel25524384default3000.4629.13 MB
PrimitiveWetModel31848matrix24007147.07 MB
PrimitiveWetModel42864matrix180019132.12 MB
PrimitiveWetModel63896matrix12002.8579.07 MB
PrimitiveWetModel858128matrix9000.71.74 GB
PrimitiveWetModel1278192matrix6000.18.17 GB
PrimitiveWetModel8516128matrix9000.41.76 GB
PrimitiveWetModel12716192matrix6000.18.22 GB
PrimitiveWetModel8524128matrix9000.31.78 GB
PrimitiveWetModel12724192matrix6000.08.26 GB

Primitive Equation, Float32 vs Float64 — cpu-x86

ModelNFTLΔtSYPDMemory
PrimitiveWetModelFloat3231824007174.84 MB
PrimitiveWetModelFloat6431824007879.00 MB

Grids — cpu-x86

ModelTLGridRingsΔtSYPDMemory
PrimitiveWetModel638FullGaussianGrid9612008225.36 MB
PrimitiveWetModel638FullClenshawGrid9512008325.14 MB
PrimitiveWetModel638OctahedralGaussianGrid96120011517.42 MB
PrimitiveWetModel638OctahedralClenshawGrid9512007417.18 MB
PrimitiveWetModel638HEALPixGrid95120018812.73 MB
PrimitiveWetModel638OctaHEALPixGrid9512007515.50 MB

Number of vertical layers — cpu-x86

ModelTLΔtSYPDMemory
PrimitiveWetModel314240014503.11 MB
PrimitiveWetModel31824009484.84 MB
PrimitiveWetModel311224006396.58 MB
PrimitiveWetModel311624005468.33 MB

PrimitiveDryModel: Physics or dynamics only — cpu-x86

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveDryModel318truetrue240014774.16 MB
PrimitiveDryModel318truefalse240022474.16 MB
PrimitiveDryModel318falsetrue240018994.16 MB

PrimitiveWetModel: Physics or dynamics only — cpu-x86

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveWetModel318truetrue24009604.84 MB
PrimitiveWetModel318truefalse240016824.84 MB
PrimitiveWetModel318falsetrue240012114.84 MB

Individual dynamics functions — cpu-x86

PrimitiveWetModel | Float32 | T31 L8 | OctahedralGaussianGrid | 48 Rings — cpu-x86

FunctionTimeMemoryAllocations
pressure_gradient_flux!67.850 μs100.30 KiB789
linear_virtual_temperature!3.405 μs0 bytes0
geopotential!11.580 μs5.51 KiB22
vertical_integration!14.010 μs0 bytes0
surface_pressure_tendency!24.450 μs24.66 KiB288
vertical_velocity!86.940 μs346.84 KiB12
linear_pressure_gradient!3.048 μs0 bytes0
vertical_advection!228.479 μs8.69 KiB96
vordiv_tendencies!395.089 μs246.36 KiB721
temperature_tendency!512.349 μs361.98 KiB1024
humidity_tendency!483.629 μs360.62 KiB1014
bernoulli_potential!172.839 μs416.55 KiB340

Architecture: gpu-nvidia

Created for SpeedyWeather.jl v0.20.3 on Sat, 06 Jun 2026 22:14:51.

Machine details — gpu-nvidia

julia
julia> versioninfo()
Julia Version 1.12.2
Commit ca9b6662be4 (2025-11-20 16:25 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × AMD EPYC 9554 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, znver4)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 128 virtual cores)
Environment:
  LD_LIBRARY_PATH = /usr/local/lib:/usr/local/lib:
julia
julia> CUDA.versioninfo()
CUDA toolchain: 
- runtime 13.2, artifact installation
- driver 580.126.9 for 13.3
- compiler 13.3, artifact installation

CUDA libraries: 
- cuBLAS: 13.4.0
- cuSPARSE: 12.7.10
- cuSOLVER: 12.2.0
- cuFFT: 12.2.0
- cuRAND: 10.4.2
- CUPTI: 2026.1.1 (API 13.2.1)
- NVML: 13.0.0+580.126.9

Julia packages: 
- CUDACore: 6.1.1
- GPUArrays: 11.5.5
- GPUCompiler: 1.17.1
- KernelAbstractions: 0.9.41
- CUDA_Driver_jll: 13.3.0+0
- CUDA_Compiler_jll: 0.4.4+0
- CUDA_Runtime_jll: 0.21.0+1

Toolchain:
- Julia: 1.12.2
- LLVM: 18.1.7

1 device:
  0: NVIDIA H100 80GB HBM3 (sm_90, 47.753 GiB / 79.647 GiB available)

Models, default setups — gpu-nvidia

ModelTLPhysicsΔtSYPDMemory
BarotropicModel311false24001701432.95 KB
ShallowWaterModel311false2400850436.08 KB
PrimitiveDryModel318true24001505579.99 KB
PrimitiveWetModel318true24001235585.21 KB

Shallow water model, resolution — gpu-nvidia

ModelTLRingsΔtSYPDMemory
ShallowWaterModel311482400827436.08 KB
ShallowWaterModel421641800478741.88 KB
ShallowWaterModel6319612001981.61 MB
ShallowWaterModel8511289001312.80 MB
ShallowWaterModel1271192600596.21 MB
ShallowWaterModel17012564503310.96 MB
ShallowWaterModel25513843001524.49 MB

Primitive wet model, resolution — gpu-nvidia

ModelTLRingsTransformΔtSYPDMemory
PrimitiveWetModel31848default24001137585.21 KB
PrimitiveWetModel42864default1800656991.36 KB
PrimitiveWetModel63896default12003802.14 MB
PrimitiveWetModel858128default9002133.74 MB
PrimitiveWetModel1278192default600648.30 MB
PrimitiveWetModel1708256default4503314.65 MB
PrimitiveWetModel2558384default3001432.77 MB
PrimitiveWetModel8516128default9002424.79 MB
PrimitiveWetModel12716192default6007110.65 MB
PrimitiveWetModel17016256default4502918.85 MB
PrimitiveWetModel25516384default3001142.21 MB
PrimitiveWetModel8524128default9002315.84 MB
PrimitiveWetModel12724192default6006713.01 MB
PrimitiveWetModel17024256default4502723.04 MB
PrimitiveWetModel25524384default3008.751.64 MB
PrimitiveWetModel31848matrix24001480561.28 KB
PrimitiveWetModel42864matrix1800278959.74 KB
PrimitiveWetModel63896matrix12001482.09 MB
PrimitiveWetModel858128matrix900763.68 MB
PrimitiveWetModel1278192matrix600188.20 MB
PrimitiveWetModel8516128matrix900774.73 MB
PrimitiveWetModel12716192matrix6001810.56 MB
PrimitiveWetModel8524128matrix900775.78 MB
PrimitiveWetModel12724192matrix6001712.92 MB

Primitive Equation, Float32 vs Float64 — gpu-nvidia

ModelNFTLΔtSYPDMemory
PrimitiveWetModelFloat3231824001097585.21 KB
PrimitiveWetModelFloat643182400599585.85 KB

Grids — gpu-nvidia

ModelTLGridRingsΔtSYPDMemory
PrimitiveWetModel638FullGaussianGrid9612002612.23 MB
PrimitiveWetModel638FullClenshawGrid9512003862.21 MB
PrimitiveWetModel638OctahedralGaussianGrid9612003782.14 MB
PrimitiveWetModel638OctahedralClenshawGrid9512003792.12 MB
PrimitiveWetModel638HEALPixGrid9512003742.07 MB
PrimitiveWetModel638OctaHEALPixGrid9512003882.10 MB

Number of vertical layers — gpu-nvidia

ModelTLΔtSYPDMemory
PrimitiveWetModel3142400993511.49 KB
PrimitiveWetModel31824001116585.21 KB
PrimitiveWetModel311224001179658.94 KB
PrimitiveWetModel311624001111732.67 KB

PrimitiveDryModel: Physics or dynamics only — gpu-nvidia

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveDryModel318truetrue24001323579.99 KB
PrimitiveDryModel318truefalse24001353579.99 KB
PrimitiveDryModel318falsetrue24003412579.99 KB

PrimitiveWetModel: Physics or dynamics only — gpu-nvidia

ModelTLDynamicsPhysicsΔtSYPDMemory
PrimitiveWetModel318truetrue24001125585.21 KB
PrimitiveWetModel318truefalse24001214585.21 KB
PrimitiveWetModel318falsetrue24002735585.21 KB

Individual dynamics functions — gpu-nvidia

PrimitiveWetModel | Float32 | T31 L8 | OctahedralGaussianGrid | 48 Rings — gpu-nvidia

FunctionTimeMemoryAllocations
pressure_gradient_flux!1.499 ms191.42 KiB6470
linear_virtual_temperature!15.550 μs2.75 KiB72
geopotential!36.020 μs6.72 KiB196
vertical_integration!34.330 μs7.58 KiB170
surface_pressure_tendency!617.227 μs95.48 KiB3097
vertical_velocity!59.590 μs15.16 KiB403
linear_pressure_gradient!14.400 μs2.28 KiB58
vertical_advection!32.750 μs12.28 KiB190
vordiv_tendencies!302.219 μs26.31 KiB450
temperature_tendency!443.968 μs30.69 KiB724
humidity_tendency!434.519 μs25.92 KiB578
bernoulli_potential!176.190 μs20.30 KiB488