What is spesim? What isn't it? • spesim

Purpose

spesim is designed for teaching, methods testing, and exploratory research.

It helps you create a known spatial “truth” (individual locations, a sampling design, and derived site $\times$ species data) so you can:

demonstrate concepts (environmental filtering, distance–decay, $\beta$ -diversity),
compare sampling schemes (random vs transect vs Voronoi, etc.),
stress-test metrics and workflows on controlled scenarios.

What the simulator generates

A typical run produces:

A domain: an sf polygon. The built-in domain is synthetic and uses arbitrary planar units.
Environmental gradients: gridded synthetic fields (temperature, elevation, rainfall) generated over the domain.
A community of individuals: sf point locations with species labels.
Quadrat samples: sf polygons representing the sampling design.
Derived data:
- site $\times$ species abundance matrix,
- per-quadrat mean environment,
- classic diagnostic summaries/plots (SAD, SAR, rarefaction, distance–decay).

What the parameters mean

Species–abundance distribution (SAD)

spesim now supports multiple SAD generators via SAD_MODEL in the init file:

fisher (default): dominant species fraction + Fisher log-series tail.
geometric, brokenstick (teaching comparators).
zipf, zipf-mandelbrot (rank-based heavy tails).
lognormal, poisson-lognormal, poisson-gamma (sampling-model flavours).
zsm (neutral-theory SAD helper; currently a theta-only Ewens sampler).
custom: user supplies a numeric vector (probabilities or weights/counts) or a function.

This is primarily a teaching oriented feature: it creates realistic-looking rank–abundance curves without requiring a fully mechanistic population model.

Model families

spesim now includes a high-level MODEL_FAMILY layer:

manual: use low-level knobs directly.
niche_filtering: point-process baseline + gradient filtering.
neutral_csr: neutral SAD baseline with CSR placement.
neutral_hubbell_like: sequential neutral recruitment with immigration/speciation and dispersal kernel.
hybrid: neutral recruitment with environmental sorting.

spesim also supports constrained linearized landscapes via DOMAIN_TYPE:

network: river-network style along-path coordinate.
coastline: alongshore coordinate with optional wrap (LINEAR_WRAP = TRUE).
optional external section/node covariates via ENV_COVARIATES_FILE.
bundled starter covariate files: system.file("extdata/doubs_like_nodes.csv", package = "spesim") and system.file("extdata/seaweed_coast_sections.csv", package = "spesim").

Environmental filtering

Species can be assigned to named gradients with an optimum and tolerance.

“Optimum” = where along the gradient a species does best.
“Tolerance” = how broad the response is.

This is implemented as a Gaussian response on a normalised 0–1 gradient internally.

Spatial structure and interactions

spesim supports several ways to introduce spatial structure:

CSR baseline (Complete Spatial Randomness): homogeneous Poisson.
Clustering: tunable clustering behaviour (including fast engines).
Inhibition: “repulsion” style processes (Strauss/Geyer fast engines).
Neighbour effects: a directed coefficient matrix applied within a single global interaction radius (values < 1 suppress; > 1 facilitate; 1 neutral).

These are intended to create interpretable spatial patterns for teaching and experimentation.

What spesim is not

spesim is not a full mechanistic ecological simulator. In particular, it still does not attempt to model:

demography (births/deaths),
temporal dynamics or succession,
detectability/observation error (unless you add it downstream),
parameter inference / likelihood-based point process modelling.

MODEL_FAMILY = "neutral_hubbell_like" and "hybrid" now include an individual recruitment process with dispersal kernels, but this is still a single-time synthetic generator rather than a calibrated dynamic ecosystem model.

If you need those, consider pairing spesim with specialised tools (e.g. spatstat for point process inference/diagnostics, or domain-specific individual-based/metacommunity simulators).

Valid interpretations

Valid:

“If I impose stronger environmental filtering, does distance–decay increase?”
“How does a transect design compare to random quadrats for SAR shape?”
“Do my diversity metrics behave sensibly under known gradients?”

Not valid without extra work:

“This coefficient equals a real competition parameter.”
“The ‘temperature’ units correspond to a real landscape unless I supplied one with a known CRS/units.”

Reproducibility checklist

Set (and record) SEED in your init file (or pass seed= to spesim_run()).
If you regenerate advanced panels or vegan-derived summaries separately, call set.seed(SEED) immediately beforehand.

Minimal example

library(spesim)

P <- load_config(system.file("examples/spesim_init_basic.txt", package = "spesim"))
res <- spesim_run(P, write_outputs = FALSE, seed = 77)

plot_spatial_sampling(res$domain, res$species_dist, res$quadrats, res$P)