spesim is intended for teaching and method testing, where you usually want runs to be interactive (seconds, not minutes).
This vignette gives conservative rules of thumb for keeping runtime and memory use reasonable.
What tends to dominate runtime
-
Neutral-model simulation loop (when
MODEL_FAMILY = "hybrid"or"neutral").- The birth–death–dispersal loop runs for
NEUTRAL_MAX_STEPSiterations. The inner loop is written in R; cost scales withN_INDIVIDUALSand the number of steps to convergence. As of v0.5.2 the main per-step costs are the C++ point-in-polygon check (pip_cpp) and the R interpreter overhead of the loop body itself — the environmental acceptance and displacement sampling steps are now fully vectorised over the candidate batch.
- The birth–death–dispersal loop runs for
-
Spatial intersections
(
sf::st_intersects) between individuals and quadrats.- Roughly scales with the number of individuals × number of quadrats.
-
Environmental grid resolution (the
env_gradientsraster-like table).- Higher resolution increases memory and slows operations that summarise environment per quadrat.
-
Optional analyses (reports, panels, diversity
calculations).
-
ADVANCED_ANALYSIS = TRUEis intentionally heavier.
-
Safe interactive ranges (rules of thumb)
These are not hard limits—just settings that tend to work well on a laptop.
Individuals
-
Good default:
N_INDIVIDUALS≈ 1,000–5,000 - Still workable for method testing: up to ~20,000 (expect slower plotting/intersections)
Quadrat sampling
-
Good default:
N_QUADRATS≈ 10–50 - Very large
N_QUADRATSincreases intersection work and can make some analyses noisy.
Environmental grid
SAMPLING_RESOLUTIONis the number of grid cells per side for the environmental surface.Good default: 30–80
Above ~150 you should expect larger memory use and slower quadrat-environment summaries.
Advanced outputs
- For interactive work, start with:
P$ADVANCED_ANALYSIS <- FALSE
x <- spesim_method_test(P = P, make_plots = TRUE)- Enable advanced reporting once your parameters are stable:
P$ADVANCED_ANALYSIS <- TRUE
res <- spesim_run(P, write_outputs = TRUE, seed = P$SEED)
cat(generate_full_report(res, include_audit = TRUE), sep = "\n")Practical tips
- When comparing sampling schemes, keep the truth fixed (same seed) and change only sampling parameters.
- Prefer running many small replicates over one enormous run.
- If a sampling scheme returns very few quadrats, some downstream summaries may be undefined or uninformative (distance–decay requires ≥2 sites, etc.).
If you need to go bigger
- Reduce plotting.
- Increase
N_INDIVIDUALSgradually. - Consider using the fast point-process engines when available.
Engine performance history
Internal profiling (N = 1,500, S = 15, hybrid model) shows the following cumulative improvements since v0.5.1:
| Release | Change | Approx. wall time |
|---|---|---|
| v0.5.1 | C++ PIP engine + vectorised domain checks | 17,690 ms |
| v0.5.2 | Vectorised env-acceptance + displacement sampling | 2,600 ms |
The residual cost is now dominated by the C++ PIP engine and the R interpreter overhead of the simulation loop itself, both of which are at or near their practical floor without a full Rcpp port of the inner loop.