Skip to contents

Generates individual locations and species identities over an arbitrary polygon domain by combining:

  1. a spatial point process for baseline locations (clustered, inhibited, or Poisson),

  2. environmental filtering for gradient-responsive species, and

  3. local interspecific interactions within a neighbourhood radius.

The dominant species "A" and the pool of non-dominant species can use different point-process models. Environmental suitability is applied as a Gaussian preference around a per-species optimum, and local interactions are incorporated as a multiplicative modifier computed from nearby already-assigned individuals.

Usage

generate_heterogeneous_distribution(domain, P)

Arguments

domain

An sf polygon (or multipolygon) defining the sampling domain; must have a valid CRS (projected coordinates recommended).

P

A fully materialized parameter list, typically from load_config(), containing at least:

  • Community size: N_SPECIES, N_INDIVIDUALS, DOMINANT_FRACTION, FISHER_ALPHA, FISHER_X.

  • Model family: MODEL_FAMILY in "manual", "niche_filtering", "neutral_csr", "neutral_hubbell_like", or "hybrid".

  • Environment: SAMPLING_RESOLUTION, ENVIRONMENTAL_NOISE, and GRADIENT (tibble with species, gradient in {temperature, elevation, rainfall}, optimum in \([0,1]\), and tol > 0). If absent, species are treated as neutral.

  • Neutral/hybrid controls (if family is neutral/hybrid): NEUTRAL_M, NEUTRAL_NU, NEUTRAL_META_MODEL, DISPERSAL_KERNEL, DISPERSAL_SCALE, DISPERSAL_ALPHA, and HYBRID_ENV_WEIGHT.

  • Point-process selection (strings): SPATIAL_PROCESS_A and SPATIAL_PROCESS_OTHERS in "poisson", "thomas", "strauss", or "geyer".

  • Thomas (A) params (if used): A_PARENT_INTENSITY (parents per area; optional), A_MEAN_OFFSPRING (mean children per parent), A_CLUSTER_SCALE (Gaussian sd of offspring displacement; map units).

  • Strauss/Geyer (others) params (if used):

    • Strauss (inhibition surrogate): OTHERS_R (interaction radius) and OTHERS_S (inhibition strength in \((0,1]\); smaller values yield stronger inhibition).

    • Geyer (saturation): OTHERS_R (interaction radius), OTHERS_GAMMA (interaction parameter; < 1 inhibition, > 1 clustering), and OTHERS_S (saturation count; positive integer).

    • OTHERS_BETA (baseline intensity/multiplier; optional; used by some engines).

    OTHERS_* quick reference. These parameters are shared across several point-process code paths, so the same name can feed different underlying samplers depending on SPATIAL_PROCESS_OTHERS:

    ParameterMeaningUsed whenDefaultConstraints
    OTHERS_Rinteraction radiusStrauss, Geyer1> 0 (map units)
    OTHERS_SStrauss: inhibition strength; Geyer: saturation countStrauss, Geyer2Strauss: (0,1]; Geyer: integer >= 0
    OTHERS_GAMMAGeyer interaction parameterGeyerNA (falls back to OTHERS_S in some dispatchers)> 0 (<1 inhibition, >1 clustering)
    OTHERS_BETAbaseline intensity / multipliersome engines / wrappersNA>= 0 (units depend on engine)

    Note: in internal dispatchers, missing OTHERS_GAMMA may fall back to OTHERS_S for backwards compatibility in some Strauss/Geyer paths. Prefer setting OTHERS_GAMMA explicitly for Geyer.

  • Local interactions: INTERACTION_RADIUS (map units) and INTERACTION_MATRIX (S x S numeric, dimnames = species letters).

Value

An sf POINT layer with a character column species and appended environmental columns (e.g. temperature_C, elevation_m, rainfall_mm). Rows correspond to simulated individuals retained after assignment.

Details

Abundances. Total individuals per species are generated with generate_sad() using P$SAD_MODEL (default: "fisher"). The built-in Fisher option allocates a fixed fraction to species "A" and distributes the remainder by a log-series across B, C, ...; other SAD models generate a full A.. vector directly.

Baseline locations (point processes). Locations are simulated using the process names in P. Supported values (case-insensitive):

"poisson"

Homogeneous Poisson process (Complete Spatial Randomness).

"thomas"

Thomas (Neyman-Scott) cluster process. A fast C++ implementation is used if available.

"strauss"

Strauss process for inhibition. A fast C++ MCMC implementation is used if available.

"geyer"

Geyer saturation process. A fast C++ MCMC implementation is used if available.

Environmental filtering. For species listed in P$GRADIENT, the assignment probability for each species is proportional to \(\exp\{-(x - \mu)^2 / (2 \sigma^2)\}\), where \(x\) is the normalized environmental value at the point, \(\mu\) the species optimum, and \(\sigma\) the tolerance. Environmental values are attached by nearest neighbour join to the grid returned by create_environmental_gradients().

Local interactions. For each candidate point and focal species, the abundance-independent interaction modifier is the geometric mean of the corresponding coefficients in P$INTERACTION_MATRIX for neighbours found within P$INTERACTION_RADIUS (using up to 5 nearest already-assigned individuals). Coefficients > 1 favour co-occurrence; < 1 penalize it.

Tie-breaking and robustness. If all weights for a step are non-finite or non-positive, a uniform assignment is used for that step to avoid dead ends. Only points with a non-empty species are returned.

Notes

  • Use a projected CRS (e.g. metres) so that process radii and cluster scales are in meaningful linear units.

  • When SPATIAL_PROCESS_OTHERS is inhibitory and n is large relative to OTHERS_R and domain area, you may hit feasibility limits; adjust n, r, or choose a different process.

  • Setting INTERACTION_RADIUS = 0 or an all-ones matrix disables local interaction effects.

Examples

if (FALSE) { # \dontrun{
P <- load_config("simul_init.txt")
domain <- create_sampling_domain()

# Example: A clustered (Thomas), others mildly inhibited (Strauss)
P$SPATIAL_PROCESS_A <- "thomas"
P$A_PARENT_INTENSITY <- NA
P$A_MEAN_OFFSPRING <- 10
P$A_CLUSTER_SCALE <- 0.8

P$SPATIAL_PROCESS_OTHERS <- "strauss"
P$OTHERS_R <- 0.6
P$OTHERS_S <- 0.5

pts <- generate_heterogeneous_distribution(domain, P)
plot(sf::st_geometry(pts), pch = 16, cex = 0.4)
} # }