Skip to contents

Create a generator specification for multivariate Gaussian variables, optionally transformed from ILR coordinates to compositional parts.

Usage

gen_mvn(
  vars,
  level = c("single", "level2", "multilevel"),
  mean = NULL,
  cov = NULL,
  fixed_intercept = NULL,
  random_cov = NULL,
  residual_cov = NULL,
  scale_fixed_intercept = NULL,
  residual_cor = NULL,
  compositional = FALSE,
  parts = NULL,
  total = 1,
  keep_ilr = TRUE,
  sbp = NULL
)

Arguments

vars

Character vector naming the generated variables. For compositional generators these are ILR coordinate names.

level

Simulation level. "single" generates one row-level vector per observation, "level2" generates one group-level vector and expands it to each row in the group, and "multilevel" uses a random-intercept model.

mean, cov

Direct mean vector and covariance matrix for "single" and "level2" generators. Defaults are zero means and identity covariance.

fixed_intercept

Location intercept vector for multilevel generators.

random_cov

Group-level random-intercept covariance matrix. When scale_fixed_intercept is supplied this covariance spans location and scale intercepts jointly.

residual_cov

Residual covariance matrix for multilevel MVN generators without a row-specific scale model.

scale_fixed_intercept

Optional log residual standard-deviation intercepts for multilevel generators.

residual_cor

Residual correlation matrix used with scale_fixed_intercept.

compositional

Logical; if TRUE, treat vars as ILR coordinates and back-transform to composition parts.

parts

Character vector naming the composition parts. Must have length(vars) + 1 entries unless supplied through sbp column names.

total

Positive scalar total for closed compositions.

keep_ilr

Logical; if TRUE, emit both ILR coordinates and parts. If FALSE, emit only parts.

sbp

Optional sequential binary partition matrix. Columns name parts; rows define ILR balances using -1, 0, and 1.

Value

An mlsim_generator_spec object for use in simulate_data().

Details

Compositional MVN generators simulate ILR coordinates first, then use the SBP basis to transform the coordinates into positive composition parts that sum to total.

Examples

sim <- simulate_data(
  n = 4,
  seed = 2,
  generators = list(
    z = gen_mvn(
      c("z1", "z2"),
      mean = c(0, 0),
      cov = diag(2),
      compositional = TRUE,
      parts = c("sleep", "activity", "sedentary")
    )
  )
)
sim$data
#>    obs_id          z1         z2     sleep  activity  sedentary
#>     <int>       <num>      <num>     <num>     <num>      <num>
#> 1:      1  0.08025176 -0.8969145 0.3135056 0.1507036 0.53579077
#> 2:      2 -0.13242028  0.1848492 0.2965364 0.3974460 0.30601766
#> 3:      3 -0.70795473  1.5878453 0.1100264 0.8047731 0.08520055
#> 4:      4  0.23969802 -1.1303757 0.3340632 0.1119962 0.55394063
rowSums(as.matrix(sim$data[, c("sleep", "activity", "sedentary"), with = FALSE]))
#> [1] 1 1 1 1