simulate¶
Simulate phenotypes from real genotype data and user-defined architecture.
Use this command to create controlled datasets for method validation.
Basic Syntax¶
gelex simulate -b genotypes -o sim_data
gelex simulate --bfile <genotype_prefix> [OPTIONS]
Required input is genotype prefix (--bfile).
Options¶
Quick Start Options
-b, --bfilerequiredPLINK binary prefix (
.bed/.bim/.fam).-o, --outsim.phenOutput prefix/path root for simulation outputs.
--h20.5Additive heritability proportion.
--d20.0Dominance heritability proportion.
Effect Architecture
--add-var0.01Additive effect-class variances (one or more values).
--add-prop1.0Additive effect-class proportions; must match
--add-varlength and sum to 1.--dom-var0.01Dominance effect-class variances.
--dom-prop1.0Dominance effect-class proportions; must match
--dom-varlength and sum to 1.--intercept0.0Mean term added to simulated phenotypes.
Randomness
--seed42Random seed for reproducibility.
Output Files¶
Simulation writes phenotype and causal-effect outputs using the --out root.
File pattern |
Contents |
Notes |
|---|---|---|
|
Simulated phenotype table (FID, IID, phenotype) |
Main output for downstream |
|
Causal SNP effects and class assignments |
Ground truth for benchmarking |
Warnings and Notes¶
Warning
Keep h2 + d2 < 1 to leave residual variance positive.
Warning
--add-var and --add-prop must have the same number of entries, and
additive proportions must sum to 1.
Note
Dominance classes (--dom-var and --dom-prop) are only used when
--d2 is greater than 0.
Examples¶
gelex simulate \
-b genotypes \
-o sim_basic
Expected outputs: sim_basic.phen, sim_basic.causal.
gelex simulate \
-b genotypes \
--h2 0.3 \
--d2 0.1 \
--seed 2026 \
-o sim_dom
gelex simulate \
-b genotypes \
--add-var 0 0.0001 0.001 0.01 \
--add-prop 0.90 0.05 0.03 0.02 \
--h2 0.5 \
--seed 42 \
-o sim_mix
gelex simulate \
-b genotypes \
--h2 0.4 \
--d2 0.2 \
--add-var 0 0.001 0.01 \
--add-prop 0.85 0.10 0.05 \
--dom-var 0 0.001 \
--dom-prop 0.95 0.05 \
--intercept 1.5 \
--seed 42 \
-o sim_arch