Risk Exposure
vivarium_public_health provides components for modeling how risk factor
exposures modify disease outcomes. This tutorial covers risk exposure -
how simulants are assigned exposure values. For how exposure modifies disease
rates, see the Risk Effects tutorial.
Overview
A risk model in vivarium_public_health has two primary components:
Exposure - the risk factor and how simulants are exposed to it.
Risk assigns each
simulant an exposure category (for dichotomous/polytomous risks) or a
continuous exposure value.
Effect - how exposure modifies a target rate or measure. A target is a
specific rate modified by the risk, identified by an entity and a measure
(e.g., cause.lung_cancer.incidence_rate). See the Risk Effects
tutorial for details on
RiskEffect and
NonLogLinearRiskEffect.
This tutorial focuses on the exposure side: how the
Risk component determines
who is exposed.
Common Setup
Every code example in this tutorial uses the imports and helpers shown below. To run any example in a standalone script, include all of these at the top:
from vivarium import InteractiveContext
from vivarium_public_health.risks import Risk, RiskEffect
from vivarium_public_health.disease import SI, SIS
from vivarium_public_health.population import BasePopulation
from vivarium_public_health._example_data import BASE_PLUGINS, make_base_config
# BASE_PLUGINS overrides the data plugin to use ExampleArtifactManager,
# which serves example data from memory instead of requiring a real HDF file.
base_plugins = BASE_PLUGINS
# make_base_config() returns a configuration with sensible defaults for
# time range, step size, and randomness key columns.
config = make_base_config()
The Artifact Data Format section shows the expected key names and column layouts for every data key so that you know exactly what to put in your own artifact.
Artifact Data Format
This section documents the key name and column layout that the
Risk component expects.
Risk components support the data_sources configuration pattern that lets
you override individual keys with a scalar, DataFrame, or callable without
rebuilding the artifact (see Data sources).
Data keys
The table below lists every data key used by the
Risk component.
Key |
Index columns |
Value columns |
Configurable? |
|---|---|---|---|
|
(scalar) |
A string (e.g., |
Yes - |
|
age, sex, year, parameter |
|
Yes - |
Artifact data shapes
The examples below use the data builders from the
_example_data module; a production artifact
has the same column layout but with real GBD values.
from vivarium_public_health._example_data import risk_exposure_dichotomous
# risk_factor.{name}.exposure - proportion per exposure category.
exposure = risk_exposure_dichotomous(0.6)
print(exposure.query("year_start == 1990 and parameter == 'exposed'").head(4).to_string(index=False))
age_start age_end sex year_start year_end parameter value
0.000000 0.019178 Male 1990 1991 exposed 0.6
0.000000 0.019178 Female 1990 1991 exposed 0.6
0.019178 0.076712 Male 1990 1991 exposed 0.6
0.019178 0.076712 Female 1990 1991 exposed 0.6
Data sources
Risk components support a data_sources configuration pattern that lets
you override individual data keys without rebuilding the artifact. You can
override any key with:
Scalar (int or float) - broadcast a constant value to all simulants.
DataFrame - use the DataFrame directly.
Callable - call the function at setup time to produce the data.
Artifact key (string) - load a different key from the artifact.
Risk declares a
configurable exposure data source:
# Default configuration (loaded from the artifact):
risk_factor.{name}:
data_sources:
exposure: "risk_factor.{name}.exposure"
distribution_type: "risk_factor.{name}.distribution"
This can be overridden with a scalar in the simulation configuration:
configuration:
risk_factor.{name}:
distribution_type: "dichotomous" # or load from artifact
data_sources:
exposure: 0.6 # scalar, DataFrame, callable, or artifact key
Risk
A Risk component assigns
each simulant an exposure value. In the simplest case - a dichotomous
risk - each simulant falls into one of two categories:
exposed or unexposed. The proportion exposed is determined by the exposure
data source. Each simulant’s propensity (a random value drawn at
initialization) determines whether they are in the exposed group.
Basic example
The simplest risk model sets everyone to exposed:
config = make_base_config()
config.update(
{
"population": {"population_size": 1_000},
"mortality": {"data_sources": {"all_cause_mortality_rate": 0}},
"risk_factor.test_risk": {
"distribution_type": "dichotomous",
"data_sources": {"exposure": 1.0},
},
},
layer="override",
)
sim = InteractiveContext(
components=[
BasePopulation(),
Risk("risk_factor.test_risk"),
],
configuration=config,
plugin_configuration=base_plugins,
)
pop = sim.get_population(["test_risk.exposure"])
# With exposure=1.0, all simulants are in the exposed category.
print(f"All exposed: {(pop['test_risk.exposure'] == 'exposed').all()}")
All exposed: True
Partial exposure
When the exposure proportion is less than 1, some simulants will be exposed and others unexposed. The split is determined by each simulant’s propensity draw:
config = make_base_config()
config.update(
{
"population": {"population_size": 10_000},
"mortality": {"data_sources": {"all_cause_mortality_rate": 0}},
"risk_factor.test_risk": {
"distribution_type": "dichotomous",
"data_sources": {"exposure": 0.4},
},
},
layer="override",
)
sim = InteractiveContext(
components=[
BasePopulation(),
Risk("risk_factor.test_risk"),
],
configuration=config,
plugin_configuration=base_plugins,
)
pop = sim.get_population(["test_risk.exposure"])
n_exposed = (pop["test_risk.exposure"] == "exposed").sum()
proportion = n_exposed / len(pop)
# Approximately 40% should be exposed.
print(f"Proportion exposed near 0.4: {np.isclose(proportion, 0.4, atol=0.02)}")
Proportion exposed near 0.4: True
Configuration Summary
Component |
Key configuration options |
Artifact data required |
|---|---|---|
|
|
|
Note
For how exposure modifies disease rates via relative risks and PAF, see the Risk Effects tutorial.
For more advanced use cases - including polytomous risks, coverage gaps, alternative risk factors, and parameterized effect distributions - see the Non-standard Risk Exposure and Effect Models tutorial.