Population Data Transformations
Provide tools for handling raw demographic data and transforming it into different distributions for sampling.
- vivarium_public_health.population.data_transformations.assign_demographic_proportions(population_data, include_sex)[source]
Calculate conditional probabilities on the provided population data for sampling.
- Parameters:
population_data (
DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, and ‘value’include_sex (
str) – ‘Female’, ‘Male’, or ‘Both’. Sexes to include in the distribution.
- Return type:
DataFrame- Returns:
- Table with columns
’age’ : Midpoint of the age group, ‘age_start’ : Lower bound of the age group, ‘age_end’ : Upper bound of the age group, ‘sex’ : ‘Male’ or ‘Female’, ‘location’ : location, ‘year’ : Year, ‘population’ : Total population estimate, ‘P(sex, location | age, year)’ : Conditional probability of sex and location given age and year, ‘P(sex, location, age | year)’ : Conditional probability of sex, location, and age given year, ‘P(age | year, sex, location)’ : Conditional probability of age given year, sex, and location.
- vivarium_public_health.population.data_transformations.rescale_binned_proportions(pop_data, age_start, age_end)[source]
Reshape the distribution so that bin edges fall on the age_start and age_end.
- Parameters:
pop_data (
DataFrame) – Table with columns ‘age’, ‘age_start’, ‘age_end’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’age_start (
float) – The starting age for the rescaled bins.age_end (
float) – The terminal age for the rescaled bins.
- Return type:
DataFrame- Returns:
Table with the same columns as pop_data where all bins outside the range (age_start, age_end) have been discarded. If age_start and age_end don’t fall cleanly on age boundaries, the bins in which they lie are clipped and the ‘population’, ‘P(sex, location, age| year)’, and ‘P(age | year, sex, location)’ values are rescaled to reflect their smaller representation.
- class vivarium_public_health.population.data_transformations.AgeValues(current, young, old)
- current
Alias for field number 0
- old
Alias for field number 2
- young
Alias for field number 1
- class vivarium_public_health.population.data_transformations.EndpointValues(left, right)
- left
Alias for field number 0
- right
Alias for field number 1
- vivarium_public_health.population.data_transformations.smooth_ages(simulants, population_data, randomness)[source]
Distribute simulants among ages within their assigned age bins.
- Parameters:
simulants (
DataFrame) – Table with columns ‘age’, ‘sex’, and ‘location’population_data (
DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’randomness (
RandomnessStream) – Source of random number generation within the vivarium common random number framework.
- Return type:
DataFrame- Returns:
Table with same columns as simulants with ages smoothed out within the age bins.
- vivarium_public_health.population.data_transformations.get_cause_deleted_mortality_rate(all_cause_mortality_rate, list_of_csmrs)[source]
Compute the cause-deleted mortality rate by subtracting individual CSMRs.
- Parameters:
- Return type:
DataFrame- Returns:
DataFrame with the same index columns and a
death_due_to_other_causescolumn containing the residual mortality rate after subtracting all provided cause-specific rates.
- vivarium_public_health.population.data_transformations.load_population_structure(builder)[source]
Load population structure data from the artifact and add derived columns.
- Parameters:
builder (
Builder) – Access point for utilizing framework interfaces during setup.- Return type:
DataFrame- Returns:
DataFrame with all columns from the raw data plus
ageandlocation.
- vivarium_public_health.population.data_transformations.get_live_births_per_year(builder)[source]
Compute the simulated number of live births per year.
Combines population structure data with live birth covariate data to produce a per-year birth rate scaled to the simulation’s initial population size. Handles time-dependent vs. fixed birth rates and population fractions according to the fertility configuration, and extends the series to cover the simulation end year if needed.
- Parameters:
builder (
Builder) – Access point for utilizing framework interfaces during setup.- Return type:
Series- Returns:
A
pandas.Seriesindexed by year containing the expected number of new simulant births per year.
- vivarium_public_health.population.data_transformations.rescale_final_age_bin(builder, population_data)[source]
Clip and rescale the final age bin to match
initialization_age_max.When
population.initialization_age_maxis configured and falls within an existing age bin, that bin is truncated atinitialization_age_maxand itsvalueis scaled proportionally to reflect the reduced width.- Parameters:
builder (
Builder) – Access point for utilizing framework interfaces during setup.population_data (
DataFrame) – DataFrame with columnsage_start,age_end, andvalue.
- Return type:
DataFrame- Returns:
A copy of
population_datawith the final age bin adjusted to end atinitialization_age_maxand its value rescaled accordingly. Returned unchanged ifinitialization_age_maxis not set.
- vivarium_public_health.population.data_transformations.validate_crude_birth_rate_data(builder, data_year_max)[source]
Validate that birth rate data covers the simulation time period.
- Parameters:
- Raises:
ValueError – If the simulation end year exceeds
data_year_maxand extrapolation is not enabled.- Return type: