Population Data Transformations
Provide tools for handling raw demographic data and transforming it into different distributions for sampling.
- vivarium_public_health.population.data_transformations.add_age_midpoint(data)[source]
Add an
agecolumn as the midpoint ofage_startandage_end.- Parameters:
data (
DataFrame) – A DataFrame withage_startandage_endcolumns.- Return type:
DataFrame- Returns:
The input DataFrame with an
agecolumn added in place.
- vivarium_public_health.population.data_transformations.assign_demographic_proportions(population_data, include_sex)[source]
Calculate conditional probabilities on the provided population data for sampling.
- Parameters:
population_data (
DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, and ‘value’include_sex (
str) – ‘Female’, ‘Male’, or ‘Both’. Sexes to include in the distribution.
- Return type:
DataFrame- Returns:
- Table with columns
’age’ : Midpoint of the age group, ‘age_start’ : Lower bound of the age group, ‘age_end’ : Upper bound of the age group, ‘sex’ : ‘Male’ or ‘Female’, ‘location’ : location, ‘year’ : Year, ‘population’ : Total population estimate, ‘P(sex, location | age, year)’ : Conditional probability of sex and location given age and year, ‘P(sex, location, age | year)’ : Conditional probability of sex, location, and age given year, ‘P(age | year, sex, location)’ : Conditional probability of age given year, sex, and location.
- vivarium_public_health.population.data_transformations.rescale_binned_proportions(pop_data, age_start, age_end)[source]
Reshape the distribution so that bin edges fall on the age_start and age_end.
- Parameters:
pop_data (
DataFrame) – Table with columns ‘age’, ‘age_start’, ‘age_end’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’age_start (
float) – The starting age for the rescaled bins.age_end (
float) – The terminal age for the rescaled bins.
- Return type:
DataFrame- Returns:
Table with the same columns as pop_data where all bins outside the range (age_start, age_end) have been discarded. If age_start and age_end don’t fall cleanly on age boundaries, the bins in which they lie are clipped and the ‘population’, ‘P(sex, location, age| year)’, and ‘P(age | year, sex, location)’ values are rescaled to reflect their smaller representation.
- class vivarium_public_health.population.data_transformations.AgeValues(current, young, old)
- current
Alias for field number 0
- old
Alias for field number 2
- young
Alias for field number 1
- class vivarium_public_health.population.data_transformations.EndpointValues(left, right)
- left
Alias for field number 0
- right
Alias for field number 1
- vivarium_public_health.population.data_transformations.smooth_ages(simulants, population_data, randomness)[source]
Distribute simulants among ages within their assigned age bins.
- Parameters:
simulants (
DataFrame) – Table with columns ‘age’, ‘sex’, and ‘location’population_data (
DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’randomness (
RandomnessStream) – Source of random number generation within the vivarium common random number framework.
- Return type:
DataFrame- Returns:
Table with same columns as simulants with ages smoothed out within the age bins.
- vivarium_public_health.population.data_transformations.get_cause_deleted_mortality_rate(all_cause_mortality_rate, list_of_csmrs)[source]
Compute the cause-deleted mortality rate by subtracting individual CSMRs.
- Parameters:
- Return type:
DataFrame- Returns:
DataFrame with the same index columns and a
death_due_to_other_causescolumn containing the residual mortality rate after subtracting all provided cause-specific rates.