Population Data Transformations

This module contains tools for handling raw demographic data and transforming it into different distributions for sampling.

vivarium_public_health.population.data_transformations.assign_demographic_proportions(population_data, include_sex)[source]

Calculates conditional probabilities on the provided population data for sampling.

Parameters:
  • population_data (DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, and ‘value’

  • include_sex (str) – ‘Female’, ‘Male’, or ‘Both’. Sexes to include in the distribution.

Returns:

Table with columns

’age’ : Midpoint of the age group, ‘age_start’ : Lower bound of the age group, ‘age_end’ : Upper bound of the age group, ‘sex’ : ‘Male’ or ‘Female’, ‘location’ : location, ‘year’ : Year, ‘population’ : Total population estimate, ‘P(sex, location | age, year)’ : Conditional probability of sex and location given age and year, ‘P(sex, location, age | year)’ : Conditional probability of sex, location, and age given year, ‘P(age | year, sex, location)’ : Conditional probability of age given year, sex, and location.

Return type:

pandas.DataFrame

vivarium_public_health.population.data_transformations.rescale_binned_proportions(pop_data, age_start, age_end)[source]

Reshapes the distribution so that bin edges fall on the age_start and age_end.

Parameters:
  • pop_data (DataFrame) – Table with columns ‘age’, ‘age_start’, ‘age_end’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’

  • age_start (float) – The starting age for the rescaled bins.

  • age_end (float) – The terminal age for the rescaled bins.

Returns:

Table with the same columns as pop_data where all bins outside the range (age_start, age_end) have been discarded. If age_start and age_end don’t fall cleanly on age boundaries, the bins in which they lie are clipped and the ‘population’, ‘P(sex, location, age| year)’, and ‘P(age | year, sex, location)’ values are rescaled to reflect their smaller representation.

Return type:

pandas.DataFrame

class vivarium_public_health.population.data_transformations.AgeValues(current, young, old)
current

Alias for field number 0

old

Alias for field number 2

young

Alias for field number 1

class vivarium_public_health.population.data_transformations.EndpointValues(left, right)
left

Alias for field number 0

right

Alias for field number 1

vivarium_public_health.population.data_transformations.smooth_ages(simulants, population_data, randomness)[source]

Distributes simulants among ages within their assigned age bins.

Parameters:
  • simulants (DataFrame) – Table with columns ‘age’, ‘sex’, and ‘location’

  • population_data (DataFrame) – Table with columns ‘age’, ‘sex’, ‘year’, ‘location’, ‘population’, ‘P(sex, location, age| year)’, ‘P(sex, location | age, year)’, ‘P(age | year, sex, location)’

  • randomness (RandomnessStream) – Source of random number generation within the vivarium common random number framework.

Returns:

Table with same columns as simulants with ages smoothed out within the age bins.

Return type:

pandas.DataFrame

vivarium_public_health.population.data_transformations.get_cause_deleted_mortality_rate(all_cause_mortality_rate, list_of_csmrs)[source]
vivarium_public_health.population.data_transformations.load_population_structure(builder)[source]
vivarium_public_health.population.data_transformations.get_live_births_per_year(builder)[source]
vivarium_public_health.population.data_transformations.rescale_final_age_bin(builder, population_data)[source]
vivarium_public_health.population.data_transformations.validate_crude_birth_rate_data(builder, data_year_max)[source]