Random Numbers in vivarium
¶
This module contains classes and functions supporting common random numbers.
Vivarium has some peculiar needs around randomness. We need to be totally consistent between branches in a comparison. For example, if a simulant gets hit by a truck in the base case in must be hit by that same truck in the counterfactual at exactly the same moment unless the counterfactual explicitly deals with traffic accidents. That means that the system can’t rely on standard global randomness sources because small changes to the number of bits consumed or the order in which randomness consuming operations occur will cause the system to diverge. The current approach is to generate hashbased seeds where the key is the simulation time, the simulant’s id, the draw number and a unique id for the decision point which needs the randomness. These seeds are then used to generate numpy.random.RandomState objects that can be used to create pseudorandom numbers in a repeatable manner.

vivarium.framework.randomness.
RESIDUAL_CHOICE
¶ A probability placeholder to be used in an unnormalized array of weights to absorb leftover weight so that the array sums to unity. For example:
[0.2, 0.2, RESIDUAL_CHOICE] => [0.2, 0.2, 0.6]
Note
Currently this object is only used in the choice function of this module.
Type: object
For mor information, see the Common Random Numbers concept note.

exception
vivarium.framework.randomness.
RandomnessError
[source]¶ Exception raised for inconsistencies in random number and choice generation.

class
vivarium.framework.randomness.
IndexMap
(map_size=1000000)[source]¶ A keyindex mapping with a simple vectorized hash and vectorized lookups.

TEN_DIGIT_MODULUS
= 10000000000¶

update
(new_keys)[source]¶ Adds the new keys to the mapping.
Parameters: new_keys ( Index
) – The new index to hash.

hash_
(keys, salt=0)[source]¶ Hashes the given index into an integer index in the range [0, self.stride]
Parameters:  keys (
Index
) – The new index to hash.  salt (
int
) – An integer used to perturb the hash in a deterministic way. Useful in dealing with collisions.
Returns: A pandas series indexed by the given keys and whose values take on integers in the range [0, self.stride]. Duplicates may appear and should be dealt with by the calling code.
Return type: pd.Series
 keys (

convert_to_ten_digit_int
(column)[source]¶ Converts a column of datetimes, integers, or floats into a column of 10 digit integers.
Parameters: column ( Series
) – A series of datetimes, integers, or floats.Returns: A series of ten digit integers based on the input data. Return type: pd.Series Raises: RandomnessError : – If the column contains data that is neither a datetimelike nor numeric.

static
digit
(m, n)[source]¶ Returns the nth digit of each number in m.
Return type: Union
[int
,Series
]

static
clip_to_seconds
(m)[source]¶ Clips UTC datetime in nanoseconds to seconds.
Return type: Union
[int
,Series
]


vivarium.framework.randomness.
random
(key, index, index_map=None)[source]¶ Produces an indexed pandas.Series of uniformly distributed random numbers.
The index passed in typically corresponds to a subset of rows in a pandas.DataFrame for which a probabilistic draw needs to be made.
Parameters:  key (
str
) – A string used to create a seed for the random number generation.  index (
Index
) – The index used for the returned series.  index_map (
Optional
[IndexMap
]) – A mapping between the provided index (which may contain ints, floats, datetimes or any arbitrary combination of them) and an integer index into the random number array.
Returns: A series of random numbers indexed by the provided index.
Return type: pd.Series
 key (

vivarium.framework.randomness.
get_hash
(key)[source]¶ Gets a hash of the provided key.
Parameters: key ( str
) – A string used to create a seed for the random number generator.Returns: A hash of the provided key. Return type: int

vivarium.framework.randomness.
choice
(key, index, choices, p=None, index_map=None)[source]¶ Decides between a weighted or unweighted set of choices.
Given a a set of choices with or without corresponding weights, returns an indexed set of decisions from those choices. This is simply a vectorized way to make decisions with some bookkeeping.
Parameters:  key (
str
) – A string used to create a seed for the random number generation.  index (pandas.Index) – An index whose length is the number of random draws made and which indexes the returned pandas.Series.
 choices (
Union
[List
[~T],Tuple
,ndarray
,Series
]) – A set of options to choose from.  p (
Union
[List
[~T],Tuple
,ndarray
,Series
,None
]) – The relative weights of the choices. Can be either a 1d array of the same length as choices or a 2d array with len(index) rows and len(choices) columns. In the 1d case, the same set of weights are used to decide among the choices for every item in the index. In the 2d case, each row in p contains a separate set of weights for every item in the index.  index_map (
Optional
[IndexMap
]) – A mapping between the provided index (which may contain ints, floats, datetimes or any arbitrary combination of them) and an integer index into the random number array.
Returns: An indexed set of decisions from among the available choices.
Return type: pd.Series
Raises: RandomnessError
– If any row in p contains RESIDUAL_CHOICE and the remaining weights in the row are not normalized or any row of p contains more than one reference to RESIDUAL_CHOICE. key (

vivarium.framework.randomness.
filter_for_probability
(key, population, probability, index_map=None)[source]¶ Decide an event outcome for each individual in a population from probabilities.
Given a population or its index and an array of associated probabilities for some event to happen, we create and return the subpopulation for whom the event occurred.
Parameters:  key (
str
) – A string used to create a seed for the random number generation.  population (
Union
[DataFrame
,Series
,Index
]) – A view on the simulants for which we are determining the outcome of an event.  probability (
Union
[List
[~T],Tuple
,ndarray
,Series
]) – A 1d list of probabilities of the event under consideration occurring which corresponds (i.e. len(population) == len(probability)) to the population array passed in.  index_map (
Optional
[IndexMap
]) – A mapping between the provided index (which may contain ints, floats, datetimes or any arbitrary combination of them) and an integer index into the random number array.
Returns: The subpopulation of the simulants for whom the event occurred. The return type will be the same as type(population)
Return type: pd.core.generic.PandasObject
 key (

class
vivarium.framework.randomness.
RandomnessStream
(key, clock, seed, index_map=None, manager=None, for_initialization=False)[source]¶ A stream for producing common random numbers.
RandomnessStream objects provide an interface to Vivarium’s common random number generation. They provide a number of methods for doing common simulation tasks that require random numbers like making decisions among a number of choices.

key
¶ The name of the randomness stream.

clock
¶ A way to get the current simulation time.

seed
¶ An extra number used to seed the random number generation.
Notes
Should not be constructed by client code.
Simulation components get RandomnessStream objects by requesting them from the builder provided to them during the setup phase. I.E.:
class CeamComponent: def setup(self, builder): self.randomness_stream = builder.randomness.get_stream('stream_name')
See also
engine.Builder

name
¶

get_draw
(index, additional_key=None)[source]¶ Get an indexed sequence of floats pulled from a uniform distribution over [0.0, 1.0)
Parameters: Returns: A series of random numbers indexed by the provided pandas.Index.
Return type: pd.Series

get_seed
(additional_key=None)[source]¶ Get a randomly generated seed for use with external randomness tools.
Parameters: additional_key ( Optional
[Any
]) – Any additional information used to create the seed.Returns: A seed for a random number generation that is linked to Vivarium’s common random number framework. Return type: int

filter_for_rate
(population, rate, additional_key=None)[source]¶ Decide an event outcome for each individual in a population from rates.
Given a population or its index and an array of associated rates for some event to happen, we create and return the subpopulation for whom the event occurred.
Parameters:  population (
Union
[DataFrame
,Series
,Index
]) – A view on the simulants for which we are determining the outcome of an event.  rate (
Union
[List
[~T],Tuple
,ndarray
,Series
]) – A 1d list of rates of the event under consideration occurring which corresponds (i.e. len(population) == len(probability)) to the population view passed in. The rates must be scaled to the simulation timestep size either manually or as a postprocessing step in a rate pipeline.  additional_key (
Optional
[Any
]) – Any additional information used to create the seed.
Returns: The index of the simulants for whom the event occurred.
Return type: Index
See also
framework.values()
 Value/rate pipeline management module.
 population (

filter_for_probability
(population, probability, additional_key=None)[source]¶ Decide an event outcome for each individual in a population from probabilities.
Given a population or its index and an array of associated probabilities for some event to happen, we create and return the subpopulation for whom the event occurred.
Parameters:  population (
Union
[DataFrame
,Series
,Index
]) – A view on the simulants for which we are determining the outcome of an event.  probability (
Union
[List
[~T],Tuple
,ndarray
,Series
]) – A 1d list of probabilities of the event under consideration occurring which corresponds (i.e. len(population) == len(probability) to the population view passed in.  additional_key (
Optional
[Any
]) – Any additional information used to create the seed.
Returns: The subpopulation of the simulants for whom the event occurred. The return type will be the same as type(population)
Return type: Index
 population (

choice
(index, choices, p=None, additional_key=None)[source]¶ Decides between a weighted or unweighted set of choices.
Given a a set of choices with or without corresponding weights, returns an indexed set of decisions from those choices. This is simply a vectorized way to make decisions with some bookkeeping.
Parameters:  index (
Index
) – An index whose length is the number of random draws made and which indexes the returned pandas.Series.  choices (
Union
[List
[~T],Tuple
,ndarray
,Series
]) – A set of options to choose from.  p (
Union
[List
[~T],Tuple
,ndarray
,Series
,None
]) – The relative weights of the choices. Can be either a 1d array of the same length as choices or a 2d array with len(index) rows and len(choices) columns. In the 1d case, the same set of weights are used to decide among the choices for every item in the index. In the 2d case, each row in p contains a separate set of weights for every item in the index.  additional_key (
Optional
[Any
]) – Any additional information used to seed random number generation.
Returns: An indexed set of decisions from among the available choices.
Return type: pd.Series
Raises: RandomnessError
– If any row in p contains RESIDUAL_CHOICE and the remaining weights in the row are not normalized or any row of p contains more than one reference to `RESIDUAL_CHOICE. index (


class
vivarium.framework.randomness.
RandomnessManager
[source]¶ Access point for common random number generation.

configuration_defaults
= {'randomness': {'additional_seed': None, 'key_columns': ['entrance_time'], 'map_size': 1000000, 'random_seed': 0}}¶

name
¶

get_randomness_stream
(decision_point, for_initialization=False)[source]¶ Provides a new source of random numbers for the given decision point.
Parameters:  decision_point (
str
) – A unique identifier for a stream of random numbers. Typically represents a decision that needs to be made each time step like ‘moves_left’ or ‘gets_disease’.  for_initialization (
bool
) – A flag indicating whether this stream is used to generate key initialization information that will be used to identify simulants in the Common Random Number framework. These streams cannot be copied and should only be used to generate the state table columns specified inbuilder.configuration.randomness.key_columns
.
Raises: RandomnessError : – If another location in the simulation has already created a randomness stream with the same identifier.
Return type:  decision_point (

register_simulants
(simulants)[source]¶ Adds new simulants to the randomness mapping.
Parameters: simulants ( DataFrame
) – A table with state data representing the new simulants. Each simulant should pass through this function exactly once.Raises: RandomnessError : – If the provided table does not contain all key columns specified in the configuration.


class
vivarium.framework.randomness.
RandomnessInterface
(manager)[source]¶ 
get_stream
(decision_point, for_initialization=False)[source]¶ Provides a new source of random numbers for the given decision point.
vivarium
provides a framework for Common Random Numbers which allows for variance reduction when modeling counterfactual scenarios. Users interested in causal analysis and comparisons between simulation scenarios should be careful to use randomness streams provided by the framework wherever randomness is employed.Parameters:  decision_point (
str
) – A unique identifier for a stream of random numbers. Typically represents a decision that needs to be made each time step like ‘moves_left’ or ‘gets_disease’.  for_initialization (
bool
) – A flag indicating whether this stream is used to generate key initialization information that will be used to identify simulants in the Common Random Number framework. These streams cannot be copied and should only be used to generate the state table columns specified inbuilder.configuration.randomness.key_columns
.
Returns: An entry point into the Common Random Number generation framework. The stream provides vectorized access to random numbers and a few other utilities.
Return type:  decision_point (

register_simulants
(simulants)[source]¶ Registers simulants with the Common Random Number Framework.
Parameters: simulants ( DataFrame
) – A section of the state table with new simulants and at least the columns specified inbuilder.configuration.randomness.key_columns
. This function should be called as soon as the key columns are generated.
