The Data Artifact

This module provides tools for interacting with data artifacts.

A data artifact is an archive on disk intended to package up all data relevant to a particular simulation. This module provides a class to wrap that archive file for convenient access and inspection.

exception vivarium.framework.artifact.artifact.ArtifactException[source]

Exception raise for inconsistent use of the data artifact.

class vivarium.framework.artifact.artifact.Artifact(path, filter_terms=None)[source]

An interface for interacting with vivarium artifacts.

Parameters:
property path

The path to the artifact file.

property keys: List[str]

A list of all the keys contained within the artifact.

property filter_terms: List[str]

Filters that will be applied to the requested data on loads.

static create_hdf_with_keyspace(path)[source]

Creates the artifact HDF file and adds a node to track keys.

Parameters:

path (Path) –

load(entity_key)[source]

Loads the data associated with provided entity_key.

Parameters:

entity_key (str) – The key associated with the expected data.

Returns:

The expected data. Will either be a standard Python object or a pandas.DataFrame or pandas.Series.

Return type:

Any

Raises:

ArtifactException – If the provided key is not in the artifact.

write(entity_key, data)[source]

Writes data into the artifact and binds it to the provided key.

Parameters:
  • entity_key (str) – The key associated with the provided data.

  • data (Any) – The data to write. Accepted formats are pandas.Series, pandas.DataFrame or standard python types and containers.

Raises:

ArtifactException – If the provided key already exists in the artifact.

remove(entity_key)[source]

Removes data associated with the provided key from the artifact.

Parameters:

entity_key (str) – The key associated with the data to remove.

Raises:

ArtifactException – If the key is not present in the artifact.

replace(entity_key, data)[source]

Replaces the artifact data at the provided key with the new data.

Parameters:
  • entity_key (str) – The key for which the data should be overwritten.

  • data (Any) – The data to write. Accepted formats are pandas.Series, pandas.DataFrame or standard python types and containers.

Raises:

ArtifactException – If the provided key does not already exist in the artifact.

clear_cache()[source]

Clears the artifact’s cache.

The artifact will cache data in memory to improve performance for repeat access.

class vivarium.framework.artifact.artifact.Keys(artifact_path)[source]

A convenient wrapper around the keyspace which makes it easier for Artifact to maintain its keyspace when an entity key is added or removed. With the artifact_path, Keys object is initialized when the Artifact is initialized

Parameters:

artifact_path (Path) –

keyspace_node = 'metadata.keyspace'
append(new_key)[source]

Whenever the artifact gets a new key and new data, append is called to remove the old keyspace and to write the updated keyspace

Parameters:

new_key (str) –

remove(removing_key)[source]

Whenever the artifact removes a key and data, remove is called to remove the key from keyspace and write the updated keyspace.

Parameters:

removing_key (str) –

to_list()[source]

A list of all the entity keys in the associated artifact.

Return type:

List[str]