HDF Interface

A convenience wrapper around the tables and pandas HDF interfaces.

Public Interface

The public interface consists of 5 functions:

HDF Public Interface

Function

Description

touch()

Creates an HDF file, wiping an existing file if necessary.

write()

Stores data at a key in an HDF file.

load()

Loads (potentially filtered) data from a key in an HDF file.

remove()

Clears data from a key in an HDF file.

get_keys()

Gets all available HDF keys from an HDF file.

Contracts

  • All functions in the public interface accept both pathlib.Path and normal Python str objects for paths.

  • All functions in the public interface accept only str objects as representations of the keys in the hdf file. The strings must be formatted as "type.name.measure" or "type.measure".

vivarium.framework.artifact.hdf.touch(path)[source]

Creates an HDF file, wiping an existing file if necessary.

If the given path is proper to create a HDF file, it creates a new HDF file.

Parameters:

path (str | Path) – The path to the HDF file.

Raises:

ValueError – If the non-proper path is given to create a HDF file.

vivarium.framework.artifact.hdf.write(path, entity_key, data)[source]

Writes data to the HDF file at the given path to the given key.

Parameters:
  • path (str | Path) – The path to the HDF file to write to.

  • entity_key (str) – A string representation of the internal HDF path where we want to write the data. The key must be formatted as "type.name.measure" or "type.measure".

  • data (Any) – The data to write. If it is a pandas object, it will be written using a pandas.HDFStore or pandas.DataFrame.to_hdf(). If it is some other kind of python object, it will first be encoded as json with json.dumps() and then written to the provided key.

Raises:

ValueError – If the path or entity_key are improperly formatted.

vivarium.framework.artifact.hdf.load(path, entity_key, filter_terms, column_filters)[source]

Loads data from an HDF file.

Parameters:
  • path (str | Path) – The path to the HDF file to load the data from.

  • entity_key (str) – A representation of the internal HDF path where the data is located.

  • filter_terms (List[str] | None) – An optional list of terms used to filter the rows in the data. The terms must be formatted in a way that is suitable for use with the where argument of pandas.read_hdf(). Only filters applying to existing columns in the data are used.

  • column_filters (List[str] | None) – An optional list of columns to load from the data.

Raises:

ValueError – If the path or entity_key are improperly formatted.

Returns:

The data stored at the the given key in the HDF file.

Return type:

Any

vivarium.framework.artifact.hdf.remove(path, entity_key)[source]

Removes a piece of data from an HDF file.

Parameters:
  • path (str | Path) – The path to the HDF file to remove the data from.

  • entity_key (str) – A representation of the internal HDF path where the data is located.

Raises:

ValueError – If the path or entity_key are improperly formatted.

vivarium.framework.artifact.hdf.get_keys(path)[source]

Gets key representation of all paths in an HDF file.

Parameters:

path (str | Path) – The path to the HDF file.

Returns:

A list of key representations of the internal paths in the HDF.

Return type:

List[str]

class vivarium.framework.artifact.hdf.EntityKey(key)[source]

A convenience wrapper that translates artifact keys.

This class provides several representations of the artifact keys that are useful when working with the pandas and tables HDF interfaces.

property type: str

The type of the entity represented by the key.

property name: str

The name of the entity represented by the key

property measure: str

The measure associated with the data represented by the key.

property group_prefix: str

The HDF group prefix for the key.

property group_name: str

The HDF group name for the key.

property group: str

The full path to the group for this key.

property path: str

The full HDF path associated with this key.

with_measure(measure)[source]

Replaces this key’s measure with the provided one.

Parameters:

measure (str) – The measure to replace this key’s measure with.

Returns:

A new EntityKey with the updated measure.

Return type:

EntityKey