jobarchitect.utils

Utilities for jobarchitect.

jobarchitect.utils.are_identifiers_in_dataset(dataset_path, identifiers)

Return True if all identifiers are in the suppplied dataset. If the list of identifiers is empty, also return True.

Parameters:
  • dataset_path – path to dataset
  • identifiers – list of identifiers to test
Returns:

True if all identifiers in dataset, False otherwise.

jobarchitect.utils.mkdir_parents(path)

Create the given directory path.

This includes all necessary parent directories. Does not raise an error if the directory already exists.

Parameters:path – path to create
jobarchitect.utils.output_path_from_hash(dataset_path, hash_str, output_root)

Return absolute output path for a dataset item.

A.k.a. the absolute path to which output data should be written for the datum specified by the given hash.

This function is not responsible for creating the directory.

Parameters:
  • dataset_path – path to input dataset
  • hash_str – dataset item identifier as a hash string
  • output_root – path to output root
Raises:

KeyError if hash string identifier is not in the dataset

Returns:

absolute output path for a dataset item specified by the identifier

jobarchitect.utils.split_iterable(iterable, nchunks)

Return generator yielding lists derived from the iterable.

Parameters:
  • iterable – an interable
  • nchunks – number of chunks the iterable should be split into
Returns:

generator yielding lists from the iterable