A parent class to help setup pipeline tasks. It can be extended to meet the needs of the different pipelines. The class is used to obtain a task object that:

  • defines job resource requirements

  • provides access to variables (by name or via a .var dictionary)

  • creates an outfolder based on the outfile name

class cellhub.tasks.setup.setup(infile, outfile, PARAMS, memory='4G', cpu=1, make_outdir=True, expose_var=True)

Bases: object

A class for routine setup of pipeline tasks.

  • infile – The task infile path or None

  • outfile – The task outfile path (typically ends with “.sentinel”)

  • memory – The total memory needed for execution of the task. If no unit is given, gigabytes are assumed. Recognised units are “M” for megabyte and “G” for gigabytes. 4 gigabytes can be requested by passing “4”, “4GB” or “4096M”. Default = “4GB”.

  • cpu – The number of cpu cores required (used to populate job_threads)

  • make_outdir – True|False. Default = True.

  • expose_var – True|False. Should the self.var dictionary be created from self.__dict__. Default = True.


The number of threads that will be requested


The amount of memory that will be requested per thread


A dictionary with keys “job_threads” and “job_memory” for populating the kwargs, e.g., **t.resources)


The os.path.basename of outfile


The os.path.dirname of outfile


If an infile path is given, the os.path.dirname of the infile.


If an infile path is given, the os.path.basename of the infile.


If the outfile path ends with “.sentinel”


Return an integer that represents the amount of memory needed by the task in gigabytes.

set_resources(PARAMS, memory='4G', cpu=1)

calculate the resource requirements and return a dictionary that can be used to update the local variables