pipeline_cellbender.py

Overview

This pipeline uses CellBender to remove ambient UMI counts.

Configuration

This pipeline is normally run in a seperate e.g. “cellhub_cellbender” directory so that the downstream results can be compared with those from cellranger.

The pipeline requires a configured pipeline_cellbender.yml file. A default version can be obtained by executing:

cellhub cellbender config

Input

The location of the cellhub folder containing the cellranger results that will be used as the input for CellBender is be specified in the “pipeline_cellbender.yml” configuration file. Typically the user will have two parallel “cellhub” instances, e.g.:

  1. “cellhub” <- containing a first cellhub run based on the Cellranger counts (counts registered with “cellhub cellranger make useCounts”).

  2. “cellhub_cellbender” <- containing a second cellhub run using CellBender to correct the Cellranger counts from the first run (counts registered with “cellhub cellbender make useCounts”).

Running the pipeline

It is recommended to run the cellbender task on a gpu queue.

On the University of Oxford’s BMRC cluster, this can be achieved with e.g.

cellhub cellbender make cellbender -v5 -p 200 --cluster-queue=short.qg --cluster-options "-l gpu=1,gputype=p100"

Pipeline output

The pipeline registers cleaned CellBender h5 files on the local cellhub API. Currently this format is not fully compatible with the 10x h5 format. To work around this a custom loader is used, see the cellhub.tasks.cellbender module documentation for more details.

Code

cellhub.pipeline_cellbender.cellbender(infile, outfile)

This task will run the CellBender command. Please visit cellbender.readthedocs.io for further details.

cellhub.pipeline_cellbender.h5API(infile, outfile)

Put the h5 files on the API

Inputs:

The input cellbender.dir folder layout is:

unfiltered “outs”:

library_id/cellbender.h5

filtered “outs”:

library_id/cellbender_filtered.h5
cellhub.pipeline_cellbender.mtx(infile, outfile)

Convert cellbender h5 to mtx format

cellhub.pipeline_cellbender.mtxAPI(infile, outfile)

Put the mtx files on the API

Inputs:

The input cellbender.dir folder layout is:

unfiltered “outs”:

library_id/cellbender.h5

filtered “outs”:

library_id/cellbender_filtered.h5
cellhub.pipeline_cellbender.full()

Run the full pipeline.

cellhub.pipeline_cellbender.useCounts(infile, outfile)

Set the cellbender counts as the source for downstream analysis. This task is not run by default.