pipeline_emptydrops.py

Overview

This pipeline performs the following task:

run emptydrops on the raw output of cellranger

Usage

See Installation and Usage on general information how to use CGAT pipelines.

Configuration

The pipeline requires a configured pipeline.yml file.

Default configuration files can be generated by executing:

python <srcdir>/pipeline_emptydrops.py config

Input files

The pipeline is run from the cellranger count output (raw_feature_bc_matrix folder).

The pipeline expects a tsv file containing a column named path and a column named sample_id.

‘raw path’ should contain the path to each cellranger path to raw_feature_bc_matrix. ‘sample_id’ is the desired name for each sample (output folder will be named like this).

Dependencies

This pipeline requires: * cgat-core: https://github.com/cgat-developers/cgat-core * R + packages

Pipeline output

The pipeline returns: A list of barcodes passing emptydrops cell identification and a table with barcode ranks including all barcodes (this can be used for knee plots).

Code

cellhub.pipeline_emptydrops.emptyDrops(infile, outfile): Run Rscript to run EmptyDrops on each library

cellhub.pipeline_emptydrops.meanReads(infile, outfile): Calculate the mean reads per cell

cellhub.pipeline_emptydrops.full(): Run the full pipeline.