This pipeline performs the following task:

  • run emptydrops on the raw output of cellranger


See Installation and Usage on general information how to use CGAT pipelines.


The pipeline requires a configured pipeline.yml file.

Default configuration files can be generated by executing:

python <srcdir>/pipeline_emptydrops.py config

Input files

The pipeline is run from the cellranger count output (raw_feature_bc_matrix folder).

The pipeline expects a tsv file containing a column named path and a column named sample_id.

‘raw path’ should contain the path to each cellranger path to raw_feature_bc_matrix. ‘sample_id’ is the desired name for each sample (output folder will be named like this).


This pipeline requires: * cgat-core: https://github.com/cgat-developers/cgat-core * R + packages

Pipeline output

The pipeline returns: A list of barcodes passing emptydrops cell identification and a table with barcode ranks including all barcodes (this can be used for knee plots).


cellhub.pipeline_emptydrops.emptyDrops(infile, outfile)

Run Rscript to run EmptyDrops on each library

cellhub.pipeline_emptydrops.meanReads(infile, outfile)

Calculate the mean reads per cell


Run the full pipeline.