pipeline_ambient_rna.py

Overview

This pipeline performs the following steps: * Analyse the ambient RNA profile in each input (eg. channel’s or library’s raw cellranger matrices) * Compare ambient RNA profiles across inputs

Configuration

The pipeline requires a configured pipeline.yml file. Default configuration files can be generated by executing:

python <srcdir>/pipeline_ambient_rna.py config

Input files

An tsv file called ‘input_libraries.tsv’ is required. This file must have column names as explained below. Must not include row names. Add as many rows as iput channels/libraries for analysis.

This file must have the following columns:

  • library_id - name used throughout. This could be the channel_pool id eg. A1

  • raw_path - path to the raw_matrix folder from cellranger count

  • exp_batch - might or might not be useful. If not used, fill with “1”

  • channel_id - might or might not be useful. If not used, fill with “1”

  • seq_batch - might or might not be useful. If not used, fill with “1”

  • (optional) excludelist - path to a file with cell_ids to excludelist

You can add any other columns as required, for example pool_id

Dependencies

This pipeline requires: * cgat-core: https://github.com/cgat-developers/cgat-core * R dependencies required in the r scripts

Pipeline output

The pipeline returns: * per-input html report and tables saved in a ‘profile_per_input’ folder * ambient rna comparison across inputs saved in a ‘profile_compare’ folder

Code

cellhub.pipeline_ambient_rna.ambient_rna_per_input(infile, outfile)

Explore count and gene expression profiles of ambient RNA droplets per input - The output is saved in profile_per_input.dir/<input_id> - The output consists on a html report and a ambient_genes.txt.gz file - See more details of the output in the ambient_rna_per_library.R

cellhub.pipeline_ambient_rna.ambient_rna_compare(infiles, outfile)

Compare the expression of top ambient RNA genes across inputs - The output is saved in profile_compare.dir - Output includes and html report and a ambient_rna_profile.tsv - See more details of the output in the ambient_rna_compare.R

cellhub.pipeline_ambient_rna.plot(infile, outfile)

Draw the pipeline flowchart

cellhub.pipeline_ambient_rna.full()

Run the full pipeline.