pipeline_velocyto.py
Overview
This pipeline performs the following steps:
sort bam file by cell barcode
estimate intronic and exonic reads using velocyto (on selected barcodes)
Usage
See Installation and Usage on general information how to use CGAT pipelines.
Configuration
The pipeline requires a configured pipeline_velocity.yml
file.
Default configuration files can be generated by executing:
python <srcdir>/pipeline_velocity.py config
Input files
The pipeline is run from bam files generated by cellranger count.
The pipeline expects a tsv file containing the path to each cellranger bam file (path) and the respective sample_id for each sample. In addition a list of barcodes is required, this could be the filtered barcodes from cellranger or a custom input (can be gzipped file). Any further metadata can be added to the file. The required columns are sample_id, barcodes and path.
Dependencies
This pipeline requires: * cgat-core: https://github.com/cgat-developers/cgat-core * samtools * veloctyo
Pipeline output
The pipeline returns: * a loom file with intronic and exonic reads for use in scvelo analysis
Code
- cellhub.pipeline_velocyto.checkInputs(outfile)
Check that input_samples.tsv exists and the path given in the file is a valid directorys.
- cellhub.pipeline_velocyto.genClusterJobs()
Generate cluster jobs for each sample
- cellhub.pipeline_velocyto.sortBam(infile, outfile)
Sort bam file by cell barcodes
- cellhub.pipeline_velocyto.runVelocyto(infile, outfile)
Run velocyto on barcode-sorted bam file. This task writes a loom file into the pipeline-run directory for each sample.
- cellhub.pipeline_velocyto.full()
Run the full pipeline.