nf-pySpade

The Nextflow pipeline for pySpade. Details about pySpade: https://github.com/Hon-lab/pySpade

Introduction

To run the pySpade pipeline, please prepare the folloing input files:

Mapped transcriptome matrix: provide the Cell Ranger path that contains "filtered_feature_bc_matrix.h5".
Mapped sgRNA matrix: the column is cell ID, and the index is sgRNA name. The sgRNA matrix should represent if the sgRNA is present in the cell. More than 1 is considered presented, 0 is not presented.
sgRNA dictionary: the reference file of perturbation regions and the targeting sgRNA. Example of first line (separated by tab): chr1:1234567-1235067 sg1;sg2;sg3;sg4;sg5
positive control file: the file is used in the fc function in order to see if positive controls have good repression/activation. Example of first line (separated by tab): chr1:1234567-1235067 ACTB

The defalult parameters show here, please change them if needed.

output directory: current folder
FDR (false discovery rate): 0.1. 10% of false discovery rate.
fold change cutoff: 0.2. Fold change need to be more than 20% (up-regulation and down-regulation) to be considered as hit.
expression level cutoff: 0.05. Genes need to be expressed in more than 5% of cells to be considered as hit.

Manhattan_plots/filtered_df.csv: the global hits (trans regulated genes) after FDR, fold change and expression level filtering.
Manhattan_plots/filtered_local_df.csv: the local hits (cis regulated genes) after FDR, fold change and expression level filtering.
Manhattan plots: Manhattan plots of individual perturbation regions (pdf file).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
script		script
LICENSE		LICENSE
README.md		README.md
Workflow.png		Workflow.png
log.run_nextflow.sh		log.run_nextflow.sh
main.nf		main.nf
nextflow.config		nextflow.config