Add InterProScan to Pipeline and integrate in AMPcombi #428

Darcy220606 · 2024-12-09T15:30:00Z

PR checklist

This PR adds InterProScan to FUNCSCAN. It also integrates it into AMPcombi v2.0.1, which can parse its output as an optional flag.

This PR also closes issue #434

🚨 🚨 As interproscan requires a large database, i have not added it to any of the CI tests as that would require 4 hours for just downloading the database!!!!

~~👀 👀 👀 👀 👀 👀 Still TODO once AMPcombi 2.0.1 is updated in nf-core: DONE!!~~

github-actions · 2024-12-09T15:32:05Z

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 5d1f4b7

+| ✅ 350 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  15 tests had warnings |!

❗ Test warnings:

files_exist - File not found: conf/igenomes.config
files_exist - File not found: conf/igenomes_ignored.config
pipeline_todos - TODO string in ro-crate-metadata.json: "description": "
\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-funcscan_logo_dark.png">\n <img alt="nf-core/funcscan" src="docs/images/nf-core-funcscan_logo_light.png">\n \n
\n\n\n \n\n\n\n\n\n\n\n\n \n\n## Introduction\n\nnf-core/funcscan is a bioinformatics pipeline that ...\n\n TODO nf-core:\n Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the\n major pipeline sections and the types of output it produces. You're giving an overview to someone new\n to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction\n\n\n Include a figure that guides the user through the major workflow steps. Many nf-core\n workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. \n Fill in short bullet-pointed list of the default steps in the pipeline 2. Present QC for raw reads (MultiQC)\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.\n\n Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.\n Explain what rows and columns represent. For instance (please edit as appropriate):\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\nsamplesheet.csv:\n\ncsv\nsample,fastq_1,fastq_2\nCONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz\n\n\nEach row represents a fastq file (single-end) or a pair of fastq files (paired end).\n\n\n\nNow, you can run the pipeline using:\n\n update the following command to include all required parameters for a minimal example \n\nbash\nnextflow run nf-core/funcscan \\\n -profile <docker/singularity/.../institute> \\\n --input samplesheet.csv \\\n --outdir <OUTDIR>\n\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.\n\nFor more details and further functionality, please refer to the usage documentation and the parameter documentation.\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\noutput documentation.\n\n## Credits\n\nnf-core/funcscan was originally written by Jasmin Frangenberg, Anan Ibrahim, Louisa Perelo, Moritz E. Beber, James A. Fellows Yates.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n If applicable, make list of people who have also contributed \n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the contributing guidelines.\n\nFor further information or help, don't hesitate to get in touch on the Slack #funcscan channel (you can join with this invite).\n\n## Citations\n\n Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. \n If you use nf-core/funcscan for your analysis, please cite it using the following doi: 10.5281/zenodo.XXXXXX \n\n Add bibliography of tools and data used in your pipeline \n\nAn extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.\n\nYou can cite the nf-core publication as follows:\n\n> The nf-core framework for community-curated bioinformatics pipelines.\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.\n",
local_component_structure - merge_taxonomy_combgc.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - merge_taxonomy_hamronization.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - merge_taxonomy_ampcombi.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - combgc.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - amp_database_download.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - interproscan_download.nf in modules/local should be moved to a TOOL/SUBTOOL/main.nf structure
local_component_structure - annotation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
local_component_structure - bgc.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
local_component_structure - amp.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
local_component_structure - taxa_class.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
local_component_structure - arg.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
local_component_structure - protein_annotation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

actions_ci - actions_ci

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-funcscan_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-funcscan_logo_light.png
files_exist - File found: docs/images/nf-core-funcscan_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File found: ro-crate-metadata.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-funcscan_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowFuncscan.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Found nf-schema plugin
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: validation.help.enabled
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable found: validation.help.beforeText
nextflow_config - Config variable found: validation.help.afterText
nextflow_config - Config variable found: validation.help.command
nextflow_config - Config variable found: validation.summary.beforeText
nextflow_config - Config variable found: validation.summary.afterText
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config variable (correctly) not found: params.max_cpus
nextflow_config - Config variable (correctly) not found: params.max_memory
nextflow_config - Config variable (correctly) not found: params.max_time
nextflow_config - Config variable (correctly) not found: params.validationFailUnrecognisedParams
nextflow_config - Config variable (correctly) not found: params.validationLenientMode
nextflow_config - Config variable (correctly) not found: params.validationSchemaIgnoreParams
nextflow_config - Config variable (correctly) not found: params.validationShowHiddenParams
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 2.1.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.taxa_classification_tool= mmseqs2
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_compressed= 0
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_db_id= Kalamari
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_searchtype= 2
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_lcaranks= kingdom,phylum,class,order,family,genus,species
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_taxlineage= 1
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_sensitivity= 5.0
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_orffilters= 2.0
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_lcamode= 3
nextflow_config - Config default value correct: params.taxa_classification_mmseqs_taxonomy_votemode= 1
nextflow_config - Config default value correct: params.annotation_tool= pyrodigal
nextflow_config - Config default value correct: params.annotation_bakta_db_downloadtype= full
nextflow_config - Config default value correct: params.annotation_bakta_mincontiglen= 1
nextflow_config - Config default value correct: params.annotation_bakta_translationtable= 11
nextflow_config - Config default value correct: params.annotation_bakta_gram= ?
nextflow_config - Config default value correct: params.annotation_prokka_kingdom= Bacteria
nextflow_config - Config default value correct: params.annotation_prokka_gcode= 11
nextflow_config - Config default value correct: params.annotation_prokka_mincontiglen= 1
nextflow_config - Config default value correct: params.annotation_prokka_evalue= 1e-06
nextflow_config - Config default value correct: params.annotation_prokka_coverage= 80
nextflow_config - Config default value correct: params.annotation_prokka_compliant= true
nextflow_config - Config default value correct: params.annotation_prodigal_transtable= 11
nextflow_config - Config default value correct: params.annotation_pyrodigal_transtable= 11
nextflow_config - Config default value correct: params.protein_annotation_tool= InterProScan
nextflow_config - Config default value correct: params.protein_annotation_interproscan_db_url= http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.67-99.0/interproscan-5.67-99.0-64-bit.tar.gz
nextflow_config - Config default value correct: params.protein_annotation_interproscan_applications= PANTHER,ProSiteProfiles,ProSitePatterns,Pfam
nextflow_config - Config default value correct: params.amp_ampir_model= precursor
nextflow_config - Config default value correct: params.amp_ampir_minlength= 10
nextflow_config - Config default value correct: params.amp_ampcombi_db_id= DRAMP
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_cutoff= 0.6
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_aalength= 120
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_dbevalue= 5.0
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_hmmevalue= 0.06
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_windowstopcodon= 60
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_windowtransport= 11
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_ampir= .ampir.tsv
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_amplify= .amplify.tsv
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_macrel= .macrel.prediction
nextflow_config - Config default value correct: params.amp_ampcombi_parsetables_hmmsearch= .hmmer_hmmsearch.txt
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_covmode= 0.0
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_sensitivity= 4.0
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_minmembers= 0
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_mode= 1.0
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_coverage= 0.8
nextflow_config - Config default value correct: params.amp_ampcombi_cluster_seqid= 0.4
nextflow_config - Config default value correct: params.arg_amrfinderplus_identmin= -1.0
nextflow_config - Config default value correct: params.arg_amrfinderplus_coveragemin= 0.5
nextflow_config - Config default value correct: params.arg_amrfinderplus_translationtable= 11
nextflow_config - Config default value correct: params.arg_deeparg_db_version= 2
nextflow_config - Config default value correct: params.arg_deeparg_model= LS
nextflow_config - Config default value correct: params.arg_deeparg_minprob= 0.8
nextflow_config - Config default value correct: params.arg_deeparg_alignmentevalue= 1e-10
nextflow_config - Config default value correct: params.arg_deeparg_alignmentidentity= 50
nextflow_config - Config default value correct: params.arg_deeparg_alignmentoverlap= 0.8
nextflow_config - Config default value correct: params.arg_deeparg_numalignmentsperentry= 1000
nextflow_config - Config default value correct: params.arg_fargene_hmmmodel= class_a,class_b_1_2,class_b_3,class_c,class_d_1,class_d_2,qnr,tet_efflux,tet_rpg,tet_enzyme
nextflow_config - Config default value correct: params.arg_fargene_minorflength= 90
nextflow_config - Config default value correct: params.arg_fargene_translationformat= pearson
nextflow_config - Config default value correct: params.arg_rgi_alignmenttool= BLAST
nextflow_config - Config default value correct: params.arg_rgi_data= NA
nextflow_config - Config default value correct: params.arg_rgi_split_prodigal_jobs= true
nextflow_config - Config default value correct: params.arg_abricate_db_id= ncbi
nextflow_config - Config default value correct: params.arg_abricate_minid= 80
nextflow_config - Config default value correct: params.arg_abricate_mincov= 80
nextflow_config - Config default value correct: params.arg_hamronization_summarizeformat= tsv
nextflow_config - Config default value correct: params.bgc_mincontiglength= 3000
nextflow_config - Config default value correct: params.bgc_antismash_contigminlength= 3000
nextflow_config - Config default value correct: params.bgc_antismash_hmmdetectionstrictness= relaxed
nextflow_config - Config default value correct: params.bgc_antismash_taxon= bacteria
nextflow_config - Config default value correct: params.bgc_deepbgc_score= 0.5
nextflow_config - Config default value correct: params.bgc_deepbgc_mergemaxproteingap= 0
nextflow_config - Config default value correct: params.bgc_deepbgc_mergemaxnuclgap= 0
nextflow_config - Config default value correct: params.bgc_deepbgc_minnucl= 1
nextflow_config - Config default value correct: params.bgc_deepbgc_minproteins= 1
nextflow_config - Config default value correct: params.bgc_deepbgc_mindomains= 1
nextflow_config - Config default value correct: params.bgc_deepbgc_minbiodomains= 0
nextflow_config - Config default value correct: params.bgc_deepbgc_classifierscore= 0.5
nextflow_config - Config default value correct: params.bgc_gecco_cds= 3
nextflow_config - Config default value correct: params.bgc_gecco_pfilter= 1e-09
nextflow_config - Config default value correct: params.bgc_gecco_threshold= 0.8
nextflow_config - Config default value correct: params.bgc_gecco_edgedistance= 0
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-funcscan_logo_light.png matches the template
files_unchanged - docs/images/nf-core-funcscan_logo_light.png matches the template
files_unchanged - docs/images/nf-core-funcscan_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 24.04.2, Config: 24.04.2
readme - README Zenodo placeholder was replaced with DOI.
plugin_includes - No wrong validation plugin imports have been found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (0 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: template_version_comment.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
base_config - GUNZIP found in conf/base.config and Nextflow scripts.
base_config - UNTAR found in conf/base.config and Nextflow scripts.
base_config - PROKKA found in conf/base.config and Nextflow scripts.
base_config - PRODIGAL_GBK found in conf/base.config and Nextflow scripts.
base_config - BAKTA_BAKTA found in conf/base.config and Nextflow scripts.
base_config - ABRICATE_RUN found in conf/base.config and Nextflow scripts.
base_config - AMRFINDERPLUS_RUN found in conf/base.config and Nextflow scripts.
base_config - DEEPARG_DOWNLOADDATA found in conf/base.config and Nextflow scripts.
base_config - DEEPARG_PREDICT found in conf/base.config and Nextflow scripts.
base_config - FARGENE found in conf/base.config and Nextflow scripts.
base_config - RGI_MAIN found in conf/base.config and Nextflow scripts.
base_config - AMPIR found in conf/base.config and Nextflow scripts.
base_config - AMPLIFY_PREDICT found in conf/base.config and Nextflow scripts.
base_config - AMP_HMMER_HMMSEARCH found in conf/base.config and Nextflow scripts.
base_config - MACREL_CONTIGS found in conf/base.config and Nextflow scripts.
base_config - BGC_HMMER_HMMSEARCH found in conf/base.config and Nextflow scripts.
base_config - ANTISMASH_ANTISMASHLITE found in conf/base.config and Nextflow scripts.
base_config - ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES found in conf/base.config and Nextflow scripts.
base_config - DEEPBGC_DOWNLOAD found in conf/base.config and Nextflow scripts.
base_config - DEEPBGC_PIPELINE found in conf/base.config and Nextflow scripts.
base_config - GECCO_RUN found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_ABRICATE found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_AMRFINDERPLUS found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_DEEPARG found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_RGI found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_FARGENE found in conf/base.config and Nextflow scripts.
base_config - HAMRONIZATION_SUMMARIZE found in conf/base.config and Nextflow scripts.
base_config - ARGNORM_DEEPARG found in conf/base.config and Nextflow scripts.
base_config - ARGNORM_ABRICATE found in conf/base.config and Nextflow scripts.
base_config - ARGNORM_AMRFINDERPLUS found in conf/base.config and Nextflow scripts.
base_config - AMPCOMBI2_PARSETABLES found in conf/base.config and Nextflow scripts.
base_config - AMPCOMBI2_CLUSTER found in conf/base.config and Nextflow scripts.
base_config - INTERPROSCAN_DATABASE found in conf/base.config and Nextflow scripts.
modules_config - conf/modules.config found and not ignored.
modules_config - MULTIQC found in conf/modules.config and Nextflow scripts.
modules_config - GUNZIP found in conf/modules.config and Nextflow scripts.
modules_config - MMSEQS_DATABASES found in conf/modules.config and Nextflow scripts.
modules_config - MMSEQS_CREATEDB found in conf/modules.config and Nextflow scripts.
modules_config - MMSEQS_TAXONOMY found in conf/modules.config and Nextflow scripts.
modules_config - MMSEQS_CREATETSV found in conf/modules.config and Nextflow scripts.
modules_config - SEQKIT_SEQ_LENGTH found in conf/modules.config and Nextflow scripts.
modules_config - SEQKIT_SEQ_FILTER found in conf/modules.config and Nextflow scripts.
modules_config - INTERPROSCAN_DATABASE found in conf/modules.config and Nextflow scripts.
modules_config - INTERPROSCAN found in conf/modules.config and Nextflow scripts.
modules_config - PROKKA found in conf/modules.config and Nextflow scripts.
modules_config - BAKTA_BAKTADBDOWNLOAD found in conf/modules.config and Nextflow scripts.
modules_config - BAKTA_BAKTA found in conf/modules.config and Nextflow scripts.
modules_config - PRODIGAL found in conf/modules.config and Nextflow scripts.
modules_config - PYRODIGAL found in conf/modules.config and Nextflow scripts.
modules_config - ABRICATE_RUN found in conf/modules.config and Nextflow scripts.
modules_config - AMRFINDERPLUS_UPDATE found in conf/modules.config and Nextflow scripts.
modules_config - AMRFINDERPLUS_RUN found in conf/modules.config and Nextflow scripts.
modules_config - DEEPARG_DOWNLOADDATA found in conf/modules.config and Nextflow scripts.
modules_config - DEEPARG_PREDICT found in conf/modules.config and Nextflow scripts.
modules_config - FARGENE found in conf/modules.config and Nextflow scripts.
modules_config - UNTAR_CARD found in conf/modules.config and Nextflow scripts.
modules_config - RGI_CARDANNOTATION found in conf/modules.config and Nextflow scripts.
modules_config - RGI_MAIN found in conf/modules.config and Nextflow scripts.
modules_config - AMPIR found in conf/modules.config and Nextflow scripts.
modules_config - AMPLIFY_PREDICT found in conf/modules.config and Nextflow scripts.
modules_config - AMP_HMMER_HMMSEARCH found in conf/modules.config and Nextflow scripts.
modules_config - MACREL_CONTIGS found in conf/modules.config and Nextflow scripts.
modules_config - BGC_HMMER_HMMSEARCH found in conf/modules.config and Nextflow scripts.
modules_config - ANTISMASH_ANTISMASHLITE found in conf/modules.config and Nextflow scripts.
modules_config - ANTISMASH_ANTISMASHLITEDOWNLOADDATABASES found in conf/modules.config and Nextflow scripts.
modules_config - DEEPBGC_DOWNLOAD found in conf/modules.config and Nextflow scripts.
modules_config - DEEPBGC_PIPELINE found in conf/modules.config and Nextflow scripts.
modules_config - GECCO_RUN found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_ABRICATE found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_AMRFINDERPLUS found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_DEEPARG found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_RGI found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_FARGENE found in conf/modules.config and Nextflow scripts.
modules_config - HAMRONIZATION_SUMMARIZE found in conf/modules.config and Nextflow scripts.
modules_config - MERGE_TAXONOMY_HAMRONIZATION found in conf/modules.config and Nextflow scripts.
modules_config - ARG_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
modules_config - AMPCOMBI2_PARSETABLES found in conf/modules.config and Nextflow scripts.
modules_config - AMPCOMBI2_COMPLETE found in conf/modules.config and Nextflow scripts.
modules_config - AMPCOMBI2_CLUSTER found in conf/modules.config and Nextflow scripts.
modules_config - MERGE_TAXONOMY_AMPCOMBI found in conf/modules.config and Nextflow scripts.
modules_config - AMP_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
modules_config - COMBGC found in conf/modules.config and Nextflow scripts.
modules_config - ARGNORM_ABRICATE found in conf/modules.config and Nextflow scripts.
modules_config - ARGNORM_AMRFINDERPLUS found in conf/modules.config and Nextflow scripts.
modules_config - ARGNORM_DEEPARG found in conf/modules.config and Nextflow scripts.
modules_config - MERGE_TAXONOMY_COMBGC found in conf/modules.config and Nextflow scripts.
modules_config - BGC_TABIX_BGZIP found in conf/modules.config and Nextflow scripts.
modules_config - AMP_DATABASE_DOWNLOAD found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 3.2.0

Run details

nf-core/tools version 3.2.0
Run at 2025-02-02 17:34:28

nf-core-bot · 2025-01-06T07:53:57Z

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.1.0.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Darcy220606 · 2025-01-08T07:30:11Z

@nf-core-bot fix linting

Darcy220606 · 2025-01-09T12:30:24Z

@nf-core-bot fix linting

Darcy220606 · 2025-01-09T13:36:37Z

Also fixes issue number #434

jfy133

Main issue is I don't like the use of function, we already use functon in funcscan in a broad sense... can you refine what exactly we are using interproscan for and then we can adjust the naming

CHANGELOG.md

conf/base.config

conf/modules.config

docs/output.md

nextflow_schema.json

subworkflows/local/amp.nf

subworkflows/local/function.nf

workflows/funcscan.nf

jfy133

Almost there!

Also missing README update

CHANGELOG.md

docs/output.md

docs/usage.md

nextflow.config

jasmezz

Great work overall 💪

Now that you introduce the protein_annotation workflow, I wonder if we should rename the DNA-level annotation workflow (of pyrodigal, bakta etc.). Maybe to contig_annotation, cds_annotation, or orf_annotation?

jasmezz · 2025-02-04T11:21:52Z

CHANGELOG.md

-| MultiQC   | 1.24.0           | 1.27        |
-| Pyrodigal | 3.3.0            | 3.6.3       |
-| seqkit    | 2.8.1            | 2.9.0       |
+=======


Suggested change

=======

jasmezz · 2025-02-04T11:24:20Z

CHANGELOG.md

+| Tool | Previous version | New version |
+| ------------ | ---------------- | ----------- |
+| AMPcombi | 0.2.2 | 2.0.1 |
+| Bakta | 1.9.3 | 1.10.4 |
+| InterProScan | - | 5.59_91.0 |
+| Macrel | 1.2.0 | 1.4.0 |
+| MMseqs2 | 15.6f452 | 17.b804f |
+| MultiQC | 1.24.0 | 1.27 |
+| Pyrodigal | 3.3.0 | 3.6.3 |
+| seqkit | 2.8.1 | 2.9.0 |


Is this table formatting intended? I don't know if we should write it like this (i.e. without filling up spaces).

jasmezz · 2025-02-04T11:28:48Z

CITATIONS.md

@@ -70,6 +70,14 @@

  > Eddy S. R. (2011). Accelerated Profile HMM Searches. PLoS computational biology, 7(10), e1002195. [DOI: 10.1371/journal.pcbi.1002195](https://doi.org/10.1371/journal.pcbi.1002195)

+- [InterPro](https://doi.org/10.1093/nar/gkaa977)
+
+  > Blum, M., Chang, H-Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., Nuka, G., Paysan-Lafosse, T., Qureshi, M., Raj, S., Richardson, L., Salazar, G.A., Williams, L., Bork, P., Bridge, A., Gough, J., Haft, D.H., Letunic, I., Marchler-Bauer, A., Mi, H., Natale, D.A., Necci, M., Orengo, C.A., Pandurangan, A.P., Rivoire, C., Sigrist, C.A., Sillitoe, I., Thanki, N., Thomas, P.D., Tosatto, S.C.E, Wu, C.H., Bateman, A., Finn, R.D. (2021) The InterPro protein families and domains database: 20 years on, Nucleic Acids Research, 49(D1), D344–D354.[DOI: 10.1093/nar/gkaa977](https://doi.org/10.1093/nar/gkaa977).


Suggested change

> Blum, M., Chang, H-Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., Nuka, G., Paysan-Lafosse, T., Qureshi, M., Raj, S., Richardson, L., Salazar, G.A., Williams, L., Bork, P., Bridge, A., Gough, J., Haft, D.H., Letunic, I., Marchler-Bauer, A., Mi, H., Natale, D.A., Necci, M., Orengo, C.A., Pandurangan, A.P., Rivoire, C., Sigrist, C.A., Sillitoe, I., Thanki, N., Thomas, P.D., Tosatto, S.C.E, Wu, C.H., Bateman, A., Finn, R.D. (2021) The InterPro protein families and domains database: 20 years on, Nucleic Acids Research, 49(D1), D344–D354.[DOI: 10.1093/nar/gkaa977](https://doi.org/10.1093/nar/gkaa977).

> Blum, M., Chang, H-Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., Nuka, G., Paysan-Lafosse, T., Qureshi, M., Raj, S., Richardson, L., Salazar, G. A., Williams, L., Bork, P., Bridge, A., Gough, J., Haft, D. H., Letunic, I., Marchler-Bauer, A., Mi, H., Natale, D. A., Necci, M., Orengo, C. A., Pandurangan, A. P., Rivoire, C., Sigrist, C. A., Sillitoe, I., Thanki, N., Thomas, P. D., Tosatto, S. C. E, Wu, C. H., Bateman, A., Finn, R. D. (2021) The InterPro protein families and domains database: 20 years on. Nucleic Acids Research, 49(D1), D344–D354. [DOI: 10.1093/nar/gkaa977](https://doi.org/10.1093/nar/gkaa977)

jasmezz · 2025-02-04T11:31:00Z

CITATIONS.md

+
+- [InterProScan](https://doi.org/10.1093/bioinformatics/btu031)
+
+  > Jones, P., Binns, D., Chang, H-Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S-Y., Lopez, R., Hunter, S. (2014)InterProScan 5: genome-scale protein function classification, Bioinformatics, 30(9), 1236–1240. [DOI: 10.1093/bioinformatics/btu031](https://doi.org/10.1093/bioinformatics/btu031)


Suggested change

> Jones, P., Binns, D., Chang, H-Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S-Y., Lopez, R., Hunter, S. (2014)InterProScan 5: genome-scale protein function classification, Bioinformatics, 30(9), 1236–1240. [DOI: 10.1093/bioinformatics/btu031](https://doi.org/10.1093/bioinformatics/btu031)

> Jones, P., Binns, D., Chang, H-Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A. F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S-Y., Lopez, R., Hunter, S. (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics, 30(9), 1236–1240. [DOI: 10.1093/bioinformatics/btu031](https://doi.org/10.1093/bioinformatics/btu031)

jasmezz · 2025-02-04T11:35:49Z

conf/modules.config

+    withName: SEQKIT_SEQ_FILTER {
+        ext.prefix = { "${meta.id}_cleaned.faa" }
+        publishDir = [
+            path: { "${params.outdir}/protein_annotation/interproscan/" },


Are we sure we want the output in ${params.outdir}/protein_annotation/interproscan/ and not in ${params.outdir}/annotation/interproscan/? I'd prefer the latter, to have it all in one place regardless of DNA (pyrodigal etc.) or protein annotation (interproscan). I think it's more intuitive to search for any annotation results in a single folder.

If not, what do you think of renaming the annotation output folder to contig_annotation?

jasmezz · 2025-02-04T15:33:29Z

subworkflows/local/protein_annotation.nf

+                .first()
+        } else {
+            INTERPROSCAN_DATABASE ( params.protein_annotation_interproscan_db_url )
+            ch_versions  = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions )


Suggested change

ch_versions = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions )

ch_versions = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions )

jasmezz · 2025-02-04T15:33:43Z

subworkflows/local/protein_annotation.nf

+        }
+
+        INTERPROSCAN( ch_faa_for_interproscan, ch_interproscan_db )
+        ch_versions  = ch_versions.mix( INTERPROSCAN.out.versions )


Suggested change

ch_versions = ch_versions.mix( INTERPROSCAN.out.versions )

ch_versions = ch_versions.mix( INTERPROSCAN.out.versions )

jasmezz · 2025-02-04T15:36:27Z

subworkflows/local/protein_annotation.nf

+        ch_versions  = ch_versions.mix( INTERPROSCAN.out.versions )
+        ch_interproscan_tsv = ch_interproscan_tsv.mix( INTERPROSCAN.out.tsv )
+
+        // Current INTERPROSCAN version 5.59_91.0 only includes 13 columns and not 15 which ampcombi expects, so we added them here


Isn't this something to solve upstream on AMPcombi side? 😬 Is ok for now I guess, but better to have this column number check done by AMPcombi instead of pipeline level.

jasmezz · 2025-02-04T15:38:28Z

workflows/funcscan.nf

+        PROTEIN_ANNOTATION (
+            ch_input_for_protein_annotation
+        )


Suggested change

PROTEIN_ANNOTATION (

ch_input_for_protein_annotation

)

PROTEIN_ANNOTATION ( ch_input_for_protein_annotation )

jasmezz · 2025-02-04T15:38:57Z

workflows/funcscan.nf

+
+        ch_interproscan_tsv = PROTEIN_ANNOTATION.out.tsv.map { meta, file ->
+            if (file == [] || file.isEmpty()) {
+                log.warn("[nf-core/funcscan] Protein annotation with INTERPROSCAN produced an empty TSV file. No protein annotation will be added for ${meta.id}.")


Suggested change

log.warn("[nf-core/funcscan] Protein annotation with INTERPROSCAN produced an empty TSV file. No protein annotation will be added for ${meta.id}.")

log.warn("[nf-core/funcscan] Protein annotation with InterProScan produced an empty TSV file. No protein annotation will be added for sample ${meta.id}.")

Darcy220606 added 6 commits December 5, 2024 13:34

install interproscan in funcscan

86592d9

add module in modules.json

491f25d

start adding interproscan_amp functionality

bbd456e

add citation

5b5cb3e

update documentations

d8c5bf2

update dynamic optional parameter path

fd1ef46

Darcy220606 marked this pull request as draft December 9, 2024 15:32

remove dynamik ext.arg

b54f1ea

Base automatically changed from nf-core-template-merge-3.0.2 to dev December 18, 2024 12:51

Merge remote-tracking branch 'origin/dev' into add_interproscan_to_amp

fee3adb

remove comments and fix Interproscan subworkflow

8b44ed5

nf-core-bot and others added 4 commits January 8, 2025 07:30

[automated] Fix code linting

ed81b0b

fix linting issues

0340ba9

fix ampcombi when only one file passes parsetables

7e1f164

fix ampcombi2 wo interproscan

5be17ef

Darcy220606 added 2 commits January 9, 2025 13:35

clean up comments

b782b54

fix linting

e58b322

Darcy220606 marked this pull request as ready for review January 9, 2025 13:11

Darcy220606 mentioned this pull request Jan 9, 2025

WARN: Process NFCORE_FUNCSCAN:FUNCSCAN:AMP:AMP_DATABASE_DOWNLOAD publishDir path contains a variable with a null value #434

Closed

Darcy220606 linked an issue Jan 9, 2025 that may be closed by this pull request

WARN: Process NFCORE_FUNCSCAN:FUNCSCAN:AMP:AMP_DATABASE_DOWNLOAD publishDir path contains a variable with a null value #434

Closed

jfy133 reviewed Jan 10, 2025

View reviewed changes

Darcy220606 added 2 commits January 17, 2025 15:58

Add reviewer requests

310a0be

fix prettier

8c8aeaf

Darcy220606 requested a review from jfy133 January 22, 2025 08:50

jfy133 reviewed Jan 22, 2025

View reviewed changes

jasmezz mentioned this pull request Jan 31, 2025

AMP screening doesn't run when only a single tool is enabled #444

Open

Darcy220606 and others added 3 commits February 2, 2025 15:24

fix code review -docs

804da18

Merge branch 'dev' into add_interproscan_to_amp

186d351

fix prettier

f88fe9c

Darcy220606 requested review from jfy133 and jasmezz February 2, 2025 15:13

Darcy220606 linked an issue Feb 2, 2025 that may be closed by this pull request

AMP screening doesn't run when only a single tool is enabled #444

Open

Fix code review

5d1f4b7

jasmezz requested changes Feb 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add InterProScan to Pipeline and integrate in AMPcombi #428

Add InterProScan to Pipeline and integrate in AMPcombi #428

Darcy220606 commented Dec 9, 2024 •

edited

Loading

github-actions bot commented Dec 9, 2024 •

edited

Loading

❗ Test warnings:

\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-funcscan_logo_dark.png">\n <img alt="nf-core/funcscan" src="docs/images/nf-core-funcscan_logo_light.png">\n \n

❔ Tests ignored:

✅ Tests passed:

Run details

nf-core-bot commented Jan 6, 2025

Darcy220606 commented Jan 8, 2025

Darcy220606 commented Jan 9, 2025

Darcy220606 commented Jan 9, 2025

jfy133 left a comment

jfy133 left a comment •

edited

Loading

jasmezz left a comment

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025

jasmezz Feb 4, 2025


		- [InterProScan](https://doi.org/10.1093/bioinformatics/btu031)

		> Jones, P., Binns, D., Chang, H-Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S-Y., Lopez, R., Hunter, S. (2014)InterProScan 5: genome-scale protein function classification, Bioinformatics, 30(9), 1236–1240. [DOI: 10.1093/bioinformatics/btu031](https://doi.org/10.1093/bioinformatics/btu031)

	ch_versions = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions )
	ch_versions = ch_versions.mix( INTERPROSCAN_DATABASE.out.versions )

	log.warn("[nf-core/funcscan] Protein annotation with INTERPROSCAN produced an empty TSV file. No protein annotation will be added for ${meta.id}.")
	log.warn("[nf-core/funcscan] Protein annotation with InterProScan produced an empty TSV file. No protein annotation will be added for sample ${meta.id}.")

Add InterProScan to Pipeline and integrate in AMPcombi #428

Are you sure you want to change the base?

Add InterProScan to Pipeline and integrate in AMPcombi #428

Conversation

Darcy220606 commented Dec 9, 2024 • edited Loading

PR checklist

github-actions bot commented Dec 9, 2024 • edited Loading

nf-core pipelines lint overall result: Passed ✅ ⚠️

❗ Test warnings:

\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-funcscan_logo_dark.png">\n <img alt="nf-core/funcscan" src="docs/images/nf-core-funcscan_logo_light.png">\n \n

❔ Tests ignored:

✅ Tests passed:

Run details

nf-core-bot commented Jan 6, 2025

Darcy220606 commented Jan 8, 2025

Darcy220606 commented Jan 9, 2025

Darcy220606 commented Jan 9, 2025

jfy133 left a comment

Choose a reason for hiding this comment

jfy133 left a comment • edited Loading

Choose a reason for hiding this comment

jasmezz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Darcy220606 commented Dec 9, 2024 •

edited

Loading

github-actions bot commented Dec 9, 2024 •

edited

Loading

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

jfy133 left a comment •

edited

Loading