-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run spacedust normally #3
Comments
I meet the same problem |
Thanks for the reminder and sorry for the delay. Spacedust currently only accepts faa files generated with Prodigal, with which the correct coordinate information can be parsed. If you would still like to work with Prokka, in principle you could create the DB with the gff files generated by Prokka. |
I can work with Prodigal as required. However, I have concerns about the accuracy of CDS predictions made by Prodigal, as it lacks a certain error correction. Support for formats generated by Prokka and Bakta would be incredibly valuable, as these tools are widely used, and enabling compatibility with them would likely expand Spacedust's usability and adoption significantly. Thank you for considering this potential enhancement! |
+1 here, as I also tried to run spacedust on bakta-generated faa files. Would be super helpful if that could be updated in the future! |
Expected Behavior
Test and obtain the expected gene cluster.
Current Behavior
using CDS
When I use the gff file generated by prokka, it prompts "Not enough columns in GFF file"
./spacedust createsetdb *fna setDB tmpFolder --gff-dir gff.txt --gff-type CDS
When running the next command
./spacedust clustersearch setDB setDB result.tsv tmpFolder
, an error occurs.using faa
there is no error in building the database, but an error also occurs when running
./spacedust clustersearch setDB setDB result.tsv tmpFolder
.A puzzling point
When I use the example in the current repository provided, CDS still prompts "Not enough columns in GFF file" while faa can run within a few minutes.
My gff and faa files were generated using prokka. The size of the my genomes is about 4.5M. Despite using the same command, my own data doesn't work properly.
Your Environment
I ran separately on Ubuntu and CentOS with the same command. example_data can be executed, but it fails when I try it with my own data.
spacedust Output (for bugs)
The output of the command
./spacedust clustersearch setDB setDB result.tsv tmpFolder
.The text was updated successfully, but these errors were encountered: