How to speed up protein screening #481

gabrielpan147 · 2024-11-25T13:39:17Z

gabrielpan147
Nov 25, 2024

Hello,

Thank you for developing such a great tool! I am currently doing protein screening based on the pulldown mode. We have several A100 GPU on a slurm based cluster. However, I just found the inferencing speed of the tool is slow: for a 827 residues , the prediction time on a single A100 card was ~150s, significantly slower than the Alphafold suggest prediction speed ~60-90 s(which also inference on A100).

I just followed your installation tutorial but I'm not sure if I configure everything correct. I'm wondering if you have parameter such as "global_config.subbatch_size" to increase the batch size or speed it up? Could you give me some suggestions?

Thanks,
Gabriel

jkosinski · 2024-11-25T17:01:45Z

jkosinski
Nov 25, 2024
Maintainer

Hi Gabriel,

The inference speed depends not only on sequence lengths but also on other factors like the size of MSA. Did you compare AlphaFold and AlphaPulldown run time on the exact same input, including the same input sequence alignments and templates, same number of recycles, and so on?

Best,
Jan

0 replies

DimaMolod · 2024-11-26T08:04:24Z

DimaMolod
Nov 26, 2024
Maintainer

Hi @gabrielpan147
we use the same config file as the original alphafold, e.g. you can find global_config.subbatch_size value here, but I agree that you can't compare speed based on the sequence length alone. Also not sure how changing the default AF parameters would affect the accuracy of predictions.

0 replies

gabrielpan147 · 2024-11-27T06:16:35Z

gabrielpan147
Nov 27, 2024
Author

Hello all,

Thank you for your reply! I haven't tested the official AF on the same input yet, and I agree this is what I need to test. I am also wondering if we focus on AlphaPullDown, do you have any tips& suggestions for accelerating the inferencing speed?

0 replies

DimaMolod · 2024-11-27T08:02:25Z

DimaMolod
Nov 27, 2024
Maintainer

well, you can limit MSA depth, reduce number of cycles, or play with some other params in config.py, but it's always a trade off between speed and quality and I don't think there is a trick to just accelerate inference without affecting accuracy. Otherwise, it would be well known by now :) we recently found out that conversion to modelCIF might take a while for large PAEs, so maybe turn off this too.

0 replies

jkosinski · 2024-11-27T08:30:56Z

jkosinski
Nov 27, 2024
Maintainer

Exactly as Dima said, unfortunately speed always sacrifices sensitivity and accuracy so it depends on your biological question, e.g. whether you want to find as many interactions as possible in your system, or you are fine with missing some. In addition to the parameters Dima listed above, you can also adjust --num_predictions_per_model=1 and only run one model (e.g.--model_names=model_2_multimer_v3). @DimaMolod, by the way, since this question comes back, maybe we could have some subpage in the doc listing all these settings for the fastest speed, with a warning about accuracy?

(we should also occasionally test that AP is as fast as AF with the same input).

0 replies

DimaMolod · 2024-11-27T08:56:23Z

DimaMolod
Nov 27, 2024
Maintainer

yes, I like the idea: maybe instead of the current exhaustive manual with all possible functionality we should split it by use cases, e.g.

quick start
classical alphapulldown: how to inference as quickly as possible
modeling big complexes with slow and accurate mode and multimeric templates
how to use crosslinks
etc.

0 replies

jkosinski · 2024-11-27T09:04:46Z

jkosinski
Nov 27, 2024
Maintainer

Yes, we need to restructure the front page along these lines, let's discuss that separately.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speed up protein screening #481

{{title}}

Replies: 7 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How to speed up protein screening #481

gabrielpan147 Nov 25, 2024

Replies: 7 comments

jkosinski Nov 25, 2024 Maintainer

DimaMolod Nov 26, 2024 Maintainer

gabrielpan147 Nov 27, 2024 Author

DimaMolod Nov 27, 2024 Maintainer

jkosinski Nov 27, 2024 Maintainer

DimaMolod Nov 27, 2024 Maintainer

jkosinski Nov 27, 2024 Maintainer

gabrielpan147
Nov 25, 2024

jkosinski
Nov 25, 2024
Maintainer

DimaMolod
Nov 26, 2024
Maintainer

gabrielpan147
Nov 27, 2024
Author

DimaMolod
Nov 27, 2024
Maintainer

jkosinski
Nov 27, 2024
Maintainer

DimaMolod
Nov 27, 2024
Maintainer

jkosinski
Nov 27, 2024
Maintainer