You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I saw that in the example code, max_frames is set to 128, but when I use this parameter, I encounter an out-of-memory error. I am using an 80GB A800 GPU.
To replicate the results in your paper's tables, how many frames should I use? I am planning to reproduce the performance of VideoLlama3 on long video datasets.
The text was updated successfully, but these errors were encountered:
whu125
changed the title
How much GPU memory is required for inference with the 7B Video model?
What is the max_frames for inference on long videos in the paper?
Jan 24, 2025
I found that the root cause of the problem was that I didn't use the flash-attention library. However, when I run the LLaVA-NEXT code, I still don't use the flash-attention library, and the out-of-memory error does not occur. Does anyone know the reason?
I saw that in the example code, max_frames is set to 128, but when I use this parameter, I encounter an out-of-memory error. I am using an 80GB A800 GPU.
To replicate the results in your paper's tables, how many frames should I use? I am planning to reproduce the performance of VideoLlama3 on long video datasets.
Sorry for the late reply.
For videos no longer than 3 minutes (180 seconds), we sample the video frames at the rate of 1 fps. For the longer ones, we uniformly sample 180 frames.
We apply the above frame sampling strategy to all of the benchmarks and there is no separate strategy for long video datasets.
Thank you for the excellent work.
I saw that in the example code, max_frames is set to 128, but when I use this parameter, I encounter an out-of-memory error. I am using an 80GB A800 GPU.
To replicate the results in your paper's tables, how many frames should I use? I am planning to reproduce the performance of VideoLlama3 on long video datasets.
The text was updated successfully, but these errors were encountered: