What is the max_frames for inference on long videos in the paper? #4

whu125 · 2025-01-24T02:43:09Z

Thank you for the excellent work.

I saw that in the example code, max_frames is set to 128, but when I use this parameter, I encounter an out-of-memory error. I am using an 80GB A800 GPU.

To replicate the results in your paper's tables, how many frames should I use? I am planning to reproduce the performance of VideoLlama3 on long video datasets.

whu125 · 2025-01-27T04:47:10Z

attn_implementation="eager"

I found that the root cause of the problem was that I didn't use the flash-attention library. However, when I run the LLaVA-NEXT code, I still don't use the flash-attention library, and the out-of-memory error does not occur. Does anyone know the reason?

lixin4ever · 2025-02-01T15:21:50Z

Thank you for the excellent work.

I saw that in the example code, max_frames is set to 128, but when I use this parameter, I encounter an out-of-memory error. I am using an 80GB A800 GPU.

To replicate the results in your paper's tables, how many frames should I use? I am planning to reproduce the performance of VideoLlama3 on long video datasets.

Sorry for the late reply.

For videos no longer than 3 minutes (180 seconds), we sample the video frames at the rate of 1 fps. For the longer ones, we uniformly sample 180 frames.

We apply the above frame sampling strategy to all of the benchmarks and there is no separate strategy for long video datasets.

whu125 changed the title ~~How much GPU memory is required for inference with the 7B Video model?~~ What is the max_frames for inference on long videos in the paper? Jan 24, 2025

lixin4ever mentioned this issue Feb 1, 2025

Differences Between VideoLLaMA3-Video and VideoLLaMA3-Image Checkpoints #14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the max_frames for inference on long videos in the paper? #4

What is the max_frames for inference on long videos in the paper? #4

whu125 commented Jan 24, 2025 •

edited

Loading

whu125 commented Jan 27, 2025 •

edited

Loading

lixin4ever commented Feb 1, 2025

What is the max_frames for inference on long videos in the paper? #4

What is the max_frames for inference on long videos in the paper? #4

Comments

whu125 commented Jan 24, 2025 • edited Loading

whu125 commented Jan 27, 2025 • edited Loading

lixin4ever commented Feb 1, 2025

whu125 commented Jan 24, 2025 •

edited

Loading

whu125 commented Jan 27, 2025 •

edited

Loading