Using multiple gpu for single prompt inferencing #1170

zorolevid · 2024-03-26T07:43:41Z

zorolevid
Mar 26, 2024

I have 4 gpu core of 32 gb each but the prompt length is big so it's needing 40 gb of memory due to which I am getting out of memory error is there any way using which I can use 2 cores of gpu for inferencing. I am using qween 72B chat model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using multiple gpu for single prompt inferencing #1170

{{title}}

Replies: 0 comments

Select a reply

Using multiple gpu for single prompt inferencing #1170

zorolevid Mar 26, 2024

Replies: 0 comments

zorolevid
Mar 26, 2024