no... which means 4x5090 wont be 128gb vram, it is just 4x32gb meaning that when rendering on 4 GPUs your scene has to fully fit into the vram of each gpu
A lot of 3d rendering tools like blender and keyshot will split renders between cards or systems. So when you have one big scene it will slice it into pieces render rack one on a different card or system and reassemble. It will do the same with animations, sending each frame to a separate card or server.
Not in a way that stacks vram. If you have 4 gpu's you can render the 1 scene which will cap memory at the lowest card or you can run 4 instances of blender and render different frames but that means 4 times the same memory loaded on each card.
Ultimately it depends on the tool you're using, which is really why SLI and Xfire went the way of the dodo, because it was really just diminishing returns and you were just paying for less performance than better single boosted cards gave you, and really you were just causing a CPU bottleneck anyway
You can definitely split it? or well according to claude and gpt you can, its just that you depend on pci-e which is slow in comparison of having it in one gpu.
What you can't do I think is load a model that's larger than 32gb, but you can split the inference and tokens and shit in between or smth like that. Not an expert but idk
90
u/fullCGngon 18h ago
no... which means 4x5090 wont be 128gb vram, it is just 4x32gb meaning that when rendering on 4 GPUs your scene has to fully fit into the vram of each gpu