I'm only interested how sphere gets into GPU , directly by avoiding RAM or not
That's a very complicated question, because today's computers have so many optimizations that I don't believe we can really generalize anymore.
Lets start with data format. The sphere is first transformed into a list of triangles, typically a triangle-strip.
https://en.wikipedia.org/wiki/Triangle_strip. For a sphere, it would look something like this:
If you look closely at curves in video games, you can see the triangles on the edges.
Okay, so the CPU first needs to make a list of triangles to pass to the GPU. Its simply a list of every point in the strip. Like "(0,0,0), (1,0,0), (1,1,0), (1, 1, 1),", etc. etc. A big list like that. This is called "Vertex information".
A "Vertex Shader" or "Geometry Shader" is a GPU-program that may make more triangles from data passed in from the CPU ("shaders" are run on the GPU instead of the CPU). GPU code is highly customizable and can pretty much do anything these days, so its really hard to generalize exactly what format CPUs even pass to the GPUs anymore. But... lets assume a triangle strip.
So... where am I going with this?
I guess what I'm saying is... anything can happen, so long as someone wrote the code for it. Code runs on the GPU, code runs on the CPU. At best, I can tell you what happens with a specific situation, like lets
assume we have a simple triangle strip, lets
assume we have a mip-mapped texture mapped to those triangles. Lets assume the CPU is storing the data in DDR4 (not necessarily true! It could be in L3 cache, it could be on the hard drive, it could be on the internet and the CPU doesn't even have the data yet).
It all depends on how a particular game engine works, and what optimizations are enabled, whether the GPU is an iGPU (shares internal bus with the CPU), or a dGPU (must be external over PCIe), device drivers... every piece of the puzzle is changing constantly because everyone involved is trying to make the whole process faster.
if it goes directly into VRAM avoiding RAM , then it means: if you have 10gigs of VRAM and only 1 gig of RAM it will be ok because it will never touch RAM because object will fit entirely into VRAM
is above true for 3d max and maya houdini and blender ? (they work in same way I'm sure, at least steps of object creation is same). it's something that should be common for such host softs like those I've mentioned.
The CPU cannot access GPU-RAM directly, but it can pass data to the GPU somewhat efficiently (roughly 15GB/s is the speed of PCIe 3.0 16x lanes). So even if the CPU doesn't have 10GBs, it can certainly be programs to pass 10GBs of generated data to the GPU.
Does that happen commonly? No. Most computers have more DDR4 RAM than vRAM. DDR4 is way, way cheaper after all. 16GBs of DDR4 is like $80, while 10GBs+ GDDR6 VRAM GPUs are hundreds of dollars. So most code probably assumes that they're in DDR4.
But what about 3d Max, Maya, Houdini and Blender? As
offline renderers, they're designed to support dozens of GBs sized scenes, in excess of a computer's DDR4 capacity. Unlike a video game (which has speed requirements: usually 60fps, 30fps, or some other similar FPS requirement), offline renders can spend days on a single animation. So... they probably can swap off data that doesn't fit into the hard drive or something.
You're asking a lot of questions, but things are very complicated and not really simple to answer.
----------
EDIT: Okay, so lets choose Blender. Because Blender is easy. Why? Because Cycles / Blender could be a CPU-only renderer. No GPU discussions happen at all. Bam. Simple answer to a simple question.
