A little background…
Since the invention of 3D rendering, the CPU has been responsible for rendering out images and animations from 3D and video applications using graphic APIs such as OpenGL or DirectX to communicate with the graphics card.
With only a few exceptions render engines use ‘bucket rendering’. This splits up the image into buckets (squares), which are processed on the CPU until the full image is complete. These buckets can be distributed over a render farm to speed up rendering but, if working on a single workstation, the total number of active buckets depends on the amount of CPU cores you have available.
An example of an 8 core CPU rendering 8 buckets simultaneously.
Inspired by this limitation, manufacturers of both hardware and software have been working towards developing a method that will offload the job of rendering from the CPU to the GPU. Although there are a number of render engines currently developing this, almost all will utilise the CUDA technology of NVIDIA graphics cards.
As mentioned earlier, the total number of active buckets is determined by the amount of cores available. If we consider an entry-level NVIDIA GeForce graphics card such as the 8800GT, which has 112 cores, you start to see how GPU rendering can and will have a massive impact on render times as more buckets can be rendered simultaneously.
As you would expect, the better the graphics card, the faster the render. This means that the high-end GeForce and Quadro graphics cards could render up to 60 times faster than a standard quad core CPU. This will improve and quicken the artist’s workflow by allowing the user to see the immediate effect on the scene after any alteration. This could be as simple as changing a light or material parameter, or introducing a new object. It can also allow the artist to pan and zoom the camera around the scene without the need to wait while the frame buffer re-renders. Instead, the viewport follows the user’s actions while working on the scene and automatically (and progressively) generates a photorealistic preview.
Whilst NVIDIA’s CUDA engine is clearly the leader in this field – effectively locking all GPU processing tasks to NVIDIA hardware – there are others on the horizon. Apple have been working with the Khronos Group on OpenCL, a standards-based method for general purpose GPU computing.
By democratising GPU processing, any program on Macintosh, Windows and Linux platforms will be able to compute 3D data on any graphics card, regardless of manufacturer. Not only is OpenCL a genuine competitor, it is likely to supersede CUDA as the API of choice, allowing programs such as Maxon’s Cinema4D and Autodesk’s Maya to render on the GPU.
Another worthy mention is Microsoft DirectX 11’s compute shader feature, which is shipping with Windows 7. This feature enables post-processing effects, such as depth of field and motion blur, to be carried out by the GPU. Although locked to the Windows platform, it can be used on both AMD and NVIDIA graphics cards.
In order to use the GPU cores for rendering, we have had to wait for software companies to catch up with the developments at NVIDIA. There are two clear leaders in the race to get a fully supported GPU renderer on the shelves; Mental Ray’s iRay and Chaos Group’s V-Ray RT.
iRay will hopefully be available to all customers who upgrade to future releases of Mental Ray, either as a standalone renderer or from within applications that include the software (including such as Autodesk 3ds Max, Autodesk Maya and Autodesk Softimage).
Although impressive, indoor scenes or scenes with a large amount of bounced light seem to take significantly longer than other images to fully render. Even after a few seconds the image looks like a poor reception from a television and not at all production quality. These results were obtained using four GPUs; what type we don’t know, but most likely it would have been a Tesla S1070, (a platform iRay was designed to run on).
Incredibly, those pioneers over at Mental Images have also found the time to develop mental mill and, in conjunction with NVIDIA, the RealityServer. mental mill enables artists to create shaders and graphs for GPU and CPU rendering through an intuitive GUI with realtime visual feedback. The NVIDIA RealityServer delivers the power of thousands of cores that allow for realtime rendering over the web, perfect for product designers and architects who can easily visualise their clients’ projects with a laptop or even an iPhone!
The NVIDIA RealityServer platform is a powerful combination of NVIDIA Tesla GPUs, RealityServer software and iRay. Later, we will consider the NVIDIA Tesla GPUs in more depth and explore how they too are shaping the future of GPU rendering.
The other viable option for realtime rendering is V-Ray RT. Whilst V-Ray RT is currently CPU based, Chaos Group have already developed it into a fully interactive GPU accelerated renderer, which will hopefully be available as a service pack upgrade this year. A beta version of this was showcased last year at the industry event SIGGRAPH and was considered the major highlight of the show.
V-Ray has long been at the forefront of photorealistic rendering and is well known for being the fastest and easiest to use. In contrast to the iRay demo, it appears that V-Ray RT will yield faster results whilst using mid- to high-range graphics card. In the video, they use an NVIDIA GeForce GTX 285, which is available for just £399 exVAT. Once V-Ray RT goes fully GPU based, users should expect renderings to be completed 10 to 20 times faster than its CPU counterpart.
So which is better?
- Available as a future release of Mental Ray
- Web interface
- mental mill
- Very expensive hardware
- Slower than V-Ray RT
- Faster than iRay
- Cheaper hardware
- No web interface
- No definite release date
- CPU version currently does not support meshes
If money is no object and you require a method of interacting with your 3d scene over the web, perhaps whilst in front of clients, then iRay is for you.
However, if you are prepared to wait a bit for its release, GPU based V-Ray RT will offer you quicker and cheaper results and will seamlessly fit into current workflow methods. It is worth mentioning that both solutions are scalable, meaning that you can add multiple graphics cards into a workstation or distribute the task over a network. Be aware that it is almost certain that each graphics cards will need a 16 x PCIE 2.0 slot to work fully, so check your motherboard before you upgrade.
The only other GPU rendering solution worth mentioning is OctaneRender, developed by Refractive Software. A limited feature demo is available for the Windows platform.
OctaneRender isn’t locked to a particular program, you simply import a Wavefront ‘obj’ file and then start applying shaders and adding lights to the scene whilst viewing your changes in realtime. The upside of this is that almost all 3D applications can export to it but it does require a significant change in current workflow techniques and is unlikely to surpass the complex and now standard practices of Mental Ray and V-Ray.
NVIDIA Tesla technology
Right, you’ve heard us mention the Tesla a few times already, so it’s about time we explain why it is at the heart of this GPU revolution.
The Tesla S1070 is the world’s first 4 teraflop processor. This is achieved by using four 1 teraflop GPUs, each with 240 processor cores to give a total of 960 cores, all in 1U of rack space! This amount of cores will reduce render times from hours to minutes or even seconds.
Needless to say, there is also a workstation equivalent. The C1060 takes one of those 4GB GDDR3 1 teraflop GPU’s used in the S1070 and uses a regular PCIE 2.0 bus so that it can be immediately implemented into existing workstations.
This breakthrough finally provides an affordable solution for individuals and small companies who can now have the processing power of 60 Quad core processors (which would previously take up the space of a small room!) located neatly alongside a regular graphics card used for video display.
So, together with a render engine such as V-Ray RT or iRay and a CUDA enabled graphics card, individuals will soon have access to realtime photorealistic rendering power at a fraction of the cost of a render farm. I’m sure you will agree this is a massive, game-changing development.
Back in the real world
Aside from all the facts and demos, if you ever needed proof that the burden of rendering has fallen on the shoulders of the GPU, then consider the hugely successful and brilliant film ‘Avatar’.
At last, film and special effects companies such as WETA now have the necessary hardware to produce stunningly beautiful and lushly detailed scenes with an extensive cast of virtual characters set in computer generated environments.
Of course this has been done before; in fact, the last breakthrough in this field was made on another of WETA’s creations, ‘Lord of the Rings’. However, those 3D effects were merged into the real world footage, whereas ‘Avatar’ is total fantasy, everything exists only in a 3D virtual model.
WETA were required for the first time in the history of CG visual effects, to model, animate, shade, light and render billions rather than millions of polygons in a single scene. The computational power required to process the ‘Avatar’ shots was higher than anything they had attempted previously; so they turned to NVIDIA, masters of the GPU.
Step forward the Tesla S1070 that, along with new custom designed software, PandaRay, allowed WETA to process their shots 25 times faster than any CPU-based server.
One scene in particular exemplifies the advantages of PandaRay and GPU-based servers. If you’ve got a copy, pay close attention to the shots where a huge flock of purple creatures and enemy helicopters are flying amongst tree covered mountains. Those sorts of scenes were pre-computed in a day and half where previously it would have taken a week with traditional CPU-based servers.
The increased rendering speed allowed for minute detail of vegetation and near perfect colour separation between distances, creating a more beautiful shot.
So as you can see, GPU computing is both the present and future of 3D rendering. If you would like any more information regarding CUDA-enabled graphics cards and servers, as well as rendering programs, please don’t hesitate to get in touch.