FAQ: GPU Features in Detail
The graphics card computes too
For the original German article, see here.
Modern software demands more and more of computers and notebooks. For instance, enormous video files should be convertible from one format to another in a snap, and uncompressed music should shrink to little MP3 files in no time at all. A computer game should seem as realistic as possible, and in every action it should offer a fluid picture with perfect detail without delays. Research centers at universities and companies also require high levels of performance from processors et alia for elaborate simulations. But often even the newest processors with multiple cores at their disposal find themselves so heavily overloaded that you end up with frustrating wait times. As a consequence you can end up with visible fissures in the display quality, so that the computer can still manage a fluid motion sequence. For a computer game that means: numerous details like mirror-images, reflections or smoke emissions, which make the display look realistic, have to disappear or be extremely simplified before they reach the monitor. Sometimes the display rate sinks as well, leaving movements no longer fluid.
In order to avoid that and to support the processor in its work, it's possible to transfer some of the computational tasks to the graphics processor. This processor is actually distinctly better suited to some computations and is often left waiting for assignments unused or under-occupied. In addition, a graphics processor is specialized for the parallel computation of thousands of operations at once and therefore completes these tasks especially quickly. In order for the distribution of tasks to function smoothly, it is necessary to have a coordinator. A coordinator splits the tasks, distributes them and makes sure that there is a consistent result. This is where our technology comes in -- the CUDA platform with the integrated physics engine PhysX and OpenCL.
CUDA stands for Compute Unified Device Architecture and is a technology developed by Nvidia. It allows parallel computational tasks that are intended for the main processor to be outsourced to the graphics processor. The result is a double speed gain. On the one hand, the main processor has time for other tasks. On the other hand, the graphics processor can complete parallel tasks distinctly more quickly than the main processor. But a prerequisite: the software must be optimized for CUDA. Otherwise no division of labor can take place.
The increase in operating speed through the use of CUDA with optimized software can be immense. Depending on the application, computations can run between about 10 to 200 times more quickly. For example, with the help of CUDA wait times for scientific simulations can be reduced from 20 minutes to 30 seconds. Computations that can take between 30 to 40 seconds without CUDA sometimes run with the technology in real time.
CUDA was originally developed solely for scientific software with high demands for computer-assisted visualizations and simulations, but it is increasingly in application in private environments; primarily for video games. One of the first programs that supports scientific research and is also used on personal computers is Berkeley University's [email protected] Project, which occupies itself with the search for extraterrestrial life forms in outer space.
The minimum prerequisite for the use of CUDA is an Nvidia graphics card from the GeForce 8 Series. Numerous Quadro graphics cards and others from the Tesla line also support CUDA. Notebook versions of those respective graphics chips support the technology as well. You can find a complete list of supported GeForce graphics cards at Nvidia. You can find a list of compatible games and other software that profit from CUDA here.
When PhysX was integrated in the CUDA technology it found its way onto private PCs. PhysX is a physics engine developed by Nvidia that is also designed to relieve the main processor and accelerate computation. A physics engine calculates the movement of physical objects, like bodies and clothing, and also skin, hair and similar things, making realistic depictions possible. The calculation of fluid and gaseous substances like water, oil, lava, smoke, steam, fog and fire count as physical actions as well and are especially elaborate.
PhysX guides part of such elaborate calculations from the main processor to the graphics processor. The effects are shortened calculation time, faster program sequences and, for computer games, a higher refresh rate and/or higher display quality. The additional computational power also enables the display of more graphic effects. When the main and graphics processors are an especially powerful and efficient team, the use of PhysX leads to extremely high refresh rates and a simultaneously realistic display.
PhysX only functions with graphics processors from Nvidia. All modern graphics cards from the GeForce series support the technology, as long as they possess at least 32 graphics units and over 256 MB of memory. In the meantime, the PhysX engine has also made its way to game consoles like the Nintendo Wii, the Sony PlayStation and the Microsoft Xbox 360. You can find a complete list of all the Nvidia graphics processors that support PhysX here.
If there are two or more graphics cards installed on the PC it's possible to lay aside one card entirely for the physics engine. Especially practical: this doesn't need to involve identical graphics processors. The PhysX engine is a component of Nvidia's CUDA technology.
A substantial number of modern computer games support PhysX, including the popular Borderlands 2 and Deep Black Reloaded. A complete list of all the games that support PhysX can be found on the website PhysXInfo.com.
OpenCL stands for Open Computing Language. It's an alternative technology to CUDA that also allows AMD graphics processors and Intel graphics cards to take work from the main processor and to speed up computations. OpenCL was originally developed by Apple, and later further developed together with AMD, IBM, Intel and Nvidia. The Khronos Group - an industry consortium that manages multimedia standards - standardized OpenCL and is taking care of its further development. OpenCL is open source that can be used by anyone without needing to pay for a license. It is relatively young and was released on the market for the first time in 2009 with Apple's 10.6 operating system (Snow Leopard).
Like CUDA, OpenCL distributes computationally intensive work activities and guides all the operations that can be processed in parallel to the graphics processor. Via OpenCL it's possible to directly access elements of the programming interfaces OpenGL and DirectX, and to process these in accelerated time periods with the help of the graphics processor.
The list of supported graphics chips is relatively long. You can find an overview here. Unfortunately the list of programs that support OpenCL is so far much shorter. Some of the most popular representatives are the video software Total Media Theater 5.2 from ArcSoft, vReveal from MotionDSP (also video processing), numerous graphics programs from Adobe, including Photoshop, Premiere and also Flash. The free picture-processing program Gimp, some of the filters of the VLC media player as well as the compression program WinZip also profit from the OpenCL acceleration. One scientific freeware program that supports OpenCL is the mathematics software ViennaCL.