1. Introduction
In the past two years, artificial intelligence (AI) has swept through the global tech scene. Although it is not yet fully mature, it has begun to profoundly integrate into and change our daily lives.
When discussing AI, many people may wonder: why is GPU computing, led by NVIDIA, dominating AI instead of the familiar processors (CPUs) from Intel or AMD?
This is a “fascinating” question. The main reason for this confusion stems from two biases and misunderstandings in the perceptions of some individuals.
These biases are: an incomplete understanding of CPU performance and a shallow, outdated view of GPUs. I will explain these points one by one.
2. Limitations of CPU Performance
When processors (mainly consumer-grade) are mentioned, many immediately think of benchmark scores, specifically models like the Core i9-14900KS and Ryzen 9 9950X, which currently hold the highest scores and performance. Many subconsciously believe these processors are the best in all computing fields.
This perception is very one-sided and incorrect. Although it is all “computation,” there are many types of calculations, such as basic arithmetic operations, squaring, cubing, and square roots.
Consumer-grade processors are primarily designed to meet general computing needs, excelling in complex logic control and serial tasks, efficiently executing various types of instructions, mainly integer calculations.
In contrast, AI computations primarily involve model training and inference, focusing on matrix and tensor operations, which test the processor’s floating-point performance—an area where CPUs lag behind GPUs. GPUs, designed with multiple processing units for high-density floating-point operations like images and videos, far outperform CPUs in floating-point tasks.
In short, although both are “computations,” the specific types required and tested vary significantly between applications, particularly in AI, which differs greatly from the types of computations CPUs excel at. This distinction must be understood objectively and correctly.
3. Shallow Understanding of GPUs
Upon reading this subheading, some may feel defensive, believing they are seasoned gamers or GPU enthusiasts with a deep understanding of graphics cards.
However, while my words may sound harsh, they reflect reality. Many people think of GPUs primarily in relation to gaming—graphics quality, ray tracing, average frame rates, video encoding/decoding, and rendering. The term “graphics card” implies tasks related to “display.”
I must emphasize that this perception is outdated and limited, having not evolved since the early days of GPUs. Although correct in the past, this view became obsolete after NVIDIA introduced the CUDA architecture in 2006.
CUDA is a parallel computing architecture that combines a programming model, compiler, API, libraries, and tools, unlocking the general computing potential of GPUs. It enables developers to utilize GPU parallel processing power to solve complex problems. As a result, traditional GPU tasks associated with graphics computing have become just a subset of the broader CUDA ecosystem.
Thus, one should not complain about the modest performance increases of the RTX 40 series GPUs. Graphics performance is just one aspect of the overall performance improvements, with many enhancements not visible in conventional gaming scenarios, especially with DLSS turned off.
In other words, referring to GPUs merely as “graphics cards” is increasingly inaccurate, unobjective, and misleading. Today, GPUs have evolved into general-purpose processors, with their original gaming, video encoding, and rendering functions representing only a portion of their capabilities.
As a side note, AMD has finally recognized the limitations of its previous strategy and the advantages of NVIDIA’s unified CUDA architecture. Recently, it decided to merge its CDNA architecture for commercial computing with its RDNA architecture for consumer GPUs into a unified UDNA architecture, highlighting its significant lag in understanding and decision-making.
4. Advantages of CUDA Architecture and GPUs Using This Architecture
① High Floating-Point Performance
GPU processing units are specifically designed for high-density floating-point operations, resulting in significantly higher efficiency in executing floating-point calculations compared to CPUs.
② High Core Count
GPUs often feature hundreds or thousands of small processing cores that can simultaneously execute numerous parallel computing tasks. In contrast, CPUs typically have only a few to dozens of cores, potentially leading to performance bottlenecks during large-scale parallel computing, giving GPUs a substantial advantage.
③ Strong Parallel Computing Performance
The core principle of CUDA is to decompose complex tasks into simpler sub-tasks and execute them in parallel across multiple GPU cores. Each streaming multiprocessor (SM) within a GPU contains several CUDA cores that can process multiple threads simultaneously.With the CUDA approach, developers can partition training tasks in deep learning into smaller sub-tasks distributed across multiple GPU cores, enabling simultaneous processing and significantly enhancing training speed.
④ High Memory Bandwidth
GPU memory bandwidth is typically much higher than that of CPUs, allowing for faster data access and transfer when handling large datasets. In deep learning model training, frequent reading and writing of substantial data makes high memory bandwidth crucial for improving training efficiency.
5. Summary
In summary, the reliance of AI computing on GPUs rather than CPUs primarily stems from GPUs’ significant advantages in parallel computing capability, floating-point performance, and memory bandwidth. These strengths allow GPUs to provide higher computational efficiency and performance when processing large datasets and complex tasks.
Given that NVIDIA introduced the CUDA architecture early and has invested significantly in it over the years, it has gained widespread industry support, making it difficult for competitors like Intel and AMD to challenge or surpass this advantage in the short term. However, numerous variables may arise in the future.
Disclaimer:
- This channel does not make any representations or warranties regarding the availability, accuracy, timeliness, effectiveness, or completeness of any information posted. It hereby disclaims any liability or consequences arising from the use of the information.
- This channel is non-commercial and non-profit. The re-posted content does not signify endorsement of its views or responsibility for its authenticity. It does not intend to constitute any other guidance. This channel is not liable for any inaccuracies or errors in the re-posted or published information, directly or indirectly.
- Some data, materials, text, images, etc., used in this channel are sourced from the internet, and all reposts are duly credited to their sources. If you discover any work that infringes on your intellectual property rights or personal legal interests, please contact us, and we will promptly modify or remove it.