HBM and Advanced Packaging: The Hidden AI Power Race

The explosive development of large AI models is reshaping the global semiconductor industry landscape. From training Transformer models with hundreds of billions of parameters to real-time inference for generative AI applications, computing power demand is growing exponentially. According to IDC, China’s intelligent computing power is expected to reach 1037.3 EFLOPS by 2025, a 43% increase. However, the traditional “memory wall” — the performance bottleneck caused by the disparity between data transfer speed and computation speed — has become a critical constraint, leading to low energy efficiency.

Against this backdrop, High Bandwidth Memory (HBM) and advanced packaging technologies have emerged as key breakthroughs. HBM, through 3D stacking and Through-Silicon Via (TSV) technology, delivers over 1TB/s per chip — five times that of conventional GDDR6. Advanced packaging technologies such as TSMC’s CoWoS and Intel’s EMIB integrate CPUs, GPUs, and NPUs into “super chips,” breaking past the limitations of chip size and power. These two technologies together form the “invisible battlefield” of the AI computing revolution, involving not only technical competition but also geopolitical and supply chain influence.

In traditional computing architectures, the physical separation between compute and memory units results in data movement accounting for over 60% of total energy consumption, forming the “memory wall.” Models like GPT-4 require terabytes of data per inference. If using GDDR5, the latency in data transfer would drastically reduce computing efficiency. This is why NVIDIA adopted HBM3 in its H100 chip — offering a 3.35TB/s bandwidth, five times that of GDDR6X, and reducing data transfer latency to nanoseconds.

Since the first HBM product launch in 2014, HBM has evolved through four generations: HBM, HBM2, HBM2E, HBM3, and the improved HBM3E. Capacity has increased from 1GB to 24GB, bandwidth from 128GB/s to 1.2TB/s, and data rate from 1Gbps to 9.2Gbps.

HBM vertically integrates 8–12 layers of DRAM using TSV technology. For example, SK Hynix’s latest HBM3E adopts a 1β nm process and hybrid bonding, achieving 24GB per chip and over 1TB/s bandwidth — essentially building a 12-lane data highway on a fingernail-sized area. This advancement relies on both DRAM process improvement and ultra-precise interposer wiring in 2.5D packaging — TSMC’s CoWoS technology supports over 100,000 microbumps per square centimeter, reducing processor-memory distance to the micron level.

AI’s rapid growth is driving explosive demand for HBM. As AI spreads across cloud/e-commerce services, smart manufacturing, finance, healthcare, and autonomous driving, demand for AI servers and high-end GPUs has surged. According to TrendForce, 2023 AI server shipments (including GPU/FPGA/ASIC-based systems) hit nearly 1.2 million units, up 38.4% year-over-year, accounting for 9% of total server shipments — expected to reach 15% by 2026, with a 22% CAGR from 2022 to 2026. In AI servers, HBM is nearly universal — especially on the training side with GPUs like NVIDIA’s A100/H100. As AIGC models grow more complex, inference also increasingly relies on HBM-equipped high-end GPUs.

When HBM and GPUs are integrated via advanced packaging, computing power density shifts from transistor count to “bandwidth per watt.” A good example is NVIDIA’s H200 — with the same GPU architecture as the H100, but HBM3E integration boosts LLM inference speed by 1.9x. In the AI era, the “memory bandwidth × energy efficiency” metric is becoming more important than floating-point peak performance.

The global HBM market is highly concentrated: in 2022, SK Hynix held 50%, Samsung about 40%, and Micron 10%. SK Hynix, an early mover, supplies NVIDIA with HBM3 and maintains a lead. However, Samsung expanded HBM3 supply to NVIDIA in 2024, and Micron has started mass-producing HBM3E. All major players are ramping up capacity: SK Hynix is expanding its M15X plant in Cheongju with a $14.6 billion investment to be completed by Nov 2025; Samsung is building a dedicated HBM4 line in the pilot production phase; Micron is constructing HBM test and production lines in the U.S. and exploring Malaysian manufacturing, with possible expansion at Virginia’s Dominion Fab6.

Advanced packaging plays an equally critical role in boosting AI performance. As Moore’s Law slows, traditional scaling becomes more expensive and technically challenging due to issues like quantum tunneling and low yield rates. Advanced packaging has emerged as a cost-effective path forward.

Unlike traditional packaging — which focuses on electrical connection and chip protection using wire bonding — advanced packaging emphasizes integration density, miniaturization, and interconnect performance. It leverages bump bonding and includes techniques like Flip Chip, Wafer-Level Packaging (WLP), Fan-In/Fan-Out, 2.5D/3D packaging, hybrid bonding, and chiplets. These methods improve integration, shorten interconnect paths, increase I/O count, enhance thermal performance, and boost design and production efficiency.

For example, AI training chips integrate HBM via stacked DRAM (3D packaging) and a logic die on a silicon interposer (2.5D packaging). 2.5D/3D packaging is now mainstream for AI and HPC due to its ability to overcome memory wall limitations. 2.5D involves side-by-side chip placement on an interposer, ideal for combining logic chips with HBM. 3D stacks chips vertically and is used for high-performance logic and SoCs.

TSMC’s CoWoS (Chip-on-Wafer-on-Substrate) is a benchmark in 2.5D/3D packaging, stacking chips before substrate packaging to save space, lower power, and increase performance. Most AI chipmakers use CoWoS, causing persistent capacity shortages.

According to Yole, the global advanced packaging market will grow from $44.3B in 2022 to $78.6B by 2028 (CAGR: 10.6%), far outpacing traditional packaging. TSMC leads due to its advanced process expertise. Its 3DFabric integrates 3D stacking and advanced packaging, adopted by NVIDIA, AMD, and others.

Samsung, Intel, and ASE also hold significant market shares. Chinese firms are catching up — JCET and Tongfu Microelectronics have CoWoS capabilities. JCET’s XDFOI™ 2.5D line has entered stable production with 4nm multi-chip products. Tongfu collaborates with AMD on CoWoS. Shenghe Jingwei has also made fast progress, offering mass production of multi-chip packaging solutions using TSV substrates, fan-out, and large-size interposers for AI, data centers, and smartphones.

The rise of advanced packaging has fueled demand for semiconductor equipment and materials. It requires lithography, etching, deposition, polishing, etc., expanding equipment needs from backend to front-end wafer processes. Key tools include lithography machines, etchers, film deposition, and CMP; materials include plating liquids, CMP slurry, specialty chemicals, photoresists, bonding adhesives, and target materials. Domestic equipment is progressing across bonding, lithography, etching, deposition, and thinning. Though high-end substrates, EMC, PSPI photoresists, and bonding adhesives still rely on imports, Chinese firms are accelerating development, with some products in validation or pilot production.

By enabling heterogeneous integration and miniaturized interconnects, advanced packaging reshapes chip design paradigms. TSMC’s CoWoS uses interposers to connect multiple chips — NVIDIA’s H100 GPU uses it to integrate HBM3, reaching 3TB/s bandwidth. Intel’s EMIB embeds silicon bridges for chiplet integration — AMD’s MI300 uses it to combine CPU, GPU, and HBM, tripling compute density.

TSMC’s CoWoS monthly capacity surpassed 40,000 wafers in 2024 but still falls short — NVIDIA’s B100 alone pre-booked 60%. Samsung is investing $20B in a packaging megafab using its H-Cube tech, capable of housing 1200mm² interposers — three times traditional limits. A single CoWoS line costs over $3B — equivalent to three conventional OSAT plants.

Packaging value rose from 7% of chip cost in the 28nm era to 25% in the 3nm era. TSMC’s CoWoS pushed gross margins to 52%, forcing traditional OSAT leader ASE to shift focus to advanced packaging. The deeper impact is a shift in supply chain control — chip designers must now coordinate packaging plans with TSMC 18 months in advance. AMD’s MI300X was delayed due to packaging resource shortages.

HBM and advanced packaging are not isolated technologies — they complement each other to advance AI computing. HBM resolves bandwidth and capacity limits, enabling fast, large-scale data access. Advanced packaging enhances interconnects and integration, improving data transfer and system performance.

In AI servers, tight integration of HBM and AI chips via advanced packaging greatly enhances system performance. For example, 2.5D/3D packaging shortens the data path between HBM and GPU, reducing signal delay and speeding up AI processing. This synergy is especially critical for training and inference, significantly improving model speed and accuracy.

From a supply chain perspective, HBM and advanced packaging drive coordinated upstream and downstream development. Equipment and materials makers like Applied Materials, Tokyo Electron, NAURA, and AMEC must develop new tools to meet HBM and packaging demands. Midstream players like NVIDIA, AMD, Samsung, and TSMC integrate these technologies into competitive products. Downstream AI firms like Google, Microsoft, Baidu, and Alibaba use them to build more powerful AI applications.

In the hidden battlefield of the AI computing revolution, HBM and advanced packaging are pivotal forces. Their development will not only elevate AI capabilities but also reshape the entire semiconductor industry. For companies, seizing the opportunities brought by these technologies, increasing R&D investment, and enhancing technical and market strength is key to gaining a competitive edge in the future of technology.

End-of-Yunze-blog

Disclaimer:

  1. This channel does not make any representations or warranties regarding the availability, accuracy, timeliness, effectiveness, or completeness of any information posted. It hereby disclaims any liability or consequences arising from the use of the information.
  2. This channel is non-commercial and non-profit. The re-posted content does not signify endorsement of its views or responsibility for its authenticity. It does not intend to constitute any other guidance. This channel is not liable for any inaccuracies or errors in the re-posted or published information, directly or indirectly.
  3. Some data, materials, text, images, etc., used in this channel are sourced from the internet, and all reposts are duly credited to their sources. If you discover any work that infringes on your intellectual property rights or personal legal interests, please contact us, and we will promptly modify or remove it.

Leave a Reply