필사 모드: The AI Semiconductor Supply Chain and Market — Who Actually Makes the Chips (2026)
EnglishIntroduction
There is no shortage of talk about what AI models can do, yet surprisingly little about how the chips that actually run those models get made. A single GPU or AI accelerator is not something one company builds end to end. It is the product of a long, tangled value chain involving dozens of firms.
A design house draws the circuits, EDA tools verify the design, IP firms supply core blocks, a foundry etches them into silicon, a packaging house binds many chips and memory into one package, and equipment makers supply the machines for every step. Block any one stage and the chip does not ship.
In this article we dissect the AI chip supply chain stage by stage as of 2026, identify where the bottlenecks sit, examine how geopolitics and market structure intertwine, and lay out the investment-cycle debate that keeps surfacing. This is not a stock recommendation — it is a map for understanding the industry's structure.
The AI Chip Value Chain at a Glance
Let us first sketch the whole flow in a simplified diagram.
[design/architecture] → [EDA tools] → [IP blocks]
↓
[foundry fab] → [HBM memory] → [advanced packaging (CoWoS, etc.)]
↓
[test/validation] → [board/system integration] → [cloud/server deployment]
supporting every step from the side: [semiconductor equipment (EUV, etc.)]
Each stage is highly specialized, and many are concentrated in a handful of companies. That concentration delivers efficiency while creating bottlenecks and geopolitical risk. Let us walk through it stage by stage.
Design, EDA, and IP
A chip starts with design. For AI accelerators, companies like NVIDIA, AMD, Google, and Amazon define the architecture and draw the circuits. Yet even the design cannot be done alone.
- **EDA (Electronic Design Automation) tools**: Humans cannot place billions of transistors by hand, so automation software handles placement, routing, and verification. This market is effectively an oligopoly of Synopsys, Cadence, and Siemens EDA.
- **IP (intellectual property) blocks**: Not every circuit is designed from scratch. Proven blocks — CPU cores (e.g. Arm), interfaces, memory controllers — are licensed and reused. Arm's instruction set and core IP are growing in share even in data center chips.
The hallmark of this stage is that software and licenses are the heart of it. There is no physical factory here, but if it jams, the design itself cannot proceed.
Foundries: TSMC, Samsung, Intel
Once design is done, the foundry is where the blueprint is actually etched into a silicon wafer. The companies that can mass-produce products demanding leading-edge processes (3nm, 2nm class) for AI chips can be counted on one hand.
- **TSMC (Taiwan)**: Holds an overwhelming position in leading-edge logic foundry. Most high-performance AI accelerators are made on TSMC's latest process. It is the single most important dependency point in the AI chip supply chain.
- **Samsung Foundry (South Korea)**: Another front-runner with its own advanced process, an early mover on next-generation transistor structures like gate-all-around (GAA).
- **Intel Foundry (United States)**: Starting from manufacturing its own chips, it is expanding into a foundry business that takes external customers, aiming for a comeback in advanced process and packaging.
The reason advanced foundries are concentrated in a few hands is simple. Building one cutting-edge fab costs tens of billions of dollars, and process know-how requires decades of accumulation. This barrier to entry is precisely the supply chain's fragility. The higher the dependence on one region or one company, the more the whole industry can be shaken by a natural disaster or geopolitical shock.
ASML and EUV
For a foundry to draw fine circuits, it needs equipment that etches patterns with equally fine light. Enter EUV (extreme ultraviolet) lithography.
The company that makes EUV machines is, effectively, the Netherlands' ASML alone. Essential to advanced processes, a single such machine costs tens of millions of dollars, and the latest High-NA models reach hundreds of millions of dollars each. The manufacturing itself is so complex that supply is limited.
The EUV dependency chain:
advanced AI chip ←depends on─ latest foundry process
latest process ←depends on─ EUV lithography machine
EUV machine ←monopoly── ASML (effectively single supply)
Because of this single dependency, the production and export of EUV equipment governs the entire capacity to make leading-edge chips. Each machine has a long lead time and can become subject to export controls, making it a key lever of geopolitics.
HBM: SK hynix, Samsung, Micron
For an AI accelerator, fast compute units are not the end of the story. Without the memory bandwidth to feed enormous model weights quickly, the compute units starve. The answer to this is HBM (High Bandwidth Memory).
HBM stacks multiple DRAM dies vertically and sits right next to the accelerator, providing enormous bandwidth. The HBM in a single AI chip accounts for a substantial share of the chip's cost, and that share grows with each generation.
The companies that can mass-produce HBM are also few.
- **SK hynix (South Korea)**: Has built a leading position in HBM for high-performance AI accelerators.
- **Samsung (South Korea)**: A heavyweight across memory broadly, deeply engaged in the HBM race.
- **Micron (United States)**: The third major supplier, growing its share alongside expanding AI demand.
HBM is not a mere component but a decisive variable in AI chip performance. As of 2026 the transition to the next-generation HBM4 is underway, and memory companies are collaborating with accelerator firms at a level approaching co-design.
The CoWoS and Advanced Packaging Bottleneck
To integrate an accelerator die and several HBM stacks into one package, you need advanced packaging that connects them precisely. The representative technology is TSMC's CoWoS (Chip-on-Wafer-on-Substrate).
This process — placing multiple chips and memory on one interposer and wiring them at ultra-high density — is effectively the final assembly step of an AI chip. And in recent years it has been one of the most notorious bottlenecks.
The structure of the packaging bottleneck:
even if you can make plenty of GPU dies,
+ if the CoWoS capacity to attach HBM stacks is short,
→ the shipment of finished accelerators is constrained.
In other words, no matter how many compute dies you stamp out, if the packaging capacity to bind them with HBM falls short, the final product does not ship. Foundries and packaging houses have aggressively expanded CoWoS-class advanced packaging capacity, but keeping pace with AI demand growth has always been a challenge. Ask "where is the jam in the supply chain?" and the recent answer has often been advanced packaging.
The Cost Structure of a Single Chip
Looking into how the price of a single AI accelerator forms gives you a feel for how each supply chain stage translates into cost. The exact figures vary by product and point in time, but the rough structure of the proportions is consistent.
Major items composing the cost of a finished accelerator (rough share):
compute die (foundry manufacturing) : large share
HBM memory stacks : a growing share
advanced packaging (CoWoS, etc.) : a meaningful share
test / yield loss : hidden cost
amortized R&D / software : reflected in price
What stands out here is the share of HBM. As generations advance, each accelerator carries more and faster HBM, so memory's portion of the total cost grows. That is, what governs accelerator pricing is not the compute die alone but the price and supply of the memory attached beside it.
Yield is also a hidden variable. The more advanced the process, the higher the defect probability, and the larger the die, the greater the risk that a single defect throws away the entire die. So designs that split a chip into small pieces (chiplets) to raise yield are increasingly common.
Chiplets and the Evolution of Packaging
In the past, the standard was to put all functions into one giant monolithic die. But because larger dies suffer lower yield and surging cost, the recent mainstream is to split a chip into several small chiplets and bind them back together at the packaging stage.
monolithic: [────── one big die ──────] low yield, low flexibility
chiplet : [small die][small die][small die] combined by packaging
→ each chiplet can be made on the optimal process
→ a defect is confined to a small chiplet, improving yield
→ the same chiplet can be reused across products
This trend raises the importance of packaging a notch further, because the packaging technology that connects chiplets at high bandwidth becomes a core variable of chip performance itself. This is why advanced packaging like the CoWoS seen earlier is treated not as mere assembly but as part of the design.
Interconnect: From Chip to System
A giant model cannot run on a single accelerator. Hundreds or thousands of accelerators must be bound to act like one giant system, and the interconnect that links them governs whole-system performance.
hierarchical connection structure:
inside a chip : ultra-high-bandwidth links between dies/chiplets
inside a server : high-speed links between accelerators (e.g. NVLink)
between servers : rack/cluster network (e.g. UALink, Ethernet-based)
Competition over interconnect standards is also an important axis of the supply chain. Whether you are locked into one company's proprietary standard or choose an open standard directly affects cost and dependence. As of 2026, the competition between proprietary interconnects like NVLink and open camps like UALink is underway, and this carries the character of an ecosystem-leadership contest beyond a mere technical choice.
Geopolitics and Export Controls
The AI chip supply chain is treated as a national security issue as much as a technical one. As advanced chips are perceived as central to military and economic competitiveness, major nations intervene deeply through export controls and industrial policy.
The key points of contention:
- **Advanced chip export controls**: Exports of certain high-performance AI accelerators and their manufacturing equipment to certain countries are restricted. This fragments the market and spawns spec-altered products designed to skirt the controls.
- **Equipment export controls**: When the export of advanced manufacturing equipment, including EUV, is restricted, it becomes hard for a given region to achieve self-sufficiency in leading-edge processes.
- **Production concentration risk**: Advanced foundry, packaging, and HBM are heavily concentrated in East Asia, so the United States, Europe, and others are using subsidy policies to attract domestic production facilities.
As a result, a supply chain once concentrated in one place purely for efficiency is being reshaped toward diversification with security and resilience in mind. That said, the entry barrier for advanced processes is so high that reducing dependence quickly is realistically constrained.
The Rise of In-House Cloud Silicon
For a long time the data center AI chip market was dominated by NVIDIA GPUs. But the trend of large cloud providers designing and deploying their own ASICs (custom chips) is growing fast.
- **Google**: Has long run the TPU series, and as of 2026 operates the 6th-generation Trillium and the inference-specialized 7th-generation Ironwood.
- **Amazon**: Has grown its own chips in the Trainium (training) and Inferentia (inference) lines.
- **Microsoft**: Is pushing its own AI accelerator line and expanding its use on internal workloads.
The reasons they build their own chips are clear: to control enormous inference costs, reduce dependence on a single supplier, and gain efficiency optimized for their own workloads. Industry projections estimate that the share of inference ASICs will expand from roughly 15 percent in 2024 to roughly 40 percent in 2026. And 2026 is cited as the year inference-related capital expenditure first overtakes training-related capex. As inference workloads grow in weight, the appeal of efficient custom chips rises.
NVIDIA's Dominance and the Challengers
That said, it is too early to call NVIDIA's position shaken. As of 2026, NVIDIA is estimated to hold roughly 75 to 80 percent of the AI accelerator market. Its strength lies not just in the chips but in its software ecosystem (CUDA), networking (NVLink), and a product roadmap that refreshes quickly every generation.
Looking at NVIDIA's 2026 roadmap, the Blackwell generation carries a second-generation Transformer Engine, and the next-generation Vera Rubin aims to sharply raise performance-per-watt on an HBM4 basis. This rapid refresh cadence builds a moat that challengers find hard to cross.
The landscape of challengers:
| Camp | Representative players | Differentiation |
| --- | --- | --- |
| General-purpose GPU rivals | AMD (MI350X, etc.) | Price/performance and an open software stack |
| In-house cloud ASICs | Google, Amazon, Microsoft | Internal workload optimization, cost control |
| Inference-specialized startups | Groq, SambaNova, etc. | Low-latency, high-throughput inference |
| Wafer-scale | Cerebras | A giant single chip that avoids communication bottlenecks |
Rather than overturning NVIDIA's overall position at a stroke, these players pursue a strategy of gradually eroding share in specific workloads (especially inference) and price-sensitive markets. As the market's weight shifts from training-centric to inference-centric, the opportunity for these challengers grows.
Pricing and Supply Constraints
The AI chip market is not purely a performance race. It is also a market where supply constraints govern price and availability. The CoWoS packaging capacity, HBM output, and advanced foundry slots seen above all act as limiting factors.
These constraints produce several consequences.
- High-performance accelerators often see demand exceed supply, so securing them becomes a competitive advantage in itself.
- Large cloud providers lock up capacity in advance through long-term prepurchase agreements.
- Supply shortages strengthen the motivation to develop in-house chips and adopt alternative accelerators.
In short, the price of AI chips is not set by design excellence alone but jointly by the capacity at the supply chain's bottleneck points. This is why understanding the supply chain structure is, in effect, understanding the market.
The Investment-Cycle Debate
One of the hottest debates as of 2026 is whether today's enormous AI infrastructure investment is sustainable. As the scale of capital expenditure poured into data centers, accelerators, and power infrastructure reaches historically notable levels, two views clash.
- **The optimists**: AI generates sufficient returns through productivity and new services, and inference demand grows long-term beyond training demand. Current investment is therefore a rational act of preparing ahead for future demand.
- **The cautious**: If actual revenue growth lags the investment recovery, it can lead to overcapacity and price declines. There is also a risk that supply-demand reverses if a stage's bottleneck unjams and supply surges all at once.
Which side is right is a question time will answer, and this article does not aim to declare a verdict. What is clear is that the direction of this debate directly affects the capacity-expansion decisions and pricing at each supply chain stage described above. When demand forecasts shift, the expansion plans of foundries, packaging, and memory all sway together.
Implications for Developers and Companies
Understanding this supply chain structure is practically useful.
- **Accelerator availability fluctuates**: Do not assume a specific chip will always be sufficiently supplied; keeping alternative accelerators and portability in mind is a resilient strategy.
- **Inference efficiency matters more and more**: As the market's weight shifts toward inference, the value of model compression, quantization, and serving optimization rises.
- **Supply chain diversification is a cost-vs-stability trade-off**: Weigh the risk of being locked to a single supplier against the cost of diversifying.
- **Geopolitical variables seep into technical decisions**: Export controls and subsidy policies affect which chips you can use and where.
At the company level, rather than simply chasing "the fastest chip," you need a balanced view that considers total cost of ownership, supply stability, and the software ecosystem together.
Conclusion
A single AI chip is not the work of one company but the product of a long, intricate chain linking design, EDA, IP, foundry, HBM, packaging, and equipment. Each link of this chain is concentrated in a few firms, delivering efficiency while creating bottlenecks and geopolitical risk.
The landscape of 2026 shows in-house cloud silicon and inference-specialized challengers gradually widening their territory amid NVIDIA's strong dominance, inference investment overtaking training investment for the first time, and advanced packaging and HBM acting as core bottlenecks. Over all of this loom geopolitics and the investment-cycle debate as major variables.
The answer to "who makes the chips" is, in the end, not a single name but a network of many interdependent names. Simply holding a map of that network in your head gives you the eye to gauge, amid a flood of news, what counts as a truly important change.
References
- [TSMC official site](https://www.tsmc.com/english)
- [ASML official site (EUV technology)](https://www.asml.com/en/technology/lithography-principles/euv-lithography)
- [SK hynix HBM overview](https://www.skhynix.com/products/dram/hbm)
- [Micron HBM products](https://www.micron.com/products/memory/hbm)
- [NVIDIA Blackwell Architecture](https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/)
- [Google Cloud TPU overview](https://cloud.google.com/tpu/docs/intro-to-tpu)
- [AWS Trainium overview](https://aws.amazon.com/machine-learning/trainium/)
- [SemiAnalysis (semiconductor industry analysis)](https://www.semianalysis.com/)
현재 단락 (1/111)
There is no shortage of talk about what AI models can do, yet surprisingly little about how the chip...