Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction

In 2026, the biggest topic in AI hardware is no longer "how fast can we compute." The real bottleneck has shifted to "how cheaply and quickly can we move data." The floating-point throughput of a single GPU has grown by dozens of times over the past decade, but the memory bandwidth feeding that GPU and the interconnect linking chip to chip have not kept pace.

This gap is commonly called the "memory wall." And as of 2026, one of the most ambitious approaches to crossing the memory wall is moving data with light — that is, photonic computing and optical interconnects.

This post starts with why electrical interconnects have hit their limits, then covers the basic building blocks of silicon photonics, the physical advantages of optical interconnects, real-world cases like Lightmatter's Passage and DARPA's photonic programs, and finally academia's photonic tensor core research in 2026 and the commercialization challenges of co-packaged optics (CPO). At the end, I also summarize what this shift means for developers who work with GPUs and CUDA.

The Limits of Electrical Interconnects

Why the Memory Wall Appears

If we simplify the structure of a modern AI accelerator, it is "a giant compute unit plus high-bandwidth memory (HBM) attached beside it." The problem is that the growth rate of the compute unit's throughput and the growth rate of the bandwidth pulling data from memory differ.

Compute vs Memory Bandwidth (conceptual growth curves)

Performance

| Compute (FLOPs)

| .-'

| .-' Memory bandwidth

| .-' ____________------

| .-' _______---

| .-' _______--

|__---__-------------------------------------> time

(gap = memory wall)

The compute unit ends up idling while it waits for data, and this "starved compute unit" phenomenon is the essence of the memory wall. NVIDIA's Blackwell generation (as of GTC 2026) raised compute efficiency with its 2nd-generation Transformer Engine, but the structure in which HBM bandwidth and inter-chip links ultimately set the ceiling on overall performance remains unchanged.

Data-Movement Energy Dominates Compute Energy

The more fundamental problem is energy. In modern process nodes, a single 64-bit floating-point addition costs energy on the order of a few femtojoules (fJ), but moving that data to the other side of the chip, or to the chip next door, costs energy on the order of picojoules (pJ). In other words, there is a paradox: "moving data is tens to hundreds of times more expensive than computing on it."

Compute vs Data-movement energy (rough relative comparison)

Operation Energy (relative)

-----------------------------------------------

32-bit integer add very small (baseline 1)

32-bit SRAM read about 5

short on-chip wire move about tens

long on-chip wire move about hundreds

off-chip transfer about 1000 or more

The farther data must move, the more the energy cost grows exponentially. A substantial portion of a data center's power budget is spent not on actual computation but on shuttling data back and forth.

The Physical Limits of Copper: Reach and Loss

When you send an electrical signal over copper wiring, the signal attenuates sharply as frequency rises. Compensating for this requires stronger drivers, more complex equalizers, and more power. As a result, high-speed electrical links suffer from a triple burden:

- Reach: even past tens of centimeters, signal integrity becomes hard to maintain.

- Loss: insertion loss grows as frequency increases.

- Crosstalk: the denser the wiring, the worse the interference between adjacent channels.

The reason wafer-scale chips like the Cerebras WSE-3 emerged can be understood in this context. The WSE-3 packs about 4 trillion transistors, about 900,000 cores, and about 44GB of on-chip SRAM onto a single wafer, pushing on-chip bandwidth up to about 21 PB/s. The idea is: "if sending data off-chip is the most expensive thing, then put everything inside one chip." But there is a limit even to a single wafer, and the moment you connect multiple wafers or multiple systems, the interconnect problem returns. This is exactly where light enters.

Silicon Photonics Basics

Silicon photonics is the technology of building components that manipulate light on a silicon chip, using methods similar to existing CMOS semiconductor processes. Let's look at a few core components.

Waveguide

A waveguide is the road that light travels. It uses the refractive-index difference between silicon and silicon oxide to confine light within a narrow channel and carry it. It corresponds to wiring in an electrical circuit, but light can cross without interfering with itself and has relatively low loss even as frequency rises.

Modulator

This is the component that loads an electrical signal onto light. It converts digital bits (0 and 1) into changes in the intensity or phase of light. Representative examples are the Mach-Zehnder modulator (MZM) and the microring modulator.

Photodetector

The opposite of a modulator: it converts arriving light back into an electrical signal. It is usually built by integrating germanium (Ge) into silicon. It corresponds to the "receiving end" of an optical link.

Mach-Zehnder Interferometer (MZI)

If you split light into two branches, change the phase on one side, and recombine them, constructive and destructive interference occurs depending on the phase difference between the two beams. This principle lets you switch light or multiply it by a weight. The MZI is the core building block of the optical matrix multiplication discussed later.

Mach-Zehnder Interferometer (MZI) concept

phase shifter

input ---+--[ theta ]--+--- output1 (bright/dark)

| |

+--------------+--- output2

splitter combiner

the phase difference theta sets the intensity ratio of the two outputs

Microring Resonator

A small ring-shaped waveguide that acts as a filter, resonating and trapping only light of a specific wavelength. It is small, so integration density is high, and it is used as a modulator or wavelength filter. However, it is extremely sensitive to temperature, making it the chief culprit behind the thermal-stability problem discussed later.

Wavelength-Division Multiplexing (WDM)

A technology that simultaneously carries several beams of light of different wavelengths (colors) over a single waveguide. One electrical wire carries one signal, but one optical waveguide can flow many wavelengths at once. This is the key to explosively raising the bandwidth density of optical interconnects.

WDM: many wavelengths over one waveguide simultaneously

lambda1 --+

lambda2 --+

lambda3 --+--[ multiplexer ]== one waveguide ==[ demultiplexer ]--+-- lambda1

lambda4 --+ +-- lambda2

+-- lambda3

(one copper strand = one signal) +-- lambda4

(one optical waveguide = many signals)

Advantages of Optical Interconnects

Moving data with light brings the following physical advantages over electrical signaling.

High Bandwidth Density

Thanks to WDM, a single physical channel can carry multiple wavelengths, so bandwidth per unit area and per unit edge (beachfront) is far higher than with electrical signaling. Chip beachfront is a finite resource, but optics can push far more bits out of the same edge.

Low Latency

Inside an optical waveguide, light propagates very quickly, with less of the heavy equalization or retransmission burden seen in electrical links. The increase in latency stays gentle even as distance grows.

Low Crosstalk

Beams of light of different wavelengths barely interfere even within the same waveguide. Signal leakage into adjacent channels, as happens with electrical wiring, is far less common.

Energy Cost Insensitive to Distance

Electrical links see their energy surge as distance grows, but optical links are dominated by the cost of modulation and detection, so once converted to light, they are relatively insensitive to distance. The per-bit energy that optical I/O targets drops well below the few-pJ region.

The table below is a rough comparison of the character of electrical and optical interconnects.

| Item | Electrical interconnect (copper) | Optical interconnect (photonics) |

| --- | --- | --- |

| Reach | short (tens of cm) | long (meters to tens of meters or more) |

| Bandwidth density | limited | high (using WDM) |

| Energy vs distance | surges | relatively insensitive |

| Crosstalk | large | small |

| Maturity | very mature | developing |

| Thermal/packaging difficulty | low | high (lasers, ring stabilization) |

Lightmatter Passage — A 3D Photonic Interposer

Lightmatter is one of the most closely watched companies in the photonic interconnect space. Their Passage is a "photonic interposer" laid beneath the chips.

A traditional interposer is a packaging substrate that links multiple chiplets with electrical wiring. The idea of Passage is to embed a layer of optical waveguides into the interposer itself, so the compute chips placed on top of it communicate with one another using light.

3D photonic interposer concept

[ compute chip A ] [ compute chip B ] [ compute chip C ]

| | |

===optical I/O====optical I/O====optical I/O=== <- photonic interposer layer

|| waveguide + WDM routing mesh ||

====================================

(chips linked by light instead of electrical wiring)

This makes the chips behave as if placed on one large fabric, allowing the bandwidth limit of the chip edge to be bypassed with light. It is advantageous for binding multiple GPUs or accelerators into a single giant logical compute resource.

Companies moving in a similar direction include Ayar Labs, which provides chiplet-form optical I/O ("optical I/O chiplet") to add optical links beside an existing SoC, and Celestial AI, which is pursuing a fabric called Photonic Fabric that links memory and compute with light. The approaches differ slightly, but the common goal is "crossing the memory wall by converting data movement to light."

DARPA Photonic Programs — Connecting Wafer-Scale Nodes

The U.S. DARPA has invested in photonics for a long time. As of 2026, a particularly interesting direction is research that connects the wafer-scale compute nodes seen earlier with light.

A wafer-scale chip has enormous bandwidth within a single chip, but the moment you bind multiple wafers or multiple systems together, it again hits the limits of electrical interconnects. DARPA's photonic programs aim to solve this "node-to-node" connection with light, so that multiple giant chips operate as a single system.

The core technical challenges are as follows.

- Coupling technology to efficiently put light in and out at the wafer edge

- Laser sources that stably supply many wavelengths

- A switching fabric that routes thousands of optical channels at once

- Reliability and thermal stability that hold up even in military and space environments

This kind of national-scale R&D investment serves to raise the foundational technology of the commercial ecosystem.

Photonic Tensor Cores and Photonic In-Memory Research

So far we have talked about light as an interconnect that "moves data," but a more radical direction is "doing the computation itself with light." In 2026, arXiv and Nature Photonics and others are actively publishing research on this topic.

Multiplying Matrices with Light

The core operation of deep learning is, ultimately, matrix multiplication. And if you arrange the MZIs seen earlier in a mesh, a linear transformation (matrix multiply) occurs simply by light passing through that mesh. The settings of the phase shifters become the weights of the matrix.

MZI-mesh-based optical matrix multiply concept

input vector (encoded as light amplitude)

x1 --+

x2 --+ +--[MZI]--[MZI]--+

x3 --+->| [MZI]--[MZI] |--> output vector y = W . x

x4 --+ +--[MZI]--[MZI]--+

(phase shifter settings = weight matrix W)

Light passes through the mesh at the speed of light, so in theory the matrix multiply completes with a single pass. The appeal is that multiply-accumulate (MAC) can be performed almost passively, at very low energy. Such structures are commonly called "photonic tensor cores."

Photonic In-Memory Computing

Another direction is "photonic in-memory" research, which physically stores weights in optical elements (such as phase-change materials or microrings) and performs multiplication on the spot by passing light through them. It is an attempt to eliminate the very process of moving data from memory to the compute unit, taking direct aim at the memory wall problem.

In academia, optical memory using phase-change materials, MZI-mesh-based optical neural networks, and parallel optical computation using frequency combs are treated as key keywords. (Rather than citing a specific arXiv number, it is safer to remember these by research trend and keyword.)

That said, much of this research is still at the laboratory stage, and there are many challenges to solve: precision (noise in analog computation), reconfiguration speed, handling nonlinear functions, and integration with digital systems. In the short term, the more realistic option is optical interconnects that "still compute with electricity, but convert only the chip-to-chip communication to light."

Co-Packaged Optics (CPO)

The closest form in which optical interconnects enter actual products is co-packaged optics, or CPO.

Traditionally, optical modules (optics) were plugged into the edge of a switch or accelerator board as separate components (pluggable transceivers). CPO integrates this optical engine right next to the switch ASIC or GPU package, on the same substrate. Instead of electrical signals running a long way over copper, they are converted to light directly inside the package.

Pluggable optics vs Co-Packaged Optics

[Pluggable]

ASIC --long copper trace-- board edge - [optical module]

(longer copper section = more loss/power)

[CPO]

+---------- package ----------+

| ASIC -short link- [optical engine] |=== output directly to fiber

+-----------------------------+

(minimized copper section, lower per-bit energy)

The benefits of CPO are clear. The copper section shortens, so per-bit energy drops and bandwidth density rises. Major switch vendors have begun releasing CPO-based products, and it is drawing particular attention in the scale-out networks of AI clusters.

NVIDIA's next-generation roadmap (the Vera Rubin generation in late 2026, adopting HBM4, targeting about 10x performance per watt) also reveals a trend toward pulling chip-to-chip and node-to-node connections into the optical domain. 2026 is projected to be the year inference capex first overtakes training capex, and because inference is deployed in a distributed manner at scale, node-to-node communication efficiency translates directly into cost. With NVIDIA holding about 75 to 80 percent of the accelerator market, their interconnect choices are likely to determine the industry standard.

Commercialization Challenges

Everyone knows light is good, so why don't all chips communicate with light yet? Commercialization faces formidable barriers.

Yield

Optical components demand nanometer-scale precision. Because characteristics change even with a slight difference in waveguide width, it is hard to produce consistent quality in mass production. Low yield directly means higher cost.

Thermal Stability of Microrings

We said earlier that microrings are extremely sensitive to temperature. Even a few degrees of change in chip temperature shifts the resonant wavelength, and the ring fails to function. Correcting for this requires heaters and feedback control, and that control circuitry consumes power in turn. Beware the paradox of "spending more power to stabilize the ring while trying to save data-movement energy."

Laser Integration

Silicon is an indirect-bandgap material that does not emit light efficiently, so the light source (laser) must be integrated separately. Bonding III-V materials such as indium phosphide (InP) onto silicon or bringing in an external laser is finicky, expensive, and hard to manage for reliability.

Packaging Cost

Fiber alignment, minimizing coupling loss, and integrating the optical engine all demand precise packaging processes. One reason CPO is attractive yet slow to spread is precisely this packaging cost and the difficulty of serviceability (repair and replacement). If a single optical component fails, it can affect the entire expensive package.

The table below summarizes the commercialization challenges and their current directions of response.

| Challenge | Cause | Direction of response |

| --- | --- | --- |

| Low yield | nanometer precision required | process maturity, design margin |

| Ring thermal instability | resonance shift with temperature | heater/feedback control, athermal design |

| Laser integration | silicon's emission limits | III-V bonding, external source |

| Packaging cost | precise fiber alignment | CPO standardization, auto-alignment process |

Outlook

As of 2026, optical interconnects are at an inflection point, moving "from the lab to the data center." In the short term, optical interconnects and CPO that convert chip-to-chip and node-to-node communication to light are likely to take hold before full-blown optical computation like photonic tensor cores.

To summarize the trend roughly:

- Phase 1 (in progress): adopt CPO in switches and accelerators, move from pluggable to co-packaged

- Phase 2: expand intra-package and inter-package optical communication with optical interposers and optical I/O chiplets

- Phase 3: disaggregated architectures that link memory and compute with light

- Long term: photonic tensor cores and photonic in-memory complement electrical compute for specific workloads

The core driving force does not change. AI models keep growing, and data-movement energy keeps dominating total cost. If that cost can be lowered with light, that path will eventually be adopted.

Implications for Developers

For developers who wrote kernels in CUDA and worked with GPUs, what does this shift mean?

First, the importance of "data locality" actually grows. Even if optical interconnects make chip-to-chip communication cheap, modulation and detection still cost something. Optimizing algorithms and memory-access patterns to reduce unnecessary data movement remains valid in the optical era too.

Second, designs that presume disaggregated architectures will increase. When memory and compute are loosely bound with light, "where to place which data and how to distribute it" determines performance. The habit of being conscious of communication patterns in distributed training and inference becomes more important.

Third, the abstraction layer will look familiar for the time being. Optical interconnects are mostly abstracted at the hardware and driver level, so application code does not change much. However, the eye for reading the "communication-to-compute ratio" in profiling tools becomes increasingly important.

Fourth, a sense for precision and noise. If analog optical computation like photonic tensor cores becomes widespread, knowledge of designing models robust to quantization and noise becomes a new competitive edge. If you are already familiar with low-precision (FP8, FP4) training, that intuition carries straight over.

Translated into a more concrete checklist:

- Measure data movement first. Understand where and how many bytes flow across the whole workload, not just per kernel.

- Quantify the communication-to-compute ratio with a profiler. Once this ratio exceeds 1, the interconnect is the bottleneck.

- Check the topology affinity of collective operations (all-reduce, all-gather, and so on). On an optical fabric, the map of which node talks cheaply to which can change.

- Prepare a data-placement strategy that assumes disaggregation. Consider tiering: keep frequently used weights close, rarely used ones far.

- Get comfortable with low-precision (FP8, FP4) and noise-robust training and inference techniques. When analog optical compute arrives, they become assets immediately.

- Always check for overlap opportunities. If communication and compute can be overlapped and hidden, the felt cost of interconnect latency drops sharply.

Developer checklist

[ ] Is this an algorithm that reduces data movement?

[ ] Have you profiled the communication-to-compute ratio?

[ ] Have you checked the topology affinity of collectives?

[ ] Are you conscious of communication patterns in distributed deployment?

[ ] Have you designed data placement assuming disaggregation?

[ ] Have you exploited overlap of communication and compute?

[ ] Have you considered designs robust to low precision and noise?

Electrical SerDes vs Optical Links — Energy Per Bit, Worked Out

Instead of the abstract claim that "optics is more efficient," the picture sharpens considerably once you actually work out energy per bit (pJ/bit). Energy per bit is the total energy spent to transmit one bit, divided by the number of bits transmitted. This metric matters because a data center's power budget is, in effect, determined by "total bits times energy per bit."

First, consider an electrical SerDes (serializer/deserializer). A modern high-speed SerDes burns power in the transmit driver, the receive-side equalizer, and the clock recovery circuit (CDR). The longer or lossier the channel, the more equalization costs, so even the same SerDes spends more energy per bit the farther it has to drive a signal across the board.

An optical link has a different energy makeup. The laser source, modulator drive, photodetector plus transimpedance amplifier (TIA), and — if microrings are used — the ring thermal-stabilization heater are the main consumers. The key point is that once converted to light, there is almost no additional cost as a function of distance.

Energy per bit (pJ/bit) rough comparison — variation with distance

pJ/bit

12 | * (long copper SerDes, heavy equalization)

10 | |

8 | | * (mid-reach copper)

6 | | |

4 | | |

3 | *--------*--------* (short-reach copper)

2 |

1 | o--------o--------o--------o (optical link, distance-insensitive)

0 +---------------------------------> distance

near mid far very far

* = electrical (copper) o = optical (photonics)

Two things to read off the figure. First, at short reach the gap between electrical and optical is not large. Second, as distance grows, electrical climbs steeply while optical stays nearly flat. So optical I/O first earns its economics in the medium-to-long reach that "crosses the board."

The table below summarizes the approximate level of energy per bit by representative segment. Exact numbers vary by process, generation, and implementation, so it is best to take these as an order-of-magnitude sense.

| --- | --- | --- | --- |

There is a trap here. When computing an optical link's energy per bit, you must not omit the laser's "wall-plug efficiency." The efficiency with which a laser converts electrical energy into light is not 100 percent, and this loss takes a non-negligible share of the optical link's energy budget. In other words, the optimism of "the modulator alone is nearly free" is dangerous; an honest comparison has to look at the whole system, including the laser and thermal stabilization.

Microrings and Thermal Tuning, In More Depth

Earlier I only noted that microrings are extremely sensitive to temperature, but this problem is a central obstacle to commercializing optical interconnects, so it is worth a deeper look.

A microring resonator resonates and traps only light whose wavelength fits an integer number of times into its circumference. But silicon's refractive index changes with temperature. As temperature rises, the effect is as if the ring's effective circumference lengthened, and the resonant wavelength shifts toward the longer end. Even a few degrees of change in chip temperature can push the resonance out of the communication channel, and a perfectly good ring starts dropping the signal.

Microring + ring heater + wavelength-lock control loop

input waveguide (many wavelengths)

====L1 L2 L3 L4===========================

) <- microring (resonant at L2)

( O ) thin-film heater on top

) |

=======================|=== drop port -> extract L2

[ photodetector tap ]

[ controller ] if resonance drifts,

| adjust heater current

[ heater current ] to re-lock wavelength

There are two broad ways to handle this.

The first is active tuning. You place a thin-film heater on top of the ring and, while continuously monitoring where the resonance sits with a photodetector, finely adjust the heater current so that the resonant wavelength stays exactly on the target channel. This is called "wavelength locking." The downside is clear: the heater consumes power and is itself another heat source.

The second is athermal design. You add a compensating material whose refractive index changes in the opposite direction with temperature (for example, a specific polymer overcladding), trying to cancel silicon's temperature change at the material level. This can reduce heater power, but the process is finicky and the compensation range is limited.

On top of this, when hundreds to thousands of rings gather on a chip, a vexing problem called "thermal crosstalk" arises. When you turn on a heater to warm ring A, that heat spreads to neighboring ring B and shakes B's resonance too. Then B's heater reacts, and that heat in turn affects A — a mutual interference. This is why control algorithms for large ring arrays evolve beyond simple per-ring feedback into cooperative control that accounts for mutual interference.

| Thermal stabilization method | Advantage | Disadvantage |

| --- | --- | --- |

| Active heater plus wavelength lock | precise, wide correction range | extra power, heat, control complexity |

| Athermal overcladding | reduces heater power | process difficulty, limited range |

| Cooperative control (array) | mitigates thermal crosstalk | algorithm and calibration burden |

In the end, the microring's strength — small, fast, high integration density — and its weakness — temperature fragility — are two sides of the same coin. That is why some designs choose temperature-insensitive Mach-Zehnder modulators over microrings from the start. The choice between the two is an engineering trade-off balancing area, power, and thermal budget.

Ayar Labs and Celestial AI — Comparing Two Approaches

Companies productizing optical interconnects head toward the same goal, but place different bets on which point in the system to convert to light.

Ayar Labs focuses on "optical I/O chiplets." You attach a dedicated optical I/O chiplet next to an existing SoC or accelerator, supply a multi-wavelength laser ("comb laser") from outside, and push data out as light at the package edge. The core message is "extend chip-edge bandwidth with light without major changes to existing chip designs." A strength is that a standardized chiplet interface lets it attach to a variety of SoCs.

Celestial AI goes a step further and aims at "optical memory disaggregation." With a fabric called Photonic Fabric, it loosely links compute chips and memory pools with light, so memory need not sit right next to the compute unit but can be accessed over light in a distant, large-capacity memory pool. The idea is to extend memory with light rather than being trapped by HBM capacity for large models.

| Item | Ayar Labs | Celestial AI |

| --- | --- | --- |

| Core product | optical I/O chiplet | Photonic Fabric |

| Primary goal | extend chip-edge bandwidth | memory disaggregation |

| Point converted to light | chip-to-chip I/O | compute-to-memory path |

| Laser supply | external multi-wavelength source | fabric-integrated source |

| Integration method | chiplet attached beside existing SoC | compute-memory fabric link |

| Appeal | minimal design change | bypass the memory-capacity wall |

The two approaches are less competitors than complements attacking different layers of the stack. In the short term, one can picture Ayar Labs-style "optical chip I/O" taking hold first, with Celestial AI-style "memory disaggregation" maturing on top of it. Add that the Lightmatter Passage seen earlier is yet another bet — converting the entire interposer layer to light — and you have three different heights of attack on the same memory-wall problem.

Optical vs Electrical Interconnects — When to Use Which

Optics does not beat electrical signaling everywhere. Where to use which interconnect is a function of distance, bandwidth demand, power budget, and cost and reliability. A practical set of decision criteria:

- Within the die (a few mm): electrical wins overwhelmingly. The conversion cost of going optical exceeds the distance savings.

- Within the package (a few cm): electrical is still the default, but once bandwidth density hits its limit, an optical interposer becomes a candidate.

- Crossing the board (tens of cm): the point where optical economics begin in earnest. This is the core territory CPO targets.

- Within and across racks (meters): optics clearly wins. Copper's distance-driven loss and power climb steeply.

- Data-center scale-out (tens of meters or more): optics is effectively the only realistic choice.

Interconnect choice by distance (concept)

mm cm tens of cm m tens of m

|---------|----------|-----------|----------|-------->

[ elec. ][ elec. ][elec.<->opt][ opt. ][ opt. ]

the line CPO contests

In short, "optical vs electrical" is not an either/or but a boundary-line skirmish along the distance axis. And that boundary line creeps toward the short end every year. Where once optics was used only in tens-of-meters cables, it has now come down to tens of centimeters on the board, and the next step is inside the package.

Conclusion

The memory wall is not a problem of compute speed but a problem of data movement, and the heart of data movement is, ultimately, energy and distance. Copper is excellent, but it cannot match the bandwidth density and distance-insensitive energy characteristics that light possesses.

Photonics in 2026 is not yet a finished form. The real-world walls of yield, thermal stability, laser integration, and packaging cost clearly exist. But Lightmatter Passage, Ayar Labs' optical I/O, Celestial AI's Photonic Fabric, DARPA's wafer-scale connection research, and academia's photonic tensor core trend all point in the same direction. AI's next leap will come not from faster transistors but from cheaper and faster data movement, and a leading candidate for that path is light.

The era of crossing the memory wall with light is beginning in earnest.

References

- [NVIDIA](https://www.nvidia.com/) — next-generation accelerator roadmap including Blackwell and Vera Rubin

- [Lightmatter](https://lightmatter.co/) — Passage 3D photonic interposer

- [Ayar Labs](https://ayarlabs.com/) — optical I/O chiplet

- [Celestial AI](https://www.celestial.ai/) — Photonic Fabric

- [DARPA](https://www.darpa.mil/) — photonics and wafer-scale connection research

- [Cerebras](https://www.cerebras.ai/) — WSE-3 wafer-scale engine

- [arXiv](https://arxiv.org/) — latest papers on photonic tensor cores and photonic in-memory

- [Nature Photonics](https://www.nature.com/) — academic trends in photonic computing

- [IEEE Spectrum](https://spectrum.ieee.org/) — industry coverage of silicon photonics and CPO

- [SemiAnalysis](https://www.semianalysis.com/) — AI hardware and interconnect market analysis