Realtime Web in 2026 — WebSocket vs SSE vs WebTransport vs WebRTC (and Why LLM Streaming Picked SSE)

Prologue — The Trap in the Word "Realtime"

You get a request: "add a realtime feature." The first word in your head is WebSocket. You spin up a WS server, sticky-session it behind your router, bolt on token auth. Six months later, another team in the same company is streaming LLM tokens — and, weirdly, they're using SSE. Someone asks: "Why not WebSocket?"

The answer isn't simple. "Realtime" is a spectrum, not a single problem.

A chat line to 10,000 people worldwide within 50 ms — bidirectional, low latency.
LLM tokens to one user over a minute — one-way stream.
Game coordinates 60 times a second, where a missed packet is fine because the next one catches up — many-to-many, unreliable datagrams.
Video and audio for a call, plus a data side-channel — P2P media.
A notification to a phone every five minutes — that's push, not realtime.

Different problems want different protocols. This post compares the five candidates of 2026 — WebSocket, Server-Sent Events (SSE), WebTransport, WebRTC DataChannel, and Long Polling — plus a side note on why HTTP/2 Server Push died, and the real reason almost all LLM streaming converged on SSE.

TL;DR. One-way streaming: SSE. Bidirectional low latency: WebSocket. Game-style unreliable multi-stream: WebTransport. P2P or media: WebRTC. And — honestly — plain polling is often enough.

1. The Landscape — Five Protocols in 2026

        Browser ◀──────── Server (or another browser)
        ───────────────────────────────────────────────
        WebSocket           Bidirectional, TCP, framed
        SSE (EventSource)   Server-to-client one-way, over HTTP
        WebTransport        Bidirectional, QUIC/HTTP3, multiplexed
        WebRTC DataChannel  P2P, UDP/SCTP, optionally unreliable
        Long Polling        HTTP polling, response held open

One-sentence summaries:

WebSocket — Bidirectional frames over a single TCP connection. Standardized 2011, most ubiquitous, also the most well-known footguns.
Server-Sent Events — A long-lived HTTP response that streams text. text/event-stream MIME, automatic reconnect built in. One-way, server to client.
WebTransport — Bidirectional over QUIC. Reliable streams plus unreliable datagrams over one connection. No head-of-line blocking. Available in Chrome stable.
WebRTC DataChannel — A direct connection between two peers. P2P, optionally unreliable. The hardest of the bunch to operate.
Long Polling — Old friend. Holds the response open and replies when an event fires. Works everywhere.

And the dead one: HTTP/2 Server Push. Chrome disabled it in 2022 and the spec is removing it. Cache behavior was subtle, real-world wins were small.

2. WebSocket — Still the Workhorse, Still the Footguns

WebSocket was standardized as RFC 6455 in 2011. The core idea is simple: start with an HTTP/1.1 Upgrade handshake, then flip the TCP connection to a custom framing protocol (text, binary, ping, close).

Mental Model

        Client                          Server
          │  GET /ws HTTP/1.1               │
          │  Upgrade: websocket             │
          │ ──────────────────────────────▶ │
          │                                 │
          │ ◀────────────────────────────── │
          │  HTTP/1.1 101 Switching         │
          │                                 │
          │   ◀══════ frame ══════▶         │  (bidirectional)
          │   ◀══════ frame ══════▶         │
          │                                 │
          │   Close 0x88                    │

Once the Upgrade handshake completes, that TCP connection is no longer HTTP. That's why routers, load balancers, proxies, and CDNs all behave a bit strangely with WS.

Strengths

Truly bidirectional. Either side can send at any time.
Low overhead. One handshake, then 2 to 14 bytes of frame overhead per message.
Binary safe. Drop in game protocols or binary RPC unchanged.
Supported in essentially every browser, language, and SDK.

Footguns

It is not HTTP/2. WS itself is an HTTP/1.1 upgrade. RFC 8441 enables WebSocket over HTTP/2, but server support is uneven.
Sticky sessions. A WS connection is pinned to one backend — you need a separate Pub/Sub fan-out layer.
Head-of-line blocking. Single TCP — one large message blocks the rest.
No automatic reconnect. You write that yourself — exponential backoff, retransmit queue, sequence numbers.
Proxy-incompatible. Some corporate proxies kill the WS handshake.
Bypasses HTTP caching. Not a GET, so no CDN cache benefit.

A Tiny Chat Server (Node-ish + ws)

// server.ts — minimal WS echo + broadcast
import { WebSocketServer } from 'ws'

const wss = new WebSocketServer({ port: 8080 })
const clients = new Set<WebSocket>()

wss.on('connection', (ws) => {
  clients.add(ws)
  ws.send(JSON.stringify({ type: 'welcome', count: clients.size }))

  ws.on('message', (raw) => {
    const msg = JSON.parse(raw.toString())
    for (const c of clients) {
      if (c !== ws && c.readyState === 1) {
        c.send(JSON.stringify({ ...msg, ts: Date.now() }))
      }
    }
  })

  ws.on('close', () => clients.delete(ws))
})

Client side:

const ws = new WebSocket('wss://example.com/ws')
ws.onopen = () => ws.send(JSON.stringify({ type: 'hello', name: 'Jun' }))
ws.onmessage = (e) => console.log('recv:', JSON.parse(e.data))
ws.onclose = () => {
  // You write the reconnect logic — backoff, jitter, token refresh, ...
}

That's not the whole thing. Production grows into hundreds of lines: auth, heartbeats, reconnect, message queue, graceful drain during rollouts.

3. Server-Sent Events — The Surprise Winner of LLM Streaming

SSE entered the HTML5 spec in 2009. It looked so simple that for years people dismissed it — until the 2024-2026 LLM streaming boom made it the de facto standard.

Mental Model

        Client                                Server
          │ GET /stream HTTP/1.1                 │
          │ Accept: text/event-stream            │
          │ ──────────────────────────────────▶  │
          │                                      │
          │ ◀──────────────────────────────────  │
          │ HTTP/1.1 200 OK                      │
          │ Content-Type: text/event-stream      │
          │                                      │
          │ ◀── data: {"token":"hello"}\n\n     │
          │ ◀── data: {"token":" world"}\n\n    │
          │ ◀── data: [DONE]\n\n                 │
          │                                      │
          │   (the response never closes)        │

The connection is just an HTTP response. The server keeps it open and emits one data: ...\n\n line per event. The browser's EventSource object parses and reconnects for you.

Strengths

Simple. It's HTTP. Same headers, methods, caching rules.
Automatic reconnect. EventSource reconnects on drop and sends Last-Event-ID to resume.
Firewall friendly. Plain HTTP, traverses every corporate proxy.
HTTP/2 and HTTP/3 for free. Multiplexing comes along for the ride.
One-way = simpler servers. Maps naturally to Pub/Sub fan-out.
CDN- and gzip-compatible. It's a response, not an upgrade.

Weaknesses

Server-to-client only. Client-to-server needs a separate POST. (That's exactly the LLM API pattern.)
Text only. UTF-8 text — for binary you base64-encode, which is wasteful.
Connection cap. HTTP/1.1: 6 per origin. HTTP/2: effectively unlimited.
Idle timeouts. Some proxies (default nginx) terminate idle responses — you need periodic comments as keep-alives.

Why LLM Streaming Picked SSE

OpenAI, Anthropic, Google Gemini, Cohere — nearly every LLM token streaming API ships SSE. Sources: OpenAI Streaming docs https://platform.openai.com/docs/api-reference/streaming, Anthropic Messages Streaming https://docs.anthropic.com/en/api/messages-streaming. Seven reasons:

The traffic is fundamentally one-way. The prompt is one POST; the answer is a stream. WS's bidirectionality isn't needed.
Retries fit HTTP semantics. Normal 429/5xx policies just work.
Auth, rate limiting, and billing slot into standard HTTP middleware. With WS you build all that yourself.
Firewall and proxy compatibility. WS can be blocked in enterprise networks.
CDN friendly. Cloudflare and Vercel Edge Functions treat SSE as a first-class shape.
Easy to test. curl -N and you're done.
Free multiplexing on HTTP/2 and HTTP/3. Concurrent streams share one connection.

If you use WebSocket, every one of those seven becomes something you have to build.

30-Line Token Streamer (Bun/Node)

// app/api/chat/route.ts — 30-line SSE token streamer
export const runtime = 'edge'

export async function POST(req: Request) {
  const { prompt } = await req.json()

  const stream = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder()
      const send = (data: unknown) => {
        controller.enqueue(encoder.encode(`data: ${JSON.stringify(data)}\n\n`))
      }

      // Fake LLM: emit a token every 50 ms
      const tokens = ['Hello', ',', ' SSE', ' demo', ' from', ' an', ' edge', ' function', '.']
      for (const t of tokens) {
        send({ token: t })
        await new Promise((r) => setTimeout(r, 50))
      }
      send('[DONE]')
      controller.close()
    },
  })

  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache, no-transform',
      Connection: 'keep-alive',
    },
  })
}

The client — since EventSource can't POST — uses fetch streaming or an SSE parser like eventsource-parser.

const res = await fetch('/api/chat', {
  method: 'POST',
  body: JSON.stringify({ prompt: 'hi' }),
  headers: { 'Content-Type': 'application/json' },
})
const reader = res.body!.getReader()
const decoder = new TextDecoder()
let buf = ''
while (true) {
  const { done, value } = await reader.read()
  if (done) break
  buf += decoder.decode(value, { stream: true })
  for (const line of buf.split('\n\n')) {
    if (line.startsWith('data: ')) console.log(line.slice(6))
  }
  buf = ''
}

That's the entire shape of ChatGPT- and Claude-style token streaming. Forty lines, end to end.

4. WebTransport — The HTTP/3 Era's New Workhorse

WebTransport is the IETF's RFC 9484, finalized in 2024. It puts bidirectional streams plus datagrams over QUIC on one connection. Source: https://developer.mozilla.org/en-US/docs/Web/API/WebTransport_API.

Mental Model

        Browser ──────── QUIC (over UDP) ──────── Server
                │
                ├── Bidirectional reliable stream #1
                ├── Bidirectional reliable stream #2
                ├── Unidirectional reliable stream #3
                └── Unreliable datagrams (UDP-like)

Many streams on one QUIC connection. Packet loss on one stream doesn't block the others — no head-of-line blocking. WebSocket's worst weakness disappears.

Browser Support, 2026

Chrome / Edge — Stable since Chrome 97 (2022). https://chromestatus.com/feature/4854541929873408
Firefox — Stable since Firefox 114 (2023). Default-on rollout still in progress.
Safari — Partial as of May 2026: experimental in Safari Technology Preview. Not yet enabled by default on stable.

Server implementations: aioquic (Python), quinn (Rust), msquic (Microsoft), Cloudflare Workers WebTransport API. Source: https://github.com/aiortc/aioquic.

Strengths

No head-of-line blocking. Decisive for games and multi-stream content.
Reliable and unreliable in one connection. Datagrams are ideal for game coordinates and side tracks.
Connection migration. A QUIC property — IP changes (Wi-Fi to cellular) without dropping the session.
TLS 1.3 mandatory. Security baked in.

Weaknesses

Safari incomplete. If you target every user, you need a fallback.
Ops tooling immature. Debugging, logging, and metrics aren't as fleshed out as WS.
UDP blocking. Some enterprise and carrier networks drop UDP; fallback required.
Certificate requirements. Real domain cert, or pin a self-signed cert via serverCertificateHashes.

Datagram Demo

// Send game coordinates at 30 fps over unreliable datagrams
const transport = new WebTransport('https://game.example.com/wt')
await transport.ready

const writer = transport.datagrams.writable.getWriter()
const reader = transport.datagrams.readable.getReader()

setInterval(async () => {
  const buf = new ArrayBuffer(8)
  const view = new DataView(buf)
  view.setFloat32(0, player.x)
  view.setFloat32(4, player.y)
  await writer.write(new Uint8Array(buf))
}, 33)

;(async () => {
  while (true) {
    const { value, done } = await reader.read()
    if (done) break
    const view = new DataView(value.buffer)
    other.x = view.getFloat32(0)
    other.y = view.getFloat32(4)
  }
})()

Try this with WebSocket and one slow coordinate packet blocks the rest. WebTransport just drops the lost one and moves on.

5. WebRTC DataChannel — King of P2P, Hell to Operate

WebRTC was built for video calls. RTCDataChannel is a sibling API that lets you push arbitrary data on the same P2P connection as the media.

Mental Model

        Peer A                         Peer B
          │      (signaling server)      │
          │ ───── SDP offer ────────▶   │
          │ ◀──── SDP answer ─────────  │
          │ ───── ICE candidates ───▶   │
          │ ◀──── ICE candidates ─────  │
          │                              │
          │  ╔════ P2P SCTP/UDP ════╗   │
          │  ║   DataChannel        ║   │
          │  ║   Video/Audio        ║   │
          │  ╚══════════════════════╝   │

The key thing: signaling is separate. WebRTC needs an out-of-band channel to negotiate IPs, ports, and media capabilities — a signaling server, usually built on top of WebSocket or SSE.

Strengths

Real P2P. Latency is lowest, server bandwidth is zero.
Unreliable mode. ordered: false, maxRetransmits: 0 for UDP-like behavior.
NAT traversal. STUN and TURN infrastructure connects peers behind NAT.
Bundled with media. Add a data channel to an existing video call cheaply.

Weaknesses

Hardest ops surface. Signaling server + STUN + TURN — three pieces of infra.
TURN is not P2P. When TURN takes over, you're really running a relay, and bandwidth bills explode.
Slow to connect. Hundreds of ms to seconds of ICE negotiation.
Subtle browser differences. Chrome, Firefox, and Safari diverge at the edges.

When to Use It

Video and voice with side data (chat, file transfer, game inputs).
True P2P games — saves server cost, minimizes latency.
File sharing (WebTorrent-style).
Metaverse and VR — frequent positional and expression data.

Nobody uses WebRTC for LLM streaming. The knife is sharp but narrow, and operating it is expensive.

6. Long Polling — Old Friend, Refuses to Die

Browsers polled in the HTTP/1.1 era; Long Polling is the evolved version. The client sends a GET, the server doesn't close immediately, and instead holds the response until an event fires. Then it responds, and the client immediately reissues the GET.

Mental Model

        Client                        Server
          │ GET /events                  │
          │ ──────────────────────────▶  │
          │                              │ (wait for event, held open)
          │                              │
          │                              │ event fires
          │ ◀────── 200 OK [event] ───── │
          │ GET /events?since=...        │
          │ ──────────────────────────▶  │
          │                              │ (waits again)

Simple, and there's nowhere it doesn't run.

Strengths

Works everywhere. Even HTTP/1.0. Every proxy is fine with it.
Standard HTTP for auth, CORS, caching, and CDN.
Stateless. One request, one response — the server can stay stateless.
Excellent fallback. When WS, SSE, and WT all fail, polling keeps the lights on.

Weaknesses

Header overhead. Every event pays for HTTP headers.
Latency jitter. There's a gap between response and the next GET.
Timeout tuning. Too long and proxies kill it; too short and you're really polling.

Where It Lives in 2026

Mobile SDK fallbacks. Internal LAN chat. Notification systems. Anywhere SSE can't go (some embedded browsers). Don't dismiss it because it isn't trendy.

7. The Dead — A Short Life for HTTP/2 Server Push

HTTP/2 Server Push arrived in 2015 with a clean idea: the server pre-pushes resources before the client requests them. Pages would feel faster.

Reality disappointed. Chrome telemetry (2020-2021) showed most pushes were wasted because resources were already cached. Cache behavior was subtle, implementations were inconsistent. Chrome disabled it in 2022. Source: https://developer.chrome.com/blog/removing-push. The spec is removing it.

Replacements. For resource hinting: 103 Early Hints plus link rel=preload. For messages: SSE.

8. Decision Matrix

Use case	First pick	Alternative	Notes
LLM token streaming	SSE	WS	One-way server-to-client, retry- and CDN-friendly
Chat (1:1, 1:many)	WebSocket	SSE + POST	Bidirectional, low latency
Realtime multiplayer game	WebTransport	WebRTC	Unreliable datagrams, no HoL blocking
Dashboards and live metrics	SSE	WS	One-way, auto reconnect
Collaborative editor (CRDT)	WebSocket	WT	Bidirectional + message ordering
Video call	WebRTC	—	Media plus data on one channel
P2P file transfer	WebRTC	WS relay	Save server bandwidth
Notifications (low frequency)	Long Polling	SSE	Minimal server cost and complexity
Stock and crypto tickers (broadcast)	SSE	WS	One-way, simple fan-out
IoT commands (bidirectional)	WebSocket	WT	Small payloads, both directions
Last-resort fallback	Long Polling	—	Works in any network

The table is a starting point, not a verdict. Every decision has tradeoffs, and your team's ops capability, existing infra, and users' network shape the answer.

9. An Honest Decision Framework

Ask these five questions in order.

Q1. Is the traffic truly bidirectional, or actually one-way?

LLM answers are one-way. The prompt is one POST; the answer is a token stream. Chat is mostly one-way too — sending is a single POST, receiving is the stream.

Truly bidirectional means messages flow both ways concurrently — CRDT updates in a collaborative editor, input and state in a multiplayer game, IoT command and response.

If it's one-way, look at SSE first. Cost, ops, and complexity are dramatically lower.

Q2. Reliable, or is unreliable acceptable?

Chat messages, payment events, state updates — reliability is mandatory. TCP, or QUIC reliable streams.

Game coordinates, VOIP, real-time sensors — unreliable is fine. A dropped packet doesn't matter; the next one catches up. UDP, or QUIC datagrams.

If you need unreliable, you need WebTransport or WebRTC. WS has no unreliable mode.

Q3. Do you really need P2P?

Only if you really mean it — no server in the data path. Video calls, file transfer, true P2P games. Everything else is dramatically simpler in client-server.

If you decide on P2P, accept the STUN and TURN infrastructure and ops. And if TURN takes over the relay, the P2P benefit is gone.

Q4. What infra do you already run?

If your company already runs Kafka, Pub/Sub, and nginx SSE well — don't bolt WS onto the new feature. Ops has to know both stacks.

If WS is already well-operated — adding SSE for a new product is fine if the problems are different.

Q5. What's the users' network?

Enterprise and carrier networks block all sorts of things — UDP, WS, idle responses. If your users live in varied networks, fallback is mandatory.

        Try WT ──fail──▶ Try WS ──fail──▶ Try SSE ──fail──▶ Long Polling

Most production realtime stacks have some version of this fallback chain.

10. Who Uses What in the Wild

ChatGPT, Claude, Gemini token streaming — SSE. Sources: OpenAI https://platform.openai.com/docs/api-reference/streaming, Anthropic https://docs.anthropic.com/en/api/messages-streaming.
Slack messaging — WebSocket historically. Around 2025 mobile shifted to HTTP polling plus push notifications for battery.
Discord — WebSocket for messages, WebRTC for voice and video. Source: https://discord.com/developers/docs/topics/gateway.
Figma multi-user editing — WebSocket plus a custom CRDT.
Google Docs — Long Polling originally, WebSocket later.
Cloudflare Stream / Workers — WebTransport support, picking up gaming and live video loads.
Zoom — WebRTC media plus custom signaling.
GitHub Live Updates — SSE.
Twitch IRC (chat) — TCP plus a custom protocol; the web client tunnels IRC over WebSocket.
Stock trading platforms — Mostly WebSocket (broadcast and order entry) or a custom binary TCP for desktop.

The pattern is clear: one-way streaming is SSE, bidirectional chat is WS, media is WebRTC, gaming is incrementally WT.

11. Operational Notes — Where It Actually Hurts

Sticky sessions

WS and WT pin a client to one backend. Configure sticky sessions on your ALB, Envoy, or HAProxy. SSE is per-request — a response stays on one instance, but a reconnect can go anywhere as long as the server is stateless.

Graceful shutdown

If you drop 10,000 WS connections at deploy time, all 10,000 clients reconnect at once — thundering herd. The pattern:

Reject new connections (fail health check).
Send a "going down" message to existing connections.
Let clients reconnect with jitter.
Force-close after 30 to 60 seconds.

SSE has the same problem. Long Polling is luckier because each request is short.

Auth

The WS Upgrade handshake is HTTP, so cookies, headers, and tokens all work. Watch out for libraries that authenticate after the handshake — that's a window of unauthenticated connections. Auth at the handshake.

WT uses certs plus headers. SSE is straight HTTP — the simplest of all.

WebRTC authenticates at the signaling server. The P2P connection itself is encrypted with DTLS-SRTP.

Cost

WS and SSE — server compute scales with connection count; bandwidth scales with message volume.
WT — similar, with a modest QUIC CPU overhead in 2026.
WebRTC — server cost is signaling, STUN, and TURN. TURN usage is the bill bomb.
Long Polling — CPU and memory per held request plus HTTP header overhead per event.

At large scale (100k+ connections), memory per connection dominates. Node WS is roughly 30 to 50 KB per connection, Go and Rust are 5 to 15 KB. Language choice drives ops cost.

Monitoring

WS — connection count, message throughput, average connection lifetime, reconnect rate.
SSE — active responses, response lifetime, encoding errors.
WT — active sessions, per-stream throughput, datagram loss rate.
WebRTC — ICE success rate, TURN fallback rate, media RTT.
Long Polling — RPS, hold-time distribution.

OpenTelemetry coverage varies — partial for WS, complete for SSE, in progress for WT as of 2026.

12. A Minimal WT Datagram Server (Node, without aiortc)

Browser demo is above; the server side most commonly uses Rust wtransport or Python aioquic. On Node in May 2026, @fails-components/webtransport is the maturest package.

// server.ts — minimal WebTransport datagram echo
import { Http3Server } from '@fails-components/webtransport'
import { readFileSync } from 'node:fs'

const server = new Http3Server({
  port: 4433,
  host: '0.0.0.0',
  secret: 'demo',
  cert: readFileSync('./cert.pem'),
  privKey: readFileSync('./key.pem'),
})
server.startServer()

const stream = await server.sessionStream('/wt')
const reader = stream.getReader()
while (true) {
  const { value: session, done } = await reader.read()
  if (done) break

  await session.ready
  const dgReader = session.datagrams.readable.getReader()
  const dgWriter = session.datagrams.writable.getWriter()

  ;(async () => {
    while (true) {
      const { value, done } = await dgReader.read()
      if (done) break
      // echo
      await dgWriter.write(value)
    }
  })()
}

You need a demo cert (or pin a self-signed cert via serverCertificateHashes in the browser), and Chrome flags enabled for local testing.

13. Ten Anti-Patterns

Reflexively reaching for WebSocket for any realtime requirement. If it's one-way, SSE is simpler and cheaper.
Authenticating after the handshake. A window of unauthenticated connections.
Skipping automatic reconnect. Disconnects on mobile are constant.
No heartbeat. Idle proxies drop the connection at 30 seconds and you don't know.
No message sequence ID. After reconnect, you can't dedupe or replay.
Redeploying without graceful shutdown. Thundering herd, self-inflicted DDoS.
Giant single messages. Trigger your own head-of-line blocking.
Using WebRTC for client-server. WS or WT is dramatically simpler.
Reviving HTTP/2 Server Push. It is dead. Stop.
No fallback chain. 30 percent of users might be behind a corporate proxy.

14. Predictions for 2026 and Beyond

WebTransport — The moment Safari ships stable enable-by-default (estimated 2026 to 2027), WT becomes the default candidate for game and media loads.
HTTP/3 plus SSE — Multiplexing and 0-RTT reconnect make SSE even stronger.
MoQ (Media over QUIC) — The IETF MoQ working group is building a live media standard. Source: https://datatracker.ietf.org/wg/moq/about/.
First-class SSE in edge runtimes — Vercel, Cloudflare, and Deno Deploy prioritize SSE.
WebSocket over HTTP/3 — RFC 9220 — adoption is slow but real.
AI agent workflows — Token streaming over SSE, with tool-call results split out as separate SSE event types, is becoming the canonical pattern.

The post-LLM realtime web isn't a single-protocol era. It's an era of picking the right tool per use case. The 2010s, when WS could solve everything, are over.

Epilogue — Protocols Are Tools

One sentence: the moment you see "realtime," ask first whether the traffic is one-way or bidirectional, then whether it needs to be reliable, then whether it must be P2P. Three questions and four or five lines later, you have an answer.

LLM token streaming picked SSE not because SSE is "the best" but because SSE is the simplest answer to that specific problem. By the same logic, collaborative editing is simplest on WebSocket, game coordinates are most natural on WebTransport.

Love the problem, not the tool.

14-Item Checklist

Is the traffic truly bidirectional, or really one-way?
Is reliability mandatory, or is unreliable acceptable?
Is P2P actually needed?
Did you model the users' network (enterprise, mobile, international)?
Is there a fallback chain?
Do you authenticate at the handshake?
Do you have auto reconnect with backoff and jitter?
Is there a heartbeat or idle ping?
Are there message sequence IDs and resume semantics?
Is there a graceful shutdown procedure?
Are sticky sessions and session affinity configured at the router?
Did you measure memory per connection?
Are monitoring and SLOs defined?
If it's LLM, did you consider SSE first?

Ten Anti-Patterns (Recap)

WebSocket for one-way traffic.
Late auth.
No reconnect.
No heartbeat.
No sequence IDs.
No graceful shutdown.
Giant single messages.
WebRTC for non-P2P.
Reviving HTTP/2 Server Push.
No fallback chain.

Next Up

Possible follow-ups: SSE on edge runtimes — comparing Vercel, Cloudflare, and Deno LLM streaming patterns, MoQ (Media over QUIC) — the next-generation live media protocol, CRDT plus WebSocket — operational notes for collaborative editing.

"The next decade of the realtime web is not one protocol — it's picking the right tool. One-way: SSE. Bidirectional: WebSocket. Unreliable multi-stream: WebTransport. P2P: WebRTC. And when none of that works — Long Polling is still alive."

— Realtime Web in 2026, end.

참고 / References

WebSocket: RFC 6455 — https://datatracker.ietf.org/doc/html/rfc6455
WebSocket over HTTP/2: RFC 8441 — https://datatracker.ietf.org/doc/html/rfc8441
WebSocket over HTTP/3: RFC 9220 — https://datatracker.ietf.org/doc/html/rfc9220
SSE — HTML Living Standard: https://html.spec.whatwg.org/multipage/server-sent-events.html
EventSource MDN: https://developer.mozilla.org/en-US/docs/Web/API/EventSource
WebTransport: RFC 9484 — https://datatracker.ietf.org/doc/html/rfc9484
WebTransport MDN: https://developer.mozilla.org/en-US/docs/Web/API/WebTransport_API
WebTransport Chrome status: https://chromestatus.com/feature/4854541929873408
WebRTC MDN: https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API
HTTP/2 Push removal: https://developer.chrome.com/blog/removing-push
OpenAI Streaming API: https://platform.openai.com/docs/api-reference/streaming
Anthropic Messages Streaming: https://docs.anthropic.com/en/api/messages-streaming
Discord Gateway: https://discord.com/developers/docs/topics/gateway
aioquic Python QUIC/HTTP3: https://github.com/aiortc/aioquic
Cloudflare WebTransport: https://blog.cloudflare.com/webtransport-now-supported-in-workers/
IETF MoQ Working Group: https://datatracker.ietf.org/wg/moq/about/