Split View: Edge Computing의 새 시대 심화 가이드 — Cloudflare Workers, Durable Objects, D1, Vercel Edge, Deno Deploy, Fastly, Fly.io, Turso, Region-aware 아키텍처까지 (2025)

Edge Computing의 새 시대 심화 가이드 — Cloudflare Workers, Durable Objects, D1, Vercel Edge, Deno Deploy, Fastly, Fly.io, Turso, Region-aware 아키텍처까지 (2025)

TL;DR — 2017년 Cloudflare Workers로 시작한 Edge Computing이 2024-2025년 대부분의 현대 SaaS의 기본 인프라가 됐다. '지구 반대편 사용자에게도 50ms'라는 UX 기준은 300+ PoP(Point of Presence)에 분산된 V8 Isolate + WASM 런타임의 결과다. 엣지는 CDN의 진화가 아니라 완전히 다른 아키텍처 패러다임 — stateless 요청 처리 → Durable Objects(stateful), D1/Turso(엣지 SQLite), region-aware writes, Smart Placement까지 성숙했다. 이 글은 Cloudflare Workers, Vercel Edge, Deno Deploy, Fastly, Fly.io의 철학과 트레이드오프, 엣지 데이터베이스(D1, Turso libSQL, PlanetScale, Neon), 콜드 스타트 경쟁(V8 Isolate vs Container vs Firecracker vs WASM), 2025 엣지 AI 스택까지 — 현대 분산 아키텍처의 새 표준을 전 지형도로 정리한다.

CDN에서 Edge Compute로 — 30년의 진화

1세대 CDN (1998-) — 정적 캐시

Akamai가 MIT에서 스핀오프(1998). 이미지/CSS/JS 같은 정적 파일을 지역 캐시에 복제. 대역폭 절약 + 지연시간 단축.

2세대 CDN (2010-) — 동적 콘텐츠

CloudFront(AWS, 2008), Cloudflare(2010). HTTP cache control, purge API, signed URL. 동적 페이지도 일부 캐시 가능.

3세대 — Edge Compute (2017-)

Cloudflare Workers(2017.09). "CDN 노드에서 코드를 돌리자". 요청을 origin까지 보내지 않고 엣지에서 처리.

// 최초 Workers 예제
addEventListener('fetch', event => {
  event.respondWith(new Response('Hello from the edge!'))
})

의미: 지연 시간이 100ms → 5ms로 1/20. Origin 서버 부하 대폭 감소.

4세대 — Stateful Edge (2021-)

Cloudflare Durable Objects(2021), Fly.io(2020)가 '엣지에 상태 저장'을 대중화. 지금까지의 엣지는 stateless였으나, consistency 보장되는 상태가 엣지에서 가능해짐.

5세대 — Edge AI (2024-)

Cloudflare Workers AI, Vercel AI SDK, WebLLM. LLM 추론을 엣지에서. '사용자 디바이스 근처에서 AI' 가 현실.

왜 엣지인가

1. 지연시간의 물리학

빛은 진공에서 30만 km/s, 광섬유에서는 20만 km/s. 서울-뉴욕 직선 거리 약 11,000km → 왕복 110ms 이론 최소. 실제로는 라우팅 비효율로 150-200ms.

TCP handshake(1.5 RTT) + TLS 1.3 handshake(1 RTT) + HTTP 요청 = 최소 3.5 RTT = 500ms+ (서울-뉴욕). 엣지가 서울 PoP에서 응답하면 5ms.

2. Origin 보호

DDoS, traffic spike, 크롤러 → 엣지가 흡수. Origin이 Origin 일에만 집중.

3. 규제 대응

GDPR, data residency — 유럽 유저 데이터는 유럽 엣지에서만 처리.

4. 비용 구조

대역폭 egress 비용이 AWS Lambda $0.09/GB vs Cloudflare Workers$ 0. 대용량 요청 처리 시 수십-수백 배 차이.

Cloudflare Workers — V8 Isolate 원조

아키텍처

Workers는 V8 Isolate 위에서 실행. 컨테이너/VM 아닌 하나의 V8 프로세스 안 격리된 JS context.

Cloudflare Node (전 세계 300+ PoP)
├─ V8 Runtime (한 프로세스)
│   ├─ Isolate A (tenant 1, Worker X)
│   ├─ Isolate B (tenant 1, Worker Y)
│   ├─ Isolate C (tenant 2, Worker Z)
│   ...수천 개

이득:

콜드 스타트 5ms (컨테이너 100ms+ vs)
메모리 3MB/isolate (컨테이너 100MB+ vs)
한 노드에서 수만 테넌트

제약:

CPU 시간 제한 (Free 10ms, Paid 50ms, 최대 30s)
eval 금지, native add-on 불가
WASM 지원

주요 API

export default {
  async fetch(request, env, ctx) {
    // KV Store
    const value = await env.MY_KV.get("key")
    
    // D1 Database (SQLite)
    const { results } = await env.DB.prepare("SELECT * FROM users").all()
    
    // R2 (S3-compatible)
    const object = await env.MY_BUCKET.get("file.jpg")
    
    // Queues
    await env.MY_QUEUE.send({ event: "user_signup" })
    
    // AI
    const ai = new Ai(env.AI)
    const response = await ai.run("@cf/meta/llama-3-8b-instruct", {
      messages: [{ role: "user", content: "Hello" }]
    })
    
    return Response.json({ result: response })
  }
}

Durable Objects — 엣지의 액터

Stateful 엣지 컴퓨팅의 핵심. 각 Durable Object는:

전 세계에 단 하나의 인스턴스만 존재
특정 지역에 자동 배치 (Smart Placement)
내부 상태 영구 저장 (transactional SQLite 2024)
WebSocket 연결 유지 가능

export class ChatRoom {
  constructor(state, env) {
    this.state = state
    this.sessions = []
  }

  async fetch(request) {
    const pair = new WebSocketPair()
    this.sessions.push(pair[1])
    pair[1].accept()
    
    pair[1].addEventListener('message', e => {
      for (const session of this.sessions) {
        session.send(e.data)  // 브로드캐스트
      }
    })
    
    return new Response(null, { status: 101, webSocket: pair[0] })
  }
}

사용처: 실시간 채팅, 게임 매치메이킹, 협업 편집 (Google Docs 스타일), 주문 처리.

D1 — 엣지 SQLite

2022년 출시, 2024년 GA. 각 Cloudflare PoP에 SQLite DB 복제. Primary는 특정 region, read replica 전 세계.

const { results } = await env.DB
  .prepare("SELECT * FROM users WHERE id = ?")
  .bind(userId)
  .all()

await env.DB
  .prepare("INSERT INTO orders (user_id, total) VALUES (?, ?)")
  .bind(userId, 99.99)
  .run()

제한: 10GB/DB (2024 기준), write는 primary region으로 라우팅.

R2 — S3-compatible object storage

Egress 수수료 0. S3 호환 API. 2024년 BYO 도메인 지원.

Workers AI

2023년 출시. Cloudflare가 운영하는 GPU 풀에서 오픈소스 모델 추론.

const ai = new Ai(env.AI)

// Llama 3
const response = await ai.run("@cf/meta/llama-3-8b-instruct", {
  messages: [{ role: "user", content: "What is edge computing?" }]
})

// Stable Diffusion
const image = await ai.run("@cf/stabilityai/stable-diffusion-xl-base-1.0", {
  prompt: "A cat coding on a laptop"
})

// Embedding
const vector = await ai.run("@cf/baai/bge-base-en-v1.5", {
  text: "Hello world"
})

Vectorize

2024년 GA. Cloudflare의 벡터 DB. Workers AI + Vectorize 조합으로 엣지 RAG.

Vercel Edge Functions + Middleware

2022년 출시. Next.js 중심 프레임워크 개발자 타겟.

Edge Runtime

Cloudflare Workers와 유사한 V8 isolate 기반. 일부 Node.js API 지원 (AsyncLocalStorage, Buffer).

// Next.js Edge API Route
export const config = { runtime: 'edge' }

export default async function handler(req) {
  const country = req.geo?.country
  return new Response(`Hello from ${country}`)
}

Middleware

// middleware.ts
import { NextResponse } from 'next/server'

export function middleware(request) {
  const country = request.geo?.country
  if (country === 'KR') {
    return NextResponse.rewrite(new URL('/kr', request.url))
  }
}

export const config = { matcher: '/((?!api|_next).*)' }

A/B 테스팅, geolocation 라우팅, bot protection에 주로 쓰임.

Vercel Edge Config

읽기 전용 KV, ~20ms 세계 전역 업데이트. Feature flag, A/B 버킷 정의에 최적.

Vercel AI SDK

2023년 출시. streamText, generateObject, tool calling 표준화.

import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'

export async function POST(req) {
  const { messages } = await req.json()
  const result = await streamText({
    model: openai('gpt-4-turbo'),
    messages,
  })
  return result.toDataStreamResponse()
}

Deno Deploy — V8 + TypeScript 네이티브

Ryan Dahl(Node.js 창시자)이 만든 Deno. 2021년 Deno Deploy.

특징

V8 isolate 기반
TypeScript 네이티브 — 트랜스파일 자동
Web standard API — Node.js 호환 아닌 표준 fetch/Request/Response
npm 호환 (2023년 추가)

Deno.serve((req) => {
  return new Response("Hello from Deno Deploy")
})

Netlify Edge Functions

Deno Deploy 런타임을 내부적으로 사용. Netlify 통합.

Fastly Compute@Edge — 100% WASM

2020년 GA. Cloudflare가 V8 기반이라면 Fastly는 Wasmtime 기반.

특징

언어 무관 — Rust, Go, JavaScript, AssemblyScript
결정적 콜드 스타트 — GC 없음, 35μs 주장
상대적으로 비쌈, 엔터프라이즈 타겟

// Rust on Fastly
use fastly::{Error, Request, Response};

#[fastly::main]
fn main(req: Request) -> Result<Response, Error> {
    Ok(Response::from_body("Hello from WASM edge"))
}

KV Store, Config Store, Secret Store

Cloudflare와 유사한 엣지 스토리지.

Fly.io — 지역 VM 오케스트레이션

2020년 출시. "엣지에 진짜 컨테이너를 돌리자".

아키텍처

Firecracker MicroVM (AWS Lambda와 같은 하이퍼바이저)
35개 이상 region
fly.toml로 지역 선택

# fly.toml
app = "my-app"
primary_region = "nrt"  # 도쿄

[build]
  image = "my-app:latest"

[[services]]
  internal_port = 8080
  protocol = "tcp"
  [[services.ports]]
    port = 443
    handlers = ["tls", "http"]

[http_service]
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0

vs Cloudflare Workers

축	Cloudflare Workers	Fly.io
런타임	V8 Isolate	Firecracker VM
언어	JS + WASM	모든 것
콜드 스타트	5ms	수백 ms-초
무상태성	기본 stateless	항상 running 가능
DB	D1, KV	Postgres, SQLite Full-Stack
WebSocket	Durable Objects	일반 TCP
가격	요청 기반	VM 시간 기반

선택 기준:

JS/TS 웹 API, stateless → Cloudflare
Python/Rust/Go 긴 프로세스, Postgres 필요 → Fly.io
Phoenix LiveView, Discord 스타일 → Fly.io (여러 회사 사례)

Phoenix LiveView + Fly.io

Elixir/Phoenix 팀이 Fly.io 지분 투자 + 공식 파트너. LiveView는 지속적 WebSocket 필요해서 Fly.io가 완벽.

AWS의 답 — Lambda@Edge, CloudFront Functions

CloudFront Functions (2021)

JavaScript만
1ms 콜드 스타트
1ms 최대 실행 시간
헤더 재작성, 단순 리다이렉트만

Lambda@Edge (2017)

Node.js, Python
5초 timeout
더 강력하지만 콜드 스타트 있음
13 region만 (Cloudflare 300+과 대비)

한계: AWS 생태계에 머물러야 하는 고객 대상. 범용 엣지 플랫폼으로는 Cloudflare/Vercel/Fly에 밀림.

엣지 데이터베이스 경쟁

엣지의 진짜 병목은 데이터. stateless compute는 쉽지만, stateful consistency는 물리 법칙과 싸움.

Turso — libSQL (Chiselstrike, 2022)

SQLite fork "libSQL" — 원격 replication 내장
각 region에 read replica, write는 primary로
1ms 읽기 지연 주장

turso db create my-db --location fra
turso db replicate my-db hnd   # 도쿄 replica 추가

import { createClient } from '@libsql/client'

const db = createClient({
  url: process.env.TURSO_URL,
  authToken: process.env.TURSO_TOKEN
})

const result = await db.execute({
  sql: "SELECT * FROM users WHERE id = ?",
  args: [userId]
})

혁신: 브랜치(git-like), Embedded Replicas (앱에 DB 내장해 로컬 쿼리).

Cloudflare D1

앞서 설명. 10GB 제한 있음.

PlanetScale

MySQL 기반. Vitess 샤딩. "branch를 main에 merge" 개발 워크플로우.

pscale branch create my-db feature-x
pscale deploy-request create my-db feature-x

2024년 PostgreSQL 출시 (MySQL 중심이었던 것에서 확장).

Neon

Serverless PostgreSQL. storage-compute 분리.

Branching — git처럼
Scale to zero — 미사용시 compute 0
Fast cold start — 100ms

2024년 DB of the Year 수상.

Xata

PostgreSQL + Typesafe client + 전문 검색.

Supabase Edge Functions

Supabase의 엣지 런타임. Deno 기반. Supabase DB(PostgreSQL) 연동 최강.

EdgeDB

그래프 쿼리, 타입 강력. PostgreSQL 위에 새 쿼리 언어.

Region-Aware 아키텍처 설계

"엣지에 컴퓨팅 분산"은 쉬우나, "데이터 일관성"은 어렵다.

Read Local, Write Global

읽기는 가까운 replica에서
쓰기는 primary region으로 라우팅 (추가 지연 수용)

User (서울)
  ↓ read (5ms)  
Seoul replica ←─── replication ──── Primary (Frankfurt)
  ↑ write (250ms)

Smart Placement (Cloudflare)

워커를 Origin DB 가까이 배치하는 역발상. 사용자는 CDN 캐시로 지연 최소화, Workers는 DB와 함께 둠.

// wrangler.toml
[placement]
mode = "smart"

Single-Leader with Leader Election

CockroachDB, Spanner 스타일
Raft/Paxos 컨센서스
Leader가 region별로 이동

Geo-partitioning

사용자 데이터를 region에 고정. 한국 유저 data → 서울 DB. GDPR 대응에 필수.

-- CockroachDB
ALTER TABLE users CONFIGURE ZONE USING
  constraints = '{"+region=seoul": 1}';

Eventual Consistency

DynamoDB Global Table, Cloudflare KV. 읽기 zone 자유, 쓰기 eventual.

Put (us-east-1): key=x, value=v1
           ↓  (수십초)
Get (ap-northeast-1): key=x, value=v1 (혹은 이전)

뉴스, SNS 피드 같은 "초 단위 정확성" 필요 없는 데이터에 적합.

CRDT — 충돌 없는 복제

Conflict-free Replicated Data Types. Riak, Redis CRDT, Automerge, Y.js가 구현.

Figma, Linear, Notion이 사용
오프라인 편집 + 자동 머지

Cold Start 경쟁 — 2025년 숫자

기술	콜드 스타트	메모리	특징
Cloudflare Workers	5ms	3MB	V8 isolate
Fastly Compute	35μs-1ms	MB	Wasmtime
Deno Deploy	10ms	MB	V8 + TS
Vercel Edge Functions	10ms	MB	V8 isolate
AWS Lambda (Snapstart)	100-300ms	MB	Container snapshot
AWS Lambda@Edge	100ms-1s	MB	Node.js/Python
Fly.io (Firecracker)	200ms-1s	MB-GB	MicroVM
Fly Machines 2.0	100ms (hibernated)	MB	VM checkpoint
Google Cloud Run	500ms-2s	MB	Container
AWS ECS Fargate	10-30s	GB	Container

Firecracker (AWS가 2018년 오픈소스화): 125ms에 KVM microVM 부트. Lambda와 Fly.io의 기반.

isolates vs microVMs: Isolates가 빠르지만 보안 경계가 약함 (하드웨어 격리 아님). microVM은 KVM 격리 + 빠름 (수백 ms).

엣지 AI — 2024-2025 폭발

Cloudflare Workers AI

Llama 3, Mistral, Stable Diffusion 등 40+ 모델
사용량 기반 과금, GPU 직접 관리 불필요

Vercel AI SDK + v0

UI 생성 AI (v0.dev)
스트리밍 텍스트, RSC와 통합

Supabase AI

pgvector + Edge Functions

WebLLM (MLC AI)

브라우저에서 Llama/Mistral 실행
WebGPU 활용

Transformers.js (HuggingFace)

브라우저에서 BERT, Whisper, SAM

Ollama Cloud (2024)

Local-first LLM의 클라우드 확장

공통 패턴: 프롬프트는 엣지, 무거운 inference는 GPU region → hybrid.

엣지 보안 — Zero Trust

DDoS 방어

Cloudflare는 2024년 259M QPS DDoS 방어 사례 보고. 엣지가 흡수해야 하는 이유.

Rate Limiting

// Cloudflare Workers
export default {
  async fetch(req, env) {
    const { success } = await env.RATE_LIMITER.limit({ key: req.headers.get("CF-Connecting-IP") })
    if (!success) return new Response("Too many", { status: 429 })
    // ...
  }
}

mTLS, Zero Trust

Cloudflare Access, Tailscale, Zscaler — VPN 대체. 앱별 인증 + 엣지에서 수행.

WAF (Web Application Firewall)

Cloudflare WAF는 OWASP Top 10 자동 차단. 엣지에서 실행되어 origin 보호.

Bot Management

AI 기반 봇 탐지. Claude, GPT-4 스크래퍼 필터링.

실전 채택 가이드 — 언제 어떤 엣지인가

시나리오 1 — 정적 사이트 + 일부 동적

추천: Vercel, Netlify
Next.js/Nuxt/SvelteKit + ISR + Edge Middleware

시나리오 2 — API + WebSocket + 실시간

추천: Cloudflare Workers + Durable Objects
채팅, 게임 로비, 협업 에디터

시나리오 3 — 글로벌 SaaS + Postgres

추천: Fly.io + Neon/Supabase, 또는 Vercel + Neon
전통적 웹 앱, region-aware

시나리오 4 — 엔터프라이즈 + 규정 준수

추천: Fastly + 전용 region / AWS Lambda@Edge
금융, 헬스케어

시나리오 5 — 엣지 AI

추천: Cloudflare Workers AI + Vectorize
또는 Vercel AI SDK + OpenAI

시나리오 6 — IoT/Low Latency

추천: AWS IoT + Wavelength (5G edge)
Cloudflare Workers

2025 엣지 트렌드 정리

1. 엣지 + AI 통합

모든 메이저 플랫폼이 AI 런타임 추가. Workers AI, Vercel AI, Fastly AI (2024 예고).

2. Stateful Edge 성숙

Durable Objects, Turso Embedded Replica, Fly Machines auto-hibernate.

3. Region-aware 프레임워크

Next.js App Router의 runtime: 'edge', Remix, Astro가 엣지 쉽게 배포.

4. 엣지 데이터베이스 전쟁

D1 vs Turso vs Neon vs PlanetScale. SQLite libSQL의 약진.

5. MicroVM 개선

Firecracker 2.0 (2024)이 초 단위 부팅 → 100ms.

6. 엣지 보안 표준화

WAF + Zero Trust + Bot Management 번들.

7. FinOps 엣지

요청 기반 과금 → 대용량 시 Container가 싸다는 역전. 비용 최적화 트레이드오프.

실전 도입 체크리스트 (2025)

명확한 유스케이스 — 정적/동적/stateful/AI
런타임 선택 — Workers(V8), Fastly(WASM), Fly(VM) 중
지연시간 벤치마크 — 실제 사용자 지역에서 측정
데이터 아키텍처 — region-aware, read replica, write routing
관측 가능성 — OTLP export, Cloudflare Logs, Datadog
보안 — WAF, Rate Limit, Zero Trust
비용 모델 — 요청당 vs VM 시간당 계산
Vendor lock-in 최소화 — Web Standard API 기반
Failure Mode — origin fallback, multi-edge provider
콜드 스타트 예산 — p99 목표 설정
DB 전략 — Turso/D1/Neon 선정
CI/CD — Wrangler, Vercel CLI, flyctl

10가지 흔한 안티패턴

엣지에 무거운 계산 강행 — CPU 시간 초과
stateless 전제에 상태 숨기기 — 일관성 버그
모든 요청 엣지 처리 — 일부는 origin이 나음
지연시간 미측정 — 체감 없는 엣지 배포
규정 미고려 — GDPR region residency 위반
엣지 DB 10GB+ 저장 — D1 한계 초과
WebSocket 확장성 낙관 — Durable Object 설계 필요
origin fallback 없음 — 엣지 장애 시 전체 다운
콜드 스타트 예산 안 잡음 — p99 악화
단일 벤더 락인 — Web Standard API 피해감

다음 글 예고 — "현대 CI/CD의 진화" — GitHub Actions, GitLab CI, Dagger, Nx, Turborepo, Remote Cache, Hermetic Build까지

엣지 배포가 빨라진 만큼 CI/CD 파이프라인도 혁신이 일어났다. 2024-2025년 CI/CD는 '빌드 한 번에 30분'의 시대를 벗어나 Remote Cache, Hermetic Build, Dagger, Distributed Test 같은 기술로 '빌드 5분, 테스트 10분'을 일반화했다.

다음 글에서는:

CI/CD 역사 — Jenkins → CircleCI → GitHub Actions → Dagger
GitHub Actions 깊이 — matrix, reusable workflows, composite actions
GitLab CI vs Jenkins vs Buildkite vs CircleCI
Monorepo Build — Nx, Turborepo, Bazel, Rush, pnpm workspace
Remote Cache + Distributed Build — Bazel, Turborepo Remote Cache, Nx Cloud
Hermetic Build — 재현 가능성의 철학
Dagger — "CI/CD as code" programmable pipelines
Container Registries — GitHub Packages, ECR, Harbor
Supply Chain Security — SLSA, sigstore, cosign, SBOM
Test Parallelization — Jest/Vitest sharding, Playwright shards
Deployment Strategies — Blue/Green, Canary, Progressive Delivery (Flagger)
Platform Engineering — Backstage + CI/CD 통합

을 다룬다. '빠른 CI/CD'가 단지 기술이 아니라 팀 생산성과 배포 빈도를 직접 결정하는 지렛대라는 점, 그리고 왜 'monorepo 대규모 조직'이 CI/CD를 제품처럼 다루는지 추적한다.

A Deep-Dive Guide to the New Era of Edge Computing — Cloudflare Workers, Durable Objects, D1, Vercel Edge, Deno Deploy, Fastly, Fly.io, Turso, and Region-aware Architecture (2025)

TL;DR — Edge computing, which started with Cloudflare Workers in 2017, has become the default infrastructure of most modern SaaS in 2024-2025. The UX bar of "50ms even for users on the other side of the planet" is the product of V8 Isolates plus WASM runtimes distributed across 300+ PoPs (Points of Presence). Edge is not an evolution of CDNs — it is a completely different architectural paradigm. It has matured from stateless request handling to Durable Objects (stateful), D1/Turso (edge SQLite), region-aware writes, and Smart Placement. This article maps the full terrain: the philosophies and trade-offs of Cloudflare Workers, Vercel Edge, Deno Deploy, Fastly, and Fly.io; edge databases (D1, Turso libSQL, PlanetScale, Neon); the cold-start race (V8 Isolate vs Container vs Firecracker vs WASM); and the 2025 edge AI stack.

From CDN to Edge Compute — 30 Years of Evolution

1st-Generation CDN (1998-) — Static Cache

Akamai was spun out of MIT in 1998. Static files like images, CSS, and JS were replicated to regional caches. Bandwidth savings plus reduced latency.

2nd-Generation CDN (2010-) — Dynamic Content

CloudFront (AWS, 2008), Cloudflare (2010). HTTP cache-control, purge APIs, signed URLs. Even dynamic pages could be partially cached.

3rd Generation — Edge Compute (2017-)

Cloudflare Workers (Sept 2017). "Run code on CDN nodes." Requests no longer have to reach origin; they are handled at the edge.

// The original Workers example
addEventListener('fetch', event => {
  event.respondWith(new Response('Hello from the edge!'))
})

Impact: Latency drops from 100ms to 5ms — 1/20th. Origin load shrinks dramatically.

4th Generation — Stateful Edge (2021-)

Cloudflare Durable Objects (2021) and Fly.io (2020) popularized "state at the edge." Edge had been stateless until then, but consistency-guaranteed state became possible at the edge.

5th Generation — Edge AI (2024-)

Cloudflare Workers AI, Vercel AI SDK, WebLLM. LLM inference at the edge. "AI near the user's device" is now reality.

Why Edge?

1. The Physics of Latency

Light travels at 300,000 km/s in vacuum and ~200,000 km/s in fiber. The Seoul-to-New York great-circle distance is roughly 11,000 km, giving a theoretical minimum round-trip of 110ms. In practice, routing inefficiencies push it to 150-200ms.

TCP handshake (1.5 RTT) plus TLS 1.3 handshake (1 RTT) plus the HTTP request totals at least 3.5 RTT — 500ms+ for Seoul-New York. If the edge responds from a Seoul PoP, it's 5ms.

2. Origin Protection

DDoS, traffic spikes, crawlers — the edge absorbs them. Origin stays focused on origin work.

3. Regulatory Compliance

GDPR, data residency — European users' data is processed only at European edges.

4. Cost Structure

Egress bandwidth costs AWS Lambda 0.09 USD/GB versus Cloudflare Workers 0 USD. For heavy request volumes, the gap is tens to hundreds of times.

Cloudflare Workers — The V8 Isolate Originator

Architecture

Workers run on V8 Isolates. Not containers, not VMs — isolated JS contexts inside a single V8 process.

Cloudflare Node (300+ PoPs globally)
├─ V8 Runtime (one process)
│   ├─ Isolate A (tenant 1, Worker X)
│   ├─ Isolate B (tenant 1, Worker Y)
│   ├─ Isolate C (tenant 2, Worker Z)
│   ...thousands

Benefits:

5ms cold starts (vs 100ms+ for containers)
3MB per isolate (vs 100MB+ for containers)
Tens of thousands of tenants per node

Constraints:

CPU time limits (Free 10ms, Paid 50ms, up to 30s)
No eval, no native add-ons
WASM is supported

Core APIs

export default {
  async fetch(request, env, ctx) {
    // KV Store
    const value = await env.MY_KV.get("key")
    
    // D1 Database (SQLite)
    const { results } = await env.DB.prepare("SELECT * FROM users").all()
    
    // R2 (S3-compatible)
    const object = await env.MY_BUCKET.get("file.jpg")
    
    // Queues
    await env.MY_QUEUE.send({ event: "user_signup" })
    
    // AI
    const ai = new Ai(env.AI)
    const response = await ai.run("@cf/meta/llama-3-8b-instruct", {
      messages: [{ role: "user", content: "Hello" }]
    })
    
    return Response.json({ result: response })
  }
}

Durable Objects — Edge Actors

The core of stateful edge compute. Each Durable Object:

Exists as a single global instance
Is auto-placed in a specific region (Smart Placement)
Persists internal state (transactional SQLite since 2024)
Can maintain WebSocket connections

export class ChatRoom {
  constructor(state, env) {
    this.state = state
    this.sessions = []
  }

  async fetch(request) {
    const pair = new WebSocketPair()
    this.sessions.push(pair[1])
    pair[1].accept()
    
    pair[1].addEventListener('message', e => {
      for (const session of this.sessions) {
        session.send(e.data)  // broadcast
      }
    })
    
    return new Response(null, { status: 101, webSocket: pair[0] })
  }
}

Typical uses: real-time chat, game matchmaking, collaborative editing (Google Docs-style), order processing.

D1 — Edge SQLite

Launched in 2022, GA in 2024. Replicates SQLite databases to each Cloudflare PoP. Primary lives in one region; read replicas are worldwide.

const { results } = await env.DB
  .prepare("SELECT * FROM users WHERE id = ?")
  .bind(userId)
  .all()

await env.DB
  .prepare("INSERT INTO orders (user_id, total) VALUES (?, ?)")
  .bind(userId, 99.99)
  .run()

Limit: 10GB per DB (as of 2024), writes are routed to the primary region.

R2 — S3-compatible Object Storage

Zero egress fees. S3-compatible API. BYO domain support added in 2024.

Workers AI

Launched in 2023. Open-source model inference on Cloudflare-operated GPU pools.

const ai = new Ai(env.AI)

// Llama 3
const response = await ai.run("@cf/meta/llama-3-8b-instruct", {
  messages: [{ role: "user", content: "What is edge computing?" }]
})

// Stable Diffusion
const image = await ai.run("@cf/stabilityai/stable-diffusion-xl-base-1.0", {
  prompt: "A cat coding on a laptop"
})

// Embedding
const vector = await ai.run("@cf/baai/bge-base-en-v1.5", {
  text: "Hello world"
})

Vectorize

GA in 2024. Cloudflare's vector database. Workers AI plus Vectorize yields edge RAG.

Vercel Edge Functions + Middleware

Launched in 2022. Aimed at framework developers, Next.js-centric.

Edge Runtime

Built on V8 isolates like Cloudflare Workers. Supports some Node.js APIs (AsyncLocalStorage, Buffer).

// Next.js Edge API Route
export const config = { runtime: 'edge' }

export default async function handler(req) {
  const country = req.geo?.country
  return new Response(`Hello from ${country}`)
}

Middleware

// middleware.ts
import { NextResponse } from 'next/server'

export function middleware(request) {
  const country = request.geo?.country
  if (country === 'KR') {
    return NextResponse.rewrite(new URL('/kr', request.url))
  }
}

export const config = { matcher: '/((?!api|_next).*)' }

Primarily used for A/B testing, geolocation routing, and bot protection.

Vercel Edge Config

Read-only KV with ~20ms global propagation. Ideal for feature flags and A/B bucket definitions.

Vercel AI SDK

Launched in 2023. Standardizes streamText, generateObject, and tool calling.

import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'

export async function POST(req) {
  const { messages } = await req.json()
  const result = await streamText({
    model: openai('gpt-4-turbo'),
    messages,
  })
  return result.toDataStreamResponse()
}

Deno Deploy — V8 plus TypeScript Native

Deno was built by Ryan Dahl, creator of Node.js. Deno Deploy shipped in 2021.

Characteristics

Built on V8 isolates
TypeScript native — transpilation is automatic
Web-standard APIs — standard fetch/Request/Response instead of Node.js compat
npm compatibility (added in 2023)

Deno.serve((req) => {
  return new Response("Hello from Deno Deploy")
})

Netlify Edge Functions

Uses Deno Deploy's runtime internally. Netlify integration.

Fastly Compute@Edge — 100% WASM

GA in 2020. Where Cloudflare is V8-based, Fastly is Wasmtime-based.

Characteristics

Language-agnostic — Rust, Go, JavaScript, AssemblyScript
Deterministic cold starts — no GC, claims 35μs
Relatively expensive, enterprise-targeted

// Rust on Fastly
use fastly::{Error, Request, Response};

#[fastly::main]
fn main(req: Request) -> Result<Response, Error> {
    Ok(Response::from_body("Hello from WASM edge"))
}

KV Store, Config Store, Secret Store

Edge storage primitives similar to Cloudflare's.

Fly.io — Regional VM Orchestration

Launched in 2020. "Run actual containers at the edge."

Architecture

Firecracker MicroVMs (the same hypervisor as AWS Lambda)
35+ regions
Pick regions via fly.toml

# fly.toml
app = "my-app"
primary_region = "nrt"  # Tokyo

[build]
  image = "my-app:latest"

[[services]]
  internal_port = 8080
  protocol = "tcp"
  [[services.ports]]
    port = 443
    handlers = ["tls", "http"]

[http_service]
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0

vs Cloudflare Workers

Axis	Cloudflare Workers	Fly.io
Runtime	V8 Isolate	Firecracker VM
Language	JS + WASM	Anything
Cold start	5ms	hundreds of ms to seconds
Statelessness	stateless by default	always-running possible
DB	D1, KV	Postgres, SQLite full-stack
WebSocket	Durable Objects	plain TCP
Pricing	request-based	VM-time based

Selection criteria:

JS/TS web APIs, stateless → Cloudflare
Python/Rust/Go long-running processes, Postgres required → Fly.io
Phoenix LiveView, Discord-style → Fly.io (several public case studies)

Phoenix LiveView + Fly.io

The Elixir/Phoenix team took equity in Fly.io and is an official partner. LiveView requires persistent WebSockets, and Fly.io is a perfect fit.

AWS's Answer — Lambda@Edge, CloudFront Functions

CloudFront Functions (2021)

JavaScript only
1ms cold starts
1ms max execution time
Header rewrites, simple redirects only

Lambda@Edge (2017)

Node.js, Python
5-second timeout
More powerful but has cold starts
Only 13 regions (vs Cloudflare's 300+)

Limitation: Targeted at customers who must stay inside the AWS ecosystem. As a general-purpose edge platform, it trails Cloudflare/Vercel/Fly.

Edge Database Competition

The real bottleneck at the edge is data. Stateless compute is easy, but stateful consistency wars against physics.

Turso — libSQL (Chiselstrike, 2022)

SQLite fork "libSQL" — built-in remote replication
Read replicas per region, writes go to primary
Claims 1ms read latency

turso db create my-db --location fra
turso db replicate my-db hnd   # Add a Tokyo replica

import { createClient } from '@libsql/client'

const db = createClient({
  url: process.env.TURSO_URL,
  authToken: process.env.TURSO_TOKEN
})

const result = await db.execute({
  sql: "SELECT * FROM users WHERE id = ?",
  args: [userId]
})

Innovations: Git-like branching; Embedded Replicas (embed the DB inside the app for local queries).

Cloudflare D1

Covered above. Has a 10GB limit.

PlanetScale

MySQL-based. Vitess sharding. The "merge a branch into main" dev workflow.

pscale branch create my-db feature-x
pscale deploy-request create my-db feature-x

PostgreSQL shipped in 2024 (an expansion from its MySQL-only roots).

Neon

Serverless PostgreSQL. Storage-compute separation.

Branching — like git
Scale to zero — compute 0 when idle
Fast cold start — 100ms

DB of the Year in 2024.

Xata

PostgreSQL + typesafe client + full-text search.

Supabase Edge Functions

Supabase's edge runtime. Deno-based. Best-in-class integration with Supabase's Postgres.

EdgeDB

Graph queries, strong typing. A new query language sitting on top of PostgreSQL.

Designing Region-Aware Architectures

"Distribute compute across the edge" is easy; "keep data consistent" is hard.

Read Local, Write Global

Reads go to the nearest replica
Writes route to the primary region (accept the extra latency)

User (Seoul)
  ↓ read (5ms)  
Seoul replica ←─── replication ──── Primary (Frankfurt)
  ↑ write (250ms)

Smart Placement (Cloudflare)

A contrarian idea: place the Worker near the origin DB. Users hit the CDN cache for low latency while Workers co-locate with the DB.

// wrangler.toml
[placement]
mode = "smart"

Single-Leader with Leader Election

CockroachDB, Spanner style
Raft/Paxos consensus
Leader migrates by region

Geo-partitioning

Pin user data to regions. Korean users' data goes to the Seoul DB. Essential for GDPR compliance.

-- CockroachDB
ALTER TABLE users CONFIGURE ZONE USING
  constraints = '{"+region=seoul": 1}';

Eventual Consistency

DynamoDB Global Tables, Cloudflare KV. Free to read in any zone, writes are eventual.

Put (us-east-1): key=x, value=v1
           ↓  (tens of seconds)
Get (ap-northeast-1): key=x, value=v1 (or previous)

Suitable for data without per-second accuracy requirements — news feeds, social timelines.

CRDT — Conflict-free Replication

Conflict-free Replicated Data Types. Implemented by Riak, Redis CRDT, Automerge, and Y.js.

Figma, Linear, Notion use them
Offline editing plus automatic merging

The Cold-Start Race — 2025 Numbers

Technology	Cold start	Memory	Notes
Cloudflare Workers	5ms	3MB	V8 isolate
Fastly Compute	35μs-1ms	MB	Wasmtime
Deno Deploy	10ms	MB	V8 + TS
Vercel Edge Functions	10ms	MB	V8 isolate
AWS Lambda (SnapStart)	100-300ms	MB	Container snapshot
AWS Lambda@Edge	100ms-1s	MB	Node.js/Python
Fly.io (Firecracker)	200ms-1s	MB-GB	MicroVM
Fly Machines 2.0	100ms (hibernated)	MB	VM checkpoint
Google Cloud Run	500ms-2s	MB	Container
AWS ECS Fargate	10-30s	GB	Container

Firecracker (open-sourced by AWS in 2018): boots a KVM microVM in 125ms. The foundation of Lambda and Fly.io.

Isolates vs microVMs: Isolates are faster but have weaker security boundaries (no hardware isolation). MicroVMs deliver KVM isolation and are still fast (hundreds of ms).

Edge AI — The 2024-2025 Explosion

Cloudflare Workers AI

40+ models including Llama 3, Mistral, Stable Diffusion
Usage-based pricing, no direct GPU management required

Vercel AI SDK + v0

UI-generation AI (v0.dev)
Streaming text, integrated with RSC

Supabase AI

pgvector plus Edge Functions

WebLLM (MLC AI)

Runs Llama/Mistral in the browser
Uses WebGPU

Transformers.js (HuggingFace)

BERT, Whisper, SAM in the browser

Ollama Cloud (2024)

Cloud extension of local-first LLMs

Shared pattern: prompts at the edge, heavy inference at a GPU region → hybrid.

Edge Security — Zero Trust

DDoS Defense

Cloudflare reported a 259M QPS DDoS mitigation case in 2024. A reason the edge must absorb the traffic.

Rate Limiting

// Cloudflare Workers
export default {
  async fetch(req, env) {
    const { success } = await env.RATE_LIMITER.limit({ key: req.headers.get("CF-Connecting-IP") })
    if (!success) return new Response("Too many", { status: 429 })
    // ...
  }
}

mTLS, Zero Trust

Cloudflare Access, Tailscale, Zscaler — VPN replacements. Per-app authentication performed at the edge.

WAF (Web Application Firewall)

Cloudflare WAF auto-blocks the OWASP Top 10. Runs at the edge to protect origin.

Bot Management

AI-powered bot detection. Filters Claude, GPT-4 scrapers.

Real-World Adoption Guide — Which Edge, When

Scenario 1 — Mostly Static plus Some Dynamic

Recommendation: Vercel, Netlify
Next.js/Nuxt/SvelteKit plus ISR plus Edge Middleware

Scenario 2 — API plus WebSocket plus Real-time

Recommendation: Cloudflare Workers plus Durable Objects
Chat, game lobbies, collaborative editors

Scenario 3 — Global SaaS plus Postgres

Recommendation: Fly.io plus Neon/Supabase, or Vercel plus Neon
Traditional web apps, region-aware

Scenario 4 — Enterprise plus Compliance

Recommendation: Fastly with dedicated regions / AWS Lambda@Edge
Finance, healthcare

Scenario 5 — Edge AI

Recommendation: Cloudflare Workers AI plus Vectorize
Or Vercel AI SDK plus OpenAI

Scenario 6 — IoT/Low Latency

Recommendation: AWS IoT plus Wavelength (5G edge)
Cloudflare Workers

2025 Edge Trends

1. Edge-plus-AI Integration

Every major platform has added an AI runtime. Workers AI, Vercel AI, Fastly AI (announced 2024).

2. Stateful Edge Matures

Durable Objects, Turso Embedded Replicas, Fly Machines auto-hibernate.

3. Region-aware Frameworks

Next.js App Router's runtime: 'edge', Remix, and Astro make edge deployment easy.

4. Edge Database Wars

D1 vs Turso vs Neon vs PlanetScale. SQLite/libSQL is surging.

5. MicroVM Improvements

Firecracker 2.0 (2024) cuts boot from seconds to 100ms.

6. Edge Security Standardization

WAF plus Zero Trust plus Bot Management bundles.

7. Edge FinOps

Request-based pricing flips — at high volume, containers become cheaper. A cost-optimization trade-off.

Adoption Checklist (2025)

Clear use case — static/dynamic/stateful/AI
Runtime choice — Workers (V8), Fastly (WASM), Fly (VM)
Latency benchmark — measure from real user regions
Data architecture — region-aware, read replicas, write routing
Observability — OTLP export, Cloudflare Logs, Datadog
Security — WAF, rate limiting, Zero Trust
Cost model — per-request vs VM-hour
Minimize vendor lock-in — build on Web Standard APIs
Failure modes — origin fallback, multi-edge provider
Cold-start budget — set p99 targets
DB strategy — choose among Turso/D1/Neon
CI/CD — Wrangler, Vercel CLI, flyctl

10 Common Anti-Patterns

Forcing heavy compute to the edge — CPU time exceeded
Hiding state under a stateless premise — consistency bugs
Routing every request through the edge — some are better at origin
No latency measurement — an edge deploy users don't feel
Ignoring regulation — GDPR residency violations
Storing 10GB+ in an edge DB — exceeding D1's limit
Assuming WebSocket scalability — requires Durable Object design
No origin fallback — an edge outage takes the whole site down
No cold-start budget — p99 degrades
Single-vendor lock-in — avoid by sticking to Web Standard APIs

Next Article Preview — "The Evolution of Modern CI/CD" — GitHub Actions, GitLab CI, Dagger, Nx, Turborepo, Remote Cache, Hermetic Builds

As edge deployment got faster, CI/CD pipelines underwent their own revolution. In 2024-2025, CI/CD left the "30-minute build" era behind: Remote Cache, Hermetic Builds, Dagger, Distributed Test normalized "5-minute builds, 10-minute tests."

The next article covers:

CI/CD history — Jenkins → CircleCI → GitHub Actions → Dagger
GitHub Actions in depth — matrix, reusable workflows, composite actions
GitLab CI vs Jenkins vs Buildkite vs CircleCI
Monorepo builds — Nx, Turborepo, Bazel, Rush, pnpm workspace
Remote Cache plus Distributed Build — Bazel, Turborepo Remote Cache, Nx Cloud
Hermetic Build — the philosophy of reproducibility
Dagger — "CI/CD as code" programmable pipelines
Container registries — GitHub Packages, ECR, Harbor
Supply chain security — SLSA, sigstore, cosign, SBOM
Test parallelization — Jest/Vitest sharding, Playwright shards
Deployment strategies — Blue/Green, Canary, Progressive Delivery (Flagger)
Platform engineering — Backstage plus CI/CD integration

We'll track how "fast CI/CD" is not just a tech concern but a lever that directly governs team productivity and deploy frequency, and why "monorepo-scale organizations" treat CI/CD as a product.