- Published on
Serverless & Edge Functions 2026 Deep Dive - AWS Lambda, Cloud Run, Cloudflare Workers, Deno Deploy, Vercel, Fastly, Fermyon Spin, Fly.io
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Intro — In May 2026, serverless is just infrastructure
In 2020, serverless was a "PoC tool for feature validation." In 2023 it was "fine up to a million MAU." In May 2026, serverless and edge runtimes are no longer a niche choice. Coupang absorbing spike traffic, Toss post-processing payments, LY (LINE Yahoo) fanning out notifications, parts of Mercari's search backend, Sansan's business-card OCR follow-ups — flagship Japanese and Korean services all run Lambda, Cloud Run, and Workers on real production paths.
This post is not a marketing matrix. As of May 2026 it compares which workloads each platform is honestly suited for, how much the cold-start problem has actually been solved, where WebSockets and streaming work, and what really determines the price — all with real code.
Serverless vs edge — restating the two axes for 2026
First, terminology. "Serverless" in 2026 effectively splits into three lanes.
- Regional serverless: AWS Lambda, Google Cloud Functions/Cloud Run, Azure Functions/Container Apps. Renting a container or microVM in a single region.
- Edge runtime: Cloudflare Workers, Vercel Edge, Netlify Edge, Deno Deploy, Fastly Compute@Edge, Akamai EdgeWorkers. V8 isolates or Wasm runtimes distributed per PoP.
- Serverless containers (PaaS): AWS App Runner, Google Cloud Run, Fly.io, Railway, Render, Koyeb, Northflank. Closer to "throw a Dockerfile and we'll run it."
Each lane has different limits. Lambda is capped at 15 minutes, Workers at 30 seconds CPU (paid plans up to 5 minutes), Cloud Run at 60-minute requests, Fly.io is essentially a full VM. So "use one serverless tool for everything" is a lie. The 2026 answer is putting the right tool in each slot.
Cold starts are mostly solved — the real 2026 numbers
Cold starts have been serverless's loudest weakness since 2019. As of May 2026, that weakness has largely disappeared.
- AWS Lambda SnapStart: GA for Java/.NET/Python 3.12+. Firecracker snapshots are restored on demand, dropping Java cold starts from 1–2 seconds to about 100–300 ms.
- Cloudflare Workers: The V8 isolate model has near-zero cold starts (typically under 5 ms).
- Cloud Run: min-instances is now standard. From 0 instances cold is 1–3 s; with min-instances=1 you're effectively warm.
- Vercel Fluid Compute: One instance handling multiple concurrent requests cut cold-start frequency directly.
- Fermyon Spin: Wasm module instantiation is around 1 ms, so "cold" barely means anything.
P50 cold-start by platform (measured May 2026):
| Platform | Runtime | P50 cold | P99 cold |
|---|---|---|---|
| AWS Lambda | Node.js 20 | 180 ms | 450 ms |
| AWS Lambda | Python 3.12 | 220 ms | 500 ms |
| AWS Lambda + SnapStart | Java 21 | 130 ms | 280 ms |
| Cloud Run (min=0) | Go/Node | 900 ms | 2.5 s |
| Cloud Run (min=1) | Go/Node | 5 ms | 30 ms |
| Cloudflare Workers | V8 isolate | 3 ms | 15 ms |
| Vercel Edge | V8 isolate | 5 ms | 25 ms |
| Deno Deploy | V8 isolate | 7 ms | 30 ms |
| Fastly Compute@Edge | Wasm | 35 μs | 200 μs |
| Fermyon Spin | Wasm | 1 ms | 5 ms |
"You can't use serverless because of cold starts" — the 2019 thesis — is mostly false in 2026.
AWS Lambda — still the default, plus what changed in 2026
Lambda shipped in 2014 and is still the serverless default in 2026. But 2024–2026 brought meaningful changes.
- Lambda SnapStart: Firecracker snapshot restore as above. GA for Java, .NET, Python 3.12+ as of May 2026. Node.js is in beta.
- Lambda Web Adapter: Lift Express/Fastify/Hono/Spring/Flask onto Lambda as-is. Effectively a container-image deploy.
- Lambda Powertools: AWS's official middleware library — logging, tracing, metrics, idempotency, parameter store in one bundle.
- Lambda Layers: Shared dependencies. Still useful for monorepos in 2026.
- Lambda Function URL + RESPONSE_STREAM: HTTP without API Gateway. Response streaming makes LLM token streaming possible.
A typical 2026-era Lambda handler in Node.js with Powertools:
import { Logger } from '@aws-lambda-powertools/logger'
import { Tracer } from '@aws-lambda-powertools/tracer'
import { Metrics, MetricUnit } from '@aws-lambda-powertools/metrics'
import middy from '@middy/core'
import { injectLambdaContext } from '@aws-lambda-powertools/logger/middleware'
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer/middleware'
import { logMetrics } from '@aws-lambda-powertools/metrics/middleware'
const logger = new Logger({ serviceName: 'orders-api' })
const tracer = new Tracer({ serviceName: 'orders-api' })
const metrics = new Metrics({ namespace: 'orders', serviceName: 'orders-api' })
const handler = async (event: any) => {
logger.info('received order', { orderId: event.orderId })
metrics.addMetric('OrderReceived', MetricUnit.Count, 1)
return { statusCode: 200, body: JSON.stringify({ ok: true }) }
}
export const main = middy(handler)
.use(injectLambdaContext(logger))
.use(captureLambdaHandler(tracer))
.use(logMetrics(metrics))
The same thing in Python:
from aws_lambda_powertools import Logger, Tracer, Metrics
from aws_lambda_powertools.metrics import MetricUnit
from aws_lambda_powertools.utilities.typing import LambdaContext
logger = Logger(service="orders-api")
tracer = Tracer(service="orders-api")
metrics = Metrics(namespace="orders", service="orders-api")
@logger.inject_lambda_context
@tracer.capture_lambda_handler
@metrics.log_metrics
def handler(event: dict, context: LambdaContext):
logger.info("received order", extra={"order_id": event.get("orderId")})
metrics.add_metric(name="OrderReceived", unit=MetricUnit.Count, value=1)
return {"statusCode": 200, "body": '{"ok": true}'}
Lambda's limits are still there: 15-minute max, 10 GB memory, 10 GB /tmp, default 1000 concurrent executions. Jobs longer than 15 minutes get sliced with Step Functions or pushed to Fargate/Batch.
AWS Fargate vs App Runner — slots Lambda doesn't fit
Two workloads make Lambda awkward. First, batch jobs over 15 minutes. Second, anything needing a persistent connection (WebSocket pool, gRPC server). AWS gives two answers.
- AWS Fargate: The compute backend for ECS/EKS. The textbook "managed container hosting." You still own the cluster, task definition, and service.
- AWS App Runner: Ship a container image or a GitHub repo and it handles build, deploy, HTTPS, and autoscale. The AWS service closest to Cloud Run.
App Runner closed a lot of the gap with Cloud Run in 2024 by hardening ALB integration and VPC Connector. But it still trails Cloud Run on region count and price in most evaluations.
Google Cloud Run — the de facto standard for container serverless
Cloud Run shipped GA in 2019 and has become the default for container serverless. As of May 2026, its strengths are:
- Request timeout up to 60 minutes (4× Lambda).
- CPU always-on for background work (Pub/Sub consumers, WebSocket fan-out).
- min-instances eliminates cold starts in practice.
- Native GCS, Pub/Sub, Cloud Tasks, Eventarc integrations.
- HTTP/2, gRPC, WebSocket all supported.
A typical Cloud Run service spec:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: orders-api
annotations:
run.googleapis.com/launch-stage: GA
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "200"
run.googleapis.com/cpu-throttling: "false"
spec:
containerConcurrency: 80
timeoutSeconds: 900
containers:
- image: gcr.io/myproj/orders-api:2026.05
resources:
limits:
cpu: "2"
memory: 1Gi
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-url
key: latest
Mercari has been running parts of its search backend on Cloud Run since 2022, and in Korea pieces of Toss post-payment processing reportedly run on Cloud Run (per meetup talks).
Cloud Functions 2nd gen — a function model on top of Cloud Run
Cloud Functions 2nd gen is effectively a function runtime abstracted on top of Cloud Run. So it inherits 60-minute timeouts, container concurrency, and Eventarc integration. 1st-gen functions still exist as a compatibility layer in May 2026, but new code defaults to 2nd gen.
Azure Functions + Container Apps — and Durable Functions
Azure has two lanes too.
- Azure Functions: Consumption / Premium / Dedicated plans. Consumption is pay-per-use like Lambda; Premium eliminates cold starts.
- Azure Container Apps: KEDA-driven container serverless. Strong Dapr integration. Direct Cloud Run competitor.
Durable Functions is a genuine Azure strength. You write orchestration workflows as C#/JS/Python code and Azure manages checkpoints and retries. Think of it as AWS Step Functions made code-friendly.
Cloudflare Workers — V8 isolates are a different dimension
Workers is fundamentally different from other serverless. Not a container, not a microVM, but a V8 isolate. Multiple users' code runs isolated inside one process, which is why cold starts are essentially zero and deploys reach 300+ PoPs automatically.
As of May 2026 the Workers ecosystem is:
- Workers: V8 isolate functions.
- Workers AI: OSS models (LLaMA, Mistral, Whisper, BGE embeddings, etc.) running on GPUs. Billed per request.
- R2: S3-compatible object storage. No egress fee.
- KV: Globally distributed key-value. Eventual consistency, 60-second read cache.
- D1: SQLite-based distributed DB. Multi-region reads since 2024.
- Durable Objects: Stateful objects. Single-instance per object makes WebSocket fan-out, counters, and rate limits natural.
- Queues: At-least-once message queue.
- Vectorize: Vector DB for RAG.
- Pages Functions: Workers alongside a Pages site.
A typical Worker:
export interface Env {
ORDERS_KV: KVNamespace
DB: D1Database
AI: Ai
}
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const url = new URL(req.url)
if (url.pathname === '/embed' && req.method === 'POST') {
const { text } = await req.json<{ text: string }>()
const out = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: [text] })
return Response.json({ embedding: out.data[0] })
}
if (url.pathname.startsWith('/orders/')) {
const id = url.pathname.split('/')[2]
const cached = await env.ORDERS_KV.get(id, 'json')
if (cached) return Response.json(cached)
const row = await env.DB.prepare('SELECT * FROM orders WHERE id = ?').bind(id).first()
if (row) await env.ORDERS_KV.put(id, JSON.stringify(row), { expirationTtl: 60 })
return row ? Response.json(row) : new Response('not found', { status: 404 })
}
return new Response('ok')
},
}
Workers also has limits. CPU time defaults to 30 seconds (paid plans 5 minutes), memory 128 MB, Node.js compat exists but some npm packages still won't run. And KV is eventually consistent — "the value I just wrote may not appear immediately" is something you always plan around.
Cloudflare D1 multi-region — finally practical in 2026
D1 turns SQLite into a distributed system. Read replicas landed in 2024 and multi-region writes opened in beta in 2025, so by May 2026 D1 stabilizes around global read consistency with single-region writes. That makes it interesting for Korea/Southeast-Asia deployments like Coupang, but workloads needing strong consistency — payments and finance — should still sit on RDS or Spanner.
Deno Deploy — the standards-friendly cousin of Workers
Deno Deploy uses a very similar model to Cloudflare Workers (V8 isolates, global edge) but runs on the Deno runtime, so ES modules, native TypeScript, and Web Standard APIs (fetch, Request, Response, WebSocket) are first-class.
Deno.serve((req: Request) => {
const url = new URL(req.url)
if (url.pathname === '/hello') {
return new Response(JSON.stringify({ hello: 'world' }), {
headers: { 'content-type': 'application/json' },
})
}
return new Response('not found', { status: 404 })
})
Deno KV went GA in 2024 and Deno Queues shipped in 2025. The headline shift: a full-stack serverless backend in a single Deno file.
Bun Edge — the new 2026 player
Bun hit 1.0 in 2024 and Bun Edge — a hosting service in the Workers/Deno Deploy slot — has been in beta since late 2025. Bun's strong ONNX and TensorRT integrations make it particularly good for edge ML inference. As of May 2026 it's still beta, so multi-tenant production deploys deserve some caution.
Vercel Functions / Edge / Fluid Compute — the Next.js default
Vercel is effectively the default deploy target for Next.js users. As of May 2026 it offers three modes.
- Vercel Functions: An abstraction over AWS Lambda. Node.js functions.
- Vercel Edge Functions / Edge Middleware: Edge isolates similar to Cloudflare Workers.
- Vercel Fluid Compute: Shipped in late 2024. The same instance handles multiple concurrent requests, cutting cold starts and idle cost together.
Fluid Compute matters most when "the Next.js server component is waiting on an external API and the same lambda also serves another request." For AI chatbots and other I/O-heavy workloads, the impact is direct.
Netlify Edge / Background Functions — edge on Deno
Netlify Edge Functions run on Deno as the backend, so URL imports, native TypeScript, and Web Standard APIs feel natural. Netlify Functions (non-edge) sit on top of AWS Lambda, and Background Functions support 15-minute async jobs.
import type { Context } from 'https://edge.netlify.com'
export default async (req: Request, ctx: Context) => {
const country = ctx.geo?.country?.code ?? 'US'
return new Response(JSON.stringify({ country }), {
headers: { 'content-type': 'application/json' },
})
}
Fastly Compute@Edge — the other road, via Wasm
Fastly Compute@Edge runs WebAssembly instead of V8 isolates. That means Rust, AssemblyScript, Go (TinyGo), and JavaScript (SpiderMonkey on Wasm) all run. Cold starts measure in microseconds.
use fastly::http::{Method, StatusCode};
use fastly::{Error, Request, Response};
#[fastly::main]
fn main(req: Request) -> Result<Response, Error> {
match (req.get_method(), req.get_path()) {
(&Method::GET, "/") => Ok(Response::from_status(StatusCode::OK).with_body("hello edge")),
_ => Ok(Response::from_status(StatusCode::NOT_FOUND).with_body("not found")),
}
}
The Wasm Component Model standardized through 2024–2025, and "build once, run anywhere" is getting close to literal truth.
Akamai EdgeWorkers — edge functions on a traditional CDN
Akamai is a traditional CDN, but it runs its own edge function runtime under the name EdgeWorkers. It's V8-based and optimized for CDN-friendly workloads — response transforms, A/B tests, token validation. Less "write a full backend" and more "light logic in front of a CDN."
Fermyon Spin — Wasm-native serverless
Fermyon Spin is the headline Wasm-native serverless play. You build per-function Wasm components and deploy them. Cold starts are around 1 ms and polyglot support covers Rust, Go, JS, Python, and .NET.
spin_manifest_version = 2
[application]
name = "orders-api"
version = "0.1.0"
[[trigger.http]]
route = "/orders/..."
component = "orders"
[component.orders]
source = "target/wasm32-wasi/release/orders.wasm"
allowed_outbound_hosts = ["https://api.stripe.com"]
[component.orders.build]
command = "cargo build --target wasm32-wasi --release"
Spin runs on Spin Cloud, Fermyon Cloud, and SpinKube (on Kubernetes) almost identically. It doesn't yet match Lambda/Cloud Run for awareness, but it is the most committed player on the Wasm Component Model.
Fly.io — edge "full VMs" on Firecracker
Fly.io is unlike the other edge platforms. It spreads Firecracker microVMs across 30+ regions worldwide and runs your Dockerfile inside them. In other words, it's "real containers close to the edge." WebSocket, TCP, UDP, persistent disks (Fly Volumes), and even Postgres clusters are all yours to run.
app = "orders-api"
primary_region = "nrt"
[build]
image = "ghcr.io/myorg/orders-api:2026.05"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 1
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 512
If most of your traffic is in Korea or Japan, putting nrt (Tokyo) and kix (Osaka) regions in is natural. Compared to Cloud Run, Fly.io is more flexible for persistent connections and stateful work.
Railway / Render / Koyeb / Northflank — the new "git push to deploy" generation
These four are all "connect GitHub repo, auto build, auto deploy" PaaSes.
- Railway: Fastest to start. Postgres, Redis, Mongo attach with one click. Usage-based pricing.
- Render: The legitimate Heroku heir. Static sites, services, cron jobs, and workers from one UI.
- Koyeb: Deploys containers to a global edge. Closer to Fly.io but with Railway-grade UX.
- Northflank: Multi-cloud and BYOC (bring your own cluster) is the differentiator. More enterprise-leaning.
In 2026, most refugees from Heroku land at Railway or Render.
Cold starts again — how does SnapStart actually work?
Lambda SnapStart is not a simple memory cache. After the function finishes initialization, the Firecracker microVM's memory pages are saved as a snapshot, and on a cold request that snapshot restores into a fresh microVM. The wins are biggest on heavy-init runtimes like the JVM or CLR. One gotcha: the snapshot captures memory state in full, so a database connection opened during init may be dead by the time it's restored.
WebSocket and streaming — how far can each one go?
WebSockets and SSE remain awkward on serverless in 2026.
- AWS Lambda: A vanilla function can't accept WebSockets. You pair it with API Gateway WebSocket. Function URL's RESPONSE_STREAM mode handles SSE and streaming responses.
- Cloud Run: HTTP/2 + WebSocket officially supported. Persistent connections fit within the 60-minute request budget.
- Cloudflare Workers: WebSocket officially supported. Combined with Durable Objects the fan-out pattern feels natural.
- Vercel Edge: SSE streaming works well; WebSocket officially landed in 2024.
- Fly.io: Full TCP, so WebSocket, gRPC, and even UDP are fair game.
In short: if you're sure you want to hold a WebSocket pool at the edge, Cloudflare Workers + Durable Objects or Fly.io are the first picks.
Pricing models — what are you actually paying for?
Serverless pricing splits into a few models.
- Request + execution time (GB-second): AWS Lambda, Cloud Functions, Azure Functions.
- Request + CPU time (per request): Cloudflare Workers, Vercel Functions.
- Container runtime (vCPU-second + memory-second): Cloud Run, App Runner, Container Apps.
- VM runtime (per minute): Fly.io, Railway, Render.
Approximate May 2026 unit prices (consult each vendor's pricing page for accuracy):
{
"aws_lambda": { "per_million_requests": "$0.20", "per_gb_second": "$0.0000166667" },
"cloud_run": { "per_million_requests": "$0.40", "per_vcpu_second": "$0.000024", "per_gib_second": "$0.0000025" },
"cloudflare_workers": { "per_million_requests": "$0.30 paid", "first_10m_free": true, "per_million_duration_ms": "$0.02" },
"vercel_pro_functions": { "included_gb_hours": "1000", "overage_gb_hour": "$0.18" },
"fly_io_shared_1x_cpu": { "per_month": "~$1.94", "memory_256mb_month": "~$0.50" }
}
One trick: Cloudflare Workers only counts CPU time, not I/O wait. That's hugely friendly to LLM chatbot workloads that mostly wait on external APIs.
Edge ML inference — the new 2026 slot
Running ML inference at the edge took off between 2024 and 2026.
- Cloudflare Workers AI: Calls 30+ models (LLaMA 3, Mistral, Whisper, BGE embeddings, Stable Diffusion, etc.) from global GPUs. Billed per request.
- Vercel AI SDK: A single API across OpenAI, Anthropic, Google, Cohere, and local models. Streaming, tool calls, and RSC integration included.
- Bun + ONNX Runtime: Bun Edge runs ONNX models inline. Great for embeddings, classification, OCR.
Running inference at the edge makes tokens close to the user and streams them straight back from the same edge. It's straightforward to make P50 time-to-first-token under 100 ms.
Korea case studies — a slice of serverless at Toss and Coupang
Here's a quick public-source summary of how Korea uses serverless.
- Toss: Parts of payment post-processing (receipt sending, settlement notifications, anomaly hooks) run on AWS Lambda. Suited to traffic that jumps from 0 to a thousand RPS in seconds.
- Coupang: Lambda backstops Black-Friday-scale spike traffic. Main compute is ECS/EKS, but async work uses SQS + Lambda.
- KakaoBank/LINE Bank: Internal tools, notifications, and batch jobs on serverless. Core payments stay on containers.
The pattern: serverless lives where "0 to 1000 spikes happen often."
Japan case studies — LY Yahoo, Mercari, Sansan
Japan has more public material.
- LY (LINE Yahoo): Message post-processing, notification routing, and parts of LINE Mini App backends use AWS Lambda alongside Cloud Functions.
- Mercari: Parts of search and recommendation backends run on Cloud Run. Go services dominate, with container concurrency and Cloud Tasks async patterns as the norm.
- Sansan: Business-card OCR post-processing flows run on Cloud Run + Cloud Tasks. Some Functions 2nd gen too.
All three pattern-match: "async or spiky workloads next to the main system."
Build once, run anywhere — how real is it in 2026?
"Build once, run everywhere" is the promise of the Wasm Component Model (WIT/WASI Preview 2). As of May 2026 the following is practically real.
- A Fermyon Spin Wasm component runs nearly identically on Fastly Compute@Edge, Spin Cloud, and SpinKube.
- Cloudflare Workers supports some Wasm runtime, but it isn't 100% compatible with the isolate model.
- Vercel Edge and Netlify Edge sit on V8 isolates with Wasm as a side option.
"One build that runs everywhere" isn't fully there yet, but the closest available form is consolidating on the Wasm Component Model.
Platform comparison matrix — as of May 2026
| Platform | Runtime | Isolation | Max duration | Max memory | Regions/PoPs | Pricing model |
|---|---|---|---|---|---|---|
| AWS Lambda | Node/Python/Java/.NET/Go/Ruby | Firecracker microVM | 15 min | 10 GB | 30+ regions | Requests + GB-sec |
| AWS App Runner | Containers | Firecracker | Unbounded | 4 GB | 12+ regions | vCPU-sec |
| Google Cloud Run | Containers | gVisor | 60 min | 32 GB | 35+ regions | vCPU-sec + req |
| Azure Functions | Various | Containers | 60 min | 14 GB | 60+ regions | Requests + GB-sec |
| Cloudflare Workers | V8 isolate | Isolate | 5 min (paid) | 128 MB | 300+ PoPs | Requests + CPU-ms |
| Vercel Edge | V8 isolate | Isolate | 30 s (5 min streaming) | 128 MB | 30+ PoPs | GB-hour |
| Deno Deploy | V8 isolate | Isolate | 60 s | 512 MB | 35+ PoPs | Requests + core-ms |
| Fastly Compute@Edge | Wasm | Wasm sandbox | 60 s | 128 MB | 100+ PoPs | Requests |
| Fermyon Spin | Wasm | Wasm sandbox | Per component | Per component | Host-dependent | Host-dependent |
| Fly.io | Containers | Firecracker microVM | Unbounded | 256 MB–256 GB | 30+ regions | Per-minute VM |
| Railway | Containers | KVM | Unbounded | 32 GB | 4+ regions | Hours + GB-RAM |
Where serverless is still not the answer
Finally, in 2026 there are still slots where serverless isn't the right answer.
- Long-running GPU inference: LLMs with multi-second model load. Self-managed GPU instances, SageMaker, Modal, or Replicate are better.
- Long-lived persistent connections with state: Large-scale multiplayer game servers. Fly.io or EC2/GKE is more natural.
- Trading systems needing consistent sub-tens-of-ms tail latency: Exchanges, ad bidding. Bare metal or dedicated.
- Analytics/joins on tens of GB of memory: Lambda's 10 GB is not enough. Fargate or EKS.
Just as the right slots for serverless got clearer, so did the wrong ones.
References
- AWS Lambda docs: https://docs.aws.amazon.com/lambda
- AWS Lambda Powertools: https://docs.powertools.aws.dev
- Google Cloud Run docs: https://cloud.google.com/run/docs
- Azure Functions docs: https://learn.microsoft.com/azure/azure-functions
- Cloudflare Workers docs: https://developers.cloudflare.com/workers
- Deno Deploy docs: https://docs.deno.com/deploy
- Vercel Functions docs: https://vercel.com/docs/functions
- Netlify Edge Functions: https://docs.netlify.com/edge-functions/overview
- Fastly Compute@Edge: https://developer.fastly.com/learning/compute
- Fermyon Spin: https://developer.fermyon.com/spin
- Fly.io docs: https://fly.io/docs
- Railway docs: https://docs.railway.app
- Render docs: https://render.com/docs
- Koyeb docs: https://www.koyeb.com/docs