Skip to content
Published on

File Upload & Storage Tools 2026 — Uppy / UploadThing / Pinata IPFS / Cloudflare R2 / Backblaze B2 / Tigris / MinIO / tus protocol Deep Dive

Authors

"Receiving and storing a file looks simple until a single 1GB video fails after thirty retries — at that moment every abstraction collapses. In 2026 file upload is no longer about multipart/form-data." — Transloadit Uppy team, December 2024 interview

Almost every product needs file upload, yet it remains one of the most frequently broken features in production. A user uploading a 1GB video over LTE has the connection drop at 99% and starts over from zero. A hundred users upload PDFs concurrently and your Node process OOMs. And someone, always, tries to drop a .exe or a ZIP bomb on you.

As of May 2026 the file upload and object storage ecosystem has split into four clear categories — Client libraries, Managed SaaS, Self-hosted OSS, and Decentralized storage — with the tus protocol settling in as the de facto standard for resumable transport. This essay walks through Uppy, UploadThing, Filestack, Cloudinary, ImageKit, AWS S3 Presigned URLs, Cloudflare R2, Backblaze B2, Tigris, Wasabi, Vercel Blob, Bunny Storage, MinIO, SeaweedFS, Garage, Pinata, Filebase, Storj DCS, Arweave, tus, ClamAV, and Sublime Security in one pass.

1. The 2026 File Upload Map — Four Categories

The infrastructure breaks down by role into four boxes.

CategoryRepresentative productsRole
Client libraryUppy, UploadThing, Filestack JS, Cloudinary Widget, ImageKit JSHandle chunking, retries, and UI in the browser/mobile
Managed SaaSUploadThing, Filestack, Cloudinary, ImageKit, Vercel Blob, BunnyAPI line that bundles upload + CDN + transforms
S3-compatible / Self-hostedAWS S3, Cloudflare R2, Backblaze B2, Wasabi, Tigris, MinIO, SeaweedFS, GarageObject storage infrastructure
DecentralizedPinata, Filebase, Storj DCS, ArweaveStorage on top of IPFS, Sia, Filecoin, or Arweave

This split matters because where you store and how you transport are independent decisions. Uppy on the client can ship to S3 Presigned URLs, Cloudflare R2, MinIO, or even Pinata IPFS on the back end. Full-stack SaaS like UploadThing bundle both, but underneath they remain a Transport + Storage composition.

The second branching point is egress pricing. AWS S3 charges roughly 0.09 USD per GB of outbound traffic, while Cloudflare R2 and Backblaze B2 charge essentially nothing. For video- or image-heavy workloads pushing content through CDNs all day, this gap routinely turns into thousands of dollars per month.

The other 2026 keyword is resumable. The tus protocol went through an IETF Internet-Draft in 2024 and is now treated as the standard. Uppy, UploadThing, Vercel Blob, and Cloudflare R2 all support tus as a first-class transport, meaning a 1GB file at 99% that loses Wi-Fi can pick up exactly where it left off.

Uppy (uppy.io) is the modular open-source uploader released by Transloadit in 2017. As of May 2026 it has crossed 30k GitHub stars and is effectively the default in the client-side uploader category.

The core philosophy is a plugin architecture. The core ships under 1MB and you attach only the features you need.

// React + Uppy + tus + S3 + webcam + Instagram
import Uppy from '@uppy/core'
import Dashboard from '@uppy/dashboard'
import Tus from '@uppy/tus'
import AwsS3 from '@uppy/aws-s3'
import Webcam from '@uppy/webcam'
import Instagram from '@uppy/instagram'

const uppy = new Uppy({
  restrictions: {
    maxFileSize: 1024 * 1024 * 1024, // 1GB
    maxNumberOfFiles: 10,
    allowedFileTypes: ['image/*', 'video/*', '.pdf', '.zip'],
  },
  autoProceed: false,
})
  .use(Dashboard, { inline: true, target: '#drag-drop' })
  .use(Webcam, { target: Dashboard })
  .use(Instagram, { companionUrl: 'https://companion.acme.dev' })
  .use(Tus, {
    endpoint: 'https://uploads.acme.dev/files/',
    chunkSize: 5 * 1024 * 1024, // 5MB chunks
    retryDelays: [0, 1000, 3000, 5000],
  })

That single block gives you drag-and-drop UI, progress bars, pause and resume, chunked transfer, retries, webcam capture, and Instagram import. Add @uppy/google-drive, @uppy/dropbox, or @uppy/onedrive for cloud imports.

The server-side counterpart is Companion (@uppy/companion), which holds OAuth tokens and streams from the user's Google Drive or Dropbox directly to S3 server-to-server. That avoids the wasteful "browser downloads, then re-uploads" pattern for large remote files.

Uppy 4.0, released in May 2025, was a TypeScript-first rewrite with official adapters for React, Vue, and Svelte. In November 2025 the Transloadit managed transform back end integration (@uppy/transloadit) tightened further, letting you trigger video transcoding, image resizing, and virus scanning on the back of an upload as a single pipeline.

Uppy is the right pick when a team already owns its back end and storage but wants client-side UX quickly. It sits in the middle ground where buying UploadThing whole feels heavy, but hand-rolling multipart upload feels primitive.

3. UploadThing — Vercel-friendly, TypeScript-first

UploadThing (uploadthing.com) is a TypeScript-first upload SaaS created in 2023 by Theo Browne of t3.gg, accepted into Y Combinator in 2024 and raised a Series A in 2025. It is the fastest-growing upload product in the Vercel ecosystem.

Its differentiator is zero-config Next.js integration and a type-safe file router.

// app/api/uploadthing/core.ts
import { createUploadthing, type FileRouter } from 'uploadthing/next'
import { auth } from '@/auth'

const f = createUploadthing()

export const ourFileRouter = {
  imageUploader: f({ image: { maxFileSize: '4MB', maxFileCount: 4 } })
    .middleware(async ({ req }) => {
      const session = await auth()
      if (!session?.user) throw new Error('Unauthorized')
      return { userId: session.user.id }
    })
    .onUploadComplete(async ({ metadata, file }) => {
      await db.media.create({
        data: { userId: metadata.userId, url: file.url, key: file.key },
      })
    }),

  pdfUploader: f({ pdf: { maxFileSize: '32MB' } })
    .middleware(async () => ({ userId: 'pdf-user' }))
    .onUploadComplete(async ({ file }) => {
      console.log('PDF uploaded:', file.url)
    }),
} satisfies FileRouter

export type OurFileRouter = typeof ourFileRouter

On the client <UploadButton endpoint="imageUploader" /> is essentially the whole integration. The router type flows into the component, so a typo in the endpoint name fails TypeScript compilation rather than runtime.

UploadThing was originally built on AWS S3 and Cloudflare, but starting in 2025 it supports BYOS (Bring Your Own Storage). Pick R2, S3, or Vercel Blob as the back end while keeping the transformation pipeline.

The UTApi, added in January 2026, lets the server manipulate files directly — deletion, metadata updates, signed URLs all in a single SDK call.

import { UTApi } from 'uploadthing/server'

const utapi = new UTApi()
await utapi.deleteFiles(['key1', 'key2'])
await utapi.getFileUrls(['key1']) // temporary signed URL

UploadThing fits solo developers and small teams on Next.js + Vercel + Auth.js best. The promise mirrors Stripe's "drop in a checkout in five minutes" — drop in an upload flow in five minutes.

4. Filestack / Cloudinary / ImageKit — Managed Media

The managed media category bundles upload + CDN + transformation as one product. Three companies have been competing for nearly thirteen years.

Filestack (filestack.com) is the oldest of the three, founded in 2012 and acquired by Idera in 2019. Its strength is the Picker widget that imports directly from more than 40 external sources (Facebook, Instagram, Box, Dropbox, OneDrive, etc.) and the Transformation Engine that handles images, documents, and videos through one API.

<script src="//static.filestackapi.com/filestack-js/3.x.x/filestack.min.js"></script>
<script>
  const client = filestack.init('YOUR_API_KEY')
  client.picker({
    accept: ['image/*', 'application/pdf'],
    fromSources: ['local_file_system', 'instagram', 'gmail', 'webcam'],
    maxFiles: 5,
    transformations: { crop: { aspectRatio: 1 / 1, force: true } },
  }).open()
</script>

Cloudinary (cloudinary.com) is the largest managed media vendor in 2026, founded in Israel in 2012. The core is URL-based image and video transformation. A URL like https://res.cloudinary.com/demo/image/upload/w_300,h_300,c_fill,f_auto,q_auto/sample.jpg will be encoded automatically as WebP or AVIF, cropped, cached, and served from CDN.

The upload widget (cloudinary.openUploadWidget) supports 14 external sources, webcam, and screenshot capture, and since 2024 includes AI-driven auto-tagging, content-aware cropping (c_auto), and background removal.

ImageKit (imagekit.io) is a younger challenger from India, founded in 2017, growing fast on price and performance. It offers a Cloudinary-like URL transformation API at roughly one-third the price, and crucially supports BYOS: connect an existing AWS S3, R2, or Backblaze bucket as the origin. Both direct upload (imagekit.upload(...)) and server-side token issuing are supported.

Side-by-side:

ProductStrengthWeaknessPrice
FilestackMost external sources, document transformsPricey, heavyFrom 99 USD/mo
CloudinaryDeep image/video transforms, AI featuresPricing complexity, learning curveFree 25 credits, then usage
ImageKitBYOS, cost efficiencySlightly thinner transformsFree 20GB, then usage

5. AWS S3 Presigned URLs — The DIY Classic

The oldest and still most common pattern is the AWS S3 Presigned URL. Unchanged since S3 launched in 2006: the server hands the client a short-lived signed PUT URL, and the browser PUTs directly to S3.

// Server: Next.js Route Handler
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'
import { getSignedUrl } from '@aws-sdk/s3-request-presigner'

const s3 = new S3Client({ region: 'us-east-1' })

export async function POST(req: Request) {
  const { filename, contentType } = await req.json()
  const key = `uploads/${Date.now()}-${filename}`
  const command = new PutObjectCommand({
    Bucket: 'acme-uploads',
    Key: key,
    ContentType: contentType,
  })
  const url = await getSignedUrl(s3, command, { expiresIn: 60 * 5 })
  return Response.json({ url, key })
}
// Client
async function upload(file: File) {
  const res = await fetch('/api/presign', {
    method: 'POST',
    body: JSON.stringify({ filename: file.name, contentType: file.type }),
  })
  const { url, key } = await res.json()
  await fetch(url, {
    method: 'PUT',
    headers: { 'Content-Type': file.type },
    body: file,
  })
  return key
}

The strength is that the server is bypassed for bytes. A 5GB upload consumes zero bandwidth or memory on your Next.js server because the entire payload streams directly to S3. Authentication, validation, and logging happen only at the signing step.

The weakness is that you implement retries and progress yourself. The fetch API does not expose upload progress, so you typically reach for XMLHttpRequest or axios with onUploadProgress, and files larger than 5GB force Multipart Upload. In practice almost everyone uses this pattern via Uppy's @uppy/aws-s3-multipart plugin.

The exact same pattern works on Cloudflare R2, Backblaze B2, Wasabi, MinIO, and Tigris. "Follows the S3 API" is the single line that standardized the entire object storage market.

6. Cloudflare R2 — No Egress Fees

Cloudflare R2 (developers.cloudflare.com/r2) is the S3-compatible object store that went GA in 2022 with a single marketing message that shook the market: zero egress charges.

Serving 1TB of images per month from AWS S3 costs about 90 USD in transfer fees; the same workload from R2 costs 0 USD. Storage is also 0.015 USD per GB versus 0.023 USD on S3 — 35% cheaper at rest.

// R2 reuses the S3 SDK as-is
import { S3Client } from '@aws-sdk/client-s3'

const r2 = new S3Client({
  region: 'auto',
  endpoint: `https://${ACCOUNT_ID}.r2.cloudflarestorage.com`,
  credentials: {
    accessKeyId: process.env.R2_ACCESS_KEY!,
    secretAccessKey: process.env.R2_SECRET_KEY!,
  },
})

The real strength of R2 is deep integration with Cloudflare Workers, Pages, and the global CDN. Binding an R2 bucket inside a Worker lets you call env.MY_BUCKET.put(key, body) without any SDK, all running on the same global network.

// Cloudflare Workers + R2
export default {
  async fetch(req: Request, env: Env) {
    const key = new URL(req.url).pathname.slice(1)
    if (req.method === 'PUT') {
      await env.MY_BUCKET.put(key, req.body, {
        httpMetadata: { contentType: req.headers.get('content-type') ?? '' },
      })
      return new Response('ok')
    }
    const obj = await env.MY_BUCKET.get(key)
    if (!obj) return new Response('not found', { status: 404 })
    return new Response(obj.body, { headers: { 'content-type': obj.httpMetadata?.contentType ?? '' } })
  },
}

The 2024 addition of R2 Custom Domain + Cache Reserve auto-caches R2 objects in Cloudflare's CDN, so after the first hit even R2 GET costs disappear. The 2025 R2 Data Catalog (Iceberg compatible) extends R2 into a lakehouse back end.

The weakness is that R2 is not 100% S3-compatible. Object Lambda, S3 Select, S3 Inventory, and similar advanced features are missing. For the 90% case of PUT/GET/LIST plus Multipart and Presigned URLs, it is functionally identical.

7. Backblaze B2 / Wasabi / Tigris — S3 Compatible + Affordable

The S3-compatible value tier existed long before R2, and each player holds a different position.

Backblaze B2 (backblaze.com/b2) is the oldest S3 alternative, launched in 2015. At 0.006 USD per GB it is cheaper than R2, and thanks to the Bandwidth Alliance with Cloudflare, traffic flowing from B2 into Cloudflare CDN is free. As of May 2026 B2 holds the largest share of the media backup and archive market.

# B2 CLI
b2 authorize-account $KEY_ID $APP_KEY
b2 upload-file my-bucket video.mp4 videos/2026/may.mp4
b2 download-file-by-name my-bucket videos/2026/may.mp4 ./local.mp4

Wasabi (wasabi.com) is an S3-compatible storage launched in 2017 with a 0.0068 USD per GB price and free egress. Built by AWS S3 alumni, its compatibility is unusually high and the pricing model is a single "Hot Storage" tier. The downside is a 90-day minimum retention on objects, making it inefficient for short-lived transactional data.

Tigris (tigrisdata.com) is a newer S3-compatible service launched in 2024 and backed by Fly.io. Its differentiator is automatic global replication: PUT an object once and it propagates lazily to the nearest regions worldwide, so the first GET is fast wherever it lands. Inside Fly.io apps routing costs are zero.

// Tigris uses the S3 SDK unchanged
const tigris = new S3Client({
  region: 'auto',
  endpoint: 'https://fly.storage.tigris.dev',
  credentials: { accessKeyId: KEY, secretAccessKey: SECRET },
})

Pricing comparison (May 2026, standard hot tier):

ProductStorage (GB/mo)Egress (GB)Notes
AWS S3 Standard0.023 USD0.09 USDThe baseline
Cloudflare R20.015 USD0 USDWorkers + CDN integrated
Backblaze B20.006 USD0 USD (via Cloudflare)Cheapest storage
Wasabi0.0068 USD0 USD90-day minimum retention
Tigris0.02 USD0 USDAuto global replication

8. Vercel Blob / Bunny Storage — Platform-Built-in

Platform-native storage is growing quickly alongside the third-party world.

Vercel Blob (vercel.com/storage/blob) went GA in 2024 as Vercel's S3-compatible storage layered on Cloudflare R2 under the hood. The strength is SDK-one-liner uploads from Vercel functions plus automatic caching by the Vercel Edge CDN.

// app/api/upload/route.ts
import { put } from '@vercel/blob'

export async function POST(req: Request) {
  const file = (await req.formData()).get('file') as File
  const blob = await put(`uploads/${file.name}`, file, {
    access: 'public',
    addRandomSuffix: true,
  })
  return Response.json(blob)
}

For large files the client-direct upload pattern is also supported: @vercel/blob/client calls the server only for a token, then PUTs the body straight into Vercel Blob. That is the canonical way to bypass the 4.5MB body limit on Vercel Functions.

Bunny Storage (bunny.net/storage) is the object storage arm of the Slovenia-based Bunny.net CDN, integrating deeply with Bunny CDN. At 0.01 USD per GB with 11-region replication and automatic Pull Zone linkage, it has rapidly become the media back end for legacy CMSs like WordPress and Drupal.

# Bunny Storage API
curl -X PUT https://storage.bunnycdn.com/$STORAGE_ZONE/$PATH/file.jpg \
  -H "AccessKey: $ACCESS_KEY" \
  --data-binary @file.jpg

The shared advantage of platform-built-ins is integration latency: Vercel Blob is fastest from Vercel Edge, Bunny Storage is fastest from Bunny CDN. The shared weakness is lock-in: leave the platform and you lose the compatible SDK.

9. MinIO — Self-hosted S3-Compatible OSS

When a company wants object storage inside its own data center, the de facto standard is MinIO (min.io). Written in Go and released in 2015, it now has over 48k GitHub stars.

Its strengths are the most complete S3 API implementation and a single binary install.

# Single node (dev)
docker run -p 9000:9000 -p 9001:9001 \
  -e "MINIO_ROOT_USER=admin" \
  -e "MINIO_ROOT_PASSWORD=secret123" \
  -v /data:/data \
  quay.io/minio/minio server /data --console-address ":9001"

# Distributed cluster (4 nodes, 4 disks each, 16 erasure)
minio server http://node{1...4}/data{1...4} \
  --console-address ":9001"

MinIO uses erasure coding so partial node or disk failures do not lose data. The default 16-disk configuration survives 8 simultaneous disk failures, which is why MinIO is strong in internal backup, ML data lakes, and CI artifact storage.

Since 2024 MinIO has rebranded part of its product as AIStor, an AI/ML-focused storage with direct integrations into PyTorch, TensorFlow, and NVIDIA NIM. The pitch targets petabyte-scale training data on top of the same S3-compatible API.

There is a commercial license caveat. In April 2025 MinIO moved certain features from AGPL v3 to a MinIO Commercial License. Embedding MinIO in your own SaaS may now require a commercial license, so OSS purists have started looking elsewhere — namely SeaweedFS and Garage covered next.

10. SeaweedFS / Garage (Deuxfleurs) — Other OSS Options

SeaweedFS (github.com/seaweedfs/seaweedfs) was first released in 2014. Written in Go, it is a distributed object store inspired by Facebook's Haystack paper (2010), designed to store billions of small files efficiently.

# SeaweedFS master + volume + S3 gateway
weed master -mdir /data/master &
weed volume -dir /data/volumes -port 8080 -mserver localhost:9333 &
weed s3 -port 8333 -filer localhost:8888 &

Highlights:

  • Strong with small files (minimal metadata overhead)
  • POSIX-like API through Filer
  • S3, WebDAV, FUSE, and HDFS gateways simultaneously exposed
  • AGPL v3 (server), Apache 2.0 (libraries)

Garage (garagehq.deuxfleurs.fr) is a younger S3-compatible OSS started in 2020 by the French collective Deuxfleurs. The defining trait is being optimized for low-power, low-resource environments. You can run a distributed cluster across Raspberry Pi boards on consumer broadband.

What makes Garage distinct:

  • Written entirely in Rust with very low memory usage
  • Native multi-region, multi-data-center awareness
  • AGPL v3
  • Tuned for small clusters of 3 to 10 nodes
# Garage cluster (per node)
garage server &
# After exchanging node IDs
garage layout assign -z eu-west -c 1T <node-id>
garage layout apply --version 1

MinIO targets enterprise and petabyte scale; Garage targets community-run, free-software, small infrastructure. It is a common choice for Fediverse (Mastodon) and NextCloud operators.

11. Pinata — IPFS Hosting

From here we enter decentralized storage. Pinata (pinata.cloud) is a US-based IPFS pinning service founded in 2018, effectively the standard for NFT and Web3 storage.

IPFS (InterPlanetary File System) is a P2P protocol that identifies files by their content hash (CID). The same file has the same CID anywhere on the network, and once pinned can be fetched from any peer. The catch is that someone must keep the pin alive, and Pinata sells that as a service.

import PinataSDK from 'pinata'

const pinata = new PinataSDK({
  pinataJwt: process.env.PINATA_JWT!,
  pinataGateway: 'https://gateway.pinata.cloud',
})

// Upload file + IPFS pin
const result = await pinata.upload.file(file)
console.log(result.IpfsHash) // "QmXxx..."

// User-facing gateway URL
const url = `https://gateway.pinata.cloud/ipfs/${result.IpfsHash}`

The 2024 Pinata Files API added a new mode where storage lives in Pinata's Private Network rather than the public IPFS DHT. That mode is as fast as conventional object storage, and you can promote any file to public IPFS on demand. Pinata is gradually shifting from "Web3-only SaaS" to "general object storage with an IPFS option."

In NFT workflows the canonical pattern is to pin OpenSea-compatible metadata to IPFS and embed the CID in the smart contract. That gives token metadata a permanence and censorship-resistance guarantee.

12. Filebase / Storj DCS / Arweave — Decentralized + Permanent

There are several other back ends beyond Pinata in the decentralized category.

Filebase (filebase.com) is a US company founded in 2019 that ships an S3-compatible API on top of IPFS, Sia, Skynet, and Filecoin. Users only need to know the S3 SDK; the bucket configuration picks the underlying network.

# Filebase is S3 SDK as-is
aws s3 cp ./photo.jpg s3://my-bucket/photo.jpg \
  --endpoint-url https://s3.filebase.com \
  --profile filebase

At roughly 0.0059 USD per GB it is affordable, and every object automatically gets an IPFS CID.

Storj DCS (storj.io) launched mainnet in 2018 and splits each object into 80 Reed-Solomon erasure pieces distributed across more than 15,000 nodes worldwide. Any 39 pieces reconstruct the file, and client-side encryption ensures no single node can read the data.

uplink cp video.mp4 sj://my-bucket/video.mp4

Pricing is around 0.004 USD per GB at rest and 0.007 USD per GB egress, comparable to R2. Government, research, and media customers wary of trusting AWS or Azure favor it.

Arweave (arweave.org) is in its own category. Pay once, store for at least 200 years — a permanent storage blockchain. Instead of the conventional monthly billing model, you pay a single upfront fee at upload time.

# Using the Bundlr/Irys gateway
npx @irys/sdk fund 100000 -t arweave -w wallet.json
npx @irys/sdk upload file.jpg -t arweave -w wallet.json
# returns an Arweave Transaction ID

Cost is roughly 5 to 10 USD per GB, expensive in absolute terms but bundled with a permanence guarantee. Legal evidence, journalism backups, scientific data, and game assets are the natural fit. Since 2025 Solana and Ethereum back ends support the same pattern through Irys/Bundlr, expanding the "blockchain permanence" market.

13. The tus Protocol — Resumable Uploads

tus.io is a resumable upload protocol Transloadit started in 2014. In 2026 it is in an active IETF Internet-Draft and is the de facto standard. Uppy, UploadThing, Vimeo, Cloudflare R2, Vercel Blob, GitLab, ownCloud, and most major services support it.

The core idea is dead simple: HEAD to ask the current offset, PATCH to continue from there.

# 1) Create the upload
POST /files HTTP/1.1
Upload-Length: 1073741824
Upload-Metadata: filename dmlkZW8ubXA0
Tus-Resumable: 1.0.0
→ 201 Created
  Location: /files/abc123

# 2) Ask current offset after a network drop
HEAD /files/abc123
→ 200 OK
  Upload-Offset: 524288000
  Upload-Length: 1073741824

# 3) Continue from there
PATCH /files/abc123 HTTP/1.1
Upload-Offset: 524288000
Content-Type: application/offset+octet-stream
Tus-Resumable: 1.0.0
[binary chunk]
→ 204 No Content

This minimal protocol is what lets a 1GB upload that died at 99% resume at exactly 99% on the next Wi-Fi attempt. For mobile and satellite networks where drops are routine, this is non-negotiable.

Several server implementations exist.

# tusd (official Go server)
docker run -p 1080:1080 \
  -v /data:/srv/tusd-data \
  tusproject/tusd:latest \
  -upload-dir=/srv/tusd-data

# Node tus server
npm i @tus/server @tus/file-store
import { Server } from '@tus/server'
import { FileStore } from '@tus/file-store'

const tus = new Server({
  path: '/files',
  datastore: new FileStore({ directory: './uploads' }),
})
// Mount on a Next.js route handler

On the client tus-js-client or Uppy's @uppy/tus plugin are the standard choices. Both ship with automatic retries, exponential backoff, and localStorage-based fingerprinting so a refresh keeps the same upload session.

A 2025 draft of tus 2.0 introduced HTTP/3 with 0-RTT, parallel chunking, and signed metadata, with formal publication targeted for the second half of 2026.

14. ClamAV / Sublime Security — Malware Scanning

Re-exposing an uploaded file directly as a public download is how users end up reporting "the PDF I downloaded from your site was ransomware." In 2026 malware scanning on upload is essentially required.

ClamAV (clamav.net) is the long-running open-source antivirus engine, originally released in 1996, signature-based.

# Run ClamAV daemon in Docker
docker run -d --name clamav \
  -p 3310:3310 \
  -v clam_db:/var/lib/clamav \
  clamav/clamav:stable

# Scan a file
clamdscan --multiscan /uploads/file.pdf
# /uploads/file.pdf: OK
# or: /uploads/file.pdf: Eicar-Test-Signature FOUND
// Talking to the TCP ClamAV daemon from Node
import { NodeClam } from 'clamscan'

const clam = await new NodeClam().init({
  clamdscan: { host: 'clamav', port: 3310 },
})

const { isInfected, viruses } = await clam.scanFile('/uploads/file.pdf')
if (isInfected) {
  await deleteFromStorage(key)
  throw new Error(`Malware detected: ${viruses.join(', ')}`)
}

ClamAV is fast but signature-based, so it misses novel variants and zero-day malware. Most teams pair it with a managed solution.

Sublime Security (sublimesecurity.com) is a US content-security SaaS founded in 2020, originally for email but expanded into upload pipelines in 2024 after a Series B round. It applies heuristic plus LLM-based checks to email bodies, uploaded documents, and HTML payloads.

// Conceptual Sublime Detection API
const res = await fetch('https://api.sublime.security/v1/scan', {
  method: 'POST',
  headers: { Authorization: `Bearer ${SUBLIME_KEY}` },
  body: file,
})
const verdict = await res.json()
// { score: 0.92, label: 'phishing', signals: ['embedded-macro', 'suspicious-url'] }

Other options:

  • VirusTotal (virustotal.com) — Run by Google, scans through 70+ engines in parallel, free API plus paid Premium
  • Cloudmersive Virus Scan API — Managed REST API
  • MetaDefender Cloud (OPSWAT) — 30+ engines concurrently, the multi-scanning reference
  • AWS GuardDuty Malware Protection for S3 — GA in 2024, auto-scans objects on PUT into an S3 bucket

The recommended upload pipeline shape is:

  1. Client PUTs via Presigned URL or tus into a quarantine bucket on S3/R2
  2. An event (S3 ObjectCreated, R2 Queue, Vercel Blob webhook) fires
  3. ClamAV/GuardDuty/Sublime scans
  4. Clean files move to the public bucket, infected ones move to a quarantine area

This is the pattern that lets you accept ZIP, TAR, and WebAssembly modules — formats that can hide malicious payloads — without exposing users to them.

15. Korea / Japan — Toss, Kakao, Mercari

Toss — Korea's leading fintech as of 2024, receiving millions of KYC documents, ID photos, and contracts per day. Synthesizing publicly available Toss Tech Blog posts, the pattern looks like this:

  • Client side: an in-house React Native module that strips EXIF and resizes client-side
  • Transport: HTTPS PUT plus a proprietary chunked protocol similar to tus
  • First-stage storage: an AWS S3 quarantine bucket (uploads-quarantine)
  • Scanning: ClamAV plus an in-house LLM heuristic that decides whether an ID photo is a valid KYC document
  • Second-stage storage: AWS S3 with KMS encryption, IAM Role per microservice
  • CDN: CloudFront with Signed URL

The point Toss emphasizes is the SLA — every scan must finish within 30 seconds so the user can proceed. They use synchronous + fast polling rather than an asynchronous queue.

Kakao — KakaoTalk video messages, KakaoStory, and KakaoTV move petabytes of media daily. According to the Kakao Tech blog, Kakao runs its in-house KakaoCloud Object Storage (released 2022, based on a SeaweedFS variant with proprietary metadata) alongside AWS S3 and GCS in a multi-cloud setup. Domestic Korean traffic stays on KakaoCloud; global traffic uses S3 plus CloudFront to optimize egress cost.

Mercari — Japan's largest C2C marketplace, ingesting tens of millions of photos a day. The Mercari engineering blog describes a base of GCP Cloud Storage plus Cloud CDN, with image resizing handled by an in-house imageflux-compatible layer. Since 2024 every upload runs through AI auto-categorization and the same pipeline blocks inappropriate content (NSFW, suspected counterfeit) automatically.

SmartHR — A major Japanese HR SaaS handling labor contracts and payslips, legal documents subject to seven-year retention. They use AWS S3 with KMS encryption and Object Lock (WORM) to meet the regulatory requirement, with single-use signed URLs issued on each user download. Toss in Korea and SmartHR in Japan share very similar file infrastructure precisely because their regulatory environments rhyme.

The shared traits of all three are (1) an S3-compatible API at the core, (2) native client modules that handle chunking, retries, and EXIF, and (3) synchronous server-side scanning that gates the next step in the user flow.

16. Who Should Pick What

The recommendation matrix by scale and scenario:

ScenarioFirst pickSecond pickReasoning
Solo Next.js side projectUploadThingVercel BlobZero-config, free tier
Next.js startup, own authUppy + R2 Presigned URLUppy + S3Egress savings, no lock-in
Korean fintech startupUppy + AWS S3 KMS + ClamAVUppy + KakaoCloudRegulatory, KMS integration
Media-heavy (image/video transforms)Cloudinary or ImageKitFilestackURL transforms, AI auto-tagging
Global SaaS with heavy egressCloudflare R2 + Uppy tusBackblaze B2 + CDNZero egress
Petabyte in-house data centerMinIOSeaweedFSS3 compatible, erasure coding
Home or small OSS clusterGarageSeaweedFSLow power, multi-region
NFT or Web3 metadataPinata IPFSFilebaseOpenSea compatible, permanence
Legal, scientific, journalism archiveArweave (via Irys)Storj DCSPay-once, censorship resistant
Enterprise + multi-cloud abstractionMinIO + S3 + R2 fan-outTigris (global replication)Avoid vendor lock-in
Large files (>1GB)tus + Uppy + R2tus + tusd + S3 multipartResumable is essential
Strict virus gating requiredClamAV + S3 quarantine patternSublime Security + GuardDutyLayered defense

Principles:

  1. Use Uppy on the client — Unless you are a solo prototype, almost always reach for Uppy. UI, progress, retries, tus, chunking, and external sources come along for free.
  2. Pick S3-compatible storage — To avoid lock-in, choose a back end that speaks the S3 API. R2, B2, Wasabi, MinIO, and Tigris all share the same SDK.
  3. Heavy egress means R2 or B2 — If your business is serving media, zero-egress pricing is the deciding factor.
  4. Large files demand tus — Anything over 1GB without tus is wasteful.
  5. Sensitive files follow quarantine -> scan -> publish — Never expose freshly uploaded files on a public URL. Land them in a quarantine bucket, scan, then promote.
  6. Strip EXIF on the client — Photo GPS and device metadata should never reach the server. piexifjs or exif-stripper-js are enough.
  7. Decentralized only when permanence is real — IPFS and Arweave are great tools, but they are overkill for everyday SaaS. Use them only when the answer to "must this still exist in 100 years?" is genuinely yes.

File upload sits next to checkout in terms of direct user exposure. A user whose 1GB video died at 99% will stop trusting the product, and a service that lets one malware sample through has a hard time recovering reputation. The 2026 tools reduce both problems to single-line APIs — take advantage of them.

References