필사 모드: File Upload & Storage Tools 2026 — Uppy / UploadThing / Pinata IPFS / Cloudflare R2 / Backblaze B2 / Tigris / MinIO / tus protocol Deep Dive
English> "Receiving and storing a file looks simple until a single 1GB video fails after thirty retries — at that moment every abstraction collapses. In 2026 file upload is no longer about multipart/form-data." — Transloadit Uppy team, December 2024 interview
Almost every product needs file upload, yet it remains one of the most frequently broken features in production. A user uploading a 1GB video over LTE has the connection drop at 99% and starts over from zero. A hundred users upload PDFs concurrently and your Node process OOMs. And someone, always, tries to drop a `.exe` or a ZIP bomb on you.
As of May 2026 the file upload and object storage ecosystem has split into four clear categories — **Client libraries, Managed SaaS, Self-hosted OSS, and Decentralized storage** — with the tus protocol settling in as the de facto standard for resumable transport. This essay walks through Uppy, UploadThing, Filestack, Cloudinary, ImageKit, AWS S3 Presigned URLs, Cloudflare R2, Backblaze B2, Tigris, Wasabi, Vercel Blob, Bunny Storage, MinIO, SeaweedFS, Garage, Pinata, Filebase, Storj DCS, Arweave, tus, ClamAV, and Sublime Security in one pass.
1. The 2026 File Upload Map — Four Categories
The infrastructure breaks down by role into four boxes.
| Category | Representative products | Role |
|---|---|---|
| Client library | Uppy, UploadThing, Filestack JS, Cloudinary Widget, ImageKit JS | Handle chunking, retries, and UI in the browser/mobile |
| Managed SaaS | UploadThing, Filestack, Cloudinary, ImageKit, Vercel Blob, Bunny | API line that bundles upload + CDN + transforms |
| S3-compatible / Self-hosted | AWS S3, Cloudflare R2, Backblaze B2, Wasabi, Tigris, MinIO, SeaweedFS, Garage | Object storage infrastructure |
| Decentralized | Pinata, Filebase, Storj DCS, Arweave | Storage on top of IPFS, Sia, Filecoin, or Arweave |
This split matters because **where you store** and **how you transport** are independent decisions. Uppy on the client can ship to S3 Presigned URLs, Cloudflare R2, MinIO, or even Pinata IPFS on the back end. Full-stack SaaS like UploadThing bundle both, but underneath they remain a Transport + Storage composition.
The second branching point is **egress pricing**. AWS S3 charges roughly 0.09 USD per GB of outbound traffic, while Cloudflare R2 and Backblaze B2 charge essentially nothing. For video- or image-heavy workloads pushing content through CDNs all day, this gap routinely turns into thousands of dollars per month.
The other 2026 keyword is **resumable**. The tus protocol went through an IETF Internet-Draft in 2024 and is now treated as the standard. Uppy, UploadThing, Vercel Blob, and Cloudflare R2 all support tus as a first-class transport, meaning a 1GB file at 99% that loses Wi-Fi can pick up exactly where it left off.
2. Uppy (Transloadit) — Most Popular OSS Uploader
Uppy (`uppy.io`) is the modular open-source uploader released by Transloadit in 2017. As of May 2026 it has crossed 30k GitHub stars and is effectively the default in the client-side uploader category.
The core philosophy is a **plugin architecture**. The core ships under 1MB and you attach only the features you need.
// React + Uppy + tus + S3 + webcam + Instagram
const uppy = new Uppy({
restrictions: {
maxFileSize: 1024 * 1024 * 1024, // 1GB
maxNumberOfFiles: 10,
allowedFileTypes: ['image/*', 'video/*', '.pdf', '.zip'],
},
autoProceed: false,
})
.use(Dashboard, { inline: true, target: '#drag-drop' })
.use(Webcam, { target: Dashboard })
.use(Instagram, { companionUrl: 'https://companion.acme.dev' })
.use(Tus, {
endpoint: 'https://uploads.acme.dev/files/',
chunkSize: 5 * 1024 * 1024, // 5MB chunks
retryDelays: [0, 1000, 3000, 5000],
})
That single block gives you drag-and-drop UI, progress bars, pause and resume, chunked transfer, retries, webcam capture, and Instagram import. Add `@uppy/google-drive`, `@uppy/dropbox`, or `@uppy/onedrive` for cloud imports.
The server-side counterpart is **Companion** (`@uppy/companion`), which holds OAuth tokens and streams from the user's Google Drive or Dropbox directly to S3 server-to-server. That avoids the wasteful "browser downloads, then re-uploads" pattern for large remote files.
**Uppy 4.0**, released in May 2025, was a TypeScript-first rewrite with official adapters for React, Vue, and Svelte. In November 2025 the Transloadit managed transform back end integration (`@uppy/transloadit`) tightened further, letting you trigger video transcoding, image resizing, and virus scanning on the back of an upload as a single pipeline.
Uppy is the right pick when a team **already owns its back end and storage** but wants client-side UX quickly. It sits in the middle ground where buying UploadThing whole feels heavy, but hand-rolling multipart upload feels primitive.
3. UploadThing — Vercel-friendly, TypeScript-first
UploadThing (`uploadthing.com`) is a TypeScript-first upload SaaS created in 2023 by Theo Browne of t3.gg, accepted into Y Combinator in 2024 and raised a Series A in 2025. It is the fastest-growing upload product in the Vercel ecosystem.
Its differentiator is **zero-config Next.js integration** and a **type-safe file router**.
// app/api/uploadthing/core.ts
const f = createUploadthing()
export const ourFileRouter = {
imageUploader: f({ image: { maxFileSize: '4MB', maxFileCount: 4 } })
.middleware(async ({ req }) => {
const session = await auth()
if (!session?.user) throw new Error('Unauthorized')
return { userId: session.user.id }
})
.onUploadComplete(async ({ metadata, file }) => {
await db.media.create({
data: { userId: metadata.userId, url: file.url, key: file.key },
})
}),
pdfUploader: f({ pdf: { maxFileSize: '32MB' } })
.middleware(async () => ({ userId: 'pdf-user' }))
.onUploadComplete(async ({ file }) => {
console.log('PDF uploaded:', file.url)
}),
} satisfies FileRouter
export type OurFileRouter = typeof ourFileRouter
On the client `<UploadButton endpoint="imageUploader" />` is essentially the whole integration. The router type flows into the component, so a typo in the endpoint name fails TypeScript compilation rather than runtime.
UploadThing was originally built on AWS S3 and Cloudflare, but starting in 2025 it supports BYOS (Bring Your Own Storage). Pick R2, S3, or Vercel Blob as the back end while keeping the transformation pipeline.
The **UTApi**, added in January 2026, lets the server manipulate files directly — deletion, metadata updates, signed URLs all in a single SDK call.
const utapi = new UTApi()
await utapi.deleteFiles(['key1', 'key2'])
await utapi.getFileUrls(['key1']) // temporary signed URL
UploadThing fits **solo developers and small teams on Next.js + Vercel + Auth.js** best. The promise mirrors Stripe's "drop in a checkout in five minutes" — drop in an upload flow in five minutes.
4. Filestack / Cloudinary / ImageKit — Managed Media
The managed media category bundles upload + CDN + transformation as one product. Three companies have been competing for nearly thirteen years.
**Filestack** (`filestack.com`) is the oldest of the three, founded in 2012 and acquired by Idera in 2019. Its strength is the Picker widget that imports directly from more than 40 external sources (Facebook, Instagram, Box, Dropbox, OneDrive, etc.) and the Transformation Engine that handles images, documents, and videos through one API.
const client = filestack.init('YOUR_API_KEY')
client.picker({
accept: ['image/*', 'application/pdf'],
fromSources: ['local_file_system', 'instagram', 'gmail', 'webcam'],
maxFiles: 5,
transformations: { crop: { aspectRatio: 1 / 1, force: true } },
}).open()
**Cloudinary** (`cloudinary.com`) is the largest managed media vendor in 2026, founded in Israel in 2012. The core is URL-based image and video transformation. A URL like `https://res.cloudinary.com/demo/image/upload/w_300,h_300,c_fill,f_auto,q_auto/sample.jpg` will be encoded automatically as WebP or AVIF, cropped, cached, and served from CDN.
The upload widget (`cloudinary.openUploadWidget`) supports 14 external sources, webcam, and screenshot capture, and since 2024 includes AI-driven auto-tagging, content-aware cropping (`c_auto`), and background removal.
**ImageKit** (`imagekit.io`) is a younger challenger from India, founded in 2017, growing fast on price and performance. It offers a Cloudinary-like URL transformation API at roughly one-third the price, and crucially supports BYOS: connect an existing AWS S3, R2, or Backblaze bucket as the origin. Both direct upload (`imagekit.upload(...)`) and server-side token issuing are supported.
Side-by-side:
| Product | Strength | Weakness | Price |
|---|---|---|---|
| Filestack | Most external sources, document transforms | Pricey, heavy | From 99 USD/mo |
| Cloudinary | Deep image/video transforms, AI features | Pricing complexity, learning curve | Free 25 credits, then usage |
| ImageKit | BYOS, cost efficiency | Slightly thinner transforms | Free 20GB, then usage |
5. AWS S3 Presigned URLs — The DIY Classic
The oldest and still most common pattern is the **AWS S3 Presigned URL**. Unchanged since S3 launched in 2006: the server hands the client a short-lived signed PUT URL, and the browser PUTs directly to S3.
// Server: Next.js Route Handler
const s3 = new S3Client({ region: 'us-east-1' })
export async function POST(req: Request) {
const { filename, contentType } = await req.json()
const key = `uploads/${Date.now()}-${filename}`
const command = new PutObjectCommand({
Bucket: 'acme-uploads',
Key: key,
ContentType: contentType,
})
const url = await getSignedUrl(s3, command, { expiresIn: 60 * 5 })
return Response.json({ url, key })
}
// Client
async function upload(file: File) {
const res = await fetch('/api/presign', {
method: 'POST',
body: JSON.stringify({ filename: file.name, contentType: file.type }),
})
const { url, key } = await res.json()
await fetch(url, {
method: 'PUT',
headers: { 'Content-Type': file.type },
body: file,
})
return key
}
The strength is that **the server is bypassed for bytes**. A 5GB upload consumes zero bandwidth or memory on your Next.js server because the entire payload streams directly to S3. Authentication, validation, and logging happen only at the signing step.
The weakness is that **you implement retries and progress yourself**. The fetch API does not expose upload progress, so you typically reach for XMLHttpRequest or axios with `onUploadProgress`, and files larger than 5GB force **Multipart Upload**. In practice almost everyone uses this pattern via Uppy's `@uppy/aws-s3-multipart` plugin.
The exact same pattern works on Cloudflare R2, Backblaze B2, Wasabi, MinIO, and Tigris. "Follows the S3 API" is the single line that standardized the entire object storage market.
6. Cloudflare R2 — No Egress Fees
Cloudflare R2 (`developers.cloudflare.com/r2`) is the S3-compatible object store that went GA in 2022 with a single marketing message that shook the market: **zero egress charges**.
Serving 1TB of images per month from AWS S3 costs about 90 USD in transfer fees; the same workload from R2 costs 0 USD. Storage is also 0.015 USD per GB versus 0.023 USD on S3 — 35% cheaper at rest.
// R2 reuses the S3 SDK as-is
const r2 = new S3Client({
region: 'auto',
endpoint: `https://${ACCOUNT_ID}.r2.cloudflarestorage.com`,
credentials: {
accessKeyId: process.env.R2_ACCESS_KEY!,
secretAccessKey: process.env.R2_SECRET_KEY!,
},
})
The real strength of R2 is **deep integration with Cloudflare Workers, Pages, and the global CDN**. Binding an R2 bucket inside a Worker lets you call `env.MY_BUCKET.put(key, body)` without any SDK, all running on the same global network.
// Cloudflare Workers + R2
export default {
async fetch(req: Request, env: Env) {
const key = new URL(req.url).pathname.slice(1)
if (req.method === 'PUT') {
await env.MY_BUCKET.put(key, req.body, {
httpMetadata: { contentType: req.headers.get('content-type') ?? '' },
})
return new Response('ok')
}
const obj = await env.MY_BUCKET.get(key)
if (!obj) return new Response('not found', { status: 404 })
return new Response(obj.body, { headers: { 'content-type': obj.httpMetadata?.contentType ?? '' } })
},
}
The 2024 addition of **R2 Custom Domain + Cache Reserve** auto-caches R2 objects in Cloudflare's CDN, so after the first hit even R2 GET costs disappear. The 2025 **R2 Data Catalog** (Iceberg compatible) extends R2 into a lakehouse back end.
The weakness is that **R2 is not 100% S3-compatible**. Object Lambda, S3 Select, S3 Inventory, and similar advanced features are missing. For the 90% case of PUT/GET/LIST plus Multipart and Presigned URLs, it is functionally identical.
7. Backblaze B2 / Wasabi / Tigris — S3 Compatible + Affordable
The S3-compatible value tier existed long before R2, and each player holds a different position.
**Backblaze B2** (`backblaze.com/b2`) is the oldest S3 alternative, launched in 2015. At 0.006 USD per GB it is cheaper than R2, and thanks to the **Bandwidth Alliance with Cloudflare**, traffic flowing from B2 into Cloudflare CDN is free. As of May 2026 B2 holds the largest share of the media backup and archive market.
B2 CLI
b2 authorize-account $KEY_ID $APP_KEY
b2 upload-file my-bucket video.mp4 videos/2026/may.mp4
b2 download-file-by-name my-bucket videos/2026/may.mp4 ./local.mp4
**Wasabi** (`wasabi.com`) is an S3-compatible storage launched in 2017 with a 0.0068 USD per GB price and free egress. Built by AWS S3 alumni, its compatibility is unusually high and the pricing model is a single "Hot Storage" tier. The downside is a **90-day minimum retention** on objects, making it inefficient for short-lived transactional data.
**Tigris** (`tigrisdata.com`) is a newer S3-compatible service launched in 2024 and backed by Fly.io. Its differentiator is **automatic global replication**: PUT an object once and it propagates lazily to the nearest regions worldwide, so the first GET is fast wherever it lands. Inside Fly.io apps routing costs are zero.
// Tigris uses the S3 SDK unchanged
const tigris = new S3Client({
region: 'auto',
endpoint: 'https://fly.storage.tigris.dev',
credentials: { accessKeyId: KEY, secretAccessKey: SECRET },
})
Pricing comparison (May 2026, standard hot tier):
| Product | Storage (GB/mo) | Egress (GB) | Notes |
|---|---|---|---|
| AWS S3 Standard | 0.023 USD | 0.09 USD | The baseline |
| Cloudflare R2 | 0.015 USD | 0 USD | Workers + CDN integrated |
| Backblaze B2 | 0.006 USD | 0 USD (via Cloudflare) | Cheapest storage |
| Wasabi | 0.0068 USD | 0 USD | 90-day minimum retention |
| Tigris | 0.02 USD | 0 USD | Auto global replication |
8. Vercel Blob / Bunny Storage — Platform-Built-in
Platform-native storage is growing quickly alongside the third-party world.
**Vercel Blob** (`vercel.com/storage/blob`) went GA in 2024 as Vercel's S3-compatible storage layered on Cloudflare R2 under the hood. The strength is **SDK-one-liner uploads from Vercel functions** plus automatic caching by the Vercel Edge CDN.
// app/api/upload/route.ts
export async function POST(req: Request) {
const file = (await req.formData()).get('file') as File
const blob = await put(`uploads/${file.name}`, file, {
access: 'public',
addRandomSuffix: true,
})
return Response.json(blob)
}
For large files the client-direct upload pattern is also supported: `@vercel/blob/client` calls the server only for a token, then PUTs the body straight into Vercel Blob. That is the canonical way to bypass the 4.5MB body limit on Vercel Functions.
**Bunny Storage** (`bunny.net/storage`) is the object storage arm of the Slovenia-based Bunny.net CDN, integrating deeply with Bunny CDN. At 0.01 USD per GB with 11-region replication and automatic Pull Zone linkage, it has rapidly become the media back end for legacy CMSs like WordPress and Drupal.
Bunny Storage API
curl -X PUT https://storage.bunnycdn.com/$STORAGE_ZONE/$PATH/file.jpg \
-H "AccessKey: $ACCESS_KEY" \
--data-binary @file.jpg
The shared advantage of platform-built-ins is **integration latency**: Vercel Blob is fastest from Vercel Edge, Bunny Storage is fastest from Bunny CDN. The shared weakness is **lock-in**: leave the platform and you lose the compatible SDK.
9. MinIO — Self-hosted S3-Compatible OSS
When a company wants object storage inside its own data center, the de facto standard is **MinIO** (`min.io`). Written in Go and released in 2015, it now has over 48k GitHub stars.
Its strengths are **the most complete S3 API implementation** and a **single binary install**.
Single node (dev)
docker run -p 9000:9000 -p 9001:9001 \
-e "MINIO_ROOT_USER=admin" \
-e "MINIO_ROOT_PASSWORD=secret123" \
-v /data:/data \
quay.io/minio/minio server /data --console-address ":9001"
Distributed cluster (4 nodes, 4 disks each, 16 erasure)
minio server http://node{1...4}/data{1...4} \
--console-address ":9001"
MinIO uses **erasure coding** so partial node or disk failures do not lose data. The default 16-disk configuration survives 8 simultaneous disk failures, which is why MinIO is strong in internal backup, ML data lakes, and CI artifact storage.
Since 2024 MinIO has rebranded part of its product as **AIStor**, an AI/ML-focused storage with direct integrations into PyTorch, TensorFlow, and NVIDIA NIM. The pitch targets petabyte-scale training data on top of the same S3-compatible API.
There is a commercial license caveat. In April 2025 MinIO moved certain features from AGPL v3 to a **MinIO Commercial License**. Embedding MinIO in your own SaaS may now require a commercial license, so OSS purists have started looking elsewhere — namely SeaweedFS and Garage covered next.
10. SeaweedFS / Garage (Deuxfleurs) — Other OSS Options
**SeaweedFS** (`github.com/seaweedfs/seaweedfs`) was first released in 2014. Written in Go, it is a distributed object store inspired by Facebook's Haystack paper (2010), designed to store billions of small files efficiently.
SeaweedFS master + volume + S3 gateway
weed master -mdir /data/master &
weed volume -dir /data/volumes -port 8080 -mserver localhost:9333 &
weed s3 -port 8333 -filer localhost:8888 &
Highlights:
- Strong with small files (minimal metadata overhead)
- POSIX-like API through Filer
- S3, WebDAV, FUSE, and HDFS gateways simultaneously exposed
- AGPL v3 (server), Apache 2.0 (libraries)
**Garage** (`garagehq.deuxfleurs.fr`) is a younger S3-compatible OSS started in 2020 by the French collective Deuxfleurs. The defining trait is being **optimized for low-power, low-resource environments**. You can run a distributed cluster across Raspberry Pi boards on consumer broadband.
What makes Garage distinct:
- Written entirely in Rust with very low memory usage
- Native multi-region, multi-data-center awareness
- AGPL v3
- Tuned for small clusters of 3 to 10 nodes
Garage cluster (per node)
garage server &
After exchanging node IDs
garage layout assign -z eu-west -c 1T <node-id>
garage layout apply --version 1
MinIO targets enterprise and petabyte scale; Garage targets **community-run, free-software, small infrastructure**. It is a common choice for Fediverse (Mastodon) and NextCloud operators.
11. Pinata — IPFS Hosting
From here we enter **decentralized storage**. **Pinata** (`pinata.cloud`) is a US-based IPFS pinning service founded in 2018, effectively the standard for NFT and Web3 storage.
IPFS (InterPlanetary File System) is a P2P protocol that identifies files by their content hash (CID). The same file has the same CID anywhere on the network, and once pinned can be fetched from any peer. The catch is that **someone must keep the pin alive**, and Pinata sells that as a service.
const pinata = new PinataSDK({
pinataJwt: process.env.PINATA_JWT!,
pinataGateway: 'https://gateway.pinata.cloud',
})
// Upload file + IPFS pin
const result = await pinata.upload.file(file)
console.log(result.IpfsHash) // "QmXxx..."
// User-facing gateway URL
const url = `https://gateway.pinata.cloud/ipfs/${result.IpfsHash}`
The 2024 **Pinata Files API** added a new mode where storage lives in Pinata's Private Network rather than the public IPFS DHT. That mode is as fast as conventional object storage, and you can promote any file to public IPFS on demand. Pinata is gradually shifting from "Web3-only SaaS" to "general object storage with an IPFS option."
In NFT workflows the canonical pattern is to **pin OpenSea-compatible metadata to IPFS** and embed the CID in the smart contract. That gives token metadata a permanence and censorship-resistance guarantee.
12. Filebase / Storj DCS / Arweave — Decentralized + Permanent
There are several other back ends beyond Pinata in the decentralized category.
**Filebase** (`filebase.com`) is a US company founded in 2019 that ships an **S3-compatible API on top of IPFS, Sia, Skynet, and Filecoin**. Users only need to know the S3 SDK; the bucket configuration picks the underlying network.
Filebase is S3 SDK as-is
aws s3 cp ./photo.jpg s3://my-bucket/photo.jpg \
--endpoint-url https://s3.filebase.com \
--profile filebase
At roughly 0.0059 USD per GB it is affordable, and every object automatically gets an IPFS CID.
**Storj DCS** (`storj.io`) launched mainnet in 2018 and splits each object into 80 Reed-Solomon erasure pieces distributed across more than 15,000 nodes worldwide. Any 39 pieces reconstruct the file, and client-side encryption ensures no single node can read the data.
uplink cp video.mp4 sj://my-bucket/video.mp4
Pricing is around 0.004 USD per GB at rest and 0.007 USD per GB egress, comparable to R2. Government, research, and media customers wary of trusting AWS or Azure favor it.
**Arweave** (`arweave.org`) is in its own category. **Pay once, store for at least 200 years** — a permanent storage blockchain. Instead of the conventional monthly billing model, you pay a single upfront fee at upload time.
Using the Bundlr/Irys gateway
npx @irys/sdk fund 100000 -t arweave -w wallet.json
npx @irys/sdk upload file.jpg -t arweave -w wallet.json
returns an Arweave Transaction ID
Cost is roughly 5 to 10 USD per GB, expensive in absolute terms but bundled with a **permanence** guarantee. Legal evidence, journalism backups, scientific data, and game assets are the natural fit. Since 2025 Solana and Ethereum back ends support the same pattern through Irys/Bundlr, expanding the "blockchain permanence" market.
13. The tus Protocol — Resumable Uploads
`tus.io` is a **resumable upload protocol** Transloadit started in 2014. In 2026 it is in an active IETF Internet-Draft and is the de facto standard. Uppy, UploadThing, Vimeo, Cloudflare R2, Vercel Blob, GitLab, ownCloud, and most major services support it.
The core idea is dead simple: **HEAD to ask the current offset, PATCH to continue from there**.
1) Create the upload
POST /files HTTP/1.1
Upload-Length: 1073741824
Upload-Metadata: filename dmlkZW8ubXA0
Tus-Resumable: 1.0.0
→ 201 Created
Location: /files/abc123
2) Ask current offset after a network drop
HEAD /files/abc123
→ 200 OK
Upload-Offset: 524288000
Upload-Length: 1073741824
3) Continue from there
PATCH /files/abc123 HTTP/1.1
Upload-Offset: 524288000
Content-Type: application/offset+octet-stream
Tus-Resumable: 1.0.0
[binary chunk]
→ 204 No Content
This minimal protocol is what lets a 1GB upload that died at 99% resume at exactly 99% on the next Wi-Fi attempt. For mobile and satellite networks where drops are routine, this is non-negotiable.
Several server implementations exist.
tusd (official Go server)
docker run -p 1080:1080 \
-v /data:/srv/tusd-data \
tusproject/tusd:latest \
-upload-dir=/srv/tusd-data
Node tus server
npm i @tus/server @tus/file-store
const tus = new Server({
path: '/files',
datastore: new FileStore({ directory: './uploads' }),
})
// Mount on a Next.js route handler
On the client `tus-js-client` or Uppy's `@uppy/tus` plugin are the standard choices. Both ship with automatic retries, exponential backoff, and localStorage-based fingerprinting so a refresh keeps the same upload session.
A 2025 draft of tus 2.0 introduced **HTTP/3 with 0-RTT**, **parallel chunking**, and **signed metadata**, with formal publication targeted for the second half of 2026.
14. ClamAV / Sublime Security — Malware Scanning
Re-exposing an uploaded file directly as a public download is how users end up reporting "the PDF I downloaded from your site was ransomware." In 2026 **malware scanning on upload is essentially required**.
**ClamAV** (`clamav.net`) is the long-running open-source antivirus engine, originally released in 1996, signature-based.
Run ClamAV daemon in Docker
docker run -d --name clamav \
-p 3310:3310 \
-v clam_db:/var/lib/clamav \
clamav/clamav:stable
Scan a file
clamdscan --multiscan /uploads/file.pdf
/uploads/file.pdf: OK
or: /uploads/file.pdf: Eicar-Test-Signature FOUND
// Talking to the TCP ClamAV daemon from Node
const clam = await new NodeClam().init({
clamdscan: { host: 'clamav', port: 3310 },
})
const { isInfected, viruses } = await clam.scanFile('/uploads/file.pdf')
if (isInfected) {
await deleteFromStorage(key)
throw new Error(`Malware detected: ${viruses.join(', ')}`)
}
ClamAV is fast but **signature-based**, so it misses novel variants and zero-day malware. Most teams pair it with a managed solution.
**Sublime Security** (`sublimesecurity.com`) is a US content-security SaaS founded in 2020, originally for email but expanded into upload pipelines in 2024 after a Series B round. It applies heuristic plus LLM-based checks to email bodies, uploaded documents, and HTML payloads.
// Conceptual Sublime Detection API
const res = await fetch('https://api.sublime.security/v1/scan', {
method: 'POST',
headers: { Authorization: `Bearer ${SUBLIME_KEY}` },
body: file,
})
const verdict = await res.json()
// { score: 0.92, label: 'phishing', signals: ['embedded-macro', 'suspicious-url'] }
Other options:
- **VirusTotal** (`virustotal.com`) — Run by Google, scans through 70+ engines in parallel, free API plus paid Premium
- **Cloudmersive Virus Scan API** — Managed REST API
- **MetaDefender Cloud** (OPSWAT) — 30+ engines concurrently, the multi-scanning reference
- **AWS GuardDuty Malware Protection for S3** — GA in 2024, auto-scans objects on PUT into an S3 bucket
The recommended upload pipeline shape is:
1. Client PUTs via Presigned URL or tus into a **quarantine bucket** on S3/R2
2. An event (S3 ObjectCreated, R2 Queue, Vercel Blob webhook) fires
3. ClamAV/GuardDuty/Sublime scans
4. Clean files move to the **public bucket**, infected ones move to a quarantine area
This is the pattern that lets you accept ZIP, TAR, and WebAssembly modules — formats that can hide malicious payloads — without exposing users to them.
15. Korea / Japan — Toss, Kakao, Mercari
**Toss** — Korea's leading fintech as of 2024, receiving millions of KYC documents, ID photos, and contracts per day. Synthesizing publicly available Toss Tech Blog posts, the pattern looks like this:
- Client side: an in-house React Native module that strips EXIF and resizes client-side
- Transport: HTTPS PUT plus a proprietary chunked protocol similar to tus
- First-stage storage: an AWS S3 quarantine bucket (`uploads-quarantine`)
- Scanning: ClamAV plus an in-house LLM heuristic that decides whether an ID photo is a valid KYC document
- Second-stage storage: AWS S3 with KMS encryption, IAM Role per microservice
- CDN: CloudFront with Signed URL
The point Toss emphasizes is the SLA — **every scan must finish within 30 seconds** so the user can proceed. They use synchronous + fast polling rather than an asynchronous queue.
**Kakao** — KakaoTalk video messages, KakaoStory, and KakaoTV move petabytes of media daily. According to the Kakao Tech blog, Kakao runs its in-house **KakaoCloud Object Storage** (released 2022, based on a SeaweedFS variant with proprietary metadata) alongside AWS S3 and GCS in a multi-cloud setup. Domestic Korean traffic stays on KakaoCloud; global traffic uses S3 plus CloudFront to optimize egress cost.
**Mercari** — Japan's largest C2C marketplace, ingesting tens of millions of photos a day. The Mercari engineering blog describes a base of GCP Cloud Storage plus Cloud CDN, with image resizing handled by an in-house `imageflux`-compatible layer. Since 2024 every upload runs through **AI auto-categorization** and the same pipeline blocks inappropriate content (NSFW, suspected counterfeit) automatically.
**SmartHR** — A major Japanese HR SaaS handling labor contracts and payslips, legal documents subject to seven-year retention. They use AWS S3 with KMS encryption and Object Lock (WORM) to meet the regulatory requirement, with single-use signed URLs issued on each user download. Toss in Korea and SmartHR in Japan share very similar file infrastructure precisely because their regulatory environments rhyme.
The shared traits of all three are (1) an S3-compatible API at the core, (2) native client modules that handle chunking, retries, and EXIF, and (3) synchronous server-side scanning that gates the next step in the user flow.
16. Who Should Pick What
The recommendation matrix by scale and scenario:
| Scenario | First pick | Second pick | Reasoning |
|---|---|---|---|
| Solo Next.js side project | UploadThing | Vercel Blob | Zero-config, free tier |
| Next.js startup, own auth | Uppy + R2 Presigned URL | Uppy + S3 | Egress savings, no lock-in |
| Korean fintech startup | Uppy + AWS S3 KMS + ClamAV | Uppy + KakaoCloud | Regulatory, KMS integration |
| Media-heavy (image/video transforms) | Cloudinary or ImageKit | Filestack | URL transforms, AI auto-tagging |
| Global SaaS with heavy egress | Cloudflare R2 + Uppy tus | Backblaze B2 + CDN | Zero egress |
| Petabyte in-house data center | MinIO | SeaweedFS | S3 compatible, erasure coding |
| Home or small OSS cluster | Garage | SeaweedFS | Low power, multi-region |
| NFT or Web3 metadata | Pinata IPFS | Filebase | OpenSea compatible, permanence |
| Legal, scientific, journalism archive | Arweave (via Irys) | Storj DCS | Pay-once, censorship resistant |
| Enterprise + multi-cloud abstraction | MinIO + S3 + R2 fan-out | Tigris (global replication) | Avoid vendor lock-in |
| Large files (>1GB) | tus + Uppy + R2 | tus + tusd + S3 multipart | Resumable is essential |
| Strict virus gating required | ClamAV + S3 quarantine pattern | Sublime Security + GuardDuty | Layered defense |
**Principles**:
1. **Use Uppy on the client** — Unless you are a solo prototype, almost always reach for Uppy. UI, progress, retries, tus, chunking, and external sources come along for free.
2. **Pick S3-compatible storage** — To avoid lock-in, choose a back end that speaks the S3 API. R2, B2, Wasabi, MinIO, and Tigris all share the same SDK.
3. **Heavy egress means R2 or B2** — If your business is serving media, zero-egress pricing is the deciding factor.
4. **Large files demand tus** — Anything over 1GB without tus is wasteful.
5. **Sensitive files follow quarantine -> scan -> publish** — Never expose freshly uploaded files on a public URL. Land them in a quarantine bucket, scan, then promote.
6. **Strip EXIF on the client** — Photo GPS and device metadata should never reach the server. `piexifjs` or `exif-stripper-js` are enough.
7. **Decentralized only when permanence is real** — IPFS and Arweave are great tools, but they are overkill for everyday SaaS. Use them only when the answer to "must this still exist in 100 years?" is genuinely yes.
File upload sits next to checkout in terms of direct user exposure. A user whose 1GB video died at 99% will stop trusting the product, and a service that lets one malware sample through has a hard time recovering reputation. The 2026 tools reduce both problems to single-line APIs — take advantage of them.
References
- Uppy by Transloadit: https://uppy.io
- Uppy GitHub: https://github.com/transloadit/uppy
- UploadThing: https://uploadthing.com
- UploadThing Docs: https://docs.uploadthing.com
- Filestack: https://www.filestack.com
- Cloudinary: https://cloudinary.com
- ImageKit: https://imagekit.io
- AWS S3 Presigned URLs: https://docs.aws.amazon.com/AmazonS3/latest/userguide/PresignedUrlUploadObject.html
- Cloudflare R2: https://developers.cloudflare.com/r2/
- Backblaze B2: https://www.backblaze.com/cloud-storage
- Wasabi: https://wasabi.com
- Tigris: https://www.tigrisdata.com
- Vercel Blob: https://vercel.com/docs/storage/vercel-blob
- Bunny Storage: https://bunny.net/storage/
- MinIO: https://min.io
- SeaweedFS: https://github.com/seaweedfs/seaweedfs
- Garage (Deuxfleurs): https://garagehq.deuxfleurs.fr
- Pinata: https://www.pinata.cloud
- Filebase: https://filebase.com
- Storj DCS: https://www.storj.io
- Arweave: https://www.arweave.org
- tus protocol: https://tus.io
- tus IETF Internet-Draft: https://datatracker.ietf.org/doc/draft-tus-httpbis-resumable-uploads-protocol/
- ClamAV: https://www.clamav.net
- Sublime Security: https://sublimesecurity.com
- VirusTotal: https://www.virustotal.com
- AWS GuardDuty Malware Protection for S3: https://docs.aws.amazon.com/guardduty/latest/ug/malware-protection-s3.html
- Standard Webhooks: https://www.standardwebhooks.com (related Svix ecosystem)
- Toss Tech Blog: https://toss.tech
- Kakao Tech: https://tech.kakao.com
- Mercari Engineering: https://engineering.mercari.com
- SmartHR Tech Blog: https://tech.smarthr.jp
현재 단락 (1/375)
Almost every product needs file upload, yet it remains one of the most frequently broken features in...