- Published on
Decentralized Storage 2026 Complete Guide - Filecoin, Arweave, Storj, IPFS, Walrus (Sui), Shadow Drive, Greenfield, EigenDA Deep Dive
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Prologue — "Where Do You Put It?" Became a Question Again
In the cloud era, "where do you put it" stopped being a question. You threw bytes at S3 and cached them through CloudFront if needed. Done. But since 2024 the picture has shifted fast.
- November 2024:
Walruson Sui launched mainnet, introducing a new category — "BLOBs not on chain, but vouched for by a chain." - 2024: EigenDA went live on top of EigenLayer restaking, dropping data-availability costs for Ethereum rollups to single-digit cents.
- 2025: Web3.Storage rebranded as
Storachaand made its identity clear: "the hot S3-compatible gateway on top of Filecoin." - 2025: In the Arweave ecosystem, Bundlr renamed itself
Irysand declared itself "Data Layer 1." - Early 2026: Filecoin's FVM (Filecoin Virtual Machine) made data DAOs and compute-over-data workloads a routine pattern.
This article walks the 2026 decentralized storage stack end-to-end in one sitting. What content addressing is, what Filecoin, Arweave, Storj, Walrus, Greenfield, EigenDA, and Celestia each do well, and a matrix for deciding "where should I put my data?" between them.
Chapter 1 · Content Addressing - Where Everything Starts
To understand decentralized storage, you have to hold one idea in your hand: content addressing. The cloud points to a "location." Example: s3://my-bucket/photos/cat.jpg. If someone swaps the file underneath, the URL is unchanged.
Content addressing flips this. The hash of the data is the address.
[Bytes] -- SHA-256 / Blake3 --> [Hash] --> [CID]
| |
v v
"11MB cat photo" bafybeigdyrzt5sfp7udm7hu76uh7y26nf...
A CID (Content Identifier) is self-verifying. Re-hashing the bytes you received catches any lie instantly. So in content-addressed land, it does not matter who delivered the data. The unit of trust moves from "host" to "content."
Remember in one line: "Location addressing trusts the host; content addressing trusts the bits."
Chapter 2 · IPFS and libp2p - The Foundation of the Distributed File System
IPFS (InterPlanetary File System) is the protocol that puts content addressing on top of a P2P network. libp2p is the modular networking layer beneath it, shared by Ethereum, Filecoin, Polkadot, Lodestar, and countless other projects.
The 2026 IPFS implementation landscape:
- Kubo (Go) — Protocol Labs' reference implementation. CLI and node daemon. Stable but heavy.
- Helia (JS/TS) — modular client that runs in browsers and Node.js. Between 2024 and 2026, Helia became the de-facto default for web IPFS.
- Iroh (Rust) — Number 0's next-generation P2P data transfer. IPFS-compatible but explicitly aimed at "smaller and faster."
// Helia example - fetch a file by CID in the browser
import { createHelia } from 'helia'
import { unixfs } from '@helia/unixfs'
const helia = await createHelia()
const fs = unixfs(helia)
// Read a file by CID
const cid = 'bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi'
const decoder = new TextDecoder()
for await (const chunk of fs.cat(cid)) {
console.log(decoder.decode(chunk, { stream: true }))
}
IPFS's weakness is obvious. Publishing is one-shot, but without a pin the data gets garbage-collected. In other words, IPFS is a "transport protocol," and persistence must be bought separately at a different layer. That is why Filecoin, Arweave, and Pinata exist in the next chapters.
Chapter 3 · Filecoin - The Standard for Decentralized Cold Storage
Protocol Labs' Filecoin attacks IPFS's "missing persistence" head-on. Storage providers (SPs) prove cryptographically that they hold the data, and they get paid in tokens for doing so. Two proofs do the heavy lifting.
- PoRep (Proof of Replication) — a one-shot proof that "this SP actually created a unique replica of this data on disk." The sealing process.
- PoSt (Proof of Spacetime) — a repeated proof that "the replica is still alive," submitted on a fixed cadence. Twenty-four-hour windows, thirty-minute challenges by default.
In 2026 Filecoin is no longer just cold storage.
- FVM (Filecoin Virtual Machine) — an EVM-compatible VM that launched in 2023. Smart contracts now interact with storage deals and data directly.
- Data DAOs — a pattern where dataset governance is owned by a DAO. CIDgravity, Lighthouse, and Banyan provide the foundation.
- Compute over Data — Bacalhau and friends run code where the data already lives. Combined with AI datasets, this is potent.
Prices float like a compute market. As of May 2026 the average price per GiB-month is well under 0.0001 FIL, but the real cost varies by SP, and retrieval is billed separately.
The token is written $FIL. Quoted prices follow the market rate.
Chapter 4 · Arweave - The Strange Bet on "200-Year Permanent Storage"
Arweave's bet is simple. Pay once, stored forever. How? It collects 200 years of storage cost up front into an endowment fund, and pays miners out of that fund's yield. It rides on a Moore-style assumption that disk price falls every year.
Architecturally it's a blockweave, a modified blockchain. Where a normal chain demands you verify the "most recent block," blockweave uses PoA (Proof of Access) which demands you verify an arbitrary past block. That gives miners an incentive to keep old data around.
The ecosystem is not just a chain:
- AR — Arweave's native token. Permanent-storage settlement is in AR.
- Bundlr to Irys — an L2 that lets you pay for permanent storage with many tokens. In 2025 Bundlr rebranded as Irys and started using the phrase "programmable datachain."
- Othent, ArConnect — permanent-storage SaaS and a wallet extension.
- ArDrive — permanent cloud drive UX.
The use cases are obvious. NFT metadata losing its IPFS pin, journalism and archival, publications that need censorship resistance. One caveat: the "permanent" guarantee depends on the funding assumption of the endowment fund. It is not a 100% mathematical guarantee but an economic guarantee.
The token is written $AR, and the price should be looked up separately.
Chapter 5 · Storj DCS - S3-Compatible Decentralized Cloud
Storj bet in a different direction. Make it indistinguishable from S3 in the user-experience sense, with a decentralized erasure-coded backend.
Three core concepts:
- Satellite — a trusted coordinator that handles metadata, billing, and SLAs. Initially run by Storj, but anyone can run their own.
- Storage Node — a node renting out actual disk space. It gets paid.
- Erasure Coding — files are split into 80 pieces, and any 29 of them are enough to reconstruct. Half the nodes can disappear and the data still lives.
# uplink CLI - Storj is both S3-compatible and has its own CLI
uplink cp ./video.mp4 sj://my-bucket/video.mp4
uplink share --readonly sj://my-bucket/video.mp4
# Or s3cmd / aws-cli
aws s3 cp ./video.mp4 s3://my-bucket/ \
--endpoint-url https://gateway.storjshare.io
As of May 2026, pricing is around 4 USD/TB-month for storage and 7 USD/TB for egress, single-digit costs relative to AWS S3 Standard (about 23/9). The cheap egress is the decisive bit. Storj wins on "video workloads and backups."
The token is $STORJ, an ERC-20. SNOs (storage node operators) get paid in it. End users can pay with either card or crypto.
Chapter 6 · Sia and Skynet - The Oldest P2P Cloud
Sia has been running since 2015 — the veteran of decentralized storage. Its model is Renter-Host contracts, a direct matching market.
- A renter strikes a three-month contract with a host, who locks up collateral and stores the data.
- The host must submit a storage proof at regular intervals to be paid.
- Skynet was the CDN-like gateway layer built on top, but it shut down in 2022. Today's role is filled by Sia itself and successor projects from SkynetLabs (such as Filebase's Renterd-Hosted offering).
In 2026 Sia's position is clear: "a solid P2P backbone plus an S3-compatible gateway via Filebase or Renterd-Hosted." Its security model is the most conservative, and pricing competes with Storj. The downside is that the DApp ecosystem around it is thinner than Filecoin or Arweave's.
Chapter 7 · Walrus (Sui) - The 2024 Dark Horse
Built by Mysten Labs on top of Sui, Walrus launched mainnet in November 2024 and got attention immediately. Its core trick is RaptorQ-based erasure coding: data survives loss of a third of nodes, and recovery cost is a single-digit multiple cheaper than traditional 2D Reed-Solomon.
The one-line design. "BLOB metadata is on-chain on Sui; the actual bytes are RaptorQ-encoded shards spread across off-chain nodes."
[User] -> [Publisher API] -> [Encode BLOB into 1024 RaptorQ shards]
|
v
[Storage Node Cluster]
| | | |
v v v v
shard shard shard ...
|
[User retrieval] <--- [Aggregator API] <-+
|
[Sui on-chain: metadata, payment, liveness proofs]
Highlights:
- High availability — designed for 99.999% availability. If Filecoin is "cold storage," Walrus is "web-friendly hot storage."
- Sui integration — Move smart contracts handle BLOB objects directly. NFTs, games, live video, and AI datasets glue in naturally.
- Sites — Walrus Sites lets you upload a static site as BLOBs mapped to Sui objects for censorship-resistant hosting.
As of 2026, Walrus has rapidly taken the seat for "the most natural BLOB store to attach to a Web3 dApp." Adoption is strongest in NFT metadata, SocialFi, and game assets.
Chapter 8 · Shadow Drive (Solana / GenesysGo)
Shadow Drive is a GenesysGo-built storage layer on Solana. It pairs Solana's fast transactions with NFT collection issuance where the metadata is handled inside the same chain.
Highlights:
- SHDW token for payment, written
SHDW. - Immutable/mutable modes — within the same drive, you choose whether files are fixed or updatable.
- Solana RPC integration — infra providers like Helius and Triton recommend Shadow Drive as the default backend.
The downside is plain. Outside the Solana ecosystem, adoption is low, and Solana's own downtime issues weigh on the reliability story.
Chapter 9 · BNB Greenfield - A Storage Bet from an Exchange Coin
BNB Chain launched Greenfield in 2023 as decentralized object storage running inside the BNB ecosystem. What's interesting is that object permissions tie directly to smart contracts on BNB Smart Chain.
- Twenty-six SPs (Storage Providers) at launch, a deliberately bounded set, expanding gradually.
- S3-compatible API — Greenfield SDK plus an S3 gateway.
- Cross-chain messages — things that happen on Greenfield reflect into BNB Smart Chain contracts immediately.
Greenfield's pitch is "integrated price/performance inside the BNB ecosystem." External adoption is still limited, but BNB games and SocialFi projects pick it up as a modular component.
Chapter 10 · EigenDA and the Rise of Modular DA
This is a different category. DA (Data Availability) is not about "long-term storage." It is the guarantee that "the data a rollup posted is retrievable by anyone for a short window (say, two weeks)." Not the same as permanent storage.
Posting data as calldata on Ethereum mainnet is expensive. Modular DA layers exist to solve that.
- EigenDA — DA on top of EigenLayer restaking. Launched in 2024. Pricing is 10x-100x cheaper than mainnet calldata. Mantle and Celo are notable adopters.
- Celestia — the original modular DA. Mainnet in 2023. Mantle, Dymension, and Manta Pacific are headline users.
- Avail — a Polygon spin-out. KZG-proof-based.
- NEAR DA — DA running on NEAR Protocol. Used by parts of Aurora and StarkNet.
[L2 transactions] -> [L2 sequencer batches and compresses]
|
v
[Submit BLOB to DA layer]
|
v
[Only the commitment goes to L1 Ethereum]
|
v
[Validator / user pulls original from DA layer -> verifies via KZG/RS]
The 2026 pattern is clear: "new L2s pick EigenDA as their DA" has become the dominant default. The price gap is decisive.
Chapter 11 · Cardano Hydra Data, Aleph.im - Other Bets
- Cardano Hydra Data — a DA solution on Cardano. It rides on the Hydra Head L2 to handle data streaming and BLOBs. EUTXO-model-friendly.
- Aleph.im — a hybrid of compute and storage. Indexing, dApp backends, serverless functions all bundled into one infra token. From 2025 onward it has emphasized AI inference workloads.
Chapter 12 · Hot Gateways - Pinata, Storacha, NFT.Storage
For developers who never touch decentralized storage directly, the practical interface is the gateway/SaaS layer.
- Pinata — the veteran of IPFS pinning SaaS. JWT auth, gateway domains, group policies — a full stack. Half of all NFT projects sit here.
- Storacha (formerly Web3.Storage) — the Protocol Labs successor. UCAN-based (delegatable capability) auth, backed by Filecoin. Its identity as "hot S3-compatible on top of Filecoin" is now clear.
- NFT.Storage — free pinning for small NFT metadata. Policy changes between 2024 and 2025 narrowed the free tier, so check pricing before relying on it for a business.
- Quicknode IPFS — IPFS pinning plus gateway from the RPC infra company Quicknode.
- Filebase — multi-backend SaaS bundling Sia + Storj + IPFS behind an S3-compatible API.
// Storacha example - UCAN-based upload
import { create } from '@storacha/client'
const client = await create()
await client.login('me@example.com')
const space = await client.createSpace('my-app')
await client.setCurrentSpace(space.did())
const file = new File(['hello world'], 'hello.txt')
const cid = await client.uploadFile(file)
console.log('CID:', cid.toString())
Remember in one line: "Keep the infrastructure cold (Filecoin/Arweave) and use a hot gateway (Storacha/Pinata/Walrus)."
Chapter 13 · Data DAOs - When Data Itself Becomes Governance
The most interesting pattern to grow out of Filecoin is the Data DAO. The storage, access, and licensing of a dataset are subject to a DAO's decisions.
- CIDgravity, Lighthouse, Banyan — Data DAO infrastructure.
- Numbers Protocol — photo and video provenance plus DAO.
- DIMO — an automotive data DAO. Users share vehicle telemetry and receive tokens.
- Hivemapper — drivers share dashcam footage to build a map. HONEY token.
Vana and Ocean Protocol slot in here.
- Ocean Protocol — a marketplace where datasets and algorithms are tokenized as NFTs. Compute-to-Data ships the algorithm to the data instead of the other way around.
- Vana — the "AI data DAO" infrastructure rising through 2024-2025. Personal data is tokenized and consumed in LLM training.
The 2026 pattern is plain. The most active area is DAO tokenization of ownership and rights over AI training data.
Chapter 14 · Decentralized Databases - Ceramic, OrbitDB, Tableland
Storage is BLOBs, databases are queries. Two different problems.
- Ceramic — stream-based decentralized DB. ComposeDB cemented a GraphQL interface around 2025.
- OrbitDB — a P2P DB running on top of IPFS. Various data types like Key-Value, Docstore, and Eventlog.
- GunDB — a graph-based real-time P2P DB. A long-lived library.
- Tableland — "on-chain SQL." A table is itself an EVM NFT. Reads go through a validator network, writes via EVM transaction.
- Polybase / Borph — newer zk-based decentralized DB attempts between 2024-2025.
-- Tableland - create an on-chain table
CREATE TABLE my_users_5_42 (
id INTEGER PRIMARY KEY,
username TEXT NOT NULL,
joined_at INTEGER
);
-- Insert data - EVM transaction
INSERT INTO my_users_5_42 (id, username, joined_at)
VALUES (1, 'alice', 1716000000);
-- Read - free, via the Tableland Gateway
SELECT * FROM my_users_5_42 WHERE id = 1;
One-line picker. For user profiles and SocialFi use Ceramic; for queryable NFT metadata use Tableland; for P2P collaboration apps use OrbitDB/GunDB.
Chapter 15 · CRDT and Sync - Y.js, Automerge, Replicache
There is a problem where "multiple devices edit the same data concurrently and conflicts must resolve automatically." That is a CRDT (Conflict-free Replicated Data Type).
- Y.js — the de-facto standard in 2026. The internal CRDTs of collaboration tools like Figma, Linear, and Notion are either Y.js or descended from it. It binds to WebSocket, WebRTC, Hyperswarm — anything.
- Automerge — the CRDT from the Ink and Switch lab. v2's Rust core was the inflection point for serious performance.
- Replicache / Reflect — Rocicorp's "client-first sync engine." Not a CRDT, but solves a similar problem.
CRDTs themselves are not decentralized. But the combination of CRDT + libp2p + content addressing makes server-less collaboration apps real. The new pattern of 2025-2026.
// Y.js plus libp2p for a P2P collaborative document
import * as Y from 'yjs'
import { LibP2pProvider } from 'y-libp2p'
const doc = new Y.Doc()
const provider = new LibP2pProvider('shared-room', doc)
const ytext = doc.getText('content')
// One side appends text
ytext.insert(0, 'Hello, decentralized world')
// Other peers sync automatically - no server
Chapter 16 · Distributed Keys - Lit Protocol, Threshold (formerly NuCypher)
Once you put data on decentralized storage, the question becomes "who can read it." The answer is distributed key management.
- Lit Protocol — MPC-based threshold signing and decryption. "Keys release when conditions are met" — access control conditions. As of V8 in 2025, behavior stabilized.
- Threshold Network (the NuCypher plus Keep merger) — Proxy Re-Encryption (PRE) and tBTC (threshold Bitcoin bridge). Data access is expressed as delegable rights.
// Lit Protocol - condition: "only a wallet holding N or more tokens can decrypt"
const accessControlConditions = [
{
contractAddress: '0x...',
standardContractType: 'ERC721',
chain: 'ethereum',
method: 'balanceOf',
parameters: [':userAddress'],
returnValueTest: { comparator: '>=', value: '1' }
}
]
const { ciphertext, dataToEncryptHash } = await litClient.encrypt({
accessControlConditions,
dataToEncrypt: 'secret payload'
})
// The ciphertext is safe to upload to IPFS/Walrus
Use cases: NFT-gated content, token-gated video, documents visible only to DAO members.
Chapter 17 · NFT Metadata - The Most Common Traps and Best Practice
Three traps NFT issuers routinely fall into:
- Putting metadata at an HTTPS URL — if the operator disappears, so does the NFT's image. A common 2021-2022 accident.
- An IPFS CID with no pin — once Pinata's free tier runs over, GC happens. Without Filecoin backing, the future is uncertain.
- Mutable metadata JSON — if the image is on IPFS but the metadata JSON is at a mutable URL, swapping out the JSON is a scam vector.
Best practice as of 2026:
- Image — IPFS CID plus Filecoin/Storacha pin plus an Arweave second backup.
- Metadata JSON — IPFS CID, with Arweave as a single backup, immutable.
- Contract
tokenURI— start withar://...oripfs://.... Never hardcode an HTTPS gateway URL.
Walrus and Filecoin both work well for NFT metadata, but the billing and access patterns differ. At collection-launch time Walrus is strong for hot access; for long-term retention Filecoin gives better price/performance.
Chapter 18 · The Current Position of Korean and Japanese Projects
Korea:
- Kaia (Klaytn plus Finschia, post-2024 merger) — an EVM-compatible chain bridging the LINE messenger and the Kakao ecosystem. No native storage module, but plenty of dApps integrate IPFS and Filecoin.
- BORA — a gaming chain under Kakao Games. The standard for NFT metadata is IPFS plus Arweave dual backup.
- ICON — a first-generation Korean home-grown chain. Native IPC (inter-blockchain communication) plus IPFS integration.
Japan:
- Astar Network — a Polkadot parachain. Integration with the Polkadot ecosystem's IPFS work and Filecoin collaborations.
- Soneium — Sony's 2024 announcement of an OP Stack L2. Sony Music and game asset NFTs are core use cases. EigenDA and Celestia are discussed as modular DA options.
- Oasys — a gaming-focused L1. Mainnet plus Verse Layer structure. Combination cases with game-friendly storage like Walrus and Shadow Drive.
The regional pattern is consistent. EVM-compatible chains plus IPFS/Filecoin backing are the baseline for both Korea and Japan, while game/entertainment use cases gradually pull in newer players like Walrus and EigenDA.
Chapter 19 · The Decision Matrix - "Where Should My Data Live?"
The one page of this article. What to use, when.
| Scenario | Primary | Secondary backup | Notes |
|---|---|---|---|
| NFT image (permanent) | Arweave | Filecoin + IPFS | prefer ar:// |
| NFT metadata JSON | IPFS + Storacha | Arweave | immutable CID |
| Video / podcast (hot) | Storj | Storacha | egress is decisive |
| Backup (cold) | Filecoin | Sia (Filebase) | gas-efficient |
| L2 rollup DA | EigenDA | Celestia / Avail | price-first |
| Game assets (low latency) | Walrus | Shadow Drive | hot access |
| AI training data + DAO | Filecoin + Vana | Ocean Protocol | Compute-to-Data |
| Censorship-resistant publishing | Arweave | IPFS + Filecoin | immutable |
| User profile (Web3 SocialFi) | Ceramic | Lens / Farcaster Hub | GraphQL |
| Collaborative docs (P2P) | Y.js + libp2p | OrbitDB | server-less |
| Token-gated content | Walrus + Lit | IPFS + Lit | access control |
Remember in one line: "Do not use one backend only. Hot and cold, permanent and ephemeral, DA and BLOB are different tools."
Chapter 20 · Operational Traps - Real Footguns
- CID migration — moving from CIDv0 (starts with Qm) to CIDv1 (starts with b) changes the CID for the same content. If it is hardcoded into a contract's
tokenURI, it is painful. Use CIDv1 from the beginning. - Gateway dependence — public gateways like
https://ipfs.io/ipfs/...are free but have no SLA. For production, run your own gateway or use a dedicated gateway from Pinata or Storacha. - Name resolution (IPNS, DNSLink) — IPFS's mutable name system. IPNS is slow. DNSLink is practical but requires DNS control. ENS
contenthashis the most elegant. - Egress bombing — when traffic spikes, Pinata/Storacha bills blow up. Putting a CDN cache (Cloudflare Workers, Fastly) in front is the standard pattern.
- GDPR versus permanent storage — if you store personal data on Arweave, you cannot honor the "right to erasure." Client-side encryption is in practice the only workaround.
Chapter 21 · The Future - What Comes After 2026
- AI data DAOs — Vana, Filecoin, and Ocean are converging here fast. Tokenization of personal data and royalties paid on training-data usage will become routine.
- Spread of new BLOB designs like Walrus — the model gets replicated to chains other than Sui. Expect more new projects with similar architecture in H2 2026.
- DA standardization — once EigenDA, Celestia, and Avail agree on a compatibility layer, switching DA from an L2 perspective costs nearly zero.
- Default client-side encryption — when Lit Protocol and Threshold integrate deeper into SDKs, uploading without encryption will feel odd by default.
- CRDT plus IPFS going SaaS — server-less collaboration tools enter the consumer mass market.
Remember in one line: "The cloud is not going away. Some slice of it just shifts into 'P2P data locked with my keys.'"
Epilogue - "Storage" Was Too Simple a Word
At first glance, decentralized storage looks like "an S3 replacement." Look closely and it becomes clear: S3 does one thing well, while decentralized storage broke storage into separate problems — retention, retrieval, proof, licensing, access control, sync, and DA.
This is not a defect — it means the layers got separated. Each layer has different trade-offs, so different tools live there. And in the 2026 matrix, almost every seat is now filled.
- Arweave for permanent storage.
- Filecoin for cold retention.
- Storj and Storacha for hot S3-compatible.
- Walrus for dApp-friendly BLOBs.
- EigenDA and Celestia for DA.
- Lit and Threshold for distributed keys.
- Y.js and OrbitDB for collaborative sync.
The next five-year game is "how do we glue these layers smoothly together." And the ticket to play that game starts with one idea: "content addressing and self-verification."
References
- IPFS Documentation - Concepts and How It Works
- libp2p Specifications
- Filecoin Spec - Proof of Spacetime / PoRep
- Filecoin Virtual Machine (FVM) Docs
- Arweave Yellow Paper
- Irys (formerly Bundlr) Documentation
- Storj DCS Whitepaper
- Sia Network Documentation
- Walrus Protocol Whitepaper
- Sui Documentation - Move and Objects
- Shadow Drive on Solana
- BNB Greenfield Whitepaper
- EigenDA Documentation
- Celestia Documentation
- Avail Project Documentation
- NEAR Data Availability
- Storacha (formerly Web3.Storage)
- Pinata Cloud Documentation
- NFT.Storage Documentation
- Ceramic Network and ComposeDB
- Tableland Documentation
- OrbitDB Field Manual
- Y.js Documentation
- Automerge Documentation
- Lit Protocol Documentation
- Threshold Network (NuCypher merger)
- Ocean Protocol Compute-to-Data
- Vana - AI Data DAOs
- Kaia Network (Klaytn + Finschia merger)
- Soneium L2 by Sony