Rust Tokio ランタイム Deep Dive — Future、Waker、Work-Stealing、Pin、Zero-Cost Async 完全攻略 (2025)

TL;DR

Rust の async fn はコンパイル時に State Machine struct に変換される。ランタイムスタックは不要で zero-cost abstraction。
Future trait の核は poll(&mut self, cx: &mut Context) -> Poll<Output>。完了すれば Ready、未完了なら Pending を返し waker を保存する。
Waker は「この task をもう一度 poll せよ」という callback。I/O イベント到着やタイマー満了で runtime が waker を呼び、task を再 schedule する。
Tokio は reactor (mio = epoll/kqueue/IOCP) + scheduler (work-stealing) + driver (I/O, timer) の 3 層構造。#[tokio::main] が全てをセットアップする。
Work-stealing scheduler: Go と同様に worker ごとの local run queue。ただし LIFO slot (最近 spawn した task 優先) が特徴で cache locality を向上。
Pin は自己参照構造体を安全に扱うための型。async fn が生成する state machine はしばしば自己参照を持つ。
Cooperative scheduling: Tokio は await しなければ譲らない。CPU-bound は tokio::task::yield_now() または spawn_blocking へ分離する必要がある。
Function coloring: async fn と同期関数は直接相互に呼べない。Rust async の哲学的なコスト。

1. なぜ Rust async は特殊か

1.1 二つのアプローチ

Stackful (goroutine 型): 各 task が自前のスタックを持ち、関数途中で自由に suspend/resume できる。Go, Erlang, Java virtual thread。

Stackless (state machine 型): 各 task は state machine struct。suspend 地点はコンパイル時に決定。ランタイムスタック無し。Rust, C#, JavaScript, Python asyncio。

Rust は後者を選択。理由:

ランタイム無しで embedded でも動く必要がある。
Zero-cost: 使わないものにコストを払わない。
スタックを動的割り当てしない → メモリ使用量が決定的。

1.2 Zero-cost abstraction

Rust async は async fn を 1 個の struct にコンパイルし、suspend 地点を enum variant で表現、heap 割当無し (Future を Box した場合のみ)。コンパイラ最適化が効けば手書き C のイベントループと近い機械語になる。

1.3 代償

Function coloring: sync と async が混ざらない。
Pin の複雑さ: 自己参照構造体のため。
Runtime 依存: Future は poll されないと何も起こらない。Tokio のような executor が必要。
デバッグ困難: stack trace が state machine 内部を指す。

2. Future trait と State Machine

2.1 Future は単純

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

pub enum Poll<T> {
    Ready(T),
    Pending,
}

Future は「いずれ値を返す計算」。
poll() は「今進められるところまで進めよ」。
Ready(value): 完了。Pending: 未完了、後で waker から通知する。

2.2 async fn は Future を生成する

async fn hello() -> u32 { 42 }
// ≈
fn hello() -> impl Future<Output = u32> { async move { 42 } }

async move { 42 } はコンパイラによって state machine struct に変換される。

2.3 State machine の解剖

async fn example() -> u32 {
    let x = read_value().await;   // suspend point 1
    let y = compute(x).await;     // suspend point 2
    x + y
}

概念的には:

enum ExampleState {
    Start,
    Awaiting1 { fut: impl Future<Output = u32> },
    Awaiting2 { x: u32, fut: impl Future<Output = u32> },
    Done,
}

観察ポイント:

suspend point が enum variant になる。
poll() は呼ばれるたびに現在状態から進められるだけ進める。
.await は match poll() { Pending => return Pending, Ready(v) => v } と等価。
生存する局所変数は state machine の struct field として保存される。

2.4 State machine のサイズ

struct サイズ = 任意の suspend 地点で同時生存する局所変数の総和。

async fn big() {
    let buf = [0u8; 1024 * 1024];  // 1MB スタック配列!
    do_io().await;
    println!("{}", buf[0]);
}

この Future は 1MB を内包する。これが悪名高い「async fn サイズ爆発」問題。対策: Box::pin で heap に載せる、または大きなデータを Vec/Box で間接参照する。

3. Waker — 再開の仕組み

3.1 なぜ Waker が必要か

Future が Pending を返したとき、runtime はいつ再 poll すべきか。busy loop も定期 poll も非効率。正解は「Future 自身が準備完了を通知する」。これが Waker。

3.2 メカニズム

pub struct Waker { waker: RawWaker }
impl Waker {
    pub fn wake(self);
    pub fn wake_by_ref(&self);
    pub fn clone(&self) -> Waker;
}

各 poll() 呼び出しに Context が渡され、そこから waker を得る:

fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
    if self.io_ready() {
        Poll::Ready(value)
    } else {
        self.register_io(cx.waker().clone());
        Poll::Pending
    }
}

I/O 準備完了で driver が waker.wake() を呼び、runtime が task を再 schedule。

3.3 RawWaker vtable

pub struct RawWakerVTable {
    clone: unsafe fn(*const ()) -> RawWaker,
    wake: unsafe fn(*const ()),
    wake_by_ref: unsafe fn(*const ()),
    drop: unsafe fn(*const ()),
}

runtime 非依存にするための dynamic dispatch。Tokio, async-std, smol, embedded はそれぞれ独自実装を持つ。

3.4 Waker のコスト

clone() は多くの runtime で Arc::clone 相当、wake() は task を run queue に入れる (既に入っていれば no-op)。Tokio は waker が現 worker の queue に既にある場合 clone/再登録を省略する最適化を持つ。

4. Pin と自己参照構造体

4.1 問題

async fn example() {
    let data = vec![1, 2, 3];
    let ptr = &data[0];
    some_future().await;
    println!("{}", *ptr);
}

生成される state machine は data とそのスライスへのポインタを抱える。struct がメモリ上で移動すると ptr は古い位置を指し続け use-after-free。

4.2 Pin の役割

Pin は「この値は以後移動しない」という型レベルの約束。Future::poll は Pin<&mut Self> を受け取るため、runtime は Future を移動できない → 自己参照が安全。

4.3 Pin を作る方法

let fut = Box::pin(my_async_fn());       // heap 上で固定
let fut = std::pin::pin!(my_async_fn()); // stack 上で固定 (1.68+)
tokio::spawn(fut); // 内部で Box::pin

4.4 Unpin

多くの型は Unpin (auto trait) で、Pin しても実質意味無し。async fn が生成する state machine は !Unpin (自己参照可能性のため)。だから tokio::spawn 前に pin が必要。

4.5 Pin の限界

「async Rust を理解するには Pin を理解せよ」が参入障壁になっているとしばしば批判される。pin-project や pin! macro で改善されつつも本質的な複雑さは残る。

5. Tokio ランタイムアーキテクチャ

5.1 3 層構造

ユーザ async コード (async fn, .await)
Scheduler (task, worker, queue, park/unpark)
Driver (I/O = mio, time = timer wheel)
OS (epoll / kqueue / IOCP)

5.2 `#[tokio::main]`

展開後:

fn main() {
    tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .build()
        .unwrap()
        .block_on(async { do_something().await })
}

5.3 Runtime flavors

Multi-threaded (既定): N worker で work-stealing、task は Send + 'static が必要。

Current-thread: 単一 worker、Send 不要。CLI、embedded、test 向け。

6. Tokio scheduler 内部

6.1 Worker と Task

struct Worker {
    local: LocalQueue<Task>,     // capacity 256
    lifo_slot: Option<Task>,     // 最新 spawn
    remote: Arc<Mutex<RemoteQueue<Task>>>,
    park: Parker,
}

6.2 LIFO slot — cache 最適化

「spawn 直後の task は親の cache が温まっている worker で即実行した方が L1 hit 率が高い」という発想。メッセージパッシングで特に有効。ただし LIFO は不公平なので、Tokio は定期的に LIFO slot をスキップして FIFO を優先する。

6.3 選択順序

loop {
    // 1. LIFO slot
    // 2. ローカル queue
    // 3. グローバル queue (tick ごと)
    // 4. I/O driver を poll
    // 5. 他 worker から steal
    // 6. park
}

6.4 Work-stealing

victim の local queue の半分を奪い、先頭を自分で走らせ、残りを自分の local に積む。Go と同じ「半分 steal」戦略。

6.5 Task のライフサイクル

Spawn → Poll → Pending (waker 登録) → Wake (event) → 再 Poll → Ready (JoinHandle 経由で結果)。

6.6 JoinHandle

JoinHandle は Future。task と Arc で結果を共有。片方が生きている間はメモリ解放されない。

7. I/O driver — mio と epoll

7.1 mio

プラットフォーム抽象化: Linux epoll / BSD macOS kqueue / Windows IOCP。

7.2 Non-blocking I/O

原則すべての socket/fd は non-blocking。TCP connect の流れ:

socket(), fcntl(O_NONBLOCK)
connect() が EINPROGRESS で即戻り
epoll に WRITE 登録
poll → Pending
epoll_wait → 書き込み可 → waker 呼び出し
poll 完了

7.3 PollEvented

fn poll_read(self: Pin<&mut Self>, cx: &mut Context, buf: &mut [u8])
    -> Poll<io::Result<usize>>
{
    match self.io.read(buf) {
        Ok(n) => Poll::Ready(Ok(n)),
        Err(e) if e.kind() == WouldBlock => {
            self.register_readiness(cx, Readiness::READ);
            Poll::Pending
        }
        Err(e) => Poll::Ready(Err(e)),
    }
}

7.4 driver はいつ走るか

専用 I/O スレッドは無く、手が空いた worker 自身が epoll_wait を呼ぶ。cache 局所性と省電力に有利な設計。

8. Timer driver — hashed timer wheel

Tokio (と Netty、Linux kernel など) は階層化 timer wheel を用いる:

Level 0: 64 slots x 1 ms
Level 1: 64 slots x 64 ms
Level 2: 64 slots x 4 s
Level 3: 64 slots x 4 m
Level 4: 64 slots x 4 h

Insert O(1), Expire O(1) amortized。数百万タイマーでも効率的。1 周すると上位 Level から "cascade" して細かい slot へ再配分。

9. spawn vs spawn_blocking vs yield_now

9.1 `tokio::spawn`

async task を runtime に登録。multi-thread では Send + 'static 必要。.await しないと worker を独占。

9.2 CPU-bound の罠

tokio::spawn(async {
    for i in 0..1_000_000_000 { hash(i); }
});

.await 無し → その worker が他を実行できず、throughput は (N-1)/N に低下。対策:

// 手動譲り
for i in 0..1_000_000_000 {
    hash(i);
    if i % 1000 == 0 { tokio::task::yield_now().await; }
}

// ブロッキング専用 pool へ隔離
tokio::task::spawn_blocking(|| expensive_computation()).await.unwrap();

spawn_blocking の既定上限は 512 スレッド。

9.3 使い分け

短い CPU (< 100 µs): そのまま。
長い CPU: spawn_blocking。
Blocking I/O (file, 同期 DB ドライバ): spawn_blocking。
並列 CPU: rayon + spawn_blocking か yield_now。

10. Function coloring 問題

10.1 症状

fn sync_fn() {}
async fn async_fn() {}

async fn caller1() { sync_fn(); async_fn().await; } // OK

fn caller2() {
    async_fn();        // Future を返すだけで走らない
    async_fn().await;  // コンパイルエラー
}

sync から async を動かすには runtime が必要:

fn caller3() {
    let rt = tokio::runtime::Handle::current();
    rt.block_on(async_fn());
}

10.2 Bob Nystrom の指摘

2015 年の "What Color is Your Function?" で、JS・C#・Python・Rust すべて同じ罠を抱えると指摘された。ライブラリが sync/async どちらかを選ぶ必要があり、生態系が分裂する。

10.3 Go はどう回避したか

Go は全関数が暗黙に async。runtime が常駐し go func() で任意関数を goroutine 化、blocking I/O は scheduler が面倒を見る。代償は常駐 runtime (~2MB バイナリ、embedded 不可)。

10.4 緩和策

sync wrapper (reqwest::blocking など) を提供、runtime を統合、あるいは async と sync を層として分離して境界だけで bridge する。

10.5 `block_on` の危険

tokio::spawn(async {
    let rt = tokio::runtime::Handle::current();
    rt.block_on(another_async()); // デッドロック危険
});

現在 worker が block_on に縛られ他 task を実行できない。解決は tokio::task::block_in_place: 現 worker を blocking pool へ移し、別 worker に交代させる。

11. Channel

11.1 mpsc

let (tx, mut rx) = tokio::sync::mpsc::channel::<String>(32);
tokio::spawn(async move {
    for i in 0..10 { tx.send(format!("msg {}", i)).await.unwrap(); }
});
while let Some(msg) = rx.recv().await { println!("{}", msg); }

バッファ付き、多送信単一受信。満杯なら send().await がブロック。

11.2 oneshot

let (tx, rx) = tokio::sync::oneshot::channel::<u32>();
tokio::spawn(async move { tx.send(42).unwrap(); });
let value = rx.await.unwrap();

1 回きりの値渡し。request-response に好適。

11.3 broadcast

fan-out。すべての subscriber が同じ値を受ける。遅い receiver は lag。

11.4 watch

最新値のみ保持。config 変更通知に適する。

11.5 `select!`

tokio::select! {
    val = rx.recv() => { /* ... */ }
    _ = tokio::time::sleep(Duration::from_secs(1)) => { /* timeout */ }
    _ = shutdown.recv() => { /* stop */ }
}

最初に Ready になった枝を実行し残りは drop。

11.6 Cancellation safety

drop された Future の中間状態は失われる。AsyncReadExt::read_exact は cancel-safe ではない(部分読み込みが消える)。mpsc::Receiver::recv は cancel-safe (メッセージは queue に残る)。

12. 性能チューニング

12.1 tokio-console

tokio = { version = "1", features = ["full", "tracing"] }
console-subscriber = "0.4"

fn main() { console_subscriber::init(); /* ... */ }

全 task、wake 頻度、平均 poll 時間、blocking task をブラウザで可視化。

12.2 tracing

#[tracing::instrument]
async fn handle_request(req: Request) -> Response { /* ... */ }

async 前提の構造化ログ。span が .await を跨いで維持される。

12.3 Flamegraph

cargo flamegraph --bin myapp

12.4 設定

Builder::new_multi_thread()
    .worker_threads(8)
    .max_blocking_threads(1024)
    .build()

大きい Future は Box、多数の Future 待機は select! より join_all を選ぶ。

12.5 よくある anti-pattern

async 中で std::sync::Mutex を .await 越しに保持 → tokio::sync::Mutex を使う。
tokio::spawn 内の長い CPU 作業 → spawn_blocking か rayon。
Arc<Mutex<Vec<Task>>> を手作り → tokio::task::JoinSet を使う。

13. Tokio vs Go 比較

項目	Rust Tokio	Go
モデル	Stackless state machine	Stackful goroutine
スタック	なし (struct field)	2KB 動的
サイズ overhead	Future サイズ (数 KB〜数 MB)	2KB
Scheduler	Work-stealing + LIFO	Work-stealing + runnext
Preemption	Cooperative only	Async (SIGURG, 1.14+)
CPU-bound	詰まり得る	自動 preempt
Runtime	数十 KB	~2 MB
Embedded	可 (no_std runtime)	事実上不可
学習曲線	急 (Pin, lifetime)	緩やか

Go は単純さを、Rust は zero-cost と embedded 適用を取った。

14. 生態系

Runtime: Tokio (事実上の標準)、async-std、smol、embassy (no_std)、glommio (io_uring per-core)。

Library: hyper/axum, reqwest, sqlx, sea-orm, tonic, tracing/opentelemetry, tower。

trait の async fn: 長らく #[async_trait] macro で Boxed Future に変換する必要があったが、Rust 1.75 (2023/12) でネイティブサポートが安定化。dyn Trait 対応は現在も進行中。

15. 2025 年時点のアップデート

1.75+ でネイティブ async fn in traits。
RTN (Return Type Notation) により trait method が返す Future 型に bound を掛けられる。
Polonius / NLL 改良で async コードの false positive が減少。
Rust 2024 edition で gen block (sync generator) 安定化。
io_uring ベースの per-core runtime (glommio) への関心。

16. 実運用シナリオ

16.1 「最初の 100 リクエスト後に遅くなる」

接続漏れ: netstat で TIME_WAIT/CLOSE_WAIT 確認。
Hot mutex: tokio-console で lock().await に滞留する task を確認。
Worker 飽和: top -H で 100% に張り付く worker を発見し、CPU 作業を spawn_blocking へ分離。

16.2 「spawn_blocking プール枯渇」

既定上限は 512。max_blocking_threads を増やす、async ライブラリに置換、または work queue で流量制御する。

16.3 「CPU 20% しか使っていないのに throughput が伸びない」

I/O 待ちが支配的の可能性。strace / perf trace で syscall を追う。futex 時間が多ければロック競合。

17. 学習ロードマップ

Async Book、Tokio tutorial、Jon Gjengset の "Crust of Rust" 動画。
"Rust for Rustaceans"、"Zero to Production in Rust"、"Asynchronous Programming in Rust"。
Tokio ソース: tokio-rt/src/runtime/scheduler/multi_thread/。

18. チートシート

Future::poll(Pin<&mut Self>, &mut Context) -> Poll<Output>
async fn       -> state machine struct
.await         -> match poll { Pending => return; .. }
Waker          -> 「もう一度 poll して」の callback
Pin            -> 「もう移動しない」型レベル保証
Tokio          -> worker + LIFO + local + global queue
                  + I/O driver polling + work-stealing
Spawning       -> spawn / spawn_blocking / block_in_place
Channels       -> mpsc, oneshot, broadcast, watch
避けるべき     -> std Mutex を await 越しに保持
                  long CPU を spawn 内で、
                  cancel-safe でない分岐を select! で
Tools          -> tokio-console, tracing, cargo flamegraph

19. クイズ

Q1. async fn はコンパイル時に何になる?

コンパイラ生成の state machine struct。.await 地点が enum variant、suspend を跨ぐ局所変数が struct field になる。Future::poll が巨大な match として実装され、ランタイムスタックは不要で zero-cost。

Q2. Waker の役割は?

「この task をもう一度 poll して」と runtime に通知する callback。Pending を返す際に現在の Context の waker を保存し、I/O driver/タイマー/チャネルに登録。イベント発生で waker.wake() が呼ばれ、runtime が run queue に戻す。

Q3. Pin が必要な理由は?

async fn の state machine はしばしば自己参照 (自分のフィールドへのポインタ) を含むため。struct が移動するとポインタが古い位置を指し use-after-free となる。Pin<&mut T> は「もう移動しない」という型レベル保証を提供し、自己参照の健全性を担保する。

Q4. tokio::spawn 内で CPU-bound ループを走らせるとどうなる?

そのタスクが worker を独占する。Tokio は cooperative なので .await しないと譲らない。throughput は (N-1)/N に低下。解決は定期的な tokio::task::yield_now().await、または spawn_blocking への退避。

Q5. Goroutine と Future の本質的違いは?

Go は stackful (2KB の動的スタック、任意地点で suspend、自動 preempt)。Rust は stackless (state machine struct、.await のみで suspend、cooperative のみ)。Rust は function coloring と Pin の複雑さを支払い、Go は常駐 runtime を支払う。

Q6. select! の cancellation safety とは?

select! は勝たなかった Future を drop する。drop された Future の中間状態 (読み込み済みバイトなど) は失われる。cancel-safe な API はこの状況でも健全。mpsc::Receiver::recv は cancel-safe、AsyncReadExt::read_exact は非 cancel-safe。

Q7. Go はどう function coloring を回避したか?

全関数を暗黙に async にした。runtime が常駐し任意関数を goroutine 化でき、blocking I/O は scheduler が netpoller で自動処理。代償は ~2MB の常駐 runtime で embedded 不可。Rust は逆のトレードオフを選んだ。

"Go Runtime & GMP Scheduler Deep Dive" — stackful 側の対比。
"Python GIL と CPython 内部" — また別の選択。
"eBPF Deep Dive" — カーネルプログラミング革命。
"io_uring と非同期 I/O モデル" — Tokio の未来。