I/O Systems

One of the core roles of an operating system is to manage various I/O devices and provide a consistent interface to applications. This article examines everything from I/O hardware architecture to the kernel I/O subsystem.

1. I/O Hardware

Ports, Buses, and Controllers

┌─────────────────────────────────────────────┐
│                   CPU                        │
│              ┌──────────┐                    │
│              │ Memory   │                    │
│              └────┬─────┘                    │
│                   │ System Bus               │
│         ┌─────────┼─────────┐                │
│         │         │         │                │
│    ┌────┴───┐ ┌───┴────┐ ┌─┴──────┐         │
│    │ SATA   │ │ USB    │ │ PCIe   │  ← Controllers
│    │Controller│ │Controller│ │Controller│     │
│    └────┬───┘ └───┬────┘ └─┬──────┘         │
│         │         │         │                │
│     ┌───┴──┐  ┌───┴──┐  ┌──┴───┐            │
│     │ HDD  │  │ Mouse│  │ GPU  │  ← Devices │
│     └──────┘  └──────┘  └──────┘            │
└─────────────────────────────────────────────┘

Port: Connection point to a device (e.g., USB port, SATA port)
Bus: A shared communication path that carries signals (e.g., PCIe bus)
Controller: Electronic circuitry that manages ports, buses, and devices

Device Controller Registers

Each device controller typically has the following registers:

Register	Role
data-in	Data for the host to read
data-out	Data for the host to write
status	Device state (busy, ready, error)
command	Commands issued by the host

2. Polling

The CPU repeatedly checks the device's status register to detect I/O completion.

// Polling-based I/O example (pseudo code)
void polling_write(char data) {
    // 1. 장치가 준비될 때까지 busy-wait
    while (read_status_register() & BUSY_BIT)
        ; // 바쁜 대기

    // 2. 데이터 레지스터에 데이터 쓰기
    write_data_register(data);

    // 3. 명령 레지스터에 쓰기 명령 설정
    write_command_register(WRITE_COMMAND);

    // 4. 완료될 때까지 다시 대기
    while (read_status_register() & BUSY_BIT)
        ;
}

Advantages: Simple implementation, efficient for short I/O operations Disadvantages: Wastes CPU cycles (busy waiting), inefficient for long I/O operations

3. Interrupts

The device sends a signal to the CPU upon I/O completion. The CPU handles it after being interrupted from other work.

Interrupt Handling Flow

┌──────┐      1. I/O Request        ┌──────────┐
│ CPU  │ ──────────────────────→    │ Device   │
│      │                            │Controller│
│      │  Performing other work     │          │
│      │                            │ I/O in   │
│      │                            │ progress │
│      │  ←── 2. Interrupt signal ──│ Done!    │
│      │                            └──────────┘
│      │  3. Save current state
│      │  4. Execute interrupt handler
│      │  5. Restore state, resume work
└──────┘

Interrupt Vector Table

// Interrupt vector table concept (pseudo code)
typedef void (*interrupt_handler_t)(void);

interrupt_handler_t interrupt_vector[256];

// Register handlers during initialization
void init_interrupts() {
    interrupt_vector[0]  = divide_error_handler;
    interrupt_vector[1]  = debug_handler;
    interrupt_vector[14] = page_fault_handler;
    interrupt_vector[32] = timer_handler;
    interrupt_vector[33] = keyboard_handler;
    interrupt_vector[46] = disk_handler;
    // ...
}

// Dispatch on interrupt occurrence
void dispatch_interrupt(int vector_num) {
    interrupt_vector[vector_num]();
}

Interrupt Priority

The operating system assigns priorities to interrupts so that important ones are processed first.

High Priority
  │  ┌────────────────────────┐
  │  │ NMI (Non-Maskable)     │ ← Hardware errors
  │  │ Timer Interrupt        │ ← Scheduling
  │  │ Disk Interrupt         │ ← I/O completion
  │  │ Network Interrupt      │ ← Packet arrival
  │  │ Keyboard/Mouse         │ ← User input
  ▼  └────────────────────────┘
Low Priority

4. DMA (Direct Memory Access)

A method for transferring data directly between a device and memory without CPU intervention during bulk data transfers.

DMA Operation Process

┌──────┐                    ┌──────────┐
│ CPU  │  1. DMA request     │ DMA      │
│      │    setup            │ Controller│
│      │ ─────────────────→ │          │
│      │                    │ 2. Bus    │
│      │  Free to do other  │  control  │
│      │  work              │  acquired │
│      │                    │          │
│      │                    │ 3. Device ↔│
│      │                    │  Memory   │
│      │                    │  direct   │
│      │                    │  transfer │
│      │  ←── 4. Completion │ Transfer  │
│      │      interrupt     │ complete  │
└──────┘                    └──────────┘

// DMA transfer setup (pseudo code)
void setup_dma_transfer(
    void *buffer,        // 메모리 버퍼 주소
    int   device_id,     // 대상 장치
    int   byte_count,    // 전송 바이트 수
    int   direction      // READ 또는 WRITE
) {
    dma_controller.address  = buffer;
    dma_controller.count    = byte_count;
    dma_controller.device   = device_id;
    dma_controller.command  = direction;

    // DMA 전송 시작 - CPU는 다른 작업 수행 가능
    dma_controller.start    = 1;
}

5. Application I/O Interface

The operating system abstracts various devices into several categories to provide a consistent interface.

Device Type Interfaces

┌───────────────────────────────────────┐
│        Application                    │
│    read()  write()  ioctl()          │
├───────────────────────────────────────┤
│        Kernel I/O Subsystem           │
├──────┬──────┬──────┬──────┬──────────┤
│ Block│ Char │Network│Clock │ Other    │
│Device│Device│Socket │Timer │          │
├──────┼──────┼──────┼──────┼──────────┤
│ Disk │Keybd │ NIC  │ RTC  │          │
│ SSD  │Mouse │      │ PIT  │          │
└──────┴──────┴──────┴──────┴──────────┘

Device Type	Characteristics	Key Operations	Examples
Block Device	Fixed-size block access	read, write, seek	Disk, SSD
Character Device	Byte stream	get, put	Keyboard, serial
Network Device	Socket interface	send, receive	NIC
Clock/Timer	Time measurement, alerts	get_time, set_timer	RTC, HPET

Blocking vs Non-blocking I/O

// 블로킹 I/O - 완료될 때까지 프로세스 대기
ssize_t bytes = read(fd, buffer, size);
// 이 줄은 read가 완료된 후에 실행됨

// 논블로킹 I/O - 즉시 반환
fcntl(fd, F_SETFL, O_NONBLOCK);
ssize_t bytes = read(fd, buffer, size);
if (bytes == -1 && errno == EAGAIN) {
    // 데이터가 아직 준비되지 않음
}

// 비동기 I/O - 요청 후 완료 시 통지
struct aiocb cb;
cb.aio_fildes = fd;
cb.aio_buf    = buffer;
cb.aio_nbytes = size;
aio_read(&cb);  // 즉시 반환
// 나중에 완료 확인
while (aio_error(&cb) == EINPROGRESS)
    do_other_work();

6. Kernel I/O Subsystem

The kernel provides several services for efficient I/O management.

I/O Scheduling

Optimizes the execution order of multiple I/O requests.

Request Queue:  [Disk Read A] [Disk Write B] [Disk Read C]
                    ↓ I/O Scheduler (reorders)
Execution Order: [Disk Read A] [Disk Read C] [Disk Write B]
                 → Minimizes disk head movement

Buffering

Uses temporary storage to smooth out speed differences during data transfer.

Producer (Device)                   Consumer (Process)
    │                                  │
    │  ┌──────────┐                    │
    ├→ │ Buffer 1 │ (filling)          │
    │  └──────────┘                    │
    │  ┌──────────┐                    │
    │  │ Buffer 2 │ ────────────────→  ├→ Processing
    │  └──────────┘  (draining)        │
    │                                  │
    │  Double buffering: fill and      │
    │  drain simultaneously            │

Caching

Keeps copies of frequently accessed data in faster storage.

Application → Check cache → Hit? → Return from cache
                  │
                  └→ Miss → Read from disk → Store in cache → Return

Spooling

An output queuing mechanism for devices that can handle only one job at a time (e.g., printers).

Process A ─→ ┌────────────┐
Process B ─→ │ Spool Queue│ ─→ Printer (one at a time)
Process C ─→ │ (disk)     │
              └────────────┘

7. I/O Performance

I/O is a major bottleneck in overall system performance.

Performance Improvement Principles

┌────────────────────────────────────────┐
│      I/O Performance Optimization      │
│                                        │
│  1. Reduce context switch count        │
│  2. Reduce data copy count             │
│     (Zero-copy technique)              │
│  3. Reduce interrupt frequency         │
│     (Interrupt coalescing)             │
│  4. Use DMA to reduce CPU load         │
│  5. Appropriate mix of polling         │
│     and interrupts                     │
│  6. Offload functionality to hardware  │
│     (Hardware Offloading)              │
└────────────────────────────────────────┘

Zero-copy Transfer Example

// 전통적 방식: 4번의 데이터 복사
// 디스크 → 커널 버퍼 → 사용자 버퍼 → 소켓 버퍼 → NIC

// sendfile()을 이용한 Zero-copy (Linux)
#include <sys/sendfile.h>

// 파일에서 소켓으로 직접 전송 (커널 내에서만 복사)
ssize_t sent = sendfile(socket_fd, file_fd, &offset, count);
// 디스크 → 커널 버퍼 → NIC (사용자 공간 복사 없음)

8. Summary

Polling: Simple but wastes CPU. Suitable for short I/O
Interrupts: CPU-efficient but has overhead. Used for most I/O
DMA: Essential for bulk data transfer. Minimizes CPU load
Kernel I/O Subsystem: Ensures performance and compatibility through scheduling, buffering, caching, and spooling
Performance Optimization: Various techniques including zero-copy, interrupt coalescing, and hardware offloading

Quiz: I/O Systems

Q1. What is the difference between polling and interrupt-driven I/O, and when is each suitable?

A1. Polling is a method where the CPU repeatedly checks the device status, and it can be more efficient than interrupt overhead for very short I/O operations. Interrupts are a method where the device notifies the CPU upon completion, allowing the CPU to perform other work while waiting, making it more suitable for most I/O operations.

Q2. How does DMA benefit CPU performance?

A2. The DMA controller handles data transfer between memory and devices directly, so the CPU does not have to wait for the transfer to complete and can perform other computations. This significantly reduces CPU utilization, especially during large disk I/O or network transfers.

Q3. What is the difference between buffering and caching?

A3. Buffering is temporary storage to smooth out speed differences between producers and consumers; data is removed from the buffer once consumed. Caching maintains copies of frequently accessed data in faster storage for reuse.