Split View: [운영체제] 12. I/O 시스템

[운영체제] 12. I/O 시스템

I/O 시스템

운영체제의 핵심 역할 중 하나는 다양한 I/O 장치를 관리하고, 애플리케이션에 일관된 인터페이스를 제공하는 것입니다. 이 글에서는 I/O 하드웨어 구조부터 커널 I/O 서브시스템까지 살펴봅니다.

1. I/O 하드웨어

포트, 버스, 컨트롤러

┌─────────────────────────────────────────────┐
│                   CPU                        │
│              ┌──────────┐                    │
│              │ 메모리   │                    │
│              └────┬─────┘                    │
│                   │ 시스템 버스               │
│         ┌─────────┼─────────┐                │
│         │         │         │                │
│    ┌────┴───┐ ┌───┴────┐ ┌─┴──────┐         │
│    │ SATA   │ │ USB    │ │ PCIe   │  ← 컨트롤러│
│    │Controller│ │Controller│ │Controller│     │
│    └────┬───┘ └───┬────┘ └─┬──────┘         │
│         │         │         │                │
│     ┌───┴──┐  ┌───┴──┐  ┌──┴───┐            │
│     │ HDD  │  │ 마우스│  │ GPU  │  ← 장치    │
│     └──────┘  └──────┘  └──────┘            │
└─────────────────────────────────────────────┘

포트(Port): 장치와의 연결 지점 (예: USB 포트, SATA 포트)
버스(Bus): 신호를 전달하는 공유 통신 경로 (예: PCIe 버스)
컨트롤러(Controller): 포트, 버스, 장치를 관리하는 전자 회로

장치 컨트롤러의 레지스터

각 장치 컨트롤러는 일반적으로 다음 레지스터를 가집니다.

레지스터	역할
data-in	호스트가 읽을 데이터
data-out	호스트가 쓸 데이터
status	장치 상태 (busy, ready, error)
command	호스트가 내리는 명령

2. 폴링 (Polling)

CPU가 장치의 상태 레지스터를 반복적으로 확인하여 I/O 완료를 감지하는 방식입니다.

// 폴링 기반 I/O 예시 (의사 코드)
void polling_write(char data) {
    // 1. 장치가 준비될 때까지 busy-wait
    while (read_status_register() & BUSY_BIT)
        ; // 바쁜 대기

    // 2. 데이터 레지스터에 데이터 쓰기
    write_data_register(data);

    // 3. 명령 레지스터에 쓰기 명령 설정
    write_command_register(WRITE_COMMAND);

    // 4. 완료될 때까지 다시 대기
    while (read_status_register() & BUSY_BIT)
        ;
}

장점: 구현이 단순하고, 짧은 I/O에는 효율적 단점: CPU 사이클 낭비 (바쁜 대기), 긴 I/O에 비효율적

3. 인터럽트 (Interrupt)

장치가 I/O 완료 시 CPU에 신호를 보내는 방식입니다. CPU는 다른 작업을 수행하다가 인터럽트를 받으면 처리합니다.

인터럽트 처리 흐름

┌──────┐      1. I/O 요청        ┌──────────┐
│ CPU  │ ──────────────────────→ │ 장치     │
│      │                         │ 컨트롤러 │
│      │  다른 작업 수행 중       │          │
│      │                         │ I/O 수행 │
│      │                         │          │
│      │  ←── 2. 인터럽트 신호 ── │ 완료!    │
│      │                         └──────────┘
│      │  3. 현재 상태 저장
│      │  4. 인터럽트 핸들러 실행
│      │  5. 상태 복원, 작업 재개
└──────┘

인터럽트 벡터 테이블

// 인터럽트 벡터 테이블 개념 (의사 코드)
typedef void (*interrupt_handler_t)(void);

interrupt_handler_t interrupt_vector[256];

// 초기화 시 핸들러 등록
void init_interrupts() {
    interrupt_vector[0]  = divide_error_handler;
    interrupt_vector[1]  = debug_handler;
    interrupt_vector[14] = page_fault_handler;
    interrupt_vector[32] = timer_handler;
    interrupt_vector[33] = keyboard_handler;
    interrupt_vector[46] = disk_handler;
    // ...
}

// 인터럽트 발생 시 디스패치
void dispatch_interrupt(int vector_num) {
    interrupt_vector[vector_num]();
}

인터럽트 우선순위

운영체제는 인터럽트에 우선순위를 부여하여 중요한 인터럽트가 먼저 처리되도록 합니다.

높은 우선순위
  │  ┌────────────────────────┐
  │  │ NMI (Non-Maskable)     │ ← 하드웨어 오류
  │  │ 타이머 인터럽트          │ ← 스케줄링
  │  │ 디스크 인터럽트          │ ← I/O 완료
  │  │ 네트워크 인터럽트        │ ← 패킷 도착
  │  │ 키보드/마우스 인터럽트   │ ← 사용자 입력
  ▼  └────────────────────────┘
낮은 우선순위

4. DMA (Direct Memory Access)

대량 데이터 전송 시 CPU 개입 없이 장치와 메모리 간 직접 데이터를 전송하는 방식입니다.

DMA 동작 과정

┌──────┐                    ┌──────────┐
│ CPU  │  1. DMA 요청 설정   │ DMA      │
│      │ ─────────────────→ │ Controller│
│      │                    │          │
│      │  자유롭게 다른      │ 2. 버스 제어권│
│      │  작업 수행          │    획득   │
│      │                    │          │
│      │                    │ 3. 장치 ↔ 메모리│
│      │                    │    직접 전송│
│      │                    │          │
│      │  ←── 4. 완료       │ 전송 완료 │
│      │      인터럽트       │          │
└──────┘                    └──────────┘

// DMA 전송 설정 (의사 코드)
void setup_dma_transfer(
    void *buffer,        // 메모리 버퍼 주소
    int   device_id,     // 대상 장치
    int   byte_count,    // 전송 바이트 수
    int   direction      // READ 또는 WRITE
) {
    dma_controller.address  = buffer;
    dma_controller.count    = byte_count;
    dma_controller.device   = device_id;
    dma_controller.command  = direction;

    // DMA 전송 시작 - CPU는 다른 작업 수행 가능
    dma_controller.start    = 1;
}

5. 애플리케이션 I/O 인터페이스

운영체제는 다양한 장치를 몇 가지 유형으로 추상화하여 일관된 인터페이스를 제공합니다.

장치 유형별 인터페이스

┌───────────────────────────────────────┐
│        애플리케이션                    │
│    read()  write()  ioctl()          │
├───────────────────────────────────────┤
│        커널 I/O 서브시스템             │
├──────┬──────┬──────┬──────┬──────────┤
│ 블록 │ 문자 │네트워크│클럭  │ 기타     │
│ 장치 │ 장치 │소켓   │타이머│          │
├──────┼──────┼──────┼──────┼──────────┤
│ 디스크│키보드│ NIC  │ RTC  │          │
│ SSD  │마우스│      │ PIT  │          │
└──────┴──────┴──────┴──────┴──────────┘

장치 유형	특성	주요 연산	예시
블록 장치	고정 크기 블록 단위 접근	read, write, seek	디스크, SSD
문자 장치	바이트 스트림	get, put	키보드, 직렬 포트
네트워크 장치	소켓 인터페이스	send, receive	NIC
클럭/타이머	시간 측정, 알림	get_time, set_timer	RTC, HPET

블로킹 vs 논블로킹 I/O

// 블로킹 I/O - 완료될 때까지 프로세스 대기
ssize_t bytes = read(fd, buffer, size);
// 이 줄은 read가 완료된 후에 실행됨

// 논블로킹 I/O - 즉시 반환
fcntl(fd, F_SETFL, O_NONBLOCK);
ssize_t bytes = read(fd, buffer, size);
if (bytes == -1 && errno == EAGAIN) {
    // 데이터가 아직 준비되지 않음
}

// 비동기 I/O - 요청 후 완료 시 통지
struct aiocb cb;
cb.aio_fildes = fd;
cb.aio_buf    = buffer;
cb.aio_nbytes = size;
aio_read(&cb);  // 즉시 반환
// 나중에 완료 확인
while (aio_error(&cb) == EINPROGRESS)
    do_other_work();

6. 커널 I/O 서브시스템

커널은 I/O를 효율적으로 관리하기 위한 여러 서비스를 제공합니다.

I/O 스케줄링

여러 I/O 요청의 실행 순서를 최적화합니다.

요청 큐:  [디스크 읽기 A] [디스크 쓰기 B] [디스크 읽기 C]
              ↓ I/O 스케줄러 (재배치)
실행 순서: [디스크 읽기 A] [디스크 읽기 C] [디스크 쓰기 B]
           → 디스크 헤드 이동 최소화

버퍼링 (Buffering)

데이터 전송 시 임시 저장 공간을 사용하여 속도 차이를 완화합니다.

생산자(장치)                        소비자(프로세스)
    │                                  │
    │  ┌──────────┐                    │
    ├→ │ Buffer 1 │ (채우는 중)         │
    │  └──────────┘                    │
    │  ┌──────────┐                    │
    │  │ Buffer 2 │ ────────────────→  ├→ 처리
    │  └──────────┘  (비우는 중)        │
    │                                  │
    │  이중 버퍼링: 채우기와 비우기 동시 │

캐싱 (Caching)

자주 접근하는 데이터의 복사본을 빠른 저장장치에 보관합니다.

애플리케이션 → 캐시 확인 → Hit? → 캐시에서 반환
                  │
                  └→ Miss → 디스크에서 읽기 → 캐시에 저장 → 반환

스풀링 (Spooling)

동시에 하나의 작업만 처리할 수 있는 장치(예: 프린터)를 위한 출력 큐잉 메커니즘입니다.

프로세스 A ─→ ┌────────────┐
프로세스 B ─→ │ Spool 큐   │ ─→ 프린터 (한 번에 하나씩)
프로세스 C ─→ │ (디스크)    │
              └────────────┘

7. I/O 성능

I/O는 시스템 전체 성능의 주요 병목 지점입니다.

성능 개선 원칙

┌────────────────────────────────────────┐
│          I/O 성능 최적화 전략           │
│                                        │
│  1. 컨텍스트 스위치 횟수 줄이기         │
│  2. 데이터 복사 횟수 줄이기             │
│     (Zero-copy 기법)                   │
│  3. 인터럽트 빈도 줄이기               │
│     (인터럽트 병합)                     │
│  4. DMA 활용하여 CPU 부담 줄이기       │
│  5. 폴링과 인터럽트의 적절한 조합       │
│  6. 기능을 하드웨어로 이전              │
│     (Hardware Offloading)              │
└────────────────────────────────────────┘

Zero-copy 전송 예시

// 전통적 방식: 4번의 데이터 복사
// 디스크 → 커널 버퍼 → 사용자 버퍼 → 소켓 버퍼 → NIC

// sendfile()을 이용한 Zero-copy (Linux)
#include <sys/sendfile.h>

// 파일에서 소켓으로 직접 전송 (커널 내에서만 복사)
ssize_t sent = sendfile(socket_fd, file_fd, &offset, count);
// 디스크 → 커널 버퍼 → NIC (사용자 공간 복사 없음)

8. 정리

폴링: 단순하지만 CPU 낭비. 짧은 I/O에 적합
인터럽트: CPU 효율적이지만 오버헤드 존재. 대부분의 I/O에 사용
DMA: 대량 데이터 전송에 필수. CPU 부담 최소화
커널 I/O 서브시스템: 스케줄링, 버퍼링, 캐싱, 스풀링으로 성능과 호환성 확보
성능 최적화: Zero-copy, 인터럽트 병합, 하드웨어 오프로딩 등 다양한 기법 활용

퀴즈: I/O 시스템

Q1. 폴링과 인터럽트 방식의 차이점은 무엇이며, 각각 어떤 상황에 적합한가요?

A1. 폴링은 CPU가 장치 상태를 반복 확인하는 방식으로, 매우 짧은 I/O에서는 인터럽트 오버헤드보다 효율적입니다. 인터럽트는 장치가 완료 시 CPU에 알리는 방식으로, CPU가 대기 중 다른 작업을 할 수 있어 대부분의 I/O에 더 적합합니다.

Q2. DMA가 CPU 성능에 어떤 이점을 주나요?

A2. DMA 컨트롤러가 메모리와 장치 간 데이터 전송을 직접 처리하므로, CPU는 전송 완료를 기다리지 않고 다른 연산을 수행할 수 있습니다. 특히 대용량 디스크 I/O나 네트워크 전송에서 CPU 사용률을 크게 줄여줍니다.

Q3. 버퍼링과 캐싱의 차이점은 무엇인가요?

A3. 버퍼링은 생산자와 소비자 간 속도 차이를 완화하기 위한 임시 저장이며, 데이터가 한 번 사용되면 버퍼에서 제거됩니다. 캐싱은 자주 접근하는 데이터의 복사본을 빠른 저장소에 유지하여 재사용하는 것입니다.

[OS Concepts] 12. I/O Systems

I/O Systems

One of the core roles of an operating system is to manage various I/O devices and provide a consistent interface to applications. This article examines everything from I/O hardware architecture to the kernel I/O subsystem.

1. I/O Hardware

Ports, Buses, and Controllers

┌─────────────────────────────────────────────┐
│                   CPU                        │
│              ┌──────────┐                    │
│              │ Memory   │                    │
│              └────┬─────┘                    │
│                   │ System Bus               │
│         ┌─────────┼─────────┐                │
│         │         │         │                │
│    ┌────┴───┐ ┌───┴────┐ ┌─┴──────┐         │
│    │ SATA   │ │ USB    │ │ PCIe   │  ← Controllers
│    │Controller│ │Controller│ │Controller│     │
│    └────┬───┘ └───┬────┘ └─┬──────┘         │
│         │         │         │                │
│     ┌───┴──┐  ┌───┴──┐  ┌──┴───┐            │
│     │ HDD  │  │ Mouse│  │ GPU  │  ← Devices │
│     └──────┘  └──────┘  └──────┘            │
└─────────────────────────────────────────────┘

Port: Connection point to a device (e.g., USB port, SATA port)
Bus: A shared communication path that carries signals (e.g., PCIe bus)
Controller: Electronic circuitry that manages ports, buses, and devices

Device Controller Registers

Each device controller typically has the following registers:

Register	Role
data-in	Data for the host to read
data-out	Data for the host to write
status	Device state (busy, ready, error)
command	Commands issued by the host

2. Polling

The CPU repeatedly checks the device's status register to detect I/O completion.

// Polling-based I/O example (pseudo code)
void polling_write(char data) {
    // 1. 장치가 준비될 때까지 busy-wait
    while (read_status_register() & BUSY_BIT)
        ; // 바쁜 대기

    // 2. 데이터 레지스터에 데이터 쓰기
    write_data_register(data);

    // 3. 명령 레지스터에 쓰기 명령 설정
    write_command_register(WRITE_COMMAND);

    // 4. 완료될 때까지 다시 대기
    while (read_status_register() & BUSY_BIT)
        ;
}

Advantages: Simple implementation, efficient for short I/O operations Disadvantages: Wastes CPU cycles (busy waiting), inefficient for long I/O operations

3. Interrupts

The device sends a signal to the CPU upon I/O completion. The CPU handles it after being interrupted from other work.

Interrupt Handling Flow

┌──────┐      1. I/O Request        ┌──────────┐
│ CPU  │ ──────────────────────→    │ Device   │
│      │                            │Controller│
│      │  Performing other work     │          │
│      │                            │ I/O in   │
│      │                            │ progress │
│      │  ←── 2. Interrupt signal ──│ Done!    │
│      │                            └──────────┘
│      │  3. Save current state
│      │  4. Execute interrupt handler
│      │  5. Restore state, resume work
└──────┘

Interrupt Vector Table

// Interrupt vector table concept (pseudo code)
typedef void (*interrupt_handler_t)(void);

interrupt_handler_t interrupt_vector[256];

// Register handlers during initialization
void init_interrupts() {
    interrupt_vector[0]  = divide_error_handler;
    interrupt_vector[1]  = debug_handler;
    interrupt_vector[14] = page_fault_handler;
    interrupt_vector[32] = timer_handler;
    interrupt_vector[33] = keyboard_handler;
    interrupt_vector[46] = disk_handler;
    // ...
}

// Dispatch on interrupt occurrence
void dispatch_interrupt(int vector_num) {
    interrupt_vector[vector_num]();
}

Interrupt Priority

The operating system assigns priorities to interrupts so that important ones are processed first.

High Priority
  │  ┌────────────────────────┐
  │  │ NMI (Non-Maskable)     │ ← Hardware errors
  │  │ Timer Interrupt        │ ← Scheduling
  │  │ Disk Interrupt         │ ← I/O completion
  │  │ Network Interrupt      │ ← Packet arrival
  │  │ Keyboard/Mouse         │ ← User input
  ▼  └────────────────────────┘
Low Priority

4. DMA (Direct Memory Access)

A method for transferring data directly between a device and memory without CPU intervention during bulk data transfers.

DMA Operation Process

┌──────┐                    ┌──────────┐
│ CPU  │  1. DMA request     │ DMA      │
│      │    setup            │ Controller│
│      │ ─────────────────→ │          │
│      │                    │ 2. Bus    │
│      │  Free to do other  │  control  │
│      │  work              │  acquired │
│      │                    │          │
│      │                    │ 3. Device ↔│
│      │                    │  Memory   │
│      │                    │  direct   │
│      │                    │  transfer │
│      │  ←── 4. Completion │ Transfer  │
│      │      interrupt     │ complete  │
└──────┘                    └──────────┘

// DMA transfer setup (pseudo code)
void setup_dma_transfer(
    void *buffer,        // 메모리 버퍼 주소
    int   device_id,     // 대상 장치
    int   byte_count,    // 전송 바이트 수
    int   direction      // READ 또는 WRITE
) {
    dma_controller.address  = buffer;
    dma_controller.count    = byte_count;
    dma_controller.device   = device_id;
    dma_controller.command  = direction;

    // DMA 전송 시작 - CPU는 다른 작업 수행 가능
    dma_controller.start    = 1;
}

5. Application I/O Interface

The operating system abstracts various devices into several categories to provide a consistent interface.

Device Type Interfaces

┌───────────────────────────────────────┐
│        Application                    │
│    read()  write()  ioctl()          │
├───────────────────────────────────────┤
│        Kernel I/O Subsystem           │
├──────┬──────┬──────┬──────┬──────────┤
│ Block│ Char │Network│Clock │ Other    │
│Device│Device│Socket │Timer │          │
├──────┼──────┼──────┼──────┼──────────┤
│ Disk │Keybd │ NIC  │ RTC  │          │
│ SSD  │Mouse │      │ PIT  │          │
└──────┴──────┴──────┴──────┴──────────┘

Device Type	Characteristics	Key Operations	Examples
Block Device	Fixed-size block access	read, write, seek	Disk, SSD
Character Device	Byte stream	get, put	Keyboard, serial
Network Device	Socket interface	send, receive	NIC
Clock/Timer	Time measurement, alerts	get_time, set_timer	RTC, HPET

Blocking vs Non-blocking I/O

// 블로킹 I/O - 완료될 때까지 프로세스 대기
ssize_t bytes = read(fd, buffer, size);
// 이 줄은 read가 완료된 후에 실행됨

// 논블로킹 I/O - 즉시 반환
fcntl(fd, F_SETFL, O_NONBLOCK);
ssize_t bytes = read(fd, buffer, size);
if (bytes == -1 && errno == EAGAIN) {
    // 데이터가 아직 준비되지 않음
}

// 비동기 I/O - 요청 후 완료 시 통지
struct aiocb cb;
cb.aio_fildes = fd;
cb.aio_buf    = buffer;
cb.aio_nbytes = size;
aio_read(&cb);  // 즉시 반환
// 나중에 완료 확인
while (aio_error(&cb) == EINPROGRESS)
    do_other_work();

6. Kernel I/O Subsystem

The kernel provides several services for efficient I/O management.

I/O Scheduling

Optimizes the execution order of multiple I/O requests.

Request Queue:  [Disk Read A] [Disk Write B] [Disk Read C]
                    ↓ I/O Scheduler (reorders)
Execution Order: [Disk Read A] [Disk Read C] [Disk Write B]
                 → Minimizes disk head movement

Buffering

Uses temporary storage to smooth out speed differences during data transfer.

Producer (Device)                   Consumer (Process)
    │                                  │
    │  ┌──────────┐                    │
    ├→ │ Buffer 1 │ (filling)          │
    │  └──────────┘                    │
    │  ┌──────────┐                    │
    │  │ Buffer 2 │ ────────────────→  ├→ Processing
    │  └──────────┘  (draining)        │
    │                                  │
    │  Double buffering: fill and      │
    │  drain simultaneously            │

Caching

Keeps copies of frequently accessed data in faster storage.

Application → Check cache → Hit? → Return from cache
                  │
                  └→ Miss → Read from disk → Store in cache → Return

Spooling

An output queuing mechanism for devices that can handle only one job at a time (e.g., printers).

Process A ─→ ┌────────────┐
Process B ─→ │ Spool Queue│ ─→ Printer (one at a time)
Process C ─→ │ (disk)     │
              └────────────┘

7. I/O Performance

I/O is a major bottleneck in overall system performance.

Performance Improvement Principles

┌────────────────────────────────────────┐
│      I/O Performance Optimization      │
│                                        │
│  1. Reduce context switch count        │
│  2. Reduce data copy count             │
│     (Zero-copy technique)              │
│  3. Reduce interrupt frequency         │
│     (Interrupt coalescing)             │
│  4. Use DMA to reduce CPU load         │
│  5. Appropriate mix of polling         │
│     and interrupts                     │
│  6. Offload functionality to hardware  │
│     (Hardware Offloading)              │
└────────────────────────────────────────┘

Zero-copy Transfer Example

// 전통적 방식: 4번의 데이터 복사
// 디스크 → 커널 버퍼 → 사용자 버퍼 → 소켓 버퍼 → NIC

// sendfile()을 이용한 Zero-copy (Linux)
#include <sys/sendfile.h>

// 파일에서 소켓으로 직접 전송 (커널 내에서만 복사)
ssize_t sent = sendfile(socket_fd, file_fd, &offset, count);
// 디스크 → 커널 버퍼 → NIC (사용자 공간 복사 없음)

8. Summary

Polling: Simple but wastes CPU. Suitable for short I/O
Interrupts: CPU-efficient but has overhead. Used for most I/O
DMA: Essential for bulk data transfer. Minimizes CPU load
Kernel I/O Subsystem: Ensures performance and compatibility through scheduling, buffering, caching, and spooling
Performance Optimization: Various techniques including zero-copy, interrupt coalescing, and hardware offloading

Quiz: I/O Systems

Q1. What is the difference between polling and interrupt-driven I/O, and when is each suitable?

A1. Polling is a method where the CPU repeatedly checks the device status, and it can be more efficient than interrupt overhead for very short I/O operations. Interrupts are a method where the device notifies the CPU upon completion, allowing the CPU to perform other work while waiting, making it more suitable for most I/O operations.

Q2. How does DMA benefit CPU performance?

A2. The DMA controller handles data transfer between memory and devices directly, so the CPU does not have to wait for the transfer to complete and can perform other computations. This significantly reduces CPU utilization, especially during large disk I/O or network transfers.

Q3. What is the difference between buffering and caching?

A3. Buffering is temporary storage to smooth out speed differences between producers and consumers; data is removed from the buffer once consumed. Caching maintains copies of frequently accessed data in faster storage for reuse.