Split View: [운영체제] 10. 가상 메모리: 요구 페이징부터 스래싱까지

[운영체제] 10. 가상 메모리: 요구 페이징부터 스래싱까지

가상 메모리 개념

가상 메모리(Virtual Memory)는 프로세스 전체가 물리 메모리에 올라와 있지 않아도 실행할 수 있게 해주는 기술이다. 프로그래머에게 실제 물리 메모리보다 훨씬 큰 주소 공간을 제공한다.

[가상 메모리의 동작]

가상 주소 공간 (프로세스 관점):      물리 메모리:
+-----------+ 0xFFFF...           +-----------+
|   스택     |                     |   OS      |
|    |      |                     +-----------+
|    v      |                     | 페이지 A   | <-- 가상 페이지 0
|           |                     +-----------+
|           |                     | 페이지 C   | <-- 가상 페이지 5
|    ^      |                     +-----------+
|    |      |                     | 페이지 D   | <-- 가상 페이지 2
|   힙      |                     +-----------+
+-----------+                     |   빈 공간  |
|  데이터    |                     +-----------+
+-----------+
|  코드     |                     디스크 (스왑 영역):
+-----------+ 0x0000...           +-----------+
                                  | 페이지 B   | <-- 가상 페이지 1
4GB 가상 공간 중 일부만            | 페이지 E   | <-- 가상 페이지 3
물리 메모리에 올라가 있음           +-----------+

가상 메모리의 장점은 다음과 같다.

프로그램이 물리 메모리보다 클 수 있음
각 프로세스의 주소 공간이 격리됨
여러 프로세스가 페이지를 공유할 수 있음 (공유 라이브러리, fork 시 COW)
프로세스 생성이 효율적

요구 페이징 (Demand Paging)

프로그램 시작 시 모든 페이지를 메모리에 올리는 대신, 실제로 필요할 때만(on demand) 페이지를 로드한다.

[요구 페이징 개념]

페이지 테이블:
+-------+------+
| 프레임 | 유효  |
+-------+------+
|   3   |  v   |  <- 메모리에 있음
|   -   |  i   |  <- 디스크에 있음 (아직 로드 안 됨)
|   7   |  v   |  <- 메모리에 있음
|   -   |  i   |  <- 디스크에 있음
+-------+------+

v = valid (메모리에 있음)
i = invalid (메모리에 없음, 디스크에 있음)

페이지 폴트 (Page Fault)

유효하지 않은 페이지에 접근하면 페이지 폴트가 발생한다.

[페이지 폴트 처리 과정]

1. CPU가 페이지 테이블 참조
2. 유효 비트가 i (invalid) -> 트랩 발생
      |
      v
3. OS가 페이지 폴트 확인
   - 잘못된 참조(주소 범위 밖)? -> 프로세스 종료
   - 유효하지만 메모리에 없음? -> 계속 진행
      |
      v
4. 빈 프레임 찾기
      |
      v
5. 디스크에서 페이지를 빈 프레임으로 읽기 (I/O)
      |
      v
6. I/O 완료 시:
   - 페이지 테이블 갱신 (프레임 번호, 유효 비트 = v)
   - 프로세스 재시작 (폴트를 일으킨 명령어부터)

요구 페이징의 성능

[유효 접근 시간(EAT)]

p = 페이지 폴트 확률
ma = 메모리 접근 시간 (200ns)
페이지 폴트 처리 시간 = 8ms (8,000,000ns)

EAT = (1 - p) * ma + p * 페이지 폴트 시간
    = (1 - p) * 200 + p * 8,000,000

p = 0.001 (1000번에 1번 폴트) 이면:
EAT = 0.999 * 200 + 0.001 * 8,000,000
    = 199.8 + 8,000
    = 8,199.8ns

성능 저하 40배!

10% 이내 성능 저하를 위해서는:
220 > 200 + 7,999,800 * p
p < 0.0000025
즉, 400,000번 접근에 1번 이하의 페이지 폴트 필요

쓰기 시 복사 (Copy-on-Write, COW)

fork() 시 부모의 페이지를 즉시 복사하지 않고, 두 프로세스가 같은 페이지를 공유한다. 어느 한쪽이 페이지를 수정하려 할 때만 해당 페이지를 복사한다.

[Copy-on-Write 동작]

fork() 직후:
부모 페이지 테이블:          자식 페이지 테이블:
페이지0 -> 프레임 A (COW)   페이지0 -> 프레임 A (COW)
페이지1 -> 프레임 B (COW)   페이지1 -> 프레임 B (COW)
페이지2 -> 프레임 C (COW)   페이지2 -> 프레임 C (COW)

자식이 페이지1을 수정하면:
부모 페이지 테이블:          자식 페이지 테이블:
페이지0 -> 프레임 A (COW)   페이지0 -> 프레임 A (COW)
페이지1 -> 프레임 B         페이지1 -> 프레임 D (복사본)
페이지2 -> 프레임 C (COW)   페이지2 -> 프레임 C (COW)

수정된 페이지만 복사 -> 불필요한 복사 제거

// COW가 적용되는 fork + exec 패턴
#include <unistd.h>
#include <sys/wait.h>

int main() {
    pid_t pid = fork();
    // fork 시점: 부모의 페이지를 COW로 공유

    if (pid == 0) {
        // exec() 호출 시 자식의 주소 공간이 완전히 교체됨
        // COW 덕분에 fork에서 불필요한 페이지 복사 없음
        execlp("ls", "ls", "-la", NULL);
    } else {
        wait(NULL);
    }

    return 0;
}

페이지 교체 (Page Replacement)

물리 메모리가 가득 찼을 때 새 페이지를 올리려면, 기존 페이지 중 하나를 디스크로 내보내야(교체해야) 한다.

[페이지 교체 과정]

1. 디스크에서 페이지 위치 확인
2. 빈 프레임 찾기
   - 빈 프레임이 있으면 사용
   - 없으면 교체 알고리즘으로 희생 페이지 선택
     - 희생 페이지가 수정되었으면(dirty bit) 디스크에 기록
     - 수정되지 않았으면 그냥 덮어쓰기
3. 새 페이지를 빈 프레임에 로드
4. 페이지 테이블 갱신
5. 프로세스 재시작

1. FIFO 페이지 교체

가장 먼저 올라온 페이지를 가장 먼저 교체한다.

[FIFO 예시] 프레임 3개, 참조열: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

참조  7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> 폴트
     [7][0][ ] -> 폴트
     [7][0][1] -> 폴트
     [2][0][1] -> 폴트 (7 교체)
     [2][0][1] -> 히트
     [2][3][1] -> 폴트 (0 교체)
     [2][3][0] -> 폴트 (1 교체)
     [4][3][0] -> 폴트 (2 교체)
     [4][2][0] -> 폴트 (3 교체)
     [4][2][3] -> 폴트 (0 교체)
     [0][2][3] -> 폴트 (4 교체)
     [0][2][3] -> 히트
     [0][2][3] -> 히트
     [0][1][3] -> 폴트 (2 교체)
     [0][1][2] -> 폴트 (3 교체)
     [0][1][2] -> 히트
     [0][1][2] -> 히트
     [7][1][2] -> 폴트 (0 교체)
     [7][0][2] -> 폴트 (1 교체)
     [7][0][1] -> 폴트 (2 교체)

총 페이지 폴트: 15

벨래디의 이상현상(Belady's Anomaly): FIFO에서는 프레임 수를 늘려도 페이지 폴트가 증가할 수 있다.

2. 최적 페이지 교체 (OPT)

앞으로 가장 오랫동안 사용되지 않을 페이지를 교체한다. 이론적 최적이지만 미래를 알 수 없으므로 실제 구현이 불가능하다. 다른 알고리즘의 비교 기준으로 사용한다.

[OPT 예시] 프레임 3개, 참조열: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

참조  7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> 폴트
     [7][0][ ] -> 폴트
     [7][0][1] -> 폴트
     [2][0][1] -> 폴트 (7: 가장 나중에 사용)
     [2][0][1] -> 히트
     [2][0][3] -> 폴트 (1: 가장 나중에 사용)
     [2][0][3] -> 히트
     [2][4][3] -> 폴트 (0을 곧 사용하므로 다른 것 교체)
     [2][4][3] -> 히트
     [2][4][3] -> 히트
     [2][0][3] -> 폴트
     [2][0][3] -> 히트
     [2][0][3] -> 히트
     [2][0][1] -> 폴트
     [2][0][1] -> 히트
     [2][0][1] -> 히트
     [2][0][1] -> 히트
     [7][0][1] -> 폴트
     [7][0][1] -> 히트
     [7][0][1] -> 히트

총 페이지 폴트: 9 (최적)

3. LRU 페이지 교체 (Least Recently Used)

가장 오랫동안 사용되지 않은 페이지를 교체한다. OPT의 근사치로, 과거 정보를 미래의 근사로 사용한다.

[LRU 예시] 프레임 3개, 참조열: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

참조  7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> 폴트
     [7][0][ ] -> 폴트
     [7][0][1] -> 폴트
     [2][0][1] -> 폴트 (7: 가장 오래 전 사용)
     [2][0][1] -> 히트
     [2][0][3] -> 폴트 (1: 가장 오래 전 사용)
     [2][0][3] -> 히트 (0이 최근 사용)
     [4][0][3] -> 폴트 (2: 가장 오래 전 사용)
     [4][0][2] -> 폴트 (3: 가장 오래 전 사용)
     [4][3][2] -> 폴트 (0: 가장 오래 전 사용)
     [0][3][2] -> 폴트 (4: 가장 오래 전 사용)
     [0][3][2] -> 히트
     [0][3][2] -> 히트
     [1][3][2] -> 폴트 (0: 가장 오래 전 사용)
     [1][3][2] -> 히트
     [1][0][2] -> 폴트 (3: 가장 오래 전 사용)
     [1][0][2] -> 히트
     [1][0][7] -> 폴트 (2: 가장 오래 전 사용)
     [1][0][7] -> 히트
     [1][0][7] -> 히트

총 페이지 폴트: 12

LRU 구현 방법은 다음과 같다.

카운터 방식: 각 페이지 테이블 항목에 접근 시간 기록. 교체 시 가장 오래된 것 선택
스택 방식: 페이지 접근 시 스택 맨 위로 이동. 맨 아래가 LRU 페이지

두 방법 모두 하드웨어 지원이 필요하며, 순수 LRU 구현은 비용이 높다.

4. 클럭 알고리즘 (Second-Chance / Clock)

LRU의 근사 알고리즘으로, 참조 비트(reference bit)를 사용한다.

[클럭 알고리즘]

프레임들이 원형 큐를 이룸
각 프레임에 참조 비트(ref)가 있음

         포인터
           |
           v
    [A, ref=1] -> [B, ref=0] -> [C, ref=1] -> [D, ref=0]
         ^                                          |
         |                                          |
         +------------------------------------------+

교체가 필요할 때:
1. 포인터가 가리키는 프레임 확인
2. ref=0이면: 이 페이지를 교체
3. ref=1이면: ref를 0으로 리셋하고, 포인터를 다음으로 이동
4. 2-3을 반복

동작 과정:
포인터 -> [A, ref=1]: ref=0으로 변경, 다음으로
포인터 -> [B, ref=0]: 이 페이지를 교체!

참조 비트는 하드웨어가 페이지 접근 시 자동으로 1로 설정

// 클럭 알고리즘 구현 (개념적)
#define NUM_FRAMES 64

struct frame {
    int page_number;
    int reference_bit;
    int dirty_bit;
};

struct frame frames[NUM_FRAMES];
int clock_hand = 0;

int clock_replace() {
    while (1) {
        if (frames[clock_hand].reference_bit == 0) {
            // 교체 대상 발견
            int victim = clock_hand;
            clock_hand = (clock_hand + 1) % NUM_FRAMES;
            return victim;
        }

        // 두 번째 기회 부여: 참조 비트 초기화
        frames[clock_hand].reference_bit = 0;
        clock_hand = (clock_hand + 1) % NUM_FRAMES;
    }
}

알고리즘 비교

[페이지 교체 알고리즘 비교]

알고리즘    성능      벨래디 이상   구현 비용   비고
--------   ------    ----------   --------   ----
FIFO       나쁨      O            낮음       가장 단순
OPT        최적      X            불가능     이론적 기준
LRU        좋음      X            높음       하드웨어 지원 필요
클럭       양호      X            중간       LRU 근사, 실용적

프레임 할당 (Frame Allocation)

여러 프로세스가 실행될 때, 각 프로세스에 몇 개의 프레임을 할당할지 결정해야 한다.

할당 전략

[프레임 할당 방식]

1. 균등 할당: 모든 프로세스에 동일한 수의 프레임
   100 프레임, 5 프로세스 -> 각 20 프레임

2. 비례 할당: 프로세스 크기에 비례하여 할당
   총 프레임 = 62
   P1 크기 = 10페이지 -> 10/137 * 62 = 4.5 -> 5 프레임
   P2 크기 = 127페이지 -> 127/137 * 62 = 57.5 -> 57 프레임

전역 교체 vs 지역 교체

전역 교체(Global Replacement): 모든 프레임 중에서 희생 페이지 선택. 처리량이 높지만 예측 불가능
지역 교체(Local Replacement): 해당 프로세스의 프레임 내에서만 교체. 예측 가능하지만 비효율적일 수 있음

스래싱 (Thrashing)

프로세스가 실행보다 페이지 교체에 더 많은 시간을 보내는 현상이다.

[스래싱 발생 과정]

1. 프로세스에 프레임이 부족
2. 페이지 폴트 빈번 발생
3. CPU 이용률 저하 (I/O 대기)
4. OS가 "CPU가 놀고 있다"고 판단
5. 다중 프로그래밍 정도 증가 (프로세스 추가)
6. 기존 프로세스의 프레임이 더 줄어듦
7. 더 많은 페이지 폴트 -> 악순환!

CPU 이용률
  ^
  |        /\
  |       /  \
  |      /    \
  |     /      \------> 스래싱!
  |    /
  |   /
  +--+---+---+---+----> 다중 프로그래밍 정도
     1   2   3   4

워킹 셋 모델 (Working-Set Model)

특정 시점에서 프로세스가 활발히 사용하는 페이지의 집합을 워킹 셋이라 한다.

[워킹 셋 개념]

참조열: ... 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 ...
                                            |<-- 윈도우 크기 (delta) -->|

시점 t1의 워킹 셋(delta=10):
WS(t1) = 페이지 1, 2, 3, 4 (윈도우 내 고유 페이지)

워킹 셋 크기 = 활발히 사용 중인 페이지 수

[워킹 셋 기반 프레임 할당]

총 필요 프레임 D = 모든 프로세스의 워킹 셋 크기 합

if D > 가용 프레임 수:
    스래싱 발생 가능
    -> 일부 프로세스를 일시 중단(swap out)

if D <= 가용 프레임 수:
    각 프로세스에 워킹 셋 크기만큼 프레임 할당
    -> 스래싱 방지

페이지 폴트 빈도 (Page-Fault Frequency)

워킹 셋보다 직접적인 방법으로, 페이지 폴트 비율을 직접 모니터링한다.

[PFF 기반 프레임 조절]

페이지 폴트율
  ^
  |
  |---- 상한선 ----  폴트율이 이 위면: 프레임 추가 할당
  |
  |
  |---- 하한선 ----  폴트율이 이 아래면: 프레임 회수
  |
  +--+---+---+---+----> 할당된 프레임 수

상한선 초과: 프로세스에 프레임을 더 할당
하한선 미달: 프로세스에서 프레임을 회수
추가 할당할 프레임이 없으면: 프로세스 일시 중단

메모리 매핑 파일 (Memory-Mapped Files)

파일 I/O를 메모리 접근으로 처리하는 기법이다. 파일의 일부 또는 전체를 프로세스의 가상 주소 공간에 매핑한다.

#include <stdio.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main() {
    // 파일 열기
    int fd = open("data.txt", O_RDWR);
    struct stat sb;
    fstat(fd, &sb);

    // 파일을 메모리에 매핑
    char *mapped = mmap(NULL, sb.st_size,
                        PROT_READ | PROT_WRITE,
                        MAP_SHARED, fd, 0);

    if (mapped == MAP_FAILED) {
        perror("mmap 실패");
        return 1;
    }

    // 파일 내용을 메모리처럼 접근
    printf("파일 내용: %.50s\n", mapped);

    // 메모리에 쓰면 파일도 변경됨 (MAP_SHARED)
    memcpy(mapped, "수정된 내용", 15);

    // 변경 사항을 디스크에 동기화
    msync(mapped, sb.st_size, MS_SYNC);

    // 매핑 해제
    munmap(mapped, sb.st_size);
    close(fd);

    return 0;
}

[메모리 매핑 파일 동작]

프로세스 가상 주소 공간:       물리 메모리:         디스크:
+------------------+         +-----------+       +-------+
|                  |         |           |       |       |
| mmap 영역 -------|-------->| 파일 페이지 |<----->| 파일  |
|                  |         |           |       |       |
+------------------+         +-----------+       +-------+

1. mmap()으로 파일을 주소 공간에 매핑
2. 해당 영역 접근 시 페이지 폴트 -> 파일에서 로드
3. 메모리 쓰기 시 dirty 페이지 표시
4. 주기적으로 또는 munmap 시 디스크에 기록

장점:
- read/write 시스템 콜 오버헤드 없음
- 여러 프로세스가 같은 파일을 공유 가능
- 페이지 캐시를 자동으로 활용

프로세스 간 공유 메모리로서의 mmap

// 프로세스 간 공유 메모리 (MAP_SHARED + MAP_ANONYMOUS)
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>

int main() {
    // 익명 공유 매핑 생성
    int *shared = mmap(NULL, sizeof(int),
                       PROT_READ | PROT_WRITE,
                       MAP_SHARED | MAP_ANONYMOUS,
                       -1, 0);

    *shared = 0;

    pid_t pid = fork();

    if (pid == 0) {
        // 자식 프로세스
        *shared = 42;
        printf("자식: shared = %d\n", *shared);
    } else {
        // 부모 프로세스
        wait(NULL);
        printf("부모: shared = %d\n", *shared);
        // 출력: 부모: shared = 42
        munmap(shared, sizeof(int));
    }

    return 0;
}

정리

[가상 메모리 핵심 개념 요약]

+-------------------+--------------------+
| 요구 페이징        | 필요할 때만 로드     |
| COW               | fork 시 복사 최소화  |
+-------------------+--------------------+
| FIFO              | 단순하지만 성능 낮음  |
| OPT               | 이론적 최적 (구현 불가)|
| LRU               | 실용적이고 성능 좋음  |
| 클럭              | LRU 근사, 대부분 사용 |
+-------------------+--------------------+
| 스래싱             | 프레임 부족으로 폴트  |
|                   | 폭증하는 현상        |
| 워킹 셋           | 활발한 페이지 집합    |
|                   | 추적으로 스래싱 방지  |
+-------------------+--------------------+
| 메모리 매핑 파일    | 파일 I/O를 메모리    |
|                   | 접근으로 처리        |
+-------------------+--------------------+

가상 메모리는 물리 메모리의 한계를 극복하고 프로세스에게 독립적인 주소 공간을 제공하는 핵심 기술이다. 요구 페이징으로 필요한 페이지만 로드하고, LRU나 클럭 알고리즘으로 효율적인 페이지 교체를 수행하며, 워킹 셋 모델로 스래싱을 방지한다. 메모리 매핑 파일은 효율적인 파일 I/O와 프로세스 간 통신을 가능하게 한다.

[OS Concepts] 10. Virtual Memory: From Demand Paging to Thrashing

Virtual Memory Concepts

Virtual memory is a technology that allows a process to execute even when it is not entirely loaded in physical memory. It provides programmers with an address space much larger than actual physical memory.

[How Virtual Memory Works]

Virtual Address Space (Process view):      Physical Memory:
+-----------+ 0xFFFF...                   +-----------+
|   Stack    |                             |   OS      |
|    |      |                             +-----------+
|    v      |                             | Page A    | <-- Virtual page 0
|           |                             +-----------+
|           |                             | Page C    | <-- Virtual page 5
|    ^      |                             +-----------+
|    |      |                             | Page D    | <-- Virtual page 2
|   Heap    |                             +-----------+
+-----------+                             | Free      |
|  Data     |                             +-----------+
+-----------+
|  Code     |                             Disk (Swap area):
+-----------+ 0x0000...                   +-----------+
                                          | Page B    | <-- Virtual page 1
Only part of 4GB virtual space            | Page E    | <-- Virtual page 3
is in physical memory                     +-----------+

The advantages of virtual memory are as follows.

Programs can be larger than physical memory
Each process's address space is isolated
Multiple processes can share pages (shared libraries, COW during fork)
Process creation is efficient

Demand Paging

Instead of loading all pages into memory when a program starts, pages are loaded only when actually needed (on demand).

[Demand Paging Concept]

Page Table:
+-------+------+
| Frame | Valid |
+-------+------+
|   3   |  v   |  <- In memory
|   -   |  i   |  <- On disk (not yet loaded)
|   7   |  v   |  <- In memory
|   -   |  i   |  <- On disk
+-------+------+

v = valid (in memory)
i = invalid (not in memory, on disk)

Page Fault

A page fault occurs when an invalid page is accessed.

[Page Fault Handling Process]

1. CPU references page table
2. Valid bit is i (invalid) -> Trap raised
      |
      v
3. OS confirms page fault
   - Invalid reference (out of address range)? -> Terminate process
   - Valid but not in memory? -> Continue
      |
      v
4. Find a free frame
      |
      v
5. Read page from disk into free frame (I/O)
      |
      v
6. When I/O completes:
   - Update page table (frame number, valid bit = v)
   - Restart process (from the instruction that caused the fault)

Demand Paging Performance

[Effective Access Time (EAT)]

p = page fault probability
ma = memory access time (200ns)
Page fault handling time = 8ms (8,000,000ns)

EAT = (1 - p) * ma + p * page fault time
    = (1 - p) * 200 + p * 8,000,000

If p = 0.001 (1 fault per 1000 accesses):
EAT = 0.999 * 200 + 0.001 * 8,000,000
    = 199.8 + 8,000
    = 8,199.8ns

40x performance degradation!

For less than 10% degradation:
220 > 200 + 7,999,800 * p
p < 0.0000025
i.e., fewer than 1 page fault per 400,000 accesses needed

Copy-on-Write (COW)

During fork(), instead of immediately copying the parent's pages, both processes share the same pages. Only when either side attempts to modify a page is that page copied.

[Copy-on-Write Operation]

Right after fork():
Parent page table:           Child page table:
Page 0 -> Frame A (COW)    Page 0 -> Frame A (COW)
Page 1 -> Frame B (COW)    Page 1 -> Frame B (COW)
Page 2 -> Frame C (COW)    Page 2 -> Frame C (COW)

When child modifies Page 1:
Parent page table:           Child page table:
Page 0 -> Frame A (COW)    Page 0 -> Frame A (COW)
Page 1 -> Frame B          Page 1 -> Frame D (copy)
Page 2 -> Frame C (COW)    Page 2 -> Frame C (COW)

Only the modified page is copied -> Eliminates unnecessary copies

// fork + exec pattern where COW is applied
#include <unistd.h>
#include <sys/wait.h>

int main() {
    pid_t pid = fork();
    // At fork: parent's pages shared as COW

    if (pid == 0) {
        // When exec() is called, child's address space is completely replaced
        // Thanks to COW, no unnecessary page copying during fork
        execlp("ls", "ls", "-la", NULL);
    } else {
        wait(NULL);
    }

    return 0;
}

Page Replacement

When physical memory is full and a new page needs to be loaded, an existing page must be evicted (replaced) to disk.

[Page Replacement Process]

1. Locate page on disk
2. Find a free frame
   - If free frame exists, use it
   - If not, select victim page via replacement algorithm
     - If victim was modified (dirty bit), write to disk
     - If unmodified, just overwrite
3. Load new page into free frame
4. Update page table
5. Restart process

1. FIFO Page Replacement

The page that arrived first is replaced first.

[FIFO Example] 3 frames, reference string: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

Ref   7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> fault
     [7][0][ ] -> fault
     [7][0][1] -> fault
     [2][0][1] -> fault (7 replaced)
     [2][0][1] -> hit
     [2][3][1] -> fault (0 replaced)
     [2][3][0] -> fault (1 replaced)
     [4][3][0] -> fault (2 replaced)
     [4][2][0] -> fault (3 replaced)
     [4][2][3] -> fault (0 replaced)
     [0][2][3] -> fault (4 replaced)
     [0][2][3] -> hit
     [0][2][3] -> hit
     [0][1][3] -> fault (2 replaced)
     [0][1][2] -> fault (3 replaced)
     [0][1][2] -> hit
     [0][1][2] -> hit
     [7][1][2] -> fault (0 replaced)
     [7][0][2] -> fault (1 replaced)
     [7][0][1] -> fault (2 replaced)

Total page faults: 15

Belady's Anomaly: With FIFO, increasing the number of frames can actually increase page faults.

2. Optimal Page Replacement (OPT)

Replaces the page that will not be used for the longest time in the future. Theoretically optimal but impossible to implement since the future is unknown. Used as a benchmark for other algorithms.

[OPT Example] 3 frames, reference string: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

Ref   7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> fault
     [7][0][ ] -> fault
     [7][0][1] -> fault
     [2][0][1] -> fault (7: used latest in future)
     [2][0][1] -> hit
     [2][0][3] -> fault (1: used latest in future)
     [2][0][3] -> hit
     [2][4][3] -> fault (0 used soon, replace other)
     [2][4][3] -> hit
     [2][4][3] -> hit
     [2][0][3] -> fault
     [2][0][3] -> hit
     [2][0][3] -> hit
     [2][0][1] -> fault
     [2][0][1] -> hit
     [2][0][1] -> hit
     [2][0][1] -> hit
     [7][0][1] -> fault
     [7][0][1] -> hit
     [7][0][1] -> hit

Total page faults: 9 (optimal)

3. LRU Page Replacement (Least Recently Used)

Replaces the page that has not been used for the longest time. An approximation of OPT, using past information as an approximation of the future.

[LRU Example] 3 frames, reference string: 7 0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1

Ref   7  0  1  2  0  3  0  4  2  3  0  3  2  1  2  0  1  7  0  1
     [7][ ][ ] -> fault
     [7][0][ ] -> fault
     [7][0][1] -> fault
     [2][0][1] -> fault (7: least recently used)
     [2][0][1] -> hit
     [2][0][3] -> fault (1: least recently used)
     [2][0][3] -> hit (0 recently used)
     [4][0][3] -> fault (2: least recently used)
     [4][0][2] -> fault (3: least recently used)
     [4][3][2] -> fault (0: least recently used)
     [0][3][2] -> fault (4: least recently used)
     [0][3][2] -> hit
     [0][3][2] -> hit
     [1][3][2] -> fault (0: least recently used)
     [1][3][2] -> hit
     [1][0][2] -> fault (3: least recently used)
     [1][0][2] -> hit
     [1][0][7] -> fault (2: least recently used)
     [1][0][7] -> hit
     [1][0][7] -> hit

Total page faults: 12

LRU implementation methods.

Counter method: Record access time in each page table entry. Select the oldest one for replacement
Stack method: Move page to top of stack on access. Bottom of stack is the LRU page

Both methods require hardware support, and pure LRU implementation is costly.

4. Clock Algorithm (Second-Chance / Clock)

An approximation of LRU that uses a reference bit.

[Clock Algorithm]

Frames form a circular queue
Each frame has a reference bit (ref)

         Pointer
           |
           v
    [A, ref=1] -> [B, ref=0] -> [C, ref=1] -> [D, ref=0]
         ^                                          |
         |                                          |
         +------------------------------------------+

When replacement is needed:
1. Check frame pointed to
2. If ref=0: Replace this page
3. If ref=1: Reset ref to 0, move pointer to next
4. Repeat 2-3

Operation:
Pointer -> [A, ref=1]: Change ref to 0, move next
Pointer -> [B, ref=0]: Replace this page!

Reference bit is automatically set to 1 by hardware on page access

// Clock algorithm implementation (conceptual)
#define NUM_FRAMES 64

struct frame {
    int page_number;
    int reference_bit;
    int dirty_bit;
};

struct frame frames[NUM_FRAMES];
int clock_hand = 0;

int clock_replace() {
    while (1) {
        if (frames[clock_hand].reference_bit == 0) {
            // Found replacement target
            int victim = clock_hand;
            clock_hand = (clock_hand + 1) % NUM_FRAMES;
            return victim;
        }

        // Give second chance: reset reference bit
        frames[clock_hand].reference_bit = 0;
        clock_hand = (clock_hand + 1) % NUM_FRAMES;
    }
}

Algorithm Comparison

[Page Replacement Algorithm Comparison]

Algorithm    Perf      Belady     Impl Cost   Notes
--------    ------    --------   ---------   -----
FIFO        Poor      O          Low         Simplest
OPT         Optimal   X          Impossible  Theoretical benchmark
LRU         Good      X          High        Hardware support needed
Clock       Good      X          Medium      LRU approximation, practical

Frame Allocation

When multiple processes are running, the system must decide how many frames to allocate to each process.

Allocation Strategies

[Frame Allocation Methods]

1. Equal Allocation: Same number of frames for all processes
   100 frames, 5 processes -> 20 frames each

2. Proportional Allocation: Allocate proportional to process size
   Total frames = 62
   P1 size = 10 pages -> 10/137 * 62 = 4.5 -> 5 frames
   P2 size = 127 pages -> 127/137 * 62 = 57.5 -> 57 frames

Global vs Local Replacement

Global Replacement: Select victim from all frames. Higher throughput but unpredictable
Local Replacement: Replace only within the process's own frames. Predictable but potentially inefficient

Thrashing

A phenomenon where a process spends more time on page replacement than actual execution.

[How Thrashing Occurs]

1. Process has insufficient frames
2. Page faults occur frequently
3. CPU utilization drops (I/O waiting)
4. OS judges "CPU is idle"
5. Increases degree of multiprogramming (adds processes)
6. Existing processes get even fewer frames
7. More page faults -> Vicious cycle!

CPU Utilization
  ^
  |        /\
  |       /  \
  |      /    \
  |     /      \------> Thrashing!
  |    /
  |   /
  +--+---+---+---+----> Degree of Multiprogramming
     1   2   3   4

Working-Set Model

The set of pages actively used by a process at a specific point in time is called the working set.

[Working Set Concept]

Reference string: ... 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 ...
                                                  |<-- window size (delta) -->|

Working set at time t1 (delta=10):
WS(t1) = pages 1, 2, 3, 4 (unique pages within window)

Working set size = number of actively used pages

[Working Set Based Frame Allocation]

Total needed frames D = sum of all processes' working set sizes

if D > available frames:
    Thrashing possible
    -> Suspend some processes (swap out)

if D <= available frames:
    Allocate frames equal to working set size for each process
    -> Prevent thrashing

Page-Fault Frequency (PFF)

A more direct method than working sets that directly monitors the page fault rate.

[PFF-based Frame Adjustment]

Page Fault Rate
  ^
  |
  |---- Upper bound ----  If rate above this: Allocate more frames
  |
  |
  |---- Lower bound ----  If rate below this: Reclaim frames
  |
  +--+---+---+---+----> Allocated Frames

Above upper bound: Allocate more frames to the process
Below lower bound: Reclaim frames from the process
No more frames to allocate: Suspend the process

Memory-Mapped Files

A technique that handles file I/O as memory access. Maps part or all of a file to a process's virtual address space.

#include <stdio.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main() {
    // Open file
    int fd = open("data.txt", O_RDWR);
    struct stat sb;
    fstat(fd, &sb);

    // Map file to memory
    char *mapped = mmap(NULL, sb.st_size,
                        PROT_READ | PROT_WRITE,
                        MAP_SHARED, fd, 0);

    if (mapped == MAP_FAILED) {
        perror("mmap failed");
        return 1;
    }

    // Access file contents like memory
    printf("File contents: %.50s\n", mapped);

    // Writing to memory also changes the file (MAP_SHARED)
    memcpy(mapped, "Modified content", 16);

    // Sync changes to disk
    msync(mapped, sb.st_size, MS_SYNC);

    // Unmap
    munmap(mapped, sb.st_size);
    close(fd);

    return 0;
}

[Memory-Mapped File Operation]

Process Virtual Address Space:     Physical Memory:       Disk:
+------------------+              +-----------+         +-------+
|                  |              |           |         |       |
| mmap region -----|------------->| File pages|<------->| File  |
|                  |              |           |         |       |
+------------------+              +-----------+         +-------+

1. mmap() maps file to address space
2. Page fault on access -> Load from file
3. Memory write marks page as dirty
4. Periodically or on munmap, write to disk

Advantages:
- No read/write system call overhead
- Multiple processes can share the same file
- Automatically utilizes page cache

mmap as Shared Memory Between Processes

// Shared memory between processes (MAP_SHARED + MAP_ANONYMOUS)
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>

int main() {
    // Create anonymous shared mapping
    int *shared = mmap(NULL, sizeof(int),
                       PROT_READ | PROT_WRITE,
                       MAP_SHARED | MAP_ANONYMOUS,
                       -1, 0);

    *shared = 0;

    pid_t pid = fork();

    if (pid == 0) {
        // Child process
        *shared = 42;
        printf("Child: shared = %d\n", *shared);
    } else {
        // Parent process
        wait(NULL);
        printf("Parent: shared = %d\n", *shared);
        // Output: Parent: shared = 42
        munmap(shared, sizeof(int));
    }

    return 0;
}

Summary

[Virtual Memory Key Concepts Summary]

+-------------------+--------------------+
| Demand Paging     | Load only when needed|
| COW               | Minimize copies at fork|
+-------------------+--------------------+
| FIFO              | Simple but low perf |
| OPT               | Theoretically optimal (unimplementable)|
| LRU               | Practical, good perf|
| Clock             | LRU approx, widely used|
+-------------------+--------------------+
| Thrashing         | Fault explosion from|
|                   | frame shortage      |
| Working Set       | Track active page   |
|                   | set to prevent thrashing|
+-------------------+--------------------+
| Memory-mapped File| Handle file I/O as  |
|                   | memory access       |
+-------------------+--------------------+

Virtual memory is a core technology that overcomes the limitations of physical memory and provides each process with an independent address space. Demand paging loads only necessary pages, LRU and clock algorithms perform efficient page replacement, and the working set model prevents thrashing. Memory-mapped files enable efficient file I/O and inter-process communication.