Split View: [운영체제] 09. 메인 메모리 관리

[운영체제] 09. 메인 메모리 관리

배경

기본 개념

CPU가 직접 접근할 수 있는 저장소는 레지스터와 메인 메모리뿐이다. 디스크의 데이터는 반드시 메모리에 올라와야 CPU가 처리할 수 있다.

메모리 접근에는 시간이 걸리므로, CPU와 메모리 사이에 캐시를 두어 속도 차이를 완화한다.

CPU <-> 레지스터 (~1ns) <-> 캐시 (~10ns) <-> 메인 메모리 (~100ns)

주소 바인딩 (Address Binding)

프로그램에서 사용하는 주소를 실제 메모리 주소로 변환하는 과정이다. 바인딩 시점에 따라 세 가지로 나뉜다.

[주소 바인딩 시점]

1. 컴파일 시간 바인딩
   - 프로세스가 메모리의 어디에 올라갈지 컴파일 시 결정
   - 위치가 바뀌면 재컴파일 필요
   - 예: MS-DOS의 .COM 프로그램

2. 로드 시간 바인딩
   - 프로그램을 메모리에 올릴 때 주소 결정
   - 재배치 가능 코드(relocatable code) 사용
   - 로드 후 주소 변경 불가

3. 실행 시간 바인딩 (현대 OS)
   - 실행 중에 주소를 동적으로 변환
   - MMU(Memory Management Unit) 하드웨어 필요
   - 프로세스를 메모리 내에서 이동 가능

논리적 주소와 물리적 주소

논리적 주소(Logical Address): CPU가 생성하는 주소. 가상 주소라고도 함
물리적 주소(Physical Address): 실제 메모리 하드웨어의 주소

[MMU를 통한 주소 변환]

CPU --[논리적 주소: 346]--> MMU --[물리적 주소: 14346]--> 메모리
                            |
                      재배치 레지스터
                      (기준값: 14000)

물리적 주소 = 논리적 주소 + 재배치 레지스터 값
346 + 14000 = 14346

사용자 프로그램은 논리적 주소만 다루며, 물리적 주소를 직접 볼 수 없다.

동적 로딩과 동적 링킹

동적 로딩: 루틴이 호출될 때만 메모리에 적재한다. 사용하지 않는 루틴은 메모리를 차지하지 않으므로 메모리 이용률이 향상된다.

동적 링킹: 라이브러리를 실행 시간에 링크한다.

[정적 링킹 vs 동적 링킹]

정적 링킹:
  프로그램 A: [코드 + libc 복사본]   -- 50MB
  프로그램 B: [코드 + libc 복사본]   -- 50MB
  총 메모리: 100MB

동적 링킹 (공유 라이브러리):
  프로그램 A: [코드 + libc 참조]     -- 10MB
  프로그램 B: [코드 + libc 참조]     -- 10MB
  libc.so:   [공유 라이브러리]       -- 40MB
  총 메모리: 60MB

Linux: .so 파일 (Shared Object)
Windows: .dll 파일 (Dynamic-Link Library)

연속 메모리 할당

가장 단순한 메모리 관리 방식으로, 각 프로세스를 연속된 메모리 영역에 배치한다.

메모리 보호

[기준 레지스터와 한계 레지스터]

     기준(base)              한계(limit)
        |                       |
        v                       v
+-------+=======================+-------+
| OS    | 프로세스 P의 영역      | 기타  |
+-------+=======================+-------+
  0     300040                 420940

CPU가 생성한 주소 addr에 대해:
  if (addr >= base && addr < base + limit)
      접근 허용 -> 물리 메모리 접근
  else
      트랩 발생 -> OS가 오류 처리

메모리 할당 전략

빈 공간(hole) 목록에서 프로세스에게 메모리를 할당하는 방법이다.

[메모리 할당 예시]

초기 상태:
|---OS---|--P1--|-------빈공간-------|--P3--|---빈공간---|

할당 전략:
1. 최초 적합(First Fit): 첫 번째로 충분한 빈 공간에 할당
   장점: 빠름

2. 최적 적합(Best Fit): 가장 작은 충분한 빈 공간에 할당
   장점: 작은 남은 공간 생성
   단점: 전체 탐색 필요, 매우 작은 조각 생성

3. 최악 적합(Worst Fit): 가장 큰 빈 공간에 할당
   장점: 남은 공간이 커서 재활용 가능
   단점: 전체 탐색 필요

단편화 (Fragmentation)

[외부 단편화 vs 내부 단편화]

외부 단편화:
|P1|  빈  |P2| 빈 |P3|  빈  |P4|
    100KB     50KB     80KB

총 빈 공간: 230KB인데, 200KB 프로세스를 올릴 수 없음!
(연속 공간이 아니므로)

해결: 압축(Compaction) - 프로세스를 한쪽으로 이동
|P1|P2|P3|P4|-----230KB 빈 공간-----|

내부 단편화:
메모리를 고정 크기 블록으로 할당할 때
프로세스가 블록보다 작으면 내부에 남는 공간 발생

프로세스 크기: 18,462 바이트
할당 블록 크기: 20,000 바이트
내부 단편화: 1,538 바이트 낭비

페이징 (Paging)

페이징은 논리적 주소 공간을 비연속적으로 할당하여 외부 단편화를 완전히 제거하는 기법이다.

기본 개념

페이지(Page): 논리적 메모리를 고정 크기 블록으로 나눈 것
프레임(Frame): 물리적 메모리를 같은 크기의 블록으로 나눈 것
페이지 테이블(Page Table): 페이지를 프레임에 매핑하는 테이블

[페이징 주소 변환]

논리적 주소 = 페이지 번호(p) + 페이지 오프셋(d)

예: 페이지 크기 4KB (2^12), 논리 주소 32비트
  상위 20비트 = 페이지 번호 (최대 2^20 = 1M 페이지)
  하위 12비트 = 오프셋 (0 ~ 4095)

+--------+--------+
| p (20) | d (12) |       논리적 주소
+--------+--------+
     |
     v
[페이지 테이블]
 p -> f (프레임 번호)
     |
     v
+--------+--------+
| f (20) | d (12) |       물리적 주소
+--------+--------+

[페이징 예시]

논리 메모리 (4페이지):        물리 메모리 (8프레임):
+------+                     +------+
|페이지0| ----+              |      | 프레임0
+------+     |              +------+
|페이지1| --+ |              |페이지2| 프레임1
+------+   | |              +------+
|페이지2| -+| |              |      | 프레임2
+------+  || |              +------+
|페이지3|  || +------------>|페이지0| 프레임3
+------+  ||                +------+
          |+--------------->|페이지1| 프레임4
          |                 +------+
          +---------------->|페이지2| 프레임5 (X)
                            +------+
                            |페이지3| 프레임6
                            +------+
                            |      | 프레임7
                            +------+

페이지 테이블:
  페이지0 -> 프레임3
  페이지1 -> 프레임4
  페이지2 -> 프레임1
  페이지3 -> 프레임6

TLB (Translation Lookaside Buffer)

매번 페이지 테이블을 메모리에서 조회하면 메모리 접근이 두 배로 늘어난다. TLB는 페이지 테이블의 캐시 역할을 한다.

[TLB를 통한 주소 변환]

CPU --논리 주소(p, d)--> TLB 검색
                          |
              +-----------+-----------+
              |                       |
          TLB 히트                 TLB 미스
          (빠름, ~1ns)             (느림)
              |                       |
          프레임 f 획득          페이지 테이블 조회
              |                  프레임 f 획득
              |                  TLB에 등록
              |                       |
              +----------+------------+
                         |
                  물리 주소 (f, d)로
                  메모리 접근

유효 접근 시간(EAT) 계산:
  TLB 히트율 = 99% (일반적)
  메모리 접근 시간 = 100ns
  TLB 접근 시간 = 10ns

  EAT = 0.99 * (10 + 100) + 0.01 * (10 + 100 + 100)
      = 0.99 * 110 + 0.01 * 210
      = 108.9 + 2.1
      = 111ns

  TLB 없이: 200ns (메모리 2번 접근)
  TLB 있으면: 111ns (44.5% 개선)

페이지 테이블 구조

프로세스의 주소 공간이 크면(예: 64비트) 페이지 테이블 자체가 매우 커진다.

계층적 페이지 테이블 (Multi-level Page Table)

[2단계 페이지 테이블]

32비트 주소, 4KB 페이지:
+--------+--------+--------+
| p1(10) | p2(10) | d(12)  |
+--------+--------+--------+

외부 페이지 테이블 (1단계)
+---+
| 0 |---> 2단계 테이블 A
+---+      +---+
| 1 |--+   | 0 |--> 프레임 번호
+---+  |   +---+
| 2 |  |   | 1 |--> 프레임 번호
+---+  |   +---+
       |   | ...|
       |   +---+
       |
       +-> 2단계 테이블 B
           +---+
           | 0 |--> 프레임 번호
           +---+
           | ...|
           +---+

장점: 사용하지 않는 영역의 2단계 테이블은 생성하지 않음
-> 메모리 절약

해시 페이지 테이블

[해시 페이지 테이블]

논리 페이지 번호 p를 해시 함수에 입력
-> 해시 테이블의 슬롯으로 매핑
-> 체인(연결 리스트)에서 p를 검색
-> 대응하는 프레임 번호 반환

64비트 주소 공간처럼 매우 큰 경우에 유용

역 페이지 테이블 (Inverted Page Table)

[역 페이지 테이블]

일반 페이지 테이블: 프로세스당 하나 (논리 -> 물리)
역 페이지 테이블:   시스템에 하나 (물리 프레임 -> 프로세스, 페이지)

물리 메모리의 각 프레임에 대해 하나의 항목:
프레임 0: (PID=5, 페이지=3)
프레임 1: (PID=2, 페이지=7)
프레임 2: (비어 있음)
프레임 3: (PID=5, 페이지=0)
...

장점: 물리 메모리 크기에 비례하는 테이블 크기
단점: 주소 변환 시 전체 테이블 검색 필요 (해시로 해결)

스와핑 (Swapping)

프로세스 전체 또는 일부를 디스크로 내보내 메모리를 확보하는 기법이다.

[표준 스와핑]

메인 메모리                    백킹 스토어(디스크)
+---------+                   +---------+
|   OS    |                   |         |
+---------+                   |  P2의   |
|   P1    |  <-- swap in ---  |  이미지  |
+---------+                   |         |
|   P3    |  --- swap out --> |         |
+---------+                   +---------+
|   빈공간  |
+---------+

현대 시스템에서는 프로세스 전체가 아닌
페이지 단위로 스와핑 (페이지 스와핑)

[페이지 단위 스와핑]

프로세스의 특정 페이지만 디스크로 이동:
- 오랫동안 사용하지 않은 페이지를 swap out
- 필요할 때 다시 swap in
- 가상 메모리의 기반 기술

메모리 보호

페이징 환경에서의 메모리 보호는 페이지 테이블에 보호 비트를 추가하여 구현한다.

[페이지 테이블 항목의 구조]

+-------+-----+-----+-----+-------+
| 프레임 | 유효 | 읽기 | 쓰기 | 실행  |
| 번호   | 비트 | 가능 | 가능 | 가능  |
+-------+-----+-----+-----+-------+

유효 비트(Valid bit):
  1 = 이 페이지가 프로세스의 논리 주소 공간에 속함
  0 = 유효하지 않은 페이지 (접근 시 트랩 발생)

보호 비트:
  읽기 전용 페이지에 쓰기 시도 -> 하드웨어 트랩
  코드 페이지에 대해 실행만 허용, 쓰기 불허

// 메모리 보호 예시: mprotect 시스템 콜
#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

void segfault_handler(int sig) {
    printf("세그먼트 폴트 발생! 보호된 메모리 접근 시도\n");
    exit(1);
}

int main() {
    signal(SIGSEGV, segfault_handler);

    // 페이지 크기 단위로 메모리 할당
    size_t page_size = getpagesize();  // 보통 4096
    void *ptr = aligned_alloc(page_size, page_size);

    // 데이터 쓰기
    *(int *)ptr = 42;
    printf("값: %d\n", *(int *)ptr);

    // 메모리를 읽기 전용으로 보호
    mprotect(ptr, page_size, PROT_READ);

    // 쓰기 시도 -> 세그먼트 폴트!
    *(int *)ptr = 100;

    free(ptr);
    return 0;
}

정리

메인 메모리 관리는 프로세스에게 효율적이고 안전한 메모리 공간을 제공하기 위한 핵심 기능이다. MMU가 논리적 주소를 물리적 주소로 변환하고, 페이징은 외부 단편화를 제거하며, TLB가 주소 변환 성능을 보장한다. 계층적 페이지 테이블과 역 페이지 테이블은 대용량 주소 공간을 효율적으로 관리하고, 보호 비트를 통해 프로세스 간 메모리 격리를 보장한다.

[OS Concepts] 09. Main Memory Management

Background

Basic Concepts

The only storage directly accessible by the CPU is registers and main memory. Data on disk must be loaded into memory before the CPU can process it.

Memory access takes time, so a cache is placed between the CPU and memory to bridge the speed gap.

CPU <-> Registers (~1ns) <-> Cache (~10ns) <-> Main Memory (~100ns)

Address Binding

The process of translating addresses used by a program into actual memory addresses. There are three types based on binding time.

[Address Binding Timing]

1. Compile-time Binding
   - Memory location determined at compile time
   - Recompilation needed if location changes
   - Example: MS-DOS .COM programs

2. Load-time Binding
   - Address determined when loading program into memory
   - Uses relocatable code
   - Address cannot change after loading

3. Execution-time Binding (Modern OS)
   - Addresses dynamically translated during execution
   - Requires MMU (Memory Management Unit) hardware
   - Processes can be moved within memory

Logical and Physical Addresses

Logical Address: Address generated by the CPU. Also called virtual address
Physical Address: Actual address of memory hardware

[Address Translation via MMU]

CPU --[Logical addr: 346]--> MMU --[Physical addr: 14346]--> Memory
                              |
                        Relocation Register
                        (base value: 14000)

Physical address = Logical address + Relocation register value
346 + 14000 = 14346

User programs deal only with logical addresses and cannot see physical addresses directly.

Dynamic Loading and Dynamic Linking

Dynamic Loading: Loads routines into memory only when they are called. Unused routines do not occupy memory, improving memory utilization.

Dynamic Linking: Links libraries at runtime.

[Static Linking vs Dynamic Linking]

Static Linking:
  Program A: [code + libc copy]   -- 50MB
  Program B: [code + libc copy]   -- 50MB
  Total memory: 100MB

Dynamic Linking (Shared Libraries):
  Program A: [code + libc ref]    -- 10MB
  Program B: [code + libc ref]    -- 10MB
  libc.so:   [shared library]     -- 40MB
  Total memory: 60MB

Linux: .so files (Shared Object)
Windows: .dll files (Dynamic-Link Library)

Contiguous Memory Allocation

The simplest memory management approach, placing each process in a contiguous memory region.

Memory Protection

[Base Register and Limit Register]

     Base                   Limit
        |                       |
        v                       v
+-------+=======================+-------+
| OS    | Process P's region    | Other |
+-------+=======================+-------+
  0     300040                 420940

For address addr generated by CPU:
  if (addr >= base && addr < base + limit)
      Access allowed -> Physical memory access
  else
      Trap raised -> OS handles error

Memory Allocation Strategies

Methods for allocating memory to processes from the free space (hole) list.

[Memory Allocation Example]

Initial state:
|---OS---|--P1--|-------free-------|--P3--|---free---|

Allocation strategies:
1. First Fit: Allocate in the first sufficiently large hole
   Pros: Fast

2. Best Fit: Allocate in the smallest sufficient hole
   Pros: Creates small remaining space
   Cons: Full search needed, creates very small fragments

3. Worst Fit: Allocate in the largest hole
   Pros: Remaining space is large enough to reuse
   Cons: Full search needed

Fragmentation

[External vs Internal Fragmentation]

External Fragmentation:
|P1|  free  |P2| free |P3|  free  |P4|
    100KB     50KB     80KB

Total free space: 230KB, but cannot load a 200KB process!
(Not contiguous)

Solution: Compaction - Move processes to one end
|P1|P2|P3|P4|-----230KB free-----|

Internal Fragmentation:
When memory is allocated in fixed-size blocks
If process is smaller than block, space is wasted inside

Process size: 18,462 bytes
Block size: 20,000 bytes
Internal fragmentation: 1,538 bytes wasted

Paging

Paging is a technique that allocates logical address space non-contiguously, completely eliminating external fragmentation.

Basic Concepts

Page: Fixed-size blocks of logical memory
Frame: Same-size blocks of physical memory
Page Table: Table mapping pages to frames

[Paging Address Translation]

Logical address = Page number(p) + Page offset(d)

Example: Page size 4KB (2^12), 32-bit logical address
  Upper 20 bits = Page number (max 2^20 = 1M pages)
  Lower 12 bits = Offset (0 ~ 4095)

+--------+--------+
| p (20) | d (12) |       Logical address
+--------+--------+
     |
     v
[Page Table]
 p -> f (frame number)
     |
     v
+--------+--------+
| f (20) | d (12) |       Physical address
+--------+--------+

[Paging Example]

Logical Memory (4 pages):        Physical Memory (8 frames):
+------+                     +------+
|Page 0| ----+              |      | Frame 0
+------+     |              +------+
|Page 1| --+ |              |Page 2| Frame 1
+------+   | |              +------+
|Page 2| -+| |              |      | Frame 2
+------+  || |              +------+
|Page 3|  || +------------>|Page 0| Frame 3
+------+  ||                +------+
          |+--------------->|Page 1| Frame 4
          |                 +------+
          +---------------->|Page 2| Frame 5 (X)
                            +------+
                            |Page 3| Frame 6
                            +------+
                            |      | Frame 7
                            +------+

Page Table:
  Page 0 -> Frame 3
  Page 1 -> Frame 4
  Page 2 -> Frame 1
  Page 3 -> Frame 6

TLB (Translation Lookaside Buffer)

Looking up the page table in memory every time doubles memory accesses. The TLB serves as a cache for the page table.

[Address Translation via TLB]

CPU --logical addr(p, d)--> TLB lookup
                              |
              +---------------+---------------+
              |                               |
          TLB Hit                         TLB Miss
          (fast, ~1ns)                    (slow)
              |                               |
          Get frame f                  Page table lookup
              |                        Get frame f
              |                        Add to TLB
              |                               |
              +---------------+---------------+
                              |
                   Access memory with
                   physical addr (f, d)

Effective Access Time (EAT) calculation:
  TLB hit rate = 99% (typical)
  Memory access time = 100ns
  TLB access time = 10ns

  EAT = 0.99 * (10 + 100) + 0.01 * (10 + 100 + 100)
      = 0.99 * 110 + 0.01 * 210
      = 108.9 + 2.1
      = 111ns

  Without TLB: 200ns (2 memory accesses)
  With TLB: 111ns (44.5% improvement)

Page Table Structures

When a process's address space is large (e.g., 64-bit), the page table itself becomes very large.

Hierarchical Page Table (Multi-level Page Table)

[Two-level Page Table]

32-bit address, 4KB pages:
+--------+--------+--------+
| p1(10) | p2(10) | d(12)  |
+--------+--------+--------+

Outer Page Table (Level 1)
+---+
| 0 |---> Level 2 Table A
+---+      +---+
| 1 |--+   | 0 |--> Frame number
+---+  |   +---+
| 2 |  |   | 1 |--> Frame number
+---+  |   +---+
       |   | ...|
       |   +---+
       |
       +-> Level 2 Table B
           +---+
           | 0 |--> Frame number
           +---+
           | ...|
           +---+

Advantage: Level 2 tables for unused regions are not created
-> Memory savings

Hashed Page Table

[Hashed Page Table]

Input logical page number p to hash function
-> Maps to hash table slot
-> Search chain (linked list) for p
-> Return corresponding frame number

Useful for very large address spaces like 64-bit

Inverted Page Table

[Inverted Page Table]

Regular page table: One per process (logical -> physical)
Inverted page table: One for the system (physical frame -> process, page)

One entry for each physical memory frame:
Frame 0: (PID=5, Page=3)
Frame 1: (PID=2, Page=7)
Frame 2: (empty)
Frame 3: (PID=5, Page=0)
...

Advantage: Table size proportional to physical memory
Disadvantage: Full table search needed for translation (solved with hashing)

Swapping

A technique that moves all or part of a process to disk to free memory.

[Standard Swapping]

Main Memory                    Backing Store (Disk)
+---------+                   +---------+
|   OS    |                   |         |
+---------+                   |  P2's   |
|   P1    |  <-- swap in ---  |  image  |
+---------+                   |         |
|   P3    |  --- swap out --> |         |
+---------+                   +---------+
| free    |
+---------+

Modern systems swap at page granularity
rather than swapping entire processes

[Page-level Swapping]

Only specific pages of a process are moved to disk:
- Swap out pages not used for a long time
- Swap in again when needed
- Foundation technology for virtual memory

Memory Protection

Memory protection in a paging environment is implemented by adding protection bits to page table entries.

[Page Table Entry Structure]

+-------+-----+-----+-----+-------+
| Frame | Valid| Read| Write| Exec  |
| Number| Bit | OK  | OK   | OK    |
+-------+-----+-----+-----+-------+

Valid bit:
  1 = This page belongs to the process's logical address space
  0 = Invalid page (trap raised on access)

Protection bits:
  Write attempt on read-only page -> Hardware trap
  Code pages: Allow execute only, deny writes

// Memory protection example: mprotect system call
#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

void segfault_handler(int sig) {
    printf("Segmentation fault! Attempted access to protected memory\n");
    exit(1);
}

int main() {
    signal(SIGSEGV, segfault_handler);

    // Allocate memory in page-size units
    size_t page_size = getpagesize();  // Usually 4096
    void *ptr = aligned_alloc(page_size, page_size);

    // Write data
    *(int *)ptr = 42;
    printf("Value: %d\n", *(int *)ptr);

    // Protect memory as read-only
    mprotect(ptr, page_size, PROT_READ);

    // Write attempt -> Segmentation fault!
    *(int *)ptr = 100;

    free(ptr);
    return 0;
}

Summary

Main memory management is a core function for providing efficient and safe memory space to processes. The MMU translates logical addresses to physical addresses, paging eliminates external fragmentation, and the TLB ensures address translation performance. Hierarchical page tables and inverted page tables efficiently manage large address spaces, and protection bits ensure memory isolation between processes.