Skip to content

필사 모드: Digital Logic Design Complete Guide: From Boolean Algebra to CPU Design

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

1. Introduction to Digital Systems

Analog vs Digital

Analog signals carry continuous values while digital signals use only discrete states — 0 and 1. Modern electronics overwhelmingly favors digital systems for several key reasons:

- **Noise immunity**: Only two levels to distinguish, making the system highly resistant to electrical noise.

- **Perfect reproduction**: Copying digital data introduces no signal degradation.

- **Logical processing**: Boolean algebra provides a rigorous mathematical foundation.

- **Storage**: Flip-flops and memory cells store state reliably and indefinitely.

Number System Conversions

Digital systems regularly use binary (base-2), octal (base-8), and hexadecimal (base-16) representations.

Decimal to Binary — repeated division by 2:

45 ÷ 2 = 22 remainder 1

22 ÷ 2 = 11 remainder 0

11 ÷ 2 = 5 remainder 1

5 ÷ 2 = 2 remainder 1

2 ÷ 2 = 1 remainder 0

1 ÷ 2 = 0 remainder 1

Result: 101101 (binary)

| Decimal | Binary | Octal | Hex |

| ------- | ------ | ----- | --- |

| 0 | 0000 | 0 | 0 |

| 5 | 0101 | 5 | 5 |

| 10 | 1010 | 12 | A |

| 15 | 1111 | 17 | F |

| 16 | 10000 | 20 | 10 |

Signed Number Representation

**Two's complement** is the universal standard for representing negative integers in hardware.

+5 → 0000 0101

Computing -5:

One's complement: 1111 1010

Add 1: 1111 1011 ← -5 in two's complement

Range for an 8-bit two's complement number: -128 to +127.

BCD and Gray Code

- **BCD (Binary Coded Decimal)**: Each decimal digit is encoded as 4 bits. Example: 29 = 0010 1001.

- **Gray code**: Adjacent codewords differ by exactly one bit, minimizing transition errors in mechanical encoders and analog-to-digital converters.

2. Boolean Algebra

Fundamental Operations

Three primitive operations form the foundation of Boolean algebra:

- **AND** (logical product): A · B — output is 1 only when both inputs are 1.

- **OR** (logical sum): A + B — output is 1 when at least one input is 1.

- **NOT** (complement): A' — inverts the input.

Boolean Laws

Commutative: A + B = B + A, A · B = B · A

Associative: A + (B+C) = (A+B) + C

Distributive: A · (B+C) = AB + AC

Identity: A + 0 = A, A · 1 = A

Complement: A + A' = 1, A · A' = 0

Idempotent: A + A = A, A · A = A

**De Morgan's Theorems** — essential for NAND/NOR-based implementations:

(A · B)' = A' + B'

(A + B)' = A' · B'

Canonical Forms

- **SOP (Sum of Products)**: Sum of minterms. Example: F = AB' + A'B + AB

- **POS (Product of Sums)**: Product of maxterms. Example: F = (A+B)(A'+C)

Any Boolean function can be expressed in either canonical form, making these forms the starting point for logic minimization.

3. Logic Gates

Truth Tables for All Basic Gates

AND Gate: OR Gate: NOT Gate:

A B A·B A B A+B A A'

0 0 0 0 0 0 0 1

0 1 0 0 1 1 1 0

1 0 0 1 0 1

1 1 1 1 1 1

NAND Gate: NOR Gate:

A B (A·B)' A B (A+B)'

0 0 1 0 0 1

0 1 1 0 1 0

1 0 1 1 0 0

1 1 0 1 1 0

XOR Gate: XNOR Gate:

A B A⊕B A B (A⊕B)'

0 0 0 0 0 1

0 1 1 0 1 0

1 0 1 1 0 0

1 1 0 1 1 1

Universal Gates (NAND and NOR)

NAND alone can implement any Boolean function:

NOT A = A NAND A

A AND B = (A NAND B) NAND (A NAND B)

A OR B = (A NAND A) NAND (B NAND B)

This universality is commercially significant — an entire chip family can be fabricated from a single cell type, simplifying the manufacturing process. NOR gates are equally universal.

CMOS Implementation Overview

Modern digital ICs use CMOS (Complementary Metal-Oxide-Semiconductor) technology:

- **NMOS transistor**: Turns ON when gate is HIGH (pull-down network).

- **PMOS transistor**: Turns ON when gate is LOW (pull-up network).

- CMOS inverter: series PMOS + NMOS between VDD and GND.

- Static power consumption is nearly zero, which explains CMOS dominance in low-power design.

4. Karnaugh Maps

The Karnaugh map (K-map) provides a visual, systematic method for minimizing Boolean expressions by identifying groups of adjacent 1s.

4-Variable K-map Example

CD

AB 00 01 11 10

00 [ 1 0 1 1 ]

01 [ 1 1 1 0 ]

11 [ 0 1 1 0 ]

10 [ 0 0 1 1 ]

Grouping Rules

1. Group sizes must be powers of 2: 1, 2, 4, 8, or 16.

2. Groups must be rectangular (including squares).

3. The map wraps around on all edges (toroidal topology).

4. Use the fewest, largest possible groups.

5. Don't-care conditions (X) may be included in groups to enlarge them.

Prime Implicants

A **prime implicant** is a group that cannot be further expanded. An **essential prime implicant** covers at least one minterm not covered by any other prime implicant — it must appear in the minimal SOP expression.

5. Combinational Logic Circuits

A combinational circuit's output depends solely on current inputs — there is no internal state or memory.

Half Adder and Full Adder

Half Adder:

Sum S = A ⊕ B

Carry C = A · B

Full Adder:

Sum S = A ⊕ B ⊕ Cin

Carry Cout = AB + Cin(A ⊕ B)

Cascading full adders creates a **Ripple Carry Adder** — simple but slow because carry propagates serially. The **Carry Lookahead Adder (CLA)** computes carries in parallel for high-speed arithmetic.

Multiplexer (MUX)

A 2-to-n-input MUX selects one of 2^n data inputs using n select lines.

4-to-1 MUX (select lines S1, S0):

S1=0, S0=0 → Y = I0

S1=0, S0=1 → Y = I1

S1=1, S0=0 → Y = I2

S1=1, S0=1 → Y = I3

A 2^n-to-1 MUX can implement any n-variable Boolean function directly.

Encoders and Decoders

- **Encoder**: Converts 2^n input lines to an n-bit binary code (e.g., priority encoder handles multiple simultaneous inputs).

- **Decoder**: Converts an n-bit code to up to 2^n output lines (e.g., 3-to-8 decoder asserts one of eight outputs).

A decoder combined with OR gates directly implements any sum-of-minterms (SOP) expression.

6. Sequential Logic Circuits

Sequential circuits depend on both current inputs and internal state (memory). They are the building blocks of registers, counters, and processors.

Latches

The **SR latch** is the simplest storage element, built from cross-coupled NAND or NOR gates.

SR Latch truth table:

S R Q(next)

0 0 Q (no change)

1 0 1 (Set)

0 1 0 (Reset)

1 1 Undefined (forbidden)

The **D latch** eliminates the forbidden state by tying S = D and R = D'. When Enable is HIGH, Q tracks D; when Enable is LOW, Q retains its value.

Flip-Flops

Flip-flops are **edge-triggered** — state changes only on the rising or falling clock edge, not during the entire clock period. This strict timing discipline is essential for synchronous design.

D flip-flop: Q(next) = D

JK flip-flop:

J=0, K=0 → Q (hold)

J=1, K=0 → Q = 1 (Set)

J=0, K=1 → Q = 0 (Reset)

J=1, K=1 → Q' (Toggle)

T flip-flop:

T=0 → Q (hold)

T=1 → Q' (Toggle)

The JK flip-flop has no undefined input combination. The T flip-flop is the natural choice for counter design.

7. Registers and Counters

Shift Registers

Shift registers chain multiple flip-flops so data moves (shifts) through them with each clock cycle.

- **SISO** (Serial In, Serial Out): Data enters and leaves serially.

- **SIPO** (Serial In, Parallel Out): Serial data is loaded, then all bits are read simultaneously.

- **PISO** (Parallel In, Serial Out): Parallel data is loaded, then transmitted bit by bit.

- **PIPO** (Parallel In, Parallel Out): Complete parallel load and parallel read.

Applications: serial-to-parallel conversion, CRC generation, pseudo-random number generators (LFSR), and fast multiply/divide by powers of 2.

Counters

3-bit Asynchronous (Ripple) Counter:

Three T flip-flops chained in series.

0 → 1 → 2 → 3 → 4 → 5 → 6 → 7 → 0 ...

Drawback: ripple delay accumulates with each stage.

3-bit Synchronous Counter:

All flip-flops share a common clock.

All outputs change simultaneously → no ripple delay.

A **Modulo-N counter** cycles through 0 to N-1. Example: a BCD counter (modulo-10) cycles 0 through 9 and is widely used in digital clocks.

8. Finite State Machines (FSM)

An FSM is the abstract model underlying all sequential logic systems. It consists of a finite set of states, transition conditions, and outputs.

Moore vs Mealy Machines

- **Moore machine**: Output depends only on the current state. Output labels appear inside state bubbles.

- **Mealy machine**: Output depends on current state AND inputs. Output labels appear on transitions.

Mealy machines generally require fewer states than equivalent Moore machines.

Vending Machine FSM Example

States: S0 (0 cents), S1 (10 cents), S2 (20 cents), S3 (dispense)

Inputs: D (dime = 10c), Q (quarter = 25c)

Transitions:

S0 --D--> S1

S0 --Q--> S2 (with 5c change output)

S1 --D--> S2

S1 --Q--> S3 (dispense, 5c change)

S2 --D--> S3 (dispense)

S2 --Q--> S3 (dispense, 15c change)

FSM Design Procedure

1. Analyze the problem and define all states.

2. Draw the state diagram.

3. Construct the state table (next-state and output).

4. Assign binary codes to states (state encoding).

5. Derive next-state and output Boolean equations.

6. Minimize with Karnaugh maps.

7. Implement with flip-flops and combinational gates.

9. Verilog HDL Fundamentals

Verilog HDL (Hardware Description Language) allows designers to describe digital circuits textually, enabling simulation and automated synthesis to physical gates.

Module Structure

module module_name (

input wire clk,

input wire reset,

input wire [3:0] data_in,

output reg [3:0] data_out

);

wire [3:0] temp;

// Combinational logic — assign

assign temp = data_in & 4'b1111;

// Sequential logic — always

always @(posedge clk or posedge reset) begin

if (reset)

data_out <= 4'b0000;

else

data_out <= temp;

end

endmodule

D Flip-Flop

module d_ff (

input wire clk,

input wire reset,

input wire d,

output reg q

);

always @(posedge clk or posedge reset) begin

if (reset)

q <= 1'b0;

else

q <= d;

end

endmodule

4-bit Synchronous Counter

module counter_4bit (

input wire clk,

input wire reset,

output reg [3:0] count

);

always @(posedge clk or posedge reset) begin

if (reset)

count <= 4'b0000;

else

count <= count + 1'b1;

end

endmodule

Testbench

module tb_counter;

reg clk, reset;

wire [3:0] count;

counter_4bit uut (

.clk(clk),

.reset(reset),

.count(count)

);

// 10 ns clock period

initial clk = 0;

always #5 clk = ~clk;

initial begin

reset = 1;

#20 reset = 0;

#200 $finish;

end

initial begin

$monitor("t=%0t reset=%b count=%b", $time, reset, count);

end

endmodule

Verilog Data Types Summary

| Type | Description | Typical Use |

| --------- | ------------------------------------- | ----------------------------------- |

| wire | Combinational net, holds driven value | assign statements, port connections |

| reg | Can store a value between updates | Inside always blocks |

| integer | 32-bit signed integer | Loop variables, simulation |

| parameter | Named constant | Parameterizing module widths |

10. FPGA Fundamentals

FPGA vs ASIC vs Microprocessor

| Attribute | FPGA | ASIC | MCU / CPU |

| ----------------------- | ---------------------------- | --------------- | --------------- |

| Reconfigurable | Yes | No | Via software |

| Peak performance | Medium–High | Highest | General purpose |

| NRE cost | None | Very high | Low |

| Unit cost (high volume) | High | Very low | Low |

| Parallelism | Excellent | Excellent | Limited |

| Best for | Prototyping, moderate volume | Mass production | Control tasks |

FPGA Internal Architecture

- **CLB (Configurable Logic Block)**: Contains LUTs, flip-flops, and MUXes.

- **LUT (Look-Up Table)**: An n-input LUT implements any n-variable Boolean function by storing the truth table directly in SRAM.

- **IOB (I/O Block)**: Interface between the FPGA fabric and external pins; supports multiple voltage standards and I/O protocols.

- **Block RAM (BRAM)**: Dedicated on-chip memory blocks for storing data arrays.

- **DSP Blocks**: Hardened fast multiply-accumulate units optimized for signal processing.

- **PLL / Clock Management**: On-chip phase-locked loops for frequency synthesis and clock distribution.

FPGA Design Flow

1. Design Entry — write HDL (Verilog/VHDL) or use a schematic editor

2. Synthesis — convert HDL to a technology-independent gate-level netlist

3. Implementation

├── Technology Mapping — map netlist to FPGA primitives (LUTs, FFs)

├── Placement (Place) — assign primitives to physical FPGA resources

└── Routing (Route) — connect placed elements with on-chip wires

4. Timing Analysis — verify all setup/hold constraints are met

5. Bitstream Generation — produce the programming file

6. Device Programming — download bitstream to FPGA via JTAG

Leading FPGA Vendors

- **AMD-Xilinx**: Spartan (cost-optimized), Artix, Kintex, Virtex (high-performance), Zynq (ARM + FPGA SoC).

- **Intel (Altera)**: Cyclone (low cost), Arria (mid-range), Stratix (high-end).

- **Lattice Semiconductor**: iCE40 (ultra-low power, open-source toolchain available).

11. Basic CPU Design

ALU (Arithmetic Logic Unit)

The ALU is the computational heart of any processor.

ALU Inputs:

A, B — n-bit operands

OpCode — selects the operation

ALU Outputs:

Result — n-bit result

Zero — asserted when Result = 0

Carry — asserted on unsigned overflow

Overflow — asserted on signed overflow

Negative — asserted when Result < 0

Example operations:

000 → A + B (addition)

001 → A - B (subtraction)

010 → A AND B

011 → A OR B

100 → A XOR B

101 → NOT A

110 → A << 1 (logical left shift)

111 → A >> 1 (logical right shift)

Register File

A register file provides fast, on-chip storage for operands and results. A typical 32-register file has two read ports and one write port.

module register_file (

input wire clk,

input wire we,

input wire [4:0] rs1, rs2, rd,

input wire [31:0] write_data,

output wire [31:0] read_data1,

output wire [31:0] read_data2

);

reg [31:0] regs [0:31];

assign read_data1 = regs[rs1];

assign read_data2 = regs[rs2];

always @(posedge clk) begin

if (we && rd != 0)

regs[rd] <= write_data;

end

endmodule

Simple CPU Datapath

IF — Instruction Fetch: read instruction from memory at PC

ID — Instruction Decode: read registers, generate control signals

EX — Execute: ALU performs computation

MEM — Memory Access: load/store data memory

WB — Write Back: write result to register file

Pipelining

A 5-stage pipeline overlaps execution of multiple instructions to increase throughput:

Cycle: 1 2 3 4 5 6 7 8 9

Instr 1: IF ID EX ME WB

Instr 2: IF ID EX ME WB

Instr 3: IF ID EX ME WB

Instr 4: IF ID EX ME WB

Instr 5: IF ID EX ME WB

**Pipeline hazards** disrupt ideal throughput:

- **Structural**: Two instructions need the same hardware resource simultaneously.

- **Data (RAW/WAR/WAW)**: An instruction reads a value not yet written by a previous instruction. Resolved with forwarding, stalls, or compiler scheduling.

- **Control**: Branch outcome unknown until late in the pipeline. Resolved with branch prediction, delayed branching, or speculative execution.

12. Connecting to AI Hardware

GPU Digital Logic Architecture

A modern GPU contains thousands of small ALU cores (CUDA cores, SIMD lanes) operating in parallel. This massive parallelism maps perfectly onto the matrix multiplications that dominate deep learning workloads.

- **NVIDIA Tensor Cores**: Perform mixed-precision (FP16 × FP16 accumulate to FP32) matrix operations in a single clock cycle.

- **SIMT execution model**: Thousands of threads execute the same instruction on different data simultaneously.

TPU and NPU

- **TPU (Tensor Processing Unit)**: Google's custom ASIC built around a systolic array of MAC (Multiply-Accumulate) units. Designed exclusively for neural network inference and training.

- **NPU (Neural Processing Unit)**: Edge-device AI accelerators. Examples: Apple Neural Engine, Qualcomm Hexagon DSP, Samsung Exynos NPU.

FPGA-Based AI Acceleration

FPGAs occupy a useful niche in AI acceleration:

- **Low latency**: Suitable for real-time inference (autonomous driving, industrial vision).

- **Flexibility**: Reconfigure the bitstream when the model changes without replacing hardware.

- **Custom precision**: Implement 4-bit or 8-bit arithmetic to balance accuracy and resource usage.

- Real-world example: Microsoft Project Brainwave deploys FPGAs in Azure data centers for DNN inference.

Neuromorphic Computing

Neuromorphic chips model the brain's neuron–synapse architecture directly in silicon.

- **Intel Loihi 2**: Implements Spiking Neural Networks (SNNs) with on-chip learning.

- **IBM TrueNorth**: 1 million neurons and 256 million synapses on a single chip with milliwatt power consumption.

- Event-driven, asynchronous computation enables ultra-low power operation — a compelling advantage for always-on edge AI.

13. Learning Roadmap

[Foundations]

Boolean Algebra → Logic Gates → Karnaugh Maps

[Combinational Circuits]

Adders, MUX, Decoder, Encoder

[Sequential Circuits]

Latches → Flip-Flops → Registers → Counters

[System-Level Design]

FSM → Datapath → Control Unit

[HDL and FPGA]

Verilog → Simulation → FPGA Implementation

[Processor Architecture]

ALU → Pipelining → Memory Hierarchy

[AI Hardware]

GPU Architecture → TPU/NPU → FPGA Acceleration

Quiz

**Answer**: Because NAND alone can implement NOT, AND, and OR, making it possible to build any Boolean function from NAND gates only.

**Explanation**:

- NOT A = A NAND A

- A AND B = (A NAND B) NAND (A NAND B)

- A OR B = (A NAND A) NAND (B NAND B)

Since NOT, AND, and OR form a functionally complete set, any logic circuit can be realized with NAND gates exclusively. NOR gates share the same property.

**Answer**: -128 (binary: 10000000). The range -128 to +127 is asymmetric because zero is treated as positive, consuming one code from the positive half.

**Explanation**: With 8 bits there are 256 possible patterns. One pattern (00000000) represents zero. The two's complement of 10000000 would require 9 bits to represent as +128, so 10000000 maps exclusively to -128. This asymmetry is an inherent property of two's complement representation.

**Answer**: A D latch is level-triggered — it passes D to Q continuously while Enable is HIGH. A D flip-flop is edge-triggered — it samples D only at the clock edge and holds the value until the next edge.

**Explanation**: Level-triggered behavior makes latches difficult to use in synchronous designs because the output can change multiple times while the enable is asserted. Edge-triggered flip-flops provide strict single-update-per-cycle timing, which is the foundation of clocked synchronous design methodology.

**Answer**: A Don't-care (X) is an input combination that either cannot occur or whose output value is irrelevant to the system. The designer may assign it 0 or 1 to maximize grouping and minimize the final expression.

**Explanation**: For example, in a circuit that processes BCD digits (0–9), input combinations 1010 through 1111 (10–15) never appear during normal operation. Marking these six minterms as Don't-cares allows larger groups in the K-map, reducing the number of gates required.

**Answer**: A data hazard occurs when an instruction needs a value that has not yet been written back by an earlier instruction still in the pipeline (a Read-After-Write dependency).

**Resolution techniques**:

- **Forwarding (Bypassing)**: Wire the ALU output directly to the next instruction's ALU input, skipping the register file write-back. Resolves most RAW hazards without stalling.

- **Stalling (Pipeline bubble)**: Insert no-operation cycles until the dependent value is available. Simple but reduces throughput.

- **Compiler scheduling**: Reorder independent instructions to fill the slots between a producer and its consumer, avoiding stalls without hardware forwarding.

References

- **Morris M. Mano & Michael D. Ciletti**, _Digital Design: With an Introduction to the Verilog HDL, VHDL, and SystemVerilog_, Pearson

- **Thomas L. Floyd**, _Digital Fundamentals_, Pearson

- **Frank Vahid**, _Digital Design_, Wiley

- [AMD-Xilinx FPGA Design Resources](https://www.xilinx.com/support/documentation.html)

- [Intel FPGA (Altera) Documentation](https://www.intel.com/content/www/us/en/programmable/documentation/lit-index.html)

- [HDLBits — Verilog Practice Problems](https://hdlbits.01xz.net/)

- [nandland — FPGA/VHDL/Verilog Tutorials](https://www.nandland.com/)

현재 단락 (1/366)

Analog signals carry continuous values while digital signals use only discrete states — 0 and 1. Mod...

작성 글자: 0원문 글자: 16,857작성 단락: 0/366