- Published on
Mastering Software Architecture & Design Patterns: From SOLID to Clean Architecture and AI System Design
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- 1. SOLID Principles
- 2. GoF Design Patterns
- 3. Architecture Patterns
- 4. Microservices Patterns
- 5. Clean Code Principles
- 6. AI System Architecture
- 7. Testing Strategy
- Quiz
- Conclusion
Introduction
Software architecture is the skeleton of a system. Well-designed architecture is flexible to change, easy to test, and produces code that the entire team can understand. In the AI era, new design challenges have emerged: LLM services, RAG pipelines, and Agent systems. This guide walks through everything from classical design principles to modern AI system architecture.
1. SOLID Principles
SOLID is a set of five object-oriented design principles compiled by Robert C. Martin. They form the foundation for building maintainable and extensible software.
1.1 Single Responsibility Principle (SRP)
A class should have one, and only one, reason to change.
# BAD: multiple responsibilities in one class
class UserManager:
def create_user(self, data): ...
def send_welcome_email(self, user): ...
def save_to_database(self, user): ...
# GOOD: each responsibility in its own class
class UserRepository:
def save(self, user): ...
class EmailService:
def send_welcome(self, user): ...
class UserFactory:
def create(self, data): ...
1.2 Open/Closed Principle (OCP)
Software entities should be open for extension but closed for modification.
from abc import ABC, abstractmethod
class Discount(ABC):
@abstractmethod
def apply(self, price: float) -> float: ...
class NoDiscount(Discount):
def apply(self, price: float) -> float:
return price
class PercentDiscount(Discount):
def __init__(self, percent: float):
self.percent = percent
def apply(self, price: float) -> float:
return price * (1 - self.percent / 100)
class VIPDiscount(Discount):
def apply(self, price: float) -> float:
return price * 0.7
# New discount policies can be added without modifying existing code
class Order:
def __init__(self, discount: Discount):
self.discount = discount
def final_price(self, base: float) -> float:
return self.discount.apply(base)
1.3 Liskov Substitution Principle (LSP)
Subtypes must be substitutable for their base types without altering the correctness of the program.
class Bird:
def fly(self) -> str:
return "Flying"
# LSP violation: Penguin inherits Bird but cannot fly
class Penguin(Bird):
def fly(self):
raise NotImplementedError("Penguins cannot fly")
# GOOD: separate interfaces based on capabilities
class FlyingBird(ABC):
@abstractmethod
def fly(self) -> str: ...
class SwimmingBird(ABC):
@abstractmethod
def swim(self) -> str: ...
class Eagle(FlyingBird):
def fly(self) -> str:
return "Eagle soars through the sky"
class Penguin(SwimmingBird):
def swim(self) -> str:
return "Penguin swims gracefully"
1.4 Interface Segregation Principle (ISP)
Clients should not be forced to depend on interfaces they do not use.
# BAD: one bloated interface
class Machine(ABC):
@abstractmethod
def print(self): ...
@abstractmethod
def scan(self): ...
@abstractmethod
def fax(self): ...
# GOOD: split into smaller, focused interfaces
class Printable(ABC):
@abstractmethod
def print(self): ...
class Scannable(ABC):
@abstractmethod
def scan(self): ...
class MultiFunctionPrinter(Printable, Scannable):
def print(self): print("Printing...")
def scan(self): print("Scanning...")
class SimplePrinter(Printable):
def print(self): print("Simple print...")
1.5 Dependency Inversion Principle (DIP)
High-level modules should not depend on low-level modules. Both should depend on abstractions.
# BAD: high-level depends directly on low-level
class MySQLDatabase:
def query(self, sql: str): ...
class UserService:
def __init__(self):
self.db = MySQLDatabase() # depends on a concrete class
# GOOD: depend on abstractions (dependency injection)
class DatabasePort(ABC):
@abstractmethod
def find_user(self, user_id: str) -> dict: ...
class MySQLAdapter(DatabasePort):
def find_user(self, user_id: str) -> dict:
# MySQL implementation
return {}
class MongoAdapter(DatabasePort):
def find_user(self, user_id: str) -> dict:
# MongoDB implementation
return {}
class UserService:
def __init__(self, db: DatabasePort):
self.db = db # depends on the abstraction
# Usage
service = UserService(db=MySQLAdapter())
2. GoF Design Patterns
2.1 Factory Pattern
Encapsulates object creation logic.
class LLMProvider(ABC):
@abstractmethod
def complete(self, prompt: str) -> str: ...
class OpenAIProvider(LLMProvider):
def complete(self, prompt: str) -> str:
return f"OpenAI response: {prompt}"
class AnthropicProvider(LLMProvider):
def complete(self, prompt: str) -> str:
return f"Anthropic response: {prompt}"
class LLMFactory:
_registry = {
"openai": OpenAIProvider,
"anthropic": AnthropicProvider,
}
@classmethod
def create(cls, provider: str) -> LLMProvider:
klass = cls._registry.get(provider)
if not klass:
raise ValueError(f"Unknown provider: {provider}")
return klass()
2.2 Singleton Pattern
Ensures a class has only one instance and provides a global access point to it.
import threading
class ConfigManager:
_instance = None
_lock = threading.Lock()
def __new__(cls):
if cls._instance is None:
with cls._lock:
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._config = {}
return cls._instance
def set(self, key: str, value):
self._config[key] = value
def get(self, key: str):
return self._config.get(key)
2.3 Observer Pattern
Automatically notifies multiple subscribers when an object's state changes.
class EventBus:
def __init__(self):
self._subscribers: dict[str, list] = {}
def subscribe(self, event: str, handler):
self._subscribers.setdefault(event, []).append(handler)
def publish(self, event: str, data=None):
for handler in self._subscribers.get(event, []):
handler(data)
# Usage
bus = EventBus()
bus.subscribe("user.created", lambda d: print(f"Send welcome email: {d}"))
bus.subscribe("user.created", lambda d: print(f"Log analytics event: {d}"))
bus.publish("user.created", {"id": "u1", "email": "user@example.com"})
2.4 Strategy Pattern
Encapsulates algorithms and makes them interchangeable at runtime.
class SortStrategy(ABC):
@abstractmethod
def sort(self, data: list) -> list: ...
class QuickSort(SortStrategy):
def sort(self, data: list) -> list:
return sorted(data)
class MergeSort(SortStrategy):
def sort(self, data: list) -> list:
return sorted(data, key=lambda x: x)
class DataProcessor:
def __init__(self, strategy: SortStrategy):
self._strategy = strategy
def set_strategy(self, strategy: SortStrategy):
self._strategy = strategy
def process(self, data: list) -> list:
return self._strategy.sort(data)
2.5 Decorator Pattern
Adds new responsibilities to objects dynamically.
import time
import functools
def retry(max_attempts: int = 3):
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_attempts):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_attempts - 1:
raise
time.sleep(2 ** attempt)
return wrapper
return decorator
def timed(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
print(f"{func.__name__} took {time.time() - start:.2f}s")
return result
return wrapper
@retry(max_attempts=3)
@timed
def call_llm_api(prompt: str) -> str:
return "response"
2.6 Command Pattern
Encapsulates requests as objects, enabling undo/redo and queuing.
class Command(ABC):
@abstractmethod
def execute(self): ...
@abstractmethod
def undo(self): ...
class CreatePostCommand(Command):
def __init__(self, repo, post_data: dict):
self.repo = repo
self.post_data = post_data
self.created_id = None
def execute(self):
self.created_id = self.repo.create(self.post_data)
def undo(self):
if self.created_id:
self.repo.delete(self.created_id)
class CommandHistory:
def __init__(self):
self._history: list[Command] = []
def execute(self, cmd: Command):
cmd.execute()
self._history.append(cmd)
def undo_last(self):
if self._history:
self._history.pop().undo()
3. Architecture Patterns
3.1 Clean Architecture
Dependencies always point inward (toward the domain).
Outer Layer → Interface Adapters → Use Cases → Domain Entities
# Domain Entity (innermost layer)
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class Article:
id: str
title: str
content: str
author_id: str
created_at: datetime = field(default_factory=datetime.utcnow)
def publish(self):
if not self.title or not self.content:
raise ValueError("Title and content are required")
# Use Case (orchestrates domain logic)
class CreateArticleUseCase:
def __init__(self, repo: "ArticleRepository", event_bus: EventBus):
self.repo = repo
self.event_bus = event_bus
def execute(self, title: str, content: str, author_id: str) -> Article:
import uuid
article = Article(
id=str(uuid.uuid4()),
title=title,
content=content,
author_id=author_id,
)
article.publish()
self.repo.save(article)
self.event_bus.publish("article.created", {"id": article.id})
return article
# Interface Adapter (outermost layer)
class ArticleController:
def __init__(self, use_case: CreateArticleUseCase):
self.use_case = use_case
def handle_create(self, request: dict) -> dict:
article = self.use_case.execute(
title=request["title"],
content=request["content"],
author_id=request["author_id"],
)
return {"id": article.id, "title": article.title}
3.2 Hexagonal Architecture (Ports and Adapters)
# Port (interface definition)
class ArticleRepository(ABC):
@abstractmethod
def save(self, article: Article): ...
@abstractmethod
def find_by_id(self, id: str) -> Article: ...
class NotificationPort(ABC):
@abstractmethod
def notify(self, message: str): ...
# Adapter (connects to external systems)
class SQLiteArticleRepository(ArticleRepository):
def save(self, article: Article):
pass # SQLite save logic
def find_by_id(self, id: str) -> Article:
pass # SQLite query logic
class SlackNotificationAdapter(NotificationPort):
def notify(self, message: str):
pass # Slack API call
3.3 CQRS and Event Sourcing
# Command Side
@dataclass
class CreateOrderCommand:
order_id: str
user_id: str
items: list[dict]
# Query Side (separate read model)
@dataclass
class OrderSummaryView:
order_id: str
total_price: float
item_count: int
# Event Sourcing
@dataclass
class OrderCreatedEvent:
order_id: str
user_id: str
items: list[dict]
timestamp: datetime = field(default_factory=datetime.utcnow)
class OrderAggregate:
def __init__(self):
self.events: list = []
self.state = {}
def create(self, cmd: CreateOrderCommand):
event = OrderCreatedEvent(
order_id=cmd.order_id,
user_id=cmd.user_id,
items=cmd.items,
)
self._apply(event)
self.events.append(event)
def _apply(self, event: OrderCreatedEvent):
self.state["id"] = event.order_id
self.state["items"] = event.items
self.state["total"] = sum(
i.get("price", 0) * i.get("qty", 1) for i in event.items
)
4. Microservices Patterns
4.1 API Gateway Pattern
from fastapi import FastAPI, HTTPException
import httpx
app = FastAPI(title="API Gateway")
SERVICE_MAP = {
"users": "http://user-service:8001",
"orders": "http://order-service:8002",
"products": "http://product-service:8003",
}
@app.get("/api/users/{user_id}")
async def proxy_user(user_id: str):
async with httpx.AsyncClient() as client:
resp = await client.get(f"{SERVICE_MAP['users']}/users/{user_id}")
if resp.status_code == 404:
raise HTTPException(status_code=404, detail="User not found")
return resp.json()
4.2 Saga Pattern (Distributed Transactions)
The Saga pattern ensures data consistency across microservices using a series of local transactions and compensating transactions.
class OrderSaga:
def __init__(self, order_service, payment_service, inventory_service):
self.order_svc = order_service
self.payment_svc = payment_service
self.inventory_svc = inventory_service
async def execute(self, order_data: dict):
order_id = None
payment_id = None
try:
# Step 1: Create order
order_id = await self.order_svc.create(order_data)
# Step 2: Process payment
payment_id = await self.payment_svc.charge(order_data["amount"])
# Step 3: Reserve inventory
await self.inventory_svc.reserve(order_data["items"])
return {"status": "success", "order_id": order_id}
except Exception as e:
# Compensating transactions (reverse rollback)
if payment_id:
await self.payment_svc.refund(payment_id)
if order_id:
await self.order_svc.cancel(order_id)
raise
5. Clean Code Principles
5.1 Meaningful Naming
# BAD
def calc(d, r):
return d * (1 - r / 100)
# GOOD
def calculate_discounted_price(original_price: float, discount_rate_percent: float) -> float:
return original_price * (1 - discount_rate_percent / 100)
5.2 Function Design — Small and Single Purpose
# BAD: too many responsibilities
def process_user_registration(email, password, name, send_email=True):
if "@" not in email:
raise ValueError("Invalid email format")
hashed = hash(password)
user = {"email": email, "password": hashed, "name": name}
db.save(user)
if send_email:
mailer.send(email, "Welcome!")
return user
# GOOD: separated functions
def validate_email(email: str) -> None:
if "@" not in email:
raise ValueError("Invalid email format")
def hash_password(raw: str) -> str:
import hashlib
return hashlib.sha256(raw.encode()).hexdigest()
def register_user(email: str, password: str, name: str) -> dict:
validate_email(email)
return {"email": email, "password": hash_password(password), "name": name}
5.3 Code Smells and Refactoring
Common code smells and their solutions:
- Long Method: Extract into smaller functions (Extract Method)
- Large Class: Split by responsibility (Extract Class)
- Feature Envy: Move method closer to the data it uses (Move Method)
- Magic Numbers: Replace with named constants (Replace Magic Number with Symbolic Constant)
- Duplicate Code: Extract to a shared function (Extract Function)
6. AI System Architecture
6.1 RAG (Retrieval-Augmented Generation) Architecture
from dataclasses import dataclass
@dataclass
class RAGConfig:
embedding_model: str = "text-embedding-3-small"
llm_model: str = "gpt-4o"
top_k: int = 5
chunk_size: int = 512
class RAGPipeline:
def __init__(self, config: RAGConfig, vector_store, llm_client):
self.config = config
self.vector_store = vector_store
self.llm = llm_client
def retrieve(self, query: str) -> list[str]:
# 1. Embed the query
query_vector = self.llm.embed(query)
# 2. Search for similar documents
docs = self.vector_store.search(query_vector, top_k=self.config.top_k)
return [d["content"] for d in docs]
def generate(self, query: str, context: list[str]) -> str:
context_text = "\n\n".join(context)
prompt = f"""Answer the question using the following context.
Context:
{context_text}
Question: {query}
Answer:"""
return self.llm.complete(prompt)
def query(self, user_question: str) -> str:
context = self.retrieve(user_question)
return self.generate(user_question, context)
6.2 Agent System Design
@dataclass
class Tool:
name: str
description: str
func: callable
class ReActAgent:
"""AI Agent using the Reasoning + Acting pattern"""
def __init__(self, llm, tools: list[Tool]):
self.llm = llm
self.tools = {t.name: t for t in tools}
def _build_system_prompt(self) -> str:
tool_desc = "\n".join(
f"- {t.name}: {t.description}" for t in self.tools.values()
)
return f"""You are an AI Agent that solves tasks using tools.
Available tools:
{tool_desc}
Format:
Thought: [analyze the current situation]
Action: [tool name to use]
Action Input: [input for the tool]
Observation: [result of the tool execution]
... (repeat)
Final Answer: [final answer]"""
def run(self, task: str, max_steps: int = 10) -> str:
messages = [{"role": "user", "content": task}]
for _ in range(max_steps):
response = self.llm.chat(messages)
if "Final Answer:" in response:
return response.split("Final Answer:")[-1].strip()
if "Action:" in response:
action_line = [l for l in response.split("\n") if l.startswith("Action:")]
if action_line:
tool_name = action_line[0].replace("Action:", "").strip()
tool = self.tools.get(tool_name)
if tool:
observation = tool.func(response)
messages.append({"role": "assistant", "content": response})
messages.append({"role": "user", "content": f"Observation: {observation}"})
return "Max steps exceeded"
7. Testing Strategy
7.1 The Test Pyramid
[E2E Tests] <- slow and costly, keep few
[Integration Tests] <- verify service contracts
[Unit Tests] <- fast and cheap, keep many
7.2 TDD Example (Red-Green-Refactor)
import pytest
# 1. RED: write a failing test first
def test_calculate_discounted_price_basic():
assert calculate_discounted_price(100.0, 20.0) == 80.0
def test_calculate_discounted_price_zero_discount():
assert calculate_discounted_price(100.0, 0.0) == 100.0
def test_calculate_discounted_price_full_discount():
assert calculate_discounted_price(100.0, 100.0) == 0.0
def test_calculate_discounted_price_invalid_rate():
with pytest.raises(ValueError):
calculate_discounted_price(100.0, -10.0)
# 2. GREEN: minimal implementation to pass the tests
def calculate_discounted_price(price: float, discount_rate: float) -> float:
if discount_rate < 0 or discount_rate > 100:
raise ValueError("Discount rate must be between 0 and 100")
return price * (1 - discount_rate / 100)
# 3. REFACTOR: improve quality while keeping tests green
7.3 Mocking Strategy
from unittest.mock import MagicMock, patch
class TestUserService:
def test_create_user_sends_email(self):
mock_repo = MagicMock()
mock_email = MagicMock()
mock_repo.save.return_value = {"id": "u1"}
service = UserService(repo=mock_repo, email_svc=mock_email)
service.register("test@example.com", "John Doe")
mock_repo.save.assert_called_once()
mock_email.send_welcome.assert_called_once_with("test@example.com")
def test_create_user_handles_db_error(self):
mock_repo = MagicMock()
mock_repo.save.side_effect = Exception("DB connection failed")
mock_email = MagicMock()
service = UserService(repo=mock_repo, email_svc=mock_email)
with pytest.raises(Exception):
service.register("test@example.com", "John Doe")
mock_email.send_welcome.assert_not_called()
Quiz
Q1. Why should high-level modules not directly depend on low-level modules in the Dependency Inversion Principle?
Answer: Because changes in low-level modules propagate up to high-level modules, increasing the cost of change across the entire system.
Explanation: When business logic (high-level) depends directly on infrastructure (low-level like a database or external API), swapping MySQL for PostgreSQL forces changes in the business logic code. Depending on an abstraction (interface) means only the low-level adapter needs to change while the high-level code stays untouched. This also makes it easy to inject mocks during testing.
Q2. What is the difference between the Observer pattern and the Pub/Sub pattern?
Answer: In Observer, Subject and Observer have a direct reference relationship. In Pub/Sub, a message broker (event bus) sits between Publisher and Subscriber, fully decoupling them.
Explanation: In the Observer pattern, the Subject maintains a list of Observers directly, so they must live in the same process. In Pub/Sub, a broker like Kafka or RabbitMQ enables Publisher and Subscriber to communicate without knowing each other. Pub/Sub is better suited for asynchronous communication in microservices.
Q3. What are the benefits and complexity trade-offs of separating Commands and Queries in CQRS?
Answer: Read and write workloads can scale independently and read models can be optimized, but the downsides are eventual consistency delays and increased code complexity.
Explanation: Commands (writes) require strong consistency while Queries (reads) need performance optimization. Separation allows multiple read replicas or denormalized read models. However, combining CQRS with event sourcing introduces synchronization lag between models and significantly increases overall system complexity.
Q4. When is the Saga pattern necessary in a microservices architecture?
Answer: When a business transaction spans multiple microservices and you need data consistency in a distributed environment without using Two-Phase Commit (2PC).
Explanation: When order, payment, and shipping are separate services, they cannot share a single database transaction. Saga handles each step as a local transaction and rolls back completed steps using compensating transactions if a failure occurs. It can be implemented as Choreography (event-driven) or Orchestration (central coordinator).
Q5. Why should you write only the minimum code necessary during the Green phase of TDD's Red-Green-Refactor cycle?
Answer: To verify that the test genuinely validates behavior and to prevent over-engineering (YAGNI violations), by writing only enough code to pass the currently failing test.
Explanation: Enforcing minimal implementation confirms that tests are working as specifications. If you implement everything upfront, you cannot be sure that tests pass for the right reasons. During the Refactor phase, tests serve as a safety net so refactoring can proceed safely without breaking behavior.
Conclusion
Software architecture is not a one-time lesson. SOLID principles apply from the smallest function design, and Clean Architecture forms the backbone of systems a team will maintain for years. These principles apply equally to AI systems: RAG pipelines designed with ports and adapters make swapping vector databases painless, and Agent systems that use the Command pattern for tool execution become highly extensible.
Good architecture makes change fearless. Apply the patterns you learned today to your real projects, one at a time.