- Authors

- Name
- Youngju Kim
- @fjvbn20031
Table of Contents
- Complexity Analysis
- Core Data Structures
- Graph Algorithms
- Dynamic Programming
- Advanced Algorithms
- AI/ML Algorithm Connections
- Coding Interview Patterns
- Quiz: Test Your Knowledge
1. Complexity Analysis
Big-O, Omega, and Theta Notation
The core of algorithm analysis is expressing mathematically how time and space resources grow with input size n.
| Notation | Meaning | Description |
|---|---|---|
| O(f(n)) | Upper Bound | Does not grow faster than f(n) in the worst case |
| Ω(f(n)) | Lower Bound | Does not grow slower than f(n) in the best case |
| Θ(f(n)) | Tight Bound | Grows at exactly the rate of f(n) |
Common complexity classes (from slowest to fastest growth):
O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(n³) < O(2ⁿ) < O(n!)
Recurrence Relations and the Master Theorem
Divide-and-conquer algorithms are expressed as recurrences, solvable with the Master Theorem.
T(n) = aT(n/b) + f(n)
a= number of subproblems,b= division factor,f(n)= split/merge cost
Three cases of the Master Theorem:
- f(n) = O(n^(log_b(a) - ε)) → T(n) = Θ(n^log_b(a))
- f(n) = Θ(n^log_b(a)) → T(n) = Θ(n^log_b(a) · log n)
- f(n) = Ω(n^(log_b(a) + ε)) → T(n) = Θ(f(n))
Example: Merge sort T(n) = 2T(n/2) + O(n) → Case 2 → Θ(n log n)
Amortized Analysis
Analyzes the average cost per operation over a sequence of operations, even when individual operations can be expensive.
Dynamic Array example:
- Doubling capacity on overflow → that push is O(n)
- Total cost of n pushes: O(1) + O(1) + ... + O(n) ≈ O(2n) = O(n)
- Amortized cost: O(1) per operation
2. Core Data Structures
Complexity Summary
| Data Structure | Access | Search | Insert | Delete | Space |
|---|---|---|---|---|---|
| Array | O(1) | O(n) | O(n) | O(n) | O(n) |
| Linked List | O(n) | O(n) | O(1) | O(1) | O(n) |
| Stack | O(n) | O(n) | O(1) | O(1) | O(n) |
| Queue | O(n) | O(n) | O(1) | O(1) | O(n) |
| Hash Table | - | O(1)* | O(1)* | O(1)* | O(n) |
| BST (balanced) | O(log n) | O(log n) | O(log n) | O(log n) | O(n) |
| Heap | O(1) (max) | O(n) | O(log n) | O(log n) | O(n) |
*Average case
Red-Black Tree
A self-balancing BST that automatically maintains balance through rotations and recoloring.
5 properties:
- Every node is either red or black
- The root is black
- All leaves (NIL) are black
- Both children of every red node are black
- All paths from root to any leaf have the same number of black nodes
# Red-Black Tree Rotation - Left Rotation concept
#
# x y
# / \ → / \
# a y x c
# / \ / \
# b c a b
#
def left_rotate(tree, x):
y = x.right # y is x's right child
x.right = y.left # move y's left subtree to x's right
if y.left != tree.NIL:
y.left.parent = x
y.parent = x.parent # link x's parent to y
if x.parent == tree.NIL:
tree.root = y
elif x == x.parent.left:
x.parent.left = y
else:
x.parent.right = y
y.left = x # put x on y's left
x.parent = y
Heap and Priority Queue
A heap is a complete binary tree stored as an array.
import heapq
# Using min-heap for Dijkstra's algorithm
def dijkstra(graph, start):
distances = {node: float('inf') for node in graph}
distances[start] = 0
# Store (distance, node) tuples in the heap
heap = [(0, start)]
visited = set()
while heap:
dist, node = heapq.heappop(heap)
if node in visited:
continue
visited.add(node)
for neighbor, weight in graph[node]:
new_dist = dist + weight
if new_dist < distances[neighbor]:
distances[neighbor] = new_dist
heapq.heappush(heap, (new_dist, neighbor))
return distances
# Example usage
graph = {
'A': [('B', 1), ('C', 4)],
'B': [('C', 2), ('D', 5)],
'C': [('D', 1)],
'D': []
}
print(dijkstra(graph, 'A'))
# {'A': 0, 'B': 1, 'C': 3, 'D': 4}
3. Graph Algorithms
BFS vs DFS Comparison
| Property | BFS | DFS |
|---|---|---|
| Data structure | Queue | Stack / Recursion |
| Shortest path | Yes (unweighted) | No |
| Space complexity | O(V + E) | O(V) |
| Use cases | Shortest path, level traversal | Cycle detection, topological sort, components |
Shortest Path Algorithm Comparison
| Algorithm | Negative edges | Detect negative cycles | Time complexity | Scope |
|---|---|---|---|---|
| Dijkstra | No | No | O((V+E) log V) | Single source |
| Bellman-Ford | Yes | Yes | O(VE) | Single source |
| Floyd-Warshall | Yes | Yes | O(V³) | All pairs |
Floyd-Warshall Implementation
def floyd_warshall(n, edges):
INF = float('inf')
# dist[i][j] = shortest distance from i to j
dist = [[INF] * n for _ in range(n)]
for i in range(n):
dist[i][i] = 0
for u, v, w in edges:
dist[u][v] = w
for k in range(n): # intermediate node
for i in range(n): # source node
for j in range(n): # destination node
if dist[i][k] + dist[k][j] < dist[i][j]:
dist[i][j] = dist[i][k] + dist[k][j]
return dist
MST: Kruskal vs Prim
Kruskal (edge-based, uses Union-Find):
- Sort all edges by weight ascending, then greedily add edges that don't form cycles
- Time complexity: O(E log E)
- Better for sparse graphs
Prim (vertex-based, uses min-heap):
- Grow MST from a start node by repeatedly adding the minimum-weight edge crossing the cut
- Time complexity: O(E log V)
- Better for dense graphs
Topological Sort
Orders nodes in a DAG (Directed Acyclic Graph) respecting dependencies.
from collections import deque
def topological_sort(n, edges):
indegree = [0] * n
graph = [[] for _ in range(n)]
for u, v in edges:
graph[u].append(v)
indegree[v] += 1
queue = deque([i for i in range(n) if indegree[i] == 0])
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
return result if len(result) == n else [] # empty if cycle detected
4. Dynamic Programming
Two Approaches to DP
Optimal Substructure: The optimal solution to a problem can be constructed from optimal solutions to its subproblems. Overlapping Subproblems: The same subproblems are solved multiple times.
| Approach | Direction | Advantages | Disadvantages |
|---|---|---|---|
| Memoization (Top-Down) | Recursion + cache | Only computes needed subproblems | Recursive call overhead |
| Tabulation (Bottom-Up) | Iteration | No stack overflow risk, cache-friendly | May compute unnecessary subproblems |
LCS (Longest Common Subsequence)
def lcs(s1, s2):
m, n = len(s1), len(s2)
# dp[i][j] = LCS length of s1[:i] and s2[:j]
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if s1[i-1] == s2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
# 2D table visualization (s1="ABCD", s2="ACBD"):
# "" A C B D
# "" 0 0 0 0 0
# A 0 1 1 1 1
# B 0 1 1 2 2
# C 0 1 2 2 2
# D 0 1 2 2 3 <- LCS length = 3 ("ABD" or "ACD")
return dp[m][n]
0/1 Knapsack
def knapsack(weights, values, capacity):
n = len(weights)
# dp[i][w] = max value using first i items, with capacity w
dp = [[0] * (capacity + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for w in range(capacity + 1):
# Don't include item i
dp[i][w] = dp[i-1][w]
# Include item i
if weights[i-1] <= w:
dp[i][w] = max(dp[i][w],
dp[i-1][w - weights[i-1]] + values[i-1])
return dp[n][capacity]
5. Advanced Algorithms
Segment Tree
Handles range queries (sum, min, max) in O(log n) per operation.
class SegmentTree:
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n)
self.build(arr, 0, 0, self.n - 1)
def build(self, arr, node, start, end):
if start == end:
self.tree[node] = arr[start]
else:
mid = (start + end) // 2
self.build(arr, 2*node+1, start, mid)
self.build(arr, 2*node+2, mid+1, end)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
def query(self, node, start, end, l, r):
if r < start or end < l:
return 0 # out of range
if l <= start and end <= r:
return self.tree[node] # fully inside range
mid = (start + end) // 2
left = self.query(2*node+1, start, mid, l, r)
right = self.query(2*node+2, mid+1, end, l, r)
return left + right
def update(self, node, start, end, idx, val):
if start == end:
self.tree[node] = val
else:
mid = (start + end) // 2
if idx <= mid:
self.update(2*node+1, start, mid, idx, val)
else:
self.update(2*node+2, mid+1, end, idx, val)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
Fenwick Tree (Binary Indexed Tree)
Simpler to implement than Segment Tree, with better memory efficiency.
class FenwickTree:
def __init__(self, n):
self.n = n
self.tree = [0] * (n + 1)
def update(self, i, delta):
while i <= self.n:
self.tree[i] += delta
i += i & (-i) # add the lowest set bit
def query(self, i):
total = 0
while i > 0:
total += self.tree[i]
i -= i & (-i) # remove the lowest set bit
return total
def range_query(self, l, r):
return self.query(r) - self.query(l - 1)
| Structure | Implementation | Range Query | Point Update | Range Update | Memory |
|---|---|---|---|---|---|
| Segment Tree | Complex | O(log n) | O(log n) | O(log n) | O(4n) |
| Fenwick Tree | Simple | O(log n) | O(log n) | Limited | O(n) |
Union-Find (Path Compression + Rank)
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
# Path compression
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False # already in the same set
# Union by rank
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
return True
# Path compression + rank -> effectively O(α(n)) ≈ O(1) (inverse Ackermann)
KMP String Matching
def kmp_search(text, pattern):
def build_lps(pattern):
lps = [0] * len(pattern)
length, i = 0, 1
while i < len(pattern):
if pattern[i] == pattern[length]:
length += 1
lps[i] = length
i += 1
elif length:
length = lps[length - 1]
else:
lps[i] = 0
i += 1
return lps
lps = build_lps(pattern)
i = j = 0
results = []
while i < len(text):
if text[i] == pattern[j]:
i += 1; j += 1
if j == len(pattern):
results.append(i - j)
j = lps[j - 1]
elif i < len(text) and text[i] != pattern[j]:
j = lps[j - 1] if j else (i := i + 1) and 0
return results
6. AI/ML Algorithm Connections
k-d Tree for k-NN Search
The k-d tree partitions k-dimensional space to enable efficient nearest-neighbor queries.
class KDNode:
def __init__(self, point, left=None, right=None):
self.point = point
self.left = left
self.right = right
def build_kd_tree(points, depth=0):
if not points:
return None
k = len(points[0])
axis = depth % k
sorted_points = sorted(points, key=lambda p: p[axis])
mid = len(sorted_points) // 2
return KDNode(
point=sorted_points[mid],
left=build_kd_tree(sorted_points[:mid], depth + 1),
right=build_kd_tree(sorted_points[mid+1:], depth + 1)
)
def nearest_neighbor(root, query, depth=0, best=None):
if root is None:
return best
k = len(query)
axis = depth % k
dist = sum((q - p) ** 2 for q, p in zip(query, root.point)) ** 0.5
if best is None or dist < best[0]:
best = (dist, root.point)
# Compare query to splitting hyperplane
diff = query[axis] - root.point[axis]
close, away = (root.left, root.right) if diff <= 0 else (root.right, root.left)
best = nearest_neighbor(close, query, depth + 1, best)
# Check if we need to explore the other side
if abs(diff) < best[0]:
best = nearest_neighbor(away, query, depth + 1, best)
return best
k-d tree average complexity: O(log n), worst case: O(n) — performance degrades in high dimensions ("curse of dimensionality")
LSH (Locality-Sensitive Hashing)
An approximate nearest neighbor (ANN) method for finding similar high-dimensional vectors (e.g., embeddings) quickly.
- Core idea: Similar vectors hash to the same bucket with high probability
- Cosine similarity LSH: Hash by random hyperplane projections
- Applications: Large-scale embedding search (Faiss, ScaNN), near-duplicate detection
- Time complexity: Query O(1) ~ O(log n), index build O(nk)
Beam Search in NLP Decoding
An improvement over greedy search: maintains top k candidates (beam width) at each step.
Beam width k=2, keeping top 2 tokens per step:
Step 1: [A(0.6), B(0.4)]
Step 2: [AA(0.36), AB(0.24), BA(0.28), BB(0.16)] -> top 2: [AA, BA]
Step 3: [AAX, AAY, BAX, BAY] -> keep top 2...
Tradeoff: Larger beam width improves output quality but memory/time scales as O(k·n).
A* Algorithm in Robotics / Path Planning
Combines Dijkstra with a heuristic function h(n) for optimal and efficient pathfinding.
f(n) = g(n) + h(n)
g(n) = actual cost from start to current node
h(n) = estimated cost from current node to goal (admissible heuristic)
- Admissibility condition: h(n) is less than or equal to the true cost → guarantees optimality
- Robotics applications: Grid-map motion planning, ROS Navigation Stack
- Time complexity: O(E log V) — same as Dijkstra, but dramatically reduces search space in practice
7. Coding Interview Patterns
Two Pointers
# Find two numbers in a sorted array that sum to target
def two_sum_sorted(arr, target):
left, right = 0, len(arr) - 1
while left < right:
current = arr[left] + arr[right]
if current == target:
return (left, right)
elif current < target:
left += 1
else:
right -= 1
return None
# Time O(n), Space O(1)
Sliding Window
# Maximum sum of subarray of length k
def max_subarray_sum(arr, k):
window_sum = sum(arr[:k])
max_sum = window_sum
for i in range(k, len(arr)):
window_sum += arr[i] - arr[i - k]
max_sum = max(max_sum, window_sum)
return max_sum
# Time O(n), Space O(1)
Backtracking
# N-Queens problem
def solve_n_queens(n):
results = []
board = [-1] * n
def is_valid(row, col):
for r in range(row):
if board[r] == col:
return False
if abs(board[r] - col) == abs(r - row):
return False
return True
def backtrack(row):
if row == n:
results.append(board[:])
return
for col in range(n):
if is_valid(row, col):
board[row] = col
backtrack(row + 1)
board[row] = -1 # undo choice
backtrack(0)
return results
Pattern Selection Guide
| Problem Type | Recommended Pattern |
|---|---|
| Sorted array, pair/triple combinations | Two Pointers |
| Contiguous subarray or substring | Sliding Window |
| Combinations, permutations, counting | Backtracking |
| Shortest path, level traversal | BFS |
| Cycles, topology, connectivity | DFS |
| Range queries and updates | Segment Tree / Fenwick Tree |
| Merging sets, connectivity checks | Union-Find |
| Optimization with overlapping subproblems | Dynamic Programming |
8. Quiz: Test Your Knowledge
Q1. Under what circumstances does a hash table's average O(1) lookup degrade to O(n)?
Answer: When all keys hash to the same bucket — a worst-case hash collision scenario.
Explanation: A hash table maps keys to buckets using a hash function. In the ideal case with no collisions, lookup is O(1). However, with a poor hash function or adversarial inputs, all n keys can land in the same bucket. With chaining, this bucket becomes a linked list that must be traversed linearly, giving O(n). Solutions include rehashing when the load factor exceeds a threshold, universal hashing (randomized hash functions), or open addressing strategies like Robin Hood hashing.
Q2. Why does Dijkstra's algorithm fail with negative-weight edges?
Answer: Dijkstra relies on the greedy assumption that once a node is settled, its distance is final. Negative edges violate this assumption.
Explanation: Dijkstra marks a node as "finalized" when it is popped from the min-heap, assuming no future path can improve it. With negative edges, a path through a later-visited node B could reduce the distance to an already-finalized node A (even without a negative cycle). The algorithm's invariant breaks down. Use Bellman-Ford for graphs with negative edges — it relaxes all edges V-1 times and can detect negative cycles.
Q3. What is the space complexity difference between memoization and tabulation in DP?
Answer: Memoization stores only the subproblems that are actually called (advantageous when many subproblems are skipped); tabulation always fills the entire table O(n) but can be space-optimized with a sliding window.
Explanation: Top-down memoization uses a cache (dict/array) and only computes subproblems encountered during recursion. If only k of n total subproblems are needed, it uses O(k) space. Bottom-up tabulation fills the entire dp table, always using O(n) space. However, for problems where dp[i] depends only on dp[i-1] (like Fibonacci), the table can be reduced to O(1) by keeping only the previous values.
Q4. Why is a k-d tree efficient for k-NN search on average, and what are its limitations?
Answer: It halves the search space at each level, giving average O(log n). However, the "curse of dimensionality" causes performance to approach O(n) as d grows large.
Explanation: A k-d tree splits data along one axis per level. During nearest-neighbor search, it prunes entire subtrees when the distance to the splitting hyperplane exceeds the current best distance. This pruning is highly effective in low dimensions (d is less than or equal to 20), averaging O(log n). But in high dimensions, the distance to the hyperplane is almost always less than the current best, so pruning rarely triggers and nearly all points are visited. For high-dimensional vector search, approximate nearest neighbor methods like LSH or HNSW are preferred.
Q5. What are the implementation complexity and use-case differences between Segment Tree and Fenwick Tree (BIT)?
Answer: Segment Tree is more complex to implement but general-purpose (range updates, arbitrary aggregation); Fenwick Tree is simpler and faster but limited primarily to prefix sums and point updates.
Explanation: The Fenwick Tree (BIT) uses bitwise operations on the lowest set bit, making it very concise and cache-friendly. It excels at prefix sum queries and point updates. The Segment Tree requires more code but supports range minimum/maximum queries, lazy propagation for range updates, and complex aggregation functions (GCD, XOR, etc.). A practical rule in competitive programming: "use BIT for range sums only; use Segment Tree for everything else."
Summary
Algorithms and data structures form the bedrock of AI engineering.
- Complexity analysis is essential for predicting bottlenecks and making scaling decisions in system design.
- Graph algorithms are directly applied in knowledge graphs, recommendation systems, and path optimization.
- Dynamic programming connects deeply to sequence modeling and the Bellman equation in reinforcement learning.
- k-d trees, LSH, and Beam Search are core algorithms powering efficient inference in modern ML systems.
Combine consistent problem-solving practice with hands-on implementation to master both theory and real-world application.