N+1 문제 완전 해부 — ORM의 조용한 성능 킬러

들어가며
N+1 문제란?
N+1의 종류
select_related vs prefetch_related
ORM별 해결법
N+1 탐지 방법
GraphQL의 N+1 (DataLoader)
정리: N+1 해결 의사결정 트리

들어가며

코드 리뷰에서 가장 자주 나오는 피드백 중 하나: "N+1 쿼리 발생하고 있어요."

N+1 문제는 ORM을 쓰는 거의 모든 프로젝트에서 발생합니다. 그리고 조용히 성능을 죽입니다. 로컬에서는 안 느끼다가, 데이터가 1만 건만 넘어가면 갑자기 API 응답이 10초가 됩니다.

N+1 문제란?

시나리오

블로그 시스템에서 게시글 목록과 작성자를 보여주려 합니다.

# 모델 정의 (Django)
class Author(models.Model):
    name = models.CharField(max_length=100)

class Post(models.Model):
    title = models.CharField(max_length=200)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)
    created_at = models.DateTimeField(auto_now_add=True)

# ❌ N+1 문제 발생!
posts = Post.objects.all()[:100]  # 1번 쿼리: 게시글 100개 조회

for post in posts:
    print(f"{post.title} by {post.author.name}")
    # ↑ 게시글마다 author를 조회! 100번 추가 쿼리!

실제 발생하는 SQL

-- 1번째 쿼리 (게시글 목록)
SELECT id, title, author_id, created_at FROM posts LIMIT 100;

-- 2번째 ~ 101번째 쿼리 (작성자 각각 조회!)
SELECT id, name FROM authors WHERE id = 1;
SELECT id, name FROM authors WHERE id = 2;
SELECT id, name FROM authors WHERE id = 3;
SELECT id, name FROM authors WHERE id = 1;  -- 같은 작성자도 또 조회!
SELECT id, name FROM authors WHERE id = 4;
...
-- 총 101번의 쿼리! (1 + N = 1 + 100)

성능 영향

# 벤치마크
import time

# N+1 (101 queries)
start = time.time()
posts = Post.objects.all()[:1000]
for post in posts:
    _ = post.author.name
print(f"N+1: {time.time() - start:.2f}s")
# → N+1: 3.47s (1001 queries)

# 최적화 후 (2 queries)
start = time.time()
posts = Post.objects.select_related('author').all()[:1000]
for post in posts:
    _ = post.author.name
print(f"Optimized: {time.time() - start:.2f}s")
# → Optimized: 0.05s (2 queries)

# 69배 빠름!!

데이터 수  │  N+1 쿼리 수  │  응답 시간    │  최적화 후
──────────┼──────────────┼─────────────┼──────────
10        │  11          │  0.05s      │  0.01s
100       │  101         │  0.35s      │  0.02s
1,000     │  1,001       │  3.5s       │  0.05s
10,000    │  10,001      │  35s        │  0.1s
100,000   │  100,001     │  5분+ 💀    │  0.3s

N+1의 종류

1. ForeignKey N+1 (1:N의 "1" 쪽)

# Post → Author (Many-to-One)
# 게시글에서 작성자 조회

# ❌ N+1
for post in Post.objects.all():
    print(post.author.name)  # 매번 SELECT

# ✅ select_related (SQL JOIN)
for post in Post.objects.select_related('author'):
    print(post.author.name)  # JOIN으로 한 번에!

생성되는 SQL:

-- select_related → INNER JOIN
SELECT posts.*, authors.name
FROM posts
INNER JOIN authors ON posts.author_id = authors.id
LIMIT 100;
-- 1번의 쿼리!

2. Reverse ForeignKey N+1 (1:N의 "N" 쪽)

# Author → Posts (One-to-Many)
# 작성자별 게시글 목록 조회

# ❌ N+1
authors = Author.objects.all()[:50]
for author in authors:
    posts = author.post_set.all()  # 매번 SELECT
    print(f"{author.name}: {posts.count()} posts")

# ✅ prefetch_related (별도 쿼리 + Python 매핑)
authors = Author.objects.prefetch_related('post_set').all()[:50]
for author in authors:
    posts = author.post_set.all()  # 캐시된 결과 사용!
    print(f"{author.name}: {len(posts)} posts")

생성되는 SQL:

-- prefetch_related → 2번의 쿼리
SELECT id, name FROM authors LIMIT 50;
SELECT id, title, author_id FROM posts WHERE author_id IN (1, 2, 3, ..., 50);
-- Python에서 매핑!

3. ManyToMany N+1

class Post(models.Model):
    tags = models.ManyToManyField('Tag')

# ❌ N+1
for post in Post.objects.all():
    tag_names = [t.name for t in post.tags.all()]  # 매번 SELECT

# ✅ prefetch_related
for post in Post.objects.prefetch_related('tags'):
    tag_names = [t.name for t in post.tags.all()]  # 캐시!

4. 중첩 N+1 (가장 위험!)

# Post → Author → Company (3단계 중첩)

# ❌ N+1+1 (게시글 N + 작성자 N + 회사 N)
for post in Post.objects.all():
    print(f"{post.title} by {post.author.name} at {post.author.company.name}")
    # 게시글마다: author SELECT + company SELECT

# ✅ select_related 체이닝
for post in Post.objects.select_related('author__company'):
    print(f"{post.title} by {post.author.name} at {post.author.company.name}")

-- 3중 JOIN, 1번의 쿼리!
SELECT posts.*, authors.*, companies.*
FROM posts
JOIN authors ON posts.author_id = authors.id
JOIN companies ON authors.company_id = companies.id;

select_related vs prefetch_related

select_related:
  ├── SQL JOIN으로 한 번에 가져옴
  ├── ForeignKey / OneToOneField에만 사용
  ├── 1번의 쿼리 (JOIN)
  └── 데이터가 적을 때 효율적

prefetch_related:
  ├── 별도 쿼리 + Python에서 매핑
  ├── ManyToMany / Reverse FK에 사용
  ├── 2번의 쿼리 (메인 + IN)
  └── 데이터가 많거나 복잡한 관계에서 효율적

# 둘 다 같이 사용 가능!
posts = Post.objects.select_related(
    'author',           # FK → JOIN
    'author__company',  # FK → JOIN
).prefetch_related(
    'tags',             # M2M → 별도 쿼리
    'comments',         # Reverse FK → 별도 쿼리
).all()[:100]

# 총 4번의 쿼리로 모든 관계 로드!

ORM별 해결법

Django

# select_related (FK, O2O)
Post.objects.select_related('author', 'category')

# prefetch_related (M2M, Reverse FK)
Post.objects.prefetch_related('tags', 'comments')

# Prefetch 객체 (필터링 + 최적화)
from django.db.models import Prefetch

Post.objects.prefetch_related(
    Prefetch(
        'comments',
        queryset=Comment.objects.filter(is_approved=True).select_related('user'),
        to_attr='approved_comments'  # post.approved_comments로 접근
    )
)

# annotate로 COUNT 최적화 (서브쿼리 대신 집계)
from django.db.models import Count
Author.objects.annotate(post_count=Count('post')).filter(post_count__gte=5)

SQLAlchemy (Python)

from sqlalchemy.orm import joinedload, selectinload, subqueryload

# joinedload = select_related (JOIN)
session.query(Post).options(joinedload(Post.author)).all()

# selectinload = prefetch_related (SELECT ... IN)
session.query(Author).options(selectinload(Author.posts)).all()

# 중첩
session.query(Post).options(
    joinedload(Post.author).joinedload(Author.company),
    selectinload(Post.tags)
).all()

# 관계 로딩 전략 (모델에서 설정)
class Post(Base):
    author = relationship("Author", lazy="joined")     # 항상 JOIN
    tags = relationship("Tag", lazy="selectin")         # 항상 IN 쿼리
    comments = relationship("Comment", lazy="dynamic")  # 명시적 쿼리만

JPA / Hibernate (Java/Kotlin)

// ❌ Lazy Loading → N+1 기본 발생!
@Entity
public class Post {
    @ManyToOne(fetch = FetchType.LAZY)  // 기본값이 LAZY
    private Author author;
}

// ✅ 해결 1: JPQL JOIN FETCH
@Query("SELECT p FROM Post p JOIN FETCH p.author")
List<Post> findAllWithAuthor();

// ✅ 해결 2: EntityGraph
@EntityGraph(attributePaths = {"author", "tags"})
List<Post> findAll();

// ✅ 해결 3: Batch Size (Hibernate)
@BatchSize(size = 100)  // IN 절로 100개씩 묶어서 조회
@OneToMany(mappedBy = "author")
private List<Post> posts;

// application.yml
// spring.jpa.properties.hibernate.default_batch_fetch_size: 100

Prisma (TypeScript/Node.js)

// ❌ N+1
const posts = await prisma.post.findMany()
for (const post of posts) {
  const author = await prisma.author.findUnique({
    where: { id: post.authorId },
  })
}

// ✅ include (Eager Loading)
const posts = await prisma.post.findMany({
  include: {
    author: true,
    tags: true,
    comments: {
      where: { approved: true },
      include: { user: true },
    },
  },
})

ActiveRecord (Ruby on Rails)

# ❌ N+1
Post.all.each { |post| puts post.author.name }

# ✅ includes (자동으로 최적 전략 선택)
Post.includes(:author).each { |post| puts post.author.name }

# ✅ eager_load (항상 JOIN)
Post.eager_load(:author, :tags).all

# ✅ preload (항상 별도 쿼리)
Post.preload(:comments).all

# Bullet gem으로 자동 탐지!
# Gemfile: gem 'bullet'

N+1 탐지 방법

# settings.py
INSTALLED_APPS = ['debug_toolbar']
MIDDLEWARE = ['debug_toolbar.middleware.DebugToolbarMiddleware']

# → 브라우저에서 SQL 쿼리 수 + 중복 확인!

2. 로그에서 쿼리 수 카운팅

# Django: 쿼리 수 자동 로깅 미들웨어
from django.db import connection, reset_queries
import logging

logger = logging.getLogger(__name__)

class QueryCountMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        reset_queries()
        response = self.get_response(request)
        query_count = len(connection.queries)

        if query_count > 10:  # 임계값
            logger.warning(
                f"⚠️ {request.path}: {query_count} queries! "
                f"Possible N+1 problem."
            )
            # 중복 쿼리 찾기
            queries = [q['sql'] for q in connection.queries]
            from collections import Counter
            dupes = {q: c for q, c in Counter(queries).items() if c > 1}
            if dupes:
                logger.warning(f"Duplicate queries: {dupes}")

        return response

3. pytest로 자동 탐지

import pytest
from django.test.utils import override_settings

@pytest.mark.django_db
def test_no_n_plus_1_on_post_list(client, django_assert_max_num_queries):
    # 게시글 100개 생성
    author = Author.objects.create(name="Test")
    for i in range(100):
        Post.objects.create(title=f"Post {i}", author=author)

    # 쿼리 수 제한 (100개 게시글인데 10번 이하여야 함)
    with django_assert_max_num_queries(10):
        response = client.get('/api/posts/')
        assert response.status_code == 200

4. SQLAlchemy 이벤트 리스너

from sqlalchemy import event
import warnings

@event.listens_for(Engine, "before_cursor_execute")
def receive_before_cursor_execute(conn, cursor, statement, *args):
    if not hasattr(conn, '_query_count'):
        conn._query_count = 0
    conn._query_count += 1
    if conn._query_count > 50:
        warnings.warn(f"Over 50 queries in one request! ({conn._query_count})")

GraphQL의 N+1 (DataLoader)

# GraphQL은 N+1의 온상!
# resolver가 필드별로 호출되므로

# ❌ N+1 발생
class PostType:
    def resolve_author(post, info):
        return Author.objects.get(id=post.author_id)  # 매번 SELECT!

# ✅ DataLoader로 배칭
from promise import Promise
from promise.dataloader import DataLoader

class AuthorLoader(DataLoader):
    def batch_load_fn(self, author_ids):
        # 한 번에 모두 조회!
        authors = {a.id: a for a in Author.objects.filter(id__in=author_ids)}
        return Promise.resolve([authors.get(id) for id in author_ids])

# Request마다 새 loader 생성
author_loader = AuthorLoader()

class PostType:
    def resolve_author(post, info):
        return author_loader.load(post.author_id)  # 배칭!

정리: N+1 해결 의사결정 트리

관계 타입이 뭔가?
├── ForeignKey / OneToOne
│   └── select_related (JOIN) ✅
│
├── ManyToMany / Reverse FK
│   └── prefetch_related (IN 쿼리) ✅
│
├── 중첩 관계 (A → B → C)
│   └── select_related 체이닝 ('b__c') ✅
│
├── 조건부 프리페치
│   └── Prefetch 객체 + queryset 필터 ✅
│
└── COUNT/SUM만 필요
    └── annotate + 집계 함수 ✅ (프리페치보다 효율적)

📝 퀴즈 — N+1 문제 (클릭해서 확인!)

Q1. N+1 문제에서 N과 1은 각각 무엇인가? ||1: 메인 엔티티 목록을 가져오는 첫 번째 쿼리. N: 각 엔티티의 연관 데이터를 개별 조회하는 N번의 추가 쿼리. 총 1+N번의 쿼리 발생||

Q2. select_related와 prefetch_related의 차이는? ||select_related: SQL JOIN으로 한 번에 조회 (FK/O2O용). prefetch_related: 별도 쿼리 후 Python에서 매핑 (M2M/Reverse FK용)||

Q3. 게시글 1,000개를 N+1로 조회하면 몇 번의 쿼리가 발생하나? ||1,001번 (목록 1번 + 작성자 1,000번). 최적화하면 2번 (select_related) 또는 2번 (prefetch_related)||

Q4. JPA에서 Lazy Loading이 기본인 이유와 문제점은? ||이유: 불필요한 연관 데이터를 미리 로딩하지 않아 메모리 절약. 문제점: 루프에서 접근 시 N+1 발생. JOIN FETCH 또는 EntityGraph로 해결||

Q5. GraphQL에서 N+1이 특히 심각한 이유와 해결책은? ||이유: resolver가 필드별로 독립 호출되어 각 Post의 author를 개별 조회. 해결: DataLoader로 같은 턴의 요청을 모아 배치 처리 (IN 쿼리 1번)||

Q6. 중첩 N+1 (Post → Author → Company)에서 총 쿼리 수는? ||최악의 경우 1 + N + N = 2N+1. 게시글 100개면 201번 쿼리. select_related('author__company')로 1번의 JOIN 쿼리로 해결||

Q7. annotate가 prefetch_related보다 효율적인 경우는? ||연관 데이터의 전체가 아닌 COUNT, SUM 같은 집계만 필요할 때. prefetch_related는 모든 데이터를 가져와 Python에서 처리하지만, annotate는 DB에서 집계 후 결과만 반환||

Q8. N+1을 자동으로 탐지하는 방법 3가지는? ||1) Django Debug Toolbar (브라우저에서 쿼리 수 확인) 2) 쿼리 카운팅 미들웨어 (임계값 초과 시 경고) 3) pytest django_assert_max_num_queries (테스트에서 쿼리 수 제한)||