Skip to content

Split View: Spring Boot Embedded Tomcat 완전 가이드: 설정, 최적화, 운영 환경 튜닝

|

Spring Boot Embedded Tomcat 완전 가이드: 설정, 최적화, 운영 환경 튜닝

목차

  1. Embedded Tomcat 기본 구조
  2. 핵심 설정 옵션
  3. SSL/TLS 설정
  4. HTTP/HTTPS 동시 운영
  5. 스레드 풀 튜닝
  6. Undertow로 교체
  7. 커스텀 ErrorPage와 FilterRegistration
  8. Graceful Shutdown
  9. 퀴즈

1. Embedded Tomcat 기본 구조

Spring Boot는 기본적으로 Embedded Tomcat을 내장하고 있어 별도의 WAS 설치 없이 독립 실행 가능한 JAR 파일을 만들 수 있습니다. spring-boot-starter-web 의존성을 추가하면 자동으로 Tomcat이 포함됩니다.

spring-boot-starter-web 의존성 트리

spring-boot-starter-web
└── spring-boot-starter-tomcat
    ├── tomcat-embed-core
    ├── tomcat-embed-el
    └── tomcat-embed-websocket

내장 서버 비교표

항목TomcatUndertowJetty
기본 탑재OXX
메모리 사용량중간낮음낮음
HTTP/2 지원OOO
WebSocketOOO
성능 (처리량)높음매우 높음높음
안정성/성숙도매우 높음높음높음
커뮤니티매우 큰중간

Tomcat은 가장 오래되고 안정적인 서블릿 컨테이너로, 대부분의 엔터프라이즈 환경에서 검증된 선택입니다. Undertow는 낮은 메모리 사용량과 높은 처리량이 필요한 마이크로서비스에 적합합니다.


2. 핵심 설정 옵션

application.yml에서 Embedded Tomcat의 거의 모든 설정을 제어할 수 있습니다.

전체 설정 예시

server:
  port: 8080
  servlet:
    context-path: /api
    session:
      timeout: 30m
      cookie:
        http-only: true
        secure: true
  tomcat:
    threads:
      max: 200 # 최대 스레드 수 (기본값: 200)
      min-spare: 10 # 최소 유휴 스레드 (기본값: 10)
    max-connections: 8192 # 동시 처리 가능한 최대 연결 수
    accept-count: 100 # 연결 대기 큐 크기
    connection-timeout: 20000 # 연결 타임아웃 (ms)
    max-http-form-post-size: 2MB
    max-swallow-size: 2MB
    uri-encoding: UTF-8
    accesslog:
      enabled: true
      directory: logs
      prefix: access_log
      suffix: .txt
      pattern: combined
      rotate: true
  shutdown: graceful
  http2:
    enabled: true
  compression:
    enabled: true
    mime-types: text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/json
    min-response-size: 1024
  error:
    include-message: always
    include-binding-errors: always

주요 파라미터 상세 설명

threads.max (최대 스레드 수)

  • 기본값: 200
  • 동시에 처리할 수 있는 최대 요청 수
  • CPU 코어 수와 애플리케이션 특성에 따라 조정 필요

max-connections

  • 기본값: 8192
  • NIO 커넥터에서 동시에 유지할 수 있는 최대 연결 수
  • threads.max보다 훨씬 크게 설정하여 Keep-Alive 연결을 효율적으로 관리

accept-count

  • 기본값: 100
  • max-connections 초과 시 OS 레벨 소켓 대기 큐 크기
  • 이 값을 초과하면 클라이언트에게 Connection refused 반환

connection-timeout

  • 기본값: 20000ms (20초)
  • 클라이언트가 요청 데이터를 전송하는 데 허용된 최대 시간

3. SSL/TLS 설정

자체 서명 인증서로 테스트

keytool -genkeypair -alias tomcat -keyalg RSA -keysize 2048 \
  -storetype PKCS12 -keystore keystore.p12 -validity 3650 \
  -storepass changeit -dname "CN=localhost, OU=Dev, O=Example, L=Seoul, S=Seoul, C=KR"

application.yml SSL 설정

server:
  port: 8443
  ssl:
    key-store: classpath:keystore.p12
    key-store-password: changeit
    key-store-type: PKCS12
    key-alias: tomcat
    enabled: true
    protocol: TLS
    enabled-protocols: TLSv1.2,TLSv1.3
    ciphers: TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_AES_128_GCM_SHA256

Let's Encrypt 인증서 연동

Let's Encrypt에서 발급받은 PEM 형식의 인증서를 PKCS12로 변환하여 사용합니다.

# PEM을 PKCS12로 변환
openssl pkcs12 -export \
  -in /etc/letsencrypt/live/yourdomain.com/fullchain.pem \
  -inkey /etc/letsencrypt/live/yourdomain.com/privkey.pem \
  -out keystore.p12 \
  -name tomcat \
  -passout pass:yourpassword

운영 환경에서는 인증서 비밀번호를 환경변수로 주입하는 것이 안전합니다.

server:
  ssl:
    key-store: file:/etc/ssl/keystore.p12
    key-store-password: ${SSL_KEYSTORE_PASSWORD}

4. HTTP/HTTPS 동시 운영

Spring Boot는 기본적으로 하나의 포트만 지원하지만, TomcatServletWebServerFactory를 커스터마이징하여 HTTP와 HTTPS를 동시에 열 수 있습니다.

두 포트 동시 열기

@Configuration
public class TomcatConfig {

    @Bean
    public TomcatServletWebServerFactory tomcatServletWebServerFactory() {
        TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();
        factory.addAdditionalTomcatConnectors(createHttpConnector());
        return factory;
    }

    private Connector createHttpConnector() {
        Connector connector = new Connector(TomcatServletWebServerFactory.DEFAULT_PROTOCOL);
        connector.setScheme("http");
        connector.setPort(8080);
        connector.setSecure(false);
        connector.setRedirectPort(8443);
        return connector;
    }
}

HTTP to HTTPS 자동 리다이렉트

@Configuration
public class SecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http
            .requiresChannel(channel ->
                channel.anyRequest().requiresSecure()
            );
        return http.build();
    }
}

5. 스레드 풀 튜닝

기본 공식

I/O Bound 애플리케이션 (DB, 외부 API 호출)

적정 스레드 수 = CPU 코어 수 × (1 + 대기 시간 / 처리 시간)

CPU Bound 애플리케이션 (이미지 처리, 암호화)

적정 스레드 수 = CPU 코어 수 + 1

일반적인 웹 애플리케이션은 I/O Bound이므로 CPU 코어 수의 10~20배 정도가 출발점입니다. 부하 테스트를 통해 최적값을 찾아야 합니다.

Java 21 Virtual Thread 적용

Java 21의 Virtual Thread를 활용하면 스레드 풀 크기에 대한 고민을 크게 줄일 수 있습니다.

spring:
  threads:
    virtual:
      enabled: true

또는 직접 설정:

@Bean
public TomcatProtocolHandlerCustomizer<?> protocolHandlerVirtualThreadExecutorCustomizer() {
    return protocolHandler -> {
        protocolHandler.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
    };
}

Actuator로 스레드 모니터링

management:
  endpoints:
    web:
      exposure:
        include: metrics,health,threaddump

/actuator/metrics/tomcat.threads.busy/actuator/metrics/tomcat.threads.config.max를 통해 실시간 스레드 사용량을 확인할 수 있습니다.


6. Undertow로 교체

의존성 변경

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
    <exclusions>
        <exclusion>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-tomcat</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-undertow</artifactId>
</dependency>

Undertow 설정 옵션

server:
  undertow:
    threads:
      worker: 200 # 워커 스레드 수 (기본: CPU 코어 × 8)
      io: 4 # I/O 스레드 수 (기본: CPU 코어 × 2)
    buffer-size: 16384 # 버퍼 크기 (bytes)
    direct-buffers: true
    max-http-post-size: 10MB
    accesslog:
      enabled: true
      dir: logs
      pattern: combined

Tomcat vs Undertow 성능 비교

Undertow는 논블로킹 I/O 기반으로 동시 접속이 많은 환경에서 Tomcat 대비 더 낮은 메모리 사용량과 높은 처리량을 보입니다. 특히 서버 푸시나 WebSocket 집약적인 애플리케이션에서 차이가 두드러집니다. 단순 REST API 서버의 경우 두 서버의 성능 차이는 크지 않습니다.


7. 커스텀 ErrorPage와 FilterRegistration

TomcatServletWebServerFactory Bean 커스터마이징

@Configuration
public class WebServerConfig {

    @Bean
    public ConfigurableServletWebServerFactory webServerFactory() {
        TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();

        // 커스텀 에러 페이지 등록
        factory.addErrorPages(
            new ErrorPage(HttpStatus.NOT_FOUND, "/error/404"),
            new ErrorPage(HttpStatus.INTERNAL_SERVER_ERROR, "/error/500"),
            new ErrorPage(Exception.class, "/error/general")
        );

        // Tomcat 컨텍스트 커스터마이징
        factory.addContextCustomizers(context -> {
            context.setSessionTimeout(30);
            context.setUseHttpOnly(true);
        });

        return factory;
    }
}

커스텀 Filter 등록

@Bean
public FilterRegistrationBean<RequestLoggingFilter> loggingFilter() {
    FilterRegistrationBean<RequestLoggingFilter> registrationBean = new FilterRegistrationBean<>();
    registrationBean.setFilter(new RequestLoggingFilter());
    registrationBean.addUrlPatterns("/api/*");
    registrationBean.setOrder(1);
    return registrationBean;
}

Custom Connector 추가 (AJP 비활성화 예시)

Spring Boot 2.3 이후 AJP 커넥터는 기본적으로 비활성화되어 있으며, 보안상 활성화하지 않는 것을 권장합니다. 필요한 경우에만 아래와 같이 설정합니다.

server:
  tomcat:
    remoteip:
      remote-ip-header: x-forwarded-for
      protocol-header: x-forwarded-proto

8. Graceful Shutdown

설정

server:
  shutdown: graceful

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

Graceful Shutdown이 활성화되면 종료 신호(SIGTERM)를 받은 후 새 요청은 거부하고, 진행 중인 요청이 완료될 때까지 최대 30초를 기다린 뒤 종료합니다.

쿠버네티스 연동

# kubernetes deployment.yaml
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60 # Kubernetes 레벨 종료 대기
      containers:
        - name: app
          lifecycle:
            preStop:
              exec:
                command: ['/bin/sh', '-c', 'sleep 5'] # 로드밸런서 제거 대기

쿠버네티스의 terminationGracePeriodSeconds는 Spring Boot의 timeout-per-shutdown-phase보다 5~10초 여유 있게 설정해야 합니다.

Health Check 연동

management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true

Readiness Probe가 OUT_OF_SERVICE로 전환되어 로드밸런서에서 제외된 후에 실제 종료가 진행되도록 구성합니다.


9. 퀴즈

Q1. Spring Boot에서 최대 동시 연결 수를 제어하는 설정은?

정답: server.tomcat.max-connections

설명: max-connections는 Tomcat이 동시에 유지할 수 있는 최대 연결 수를 제어합니다. NIO 커넥터 기본값은 8192입니다. 이 값은 threads.max보다 훨씬 크게 설정하여 Keep-Alive 연결을 효율적으로 처리합니다. accept-countmax-connections 초과 시 OS 소켓 레벨의 대기 큐 크기를 의미합니다.

Q2. Graceful Shutdown 설정 시 종료 신호를 받은 후의 동작은?

정답: 새 요청 수신을 거부하고, 진행 중인 요청이 완료될 때까지 설정된 시간(timeout-per-shutdown-phase) 동안 대기한 후 종료합니다.

설명: server.shutdown=graceful로 설정하면 SIGTERM 신호를 받은 후 새 연결을 거부하고, 기존에 처리 중인 요청이 완료되길 기다립니다. spring.lifecycle.timeout-per-shutdown-phase로 최대 대기 시간을 설정합니다. 쿠버네티스 환경에서는 terminationGracePeriodSeconds를 Spring Boot의 타임아웃보다 여유 있게 설정해야 합니다.

Q3. Java 21의 Virtual Thread를 Spring Boot Tomcat에 적용하는 방법은?

정답: spring.threads.virtual.enabled=true 설정 또는 TomcatProtocolHandlerCustomizer를 통해 Executors.newVirtualThreadPerTaskExecutor()를 executor로 설정합니다.

설명: Virtual Thread는 JDK 21에서 정식 도입된 경량 스레드로, 블로킹 I/O 작업에서도 OS 스레드를 점유하지 않아 대규모 동시 요청 처리에 유리합니다. Spring Boot 3.2 이상에서는 spring.threads.virtual.enabled=true 설정만으로 간편하게 활성화할 수 있습니다.

Q4. Tomcat을 Undertow로 교체할 때 반드시 해야 하는 작업은?

정답: spring-boot-starter-web에서 spring-boot-starter-tomcat을 exclusion으로 제외하고, spring-boot-starter-undertow를 추가해야 합니다.

설명: Spring Boot의 자동 설정은 클래스패스에 있는 서버 구현체를 자동으로 감지합니다. 따라서 단순히 Undertow를 추가하면 두 서버가 동시에 존재하는 충돌이 발생합니다. Tomcat을 명시적으로 exclusion으로 제거한 후 Undertow를 추가해야 올바르게 교체됩니다.

Q5. server.tomcat.threads.max와 server.tomcat.max-connections의 차이점은?

정답: threads.max는 실제 요청을 처리하는 스레드의 최대 수이고, max-connections는 동시에 유지할 수 있는 TCP 연결의 최대 수입니다.

설명: HTTP/1.1의 Keep-Alive 덕분에 하나의 TCP 연결로 여러 요청을 순차적으로 처리할 수 있습니다. 따라서 max-connectionsthreads.max보다 훨씬 크게 설정하는 것이 일반적입니다. 예를 들어 스레드 200개, 연결 8192개로 설정하면 8192개의 연결을 유지하면서 그 중 200개를 동시에 처리할 수 있습니다. accept-countmax-connections가 모두 소진되었을 때 추가로 대기시킬 수 있는 요청 수입니다.

Spring Boot Embedded Tomcat Complete Guide: Configuration, Optimization, and Production Tuning

Table of Contents

  1. Embedded Tomcat Architecture
  2. Core Configuration Options
  3. SSL/TLS Configuration
  4. Running HTTP and HTTPS Simultaneously
  5. Thread Pool Tuning
  6. Switching to Undertow
  7. Custom ErrorPage and FilterRegistration
  8. Graceful Shutdown
  9. Quiz

1. Embedded Tomcat Architecture

Spring Boot bundles Embedded Tomcat by default, enabling you to package your application as a self-contained executable JAR without a separate application server installation. Adding spring-boot-starter-web automatically includes Tomcat.

spring-boot-starter-web Dependency Tree

spring-boot-starter-web
└── spring-boot-starter-tomcat
    ├── tomcat-embed-core
    ├── tomcat-embed-el
    └── tomcat-embed-websocket

Embedded Server Comparison

FeatureTomcatUndertowJetty
Bundled by defaultYesNoNo
Memory footprintMediumLowLow
HTTP/2 supportYesYesYes
WebSocketYesYesYes
ThroughputHighVery HighHigh
Maturity/StabilityVery HighHighHigh
CommunityVery LargeMediumLarge

Tomcat is the most battle-tested servlet container and the default choice for most enterprise Spring Boot applications. Undertow shines in microservices requiring low memory usage and high concurrency.


2. Core Configuration Options

Nearly all Embedded Tomcat settings are controllable via application.yml.

Complete Configuration Example

server:
  port: 8080
  servlet:
    context-path: /api
    session:
      timeout: 30m
      cookie:
        http-only: true
        secure: true
  tomcat:
    threads:
      max: 200 # Maximum worker threads (default: 200)
      min-spare: 10 # Minimum idle threads (default: 10)
    max-connections: 8192 # Maximum simultaneous connections
    accept-count: 100 # Queue size when max-connections is reached
    connection-timeout: 20000 # Connection timeout in milliseconds
    max-http-form-post-size: 2MB
    max-swallow-size: 2MB
    uri-encoding: UTF-8
    accesslog:
      enabled: true
      directory: logs
      prefix: access_log
      suffix: .txt
      pattern: combined
      rotate: true
  shutdown: graceful
  http2:
    enabled: true
  compression:
    enabled: true
    mime-types: text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/json
    min-response-size: 1024
  error:
    include-message: always
    include-binding-errors: always

Key Parameters Explained

threads.max (Maximum Threads)

  • Default: 200
  • The maximum number of requests processed concurrently
  • Tune based on CPU core count and application I/O characteristics

max-connections

  • Default: 8192
  • Maximum number of TCP connections maintained simultaneously by the NIO connector
  • Should be set much higher than threads.max to efficiently manage Keep-Alive connections

accept-count

  • Default: 100
  • OS-level socket backlog queue size when max-connections is exhausted
  • Requests beyond this limit receive a Connection Refused response

connection-timeout

  • Default: 20000ms (20 seconds)
  • Maximum time allowed for the client to transmit request data

3. SSL/TLS Configuration

Generating a Self-Signed Certificate for Testing

keytool -genkeypair -alias tomcat -keyalg RSA -keysize 2048 \
  -storetype PKCS12 -keystore keystore.p12 -validity 3650 \
  -storepass changeit -dname "CN=localhost, OU=Dev, O=Example, L=New York, S=NY, C=US"

application.yml SSL Configuration

server:
  port: 8443
  ssl:
    key-store: classpath:keystore.p12
    key-store-password: changeit
    key-store-type: PKCS12
    key-alias: tomcat
    enabled: true
    protocol: TLS
    enabled-protocols: TLSv1.2,TLSv1.3
    ciphers: TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_AES_128_GCM_SHA256

Integrating Let's Encrypt Certificates

Convert Let's Encrypt PEM certificates to PKCS12 format for use with Tomcat.

# Convert PEM to PKCS12
openssl pkcs12 -export \
  -in /etc/letsencrypt/live/yourdomain.com/fullchain.pem \
  -inkey /etc/letsencrypt/live/yourdomain.com/privkey.pem \
  -out keystore.p12 \
  -name tomcat \
  -passout pass:yourpassword

In production, inject the keystore password via environment variables for security.

server:
  ssl:
    key-store: file:/etc/ssl/keystore.p12
    key-store-password: ${SSL_KEYSTORE_PASSWORD}

4. Running HTTP and HTTPS Simultaneously

Spring Boot supports a single port by default, but you can customize TomcatServletWebServerFactory to open both HTTP and HTTPS ports.

Opening Both Ports Simultaneously

@Configuration
public class TomcatConfig {

    @Bean
    public TomcatServletWebServerFactory tomcatServletWebServerFactory() {
        TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();
        factory.addAdditionalTomcatConnectors(createHttpConnector());
        return factory;
    }

    private Connector createHttpConnector() {
        Connector connector = new Connector(TomcatServletWebServerFactory.DEFAULT_PROTOCOL);
        connector.setScheme("http");
        connector.setPort(8080);
        connector.setSecure(false);
        connector.setRedirectPort(8443);
        return connector;
    }
}

HTTP to HTTPS Automatic Redirect

@Configuration
public class SecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http
            .requiresChannel(channel ->
                channel.anyRequest().requiresSecure()
            );
        return http.build();
    }
}

5. Thread Pool Tuning

Sizing Formulas

I/O Bound Applications (Database, External API calls)

Optimal thread count = CPU cores x (1 + Wait time / Processing time)

CPU Bound Applications (Image processing, Encryption)

Optimal thread count = CPU cores + 1

Most web applications are I/O bound, so starting with 10-20x CPU core count is reasonable. Use load testing tools like JMeter or Gatling to find the optimal value for your workload.

Applying Java 21 Virtual Threads

Java 21 Virtual Threads significantly reduce the need for manual thread pool tuning.

spring:
  threads:
    virtual:
      enabled: true

Or configure directly in code:

@Bean
public TomcatProtocolHandlerCustomizer<?> protocolHandlerVirtualThreadExecutorCustomizer() {
    return protocolHandler -> {
        protocolHandler.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
    };
}

Monitoring Threads with Actuator

management:
  endpoints:
    web:
      exposure:
        include: metrics,health,threaddump

Monitor thread usage in real time via /actuator/metrics/tomcat.threads.busy and /actuator/metrics/tomcat.threads.config.max.


6. Switching to Undertow

Dependency Change

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
    <exclusions>
        <exclusion>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-tomcat</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-undertow</artifactId>
</dependency>

Undertow Configuration Options

server:
  undertow:
    threads:
      worker: 200 # Worker threads (default: CPU cores x 8)
      io: 4 # I/O threads (default: CPU cores x 2)
    buffer-size: 16384 # Buffer size in bytes
    direct-buffers: true
    max-http-post-size: 10MB
    accesslog:
      enabled: true
      dir: logs
      pattern: combined

Tomcat vs Undertow Performance

Undertow's non-blocking I/O architecture delivers lower memory consumption and higher throughput in high-concurrency environments. The difference is most pronounced in applications with WebSocket-heavy or server-push workloads. For standard REST APIs, the performance gap between the two servers is relatively small.


7. Custom ErrorPage and FilterRegistration

Customizing TomcatServletWebServerFactory

@Configuration
public class WebServerConfig {

    @Bean
    public ConfigurableServletWebServerFactory webServerFactory() {
        TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();

        // Register custom error pages
        factory.addErrorPages(
            new ErrorPage(HttpStatus.NOT_FOUND, "/error/404"),
            new ErrorPage(HttpStatus.INTERNAL_SERVER_ERROR, "/error/500"),
            new ErrorPage(Exception.class, "/error/general")
        );

        // Customize Tomcat context
        factory.addContextCustomizers(context -> {
            context.setSessionTimeout(30);
            context.setUseHttpOnly(true);
        });

        return factory;
    }
}

Registering Custom Filters

@Bean
public FilterRegistrationBean<RequestLoggingFilter> loggingFilter() {
    FilterRegistrationBean<RequestLoggingFilter> registrationBean = new FilterRegistrationBean<>();
    registrationBean.setFilter(new RequestLoggingFilter());
    registrationBean.addUrlPatterns("/api/*");
    registrationBean.setOrder(1);
    return registrationBean;
}

Proxy/Load Balancer Configuration

When running behind a reverse proxy (nginx, AWS ALB), configure the RemoteIP valve to correctly resolve client IP addresses and protocol.

server:
  tomcat:
    remoteip:
      remote-ip-header: x-forwarded-for
      protocol-header: x-forwarded-proto

8. Graceful Shutdown

Configuration

server:
  shutdown: graceful

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

With graceful shutdown enabled, on receiving SIGTERM, Tomcat stops accepting new requests and waits up to 30 seconds for in-flight requests to complete before shutting down.

Kubernetes Integration

# kubernetes deployment.yaml
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60 # Kubernetes-level termination wait
      containers:
        - name: app
          lifecycle:
            preStop:
              exec:
                command: ['/bin/sh', '-c', 'sleep 5'] # Wait for load balancer removal

Set Kubernetes terminationGracePeriodSeconds 5-10 seconds longer than Spring Boot's timeout-per-shutdown-phase to ensure a clean shutdown sequence.

Health Probe Integration

management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true

Configure Readiness Probe to transition to OUT_OF_SERVICE first, removing the pod from the load balancer before the actual shutdown begins.


9. Quiz

Q1. Which setting controls the maximum number of simultaneous connections in Spring Boot Tomcat?

Answer: server.tomcat.max-connections

Explanation: max-connections controls the maximum number of connections the Tomcat NIO connector maintains simultaneously. The default is 8192. This value should be set much higher than threads.max to efficiently handle Keep-Alive connections. accept-count defines the OS-level socket backlog queue size when max-connections is exhausted.

Q2. What happens after Graceful Shutdown receives a termination signal?

Answer: New requests are rejected, and the server waits up to the configured timeout-per-shutdown-phase duration for in-flight requests to complete before shutting down.

Explanation: With server.shutdown=graceful, upon receiving SIGTERM, Tomcat stops accepting new connections and waits for active requests to finish. spring.lifecycle.timeout-per-shutdown-phase defines the maximum wait time. In Kubernetes environments, terminationGracePeriodSeconds must be set higher than Spring Boot's timeout to ensure the full graceful shutdown sequence completes.

Q3. How do you apply Java 21 Virtual Threads to Spring Boot Tomcat?

Answer: Set spring.threads.virtual.enabled=true, or configure a TomcatProtocolHandlerCustomizer bean with Executors.newVirtualThreadPerTaskExecutor() as the executor.

Explanation: Virtual Threads, introduced as a stable feature in JDK 21, are lightweight threads that do not block OS threads during blocking I/O operations, making them well-suited for high-concurrency request handling. Spring Boot 3.2+ enables Virtual Threads simply via the spring.threads.virtual.enabled=true property.

Q4. What must you do when replacing Tomcat with Undertow?

Answer: Exclude spring-boot-starter-tomcat from spring-boot-starter-web, then add spring-boot-starter-undertow.

Explanation: Spring Boot's auto-configuration detects the embedded server implementation on the classpath. Simply adding Undertow without excluding Tomcat results in a conflict with two server implementations present. You must explicitly exclude Tomcat before adding Undertow for the replacement to work correctly.

Q5. What is the difference between server.tomcat.threads.max and server.tomcat.max-connections?

Answer: threads.max is the maximum number of threads that actively process requests. max-connections is the maximum number of TCP connections maintained simultaneously.

Explanation: Thanks to HTTP/1.1 Keep-Alive, a single TCP connection can handle multiple sequential requests. Therefore, max-connections is typically set much higher than threads.max. For example, with 200 threads and 8192 connections, Tomcat maintains up to 8192 TCP connections while processing up to 200 requests concurrently. accept-count is the additional OS-level queue for requests when all connections are occupied.