- Published on
Spring Boot Embedded Tomcat Complete Guide: Configuration, Optimization, and Production Tuning
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Table of Contents
- Embedded Tomcat Architecture
- Core Configuration Options
- SSL/TLS Configuration
- Running HTTP and HTTPS Simultaneously
- Thread Pool Tuning
- Switching to Undertow
- Custom ErrorPage and FilterRegistration
- Graceful Shutdown
- Quiz
1. Embedded Tomcat Architecture
Spring Boot bundles Embedded Tomcat by default, enabling you to package your application as a self-contained executable JAR without a separate application server installation. Adding spring-boot-starter-web automatically includes Tomcat.
spring-boot-starter-web Dependency Tree
spring-boot-starter-web
└── spring-boot-starter-tomcat
├── tomcat-embed-core
├── tomcat-embed-el
└── tomcat-embed-websocket
Embedded Server Comparison
| Feature | Tomcat | Undertow | Jetty |
|---|---|---|---|
| Bundled by default | Yes | No | No |
| Memory footprint | Medium | Low | Low |
| HTTP/2 support | Yes | Yes | Yes |
| WebSocket | Yes | Yes | Yes |
| Throughput | High | Very High | High |
| Maturity/Stability | Very High | High | High |
| Community | Very Large | Medium | Large |
Tomcat is the most battle-tested servlet container and the default choice for most enterprise Spring Boot applications. Undertow shines in microservices requiring low memory usage and high concurrency.
2. Core Configuration Options
Nearly all Embedded Tomcat settings are controllable via application.yml.
Complete Configuration Example
server:
port: 8080
servlet:
context-path: /api
session:
timeout: 30m
cookie:
http-only: true
secure: true
tomcat:
threads:
max: 200 # Maximum worker threads (default: 200)
min-spare: 10 # Minimum idle threads (default: 10)
max-connections: 8192 # Maximum simultaneous connections
accept-count: 100 # Queue size when max-connections is reached
connection-timeout: 20000 # Connection timeout in milliseconds
max-http-form-post-size: 2MB
max-swallow-size: 2MB
uri-encoding: UTF-8
accesslog:
enabled: true
directory: logs
prefix: access_log
suffix: .txt
pattern: combined
rotate: true
shutdown: graceful
http2:
enabled: true
compression:
enabled: true
mime-types: text/html,text/xml,text/plain,text/css,text/javascript,application/javascript,application/json
min-response-size: 1024
error:
include-message: always
include-binding-errors: always
Key Parameters Explained
threads.max (Maximum Threads)
- Default: 200
- The maximum number of requests processed concurrently
- Tune based on CPU core count and application I/O characteristics
max-connections
- Default: 8192
- Maximum number of TCP connections maintained simultaneously by the NIO connector
- Should be set much higher than
threads.maxto efficiently manage Keep-Alive connections
accept-count
- Default: 100
- OS-level socket backlog queue size when
max-connectionsis exhausted - Requests beyond this limit receive a Connection Refused response
connection-timeout
- Default: 20000ms (20 seconds)
- Maximum time allowed for the client to transmit request data
3. SSL/TLS Configuration
Generating a Self-Signed Certificate for Testing
keytool -genkeypair -alias tomcat -keyalg RSA -keysize 2048 \
-storetype PKCS12 -keystore keystore.p12 -validity 3650 \
-storepass changeit -dname "CN=localhost, OU=Dev, O=Example, L=New York, S=NY, C=US"
application.yml SSL Configuration
server:
port: 8443
ssl:
key-store: classpath:keystore.p12
key-store-password: changeit
key-store-type: PKCS12
key-alias: tomcat
enabled: true
protocol: TLS
enabled-protocols: TLSv1.2,TLSv1.3
ciphers: TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_AES_128_GCM_SHA256
Integrating Let's Encrypt Certificates
Convert Let's Encrypt PEM certificates to PKCS12 format for use with Tomcat.
# Convert PEM to PKCS12
openssl pkcs12 -export \
-in /etc/letsencrypt/live/yourdomain.com/fullchain.pem \
-inkey /etc/letsencrypt/live/yourdomain.com/privkey.pem \
-out keystore.p12 \
-name tomcat \
-passout pass:yourpassword
In production, inject the keystore password via environment variables for security.
server:
ssl:
key-store: file:/etc/ssl/keystore.p12
key-store-password: ${SSL_KEYSTORE_PASSWORD}
4. Running HTTP and HTTPS Simultaneously
Spring Boot supports a single port by default, but you can customize TomcatServletWebServerFactory to open both HTTP and HTTPS ports.
Opening Both Ports Simultaneously
@Configuration
public class TomcatConfig {
@Bean
public TomcatServletWebServerFactory tomcatServletWebServerFactory() {
TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();
factory.addAdditionalTomcatConnectors(createHttpConnector());
return factory;
}
private Connector createHttpConnector() {
Connector connector = new Connector(TomcatServletWebServerFactory.DEFAULT_PROTOCOL);
connector.setScheme("http");
connector.setPort(8080);
connector.setSecure(false);
connector.setRedirectPort(8443);
return connector;
}
}
HTTP to HTTPS Automatic Redirect
@Configuration
public class SecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.requiresChannel(channel ->
channel.anyRequest().requiresSecure()
);
return http.build();
}
}
5. Thread Pool Tuning
Sizing Formulas
I/O Bound Applications (Database, External API calls)
Optimal thread count = CPU cores x (1 + Wait time / Processing time)
CPU Bound Applications (Image processing, Encryption)
Optimal thread count = CPU cores + 1
Most web applications are I/O bound, so starting with 10-20x CPU core count is reasonable. Use load testing tools like JMeter or Gatling to find the optimal value for your workload.
Applying Java 21 Virtual Threads
Java 21 Virtual Threads significantly reduce the need for manual thread pool tuning.
spring:
threads:
virtual:
enabled: true
Or configure directly in code:
@Bean
public TomcatProtocolHandlerCustomizer<?> protocolHandlerVirtualThreadExecutorCustomizer() {
return protocolHandler -> {
protocolHandler.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
};
}
Monitoring Threads with Actuator
management:
endpoints:
web:
exposure:
include: metrics,health,threaddump
Monitor thread usage in real time via /actuator/metrics/tomcat.threads.busy and /actuator/metrics/tomcat.threads.config.max.
6. Switching to Undertow
Dependency Change
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-tomcat</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-undertow</artifactId>
</dependency>
Undertow Configuration Options
server:
undertow:
threads:
worker: 200 # Worker threads (default: CPU cores x 8)
io: 4 # I/O threads (default: CPU cores x 2)
buffer-size: 16384 # Buffer size in bytes
direct-buffers: true
max-http-post-size: 10MB
accesslog:
enabled: true
dir: logs
pattern: combined
Tomcat vs Undertow Performance
Undertow's non-blocking I/O architecture delivers lower memory consumption and higher throughput in high-concurrency environments. The difference is most pronounced in applications with WebSocket-heavy or server-push workloads. For standard REST APIs, the performance gap between the two servers is relatively small.
7. Custom ErrorPage and FilterRegistration
Customizing TomcatServletWebServerFactory
@Configuration
public class WebServerConfig {
@Bean
public ConfigurableServletWebServerFactory webServerFactory() {
TomcatServletWebServerFactory factory = new TomcatServletWebServerFactory();
// Register custom error pages
factory.addErrorPages(
new ErrorPage(HttpStatus.NOT_FOUND, "/error/404"),
new ErrorPage(HttpStatus.INTERNAL_SERVER_ERROR, "/error/500"),
new ErrorPage(Exception.class, "/error/general")
);
// Customize Tomcat context
factory.addContextCustomizers(context -> {
context.setSessionTimeout(30);
context.setUseHttpOnly(true);
});
return factory;
}
}
Registering Custom Filters
@Bean
public FilterRegistrationBean<RequestLoggingFilter> loggingFilter() {
FilterRegistrationBean<RequestLoggingFilter> registrationBean = new FilterRegistrationBean<>();
registrationBean.setFilter(new RequestLoggingFilter());
registrationBean.addUrlPatterns("/api/*");
registrationBean.setOrder(1);
return registrationBean;
}
Proxy/Load Balancer Configuration
When running behind a reverse proxy (nginx, AWS ALB), configure the RemoteIP valve to correctly resolve client IP addresses and protocol.
server:
tomcat:
remoteip:
remote-ip-header: x-forwarded-for
protocol-header: x-forwarded-proto
8. Graceful Shutdown
Configuration
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 30s
With graceful shutdown enabled, on receiving SIGTERM, Tomcat stops accepting new requests and waits up to 30 seconds for in-flight requests to complete before shutting down.
Kubernetes Integration
# kubernetes deployment.yaml
spec:
template:
spec:
terminationGracePeriodSeconds: 60 # Kubernetes-level termination wait
containers:
- name: app
lifecycle:
preStop:
exec:
command: ['/bin/sh', '-c', 'sleep 5'] # Wait for load balancer removal
Set Kubernetes terminationGracePeriodSeconds 5-10 seconds longer than Spring Boot's timeout-per-shutdown-phase to ensure a clean shutdown sequence.
Health Probe Integration
management:
endpoint:
health:
probes:
enabled: true
health:
livenessstate:
enabled: true
readinessstate:
enabled: true
Configure Readiness Probe to transition to OUT_OF_SERVICE first, removing the pod from the load balancer before the actual shutdown begins.
9. Quiz
Q1. Which setting controls the maximum number of simultaneous connections in Spring Boot Tomcat?
Answer: server.tomcat.max-connections
Explanation: max-connections controls the maximum number of connections the Tomcat NIO connector maintains simultaneously. The default is 8192. This value should be set much higher than threads.max to efficiently handle Keep-Alive connections. accept-count defines the OS-level socket backlog queue size when max-connections is exhausted.
Q2. What happens after Graceful Shutdown receives a termination signal?
Answer: New requests are rejected, and the server waits up to the configured timeout-per-shutdown-phase duration for in-flight requests to complete before shutting down.
Explanation: With server.shutdown=graceful, upon receiving SIGTERM, Tomcat stops accepting new connections and waits for active requests to finish. spring.lifecycle.timeout-per-shutdown-phase defines the maximum wait time. In Kubernetes environments, terminationGracePeriodSeconds must be set higher than Spring Boot's timeout to ensure the full graceful shutdown sequence completes.
Q3. How do you apply Java 21 Virtual Threads to Spring Boot Tomcat?
Answer: Set spring.threads.virtual.enabled=true, or configure a TomcatProtocolHandlerCustomizer bean with Executors.newVirtualThreadPerTaskExecutor() as the executor.
Explanation: Virtual Threads, introduced as a stable feature in JDK 21, are lightweight threads that do not block OS threads during blocking I/O operations, making them well-suited for high-concurrency request handling. Spring Boot 3.2+ enables Virtual Threads simply via the spring.threads.virtual.enabled=true property.
Q4. What must you do when replacing Tomcat with Undertow?
Answer: Exclude spring-boot-starter-tomcat from spring-boot-starter-web, then add spring-boot-starter-undertow.
Explanation: Spring Boot's auto-configuration detects the embedded server implementation on the classpath. Simply adding Undertow without excluding Tomcat results in a conflict with two server implementations present. You must explicitly exclude Tomcat before adding Undertow for the replacement to work correctly.
Q5. What is the difference between server.tomcat.threads.max and server.tomcat.max-connections?
Answer: threads.max is the maximum number of threads that actively process requests. max-connections is the maximum number of TCP connections maintained simultaneously.
Explanation: Thanks to HTTP/1.1 Keep-Alive, a single TCP connection can handle multiple sequential requests. Therefore, max-connections is typically set much higher than threads.max. For example, with 200 threads and 8192 connections, Tomcat maintains up to 8192 TCP connections while processing up to 200 requests concurrently. accept-count is the additional OS-level queue for requests when all connections are occupied.