Skip to content

필사 모드: Designing the Coexistence of Legacy WebSSO and Modern IAM — Architecture Patterns for the Hybrid Transition Era

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction — The Transition Is Not an Event but an Era

The previous two articles covered the architecture of SiteMinder and the migration strategy to Keycloak. This article is about the longest stretch in between: the **hybrid coexistence period**.

Let us start by admitting one truth from the field. In an organization with hundreds of apps, the coexistence of legacy WebSSO and modern IAM is measured not in quarters but in **years**. Two to five years is typical, and longer is common. That means coexistence is not a "temporary state to endure" but **an architecture that must itself be designed, operated, and monitored**. From the vantage point of 2026 this is even more true — new requirements such as passkey rollouts, Zero Trust policies, and AI agent identity management keep arriving during the transition, without waiting for it to finish.

This article is about not merely surviving the coexistence period, but **designing your way through it well**: four core patterns, the session and logout synchronization problem, security risks, and the criteria for declaring "we are done."

A Map of the Coexistence Terrain

First, the big picture of a typical hybrid environment.

User (browser)

|

+----------------+----------------+

| |

v v

+--------------------+ +--------------------+

| Modern IAM zone | bridge | Legacy WebSSO zone |

| (Keycloak) | <========> | (SiteMinder) |

| | SAML/OIDC | |

| - OIDC apps | | - SM_USER apps |

| - SPA + API | | - agent-protected |

| - passkeys | | - FCC login |

+---------+----------+ +---------+----------+

| |

v v

+--------------------+ +--------------------+

| Reverse proxy layer| | Shared User Store |

| (oauth2-proxy etc.)|----------->| (AD/LDAP) |

| hosts unmodified | | referenced by both|

| legacy apps | | |

+--------------------+ +--------------------+

Three design principles fall out of this picture.

1. **Keep the user store single wherever possible**: the moment the source of truth for accounts/passwords splits in two, the difficulty of the coexistence period doubles.

2. **Move the source of truth for authentication to one side, in stages**: minimize the period in which both IdPs present their own login screens; build the delegation structure early.

3. **Declare every bridge as temporary**: record a creation date and a target decommission date alongside every piece of bridge infrastructure.

Pattern 1: The Standard Protocol Bridge (SAML/OIDC Federation)

The most orthodox pattern. Establish **bidirectional trust** using a standard protocol both systems can speak — realistically, [SAML 2.0](https://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf).

- **Legacy as IdP**: early transition. Keep the existing login experience; develop only new apps via Keycloak (as broker).

- **Modern as IdP**: mid-to-late transition. Move login itself to Keycloak and demote the legacy zone to a SAML SP.

The decisive advantage of this pattern is that **both sides use only standard, vendor-supported features**. With almost no custom code, operational risk stays low.

The authentication flow in the mid-to-late phase (modern as IdP):

User -> accesses a legacy app

-> SM Agent: no SMSESSION -> SiteMinder federation SP initiates

-> SAML AuthnRequest -> Keycloak

-> Keycloak: immediate if SSO session exists, else login (passkey OK)

-> SAML Response (assertion) -> SiteMinder

-> SiteMinder: validates assertion -> issues SMSESSION

-> User: enters the legacy app (SM_USER headers unchanged)

Operational cautions:

- **Document the attribute mapping contract**: which attributes (employee number, groups, email) are carried in the assertion, and in what format, is the API contract between the two systems. A careless change on one side breaks every app on the other.

- **Certificate lifecycle management**: SAML signing certificate expiry is the classic cause of total outages in hybrid environments. Automate alerts at 90/30/7 days before expiry.

- **Clock skew**: NotBefore/NotOnOrAfter validation of assertions presumes NTP synchronization on both sides.

Pattern 2: Reverse-Proxy Header Injection — Protecting Legacy Apps Without Modification

This pattern moves unmodifiable header-based apps under the modern IdP. The configuration was covered in the previous article; here is the minimal setup again, with the coexistence-specific points.

nginx: the authentication proxy in front of a legacy app

server {

listen 443 ssl;

server_name legacy-crm.example.com;

location /oauth2/ {

proxy_pass http://oauth2-proxy:4180;

}

location / {

auth_request /oauth2/auth;

error_page 401 = /oauth2/start?rd=$scheme://$host$request_uri;

auth_request_set $user $upstream_http_x_auth_request_preferred_username;

auth_request_set $groups $upstream_http_x_auth_request_groups;

Translate and inject exactly the header names the legacy app expects

proxy_set_header SM_USER $user;

proxy_set_header SM_USERGROUPS $groups;

proxy_pass http://legacy-crm.internal:8080;

}

}

Core oauth2-proxy settings (Keycloak OIDC)

provider: keycloak-oidc

client_id: legacy-crm-proxy

oidc_issuer_url: https://keycloak.example.com/realms/enterprise

cookie_secure: true

cookie_samesite: lax

session_store_type: redis # shared sessions for multiple instances

redis_connection_url: redis://redis.internal:6379

skip_provider_button: true

From a coexistence standpoint, the real meaning of this pattern is that **an app's zone membership becomes a routing decision**. If DNS points at the legacy gateway, the app belongs to the legacy zone; if it points at the proxy, it belongs to the modern zone. The app code is identical either way, so both cutover and rollback are a single routing change.

Three cautions:

- **Proxy session store**: multi-instance proxies must use a shared session store such as Redis; otherwise load balancing produces random re-logins.

- **WebSockets and long-lived connections**: if the legacy app uses WebSockets or SSE, validate the timeout/upgrade settings on the auth_request path separately.

- **Large uploads**: auth_request checks authentication without the body, but proxy buffering settings have been known to break large uploads.

Pattern 3: The Identity Orchestration Layer

Instead of hand-building bridges and proxies per app, this pattern puts an **abstraction layer that manages the whole identity flow by policy**. [Strata Maverics](https://www.strata.io/) is the canonical example, and cloud vendors offer similar capabilities.

+------------------------------------------+

| Identity Orchestration Layer |

| Per-app policy: "this app -> Keycloak, |

| that app -> SM"; session translation / |

| header injection / logout fan-out |

+-----+--------------------+----------------+

| |

v v

+-----------+ +-----------+

| Keycloak | | SiteMinder|

+-----------+ +-----------+

The advantages are clear:

- Per-app IdP switching becomes a **configuration change**, accelerating wave cutovers.

- It is powerful in multi-legacy-IdP environments (for example, both SiteMinder and Oracle Access Manager present after an acquisition).

- Hard problems like logout fan-out and session lifetime alignment are absorbed as product features.

So are the disadvantages:

- It costs licenses and creates **a new single point of failure through which all identity-path traffic flows** — HA design is mandatory.

- You acquire a dependency on the orchestration layer itself. Include an exit design from day one so the layer can be removed after the transition completes.

Organizations with strong engineering capacity can achieve the same effect by self-building the pattern 1 + 2 combination. In short, this pattern is a **procurement and capability decision**, not a technology one.

Pattern 4: Strangler Fig — Gradual App-by-App Migration

The [Strangler Fig](https://martinfowler.com/bliki/StranglerFigApplication.html) is Martin Fowler's classic legacy modernization pattern: like the fig tree that slowly envelops and replaces its host, the new system erodes the old one app by app. Applied to the IAM coexistence period:

Time t0: [SM: 200 apps] [KC: 0]

Time t1: [SM: 180 apps] --Wave 1 (20)----> [KC: 20]

Time t2: [SM: 120 apps] --Waves 2-3------> [KC: 80]

Time t3: [SM: 30 apps] --Waves 4-6------> [KC: 170]

Time t4: [SM: 0. Decommissioned] [KC: 200]

Three core rules:

1. **New apps go to the new system, no exceptions**: enforce a governance policy (an architecture fitness function) banning new entries into the legacy zone. Without it, you get permanent coexistence instead of strangulation.

2. **Migrate in user-journey bundles**: grouping the apps a user traverses within one business flow into the same wave reduces bridge crossings and improves perceived quality.

3. **Make progress a public metric**: dashboards showing remaining app counts and the share of legacy-routed authentications keep organizational momentum alive.

Session Lifetimes and Logout Synchronization — The Hardest Problem of Coexistence

Mismatched Session Layers

In a hybrid environment, a single user's "logged-in state" exists in at least three layers.

| Layer | Session | Lifetime control |

| --- | --- | --- |

| Modern IdP | Keycloak SSO session | SSO Session Idle/Max |

| Bridge/proxy | oauth2-proxy cookie, SAML session | Cookie expiry, token refresh |

| Legacy | SMSESSION | realm idle/max timeout |

When these layers' lifetimes diverge, two symptoms appear:

- **Ghost sessions**: the upper layer (IdP) is dead but a lower layer (proxy cookie, SMSESSION) is alive — "I logged out but I can still get in," a security problem.

- **Re-authentication storms / redirect loops**: the lower layer dies first and the upper refresh path tangles, putting users into infinite redirects — an availability problem.

The practical recommendation converges on a simple rule: **lifetimes must be equal or shorter as you go down.** Make the IdP session the longest, the bridge session shorter or equal, and the legacy session the shortest. Then expiry always resolves safely by climbing back up the chain and silently re-issuing.

Logout Synchronization

Logout is harder. Wherever the user logs out, **sessions in every layer must terminate**. The standard tools:

- OIDC side: [OpenID Connect RP-Initiated Logout](https://openid.net/specs/openid-connect-rpinitiated-1_0.html), [Back-Channel Logout](https://openid.net/specs/openid-connect-backchannel-1_0.html)

- SAML side: SAML Single Logout (SLO) — implementation variance in the wild makes it unreliable.

- Legacy side: the SiteMinder logout URL (invalidating the SMSESSION cookie)

The realistic solution is a **central logout orchestration endpoint**. Point every app's logout button at it, and execute each layer's logout in sequence there.

GET /global-logout

1. Call the SiteMinder logout (invalidate SMSESSION)

2. Delete the oauth2-proxy session (/oauth2/sign_out)

3. Redirect to Keycloak RP-Initiated Logout

(id_token_hint + post_logout_redirect_uri)

4. "You have been logged out" landing page

The usual arrangement is hybrid: reinforce the segments where back-channel is possible (between Keycloak and the proxy) with back-channel logout, and handle the segments where it is not (SMSESSION) with front-channel chaining. **Whichever patterns you choose, full-stack logout completeness must be in your test scenarios.**

User Experience — Preventing Double Logins

A UX failure during coexistence translates directly into a helpdesk flood. The checklist:

- **One login screen only**: end the state in which users randomly encounter two different login screens as quickly as possible. This is why consolidating the authentication source of truth via the bridge is the top priority.

- **Deep link preservation**: verify that the originally requested URL (RelayState, the rd parameter) survives every bridged flow. "I logged in and got dumped on the home page" sounds trivial but generates the most complaints.

- **Graceful session expiry**: configure the proxy so AJAX requests hitting an expired session receive a 401 JSON rather than a 302 HTML. Otherwise you get the classic bug of a login form rendered in the middle of an SPA view.

- **Cutover announcements**: simply notifying an app's users that "the login screen changes today" on wave cutover day cuts ticket volume by more than half.

Security Risks — The Attack Surface Grows During Coexistence

The coexistence period inherently expands the attack surface to **the union of both systems plus the bridges**.

Header Spoofing

The chronic risk of the proxy pattern. The defense checklist:

[ ] Legacy apps reachable only via the proxy/gateway (network policy)

[ ] Proxy unconditionally overwrites inbound SM_USER-style headers

[ ] mTLS or a shared-secret header on the gateway-to-app segment

[ ] Regular scans that direct app ports (8080 etc.) are not open, even on the intranet

[ ] Header forgery attempts automated as a standing test scenario (T7)

The last item matters most. Networks change frequently during coexistence, and direct-access paths that were blocked at the start have a way of quietly reopening.

Exploiting Session Mismatch

Ghost sessions are both a UX and a security problem. If a leaver's account is disabled at the IdP but access continues for a while via a live SMSESSION or proxy cookie, that is an audit finding. Mitigations:

- Keep legacy/bridge session lifetimes short (the "shorter as you go down" rule above)

- Receive account-disable events via hooks and actively destroy related sessions (Keycloak Event Listener SPI plus SiteMinder Session Store integration)

- For high-risk apps, enable per-request re-verification of user status during session validation

Securing the Bridges Themselves

- SAML bridges: missing signature validation, weak digest algorithms, and unvalidated audiences are the recurring vulnerabilities. Review assertion encryption and signing algorithms.

- Proxies: verify cookie secret rotation, access control on the Redis session store, and configurations that keep tokens out of logs. Use [RFC 9700](https://datatracker.ietf.org/doc/html/rfc9700) recommendations as your baseline.

Monitoring and Completion Criteria

Key Metrics for the Coexistence Period

| Metric | Meaning | Target direction |

| --- | --- | --- |

| Legacy-routed authentication share | Share of all authentications handled by SiteMinder | Monotonic decrease |

| Bridged authentication share | Share of cross-zone authentications via the bridge | Rises mid-way, then falls |

| Authentication success rate (each side) | Availability baseline | Maintain 99.9 or higher |

| Login p95 latency | Watch for delay added by bridge hops | Within baseline + 1 second |

| Unmigrated app count | Strangler progress | Track against the wave plan |

| Ghost session detections | Logout incompleteness | Converge to 0 |

Log consolidation matters too. Feed SiteMinder audit logs and Keycloak events into the same SIEM so that **both sides' authentication history can be viewed as a single per-user timeline** — without this, incident investigation during coexistence is impractical.

Conceptual config for shipping Keycloak events to a SIEM (e.g. syslog listener)

spi-events-listener-jboss-logging-success-level: info

spi-events-listener-jboss-logging-error-level: warn

In production, a custom EventListener SPI shipping to Kafka/SIEM is typical

Criteria for Declaring the Transition Complete

Declaring "done" needs objective criteria. Recommended:

1. The legacy-routed authentication share reaches zero and stays there for N weeks (e.g. 8) — covering month-end/quarter-end periodic workloads

2. Zero unmigrated apps — or formal retirement decisions completed for the remainder

3. Zero DNS/routing entries depending on the legacy system

4. An approved removal plan for the bridge/orchestration layer — the coexistence infrastructure must come down too for the migration to be truly complete

5. SiteMinder policy archive preserved and license termination processed

6. Audit log retention obligations transferred (for regulated industries, a confirmed plan for retaining years of legacy audit logs)

Case Scenario — A Fictional Bank Intranet With 200 Apps

Finally, a fictional case that puts the patterns together. The setting: a mid-sized bank, 200 intranet apps, SiteMinder 12.8 (EOS looming), 12,000 employees, a 30-month target.

**Inventory results**: 25 standard-protocol apps (A), 60 modifiable header apps (B), 105 unmodifiable header apps (C), 10 SDK-dependent apps (D).

**Architecture decisions**:

- Build a Keycloak HA cluster; connect the existing AD via User Federation (the shared foundation for all patterns)

- Phase 1 (months 0-6): Keycloak as broker, SiteMinder as IdP (pattern 1, direction 2). Enforce a new-app freeze policy — all new apps Keycloak OIDC only (pattern 4, rule 1)

- Phase 2 (months 6-12): invert the login source of truth — Keycloak becomes IdP, SiteMinder demoted to SAML SP (pattern 1, direction 1). From this point every employee can register passkeys

- Phase 3 (months 6-24, in parallel): the 25 A-group apps migrate immediately by endpoint swap. The 60 B-group apps migrate gradually by folding OIDC conversion into regular releases. The 105 C-group apps move to an oauth2-proxy farm (pattern 2), 15-20 per wave

- Phase 4 (months 24-30): individual redesign of the 10 D-group apps — 3 rebuilt, 5 found proxy-compatible, 2 retired. SiteMinder decommissioned

**Problems encountered** (the ones you will meet in reality too):

- In Wave 3, one C-group app was parsing the caret-delimited SM_USERGROUPS together with LDAP DN formats, forcing a complete group-mapping rework — the lesson: the inventory must include header *value formats*, not just names

- A month-end closing batch was discovered impersonating header authentication with a service account — non-human identities were split into a separate track and converted to the client credentials flow

- A SAML certificate expiry caused a 47-minute total login outage during Phase 2 — certificate expiry alerts and a dual-key rollover procedure were introduced afterward

- Four apps remained at month 30 — the owner-department chargeback governance kicked in and they were cleared within three months

The decommission completed at month 33, and during the coexistence period the bank still rolled out passkeys (month 14) and applied Zero Trust network policies in parallel. The point of this scenario: **because the coexistence architecture was properly designed, modernization did not have to wait for the migration to finish.**

Pattern Comparison Summary and Selection Guide

The four patterns combine rather than compete, but to aid decision-making, here they are in one table.

| Aspect | Pattern 1 bridge | Pattern 2 proxy | Pattern 3 orchestration | Pattern 4 strangler |

| --- | --- | --- | --- | --- |

| Problem solved | SSO between IdPs | Hosting unmodified apps | Absorbing transition complexity | Direction/governance of migration |

| Build difficulty | Low (standard features) | Medium (per-app config) | Low (purchase) | Organizational, not technical |

| Operational risk | Certificates/attribute contract | Proxy SPOF, session store | New SPOF, vendor lock-in | Loss of momentum |

| Cost | Mostly labor | Infrastructure + labor | Licenses | Governance cost |

| When to apply | Throughout coexistence | During C-group migration | Tight schedule / multi-IdP | Throughout coexistence |

A simple selection flow:

Start

├─ Need SSO between the two IdPs? ─────────→ Always pattern 1 (foundation)

├─ Any unmodifiable header apps? ── yes ─┐

│ ├─ Strong engineering capacity → self-build pattern 2

│ └─ Limited capacity / multi-IdP → evaluate pattern 3

└─ More than ~30 apps? ── yes ────────────→ Pattern 4 governance is mandatory

Troubleshooting — Incidents You Will Actually Meet During Coexistence

Finally, the symptoms most frequently reported while operating hybrid environments, with first-response steps.

Symptom 1: Intermittent Re-Login Demands on One App Only

- **Common causes**: multi-instance proxies running without a shared session store (a session miss every time the load balancer routes to a different instance), or SMSESSION validation failures right after an Agent Key rollover.

- **First checks**: the number of proxy instances and the session_store_type setting on that app's path; on the legacy side, correlate the Policy Server's key rollover times with the incident times.

Symptom 2: Redirect Loops

- **Common causes**: inverted session lifetimes across layers (the proxy cookie alive while the IdP session has expired), or a mismatch between the redirect_uri and the cookie domain.

- **First checks**: capture the 302 chain with browser dev tools and identify which layer is bouncing to which. Reproducing with curl:

Trace the redirect chain (preserving cookies, max 10 hops)

curl -s -o /dev/null -L --max-redirs 10 \

-b cookies.txt -c cookies.txt \

-w '%{num_redirects} redirects, final: %{url_effective}\n' \

https://legacy-crm.example.com/

Inspect each hop's Location header

curl -s -D - -o /dev/null https://legacy-crm.example.com/ | grep -i '^location'

Symptom 3: Logged Out, but Other Apps Still Accessible

- **Common causes**: a layer missing from the logout orchestration chain (especially the SMSESSION invalidation step), or clients not receiving back-channel logout.

- **First checks**: after a global logout, dump cookie state layer by layer. Whichever of SMSESSION, the proxy cookie, or the Keycloak session cookie survives tells you the missing link.

Symptom 4: Groups/Permissions Appear Empty After Cutover

- **Common causes**: group claim format mismatch (caret-delimited string vs JSON array), a missing claim (groups not in scope), or LDAP group mapper sync lag.

- **First checks**: decode the token itself to confirm the claim is present; if it is, examine the proxy's header translation; if it is not, examine the Keycloak mapper configuration.

Inspect the access token's claims (using jq)

TOKEN=$(curl -s -X POST \

-d "client_id=debug-cli" -d "grant_type=password" \

-d "username=testuser" -d "password=..." \

https://keycloak.example.com/realms/enterprise/protocol/openid-connect/token \

| jq -r .access_token)

echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | jq '.groups'

Symptom 5: Authentication Failures Only at Month-End/Quarter-End

- **Common causes**: batch/service accounts using human authentication paths (header mimicry, screen scraping) that were missed in the migration.

- **First response**: create a dedicated census track for non-human identities and move them to the client credentials flow. The bank in the case scenario hit exactly this problem.

Closing

The hybrid coexistence period is not a by-product of migration; it is an architecture that deserves design in its own right. In summary:

- Use **pattern 1 (protocol bridge)** to move the authentication source of truth to the modern side quickly,

- Use **pattern 2 (proxy header injection)** to host unmodifiable legacy apps without code changes,

- Treat **pattern 3 (orchestration)** as a capability-versus-schedule trade-off,

- And enforce the direction of strangulation with the governance of **pattern 4 (strangler fig)**.

Then make these four things standing operational checks — session lifetime alignment, logout orchestration, header spoofing defense, and completion criteria — and you can pass through a multi-year transition without incidents, and without pausing modernization.

References

- [SAML 2.0 Core specification (OASIS)](https://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf)

- [OpenID Connect Core 1.0](https://openid.net/specs/openid-connect-core-1_0.html)

- [OpenID Connect Back-Channel Logout 1.0](https://openid.net/specs/openid-connect-backchannel-1_0.html)

- [OpenID Connect RP-Initiated Logout 1.0](https://openid.net/specs/openid-connect-rpinitiated-1_0.html)

- [Keycloak official documentation](https://www.keycloak.org/documentation)

- [Keycloak release notes](https://www.keycloak.org/docs/latest/release_notes/index.html)

- [Broadcom SiteMinder Tech Docs](https://techdocs.broadcom.com/siteminder)

- [oauth2-proxy official documentation](https://oauth2-proxy.github.io/oauth2-proxy/)

- [Martin Fowler — Strangler Fig Application](https://martinfowler.com/bliki/StranglerFigApplication.html)

- [RFC 6749 — The OAuth 2.0 Authorization Framework](https://datatracker.ietf.org/doc/html/rfc6749)

- [RFC 8693 — OAuth 2.0 Token Exchange](https://datatracker.ietf.org/doc/html/rfc8693)

- [RFC 9700 — Best Current Practice for OAuth 2.0 Security](https://datatracker.ietf.org/doc/html/rfc9700)

- [Strata — Identity Orchestration](https://www.strata.io/)

- [W3C WebAuthn Level 3](https://www.w3.org/TR/webauthn-3/)

현재 단락 (1/235)

The previous two articles covered the architecture of SiteMinder and the migration strategy to Keycl...

작성 글자: 0원문 글자: 21,149작성 단락: 0/235