Financial-Grade Kubernetes Platforms — Making Regulation and Cloud Native Coexist

Introduction
The Reality of Kubernetes in Finance — Understanding the Regulatory Environment
Multi-Tenancy — Split by Namespace or by Cluster?
- How Far Should Core Banking Go?
The Security Hardening Stack — Gates as Code
Audit Trails — Every Change Is Recorded and Approved
- Long-Term Retention of API Audit Logs
- Connecting GitOps to Electronic Approval
Availability Design — DR Speaks in Tiers
- The RTO/RPO Tier System
- DR Drills as a Platform Feature
Identity and Access Control — People Are the Biggest Variable
- HR System Integration
- The PIM (Privileged Access Management) Pattern
Network Policy — Start from Default Deny
Observability and Automated Compliance Reporting
Legacy Integration — The Boundary Between MCA/ESB and the Service Mesh
Adoption Roadmap — From Analytics to External Interfaces
Checklist
Pitfalls and Anti-Patterns
Closing
References

Introduction

"Kubernetes makes deployments faster, right?" When you hear this in a financial institution's infrastructure team, you cannot help but smile. Technically true — but adopting Kubernetes in a regulated financial environment is never just about spinning up a cluster. Electronic finance supervision regulations, network segregation requirements, cloud usage reporting procedures, audit trail obligations, mandatory disaster recovery drills — only when all of these regulatory requirements are satisfied on top of a cloud native architecture does the platform become something you are actually allowed to run.

This article distills the experience of designing and operating Kubernetes platforms in financial environments into an architecture where regulation and cloud native coexist. The focus is on what differs from generic Kubernetes best practices, and how to close that gap with engineering.

One thing must be stated up front: this is a technical blog post, not legal advice or an authoritative regulatory interpretation. Any real adoption must go through your institution's compliance and information security organizations, and where necessary, formal interpretation by the financial supervisory authority.

The Reality of Kubernetes in Finance — Understanding the Regulatory Environment

The Electronic Finance Supervision Regulation as the Starting Point

In Korean finance, every infrastructure design starts from the Regulation on Supervision of Electronic Financial Transactions. It defines the personnel and physical requirements financial companies must meet to ensure the safety of electronic financial transactions. From a Kubernetes platform perspective, the most relevant themes are:

Segregation of internal and external networks for information processing systems (network segregation)
Protection measures and access control for computerized data
Procedures and reporting for outsourcing information processing (including cloud usage)
Construction of disaster recovery centers and execution of DR drills
Management and change control of the core ledger (core banking data)

Since 2023, network segregation rules have been gradually rationalized, and under the roadmap for improving network segregation in the financial sector announced in 2024, the permitted scope of SaaS and generative AI usage has widened. Conservative controls over core systems such as core banking, however, remain in force. In other words, we are no longer in a world of "everything is allowed now" or "nothing is possible" — we are in an era of differentiated controls per system criticality tier.

Network Segregation and Cluster Separation Architecture

Apply network segregation to Kubernetes and the first question you hit is "how many clusters do we split into?" A typical financial institution layout looks like this:

                        [ Internet ]
                            |
                   +--------+--------+
                   | External firewall|
                   +--------+--------+
                            |
  ..........................|.......................... DMZ zone
  :                +--------+--------+                :
  :                |   DMZ cluster   |                :
  :                |  - channel GW   |                :
  :                |  - WAF/proxy    |                :
  :                |  - external-IF  |                :
  :                |    adapters     |                :
  :                +--------+--------+                :
  ......................... | ..........................
                   +--------+--------+
                   | Internal FW/IPS |   <- one-way control, minimal ports
                   +--------+--------+
                            |
  ..........................|.......................... Internal network
  :   +----------------+   |   +------------------+   :
  :   | Internal       +---+---+  Analytics (MIS)  |  :
  :   | cluster        |       |  cluster          |  :
  :   | - channel svcs |       |  - batch/analytics|  :
  :   | - shared svcs  |       |  - data pipelines |  :
  :   +-------+--------+       +------------------+   :
  :           |                                       :
  :   +-------+--------+      +-------------------+   :
  :   | Core banking   |      |  Ops/management   |   :
  :   | traditional or |      |  - CI/CD, registry|   :
  :   | dedicated      |      |  - monitoring/logs|   :
  :   | cluster        |      +-------------------+   :
  :   +----------------+                              :
  .....................................................

The core principles:

Physically separate the DMZ cluster from internal-network clusters. Sharing one cluster between DMZ and internal zones via namespaces is, under the conservative reading, contrary to the intent of network segregation.
Traffic from DMZ to the internal network is whitelisted at the firewall per destination and port. Kubernetes NetworkPolicy is a supplementary control, not a replacement for the firewall.
Keep the ops/management zone (CI/CD, registry, monitoring) separate, with one-directional access into each cluster. The default pattern is clusters pulling images from the registry, not the registry pushing in.
Do not forget dev/prod network separation. Developers access only the development clusters directly; production changes flow exclusively through the GitOps pipeline.

The Cloud Usage Regulatory Flow — Criticality Assessment and Reporting

To run Kubernetes on public cloud (including managed services like EKS, AKS, GKE), you must follow the cloud computing service usage procedure under the supervision regulation. The practical flow is roughly:

[1] Identify the target business functions
     |
[2] Criticality assessment
     |  - processing of unique identifiers / personal credit information?
     |  - impact on safety and reliability of electronic financial transactions
     |
     +--> not critical --> [3a] internal procedure + usage inventory management
     |
     +--> critical -------> [3b] business continuity plan + safety measures
                              |
                          [4] Information Protection Committee deliberation
                              |
                          [5] Prior report to the FSS (critical workloads,
                              7 business days before usage)
                              |
                          [6] Contract signing + go-live
                              |
                          [7] Ongoing: annual safety checks, reassessment on change

For platform engineers, the key insight is that the artifacts of this procedure become design inputs for the platform. Workloads classified as critical require stronger isolation (dedicated clusters or node pools), longer audit log retention, and higher DR tiers. Encode the assessment results as namespace labels and your policy engine can apply differentiated controls automatically.

apiVersion: v1
kind: Namespace
metadata:
  name: payment-core
  labels:
    bank.example.com/criticality: "critical"      # criticality assessment result
    bank.example.com/data-class: "pci-personal"   # data classification
    bank.example.com/dr-tier: "tier-1"            # DR tier
    bank.example.com/owner-dept: "payments-dev"   # owning department

Multi-Tenancy — Split by Namespace or by Cluster?

A classic Kubernetes multi-tenancy question, but in finance the decision criteria are sharper. What determines the isolation level is not technical preference but data classification and regulatory boundaries.

Isolation criterion	Namespace separation suffices	Cluster separation required
Network zone	Same network zone	Between DMZ and internal network
Data classification	Same class (e.g. general analytics)	Personal credit data vs non-processing systems
Organizational boundary	Teams within one division	Subsidiaries, external outsourced operators
Failure blast radius	Some channel latency tolerable	Core banking — bank-wide outage
Change control	Same approval chain	Different approvers and inspection cycles
Kernel/node requirements	Standard node image	Dedicated HSM integration, special kernel modules

How Far Should Core Banking Go?

The hottest topic. Can core banking — ledger processing, lending and deposit transactions, end-of-day (EOD) batch — run on Kubernetes? In 2026, the realistic answer is "from the periphery, in stages."

Stage 1: read-only core banking APIs. Containerize balance inquiry and transaction history APIs that never mutate the ledger. Failures have limited impact since they are read-only.
Stage 2: auxiliary services around the core. Split out pre/post-processing of ledger transactions: fee calculation, limit validation, notifications.
Stage 3: separate ledgers for new products. Design the ledger of a new digital product as a standalone micro-ledger built cloud native from day one. A path validated by internet-only banks.
Last: migrating the existing ledger. EOD batch time windows, transactional consistency, and supervisory reporting must all be revalidated — treat it as a multi-year program.

Even where namespace isolation is chosen, the basic skeleton is mandatory: ResourceQuota and LimitRange for resource boundaries, NetworkPolicy for communication boundaries, RBAC for permission boundaries.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: payment-core-quota
  namespace: payment-core
spec:
  hard:
    requests.cpu: "64"
    requests.memory: 256Gi
    limits.cpu: "96"
    pods: "300"
    services.loadbalancers: "0"   # only the platform team creates LBs

The Security Hardening Stack — Gates as Code

The defining trait of financial security inspections is "evidence." You must show not only that a control exists, but that it operated. Managing policy as code solves both at once.

Pod Security Admission as the Baseline

PodSecurityPolicy is long gone, so the baseline is Pod Security Admission (PSA). Production namespaces enforce restricted by default.

apiVersion: v1
kind: Namespace
metadata:
  name: payment-core
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Financial Policy Gates with Kyverno

Organization-specific policies beyond PSA are implemented with Kyverno. The three representative ones — enforcing the air-gapped registry, verifying image signatures, and proving vulnerability scan passes — can be tied into a single flow.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: financial-image-controls
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    # Rule 1: block images not from the internal air-gapped registry
    - name: restrict-registry
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Only images from registry.bank.internal are allowed."
        pattern:
          spec:
            containers:
              - image: "registry.bank.internal/*"
    # Rule 2: forbid the latest tag — untraceable changes
    - name: disallow-latest-tag
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Images must use immutable tags (digest or version)."
        pattern:
          spec:
            containers:
              - image: "!*:latest"

Image signature verification combines Sigstore Cosign with Kyverno verifyImages. The CI pipeline signs after build; the cluster verifies at admission. The result is provable evidence that "no image that bypassed the approved pipeline can even start."

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signature
spec:
  validationFailureAction: Enforce
  webhookTimeoutSeconds: 30
  rules:
    - name: require-cosign-signature
      match:
        any:
          - resources:
              kinds: ["Pod"]
              namespaceSelector:
                matchLabels:
                  bank.example.com/criticality: critical
      verifyImages:
        - imageReferences:
            - "registry.bank.internal/*"
          attestors:
            - entries:
                - keys:
                    publicKeys: |-
                      -----BEGIN PUBLIC KEY-----
                      MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...
                      -----END PUBLIC KEY-----
          # verify vulnerability scan attestation: zero Critical findings
          attestations:
            - type: https://cosign.sigstore.dev/attestation/vuln/v1
              conditions:
                - all:
                    - key: "{{ scanner }}"
                      operator: Equals
                      value: "trivy"

Operating an Air-Gapped Registry

In an internet-blocked environment you must internalize the image supply chain itself.

[ External internet zone (restricted allowlist) ]
   docker.io / ghcr.io / quay.io
        |
        v  (1) periodic sync by a dedicated mirroring host
[ Quarantine zone ]
   staging registry --> (2) full Trivy/Grype scan
        |               (3) sign on policy pass (cosign sign)
        v  (4) only approved images enter
[ Internal network ]
   Harbor (registry.bank.internal)
        |- proxy-cache project: base images
        |- apps project: internally built artifacts
        +- replication: DR-site registry

Harbor gives you project-level RBAC, scanner integration, tag immutability, and retention policies in one package, which is why it is the de facto standard for air-gapped financial registries.

Audit Trails — Every Change Is Recorded and Approved

Long-Term Retention of API Audit Logs

The Kubernetes API server audit log is the primary evidence of "who changed what, when." Aligned with the record retention intent of the supervision regulation, design for at least one year of retention — up to five years depending on institutional policy.

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # record all secret access down to metadata level
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets", "configmaps"]
  # record workload changes including request bodies
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "apps"
        resources: ["deployments", "statefulsets", "daemonsets"]
      - group: ""
        resources: ["pods", "services"]
  # minimize read-only noise (storage cost control)
  - level: None
    verbs: ["get", "list", "watch"]
    users: ["system:kube-proxy", "system:apiserver"]
  - level: Metadata
    omitStages: ["RequestReceived"]

Tamper resistance is the heart of the retention pipeline. Never leave audit logs on the node: ship them immediately via Fluent Bit or similar to an external log system, then archive daily to object storage with WORM (Write Once Read Many) properties. Hash chains or object-lock features let you prove integrity.

Connecting GitOps to Electronic Approval

Change management in a financial institution can be summarized as "no change without sign-off." GitOps pairs with this requirement surprisingly well: since Git is the single source of truth, wiring the approval system into Git merges automates change control.

Developer            Git repository        e-Approval system        ArgoCD
  |                      |                       |                    |
  |--- open PR --------->|                       |                    |
  |                      |--- webhook: draft --->|                    |
  |                      |    approval request   |                    |
  |                      |                       |--- approval chain  |
  |                      |                       |   (lead/security/  |
  |                      |                       |    change board)   |
  |                      |<-- approval callback --|                   |
  |                      |  (merge unlock)        |                   |
  |--- merge ----------->|                       |                    |
  |                      |<------------------ sync (poll/webhook) ----|
  |                      |                       |                    |
  |                      |  apply to prod cluster + attach the apply  |
  |                      |  result to the approval document (evidence)|

Three implementation points:

Protected production branches: merging requires a status check that confirms the approval state.
Emergency change (break-glass) path: predefine an act-first-approve-later route for incidents, with automatic post-hoc approval drafting and notification on use.
Drift detection: ArgoCD selfHeal plus drift alerts immediately detect "changes that are not in Git" — which doubles as evidence for the unauthorized-change-detection control.

Availability Design — DR Speaks in Tiers

The RTO/RPO Tier System

Financial DR is not one-size-fits-all; it is tiered. A typical tier table aligned with criticality assessment:

DR tier	Examples	RTO target	RPO target	Strategy
Tier 1	Core banking, payments	Minutes	Zero (no loss)	Active-active or sync-replicated hot standby
Tier 2	Channels, authentication	Within 30 min	Minutes	Warm standby, async replication + auto failover
Tier 3	Analytics, internal apps	Within 4 hours	Within 1 hour	Cold standby, automated restore from backup
Tier 4	Dev/test	Within 24 hours	24 hours	Backups only, rebuild

Kubernetes-level implementation splits by layer:

Cluster topology: Tier 1 and 2 require multi-AZ node distribution as a baseline. topologySpreadConstraints and PodDisruptionBudget keep the service alive through a single-AZ failure.
Cross-region DR: run independent clusters in the primary and DR sites (or regions) and apply identical manifests to both via GitOps. The essence of cloud native DR is not "recovering a cluster" but "reproducing the same state anywhere."
Stateful data: the hard part is data. Let databases rely on storage/DB-layer replication; keeping mostly stateless workloads on Kubernetes is the safe initial strategy.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: channel-api
spec:
  replicas: 6
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: channel-api
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: channel-api-pdb
spec:
  minAvailable: 4
  selector:
    matchLabels:
      app: channel-api

DR Drills as a Platform Feature

Financial institutions are obligated to run regular DR drills (typically at least annually, more often for core systems). Make this a repeatable platform capability, not a manual event.

Script the DNS/GSLB failover, DR cluster scale-out, and data consistency verification, and execute the identical procedure every drill.
Automatically collect drill results (failover duration, data loss, failed items) and generate the report. That report becomes evidence for supervisory reporting and internal audit.
Offset cost by using the DR cluster for analytics batch or test workloads in peacetime — but only workloads that can be evicted instantly during a drill.

Identity and Access Control — People Are the Biggest Variable

HR System Integration

A perennial audit finding in finance: "accounts of retirees and transferred staff not revoked." Kubernetes access is no exception, so the HR system must be the source of truth for entitlements.

HR system ---> IdP (Keycloak/AD) ---> OIDC ---> kube-apiserver
     |                |                              |
     |  hire/transfer |  group membership            |  RBAC group bindings
     |  /leave events |  auto-refresh                |
     +--------------->|  (dept code = IdP group)     |
                      +--- auto-lock accounts idle 90d

Distributing static tokens or client certificates in kubeconfig is irrevocable — ban it and standardize on short-lived OIDC tokens. Bind RBAC to IdP groups rather than individuals so entitlements follow personnel moves automatically.

The PIM (Privileged Access Management) Pattern

The rule for production clusters: zero standing admins. Apply Just-in-Time elevation — request when needed, receive time-boxed privileges on approval.

# steady state: operators belong only to a read-only group
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ops-readonly
subjects:
  - kind: Group
    name: idp:platform-ops
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: view
  apiGroup: rbac.authorization.k8s.io

[Elevation flow]
Operator --> PIM portal: reason + target namespace + duration (max 4h)
         --> approval by a second person --> automation creates a temporary RoleBinding
         --> controller deletes the RoleBinding at expiry
         --> all kubectl commands during elevation are tagged and reviewed in audit logs

If session recording is required, place an access proxy such as Teleport in front so that even kubectl exec sessions leave recorded evidence.

Network Policy — Start from Default Deny

If network segregation is the macro boundary, NetworkPolicy is the micro boundary inside the cluster. The principle is simple: lay down default deny, then whitelist only required flows.

# Step 1: namespace default deny (Ingress + Egress)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: payment-core
spec:
  podSelector: {}
  policyTypes: ["Ingress", "Egress"]
---
# Step 2: allow DNS (the first thing that breaks under egress deny)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: payment-core
spec:
  podSelector: {}
  policyTypes: ["Egress"]
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

For CNI, Cilium and Calico are the two leaders. Cilium excels at L7 policy (control per HTTP method and path) and Hubble-based flow visibility — flow logs answer the audit question "what did this pod actually talk to?" Calico brings a mature policy model, BGP integration, and policy recommendation in its enterprise edition. Either way, the recommended sequence is to collect flows in monitoring mode first, draft the allowlist from observed traffic, then switch to enforce mode.

Egress toward core banking should be especially narrow. With Cilium, FQDN policies can open only specific interface domains.

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-core-banking-if
  namespace: channel-api
spec:
  endpointSelector:
    matchLabels:
      app: transfer-service
  egress:
    - toFQDNs:
        - matchName: "mca-gateway.core.bank.internal"
      toPorts:
        - ports:
            - port: "8443"
              protocol: TCP

Observability and Automated Compliance Reporting

The observability stack (Prometheus, Loki, Tempo, Grafana) is broadly the same as in other industries, but finance adds two requirements.

Reporting automation. The metrics that feed supervisory reports and internal inspections (availability, incident counts, change counts, vulnerability remediation rate) must be produced as scheduled deliverables, not just dashboards. Going as far as generating monthly PDF/CSV via Grafana reporting or batch jobs and auto-drafting them into the approval system removes a huge operational burden.
Continuous compliance checks. Run CIS Kubernetes Benchmark checks (kube-bench), runtime anomaly detection (Falco), and policy violation status (Kyverno PolicyReport) on a schedule and store results as time series. Shift from "we check once a year for the inspection" to "continuous checks, submit history at inspection time."

[Evidence pipeline]
kube-bench (weekly) --+
Falco events (24/7) --+--> log pipeline --> WORM storage (long-term)
PolicyReport (24/7) --+         |
audit log (24/7) -----+         +--> monthly compliance report job --> e-approval draft

Legacy Integration — The Boundary Between MCA/ESB and the Service Mesh

Financial institutions already have an integration layer: MCA (Multi Channel Architecture) or an ESB (Enterprise Service Bus). In most cases, new services on Kubernetes cannot bypass this legacy integration layer when talking to core banking.

+--------------------------- internal cluster ---------------------------+
|                                                                        |
|  [channel svc A] --\                                                   |
|  [channel svc B] ---+--> [mesh egress gateway] ---+                    |
|  [channel svc C] --/      (mTLS termination,      |                    |
|                            traffic control)       |                    |
+---------------------------------------------------|--------------------+
                                                    v
                                      [adapter service (protocol translation)]
                                       REST/gRPC <-> fixed-length messages
                                                    |
                                             [MCA / ESB]
                                                    |
                                       +------------+-----------+
                                       | Core banking (ledger,  |
                                       | lending, deposits)     |
                                       | External IF (clearing, |
                                       | VAN networks)          |
                                       +------------------------+

Design principles:

The mesh ends at the cluster. Apply Istio/Linkerd mTLS and traffic policy only inside the cluster and up to the egress gateway; respect the existing message-level security regime (link encryption, message integrity checks) on the MCA/ESB segment. Attempts to extend the mesh into legacy rarely pay off.
Concentrate translation in adapter services. Legacy peculiarities — fixed-length messages, EBCDIC conversion, transaction-code mapping in message headers — belong in one adapter. If channel services learn the message format, legacy leaks into the cluster.
Timeouts and circuit breakers follow core banking conventions. Design mesh retry policies so they never violate the core banking response contract (e.g. 3-second response, send a reversal message on timeout). Indiscriminate automatic retries lead directly to duplicate-transaction incidents. Verify idempotency keys and reversal (compensation) message design first.

Adoption Roadmap — From Analytics to External Interfaces

Big-bang migration is not an option in finance. Expand in stages from the lowest-risk areas.

Stage	Target	Duration (example)	Goals and validation items
0	Platform build-out	3 to 6 months	Cluster standards, registry, GitOps, policy gates, observability
1	Analytics (MIS)	6 months	Migrate batch/analytics, validate procedures and evidence
2	Internal business apps	6 months	Internal portals/APIs, prove HR-integrated access control
3	Channel systems	6 to 12 months	Mobile/internet banking APIs, zero-downtime deploys, Tier 2 DR
4	External interfaces	6 to 12 months	Clearing/VAN adapters, DMZ cluster, security review
5	Core banking periphery	Ongoing	Read APIs, then auxiliary services, then new ledgers

The trick is to define each stage's exit criteria not as "number of workloads" but as operational capability provable by evidence. Stage 1 should not exit on "50 analytics batches migrated" but on "one failure drill, one DR failover, and one audit-log-based change-trace demonstration completed."

Checklist

A checklist for adoption and operations reviews.

Regulation and governance

Has the cloud criticality assessment been performed and encoded as namespace labels?
For critical workloads, were committee deliberation and prior FSS reporting completed?
Are change approval and GitOps merges systematically linked?
Are break-glass procedures and post-hoc approval paths defined?

Architecture

Are DMZ/internal clusters physically separated with minimal firewall whitelists?
Is there a documented cluster/namespace split standard based on data class and org boundaries?
Are per-tier RTO/RPO defined, with multi-AZ spread and PDBs applied?
Are DR drills scripted and repeatable?

Security

Is PSA restricted enforced on all production namespaces?
Are images outside the air-gapped registry blocked by policy?
Are image signatures and scan attestations verified at admission?
Is NetworkPolicy default deny applied across all namespaces?
Are standing admins zero, with working JIT elevation?

Audit and observability

Are audit logs retained beyond the policy period in tamper-resistant storage?
Is cluster access automatically revoked on retirement or transfer?
Do compliance checks (kube-bench, policy reports) run continuously with history?
Is supervisory-report metric production automated?

Pitfalls and Anti-Patterns

Overusing "regulation forbids it." Checking the actual text often reveals it is possible with procedure. Read the regulation and official interpretations directly; do not mistake vague convention for regulation.
Namespace maximalism. Cramming workloads from different network zones into one cluster means redrawing the whole architecture at inspection time.
Workloads before policy gates. If you wedge policies in after migration starts, existing workloads land in mass violation and enforce mode becomes impossible. Gates first.
Leaving audit logs on nodes. Node failure or compromise destroys the evidence with it. Real-time external shipping is the baseline.
DR that exists only on paper. A DR you have never actually failed over to in a drill is no DR at all. Invest in drill automation.
Stretching the mesh over legacy. Respect existing controls on the MCA/ESB segment and draw the boundary at the adapter — that is the pragmatic answer.
Neglected retry policies. When mesh/client auto-retries collide with core banking idempotency design, you get duplicate transfers — the most expensive bug in financial systems.

Closing

The essence of a financial-grade Kubernetes platform is "translating regulatory requirements into code." Network segregation becomes cluster topology; change control becomes GitOps wired to approvals; access control becomes HR-integrated OIDC with JIT elevation; audit readiness becomes a WORM evidence pipeline. In organizations where this translation is done well, regulation stops being the enemy of speed — it becomes the guardrail within which you can actually move faster.

Regulation and cloud native can coexist. But that coexistence does not happen by itself; it becomes possible only when the platform team can read the regulation document and the YAML at the same time.