- Published on
AI Governance & Responsible AI: EU AI Act, XAI, Bias Detection, and AI Safety Techniques
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Table of Contents
- AI Governance Frameworks Overview
- EU AI Act: Risk Classification System
- NIST AI RMF & ISO 42001
- Responsible AI Development Principles
- Bias Detection & Mitigation
- Explainable AI (XAI)
- AI Safety Techniques
- Data Privacy Technologies
- AI Regulatory Practice
- Quiz
AI Governance Frameworks Overview
As AI systems are deployed across society, the need for governance frameworks has grown rapidly. AI Governance refers to the totality of policies, procedures, and technologies that manage risks throughout the development, deployment, and operation of AI systems — ensuring alignment with societal values and legal requirements.
Key global frameworks:
| Framework | Authority | Core Characteristics |
|---|---|---|
| EU AI Act | European Union | Legally binding, risk-based approach |
| NIST AI RMF | US NIST | Voluntary guidance, risk management |
| ISO 42001 | ISO/IEC | Certifiable AI management system |
| G7 AI Principles | G7 Nations | International cooperation, non-binding |
| UNESCO AI Ethics Recommendation | UNESCO | Human rights-centered, global scope |
EU AI Act: Risk Classification System
The EU AI Act, which entered into force in 2024, is the world's first comprehensive AI legislation. It adopts a risk-based approach, classifying AI systems into four risk tiers.
Risk Tier Classification
1. Unacceptable Risk — Prohibited
- Real-time remote biometric identification in public spaces (e.g., CCTV facial recognition)
- Social scoring systems
- Subliminal manipulation techniques targeting vulnerable groups
- Predictive policing at the individual level
2. High-Risk — Strict Obligations
- Medical diagnosis assistance and medical device software
- Autonomous vehicles and critical infrastructure
- Recruitment and personnel evaluation systems
- Credit scoring and insurance underwriting
- Judiciary and law enforcement support tools
- Educational assessment systems
3. Limited Risk — Transparency Obligations
- Chatbots: must disclose that the user is interacting with AI
- Deepfake content: must be labeled as synthetic
- Emotion recognition systems: must disclose usage
4. Minimal Risk — Self-Regulation
- Spam filters, game AI
- AI-based inventory management, etc.
EU AI Act Risk Classifier Implementation
from dataclasses import dataclass
from enum import Enum
from typing import List
class RiskLevel(Enum):
UNACCEPTABLE = "Unacceptable (Prohibited)"
HIGH = "High-Risk (Strict Regulation)"
LIMITED = "Limited Risk (Transparency Obligations)"
MINIMAL = "Minimal Risk (Self-Regulation)"
@dataclass
class AISystemProfile:
name: str
uses_biometric: bool
is_real_time: bool
public_space: bool
domain: str # healthcare, hiring, credit, education, judiciary, infrastructure
interacts_with_humans: bool
generates_synthetic_content: bool
def classify_eu_ai_act_risk(system: AISystemProfile) -> tuple[RiskLevel, List[str]]:
"""
EU AI Act risk classifier.
Returns (RiskLevel, list_of_applicable_obligations)
"""
obligations = []
# Step 1: Check for unacceptable risk
if (system.uses_biometric and system.is_real_time and system.public_space):
return RiskLevel.UNACCEPTABLE, ["Cease operations immediately", "Legally prohibited"]
# Step 2: Check for high-risk domains
HIGH_RISK_DOMAINS = {
"healthcare", "hiring", "credit",
"education", "judiciary", "critical_infrastructure"
}
if system.domain in HIGH_RISK_DOMAINS:
obligations = [
"Mandatory Conformity Assessment",
"Technical documentation obligation",
"Human oversight mechanisms required",
"Transparency and logging requirements",
"Bias testing and data governance",
"Registration in EU database",
]
return RiskLevel.HIGH, obligations
# Step 3: Limited risk
if system.interacts_with_humans or system.generates_synthetic_content:
obligations = [
"Disclose AI system status to users",
"Watermark or label synthetic content",
]
return RiskLevel.LIMITED, obligations
# Step 4: Minimal risk
return RiskLevel.MINIMAL, ["Voluntary code of conduct recommended"]
# Usage example
credit_scoring_system = AISystemProfile(
name="Automated Credit Scoring AI",
uses_biometric=False,
is_real_time=False,
public_space=False,
domain="credit",
interacts_with_humans=False,
generates_synthetic_content=False,
)
risk_level, obligations = classify_eu_ai_act_risk(credit_scoring_system)
print(f"System: {credit_scoring_system.name}")
print(f"Risk Level: {risk_level.value}")
print("Obligations:")
for ob in obligations:
print(f" - {ob}")
NIST AI RMF & ISO 42001
NIST AI Risk Management Framework
The NIST AI RMF (2023) is structured around four core functions:
- GOVERN: Establish AI risk management culture and policies
- MAP: Identify and categorize AI risk context
- MEASURE: Analyze, evaluate, and quantify risks
- MANAGE: Respond to risks based on priority
ISO/IEC 42001: AI Management System
ISO 42001 is a management system standard for organizations to develop and deploy AI responsibly. Like ISO 9001 (quality) or ISO 27001 (security), it can be certified by third parties.
Core requirements:
- Establish AI policies and objectives
- Clarify leadership responsibilities
- Assess risks and opportunities
- Conduct AI impact assessments
- Perform internal audits and continuous improvement
Responsible AI Development Principles
The FATE Framework
Fairness: Treat similar people similarly. Do not disadvantage particular groups.
Accountability: Clarify responsibility for decisions. "Who is accountable for this decision?"
Transparency: Disclose how AI systems work, what data they were trained on, and their limitations.
Explainability: Explain the reasoning behind individual predictions in human-understandable terms.
G7 Hiroshima AI Principles (2023)
- Rule of law and respect for human rights
- Transparency and explainability
- Fairness and non-discrimination
- Human oversight and control
- Privacy protection
- Cybersecurity
- Information sharing and incident reporting
Bias Detection & Mitigation
AI model bias originates from historical inequalities in training data, feature selection errors, labeling mistakes, and feedback loops.
Key Fairness Metrics
Demographic Parity (Statistical Parity): The positive prediction rate must be equal across protected groups. P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1)
Equal Opportunity: The true positive rate (TPR) must be equal across protected groups. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1)
Calibration: Predicted probabilities must match actual positive rates (per group).
Individual Fairness: Similar individuals should be treated similarly.
Bias Detection with AIF360
import numpy as np
import pandas as pd
from aif360.datasets import BinaryLabelDataset
from aif360.metrics import BinaryLabelDatasetMetric, ClassificationMetric
from aif360.algorithms.preprocessing import Reweighing
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
# 1. Prepare data (loan approval scenario)
np.random.seed(42)
n = 1000
data = pd.DataFrame({
'income': np.random.normal(50000, 20000, n).clip(10000, 150000),
'credit_score': np.random.normal(680, 100, n).clip(300, 850),
'age': np.random.randint(20, 70, n),
'gender': np.random.choice([0, 1], n, p=[0.5, 0.5]), # 0=female, 1=male
'loan_approved': np.zeros(n, dtype=int)
})
# Inject artificial bias: males have higher approval probability
prob = 0.3 + 0.2 * data['gender'] + 0.3 * (data['credit_score'] > 700).astype(int)
data['loan_approved'] = (np.random.random(n) < prob).astype(int)
# 2. Create AIF360 dataset
aif_dataset = BinaryLabelDataset(
df=data,
label_names=['loan_approved'],
protected_attribute_names=['gender'],
favorable_label=1,
unfavorable_label=0,
)
# 3. Measure bias
privileged_groups = [{'gender': 1}] # male
unprivileged_groups = [{'gender': 0}] # female
dataset_metric = BinaryLabelDatasetMetric(
aif_dataset,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups,
)
print("=== Original Data Bias Analysis ===")
print(f"Disparate Impact: {dataset_metric.disparate_impact():.4f}")
print(f"Statistical Parity Difference: {dataset_metric.statistical_parity_difference():.4f}")
# Disparate Impact < 0.8 → 80% rule violation (bias detected)
# 4. Preprocessing bias mitigation with Reweighing
rw = Reweighing(
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups,
)
dataset_reweighed = rw.fit_transform(aif_dataset)
metric_reweighed = BinaryLabelDatasetMetric(
dataset_reweighed,
unprivileged_groups=unprivileged_groups,
privileged_groups=privileged_groups,
)
print("\n=== After Reweighing ===")
print(f"Disparate Impact: {metric_reweighed.disparate_impact():.4f}")
print(f"Statistical Parity Difference: {metric_reweighed.statistical_parity_difference():.4f}")
Post-processing Mitigation with Fairlearn
from fairlearn.postprocessing import ThresholdOptimizer
from fairlearn.metrics import MetricFrame, selection_rate, demographic_parity_difference
from sklearn.ensemble import GradientBoostingClassifier
# Train model
X = data[['income', 'credit_score', 'age']].values
y = data['loan_approved'].values
sensitive = data['gender'].values
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
base_model = GradientBoostingClassifier(n_estimators=100, random_state=42)
base_model.fit(X_scaled, y)
# ThresholdOptimizer: optimize decision thresholds per group
postprocess_est = ThresholdOptimizer(
estimator=base_model,
constraints="demographic_parity",
predict_method="predict_proba",
objective="balanced_accuracy_score",
)
postprocess_est.fit(X_scaled, y, sensitive_features=sensitive)
y_pred_fair = postprocess_est.predict(X_scaled, sensitive_features=sensitive)
# Measure fairness metrics
mf = MetricFrame(
metrics={"selection_rate": selection_rate},
y_true=y,
y_pred=y_pred_fair,
sensitive_features=sensitive,
)
print("\n=== Fairlearn Post-processing Results ===")
print(f"Selection rate by group:\n{mf.by_group}")
print(f"Demographic Parity Difference: {demographic_parity_difference(y, y_pred_fair, sensitive_features=sensitive):.4f}")
Explainable AI (XAI)
SHAP: SHapley Additive exPlanations
SHAP leverages Shapley values from cooperative game theory to quantify each feature's contribution to a prediction. It computes the average marginal contribution of a feature across all possible feature subsets.
import shap
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
# Train model
X_train, y_train = make_classification(
n_samples=500, n_features=8, n_informative=5, random_state=42
)
feature_names = [
'income', 'credit_score', 'age', 'debt_ratio',
'employment_years', 'num_accounts', 'late_payments', 'loan_amount'
]
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)
# SHAP TreeExplainer (tree-specific, fast)
explainer = shap.TreeExplainer(rf_model)
shap_values = explainer.shap_values(X_train)
# Individual prediction explanation (Waterfall Plot)
sample_idx = 0
shap.waterfall_plot(
shap.Explanation(
values=shap_values[1][sample_idx],
base_values=explainer.expected_value[1],
data=X_train[sample_idx],
feature_names=feature_names,
)
)
# Global importance (Summary Plot)
shap.summary_plot(shap_values[1], X_train, feature_names=feature_names)
# SHAP interaction effects
shap_interaction = explainer.shap_interaction_values(X_train[:100])
print(f"Income-CreditScore interaction SHAP: {shap_interaction[0, 0, 1]:.4f}")
LIME: Local Interpretable Model-agnostic Explanations
import lime
import lime.lime_tabular
import numpy as np
# Create LIME explainer
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
training_data=X_train,
feature_names=feature_names,
class_names=['Rejected', 'Approved'],
mode='classification',
discretize_continuous=True,
)
# Explain individual sample
explanation = lime_explainer.explain_instance(
data_row=X_train[0],
predict_fn=rf_model.predict_proba,
num_features=6,
num_samples=1000,
)
print("=== LIME Explanation (Sample #0) ===")
for feature, weight in explanation.as_list():
direction = "increases" if weight > 0 else "decreases"
print(f" {feature}: {weight:+.4f} ({direction} approval probability)")
explanation.show_in_notebook(show_table=True)
Generating a Model Card
import json
from datetime import datetime
def generate_model_card(
model_name: str,
version: str,
intended_use: str,
out_of_scope_uses: list,
training_data: dict,
evaluation_results: dict,
fairness_analysis: dict,
limitations: list,
ethical_considerations: list,
) -> dict:
"""Standard model card generator (based on Mitchell et al. 2019)."""
model_card = {
"model_details": {
"name": model_name,
"version": version,
"date": datetime.now().strftime("%Y-%m-%d"),
"type": "Binary Classifier",
"paper": "https://arxiv.org/abs/1810.03993",
},
"intended_use": {
"primary_uses": intended_use,
"primary_users": ["Credit officers", "Financial regulators"],
"out_of_scope_uses": out_of_scope_uses,
},
"factors": {
"relevant_factors": ["gender", "age_group", "income_bracket"],
"evaluation_factors": ["demographic_parity", "equal_opportunity"],
},
"metrics": {
"performance_measures": evaluation_results,
"decision_thresholds": {"default": 0.5, "high_precision": 0.7},
},
"training_data": training_data,
"fairness_analysis": fairness_analysis,
"limitations": limitations,
"ethical_considerations": ethical_considerations,
"caveats_recommendations": [
"Regular drift monitoring recommended",
"Quarterly bias re-evaluation required",
"Human review required for high-stakes decisions",
],
}
return model_card
card = generate_model_card(
model_name="Personal Loan Approval Model v2.1",
version="2.1.0",
intended_use="Automated initial screening for personal loan applications",
out_of_scope_uses=["Corporate loan assessment", "Insurance pricing", "Employment decisions"],
training_data={"size": 50000, "period": "2020-2024", "source": "Internal loan history"},
evaluation_results={"accuracy": 0.84, "AUC": 0.91, "F1": 0.82},
fairness_analysis={
"demographic_parity_diff": 0.03,
"equal_opportunity_diff": 0.02,
"disparate_impact": 0.96,
},
limitations=["Pre-2020 data not included", "Rural region underrepresentation"],
ethical_considerations=["Final decisions must be reviewed by human officers", "Mandatory disclosure of rejection reasons"],
)
print(json.dumps(card, indent=2))
AI Safety Techniques
Constitutional AI (Anthropic)
Constitutional AI trains models to critique and revise their own responses according to a set of explicit principles (the "constitution").
How it works:
- Model generates a potentially harmful response
- Model performs self-critique based on constitutional principles
- Model revises the response to comply with principles
- Revised responses are used to train via RLHF
RLHF (Reinforcement Learning from Human Feedback)
1. SFT (Supervised Fine-Tuning): Fine-tune base model on high-quality demonstration data
2. Reward Modeling: Train reward model on human preference pairs (preferred vs. rejected)
3. RL Optimization: Maximize reward with PPO algorithm (with KL divergence constraint)
Jailbreak Defense Techniques
- Input filtering: Detect and block harmful patterns before processing
- Prompt injection defense: Isolate system prompts from user inputs
- Output monitoring: Real-time safety checks on generated text
- Red teaming: Expert adversarial teams systematically probe for vulnerabilities
AI Watermarking
Text watermarking inserts statistically detectable patterns into LLM-generated text.
import hashlib
import random
def green_red_watermark(text: str, key: str, gamma: float = 0.25) -> dict:
"""
Green/red list watermarking based on Kirchenbauer et al. (2023).
Uses the previous token as a seed to classify tokens as green or red,
preferring green tokens during generation to embed a watermark.
"""
words = text.split()
green_count = 0
total = len(words)
for i, word in enumerate(words):
prev_token = words[i - 1] if i > 0 else "<s>"
seed = int(hashlib.sha256(f"{key}{prev_token}".encode()).hexdigest(), 16) % (2**32)
random.seed(seed)
is_green = random.random() > (1 - gamma)
if is_green:
green_count += 1
z_score = (green_count - gamma * total) / ((gamma * (1 - gamma) * total) ** 0.5 + 1e-9)
return {
"green_token_ratio": green_count / total,
"z_score": z_score,
"is_watermarked": z_score > 4.0,
}
Data Privacy Technologies
Differential Privacy
Differential privacy adds noise to databases to statistically conceal whether any individual record is included. Smaller epsilon values provide stronger privacy guarantees.
import torch
import torch.nn as nn
from opacus import PrivacyEngine
from torch.utils.data import DataLoader, TensorDataset
# Define model
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 2),
)
def forward(self, x):
return self.fc(x)
# Synthetic data
X = torch.randn(1000, 10)
y = torch.randint(0, 2, (1000,))
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=64, shuffle=True)
model = SimpleNet()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
# Apply Opacus PrivacyEngine
privacy_engine = PrivacyEngine()
model, optimizer, loader = privacy_engine.make_private_with_epsilon(
module=model,
optimizer=optimizer,
data_loader=loader,
target_epsilon=1.0, # epsilon: smaller = stronger privacy
target_delta=1e-5, # delta: probability of epsilon violation
max_grad_norm=1.0, # gradient clipping threshold
epochs=10,
)
# Training loop
criterion = nn.CrossEntropyLoss()
for epoch in range(3):
for batch_X, batch_y in loader:
optimizer.zero_grad()
outputs = model(batch_X)
loss = criterion(outputs, batch_y)
loss.backward()
optimizer.step()
epsilon = privacy_engine.get_epsilon(delta=1e-5)
print(f"Training complete: epsilon = {epsilon:.2f}, delta = 1e-5")
print(f"Privacy guarantee: individual data contribution bounded by e^{epsilon:.2f}")
Federated Learning
Federated learning avoids sending raw data to a central server — clients share only locally trained model weights (gradients).
import copy
import numpy as np
def federated_averaging(global_model_weights, client_updates, client_data_sizes):
"""
FedAvg algorithm: weighted average aggregation based on data sizes.
"""
total_data = sum(client_data_sizes)
averaged_weights = {}
for key in global_model_weights.keys():
weighted_sum = sum(
client_updates[i][key] * (client_data_sizes[i] / total_data)
for i in range(len(client_updates))
)
averaged_weights[key] = weighted_sum
return averaged_weights
# GDPR AI compliance checklist
gdpr_ai_checklist = {
"Data Minimization": "Collect only the minimum data necessary for model training",
"Purpose Limitation": "Prohibit use of training data beyond its stated purpose",
"Data Subject Rights": "Guarantee the right to explanation for automated decisions (Article 22)",
"Profiling Restrictions": "Human review required for significant automated profiling decisions",
"Data Portability": "Right to receive personal data in a portable format",
"Right to Erasure": "Remove the influence of personal data from models (Machine Unlearning)",
}
for right, description in gdpr_ai_checklist.items():
print(f"[GDPR] {right}: {description}")
AI Regulatory Practice
Model Audit Process
- Define audit scope: Clarify the model, time period, and use case under review
- Document review: Examine training data provenance, model cards, system cards
- Technical testing: Bias measurement, robustness testing, adversarial attack simulation
- Stakeholder interviews: Operations team, affected group representatives, regulators
- Audit report: Document findings, risk ratings, and recommended actions
Composing an AI Ethics Committee
An effective AI ethics committee should include:
| Role | Required Competency |
|---|---|
| AI/ML technical expert | Understand how models work |
| Legal/compliance officer | Interpret regulatory requirements |
| Ethicist/philosopher | Mediate value conflicts |
| Domain expert | Provide application context |
| Affected group representative | Reflect real-world impacts |
| Cybersecurity expert | Assess security risks |
Risk Register Template
from dataclasses import dataclass, field
from typing import List
from enum import IntEnum
class Severity(IntEnum):
LOW = 1
MEDIUM = 2
HIGH = 3
CRITICAL = 4
class Likelihood(IntEnum):
RARE = 1
UNLIKELY = 2
POSSIBLE = 3
LIKELY = 4
@dataclass
class AIRisk:
risk_id: str
description: str
severity: Severity
likelihood: Likelihood
affected_groups: List[str]
mitigation: str
owner: str
residual_risk: str = "TBD"
@property
def risk_score(self) -> int:
return self.severity * self.likelihood
@property
def risk_level(self) -> str:
score = self.risk_score
if score >= 12:
return "CRITICAL"
elif score >= 8:
return "HIGH"
elif score >= 4:
return "MEDIUM"
return "LOW"
# Example risk register
risks = [
AIRisk(
risk_id="RISK-001",
description="Gender bias in credit model leading to discriminatory loan rejections",
severity=Severity.HIGH,
likelihood=Likelihood.POSSIBLE,
affected_groups=["Women", "Non-binary individuals"],
mitigation="Reweighing + quarterly disparate impact monitoring",
owner="AI Ethics Team",
),
AIRisk(
risk_id="RISK-002",
description="Inability to explain model decisions violating GDPR Article 22",
severity=Severity.CRITICAL,
likelihood=Likelihood.LIKELY,
affected_groups=["All loan applicants"],
mitigation="Build SHAP-based decision explanation system",
owner="Compliance Team",
),
]
print("=== AI Risk Register ===")
for risk in risks:
print(f"\n[{risk.risk_id}] {risk.description}")
print(f" Risk Level: {risk.risk_level} (Score: {risk.risk_score})")
print(f" Mitigation: {risk.mitigation}")
Quiz
Q1. Under the EU AI Act, what conditions classify a biometric system as High-Risk?
Answer: Real-time + public space + remote biometric identification — when all three conditions are met simultaneously, the system falls under Unacceptable Risk and is prohibited. Limited exceptions exist, such as law enforcement searching for missing children. Non-real-time or post-hoc biometric analysis, or biometric systems used in judiciary and border control contexts, are classified as High-Risk and subject to strict obligations including conformity assessments.
Explanation: EU AI Act Annex III explicitly lists remote biometric identification systems used in law enforcement, judiciary, and border management as high-risk AI. Real-time remote biometric identification in public spaces is principally prohibited under Article 5.
Q2. What is the difference between demographic parity and equal opportunity, and what trade-offs arise?
Answer: Demographic parity requires equal positive prediction rates across protected groups: P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1). Equal opportunity requires equal true positive rates (TPR) for positive-outcome individuals: P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1).
Explanation: Chouldechova's (2017) impossibility theorem shows that when base rates differ across groups, it is mathematically impossible to simultaneously satisfy demographic parity, equal opportunity, and predictive parity. Organizations must explicitly choose which fairness criterion to prioritize based on application context and the nature of potential harms.
Q3. What game-theoretic principle does SHAP use to calculate feature importance?
Answer: SHAP is based on Shapley values from cooperative game theory. Each feature is treated as a "player" and the model's prediction as the "payoff." It computes the average marginal contribution of each feature across all possible feature subsets (coalitions).
Explanation: Shapley values are the unique attribution method satisfying four axioms: efficiency (SHAP values sum to prediction minus expected value), symmetry, linearity, and the dummy feature property. Unlike LIME, SHAP guarantees global consistency. TreeSHAP computes values in O(TLD^2) time for tree-based models.
Q4. Why does a smaller epsilon value in differential privacy provide stronger privacy protection?
Answer: Epsilon defines an upper bound: "Including or excluding one data point can change the output distribution by at most e^epsilon." As epsilon approaches 0, the output distribution becomes nearly identical regardless of whether any individual record is included, preventing individual information from being inferred.
Explanation: Epsilon = 0 means perfect privacy (fully random output); large epsilon is practical but offers weaker protection. In practice, epsilon below 1 is considered strong privacy, and below 10 is considered practical privacy. Libraries like Opacus (PyTorch) and TensorFlow Privacy automatically compute the required noise scale and track epsilon.
Q5. What are the essential sections of a Model Card?
Answer: Model details (name, version, type), intended use and out-of-scope uses, evaluation factors (protected attributes), performance metrics (accuracy, AUC, etc.), training data description, fairness analysis results, limitations and caveats, and ethical considerations.
Explanation: Model cards, proposed by Mitchell et al. (2019), have become a transparency standard. Major organizations including Google and Hugging Face publish model cards with model releases. EU AI Act high-risk AI requires technical documentation under Annex IV that is substantially equivalent to a model card.