AI Governance & Responsible AI: EU AI Act, XAI, Bias Detection, and AI Safety Techniques

AI Governance Frameworks Overview
EU AI Act: Risk Classification System
NIST AI RMF & ISO 42001
Responsible AI Development Principles
Bias Detection & Mitigation
Explainable AI (XAI)
AI Safety Techniques
Data Privacy Technologies
AI Regulatory Practice
Quiz

AI Governance Frameworks Overview

As AI systems are deployed across society, the need for governance frameworks has grown rapidly. AI Governance refers to the totality of policies, procedures, and technologies that manage risks throughout the development, deployment, and operation of AI systems — ensuring alignment with societal values and legal requirements.

Key global frameworks:

Framework	Authority	Core Characteristics
EU AI Act	European Union	Legally binding, risk-based approach
NIST AI RMF	US NIST	Voluntary guidance, risk management
ISO 42001	ISO/IEC	Certifiable AI management system
G7 AI Principles	G7 Nations	International cooperation, non-binding
UNESCO AI Ethics Recommendation	UNESCO	Human rights-centered, global scope

EU AI Act: Risk Classification System

The EU AI Act, which entered into force in 2024, is the world's first comprehensive AI legislation. It adopts a risk-based approach, classifying AI systems into four risk tiers.

Risk Tier Classification

1. Unacceptable Risk — Prohibited

Real-time remote biometric identification in public spaces (e.g., CCTV facial recognition)
Social scoring systems
Subliminal manipulation techniques targeting vulnerable groups
Predictive policing at the individual level

2. High-Risk — Strict Obligations

Medical diagnosis assistance and medical device software
Autonomous vehicles and critical infrastructure
Recruitment and personnel evaluation systems
Credit scoring and insurance underwriting
Judiciary and law enforcement support tools
Educational assessment systems

3. Limited Risk — Transparency Obligations

Chatbots: must disclose that the user is interacting with AI
Deepfake content: must be labeled as synthetic
Emotion recognition systems: must disclose usage

4. Minimal Risk — Self-Regulation

Spam filters, game AI
AI-based inventory management, etc.

EU AI Act Risk Classifier Implementation

from dataclasses import dataclass
from enum import Enum
from typing import List

class RiskLevel(Enum):
    UNACCEPTABLE = "Unacceptable (Prohibited)"
    HIGH = "High-Risk (Strict Regulation)"
    LIMITED = "Limited Risk (Transparency Obligations)"
    MINIMAL = "Minimal Risk (Self-Regulation)"

@dataclass
class AISystemProfile:
    name: str
    uses_biometric: bool
    is_real_time: bool
    public_space: bool
    domain: str  # healthcare, hiring, credit, education, judiciary, infrastructure
    interacts_with_humans: bool
    generates_synthetic_content: bool

def classify_eu_ai_act_risk(system: AISystemProfile) -> tuple[RiskLevel, List[str]]:
    """
    EU AI Act risk classifier.
    Returns (RiskLevel, list_of_applicable_obligations)
    """
    obligations = []

    # Step 1: Check for unacceptable risk
    if (system.uses_biometric and system.is_real_time and system.public_space):
        return RiskLevel.UNACCEPTABLE, ["Cease operations immediately", "Legally prohibited"]

    # Step 2: Check for high-risk domains
    HIGH_RISK_DOMAINS = {
        "healthcare", "hiring", "credit",
        "education", "judiciary", "critical_infrastructure"
    }
    if system.domain in HIGH_RISK_DOMAINS:
        obligations = [
            "Mandatory Conformity Assessment",
            "Technical documentation obligation",
            "Human oversight mechanisms required",
            "Transparency and logging requirements",
            "Bias testing and data governance",
            "Registration in EU database",
        ]
        return RiskLevel.HIGH, obligations

    # Step 3: Limited risk
    if system.interacts_with_humans or system.generates_synthetic_content:
        obligations = [
            "Disclose AI system status to users",
            "Watermark or label synthetic content",
        ]
        return RiskLevel.LIMITED, obligations

    # Step 4: Minimal risk
    return RiskLevel.MINIMAL, ["Voluntary code of conduct recommended"]


# Usage example
credit_scoring_system = AISystemProfile(
    name="Automated Credit Scoring AI",
    uses_biometric=False,
    is_real_time=False,
    public_space=False,
    domain="credit",
    interacts_with_humans=False,
    generates_synthetic_content=False,
)

risk_level, obligations = classify_eu_ai_act_risk(credit_scoring_system)
print(f"System: {credit_scoring_system.name}")
print(f"Risk Level: {risk_level.value}")
print("Obligations:")
for ob in obligations:
    print(f"  - {ob}")

NIST AI RMF & ISO 42001

NIST AI Risk Management Framework

The NIST AI RMF (2023) is structured around four core functions:

GOVERN: Establish AI risk management culture and policies
MAP: Identify and categorize AI risk context
MEASURE: Analyze, evaluate, and quantify risks
MANAGE: Respond to risks based on priority

ISO/IEC 42001: AI Management System

ISO 42001 is a management system standard for organizations to develop and deploy AI responsibly. Like ISO 9001 (quality) or ISO 27001 (security), it can be certified by third parties.

Core requirements:

Establish AI policies and objectives
Clarify leadership responsibilities
Assess risks and opportunities
Conduct AI impact assessments
Perform internal audits and continuous improvement

Responsible AI Development Principles

The FATE Framework

Fairness: Treat similar people similarly. Do not disadvantage particular groups.

Accountability: Clarify responsibility for decisions. "Who is accountable for this decision?"

Transparency: Disclose how AI systems work, what data they were trained on, and their limitations.

Explainability: Explain the reasoning behind individual predictions in human-understandable terms.

G7 Hiroshima AI Principles (2023)

Rule of law and respect for human rights
Transparency and explainability
Fairness and non-discrimination
Human oversight and control
Privacy protection
Cybersecurity
Information sharing and incident reporting

Bias Detection & Mitigation

AI model bias originates from historical inequalities in training data, feature selection errors, labeling mistakes, and feedback loops.

Key Fairness Metrics

Demographic Parity (Statistical Parity): The positive prediction rate must be equal across protected groups. P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1)

Equal Opportunity: The true positive rate (TPR) must be equal across protected groups. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1)

Calibration: Predicted probabilities must match actual positive rates (per group).

Individual Fairness: Similar individuals should be treated similarly.

Bias Detection with AIF360

import numpy as np
import pandas as pd
from aif360.datasets import BinaryLabelDataset
from aif360.metrics import BinaryLabelDatasetMetric, ClassificationMetric
from aif360.algorithms.preprocessing import Reweighing
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

# 1. Prepare data (loan approval scenario)
np.random.seed(42)
n = 1000
data = pd.DataFrame({
    'income': np.random.normal(50000, 20000, n).clip(10000, 150000),
    'credit_score': np.random.normal(680, 100, n).clip(300, 850),
    'age': np.random.randint(20, 70, n),
    'gender': np.random.choice([0, 1], n, p=[0.5, 0.5]),  # 0=female, 1=male
    'loan_approved': np.zeros(n, dtype=int)
})
# Inject artificial bias: males have higher approval probability
prob = 0.3 + 0.2 * data['gender'] + 0.3 * (data['credit_score'] > 700).astype(int)
data['loan_approved'] = (np.random.random(n) < prob).astype(int)

# 2. Create AIF360 dataset
aif_dataset = BinaryLabelDataset(
    df=data,
    label_names=['loan_approved'],
    protected_attribute_names=['gender'],
    favorable_label=1,
    unfavorable_label=0,
)

# 3. Measure bias
privileged_groups = [{'gender': 1}]    # male
unprivileged_groups = [{'gender': 0}]  # female

dataset_metric = BinaryLabelDatasetMetric(
    aif_dataset,
    unprivileged_groups=unprivileged_groups,
    privileged_groups=privileged_groups,
)

print("=== Original Data Bias Analysis ===")
print(f"Disparate Impact: {dataset_metric.disparate_impact():.4f}")
print(f"Statistical Parity Difference: {dataset_metric.statistical_parity_difference():.4f}")
# Disparate Impact < 0.8 → 80% rule violation (bias detected)

# 4. Preprocessing bias mitigation with Reweighing
rw = Reweighing(
    unprivileged_groups=unprivileged_groups,
    privileged_groups=privileged_groups,
)
dataset_reweighed = rw.fit_transform(aif_dataset)

metric_reweighed = BinaryLabelDatasetMetric(
    dataset_reweighed,
    unprivileged_groups=unprivileged_groups,
    privileged_groups=privileged_groups,
)
print("\n=== After Reweighing ===")
print(f"Disparate Impact: {metric_reweighed.disparate_impact():.4f}")
print(f"Statistical Parity Difference: {metric_reweighed.statistical_parity_difference():.4f}")

Post-processing Mitigation with Fairlearn

from fairlearn.postprocessing import ThresholdOptimizer
from fairlearn.metrics import MetricFrame, selection_rate, demographic_parity_difference
from sklearn.ensemble import GradientBoostingClassifier

# Train model
X = data[['income', 'credit_score', 'age']].values
y = data['loan_approved'].values
sensitive = data['gender'].values

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

base_model = GradientBoostingClassifier(n_estimators=100, random_state=42)
base_model.fit(X_scaled, y)

# ThresholdOptimizer: optimize decision thresholds per group
postprocess_est = ThresholdOptimizer(
    estimator=base_model,
    constraints="demographic_parity",
    predict_method="predict_proba",
    objective="balanced_accuracy_score",
)
postprocess_est.fit(X_scaled, y, sensitive_features=sensitive)

y_pred_fair = postprocess_est.predict(X_scaled, sensitive_features=sensitive)

# Measure fairness metrics
mf = MetricFrame(
    metrics={"selection_rate": selection_rate},
    y_true=y,
    y_pred=y_pred_fair,
    sensitive_features=sensitive,
)
print("\n=== Fairlearn Post-processing Results ===")
print(f"Selection rate by group:\n{mf.by_group}")
print(f"Demographic Parity Difference: {demographic_parity_difference(y, y_pred_fair, sensitive_features=sensitive):.4f}")

Explainable AI (XAI)

SHAP: SHapley Additive exPlanations

SHAP leverages Shapley values from cooperative game theory to quantify each feature's contribution to a prediction. It computes the average marginal contribution of a feature across all possible feature subsets.

import shap
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

# Train model
X_train, y_train = make_classification(
    n_samples=500, n_features=8, n_informative=5, random_state=42
)
feature_names = [
    'income', 'credit_score', 'age', 'debt_ratio',
    'employment_years', 'num_accounts', 'late_payments', 'loan_amount'
]

rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

# SHAP TreeExplainer (tree-specific, fast)
explainer = shap.TreeExplainer(rf_model)
shap_values = explainer.shap_values(X_train)

# Individual prediction explanation (Waterfall Plot)
sample_idx = 0
shap.waterfall_plot(
    shap.Explanation(
        values=shap_values[1][sample_idx],
        base_values=explainer.expected_value[1],
        data=X_train[sample_idx],
        feature_names=feature_names,
    )
)

# Global importance (Summary Plot)
shap.summary_plot(shap_values[1], X_train, feature_names=feature_names)

# SHAP interaction effects
shap_interaction = explainer.shap_interaction_values(X_train[:100])
print(f"Income-CreditScore interaction SHAP: {shap_interaction[0, 0, 1]:.4f}")

LIME: Local Interpretable Model-agnostic Explanations

import lime
import lime.lime_tabular
import numpy as np

# Create LIME explainer
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train,
    feature_names=feature_names,
    class_names=['Rejected', 'Approved'],
    mode='classification',
    discretize_continuous=True,
)

# Explain individual sample
explanation = lime_explainer.explain_instance(
    data_row=X_train[0],
    predict_fn=rf_model.predict_proba,
    num_features=6,
    num_samples=1000,
)

print("=== LIME Explanation (Sample #0) ===")
for feature, weight in explanation.as_list():
    direction = "increases" if weight > 0 else "decreases"
    print(f"  {feature}: {weight:+.4f} ({direction} approval probability)")

explanation.show_in_notebook(show_table=True)

Generating a Model Card

import json
from datetime import datetime

def generate_model_card(
    model_name: str,
    version: str,
    intended_use: str,
    out_of_scope_uses: list,
    training_data: dict,
    evaluation_results: dict,
    fairness_analysis: dict,
    limitations: list,
    ethical_considerations: list,
) -> dict:
    """Standard model card generator (based on Mitchell et al. 2019)."""
    model_card = {
        "model_details": {
            "name": model_name,
            "version": version,
            "date": datetime.now().strftime("%Y-%m-%d"),
            "type": "Binary Classifier",
            "paper": "https://arxiv.org/abs/1810.03993",
        },
        "intended_use": {
            "primary_uses": intended_use,
            "primary_users": ["Credit officers", "Financial regulators"],
            "out_of_scope_uses": out_of_scope_uses,
        },
        "factors": {
            "relevant_factors": ["gender", "age_group", "income_bracket"],
            "evaluation_factors": ["demographic_parity", "equal_opportunity"],
        },
        "metrics": {
            "performance_measures": evaluation_results,
            "decision_thresholds": {"default": 0.5, "high_precision": 0.7},
        },
        "training_data": training_data,
        "fairness_analysis": fairness_analysis,
        "limitations": limitations,
        "ethical_considerations": ethical_considerations,
        "caveats_recommendations": [
            "Regular drift monitoring recommended",
            "Quarterly bias re-evaluation required",
            "Human review required for high-stakes decisions",
        ],
    }
    return model_card

card = generate_model_card(
    model_name="Personal Loan Approval Model v2.1",
    version="2.1.0",
    intended_use="Automated initial screening for personal loan applications",
    out_of_scope_uses=["Corporate loan assessment", "Insurance pricing", "Employment decisions"],
    training_data={"size": 50000, "period": "2020-2024", "source": "Internal loan history"},
    evaluation_results={"accuracy": 0.84, "AUC": 0.91, "F1": 0.82},
    fairness_analysis={
        "demographic_parity_diff": 0.03,
        "equal_opportunity_diff": 0.02,
        "disparate_impact": 0.96,
    },
    limitations=["Pre-2020 data not included", "Rural region underrepresentation"],
    ethical_considerations=["Final decisions must be reviewed by human officers", "Mandatory disclosure of rejection reasons"],
)
print(json.dumps(card, indent=2))

AI Safety Techniques

Constitutional AI (Anthropic)

Constitutional AI trains models to critique and revise their own responses according to a set of explicit principles (the "constitution").

How it works:

Model generates a potentially harmful response
Model performs self-critique based on constitutional principles
Model revises the response to comply with principles
Revised responses are used to train via RLHF

RLHF (Reinforcement Learning from Human Feedback)

1. SFT (Supervised Fine-Tuning): Fine-tune base model on high-quality demonstration data
2. Reward Modeling: Train reward model on human preference pairs (preferred vs. rejected)
3. RL Optimization: Maximize reward with PPO algorithm (with KL divergence constraint)

Jailbreak Defense Techniques

Input filtering: Detect and block harmful patterns before processing
Prompt injection defense: Isolate system prompts from user inputs
Output monitoring: Real-time safety checks on generated text
Red teaming: Expert adversarial teams systematically probe for vulnerabilities

AI Watermarking

Text watermarking inserts statistically detectable patterns into LLM-generated text.

import hashlib
import random

def green_red_watermark(text: str, key: str, gamma: float = 0.25) -> dict:
    """
    Green/red list watermarking based on Kirchenbauer et al. (2023).
    Uses the previous token as a seed to classify tokens as green or red,
    preferring green tokens during generation to embed a watermark.
    """
    words = text.split()
    green_count = 0
    total = len(words)

    for i, word in enumerate(words):
        prev_token = words[i - 1] if i > 0 else "<s>"
        seed = int(hashlib.sha256(f"{key}{prev_token}".encode()).hexdigest(), 16) % (2**32)
        random.seed(seed)
        is_green = random.random() > (1 - gamma)
        if is_green:
            green_count += 1

    z_score = (green_count - gamma * total) / ((gamma * (1 - gamma) * total) ** 0.5 + 1e-9)
    return {
        "green_token_ratio": green_count / total,
        "z_score": z_score,
        "is_watermarked": z_score > 4.0,
    }

Data Privacy Technologies

Differential Privacy

Differential privacy adds noise to databases to statistically conceal whether any individual record is included. Smaller epsilon values provide stronger privacy guarantees.

import torch
import torch.nn as nn
from opacus import PrivacyEngine
from torch.utils.data import DataLoader, TensorDataset

# Define model
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Sequential(
            nn.Linear(10, 64),
            nn.ReLU(),
            nn.Linear(64, 2),
        )
    def forward(self, x):
        return self.fc(x)

# Synthetic data
X = torch.randn(1000, 10)
y = torch.randint(0, 2, (1000,))
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=64, shuffle=True)

model = SimpleNet()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

# Apply Opacus PrivacyEngine
privacy_engine = PrivacyEngine()
model, optimizer, loader = privacy_engine.make_private_with_epsilon(
    module=model,
    optimizer=optimizer,
    data_loader=loader,
    target_epsilon=1.0,   # epsilon: smaller = stronger privacy
    target_delta=1e-5,    # delta: probability of epsilon violation
    max_grad_norm=1.0,    # gradient clipping threshold
    epochs=10,
)

# Training loop
criterion = nn.CrossEntropyLoss()
for epoch in range(3):
    for batch_X, batch_y in loader:
        optimizer.zero_grad()
        outputs = model(batch_X)
        loss = criterion(outputs, batch_y)
        loss.backward()
        optimizer.step()

epsilon = privacy_engine.get_epsilon(delta=1e-5)
print(f"Training complete: epsilon = {epsilon:.2f}, delta = 1e-5")
print(f"Privacy guarantee: individual data contribution bounded by e^{epsilon:.2f}")

Federated Learning

Federated learning avoids sending raw data to a central server — clients share only locally trained model weights (gradients).

import copy
import numpy as np

def federated_averaging(global_model_weights, client_updates, client_data_sizes):
    """
    FedAvg algorithm: weighted average aggregation based on data sizes.
    """
    total_data = sum(client_data_sizes)
    averaged_weights = {}

    for key in global_model_weights.keys():
        weighted_sum = sum(
            client_updates[i][key] * (client_data_sizes[i] / total_data)
            for i in range(len(client_updates))
        )
        averaged_weights[key] = weighted_sum

    return averaged_weights

# GDPR AI compliance checklist
gdpr_ai_checklist = {
    "Data Minimization": "Collect only the minimum data necessary for model training",
    "Purpose Limitation": "Prohibit use of training data beyond its stated purpose",
    "Data Subject Rights": "Guarantee the right to explanation for automated decisions (Article 22)",
    "Profiling Restrictions": "Human review required for significant automated profiling decisions",
    "Data Portability": "Right to receive personal data in a portable format",
    "Right to Erasure": "Remove the influence of personal data from models (Machine Unlearning)",
}

for right, description in gdpr_ai_checklist.items():
    print(f"[GDPR] {right}: {description}")

AI Regulatory Practice

Model Audit Process

Define audit scope: Clarify the model, time period, and use case under review
Document review: Examine training data provenance, model cards, system cards
Technical testing: Bias measurement, robustness testing, adversarial attack simulation
Stakeholder interviews: Operations team, affected group representatives, regulators
Audit report: Document findings, risk ratings, and recommended actions

Composing an AI Ethics Committee

An effective AI ethics committee should include:

Role	Required Competency
AI/ML technical expert	Understand how models work
Legal/compliance officer	Interpret regulatory requirements
Ethicist/philosopher	Mediate value conflicts
Domain expert	Provide application context
Affected group representative	Reflect real-world impacts
Cybersecurity expert	Assess security risks

Risk Register Template

from dataclasses import dataclass, field
from typing import List
from enum import IntEnum

class Severity(IntEnum):
    LOW = 1
    MEDIUM = 2
    HIGH = 3
    CRITICAL = 4

class Likelihood(IntEnum):
    RARE = 1
    UNLIKELY = 2
    POSSIBLE = 3
    LIKELY = 4

@dataclass
class AIRisk:
    risk_id: str
    description: str
    severity: Severity
    likelihood: Likelihood
    affected_groups: List[str]
    mitigation: str
    owner: str
    residual_risk: str = "TBD"

    @property
    def risk_score(self) -> int:
        return self.severity * self.likelihood

    @property
    def risk_level(self) -> str:
        score = self.risk_score
        if score >= 12:
            return "CRITICAL"
        elif score >= 8:
            return "HIGH"
        elif score >= 4:
            return "MEDIUM"
        return "LOW"

# Example risk register
risks = [
    AIRisk(
        risk_id="RISK-001",
        description="Gender bias in credit model leading to discriminatory loan rejections",
        severity=Severity.HIGH,
        likelihood=Likelihood.POSSIBLE,
        affected_groups=["Women", "Non-binary individuals"],
        mitigation="Reweighing + quarterly disparate impact monitoring",
        owner="AI Ethics Team",
    ),
    AIRisk(
        risk_id="RISK-002",
        description="Inability to explain model decisions violating GDPR Article 22",
        severity=Severity.CRITICAL,
        likelihood=Likelihood.LIKELY,
        affected_groups=["All loan applicants"],
        mitigation="Build SHAP-based decision explanation system",
        owner="Compliance Team",
    ),
]

print("=== AI Risk Register ===")
for risk in risks:
    print(f"\n[{risk.risk_id}] {risk.description}")
    print(f"  Risk Level: {risk.risk_level} (Score: {risk.risk_score})")
    print(f"  Mitigation: {risk.mitigation}")

Quiz

Q1. Under the EU AI Act, what conditions classify a biometric system as High-Risk?

Answer: Real-time + public space + remote biometric identification — when all three conditions are met simultaneously, the system falls under Unacceptable Risk and is prohibited. Limited exceptions exist, such as law enforcement searching for missing children. Non-real-time or post-hoc biometric analysis, or biometric systems used in judiciary and border control contexts, are classified as High-Risk and subject to strict obligations including conformity assessments.

Explanation: EU AI Act Annex III explicitly lists remote biometric identification systems used in law enforcement, judiciary, and border management as high-risk AI. Real-time remote biometric identification in public spaces is principally prohibited under Article 5.

Q2. What is the difference between demographic parity and equal opportunity, and what trade-offs arise?

Answer: Demographic parity requires equal positive prediction rates across protected groups: P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1). Equal opportunity requires equal true positive rates (TPR) for positive-outcome individuals: P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1).

Explanation: Chouldechova's (2017) impossibility theorem shows that when base rates differ across groups, it is mathematically impossible to simultaneously satisfy demographic parity, equal opportunity, and predictive parity. Organizations must explicitly choose which fairness criterion to prioritize based on application context and the nature of potential harms.

Q3. What game-theoretic principle does SHAP use to calculate feature importance?

Answer: SHAP is based on Shapley values from cooperative game theory. Each feature is treated as a "player" and the model's prediction as the "payoff." It computes the average marginal contribution of each feature across all possible feature subsets (coalitions).

Explanation: Shapley values are the unique attribution method satisfying four axioms: efficiency (SHAP values sum to prediction minus expected value), symmetry, linearity, and the dummy feature property. Unlike LIME, SHAP guarantees global consistency. TreeSHAP computes values in O(TLD^2) time for tree-based models.

Q4. Why does a smaller epsilon value in differential privacy provide stronger privacy protection?

Answer: Epsilon defines an upper bound: "Including or excluding one data point can change the output distribution by at most e^epsilon." As epsilon approaches 0, the output distribution becomes nearly identical regardless of whether any individual record is included, preventing individual information from being inferred.

Explanation: Epsilon = 0 means perfect privacy (fully random output); large epsilon is practical but offers weaker protection. In practice, epsilon below 1 is considered strong privacy, and below 10 is considered practical privacy. Libraries like Opacus (PyTorch) and TensorFlow Privacy automatically compute the required noise scale and track epsilon.

Q5. What are the essential sections of a Model Card?

Answer: Model details (name, version, type), intended use and out-of-scope uses, evaluation factors (protected attributes), performance metrics (accuracy, AUC, etc.), training data description, fairness analysis results, limitations and caveats, and ethical considerations.

Explanation: Model cards, proposed by Mitchell et al. (2019), have become a transparency standard. Major organizations including Google and Hugging Face publish model cards with model releases. EU AI Act high-risk AI requires technical documentation under Annex IV that is substantially equivalent to a model card.

Table of Contents