Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction — by May 2026, the MLOps stack has standardized

Up through 2023, the MLOps stack was a "pick whatever works for your company" wilderness. By May 2026, much of that ambiguity has cleared. **MLflow and Weights & Biases** split experiment tracking. **Kubeflow Pipelines and Flyte** own Kubernetes-native orchestration. **Metaflow and ZenML** own Python-first workflows. **BentoML 1.4, KServe, and Triton** dominate serving. **DVC and lakeFS** own data versioning.

This article is not a marketing matrix. It is an honest, by-the-layer comparison of "what goes where in production in 2026" — covering MLflow 3.0's changes, the Kubeflow 1.10 lineup, Flyte's commercial trajectory via Union.ai, ZenML Cloud, and BentoML 1.4's LLM serving mode, with real API shapes.

The 2026 MLOps stack — seven layers

Let us start with the big picture. The standard 2026 MLOps stack splits into seven layers.

1. **Experiment tracking** — runs, parameters, metrics, artifacts

2. **Pipeline orchestration** — DAG execution, distribution, caching

3. **Model registry** — model versions, stages, metadata

4. **Model serving** — online, batch, streaming inference

5. **Feature store** — train/inference consistency for features

6. **Data + experiment versioning** — datasets and code in sync

7. **ML monitoring** — drift, performance decay, data quality

The era when one or two tools per layer sufficed is over. Today, **even within a layer there is a split between LLM-specific and classical-ML tracks**. We go layer by layer below.

Experiment tracking — MLflow 3.0 vs Weights & Biases

90% of experiment tracking is two tools.

- **MLflow 3.0**: OSS by Databricks, now under the Linux Foundation. 3.0 went GA in Q1 2026. GenAI tracing, evaluation, and prompt registry are now first-class citizens.

- **Weights & Biases**: SaaS-first but with an open SDK. Dominates on UX and visualization. W&B Models, W&B Weave (LLM tracing), and W&B Launch ship as one bundle.

The alternatives still matter.

- **Comet ML**: a SaaS that integrates ML + LLM experiments + production monitoring.

- **Neptune.ai**: repositioned as a metadata store for foundation model training.

- **Aim**: Apache 2.0 OSS, self-hosted, with a snappy UI.

- **TensorBoard, PyTorch Lightning Logger**: limited as standalone trackers; usually combined with the above.

A typical MLflow 3.0 flow looks like this.

from sklearn.ensemble import RandomForestClassifier

from sklearn.datasets import load_iris

mlflow.set_tracking_uri("http://mlflow.internal:5000")

mlflow.set_experiment("iris-rf-2026")

X, y = load_iris(return_X_y=True)

with mlflow.start_run() as run:

mlflow.log_param("n_estimators", 200)

model = RandomForestClassifier(n_estimators=200)

model.fit(X, y)

mlflow.log_metric("train_acc", model.score(X, y))

mlflow.sklearn.log_model(model, artifact_path="model", registered_model_name="iris-rf")

print(run.info.run_id)

The same flow in W&B is just as terse.

from sklearn.ensemble import RandomForestClassifier

from sklearn.datasets import load_iris

wandb.init(project="iris-rf-2026", config={"n_estimators": 200})

X, y = load_iris(return_X_y=True)

model = RandomForestClassifier(n_estimators=200).fit(X, y)

wandb.log({"train_acc": model.score(X, y)})

wandb.finish()

Both tools have auto-instrumentation for scikit-learn, PyTorch, XGBoost, and others — so you get baseline metrics even without explicit logging. The difference is **storage model and governance**. MLflow defaults to self-hosted with a BSD-friendly license; W&B is SaaS-first but has by far the smoother collaboration UX.

Pipeline orchestration — K8s-native vs Python-first

ML pipelines differ from generic ETL in that they must handle GPU scheduling, caching, and reproducibility together. As of May 2026, four tools split the market.

- **Kubeflow Pipelines + Kubeflow 1.10**: CNCF incubation project. The canonical K8s-native ML platform. Components are containers.

- **Metaflow**: built by Netflix, commercialized by Outerbounds. Python decorator-first. Deeply integrated with AWS Batch and Step Functions.

- **Flyte**: built by Lyft, commercialized by Union.ai. Under LF AI & Data. K8s-native + type-safe.

- **ZenML**: framework-agnostic, positioned as an "abstraction layer". MLflow, W&B, Kubeflow, Airflow are pluggable backends.

Adjacent tools — **Prefect, Dagster, Airflow** — also see ML use, but they are general data orchestrators rather than ML-specific, so we cover them separately in iter72/iter53.

DSL ergonomics rank roughly as **ZenML > Metaflow > Flyte > Kubeflow Pipelines**. The benchmark is whether a single Python file is enough.

Kubeflow 1.10 — a full ML platform on Kubernetes

Kubeflow is not a single tool but a **full ML platform on top of Kubernetes**. As of May 2026, the 1.10 lineup has these core components.

- **Kubeflow Pipelines (KFP)**: DAG pipeline SDK and UI.

- **Katib**: distributed hyperparameter tuning.

- **Training Operator**: PyTorchJob, TFJob, MPIJob, PaddleJob CRDs for distributed training.

- **KServe**: model serving (formerly KFServing). Split into a sibling project but integrated.

- **Notebook Controller**: JupyterHub-style notebook instances.

- **Spark Operator, Volcano**: batch scheduling.

A KFP 2.x DSL example.

from kfp import dsl, compiler

@dsl.component(base_image="python:3.12")

def preprocess(input_path: str, output_path: str):

df = pd.read_csv(input_path)

df.dropna().to_csv(output_path, index=False)

@dsl.component(base_image="python:3.12", packages_to_install=["scikit-learn"])

def train(data_path: str) -> float:

from sklearn.ensemble import RandomForestClassifier

df = pd.read_csv(data_path)

X, y = df.drop("label", axis=1), df["label"]

return RandomForestClassifier().fit(X, y).score(X, y)

@dsl.pipeline(name="iris-pipeline")

def pipeline(input_path: str = "/data/iris.csv"):

pre = preprocess(input_path=input_path, output_path="/tmp/clean.csv")

train_task = train(data_path=pre.outputs["output_path"])

compiler.Compiler().compile(pipeline, "iris.yaml")

Kubeflow rewards teams who can run Kubernetes, but its onboarding cost is the highest of the four. That is why many teams adopt it via a managed offering (Vertex AI Pipelines, SageMaker MLOps, Azure ML).

Metaflow — workflows that start with one Python decorator

Metaflow is the workflow library Netflix designed with data scientists in mind. Outerbounds offers commercial hosting; the core is Apache 2.0.

The abstractions are minimal.

- **FlowSpec**: the workflow class.

- **`@step`**: the step decorator.

- **`self.next(...)`**: explicit branching.

- **`@batch`, `@kubernetes`, `@gpu`**: per-step execution-environment decorators.

- **`@retry`, `@timeout`, `@catch`**: reliability decorators.

A typical Metaflow flow.

from metaflow import FlowSpec, step, batch, retry

class IrisFlow(FlowSpec):

@step

def start(self):

from sklearn.datasets import load_iris

self.X, self.y = load_iris(return_X_y=True)

self.next(self.train)

@batch(cpu=4, memory=16000)

@retry(times=2)

@step

def train(self):

from sklearn.ensemble import RandomForestClassifier

self.model = RandomForestClassifier(n_estimators=200).fit(self.X, self.y)

self.acc = self.model.score(self.X, self.y)

self.next(self.end)

@step

def end(self):

print(f"acc={self.acc:.3f}")

if __name__ == "__main__":

IrisFlow()

Metaflow's strength is **local-first**. It runs in your notebook, and `--with batch` is all you need to scale to AWS Batch. Automatic artifact storage (S3) plus automatic tracking are built in, which is why many teams skip a separate MLflow.

Flyte — Kubernetes-native, type-safe workflows

Flyte is the K8s-native workflow tool built by Lyft and commercialized by Union.ai. It is a graduated project under LF AI & Data. Its biggest selling points are **type safety and caching**.

- **Python type annotations** double as input/output schemas.

- **Automatic caching**: identical inputs hit cache. Significant cost savings.

- **K8s-native**: Pod, Deployment, and GPU/TPU scheduling are first-class.

- **Multi-language**: Python is primary; Java/Scala SDKs exist.

A Flyte example.

from flytekit import task, workflow, Resources

@task(cache=True, cache_version="1.0", requests=Resources(cpu="2", mem="4Gi"))

def preprocess(input_path: str) -> str:

df = pd.read_csv(input_path).dropna()

out = "/tmp/clean.csv"

df.to_csv(out, index=False)

return out

@task(requests=Resources(cpu="4", mem="16Gi", gpu="1"))

def train(data_path: str) -> float:

from sklearn.ensemble import RandomForestClassifier

df = pd.read_csv(data_path)

X, y = df.drop("label", axis=1), df["label"]

return RandomForestClassifier().fit(X, y).score(X, y)

@workflow

def iris_wf(input_path: str = "/data/iris.csv") -> float:

clean = preprocess(input_path=input_path)

return train(data_path=clean)

Flyte's edge is being **K8s-friendly with caching that actually works**. Running the same code on the same data hits cache automatically, saving time and money.

ZenML — a framework-agnostic abstraction layer

ZenML does not compete with the tools above; it sits **on top of them as an abstraction layer**. ZenML Pipelines can use Kubeflow, Airflow, Tekton, Vertex, SageMaker, or AWS Step Functions as the backend.

Core abstractions.

- **`@step`, `@pipeline`**: Python decorators for steps and pipelines.

- **Stack**: an environment composed of orchestrator + artifact_store + container_registry + experiment_tracker.

- **Component**: a swappable implementation per layer — MLflow, W&B, Neptune.ai, Kubeflow, Vertex, etc.

- **Materializer**: a serializer for user-defined types.

ZenML code.

from zenml import pipeline, step

from typing import Tuple

@step

def load() -> pd.DataFrame:

return pd.read_csv("/data/iris.csv")

@step

def split(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame]:

return df.iloc[:120], df.iloc[120:]

@step

def train(train_df: pd.DataFrame) -> float:

from sklearn.ensemble import RandomForestClassifier

X, y = train_df.drop("label", axis=1), train_df["label"]

return RandomForestClassifier().fit(X, y).score(X, y)

@pipeline

def iris_pipeline():

df = load()

tr, _ = split(df)

train(tr)

iris_pipeline()

ZenML's value is **deferring vendor lock-in**. Going local to SageMaker to Kubeflow requires almost no code change. The cost is the abstraction tax — using a backend's unique features 100% eventually requires calling that backend's SDK directly.

Model registry — MLflow Registry vs BentoML Model Store vs Hugging Face Hub

A trained model has to live somewhere with versions, stages (Staging/Production), and metadata. Three candidates show up most often as of May 2026.

- **MLflow Model Registry**: bundled with MLflow and closest to a standard. URI pattern `models:/name/Production`.

- **BentoML Model Store**: paired with BentoML serving. Flow is `bentoml.transformers.save_model()`.

- **Hugging Face Hub**: public/private repos for model sharing. The de facto standard for the transformers/diffusion ecosystem.

Enterprises usually combine **MLflow Model Registry + their own S3**, open-model and LLM teams use a **private Hugging Face Hub repo**, and BentoML-centric ops teams use the **BentoML Model Store** as-is.

Loading from MLflow Registry.

model_uri = "models:/iris-rf/Production"

model = mlflow.pyfunc.load_model(model_uri)

predictions = model.predict([[5.1, 3.5, 1.4, 0.2]])

Model serving — BentoML 1.4, KServe, Triton, Ray Serve split the work

No single serving tool covers every case. The split depends on scenario.

- **BentoML 1.4 + Yatai**: Python-friendly serving framework. Packages model + business logic as a single "Bento". 1.4 in Q1 2026 made LLM serving mode (vLLM, TGI backends) GA.

- **Seldon Core 2**: K8s-native serving. Mesh-based multi-model routing.

- **KServe**: formerly KFServing. Often paired with Kubeflow but standalone-capable. Serverless/autoscale + standard inference protocol.

- **TorchServe**: built by Meta + AWS. The PyTorch serving standard.

- **TensorFlow Serving**: serves TF SavedModels. C++ core, very stable.

- **NVIDIA Triton Inference Server**: GPU-optimal serving. Integrates TensorRT, ONNX, PyTorch, TF, vLLM backends.

- **Ray Serve**: Python serving atop Ray clusters. Managed by Anyscale.

A BentoML 1.4 LLM serving example.

from transformers import AutoTokenizer

@bentoml.service(resources={"gpu": 1, "memory": "16Gi"})

class LlamaService:

def __init__(self) -> None:

from vllm import LLM

self.llm = LLM(model="meta-llama/Llama-3.3-70B-Instruct")

self.tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.3-70B-Instruct")

@bentoml.api

def generate(self, prompt: str, max_tokens: int = 256) -> str:

outputs = self.llm.generate([prompt], sampling_params={"max_tokens": max_tokens})

return outputs[0].outputs[0].text

For classical (non-LLM) models, TF Serving, TorchServe, and Triton are faster and more stable. LLM inference plugs an engine (vLLM, SGLang, TGI, TensorRT-LLM — see the iter69 inference-engine deep dive) into BentoML or Triton as the standard pattern.

Feature stores — Feast 0.40, Hopsworks, Featureform, Tecton

The feature store layer keeps the features used in training identical to those served at inference.

- **Feast 0.40**: Apache 2.0 OSS. K8s or local deployment. Online stores: Redis, DynamoDB, Bigtable.

- **Hopsworks**: open core. Both self-hosted and SaaS. Feature store + notebooks + serving as one.

- **Featureform**: a virtualization layer that abstracts existing data warehouses into feature stores.

- **Tecton**: commercial SaaS. Built by the team behind Feast. Common in enterprise.

A Feast 0.40 definition example.

from feast import Entity, FeatureView, Field, FileSource

from feast.types import Float32, Int64

from datetime import timedelta

user = Entity(name="user", join_keys=["user_id"])

user_stats_source = FileSource(

path="s3://feast-data/user_stats.parquet",

timestamp_field="event_ts",

)

user_stats_view = FeatureView(

name="user_stats",

entities=[user],

ttl=timedelta(days=7),

schema=[

Field(name="purchase_count_7d", dtype=Int64),

Field(name="purchase_value_7d", dtype=Float32),

source=user_stats_source,

)

It is tempting to put feature stores off forever. But the moment a "train-serve skew" incident happens, they become mandatory.

Data + experiment versioning — DVC, lakeFS, Pachyderm, Quilt

Git handles code; data is another story. Four tools split the data-versioning market.

- **DVC (Data Version Control)**: data and model versioning on top of Git. `.dvc` metafiles track large files while real data lives on S3/GCS/Azure.

- **lakeFS**: a Git model for the data lake. Branches, merges, and commits work at petabyte scale on S3.

- **Pachyderm**: data versioning + data lineage + automatic pipeline triggers.

- **Quilt Data**: a data-package catalog.

A DVC flow.

git init

dvc init

dvc remote add -d storage s3://my-bucket/dvc

dvc add data/raw/train.csv

git add data/raw/train.csv.dvc .gitignore

git commit -m "track training data"

dvc push # data to S3

git push # metadata to Git

DVC's other strength is **DVC Pipelines**. Define stages and dependencies in `dvc.yaml` and it re-runs only the changed stages. With little infrastructure you get a reproducible ML workflow.

ML monitoring — Evidently, Arize, WhyLabs, Fiddler

Once deployed, models degrade as data shifts (data/concept drift). Monitoring catches that.

- **Evidently AI**: Apache 2.0 OSS. Reports and dashboards generated from Python. K8s self-hostable.

- **Arize AI**: commercial SaaS. Unified ML + LLM observability. Phoenix is the OSS spin-off.

- **WhyLabs**: data quality + drift + anomaly detection. Has a free tier.

- **Fiddler AI**: explainability-first. Enterprise compliance track.

An Evidently example.

from evidently.report import Report

from evidently.metric_preset import DataDriftPreset

ref = pd.read_csv("reference.csv")

cur = pd.read_csv("current.csv")

report = Report(metrics=[DataDriftPreset()])

report.run(reference_data=ref, current_data=cur)

report.save_html("drift.html")

Monitoring is "easy to set up, often turned off in production due to noise". Threshold tuning and dashboard customization end up being the harder problems.

Notebooks & IDEs — Jupyter, Marimo, Hex, Deepnote, Quarto

ML workflows still start in notebooks.

- **Jupyter / JupyterLab 4 / JupyterHub**: the de facto standard. Multi-user means JupyterHub. Covered in depth in the iter27 notebooks deep dive.

- **Marimo**: reactive notebooks. Cell execution order is determined by the dependency graph. Strong reproducibility.

- **Hex, Deepnote**: managed collaborative notebooks.

- **Quarto**: turns notebooks + Markdown into reports, books, and sites.

Marimo's distinguishing feature is clear: inter-cell dependencies are extracted from code analysis, eliminating an entire class of order-dependent bugs. Think of it as the notebook a data scientist hands off without needing a README.

Vector DB integration — Pinecone, Weaviate, Milvus, Qdrant

RAG and embedding workflows demand a vector DB. iter62 covers vector DBs in depth; from the MLOps angle, the key is **integration with model serving**. BentoML, Ray Serve, and KServe all standardize the pattern of packaging vector DB clients alongside models.

Combined with LangChain/LlamaIndex-style LLM frameworks, vector DBs stop being "separate infrastructure" and become "first-class components of the ML service".

End-to-end "MLflow-style" managed platforms

If self-hosting is too much, you go managed.

- **Databricks Lakehouse Platform**: by MLflow's authors. Notebooks + Spark + MLflow + Unity Catalog in one.

- **Vertex AI**: Google. AutoML, Pipelines, Endpoints, Feature Store unified.

- **SageMaker**: AWS. Studio, Pipelines, Model Registry, Endpoints unified.

- **Azure ML**: Microsoft. Workspace + Designer + Endpoints + Responsible AI.

Each managed platform pushes its own SDK while still offering standard entry points compatible with MLflow, Kubeflow Pipelines, PyTorch, and TensorFlow. The trade-off is **zero infra burden vs vendor lock-in**.

LLM-specific MLOps (LLMOps) — LangSmith, Langfuse, Arize Phoenix, Helicone

A subcategory called "LLMOps" formed across 2024-2025. It overlaps with classical MLOps but differs on several axes.

- **Unit of trace**: instead of metrics, the primary data is the **prompt/response trace**.

- **Evaluation**: ground truth is fuzzy, so LLM-as-judge, rules, and user feedback combine.

- **Prompt registry**: prompt versions are managed separately from model versions.

- **Cost tracking**: per-token cost is the center of ML cost.

Key tools.

- **LangSmith**: standard within the LangChain world. SaaS + self-host.

- **Langfuse**: open-source self-host track. Tracing, evaluation, prompt management.

- **Arize Phoenix**: Arize's OSS spin-off. LLM tracing and evaluation.

- **Helicone**: OpenAI/Anthropic API gateway + traces.

Covered more deeply in iter77 (LLM observability) and iter83 (fine-tuning).

Korean MLOps — Naver Cloud, Kakao Enterprise, NCSOFT, Hyundai Motor Group

The Korean MLOps ecosystem clusters along these axes.

- **Naver Cloud ML platform**: HyperCLOVA X-based fine-tuning + an in-house MLOps toolset.

- **Kakao Enterprise**: the ML track on Kakao i Cloud. Mixes Kubeflow with in-house tools.

- **NCSOFT AI Center**: game-AI and character-AI pipelines built in-house.

- **Hyundai Motor Group AI Lab**: autonomous-driving data + model automation, mixing Flyte and Kubeflow.

- **Korean ML engineer communities**: MLOps Korea, Pseudo Lab, Modulabs tracks run regular meetups.

Big companies typically build internal platforms on top of MLflow/Kubeflow; startups often go straight to SageMaker or Vertex AI.

Japanese MLOps — PFN, Mercari, Cybozu, Recruit

Japan's MLOps ecosystem has these standout patterns.

- **PFN (Preferred Networks)**: authors of Optuna (HPO). Pair it with their in-house workflow Allgo.

- **Cybozu**: ML pipelines on kintone data. Argo Workflows plus in-house tools.

- **Recruit AI Lab**: ML pipelines in ads and recruiting. Vertex AI + MLflow.

- **Mercari ML platform**: runs an in-house ML platform called Merlin. KServe + Argo + Feast.

- **NTT Communications - PFN partnership**: large-scale training on telecom data.

In Japan, **Argo Workflows** (classical) and **Kubeflow + KServe** see high adoption. Qiita and Zenn have seen a rapid rise in Flyte/Metaflow tutorials in Japanese throughout 2026.

CNCF AI & Data subgroup — the standardization current

In late 2024 the CNCF officially launched the AI & Data subgroup. As of May 2026, these projects sit under it.

- **Kubeflow Pipelines, Kubeflow Training Operator**: ML pipelines + distributed training.

- **Flyte**: workflow (incubation).

- **KServe**: serving (incubation).

- **Volcano**: batch scheduler.

- **Spark Operator**: Spark on K8s.

- **Argo Workflows**: classical but frequently used for ML workflows.

As CNCF becomes the standardization hub, the **ML-on-K8s** interface is converging. Managed platforms are increasingly adopting the KServe v1/v2 inference protocols.

Composition patterns — how real production stacks are wired

Real companies rarely use one tool end to end. Common composition patterns:

- **Startup minimum**: MLflow (tracking + registry) + Metaflow (workflow) + BentoML (serving) + Evidently (monitoring).

- **K8s enterprise**: Kubeflow (pipelines) + MLflow (tracking) + KServe (serving) + Feast (features) + DVC (data).

- **AWS-managed first**: SageMaker Pipelines + SageMaker Model Registry + SageMaker Endpoints + W&B (SaaS tracking) + Evidently (monitor).

- **GCP-managed first**: Vertex AI Pipelines + Vertex Endpoints + W&B + Featureform.

- **LLM-first startup**: Hugging Face Hub (models) + Langfuse (tracing) + BentoML 1.4 (serving) + ZenML (workflow abstraction).

The key is **"what is first-class and what is secondary"**. If K8s is first-class, you go Kubeflow + KServe; if Python workflows are, Metaflow + BentoML; if LLMs are, HF Hub + Langfuse + vLLM.

Adoption roadmap — zero to production

If you are introducing MLOps from scratch, the safest order is:

1. **Experiment tracking** first. MLflow or W&B. Done within a week.

2. **Model registry** in the same tracking tool. A separate tool is not needed yet.

3. **Model serving** with at least BentoML or FastAPI + ONNX/TorchScript. Managed (SageMaker Endpoints, Vertex Endpoints) is fine too.

4. **Pipeline orchestration** is cron + Makefile when you have 1-2 data scientists. Add Metaflow/ZenML when you have three or more or you need daily training.

5. **Data versioning** kicks in once datasets exceed gigabytes and change often: DVC or lakeFS.

6. **Feature store** comes after the first train-serve consistency incident. Do not install it on day one.

7. **Monitoring** starts once there is at least one production model with real users.

Installing all eight on day one fails 90% of the time. Adopting **one layer at a time** is the established wisdom.

Closing — May 2026, "MLOps remains a mosaic"

We opened with "the standard has crystallized" — but it is also true that **MLOps remains a mosaic**. No one company runs all seven layers under a single vendor. OSS combos like MLflow + Metaflow + BentoML + DVC + Evidently are still the most common.

The biggest shift is **the split between LLMOps and classical MLOps**. Even within one company, the two tracks now run separate tool stacks. LLMOps coalesces around LangSmith/Langfuse/Phoenix; classical solidifies around MLflow/Kubeflow.

Do not over-optimize the tool selection. Any combination that gives you **"four-way versioning of code, data, models, and results"** solves 90% of the problem. The rest is priorities.

References

- MLflow official docs: https://mlflow.org/docs/latest/index.html

- Kubeflow official docs: https://www.kubeflow.org/docs/

- Metaflow official docs: https://docs.metaflow.org/

- Flyte official docs: https://docs.flyte.org/

- ZenML official docs: https://docs.zenml.io/

- BentoML official docs: https://docs.bentoml.com/

- ClearML official docs: https://clear.ml/docs/latest/docs/

- DVC official docs: https://dvc.org/doc

- Weights & Biases official docs: https://docs.wandb.ai/

- Comet ML official docs: https://www.comet.com/docs/v2/

- Neptune.ai official docs: https://docs.neptune.ai/

- Aim official docs: https://aimstack.readthedocs.io/

- Feast official docs: https://docs.feast.dev/

- Hopsworks official docs: https://docs.hopsworks.ai/

- Featureform official docs: https://docs.featureform.com/

- lakeFS official docs: https://docs.lakefs.io/

- Pachyderm official docs: https://docs.pachyderm.com/

- Evidently AI official docs: https://docs.evidentlyai.com/

- Arize AI official docs: https://docs.arize.com/arize

- WhyLabs official docs: https://docs.whylabs.ai/

- Seldon Core official docs: https://docs.seldon.io/projects/seldon-core/en/latest/

- KServe official docs: https://kserve.github.io/website/

- NVIDIA Triton official docs: https://docs.nvidia.com/deeplearning/triton-inference-server/

- Ray Serve official docs: https://docs.ray.io/en/latest/serve/index.html

- Langfuse official docs: https://langfuse.com/docs

- LangSmith official docs: https://docs.smith.langchain.com/

- CNCF AI & Data Working Group: https://github.com/cncf/toc/blob/main/tags/tag-ai.md