필사 모드: Open Source ML Platforms & MLOps 2026 Deep Dive - Kubeflow, Metaflow, Flyte, ZenML, MLflow, BentoML, ClearML, DVC, Weights & Biases
EnglishIntroduction — by May 2026, the MLOps stack has standardized
Up through 2023, the MLOps stack was a "pick whatever works for your company" wilderness. By May 2026, much of that ambiguity has cleared. **MLflow and Weights & Biases** split experiment tracking. **Kubeflow Pipelines and Flyte** own Kubernetes-native orchestration. **Metaflow and ZenML** own Python-first workflows. **BentoML 1.4, KServe, and Triton** dominate serving. **DVC and lakeFS** own data versioning.
This article is not a marketing matrix. It is an honest, by-the-layer comparison of "what goes where in production in 2026" — covering MLflow 3.0's changes, the Kubeflow 1.10 lineup, Flyte's commercial trajectory via Union.ai, ZenML Cloud, and BentoML 1.4's LLM serving mode, with real API shapes.
The 2026 MLOps stack — seven layers
Let us start with the big picture. The standard 2026 MLOps stack splits into seven layers.
1. **Experiment tracking** — runs, parameters, metrics, artifacts
2. **Pipeline orchestration** — DAG execution, distribution, caching
3. **Model registry** — model versions, stages, metadata
4. **Model serving** — online, batch, streaming inference
5. **Feature store** — train/inference consistency for features
6. **Data + experiment versioning** — datasets and code in sync
7. **ML monitoring** — drift, performance decay, data quality
The era when one or two tools per layer sufficed is over. Today, **even within a layer there is a split between LLM-specific and classical-ML tracks**. We go layer by layer below.
Experiment tracking — MLflow 3.0 vs Weights & Biases
90% of experiment tracking is two tools.
- **MLflow 3.0**: OSS by Databricks, now under the Linux Foundation. 3.0 went GA in Q1 2026. GenAI tracing, evaluation, and prompt registry are now first-class citizens.
- **Weights & Biases**: SaaS-first but with an open SDK. Dominates on UX and visualization. W&B Models, W&B Weave (LLM tracing), and W&B Launch ship as one bundle.
The alternatives still matter.
- **Comet ML**: a SaaS that integrates ML + LLM experiments + production monitoring.
- **Neptune.ai**: repositioned as a metadata store for foundation model training.
- **Aim**: Apache 2.0 OSS, self-hosted, with a snappy UI.
- **TensorBoard, PyTorch Lightning Logger**: limited as standalone trackers; usually combined with the above.
A typical MLflow 3.0 flow looks like this.
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
mlflow.set_tracking_uri("http://mlflow.internal:5000")
mlflow.set_experiment("iris-rf-2026")
X, y = load_iris(return_X_y=True)
with mlflow.start_run() as run:
mlflow.log_param("n_estimators", 200)
model = RandomForestClassifier(n_estimators=200)
model.fit(X, y)
mlflow.log_metric("train_acc", model.score(X, y))
mlflow.sklearn.log_model(model, artifact_path="model", registered_model_name="iris-rf")
print(run.info.run_id)
The same flow in W&B is just as terse.
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
wandb.init(project="iris-rf-2026", config={"n_estimators": 200})
X, y = load_iris(return_X_y=True)
model = RandomForestClassifier(n_estimators=200).fit(X, y)
wandb.log({"train_acc": model.score(X, y)})
wandb.finish()
Both tools have auto-instrumentation for scikit-learn, PyTorch, XGBoost, and others — so you get baseline metrics even without explicit logging. The difference is **storage model and governance**. MLflow defaults to self-hosted with a BSD-friendly license; W&B is SaaS-first but has by far the smoother collaboration UX.
Pipeline orchestration — K8s-native vs Python-first
ML pipelines differ from generic ETL in that they must handle GPU scheduling, caching, and reproducibility together. As of May 2026, four tools split the market.
- **Kubeflow Pipelines + Kubeflow 1.10**: CNCF incubation project. The canonical K8s-native ML platform. Components are containers.
- **Metaflow**: built by Netflix, commercialized by Outerbounds. Python decorator-first. Deeply integrated with AWS Batch and Step Functions.
- **Flyte**: built by Lyft, commercialized by Union.ai. Under LF AI & Data. K8s-native + type-safe.
- **ZenML**: framework-agnostic, positioned as an "abstraction layer". MLflow, W&B, Kubeflow, Airflow are pluggable backends.
Adjacent tools — **Prefect, Dagster, Airflow** — also see ML use, but they are general data orchestrators rather than ML-specific, so we cover them separately in iter72/iter53.
DSL ergonomics rank roughly as **ZenML > Metaflow > Flyte > Kubeflow Pipelines**. The benchmark is whether a single Python file is enough.
Kubeflow 1.10 — a full ML platform on Kubernetes
Kubeflow is not a single tool but a **full ML platform on top of Kubernetes**. As of May 2026, the 1.10 lineup has these core components.
- **Kubeflow Pipelines (KFP)**: DAG pipeline SDK and UI.
- **Katib**: distributed hyperparameter tuning.
- **Training Operator**: PyTorchJob, TFJob, MPIJob, PaddleJob CRDs for distributed training.
- **KServe**: model serving (formerly KFServing). Split into a sibling project but integrated.
- **Notebook Controller**: JupyterHub-style notebook instances.
- **Spark Operator, Volcano**: batch scheduling.
A KFP 2.x DSL example.
from kfp import dsl, compiler
@dsl.component(base_image="python:3.12")
def preprocess(input_path: str, output_path: str):
df = pd.read_csv(input_path)
df.dropna().to_csv(output_path, index=False)
@dsl.component(base_image="python:3.12", packages_to_install=["scikit-learn"])
def train(data_path: str) -> float:
from sklearn.ensemble import RandomForestClassifier
df = pd.read_csv(data_path)
X, y = df.drop("label", axis=1), df["label"]
return RandomForestClassifier().fit(X, y).score(X, y)
@dsl.pipeline(name="iris-pipeline")
def pipeline(input_path: str = "/data/iris.csv"):
pre = preprocess(input_path=input_path, output_path="/tmp/clean.csv")
train_task = train(data_path=pre.outputs["output_path"])
compiler.Compiler().compile(pipeline, "iris.yaml")
Kubeflow rewards teams who can run Kubernetes, but its onboarding cost is the highest of the four. That is why many teams adopt it via a managed offering (Vertex AI Pipelines, SageMaker MLOps, Azure ML).
Metaflow — workflows that start with one Python decorator
Metaflow is the workflow library Netflix designed with data scientists in mind. Outerbounds offers commercial hosting; the core is Apache 2.0.
The abstractions are minimal.
- **FlowSpec**: the workflow class.
- **`@step`**: the step decorator.
- **`self.next(...)`**: explicit branching.
- **`@batch`, `@kubernetes`, `@gpu`**: per-step execution-environment decorators.
- **`@retry`, `@timeout`, `@catch`**: reliability decorators.
A typical Metaflow flow.
from metaflow import FlowSpec, step, batch, retry
class IrisFlow(FlowSpec):
@step
def start(self):
from sklearn.datasets import load_iris
self.X, self.y = load_iris(return_X_y=True)
self.next(self.train)
@batch(cpu=4, memory=16000)
@retry(times=2)
@step
def train(self):
from sklearn.ensemble import RandomForestClassifier
self.model = RandomForestClassifier(n_estimators=200).fit(self.X, self.y)
self.acc = self.model.score(self.X, self.y)
self.next(self.end)
@step
def end(self):
print(f"acc={self.acc:.3f}")
if __name__ == "__main__":
IrisFlow()
Metaflow's strength is **local-first**. It runs in your notebook, and `--with batch` is all you need to scale to AWS Batch. Automatic artifact storage (S3) plus automatic tracking are built in, which is why many teams skip a separate MLflow.
Flyte — Kubernetes-native, type-safe workflows
Flyte is the K8s-native workflow tool built by Lyft and commercialized by Union.ai. It is a graduated project under LF AI & Data. Its biggest selling points are **type safety and caching**.
- **Python type annotations** double as input/output schemas.
- **Automatic caching**: identical inputs hit cache. Significant cost savings.
- **K8s-native**: Pod, Deployment, and GPU/TPU scheduling are first-class.
- **Multi-language**: Python is primary; Java/Scala SDKs exist.
A Flyte example.
from flytekit import task, workflow, Resources
@task(cache=True, cache_version="1.0", requests=Resources(cpu="2", mem="4Gi"))
def preprocess(input_path: str) -> str:
df = pd.read_csv(input_path).dropna()
out = "/tmp/clean.csv"
df.to_csv(out, index=False)
return out
@task(requests=Resources(cpu="4", mem="16Gi", gpu="1"))
def train(data_path: str) -> float:
from sklearn.ensemble import RandomForestClassifier
df = pd.read_csv(data_path)
X, y = df.drop("label", axis=1), df["label"]
return RandomForestClassifier().fit(X, y).score(X, y)
@workflow
def iris_wf(input_path: str = "/data/iris.csv") -> float:
clean = preprocess(input_path=input_path)
return train(data_path=clean)
Flyte's edge is being **K8s-friendly with caching that actually works**. Running the same code on the same data hits cache automatically, saving time and money.
ZenML — a framework-agnostic abstraction layer
ZenML does not compete with the tools above; it sits **on top of them as an abstraction layer**. ZenML Pipelines can use Kubeflow, Airflow, Tekton, Vertex, SageMaker, or AWS Step Functions as the backend.
Core abstractions.
- **`@step`, `@pipeline`**: Python decorators for steps and pipelines.
- **Stack**: an environment composed of orchestrator + artifact_store + container_registry + experiment_tracker.
- **Component**: a swappable implementation per layer — MLflow, W&B, Neptune.ai, Kubeflow, Vertex, etc.
- **Materializer**: a serializer for user-defined types.
ZenML code.
from zenml import pipeline, step
from typing import Tuple
@step
def load() -> pd.DataFrame:
return pd.read_csv("/data/iris.csv")
@step
def split(df: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame]:
return df.iloc[:120], df.iloc[120:]
@step
def train(train_df: pd.DataFrame) -> float:
from sklearn.ensemble import RandomForestClassifier
X, y = train_df.drop("label", axis=1), train_df["label"]
return RandomForestClassifier().fit(X, y).score(X, y)
@pipeline
def iris_pipeline():
df = load()
tr, _ = split(df)
train(tr)
iris_pipeline()
ZenML's value is **deferring vendor lock-in**. Going local to SageMaker to Kubeflow requires almost no code change. The cost is the abstraction tax — using a backend's unique features 100% eventually requires calling that backend's SDK directly.
Model registry — MLflow Registry vs BentoML Model Store vs Hugging Face Hub
A trained model has to live somewhere with versions, stages (Staging/Production), and metadata. Three candidates show up most often as of May 2026.
- **MLflow Model Registry**: bundled with MLflow and closest to a standard. URI pattern `models:/name/Production`.
- **BentoML Model Store**: paired with BentoML serving. Flow is `bentoml.transformers.save_model()`.
- **Hugging Face Hub**: public/private repos for model sharing. The de facto standard for the transformers/diffusion ecosystem.
Enterprises usually combine **MLflow Model Registry + their own S3**, open-model and LLM teams use a **private Hugging Face Hub repo**, and BentoML-centric ops teams use the **BentoML Model Store** as-is.
Loading from MLflow Registry.
model_uri = "models:/iris-rf/Production"
model = mlflow.pyfunc.load_model(model_uri)
predictions = model.predict([[5.1, 3.5, 1.4, 0.2]])
Model serving — BentoML 1.4, KServe, Triton, Ray Serve split the work
No single serving tool covers every case. The split depends on scenario.
- **BentoML 1.4 + Yatai**: Python-friendly serving framework. Packages model + business logic as a single "Bento". 1.4 in Q1 2026 made LLM serving mode (vLLM, TGI backends) GA.
- **Seldon Core 2**: K8s-native serving. Mesh-based multi-model routing.
- **KServe**: formerly KFServing. Often paired with Kubeflow but standalone-capable. Serverless/autoscale + standard inference protocol.
- **TorchServe**: built by Meta + AWS. The PyTorch serving standard.
- **TensorFlow Serving**: serves TF SavedModels. C++ core, very stable.
- **NVIDIA Triton Inference Server**: GPU-optimal serving. Integrates TensorRT, ONNX, PyTorch, TF, vLLM backends.
- **Ray Serve**: Python serving atop Ray clusters. Managed by Anyscale.
A BentoML 1.4 LLM serving example.
from transformers import AutoTokenizer
@bentoml.service(resources={"gpu": 1, "memory": "16Gi"})
class LlamaService:
def __init__(self) -> None:
from vllm import LLM
self.llm = LLM(model="meta-llama/Llama-3.3-70B-Instruct")
self.tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.3-70B-Instruct")
@bentoml.api
def generate(self, prompt: str, max_tokens: int = 256) -> str:
outputs = self.llm.generate([prompt], sampling_params={"max_tokens": max_tokens})
return outputs[0].outputs[0].text
For classical (non-LLM) models, TF Serving, TorchServe, and Triton are faster and more stable. LLM inference plugs an engine (vLLM, SGLang, TGI, TensorRT-LLM — see the iter69 inference-engine deep dive) into BentoML or Triton as the standard pattern.
Feature stores — Feast 0.40, Hopsworks, Featureform, Tecton
The feature store layer keeps the features used in training identical to those served at inference.
- **Feast 0.40**: Apache 2.0 OSS. K8s or local deployment. Online stores: Redis, DynamoDB, Bigtable.
- **Hopsworks**: open core. Both self-hosted and SaaS. Feature store + notebooks + serving as one.
- **Featureform**: a virtualization layer that abstracts existing data warehouses into feature stores.
- **Tecton**: commercial SaaS. Built by the team behind Feast. Common in enterprise.
A Feast 0.40 definition example.
from feast import Entity, FeatureView, Field, FileSource
from feast.types import Float32, Int64
from datetime import timedelta
user = Entity(name="user", join_keys=["user_id"])
user_stats_source = FileSource(
path="s3://feast-data/user_stats.parquet",
timestamp_field="event_ts",
)
user_stats_view = FeatureView(
name="user_stats",
entities=[user],
ttl=timedelta(days=7),
schema=[
Field(name="purchase_count_7d", dtype=Int64),
Field(name="purchase_value_7d", dtype=Float32),
],
source=user_stats_source,
)
It is tempting to put feature stores off forever. But the moment a "train-serve skew" incident happens, they become mandatory.
Data + experiment versioning — DVC, lakeFS, Pachyderm, Quilt
Git handles code; data is another story. Four tools split the data-versioning market.
- **DVC (Data Version Control)**: data and model versioning on top of Git. `.dvc` metafiles track large files while real data lives on S3/GCS/Azure.
- **lakeFS**: a Git model for the data lake. Branches, merges, and commits work at petabyte scale on S3.
- **Pachyderm**: data versioning + data lineage + automatic pipeline triggers.
- **Quilt Data**: a data-package catalog.
A DVC flow.
git init
dvc init
dvc remote add -d storage s3://my-bucket/dvc
dvc add data/raw/train.csv
git add data/raw/train.csv.dvc .gitignore
git commit -m "track training data"
dvc push # data to S3
git push # metadata to Git
DVC's other strength is **DVC Pipelines**. Define stages and dependencies in `dvc.yaml` and it re-runs only the changed stages. With little infrastructure you get a reproducible ML workflow.
ML monitoring — Evidently, Arize, WhyLabs, Fiddler
Once deployed, models degrade as data shifts (data/concept drift). Monitoring catches that.
- **Evidently AI**: Apache 2.0 OSS. Reports and dashboards generated from Python. K8s self-hostable.
- **Arize AI**: commercial SaaS. Unified ML + LLM observability. Phoenix is the OSS spin-off.
- **WhyLabs**: data quality + drift + anomaly detection. Has a free tier.
- **Fiddler AI**: explainability-first. Enterprise compliance track.
An Evidently example.
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
ref = pd.read_csv("reference.csv")
cur = pd.read_csv("current.csv")
report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=ref, current_data=cur)
report.save_html("drift.html")
Monitoring is "easy to set up, often turned off in production due to noise". Threshold tuning and dashboard customization end up being the harder problems.
Notebooks & IDEs — Jupyter, Marimo, Hex, Deepnote, Quarto
ML workflows still start in notebooks.
- **Jupyter / JupyterLab 4 / JupyterHub**: the de facto standard. Multi-user means JupyterHub. Covered in depth in the iter27 notebooks deep dive.
- **Marimo**: reactive notebooks. Cell execution order is determined by the dependency graph. Strong reproducibility.
- **Hex, Deepnote**: managed collaborative notebooks.
- **Quarto**: turns notebooks + Markdown into reports, books, and sites.
Marimo's distinguishing feature is clear: inter-cell dependencies are extracted from code analysis, eliminating an entire class of order-dependent bugs. Think of it as the notebook a data scientist hands off without needing a README.
Vector DB integration — Pinecone, Weaviate, Milvus, Qdrant
RAG and embedding workflows demand a vector DB. iter62 covers vector DBs in depth; from the MLOps angle, the key is **integration with model serving**. BentoML, Ray Serve, and KServe all standardize the pattern of packaging vector DB clients alongside models.
Combined with LangChain/LlamaIndex-style LLM frameworks, vector DBs stop being "separate infrastructure" and become "first-class components of the ML service".
End-to-end "MLflow-style" managed platforms
If self-hosting is too much, you go managed.
- **Databricks Lakehouse Platform**: by MLflow's authors. Notebooks + Spark + MLflow + Unity Catalog in one.
- **Vertex AI**: Google. AutoML, Pipelines, Endpoints, Feature Store unified.
- **SageMaker**: AWS. Studio, Pipelines, Model Registry, Endpoints unified.
- **Azure ML**: Microsoft. Workspace + Designer + Endpoints + Responsible AI.
Each managed platform pushes its own SDK while still offering standard entry points compatible with MLflow, Kubeflow Pipelines, PyTorch, and TensorFlow. The trade-off is **zero infra burden vs vendor lock-in**.
LLM-specific MLOps (LLMOps) — LangSmith, Langfuse, Arize Phoenix, Helicone
A subcategory called "LLMOps" formed across 2024-2025. It overlaps with classical MLOps but differs on several axes.
- **Unit of trace**: instead of metrics, the primary data is the **prompt/response trace**.
- **Evaluation**: ground truth is fuzzy, so LLM-as-judge, rules, and user feedback combine.
- **Prompt registry**: prompt versions are managed separately from model versions.
- **Cost tracking**: per-token cost is the center of ML cost.
Key tools.
- **LangSmith**: standard within the LangChain world. SaaS + self-host.
- **Langfuse**: open-source self-host track. Tracing, evaluation, prompt management.
- **Arize Phoenix**: Arize's OSS spin-off. LLM tracing and evaluation.
- **Helicone**: OpenAI/Anthropic API gateway + traces.
Covered more deeply in iter77 (LLM observability) and iter83 (fine-tuning).
Korean MLOps — Naver Cloud, Kakao Enterprise, NCSOFT, Hyundai Motor Group
The Korean MLOps ecosystem clusters along these axes.
- **Naver Cloud ML platform**: HyperCLOVA X-based fine-tuning + an in-house MLOps toolset.
- **Kakao Enterprise**: the ML track on Kakao i Cloud. Mixes Kubeflow with in-house tools.
- **NCSOFT AI Center**: game-AI and character-AI pipelines built in-house.
- **Hyundai Motor Group AI Lab**: autonomous-driving data + model automation, mixing Flyte and Kubeflow.
- **Korean ML engineer communities**: MLOps Korea, Pseudo Lab, Modulabs tracks run regular meetups.
Big companies typically build internal platforms on top of MLflow/Kubeflow; startups often go straight to SageMaker or Vertex AI.
Japanese MLOps — PFN, Mercari, Cybozu, Recruit
Japan's MLOps ecosystem has these standout patterns.
- **PFN (Preferred Networks)**: authors of Optuna (HPO). Pair it with their in-house workflow Allgo.
- **Cybozu**: ML pipelines on kintone data. Argo Workflows plus in-house tools.
- **Recruit AI Lab**: ML pipelines in ads and recruiting. Vertex AI + MLflow.
- **Mercari ML platform**: runs an in-house ML platform called Merlin. KServe + Argo + Feast.
- **NTT Communications - PFN partnership**: large-scale training on telecom data.
In Japan, **Argo Workflows** (classical) and **Kubeflow + KServe** see high adoption. Qiita and Zenn have seen a rapid rise in Flyte/Metaflow tutorials in Japanese throughout 2026.
CNCF AI & Data subgroup — the standardization current
In late 2024 the CNCF officially launched the AI & Data subgroup. As of May 2026, these projects sit under it.
- **Kubeflow Pipelines, Kubeflow Training Operator**: ML pipelines + distributed training.
- **Flyte**: workflow (incubation).
- **KServe**: serving (incubation).
- **Volcano**: batch scheduler.
- **Spark Operator**: Spark on K8s.
- **Argo Workflows**: classical but frequently used for ML workflows.
As CNCF becomes the standardization hub, the **ML-on-K8s** interface is converging. Managed platforms are increasingly adopting the KServe v1/v2 inference protocols.
Composition patterns — how real production stacks are wired
Real companies rarely use one tool end to end. Common composition patterns:
- **Startup minimum**: MLflow (tracking + registry) + Metaflow (workflow) + BentoML (serving) + Evidently (monitoring).
- **K8s enterprise**: Kubeflow (pipelines) + MLflow (tracking) + KServe (serving) + Feast (features) + DVC (data).
- **AWS-managed first**: SageMaker Pipelines + SageMaker Model Registry + SageMaker Endpoints + W&B (SaaS tracking) + Evidently (monitor).
- **GCP-managed first**: Vertex AI Pipelines + Vertex Endpoints + W&B + Featureform.
- **LLM-first startup**: Hugging Face Hub (models) + Langfuse (tracing) + BentoML 1.4 (serving) + ZenML (workflow abstraction).
The key is **"what is first-class and what is secondary"**. If K8s is first-class, you go Kubeflow + KServe; if Python workflows are, Metaflow + BentoML; if LLMs are, HF Hub + Langfuse + vLLM.
Adoption roadmap — zero to production
If you are introducing MLOps from scratch, the safest order is:
1. **Experiment tracking** first. MLflow or W&B. Done within a week.
2. **Model registry** in the same tracking tool. A separate tool is not needed yet.
3. **Model serving** with at least BentoML or FastAPI + ONNX/TorchScript. Managed (SageMaker Endpoints, Vertex Endpoints) is fine too.
4. **Pipeline orchestration** is cron + Makefile when you have 1-2 data scientists. Add Metaflow/ZenML when you have three or more or you need daily training.
5. **Data versioning** kicks in once datasets exceed gigabytes and change often: DVC or lakeFS.
6. **Feature store** comes after the first train-serve consistency incident. Do not install it on day one.
7. **Monitoring** starts once there is at least one production model with real users.
Installing all eight on day one fails 90% of the time. Adopting **one layer at a time** is the established wisdom.
Closing — May 2026, "MLOps remains a mosaic"
We opened with "the standard has crystallized" — but it is also true that **MLOps remains a mosaic**. No one company runs all seven layers under a single vendor. OSS combos like MLflow + Metaflow + BentoML + DVC + Evidently are still the most common.
The biggest shift is **the split between LLMOps and classical MLOps**. Even within one company, the two tracks now run separate tool stacks. LLMOps coalesces around LangSmith/Langfuse/Phoenix; classical solidifies around MLflow/Kubeflow.
Do not over-optimize the tool selection. Any combination that gives you **"four-way versioning of code, data, models, and results"** solves 90% of the problem. The rest is priorities.
References
- MLflow official docs: https://mlflow.org/docs/latest/index.html
- Kubeflow official docs: https://www.kubeflow.org/docs/
- Metaflow official docs: https://docs.metaflow.org/
- Flyte official docs: https://docs.flyte.org/
- ZenML official docs: https://docs.zenml.io/
- BentoML official docs: https://docs.bentoml.com/
- ClearML official docs: https://clear.ml/docs/latest/docs/
- DVC official docs: https://dvc.org/doc
- Weights & Biases official docs: https://docs.wandb.ai/
- Comet ML official docs: https://www.comet.com/docs/v2/
- Neptune.ai official docs: https://docs.neptune.ai/
- Aim official docs: https://aimstack.readthedocs.io/
- Feast official docs: https://docs.feast.dev/
- Hopsworks official docs: https://docs.hopsworks.ai/
- Featureform official docs: https://docs.featureform.com/
- lakeFS official docs: https://docs.lakefs.io/
- Pachyderm official docs: https://docs.pachyderm.com/
- Evidently AI official docs: https://docs.evidentlyai.com/
- Arize AI official docs: https://docs.arize.com/arize
- WhyLabs official docs: https://docs.whylabs.ai/
- Seldon Core official docs: https://docs.seldon.io/projects/seldon-core/en/latest/
- KServe official docs: https://kserve.github.io/website/
- NVIDIA Triton official docs: https://docs.nvidia.com/deeplearning/triton-inference-server/
- Ray Serve official docs: https://docs.ray.io/en/latest/serve/index.html
- Langfuse official docs: https://langfuse.com/docs
- LangSmith official docs: https://docs.smith.langchain.com/
- CNCF AI & Data Working Group: https://github.com/cncf/toc/blob/main/tags/tag-ai.md
현재 단락 (1/330)
Up through 2023, the MLOps stack was a "pick whatever works for your company" wilderness. By May 202...