Skip to content

Language Learning Quiz

Based on: DPO(Direct Preference Optimization) 논문 심층 분석 — RLHF 없이 LLM 정렬하기

What does "Direct Preference Optimization" mean?