Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Prologue — Why the Translation Market Is Shaking Again

Neural Machine Translation (NMT) replaced statistical MT in 2014, and the Transformer became the standard in 2017. After that the market settled — Google, DeepL, and Microsoft formed the big three, and enterprises trusted TMS platforms like Lilt, Smartling, Phrase, and Crowdin for their workflows. Then GPT-4 arrived in 2023 and the landscape shifted again. For pure sentence-level translation DeepL is still excellent, but when you need **full-document context, tone control, glossary enforcement, and code-snippet preservation**, LLMs have a decisive edge. Claude 3.5 Sonnet in 2024 and GPT-4o in 2025 matched DeepL on BLEU, and on COMET and MQM (human-aligned metrics) they sometimes win in specific domains.

The real question in 2026 is no longer "DeepL or Google?" but **"which tool fits which slot in my workflow?"** For 100 million characters of product documentation DeepL Pro plus a TMS is still the fastest and cheapest combination. For 100 marketing copies that need to share one brand voice, a one-line LLM prompt is more accurate. For medical or legal work that demands domain adaptation, the human-in-the-loop adaptive MT of Lilt or ModernMT is the answer. For real-time voice interpretation, specialized services like KUDO and Interprefy live in their own corner.

This post lines up nearly 30 translation and localization tools side by side. Seven comparison axes — **translation quality, domain adaptation, context length, language coverage, price, privacy and data residency, TMS integration**. Then we walk through the framework layer (i18next, react-intl, Next.js i18n), the format layer (ICU MessageFormat, TMX, TBX, XLIFF), and the quality-evaluation layer (COMET, BLEURT, MQM). Read once, and you should have the full 2026 localization map.

Price and feature numbers shift every six months. Every figure here is **accurate as of 2026-05-18**, and the post focuses on decision frames rather than exact prices. The frame — "what kind of tool goes where" — should still hold a year from now even if the numbers move.

Chapter 1 · The Evolution of MT — SMT, NMT, and the LLM Era

The history of machine translation splits cleanly into three eras.

**Generation 1 — Statistical MT (SMT, around 2007)**

Google Translate used this approach for about ten years from 2007. It learned word, phrase, and sentence alignments from giant parallel corpora and stitched output together probabilistically. Short sentences and everyday expressions worked, but the weak spots were **broken word order and tangled meaning in long sentences**. Pairs where the word order diverged from English — Korean, Japanese, Arabic — were especially bad.

**Generation 2 — Neural MT (NMT, 2014 onward)**

The starting gun was the 2014 Seq2Seq paper from Sutskever and colleagues. In 2016 Google swapped its entire engine over to NMT (GNMT), and BLEU scores jumped almost ten points overnight. After the 2017 Transformer paper, encoder-decoder models became the default, and DeepL launched in August of the same year — immediately taking the top spot in many pairs. The strength of NMT is **sentence-level fluency**, but weaknesses remain: short context windows hurt paragraph and document-level consistency, and domain adaptation requires separate training.

**Generation 3 — LLM-based Translation (2023 onward)**

One of the first surprise capabilities people discovered after GPT-4 launched in early 2023 was translation. With no specific translation training, GPT-4 matched DeepL and Google NMT shoulder to shoulder. Claude 3.5 Sonnet in 2024 and Claude 4 and GPT-4o in 2025 narrowed the gap further. LLM translation has three strengths: (a) **long context** — they look at whole documents at 100k to 1M tokens; (b) **prompt-driven domain and tone control** — no fine-tuning needed, just "in legal tone" or "as marketing copy"; (c) **natural preservation of code, tables, and structure**.

LLM translation has two weaknesses — cost (5 to 50 times the per-character price of NMT) and latency (NMT under 100ms, LLMs several seconds). So the practical 2026 workflow is often a **hybrid: NMT for volume, LLM for tone, consistency, and domain polish**.

The three generations coexist — SMT still survives in some embedded devices, NMT is the workhorse for volume workloads, and LLMs fill the slots where quality is decisive. None of them is going to replace the others.

Chapter 2 · DeepL — Champion of European, Korean, and Japanese

Founded in Cologne in 2017, DeepL is the decisive winner of the NMT era. At launch it beat Google, Microsoft, and Amazon on BLEU simultaneously, and the gap was largest on European pairs like English-German, English-French, and English-Spanish. Korean and Japanese joined in 2020, and as of May 2026 DeepL supports 32 languages. Four product lines: DeepL Translator (free), DeepL Pro (paid), DeepL API, and the writing assistant DeepL Write.

**Pricing — DeepL Pro 2026**

- Starter: 9 USD per month, 500,000 characters

- Advanced: 33 USD per month, unlimited characters plus CAT integration

- Ultimate: 60 USD per month, unlimited plus priority processing

- API Free: 500,000 characters per month free

- API Pro: usage-based, from 25 USD per million characters

**DeepL strengths**

The consensus is that DeepL's Korean and Japanese fluency sits a notch above Google and Microsoft. Lexical consistency, particle and connective handling, and honorific recognition are particularly strong. Bullet, Markdown, and HTML tag preservation is reliable, which makes DeepL easy to slot into a documentation workflow.

**DeepL Write**

A separate product. Writing and tone assistance for English and German — similar to Grammarly, but with deeper tone-transformation options (friendly, business, academic). Korean entered beta in 2025 and went GA in 2026.

**Limits — an honest read**

Context windows are short. You must chunk to the sentence level for safe results, and paragraph- and document-level consistency are weak. Glossary enforcement requires the **Glossary** feature, which is missing from the free tier and only available in some Pro plans. Unlike LLMs, DeepL does not accept free-form instructions like "use this tone" or "translate this term as X."

Chapter 3 · Google Translate and the Cloud Translation API — The Broadest Coverage

Google Translate launched on SMT in 2006, switched to NMT in 2016, and in 2020 moved to a zero-shot multilingual model that supports **more than 130 languages**. No competitor matches that breadth. On major pairs like English-Spanish it ties with DeepL, but on low-resource languages like Swahili or Uzbek, Google is far ahead.

**Product lineup**

- Google Translate web and app: free, consumer-facing

- Cloud Translation API v3: enterprise NMT API

- Cloud Translation Advanced (v3): glossary, custom models, batch translation, AutoML

- Google Translate camera and voice: mobile app features

**Pricing — Cloud Translation 2026**

- NMT default: 20 USD per million characters

- AutoML Translation: 80 USD per million characters (custom models)

- Free tier: 500,000 characters per month

**Google Translate App — the mobile leader**

Point the camera at text for real-time OCR and translation. Conversation mode handles two speakers talking in their own languages, two-way translation in a single microphone. Lens is especially strong on free-form text — menus, signs, packaging.

**Where Cloud Translation API fits**

Multilingual chat, UGC translation, user-typed search query translation — anywhere you have **high volume, real-time, and a wide variety of domains**. Because DeepL only supports 32 languages, Google is the natural first pick when you need to enter Vietnamese, Indonesian, or Thai markets.

**Limits**

Korean and Japanese fluency is reportedly a step behind DeepL. Glossary integration exists but fine-tuning is harder than in DeepL.

Chapter 4 · Microsoft Translator and Amazon Translate — Cloud-Integrated Players

**Microsoft Translator (Azure AI Translator)**

The translation API of the Azure ecosystem. Supports more than 100 languages, NMT v3 underneath. First-class integration with Office 365, Teams, and SharePoint. Pricing starts at 10 USD per million characters — slightly cheaper than Google. The **Custom Translator** feature lets you upload internal corpora to build a domain-adapted model. Heavy use in medical, legal, and finance. Data residency options are strong, which makes EU, UK, and Japan data-sovereignty requirements easier to meet.

**Amazon Translate**

The translation API on the AWS side. 75 languages. Pricing is 15 USD per million characters, with Active Custom Translation at 60. The strength is **AWS ecosystem integration** — batch translation from S3, Lambda triggers, pipelines with Comprehend for entity extraction. Standalone translation quality is generally a touch behind Google, DeepL, and Microsoft, but if all your infrastructure is already in AWS, the zero data-movement cost is decisive.

**Selection criteria**

- Microsoft 365 / Azure-heavy enterprise — Microsoft Translator

- AWS-heavy infrastructure — Amazon Translate

- Multi-cloud or cloud-agnostic — DeepL, Google, or direct LLM calls

Neither tops the absolute NMT quality charts, but for teams already inside those clouds the integration cost is overwhelmingly low.

Chapter 5 · ModernMT — The Original Adaptive NMT

Made by Italy's Translated.com, ModernMT is the tool that established the **adaptive MT** concept in the market. What is adaptive? A translator's post-edits feed back into the model immediately, so within the same session and project the output gradually shifts toward that translator's style and terminology. A normal NMT freezes after training, but ModernMT **keeps learning during use**.

**The core mechanism**

Translation memory (TM) and glossary terms are injected into the model at inference time. Rather than fine-tuning, think of it as retrieval-augmented inference. The model itself does not change, but the context becomes richer. As a result domain adaptation costs collapse to near zero.

**Product line**

- ModernMT Enterprise: on-prem deployment available

- ModernMT Translate: SaaS API

- Lara: the next-generation model announced in 2024, combining LLM with adaptive

**Lara — the 2026 flagship**

LLM-based like GPT-4, Claude, and Gemini, but automatically pulls from active TMs and glossaries. User evaluations show significantly improved consistency over standard NMT.

**Pricing**

No public price sheet, enterprise quotes only. The general range is 30 to 60 USD per million characters, more for on-prem. A natural fit for medical, legal, and technical documentation where the domain is narrow and consistency is decisive.

Chapter 6 · Lilt — The Apex of Human-in-the-Loop Adaptation

California-based Lilt shares ModernMT's adaptive philosophy but pushes one step further. As a translator types each key, the model **predicts the next word** and offers it; the translator accepts or rejects. Lilt calls this "Contextual AI." The reported result: keystrokes drop nearly in half, and per-hour throughput roughly doubles.

**Business model**

Lilt is not a simple API — it sells a **translation service plus platform** package. They run their own translator pool, customers upload content, and the human-in-the-loop adaptive workflow handles it. Pricing is high — typical enterprise contracts run **1,500 USD per month and up**. Per-word pricing is market-average, but the platform fee is on top.

**Who uses it**

Intel, Canva, Airbnb, Asics show up in the customer list. The common thread is **high content volume, decisive quality requirements, and a desire to outsource internal LSP operations**.

**vs ModernMT**

ModernMT sells the tool; Lilt sells the tool plus the service. If you have in-house translators and just want better tools, ModernMT is the natural pick. If you also want to outsource the translators, Lilt is the natural pick.

**Limits**

The price gate is high — small businesses and individuals cannot use it. Strength is concentrated on English-anchored pairs, so on Korean and Japanese pairs DeepL or LLMs are often more fluent in our experience.

Chapter 7 · Smartling — Enterprise Localization Platform

New York's Smartling launched in 2009 and is a veteran of the space. It positions itself as a **localization platform rather than a translation tool**. Rather than translating a sentence, it automates **the entire journey from content creation to deployment**. CMS integration, automatic content extraction, TM and glossary management, translator workflow, QA, deployment — all in one place.

**Core feature — Global Delivery Network**

This is Smartling's signature. A CDN-style translation proxy: route an English website URL through a multilingual domain, and Smartling automatically extracts, translates, caches, and serves the content. You get a multilingual site without modifying site code.

**Pricing**

No public pricing — enterprise quotes only. Typically **tens of thousands of USD per year and up**. ROI calculations are tricky, so SMB entry is hard.

**Who uses it**

British Airways, Uber, Slack, Affirm. Companies with high global content volume where marketing, product, and legal content all need to flow through one platform.

**Translation engine**

Smartling does not build its own engine. It integrates DeepL, Google, Microsoft, and ModernMT, plus LLMs since 2024. The real value is in workflow, memory, and governance.

Chapter 8 · Phrase (formerly Memsource) — The TMS Champion

Czech-born Memsource acquired Germany's Phrase in 2021 and unified the brand. Top-of-market TMS share as of 2026. Provides **translation memory, glossary, project management, translator collaboration, and machine translation integration** as a SaaS.

**Phrase's slot**

If DeepL and Google are "engines," Phrase is **the layer that bundles those engines into a workflow**. First-pass through DeepL, then translator post-edit in the Phrase editor, then results pile up in the TM. The next time a similar sentence appears, TM pulls it automatically.

**Product line**

- Phrase TMS (formerly Memsource): enterprise translation management

- Phrase Strings: i18n key management for developers

- Phrase Analytics: translation KPI tracking

- Phrase Custom AI: domain-specific LLM translation (launched 2024)

**Pricing**

- Phrase TMS Team: 27 USD per user per month

- Business: 65 USD per user per month

- Enterprise: quote-based

**Phrase Strings**

A separate product specialized for i18n key management. Centralized handling of JSON, YAML, gettext, iOS, and Android files. Competes with Lokalise and POEditor. Priced by key count, with a free tier up to 200 keys.

**Limits**

The UI is complex — newcomers struggle to find where to do what. The learning curve is steep, no way around it. Once internalized though, it offers the richest feature set.

Chapter 9 · Crowdin — Collaboration-First Localization

Ukrainian-born Crowdin is the most loved TMS among open-source projects and startups. **GitHub integration runs deep**, and TM, glossary, MT integration, and translator community management all live in one place.

**Crowdin differentiators**

- **First-class GitHub, GitLab, and Bitbucket integration** — when a PR merges the source strings auto-update, and completed translations come back as auto-generated PRs

- **Crowdsourced translation** — easy to recruit and manage volunteer translators for OSS

- **In-Context Editor** — translators preview translations on the actual site or app while working

**Pricing**

- Free: 60,000 characters, 1 project (with a separate free plan for open source)

- Pro: from 50 USD per month, 30 projects

- Team: from 200 USD per month

- Enterprise: quote-based

**Free for open source**

Crowdin offers a free plan to verified OSS projects. Discord, Minecraft, Telegram, Privacy Badger, Hugging Face, and many more use Crowdin.

**Phrase vs Crowdin**

Phrase is enterprise governance and compliance first; Crowdin is **developer-friendly and community-friendly**. For a startup or an OSS project, Crowdin is the more natural starting point.

Chapter 10 · Lokalise, Transifex, POEditor, Weblate — The TMS Pack

**Lokalise**

Israel-born, strong on mobile and web i18n. **The iOS, Android, and Flutter SDKs are excellent**, and there is a Figma plug-in for designers. Pricing starts at 120 USD per month (Start), 230 (Essential), Pro and Business by quote. Direct competition with Phrase Strings and Crowdin.

**Transifex**

California-born, once the OSS default. Current pricing is 70 USD per month (Starter), 200 (Growth), Premium by quote. Drupal, Mozilla, and Coursera are among the customers. It has lost some ground to Phrase, Crowdin, and Lokalise in recent years but is still in the game.

**POEditor**

Romanian-born, **the cheapest** option. Free up to 1,000 strings, then 14.99 USD per month (Start), 49.99 (Pro), 119.99 (Business). Popular with small startups and indie developers.

**Weblate**

Czech-born, **open source (GPLv3)**. Self-hostable. Deep Phabricator and GitLab integration. Used by Fedora, openSUSE, MariaDB, LibreOffice. The hosted version starts at 19 USD per month (Basic).

**Selection guide — one line each**

- Enterprise governance plus mixed content — Phrase TMS

- OSS, startup, GitHub workflow — Crowdin

- Mobile app i18n plus Figma — Lokalise

- Low cost plus simple — POEditor

- Self-host plus free plus OSS — Weblate

Chapter 11 · CAT Tools — OmegaT, memoQ, SDL Trados, Wordfast

**CAT (Computer-Assisted Translation)** looks similar to TMS but is different. TMS is a cloud collaboration platform; CAT is a **desktop tool** the translator runs on their own machine. The professional translator's working environment.

**SDL Trados Studio (now RWS Trados Studio)**

The absolute leader, with roughly 70 percent share among professional translators. Windows desktop. Pricing starts at 695 EUR (Freelance, perpetual license). **Translation memory plus glossary plus MT integration**, all deeply integrated. The learning curve is steep, but once internalized, productivity rises decisively.

**memoQ**

Hungarian-born. Trados's biggest competitor. Both cloud (memoQ cloud) and desktop. Pricing starts at 770 USD (translator pro, perpetual license). Generally considered simpler than Trados UI-wise, with better collaboration features.

**Wordfast**

French-born. The budget alternative to Trados. Wordfast Pro (540 USD perpetual as of 2024), Wordfast Anywhere (cloud, free). Popular with small LSPs and freelancers.

**OmegaT**

**Open source (GPL)** desktop CAT. Java-based, runs on Windows, macOS, and Linux. Free, with TM, glossary, and MT integration all supported. The UI is rough but practical. A natural pick for startups without an in-house LSP, students, and activists.

**CAT vs LLM**

The real 2026 question — will CAT tools disappear? The short answer is no. For precise professional work, TM utilization, and secure (offline) environments, CAT is still absolute. But the first pass on volume workloads is moving to NMT and LLMs, and CAT's slot has narrowed to **a precision tool for human-in-the-loop post-editing**.

Chapter 12 · LLM-Based Translation — Claude, GPT, Gemini, DeepSeek

After GPT-4 emerged in 2023, LLM translation matured quickly. As of May 2026 the main models are these.

**Claude 3.7 and Claude 4 (Anthropic)**

Strong on long context (200k tokens, 1M in beta) and tone and nuance handling. Very high marks on Korean and Japanese fluency. Often picked for legal and medical domains where accuracy is decisive.

**GPT-4o, GPT-4.5, GPT-5 (OpenAI)**

Strong on response speed and multilingual breadth. Handles more than 130 languages reliably, and combined with function calling makes it easy to build translation as part of a tool pipeline.

**Gemini 1.5 Pro and 2.0 (Google)**

The differentiator is the **1 to 2 million token context window**. Drop a whole book in and translate it consistently. PDF, image, and video inputs are well handled.

**DeepSeek V3 and R1 (DeepSeek)**

Chinese-born, very cheap. Around 0.14 USD per million tokens — one-thirtieth of GPT-4. Especially strong on English-Chinese.

**LLM translation prompt patterns — in practice**

- Put domain, tone, banned terms, and glossary into the system prompt

- User message carries the source plus "output only the translation"

- For long documents, chunk and include the last paragraph of the previous chunk as context

- Run a separate QA pass with another LLM on the output for extra reliability

**LLM translation limits — honest read**

- **Cost** — 5 to 50 times higher per character than DeepL or Google NMT. Heavy at volume.

- **Latency** — several seconds to tens of seconds compared to under 100ms for NMT.

- **Reproducibility** — hard to guarantee identical output for identical input even with `temperature=0`.

- **Data residency** — routed through US or Chinese clouds; EU GDPR and Korea's PIPA require careful review.

Most production 2026 workflows are **NMT for volume, LLM for quality** hybrids.

Chapter 13 · Open Source NMT — NLLB, M2M-100, MADLAD, OPUS-MT, Argos

Where commercial products fall short — offline, data residency, embedded — open source NMT shines.

**NLLB-200 (Meta, 2022)**

**No Language Left Behind.** 200 languages, including low-resource ones like Kinyarwanda, Quechua, and Sindhi. Model sizes from 600M up to 54B parameters. All open on Hugging Face. Often the only choice when low-resource language support is non-negotiable.

**M2M-100 (Meta, 2020)**

**Many-to-Many** — 100 languages translated directly between each pair without pivoting through English. Quality on pairs like Korean to Swahili is decisively better without the English detour.

**MADLAD-400 (Google, 2023)**

**400 languages** — the most in the market. Up to 10.7B parameters. Apache 2.0 licensed, commercial use is fine.

**Helsinki-NLP / OPUS-MT**

Small models trained on the OPUS corpus by Helsinki University. More than 1,000 language pair models open on Hugging Face. Each model is 100 to 300 MB, light enough for **edge devices, embedded systems, and offline** workloads.

**Argos Translate**

An open source tool that wraps OPUS-MT into a **desktop GUI, CLI, and Python library**. Fully offline, free. Attractive where personal data absolutely cannot leave the device — legal, medical, government.

**Selection guide**

- Low-resource languages — NLLB-200, MADLAD-400

- Direct (no English pivot) — M2M-100

- Edge or embedded — OPUS-MT

- Desktop offline — Argos Translate

The open source NMT limit is clear — quality is a notch below DeepL and Google. But **free plus offline plus zero data residency** is decisive in some slots.

Chapter 14 · Voice Translation — Google Live, Apple Live Translation, Galaxy AI

Voice translation is a combination of two modules — (a) speech to text (ASR, Automatic Speech Recognition), (b) text to text translation. End-to-end models that fuse both have started appearing since 2024.

**Google Translate Live and Conversation**

The voice conversation mode in the Google Translate app. Two people speak in their own languages, and translations flow both ways from a single microphone. Two-speaker auto-detection. Free. The most mature option and the broadest language support.

**Apple Live Translation (iOS 18.4, 2025)**

Officially shipped with iOS 18.4 in April 2025. Combined with AirPods Pro 2 and 3, you hear **real-time interpretation in your ears**. Started with eight languages including English, Spanish, French, German, and Chinese. As of May 2026 the count is 19. Korean and Japanese are supported.

**Samsung Galaxy AI Live Translate**

Launched with the Galaxy S24 in 2024. Real-time interpretation during phone calls. Works even when the other side is not a Galaxy device. Started with 13 languages, expanded to 17 by 2026.

**Pixel Recorder Transcribe and Translate**

The recorder app on Google Pixel phones. Transcribe a meeting and translate in the same place. The strength is **on-device** processing — data never leaves the phone.

**OpenAI Whisper translate**

Whisper is an ASR model, but its `translate` mode also translates the input language to English. English-only output is a constraint. Open source and self-hostable. Often used for workflows like Korean speech to English text.

**Limits — honest read**

Voice translation is intrinsically harder than text translation. Speaker intent, tone, casual expressions, noise, and pronunciation variants all come in as ASR errors. Works well in formal presentation environments but limits remain in casual conversation, dialects, and crowded settings.

Chapter 15 · Real-Time Interpretation — KUDO, Interprefy, AppTek

**KUDO**

A virtual interpretation platform launched in 2017. Plugs into Zoom, Teams, and Webex. Both human interpreter pools and AI interpretation. In 2024 they shipped AI Speaker — one speaker's voice translated and delivered to listeners in another language. UN agencies, international organizations, and global conferences are core customers.

**Interprefy**

Swiss-born. Direct KUDO competitor. Strong on virtual plus on-site hybrid interpretation. Davos Forum and FIFA among the events that use it.

**AppTek**

US-born, the deeper-tech player. Connects ASR, translation, and TTS into an end-to-end speech interpretation stack. US government, military, and global media are customers. Deployable in your own data center.

**Translate.com / Mirror Pro**

General consumer interpretation apps. Good for hotels, tourism, and daily conversation. Pricing starts at 14.99 USD per month.

**Where real-time interpretation fits**

This market has totally different use cases from text translation. **Conferences, video meetings, multinational sessions** where immediacy is decisive. Accuracy and latency trade off — as of 2026 human interpreters still win on accuracy, AI interpretation wins on cost and scalability. They coexist.

Chapter 16 · Korean Translation — Papago, Kakao i, Genie Talk

Korean is hard for general NMT because of word order, particles, and honorifics. That is why Korean companies build their own engines.

**Naver Papago**

Naver's NMT. Strong on Korean-English, Korean-Japanese, Korean-Chinese. Free web, app, and API. Camera, voice, and conversation modes all supported. Dominant share in schools, daily use, and tourism. The API is integrated into Naver Cloud Platform.

**Kakao i Translation**

Kakao's translation engine. Integrated with KakaoTalk bots and Kakao i speakers. Competes with Papago on Korean-English. Less prominent as a standalone product, more as a feature inside the Kakao ecosystem.

**Genie Talk (Hancom, based on SYSTRAN)**

Based on SYSTRAN's NMT. Often selected by government, military, and public institutions. Strong on Korean to English, Japanese, and Chinese.

**Korean vs DeepL and LLMs**

- Daily conversation — Papago, Kakao

- Business documents — DeepL Pro

- Domain, tone, consistency — Claude, GPT (prompt adaptation)

- Government, security — Genie Talk (self-hosted) or internal LLM

Korean has strong market specificity, so global tools alone leave clear gaps.

Chapter 17 · Japanese Translation — Mirai Translator, DeepL JP, T-4OO

Japanese, with Korean-like word order plus kanji-hiragana-katakana mixing, is another NMT puzzle. Japanese vendors also build their own engines.

**Mirai Translator (NTT-AT)**

The number-one Japanese enterprise NMT, from NTT Advanced Technology. Strong on-prem deployment options. Japanese conglomerates, government, and finance are core customers.

**DeepL Japan**

DeepL has stellar reviews for Japanese fluency. They run a Tokyo office and have shipped Japan-specific features like **DeepL Voice**.

**T-4OO (Rozetta)**

Rozetta's Japanese NMT. Strong on medical, legal, and financial domains.

**Google, Microsoft, Amazon vs Japanese engines**

On public BLEU evaluations the Japanese-to-English pair usually goes to DeepL, but on East Asian pairs like Japanese-to-Chinese the Japanese engines like Mirai and T-4OO often win on domain accuracy.

**Selection guide**

- Daily and school — Google, DeepL free

- General business — DeepL Pro

- Japanese in-house integration — Mirai Translator (on-prem)

- Medical, legal, finance — T-4OO

A hallmark of the Japanese market is **strong demand for on-prem and data residency**. There are more slots than in Korea where SaaS in the cloud is not allowed, which makes on-prem options a deciding factor.

Chapter 18 · i18n Frameworks — i18next, react-intl, Next.js i18n

After the tools — **the framework layer**. How do you plug the translated strings into your code?

**i18next**

The de facto JavaScript standard. Alive since 2011. Two-million-plus downloads per week. Adapters for React, Vue, Angular, and Svelte. JSON or YAML key files plus a `t('key')` call is the basic pattern. ICU MessageFormat, plurals, gender, and context all supported.

**react-intl (FormatJS)**

A React-only library originally from Yahoo, now under FormatJS. First-class ICU MessageFormat support. Lighter than i18next but bound to React.

**Next.js i18n Routing**

Next.js built-in routing-level i18n. With App Router, middleware plus `[locale]` segments do the routing. The content itself goes through i18next, react-intl, or next-intl.

**next-intl**

An i18n library specialized for Next.js App Router. Pairs naturally with server components. Since 2024 next-intl has been trending toward becoming the default in Next.js projects.

**Vue: vue-i18n and nuxt-i18n**

The i18n libraries of the Vue and Nuxt ecosystem. Similar API to i18next.

**Mobile: iOS Localizable.strings, Android strings.xml**

The platform standards. iOS exports `.xliff`, Android uses `strings.xml`. Crowdin, Lokalise, and Phrase Strings all support them directly.

**Selection guide — simplified**

- General React, Vue, Angular — i18next

- React-only and ICU-heavy — react-intl

- Next.js App Router — next-intl

- Vue or Nuxt — vue-i18n or nuxt-i18n

- Mobile — platform standard plus Lokalise

Chapter 19 · ICU MessageFormat — Plurals, Gender, and Context

ICU (International Components for Unicode) MessageFormat is the standard for expressing **plurals, gender, and selects** inside a message. Almost every serious i18n library supports it.

**Basic shape**

Plural examples — "1 message" vs "5 messages" — have only singular and plural in English, but languages like Polish, Russian, and Arabic have four to six forms. ICU abstracts these as categories — `one`, `few`, `many`, `other`.

**Example in an i18next key file**

You list a single message in two English forms (singular and plural) and four Polish forms (`one`, `few`, `many`, `other`). The library takes the count value at runtime and picks the right form automatically.

**Gender handling**

German, French, and Spanish require gender agreement on nouns and adjectives. The ICU `select` syntax handles it — "he went" vs "she went" inside one key, chosen by context at runtime.

**Date, number, and currency formats**

Syntax like a long-form date placeholder or a currency-typed number placeholder lets the library format per locale automatically. Korean renders "2026년 5월 18일" and English renders "May 18, 2026" off the same key.

**ICU limits**

The syntax is complex, and translators editing it directly tend to make mistakes. So Phrase Strings, Lokalise, and Crowdin display ICU messages through **visual editors** — the translator fills in variants in a card-style UI, and the library generates ICU syntax automatically.

**Why it matters**

Treating "5 messages" as a simple string makes "1 messages" feel awkward. Without ICU, serious i18n accumulates grammatical awkwardness over time and erodes UX.

Chapter 20 · TMX, TBX, XLIFF — The Industry Standard Formats

Three standard formats move data between tools in translation workflows.

**TMX (Translation Memory eXchange)**

The translation memory exchange format. XML-based. Whether moving Trados memory to memoQ or Phrase memory to Lilt, TMX is the bridge. Standardized by LISA (Localization Industry Standards Association) in 1998. As of 2026, version 1.4 is most widely used.

**TBX (TermBase eXchange)**

The glossary exchange format. ISO 30042. Move an internal glossary from one tool to another. Where TMX is sentence-level, TBX is term-level.

**XLIFF (XML Localization Interchange File Format)**

**The format for moving the translation work itself.** OASIS standard. Source plus target plus metadata plus workflow state, all in one file. Trados, memoQ, Phrase, and Crowdin all import and export. iOS's `.xliff` export uses the same standard.

**Why these formats matter**

They prevent vendor lock-in. The reason a company can move from Trados to memoQ five years later. And internal data — translation memories — is a company asset. The asset must survive a tool change.

**Reality — honest read**

Even with standards, each tool's implementation differs. Pure-clean TMX export and import across tools is rare, and partial metadata loss is the norm. Large migrations require validation up front.

Chapter 21 · Quality Estimation — BLEU, chrF, COMET, BLEURT, MQM

How do you measure translation quality? Both automatic and human metrics exist.

**BLEU (BiLingual Evaluation Understudy)**

The first standard, from IBM in 2002. Measures n-gram overlap. Fast and reproducible, but **penalizes translations that are semantically equivalent but lexically different**. Still the most-cited metric in 2026, but its limits are well known.

**chrF**

Character-level F-score. Eases BLEU's word dependency. Better suited to languages where word boundaries are blurry, such as Korean and Japanese.

**TER (Translation Edit Rate)**

The number of edits a translator needs to apply to the output. Lower is better. Directly estimates post-editing cost, so the industry uses it frequently.

**COMET**

A neural-network-based evaluation metric Unbabel published in 2020. Takes source, reference, and candidate and correlates strongly with human ratings. Closer to humans than BLEU.

**BLEURT**

Google's BERT-based evaluator. Same philosophy as COMET.

**MQM (Multidimensional Quality Metrics)**

The industry standard for human evaluation. Classifies errors by type (accuracy, fluency, style, technical) with weighted severity. ASTM F2575 standard. Large LSPs treat it as table stakes.

**In practice — how do they combine?**

- Model development — BLEU and chrF (fast and reproducible)

- Model selection — COMET (close to human)

- Production QA — MQM (human evaluation, expensive)

- Post-editing efficiency — TER

After LLMs arrived, **using an LLM as the evaluator** has become common too. Show two translations to Claude or GPT and have it pick the better one. Fast and close to human, but evaluator-LLM bias creeps in.

Chapter 22 · Domain Adaptation — Legal, Medical, Technical, Marketing

Even with the same NMT model, quality varies a lot by domain. Four patterns for domain adaptation.

**1) Glossary / terminology (lightest)**

Every major NMT supports it. Pass "this term must translate to this" mappings to the NMT and they apply in the output. DeepL Pro, Google Cloud Translation, and Microsoft Translator all support glossaries first class.

**2) Custom Translator models (medium)**

Microsoft Custom Translator, Google AutoML Translation, Amazon Active Custom Translation. Upload an in-house corpus (10,000 parallel sentences and up) and build a private model. Training cost and time, but clearly higher domain accuracy than the default NMT.

**3) Adaptive NMT (Lilt, ModernMT)**

The translator's post-edits feed back into the model immediately, adapting within the session. Domain adaptation cost is close to zero — the strength. But it needs human-in-the-loop, so it does not fit pure volume workloads.

**4) LLM prompt adaptation (most flexible)**

Put domain, tone, banned terms, and glossary all into the LLM system prompt. Instant adaptation with no separate training. The downsides are cost and latency.

**Domain-specific recipes**

- **Legal** — Custom Translator or LLM (accuracy first)

- **Medical** — Custom Translator plus human post-editing (life-critical, fail-safe required)

- **Technical manuals** — DeepL Pro plus Phrase TMS plus glossary (lots of repetition, TM is a big asset)

- **Marketing** — LLM (tone and creativity decisive)

- **UI / microcopy** — LLM plus human post-edit (short but precise)

- **Government and law** — on-prem NMT plus human (data residency plus accuracy)

Chapter 23 · Pricing and ROI — Cost Comparison for One Million Words

A hypothetical: translate one million English words (about five million characters) into Korean. Cost estimates by option.

**Option A — DeepL Pro API**

5M characters times 25 USD per million = **125 USD**. The cheapest. But tone consistency and domain adaptation are weak.

**Option B — Google Cloud Translation**

5M characters times 20 USD per million = **100 USD**. Slightly cheaper than DeepL, but Korean quality is reportedly a notch lower.

**Option C — Claude (LLM)**

Roughly 1.3M tokens in and 1.3M tokens out = 3.25 USD plus 19.5 USD = around **23 USD**. Very cheap as of May 2026. Tone and domain adaptation possible, but latency is high and consistency QA is needed separately.

**Option D — GPT-4o (LLM)**

1.3M in at 2.5 USD per M plus 1.3M out at 10 USD per M = around **16 USD**. Cheaper and faster.

**Option E — Phrase TMS plus DeepL Pro plus human post-edit**

DeepL 125 USD plus Phrase fees plus human post-edit 5,000 to 15,000 USD = **5,000 to 15,000 USD**. Two orders of magnitude more expensive, but quality is decisively higher.

**Option F — Lilt (human-in-the-loop adaptive)**

Platform plus translators = **0.15 to 0.25 USD per word** times one million = **150,000 to 250,000 USD**. The most expensive, but the most precise.

**Conclusion — the cost vs quality curve**

- One-off internal use — LLM direct (20 USD)

- Volume and structured — DeepL Pro, Google (100 to 125 USD)

- Precise external publishing — DeepL plus TMS plus human (5,000 to 15,000 USD)

- Mission-critical (legal, medical, brand) — Lilt or specialist LSP (150,000+ USD)

**Hidden costs**

Tool selection cost matters less than **TM migration, workflow redesign, and translator training**. Looking at five-year cumulative cost instead of one-off, an automation-heavy tool often ends up cheaper despite a higher initial sticker.

Chapter 24 · Privacy and Data Residency — GDPR, K-PIPA, On-Prem

Data residency is one of the axes getting heavier in enterprise translation.

**GDPR (EU)**

EU citizens' data must stay in the EU. NMT routed through US clouds can be handled with SCCs (Standard Contractual Clauses) or the Data Privacy Framework, but **text containing personally identifiable information** needs additional review.

**Korea PIPA**

The 2024 revision tightened cross-border transfer regulation. A standard similar to the EU's. Medical and finance are especially strict.

**Japan APPI**

Similar to the EU. One reason on-prem options are strongly preferred in the Japanese market.

**Data residency by option**

- **DeepL** — EU hosting, with US data-center options. On-prem (DeepL Enterprise) available

- **Microsoft Azure Translator** — 60+ regions, strong residency options

- **Google Cloud Translation** — region selection available

- **Amazon Translate** — region selection available

- **OpenAI, Anthropic, Google AI** — limited residency options, enterprise agreements required

- **In-house NMT (NLLB, OPUS-MT)** — fully in-house, zero data exfiltration

- **Argos Translate** — desktop offline, zero exfiltration

**Where Enterprise LLMs fit**

Since 2025 Anthropic, OpenAI, and Google have formalized enterprise agreements that guarantee **no training on customer data plus zero retention**. That opens the door for Korean, Japanese, and EU enterprises to adopt LLMs.

**Reality — how to decide**

- General content (marketing, blogs) — cloud NMT/LLM freely

- Contains personal data — on-prem or zero-retention enterprise LLM

- Medical, finance, legal, government — on-prem NMT or in-house LLM first

- Military, intelligence — Argos and NLLB deployed in-house

Chapter 25 · Decision Tree — What to Use Where

Closing with a chapter that condenses the decision guide.

**Scenario 1 — Individual and student**

Free tiers are enough. Papago, Google Translate, DeepL Free plus ChatGPT/Claude free. Cost zero.

**Scenario 2 — Indie developer or small SaaS i18n**

- Tool — POEditor or Crowdin free plan

- Engine — DeepL Pro Starter (9 USD) or LLM direct

- Framework — next-intl / i18next

- Cost — 10 to 50 USD per month

**Scenario 3 — Startup or mid-size SaaS**

- Tool — Crowdin Pro or Phrase Strings

- Engine — DeepL Pro plus LLM post-edit

- Framework — next-intl plus Phrase Strings integration

- Cost — 100 to 500 USD per month

**Scenario 4 — Enterprise, content-heavy global**

- Tool — Phrase TMS or Smartling

- Engine — DeepL Pro, Google, Microsoft plus human post-edit

- Adaptation — glossary plus Custom Translator

- Cost — 100,000+ USD per year

**Scenario 5 — Domain precision (medical, legal, technical)**

- Tool — Lilt or in-house TMS

- Engine — ModernMT/Lara or Custom Translator

- People — LSP or in-house expert translators

- Cost — tens to hundreds of thousands USD per year

**Scenario 6 — Government, military, security (on-prem)**

- Tool — in-house workflow

- Engine — NLLB / OPUS-MT / Mirai on-prem / DeepL Enterprise on-prem

- LLM — in-house hosted Llama or DeepSeek

- Cost — own infrastructure

**Scenario 7 — Real-time voice and interpretation**

- Daily — Google Translate Conversation, Apple Live Translation

- Business — Galaxy Live Translate, Whisper

- Conferences and meetings — KUDO, Interprefy

- Government — AppTek in-house

**Principle**

No single tool does everything well. **Hybrids win.** NMT for volume, LLM for quality, TMS for workflow, humans for the final mile.

Chapter 26 · Epilogue — The 2026 Translator, the 2030 Translator

Closing with a note on people. Right after NMT arrived in 2014, there was a panic that translators would disappear. They did not. Right after LLMs arrived in 2023, the same panic returned. They will not disappear. But the shape of the job is clearly changing.

**The 2016 translator**

Wrote every sentence by hand. Used CAT tools to leverage memory, but NMT was secondary. 250 to 400 words per hour was standard.

**The 2026 translator**

MT does the first pass. The translator does **MTPE (Machine Translation Post-Editing)**. Throughput is two to three times higher at 500 to 1,000 words per hour. Per-word rates dropped 30 to 50 percent. Net effect: hourly earnings are similar or slightly higher.

**The 2030 translator — projection**

Time spent on direct translation approaches zero. **Language consulting, tone QA, domain expertise, and cultural adaptation** become the main work. The volume work shrinks, but the absolute jobs are expected to hold because the volume of content itself is exploding — per the LSP industry associations' projections.

**Creative translation (poetry, literature, marketing)** stays human. AI moves meaning but not culture, rhythm, or implication. When Han Kang won the Nobel Prize in Literature, Deborah Smith's translation was central — AI cannot do that.

**Developer-side conclusion**

- General content — automate to drive cost down

- Domain precision — tool plus human hybrid

- Culture and creative — delegate to humans

- Compliance — on-prem

- Do not treat any one tool as gospel. The landscape changes every six months.

Translation is no longer a battle between single models. It is a **battle of workflows**. Which tool you place where decides both quality and cost. May this post be the starting point of that decision.

References — 2026 Translation and Localization

**Machine translation foundations**

- [Google Neural Machine Translation paper (2016)](https://arxiv.org/abs/1609.08144)

- [Attention Is All You Need — Transformer paper (2017)](https://arxiv.org/abs/1706.03762)

- [No Language Left Behind — NLLB paper (Meta, 2022)](https://arxiv.org/abs/2207.04672)

- [MADLAD-400 paper (Google, 2023)](https://arxiv.org/abs/2309.04662)

**Commercial tool homepages**

- [DeepL](https://www.deepl.com/)

- [Google Cloud Translation](https://cloud.google.com/translate)

- [Microsoft Azure AI Translator](https://azure.microsoft.com/en-us/products/ai-services/ai-translator)

- [Amazon Translate](https://aws.amazon.com/translate/)

- [ModernMT](https://www.modernmt.com/)

- [Lilt](https://lilt.com/)

- [Smartling](https://www.smartling.com/)

- [Phrase](https://phrase.com/)

- [Crowdin](https://crowdin.com/)

- [Lokalise](https://lokalise.com/)

- [Transifex](https://www.transifex.com/)

**Open source NMT**

- [Hugging Face NLLB-200](https://huggingface.co/facebook/nllb-200-distilled-600M)

- [Helsinki-NLP OPUS-MT](https://huggingface.co/Helsinki-NLP)

- [Argos Translate](https://www.argosopentech.com/)

**LLM translation guides**

- [OpenAI Translation Best Practices](https://platform.openai.com/docs/guides/prompt-engineering)

- [Anthropic Claude Translation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering)

- [Google Gemini Multilingual](https://ai.google.dev/gemini-api/docs/models)

**i18n frameworks**

- [i18next documentation](https://www.i18next.com/)

- [FormatJS / react-intl](https://formatjs.io/)

- [next-intl](https://next-intl-docs.vercel.app/)

- [ICU MessageFormat](https://unicode-org.github.io/icu/userguide/format_parse/messages/)

**Standard formats**

- [TMX standard (LISA)](https://www.gala-global.org/lisa-oscar-standards)

- [XLIFF OASIS standard](https://www.oasis-open.org/committees/xliff/)

- [TBX ISO 30042](https://www.iso.org/standard/62510.html)

**Quality estimation metrics**

- [COMET](https://github.com/Unbabel/COMET)

- [BLEU original paper (Papineni et al., 2002)](https://aclanthology.org/P02-1040/)

- [MQM guidelines](https://themqm.org/)

**Industry trends**

- [GALA — Globalization and Localization Association](https://www.gala-global.org/)

- [TAUS Industry Reports](https://www.taus.net/)

- [Slator Language Industry News](https://slator.com/)