Split View: AI 시대 개발자 영어: AGENTS.md, 시스템 프롬프트, RFC — 에이전트가 읽는 영어 쓰기의 기술

AI 시대 개발자 영어: AGENTS.md, 시스템 프롬프트, RFC — 에이전트가 읽는 영어 쓰기의 기술

"당신이 2026년에 쓰는 가장 중요한 영어는, 사람이 아니라 에이전트가 읽는 영어다."

프롤로그 — 영어의 청중이 바뀌었다

5년 전, 개발자가 영어를 잘 써야 하는 이유는 분명했습니다. 코드 리뷰에서 동료에게 설명하기 위해. 글로벌 이슈 트래커에서 외국 동료와 일하기 위해. 컨퍼런스 발표를 영어로 하기 위해. 면접에서 자기 경험을 영어로 설명하기 위해. 그 영어의 청중은 모두 사람이었습니다.

2026년 현재, 영어의 가장 중요한 청중이 바뀌었습니다. 당신이 한 주에 가장 많이 쓰는 영어는 동료에게 보내는 슬랙 메시지가 아니라 — 십중팔구 — 에이전트가 읽는 영어입니다. AGENTS.md에 적은 빌드 명령어. 클로드 코드 세션의 시스템 프롬프트. 디빈에게 던지는 이슈 설명. 코덱스에게 주는 PR 컨텍스트. RFC에 적은 비기능 요구사항 — RFC는 이제 사람이 먼저 읽고 에이전트가 그것을 실행 단위로 쪼개는 문서가 되었습니다.

이 글은 이전의 영어 글들과 명백히 다른 각도를 가집니다. 코드 리뷰 영어, 스탠드업 영어, 연봉 협상 영어, FDE 면접 영어는 모두 사람에게 말하는 법이었습니다. 이 글은 에이전트에게 쓰는 영어 — 그리고 사람과 에이전트가 동시에 읽는 영어를 다룹니다. 그 둘은 미묘하지만 결정적으로 다릅니다.

이 글의 명제

2026년 개발자가 쓰는 영어 중 단위 시간당 가장 큰 레버리지를 가진 영어는, 에이전트가 읽는 영어다.

세 가지 이유 때문입니다.

첫째, 재사용성. 동료에게 보내는 슬랙은 한 사람이 한 번 읽습니다. AGENTS.md는 모든 팀원의 모든 에이전트 세션이 매번 읽습니다. 한 줄의 품질 차이가 수천 번 반복됩니다.

둘째, 무자비함. 사람은 모호한 문장을 맥락으로 보정합니다. 에이전트는 보정하지 않고 — 또는 잘못 보정합니다. 약한 영어가 약한 행동으로 직결됩니다.

셋째, 레버리지. 잘 쓴 시스템 프롬프트 한 줄이 100번의 코드 리뷰 코멘트를 대체합니다. 잘 쓴 RFC 하나가 30개의 잘 쪼개진 작업으로 바뀝니다.

이 글에서 다룰 내용:

장	주제
1장	청중이 바뀌었다 — 사람 vs 에이전트의 읽기 차이
2장	에이전트는 문자 그대로 읽는다(literal) — 헷지의 함정
3장	AGENTS.md / CLAUDE.md — 컨텍스트 파일 쓰기
4장	시스템 프롬프트 쓰기 — 프로덕션 에이전트를 위한 영어
5장	RFC와 디자인 독 — 사람과 에이전트가 동시에 읽는 문서
6장	이슈 / 티켓 — 에이전트가 행동할 수 있는 설명
7장	커밋 메시지와 PR 설명 — AI 우선 워크플로우
8장	프롬프트 라이팅을 craft로 — 좋은 영어의 일반 원칙
9장	정중함과 직설성의 트레이드오프
에필로그	체크리스트 + 안티패턴 + 다음 글 예고

1장 · 청중이 바뀌었다 — 사람과 에이전트의 읽기 차이

영어를 잘 쓴다는 것은 청중을 이해한다는 뜻입니다. 그리고 사람 독자와 에이전트 독자는 명백히 다른 청중입니다.

1.1 사람은 문맥으로 보정한다

당신이 동료에게 이렇게 씁니다.

We probably want to use the new auth flow for this, unless there's a reason
to stay on the old one. Let me know what you think.

동료는 이 문장에서 다섯 가지를 동시에 읽습니다. (1) 새 auth flow를 쓰자는 제안. (2) 약한 확신("probably"). (3) 반대 의견에 열려 있음("unless"). (4) 결정 권한을 일부 위임("let me know"). (5) 톤이 친근함("we"). 이 모든 것은 동료가 회사의 정치, 당신의 직급, 이전 대화, 그리고 영어 화자의 헷지 관습을 알고 있기 때문에 가능합니다.

같은 문장을 에이전트에게 주면, 에이전트는 종종 이렇게 해석합니다 — "새 auth flow를 사용하라는 지시. 다만 'unless'와 'probably'가 있으니 결정을 보류하고 사용자에게 질문하자." 또는 정반대로 — "새 auth flow를 그냥 적용." 어느 쪽이든 사람 독자가 자연스럽게 도달하는 "함께 결정하자"라는 의도와는 거리가 멉니다.

1.2 에이전트는 토큰을 가중치로 본다

에이전트는 문장을 의미 단위로 읽지 않습니다. 토큰의 가중치 합으로 읽습니다. 그래서 사람에게는 사소한 단어 — "should", "must", "may", "prefer", "always", "never" — 가 에이전트의 행동에 큰 차이를 만듭니다.

영어 단어	사람의 해석	에이전트의 해석
"should"	"그렇게 해주면 좋겠다" (약한 권장)	"거의 항상 그렇게 한다" (강한 가중치)
"must"	"반드시 그렇게 한다"	"반드시 그렇게 한다" (해석 일치)
"prefer"	"선호하지만 예외 있음"	"기본값으로 한다"
"avoid"	"되도록 피한다"	"강한 제약, 거의 안 한다"
"never"	"절대 안 한다"	"절대 안 한다" (해석 일치)
"consider"	"고려해봐라, 결정은 너에게"	모호함 — 종종 무시되거나 과도하게 적용됨

핵심은 "consider"와 "perhaps"가 사람에게는 따뜻한 권장이지만 에이전트에게는 노이즈라는 점입니다. 에이전트에게는 "Use X" 또는 "Do not use X" 둘 중 하나가 거의 항상 더 낫습니다.

1.3 사람은 한 번 읽고 에이전트는 매번 읽는다

AGENTS.md는 매 세션, 매 작업, 매 호출마다 다시 읽힙니다. 동료가 README를 한 번 읽고 머리에 보관하는 것과 다릅니다. 그래서 AGENTS.md의 줄 하나는 일회성 비용이 아니라 반복 비용입니다. 노이즈 한 줄이 1000번 반복되면 1000번의 가중치를 만듭니다.

이것이 Anthropic의 공식 권장사항 중 하나가 "CLAUDE.md의 모든 줄에 대해 — '이게 없으면 클로드가 실수했을까?' 를 물어라"인 이유입니다. 만약 답이 "아니오"라면, 그 줄은 노이즈입니다.

2장 · 에이전트는 문자 그대로 읽는다 — 헷지의 함정

영어 화자가 자연스럽게 쓰는 표현 중 가장 많이 에이전트를 망가뜨리는 것은 헷지(hedging) 입니다. "kind of", "sort of", "I think", "perhaps", "it might be that", "would probably want to". 사람에게는 정중함이지만 에이전트에게는 신호 잡음비를 떨어뜨립니다.

2.1 약한 영어 vs 강한 영어 — before / after

before — 헷지가 가득한 시스템 프롬프트의 한 부분:

You should probably try to use TypeScript for new files, although JavaScript
might be acceptable in some cases. It would be nice if you could add tests,
but if you don't have time, that's okay too. Please try to follow the
existing code style as much as possible.

에이전트의 행동:

TypeScript를 쓸지 JavaScript를 쓸지 매번 사용자에게 묻는다
테스트를 안 쓴다 ("if you don't have time" 이 사실상 면죄부)
코드 스타일은 "as much as possible" — 어떤 의미인지 모름

after — 같은 의도, 강한 영어:

Write all new files in TypeScript. JavaScript is only allowed when modifying
an existing .js file in place — do not convert .js to .ts as part of an
unrelated change.

Write tests for any new function exported from a module. Tests live next to
the source as <name>.test.ts.

Follow the rules in eslint.config.js. If a rule conflicts with your output,
fix your output — never disable the rule.

에이전트의 행동:

TypeScript로 새 파일을 쓴다 (예외 조건이 명확)
모든 export된 함수에 테스트를 쓴다
린트 규칙을 따른다 — 충돌 시 행동도 정의됨

핵심 변화 세 가지:

모호어 제거: "probably", "might", "would be nice" 제거
조건 명시화: 어떤 경우에 예외인지 정확히 적음
충돌 시 행동 정의: "If X conflicts with Y, do Z"

2.2 부정의 함정 — 에이전트는 부정을 자주 무시한다

긴 부정문은 에이전트에게 어렵습니다.

before:

Do not delete files unless you are sure they are not used by any other
module that you have not yet read.

이중 부정과 조건의 중첩으로 인해 에이전트는 종종 이 문장을 "delete files"로 단순화합니다.

after:

Never delete a file in this session. If you believe a file is unused, list
it in your final summary under "Files I would delete" — let the user delete it.

부정을 단일 행동 지시로 바꾸고, 대안 행동을 정확히 정의했습니다. 에이전트는 부정 + 대체 행동이 함께 있을 때 부정을 더 잘 지킵니다.

2.3 "If unsure, ask" 함정

"If unsure, ask the user" 는 잘 쓰면 좋지만, 잘못 쓰면 에이전트를 끝없는 질문 기계로 만듭니다. "unsure"의 기준이 없기 때문입니다.

before:

If you are unsure about anything, ask the user before proceeding.

after:

Proceed without asking when: (a) the change is a typo or comment-only fix,
(b) the test suite passes locally, (c) you can produce a unit test that
demonstrates correctness.

Ask the user when: (a) the change touches authentication or billing code,
(b) you cannot find a relevant test, (c) the change requires a new
dependency.

"unsure"의 기준을 두 개의 화이트리스트로 바꾼 결과, 에이전트는 더 자율적이고 동시에 더 안전해집니다.

3장 · AGENTS.md / CLAUDE.md — 컨텍스트 파일 쓰기

AGENTS.md는 2026년 가장 중요한 새 문서 포맷입니다. OpenAI Codex가 제안하고, Linux Foundation 산하 Agentic AI Foundation이 표준으로 가져갔고, 60,000개 이상의 오픈소스 프로젝트가 채택했습니다. CLAUDE.md는 Anthropic Claude Code의 동일한 개념입니다. 둘 다 단순한 마크다운이지만, 좋은 것과 나쁜 것의 차이가 큽니다.

3.1 좋은 AGENTS.md의 4가지 섹션

좋은 AGENTS.md는 보통 이 네 가지를 가집니다.

# AGENTS.md

## What this repo is
(에이전트가 처음 본 코드베이스에서 길을 잃지 않게 — 2-4 줄)

## How to build, test, run
(터미널 명령어 — 정확한 명령어, 정확한 디렉터리)

## Conventions
(코드 스타일, 네이밍, 파일 구조 — 코드만 봐서는 알 수 없는 것)

## Boundaries
(에이전트가 절대 건드리지 말아야 할 것들)

이 구조를 따른 OpenAI Codex 자체 리포지토리의 AGENTS.md는 짧고 강력합니다. 사람이 30초에 읽고, 에이전트가 매번 정확히 인용합니다.

3.2 약한 AGENTS.md vs 강한 AGENTS.md

before — 흔히 보는 약한 AGENTS.md:

# Our Project

This is a Node.js project. We try to follow best practices and write clean code.

## Getting started

You can install dependencies with npm. To run the tests, use the test command.
We use TypeScript and prefer functional patterns when it makes sense.

Please be careful when making changes and try to add tests when appropriate.
If you need to deploy, talk to the platform team.

문제점:

"best practices", "clean code" — 의미 없음
"you can install with npm" — 정확한 명령어 없음
"when it makes sense" — 기준 없음
"when appropriate" — 기준 없음

after — 강한 AGENTS.md:

# AGENTS.md

## What this repo is
A Next.js 15 app for the customer billing portal. The backend is a separate
service at `api.example.com` — this repo only contains the frontend.

## Setup, build, test
- Install: `pnpm install` (do not use npm or yarn — lockfile is pnpm)
- Dev server: `pnpm dev` (port 3000)
- Tests: `pnpm test --filter <package>` for a single package, never `pnpm test` for the whole monorepo (takes 20+ minutes)
- Typecheck: `pnpm typecheck` — must pass before any commit

## Conventions
- Components: PascalCase files in `src/components/`. One component per file.
- Hooks: `useFoo.ts` in `src/hooks/`. Always return an object, never an array.
- API calls: use `src/lib/api/client.ts` — never call `fetch` directly.
- Styling: Tailwind only. Do not add new CSS files.

## Boundaries
- Never modify `src/generated/` — it is regenerated from OpenAPI on CI.
- Never edit `pnpm-lock.yaml` by hand — use `pnpm install`.
- Never add a dependency without asking — the bundle is already at 280 KB and we are tracking it.
- Do not touch `src/billing/legacy/` — it is being deprecated; new features go in `src/billing/v2/`.

차이의 핵심 — 모든 문장이 (a) 정확한 명령어, (b) 정확한 경로, (c) "왜"의 짧은 설명을 동반합니다. "왜"가 있으면 에이전트가 예외 상황에서 일반화할 수 있습니다.

3.3 CLAUDE.md 특유의 패턴

CLAUDE.md는 Claude Code 세션마다 자동으로 컨텍스트에 로드됩니다. 이 점이 다음 두 가지 패턴을 만들어냅니다.

패턴 1 — 빈번한 실수의 영구 차단. 에이전트가 같은 실수를 반복할 때, 그것을 CLAUDE.md에 추가합니다.

## MDX blog rules

Files in `data/blog/` are MDX, pre-rendered at build time. Patterns that
break the build:

- Bare `{identifier}` in prose → JSX expression → ReferenceError.
  Use backticks or rephrase.
- `<` followed by a digit → MDX tries to parse a JSX tag.
  Use a space: `< 100` not `<100`.
- A fence opening with extra backticks at the end of the line
  breaks parsing.

When fixing one variant (.mdx), check all 3 (.mdx / .en.mdx / .ja.mdx).

패턴 2 — "고통의 출처(painful)" 명시화. 에이전트는 어떤 작업이 시간이 오래 걸리는지 모릅니다. 알려줘야 합니다.

## Slow operations — avoid in agent sessions
- `pnpm install` from cold cache: 4 minutes
- `pnpm test` whole monorepo: 22 minutes
- `next build`: 6 minutes
- E2E tests: 18 minutes — never run unless explicitly asked

Prefer: `pnpm test --filter <pkg>`, `pnpm typecheck`, `pnpm lint`.

이 한 섹션이 있으면 에이전트는 "테스트 한 번 돌려볼게요" 하고 22분을 잡아먹는 행동을 하지 않습니다.

4장 · 시스템 프롬프트 — 프로덕션 에이전트를 위한 영어

AGENTS.md가 "코드베이스 컨텍스트"라면, 시스템 프롬프트는 "에이전트의 역할 정의"입니다. 프로덕션에서 동작하는 에이전트라면 이 둘이 명확히 분리되어 있어야 합니다.

4.1 시스템 프롬프트의 5가지 섹션 (Anthropic 권장)

Anthropic의 프롬프트 엔지니어링 가이드를 따른 구조:

1. Role and persona       — 너는 누구인가
2. Context                — 어떤 환경에서 동작하는가
3. Capabilities and tools — 무엇을 할 수 있는가
4. Rules and constraints  — 무엇은 절대 안 되는가
5. Output format          — 결과를 어떻게 돌려주는가

예시 — 고객 지원 에이전트의 시스템 프롬프트 골격:

ROLE
You are the support assistant for Example Corp. You handle questions about
billing, account access, and subscription changes. You do not handle
refunds — those are escalated to a human agent.

CONTEXT
You have access to the user's account via the get_account tool. You do not
have access to payment methods, internal Slack, or engineering tickets. The
current date is in the user message.

CAPABILITIES
- Look up account state with get_account
- Trigger a password reset with send_password_reset
- Open a refund ticket with create_refund_ticket (this hands off to a human)

RULES
- Never reveal another user's data.
- Never tell the user about internal tools or model names.
- If the user asks for a refund, do not promise it — create a ticket and tell
  them a human will follow up within one business day.
- If you are unsure whether an action is allowed, do not do it. Ask.

OUTPUT
Respond in plain English, 2-4 short paragraphs at most. End with a single
sentence stating what action you took or will take next.

이 구조의 강점 — 각 섹션이 단일 의도를 가집니다. 새 도구가 추가되면 CAPABILITIES만 수정합니다. 새 금기가 추가되면 RULES만 수정합니다. 유지보수가 가능한 영어입니다.

4.2 시스템 프롬프트의 안티패턴

안티패턴 1 — 예시 폭주. 사람이 처음 시스템 프롬프트를 쓰면 흔히 10-20개의 예시를 우겨 넣습니다. Anthropic의 권장은 다릅니다 — "다양하고 정전(canonical)인 예시 3-5개"가 50개의 모순된 예시보다 낫습니다. 에이전트는 예시에서 일반화할 수 있는 능력이 있고, 너무 많은 예시는 과적합을 일으킵니다.

안티패턴 2 — 톤 형용사 나열. "Be helpful, professional, friendly, concise, accurate, thoughtful, careful." 일곱 개의 형용사는 일곱 개 모두를 약하게 만듭니다. 두 개를 골라 — "Direct and concise. No filler" — 명확하게.

안티패턴 3 — 메타 지시. "Think carefully before answering." 이 지시는 사람에게는 의미 있지만 에이전트에게는 종종 노이즈입니다. "Think step by step about X" 같이 구체적인 추론 단계를 지시하거나, 아예 빼는 편이 낫습니다.

4.3 사용자 메시지 vs 시스템 메시지

영어를 어디에 쓸지가 중요합니다. 세션마다 일정한 규칙은 시스템 프롬프트에. 한 번의 요청에 특화된 정보는 사용자 메시지에.

before — 모든 것을 시스템 프롬프트에:

You are a code reviewer. The user is reviewing PR #1234 which adds a new
billing flow. The PR description is: "..."

문제 — 다음 PR을 검토할 때 시스템 프롬프트를 다시 작성해야 합니다.

after — 분리:

SYSTEM
You are a code reviewer for this monorepo. Focus on correctness, security,
and consistency with the patterns in src/lib/. Do not comment on style — a
linter handles that.

USER
Review PR #1234. The diff is below. The author claims it adds a new
billing flow.
[diff]

시스템 프롬프트는 재사용 가능하고, 사용자 메시지는 매번 다릅니다. 깨끗한 분리입니다.

5장 · RFC와 디자인 독 — 사람과 에이전트가 동시에 읽는 문서

RFC와 디자인 독은 영어 글쓰기의 정점입니다. 그리고 2026년에는 이 문서들의 청중에 에이전트가 추가되었습니다. 에이전트는 RFC를 받아 작업 단위로 쪼개고, 일부 작업을 직접 실행하기도 합니다.

5.1 사람만 읽던 RFC의 약점

전통적인 RFC는 종종 이런 단점이 있습니다.

"어떻게(how)" 를 모호하게 둠 — 구현자의 자율성을 존중하기 위해서
결정의 "왜(why)" 를 본문에서 흩뿌림 — 사람은 종합해서 읽음
비기능 요구사항을 자유롭게 적음 — "should be fast enough"
대안(alternatives)을 짧게 언급 — "we considered X but decided not to"

이 모든 것이 사람에게는 괜찮지만, 에이전트는 이런 RFC에서 명확한 작업 단위를 추출하지 못합니다.

5.2 에이전트도 읽는 RFC의 새 구조

1. TL;DR (1 paragraph)         — 사람이 30초에 읽음
2. Problem                     — 왜 지금 하는가
3. Goals / non-goals           — 명시적 경계
4. Proposal                    — 무엇을 만들 것인가
5. Detailed design             — 어떻게 만들 것인가
6. Alternatives considered     — 왜 다른 길을 안 갔는가
7. Open questions              — 결정되지 않은 것
8. Work breakdown (NEW)        — 작업 단위 (에이전트가 행동 가능)
9. Acceptance criteria (NEW)   — 완료의 정의
10. Non-functional requirements
    (latency, throughput, error budget — 숫자)

전통적인 RFC 대비 두 가지가 추가됩니다 — Work breakdown과 Acceptance criteria. 둘 다 에이전트가 직접 행동의 출발점으로 삼을 수 있는 형태로 적어야 합니다.

before — 모호한 work breakdown:

We'll need to update the API, the frontend, and add some tests.

after — 행동 가능한 work breakdown:

1. Add POST /v2/billing/preview endpoint
   - Request: BillingPreviewRequest (new schema)
   - Response: BillingPreviewResponse
   - Backed by services.billing.preview()
   - File: services/billing/handlers/preview.ts (new)
   - Estimated diff: ~150 lines

2. Add usePreview hook
   - File: src/hooks/usePreview.ts (new)
   - Wraps the new endpoint with TanStack Query
   - Estimated diff: ~40 lines

3. Wire the hook into BillingScreen
   - File: src/screens/BillingScreen.tsx (modify)
   - Loading and error states required
   - Estimated diff: ~80 lines

4. Tests
   - Unit: services/billing/preview.test.ts
   - Integration: e2e/billing/preview.spec.ts

차이는 단순합니다 — 각 항목이 (a) 동사로 시작하고, (b) 정확한 파일을 가리키고, (c) 대략적 크기를 알립니다. 사람도 더 잘 검토하고, 에이전트는 이대로 행동할 수 있습니다.

5.3 비기능 요구사항은 숫자로

이것은 에이전트 시대 이전부터의 좋은 관습이지만, 에이전트가 들어오면서 더 중요해졌습니다.

before:

The new endpoint should be fast. It should handle our peak traffic.

after:

- p99 latency under 200 ms at 500 req per second.
- Error rate under 0.5 percent over any 5 minute window.
- The endpoint must not increase the upstream Postgres CPU above 60 percent
  during peak — current peak is 38 percent.
- If the endpoint fails for more than 30 seconds, fall back to the existing
  billing summary endpoint (graceful degradation).

숫자가 있으면 사람도 검토 가능하고, 에이전트가 부하 테스트를 만들 때 정확한 목표를 가질 수 있습니다.

6장 · 이슈 / 티켓 — 에이전트가 행동할 수 있는 설명

Devin, Codex Cloud, GitHub Copilot Workspace 같은 티켓 기반 에이전트가 2026년 표준이 되면서, 이슈를 쓰는 방식이 곧 에이전트의 일을 정하는 방식이 되었습니다.

6.1 사람용 이슈 vs 에이전트용 이슈

사람 개발자가 받는 이슈는 보통 이렇습니다.

Title: Login form looks broken on mobile

Description: Hey, when I try to log in on my phone, the form looks weird.
The button is cut off and I can't really submit. Can you take a look?

사람 개발자는 이 이슈를 받으면 — 폰을 꺼내서 직접 재현하고, 디자이너에게 의도를 묻고, 비슷한 이슈를 검색하고, 결국 일을 시작합니다. 에이전트는 이 이슈만 가지고는 거의 아무것도 할 수 없습니다.

같은 문제를 에이전트가 행동할 수 있는 형태로:

Title: LoginForm submit button truncated on mobile viewport

Reproduction
1. Open http://localhost:3000/login in Chrome
2. Open DevTools and select "iPhone 14 Pro" viewport (393x852)
3. Observe that the submit button extends past the right edge of the form
4. Expected: submit button is fully visible and aligned with the form's right edge

Suspected cause
The form container has `max-width: 400px` but at viewports under 400px
plus the form's padding, content overflows. See `src/components/LoginForm.tsx`
line ~45.

Acceptance criteria
- Submit button is fully visible on all viewports >= 320px wide
- No visual regression on desktop (>= 1024px)
- Existing Playwright login test still passes

Out of scope
- Do not redesign the form
- Do not change the button text or color
- Do not touch the OAuth button — it is being redone in #1240

길어 보이지만 — 에이전트는 이 이슈를 받자마자 정확히 시작할 수 있고, 사람이 검토할 때도 무엇이 됐고 무엇이 빠졌는지 명확합니다. "Out of scope" 가 특히 중요합니다. 에이전트의 가장 흔한 실수는 "거리에서 청소하다가 옆집 마당까지 청소" 입니다. 경계를 명시하면 막을 수 있습니다.

6.2 이슈 작성 템플릿

팀 전체에서 쓸 만한 템플릿:

## Context
(1-3 sentences — why this matters now)

## Reproduction or scenario
(numbered steps — for bugs; concrete user story for features)

## Acceptance criteria
- ...
- ...

## Out of scope
- ...

## Related
- linked issues, ADRs, RFCs

Out of scope는 새 추가입니다. 사람 개발자에게는 암묵적이었던 것을 에이전트에게는 명시해야 합니다.

7장 · 커밋 메시지와 PR 설명 — AI 우선 워크플로우

커밋 메시지와 PR 설명은 영어 글쓰기 중에서도 가장 자주 쓰는 것이고, 에이전트가 자주 읽고 자주 쓰는 것입니다.

7.1 좋은 커밋 메시지 (Conventional Commits + 왜)

before:

fix bug

after:

fix(billing): prevent double-charge when retry after partial failure

The retry path was not idempotent because we generated a new idempotency
key on every attempt. We now reuse the key for the whole transaction
attempt, scoped to the user and order id.

Repro: tests/billing/retry.test.ts::"survives partial gateway failure"

세 요소 — (1) type(scope): summary, (2) 단락의 "왜", (3) 재현 또는 테스트 위치. 사람도 읽기 좋고, 에이전트가 향후 같은 영역을 다시 만질 때 매우 유용합니다.

7.2 AI가 쓴 PR 설명 vs 사람이 쓴 PR 설명

AI가 자동 생성하는 PR 설명에는 흔한 약점이 있습니다.

변경 줄을 줄줄이 나열만 함 ("Updated A, B, C")
"왜" 가 거의 없음
모든 변경을 동등하게 다룸 — 핵심과 부수의 구분이 없음

좋은 PR 설명은 사람이 짧게 적든 에이전트가 길게 적든, 다음을 가집니다.

## What
(1-2 paragraphs — what changes, in business terms)

## Why
(왜 지금, 왜 이 방식)

## How (only if non-obvious)
(설계 결정 중 코드만 봐서 모를 것)

## Risks
(무엇이 깨질 수 있는가, 어떻게 봤는가)

## Test plan
(어떻게 검증했는가 — 실행한 명령어, 본 화면, 본 로그)

에이전트에게 PR 설명을 생성시킬 때 — 이 다섯 섹션을 명시적으로 요구하면 결과가 극적으로 좋아집니다.

7.3 Co-authored-by 와 정직함

AI가 함께 쓴 코드를 정직하게 적는 것은 윤리적으로도 실용적으로도 중요합니다. Co-Authored-By: Claude 같은 표기는 향후 그 커밋을 다시 볼 때 "이건 AI가 처음 썼고 내가 검토했다" 라는 신호로 작용합니다. 6개월 뒤 같은 영역을 디버깅할 때 이 표기가 큰 가치를 가집니다.

8장 · 프롬프트 라이팅을 craft로

여기까지 봤다면 한 가지가 드러났을 것입니다. 에이전트를 위한 영어 쓰기는 결국 잘 쓴 기술 글쓰기와 같습니다. 좋은 RFC를 쓰는 사람은 좋은 시스템 프롬프트를 쓸 가능성이 높고, 그 반대도 마찬가지입니다.

8.1 좋은 영어의 일반 원칙 — 에이전트 시대 버전

원칙 1 — 한 단락에 한 가지 의도. 사람도 좋아하고 에이전트는 더 좋아합니다. 한 단락에 명령과 예외와 톤이 섞이면 모두에게 어렵습니다.

원칙 2 — 명시적 컨텍스트. "as discussed yesterday" 는 동료에게는 명시적이지만 에이전트에게는 빈 공간입니다. 컨텍스트를 매번 적거나 — 더 좋게 — 영구 문서에 남기십시오.

원칙 3 — 이름 붙은 제약. "this should be fast" 가 아니라 "p99 under 200 ms". "secure" 가 아니라 "no plaintext credentials in logs". 이름이 붙은 제약은 검증 가능합니다.

원칙 4 — 충돌의 행동 정의. 잘 쓴 영어는 항상 "X와 Y가 충돌하면 Z 하라"를 갖습니다. 충돌이 명시되지 않으면 에이전트는 둘 중 하나를 임의로 고르고 — 그 선택은 종종 틀립니다.

원칙 5 — 구체적 예시. 추상적인 규칙 하나보다 정전(canonical) 예시 셋이 더 잘 동작합니다. 다만 — 예시는 일관되어야 합니다.

8.2 글을 짧게 만드는 기술

AGENTS.md와 시스템 프롬프트는 짧을수록 좋습니다. 짧게 만드는 다섯 가지 기술:

수식어 삭제: "very", "really", "quite", "somewhat" — 거의 항상 빼도 됩니다
명사화 풀기: "perform a review of" 가 아니라 "review"
수동태 → 능동태: "Tests are to be written" 가 아니라 "Write tests"
불릿 한 줄 한 명령: 두 명령이 섞이면 둘로 쪼개세요
반복 제거: 같은 규칙이 두 번 나오면, 한 곳을 지우거나 한쪽이 다른 측면을 다루도록 다시 쓰세요

좋은 시스템 프롬프트는 좋은 시처럼 — 단어 하나를 빼도 의미가 약해지는 상태입니다.

8.3 정전(canonical) 예시 큐레이션

Anthropic은 "다양하고 정전인 예시 셋" 을 권장합니다. "정전" 이라는 단어가 중요합니다. 예시는 단지 동작하는 것이 아니라 — "우리 팀이 이렇게 일한다" 의 표본이어야 합니다. 어색한 예시는 어색한 행동을 일반화시킵니다.

좋은 예시의 체크리스트:

변경의 의도가 명확하다
코드 스타일이 우리 팀의 평균과 일치한다
에지 케이스를 한 가지 다룬다 (없으면 너무 단순하고 — 너무 많으면 노이즈)
일반화 가능한 패턴이다

9장 · 정중함과 직설성의 트레이드오프

영어권 화자에게 직설적인 톤은 종종 무례하게 느껴집니다. "Do this" 보다 "Could you do this" 가 자연스럽습니다. 하지만 에이전트에게 쓰는 영어에서는 그 정중함이 비용이 됩니다.

9.1 두 청중을 동시에 만족시키기

AGENTS.md는 사람과 에이전트가 동시에 읽습니다. 너무 직설적이면 사람이 거부감을 느끼고 ("이 팀은 차갑네"), 너무 정중하면 에이전트에게 노이즈가 됩니다. 절충은 가능합니다.

before — 차가운 톤:

Do not delete files. Do not modify generated code. Do not add dependencies.

after — 명확하지만 인간적인 톤:

Working with this repo:
- Files are precious — we prefer marking unused files in a summary over deleting them.
- Generated code lives in `src/generated/` — it is regenerated by CI, do not edit by hand.
- Dependencies are tracked carefully — please ask before adding a new one. The bundle is at 280 KB and growing.

차이 — 명령은 동일하지만 (1) 짧은 "왜" 를 곁들이고, (2) "we" 와 "please" 같은 사람 톤을 유지합니다. 에이전트는 명령을 정확히 따르고, 사람은 텍스트를 읽기 좋게 받아들입니다.

9.2 자동화된 메시지의 톤

CI가 자동으로 다는 PR 코멘트, 스탠드업 봇이 보내는 메시지 — 이 자동화 메시지의 영어가 팀 문화를 정의합니다. "Failed" 와 "This pull request did not pass the required checks" 는 동일한 정보지만 후자가 인간적입니다. 에이전트 시대일수록 자동 메시지의 영어에 시간을 들이는 게 가치 있습니다.

9.3 약한 영어가 정중하다는 미신

"Could you maybe consider possibly trying to..." 같은 다중 헷지는 정중한 게 아닙니다 — 자신감 없는 것입니다. 영어 쓰기 코치들이 오랫동안 가르쳐온 것이 에이전트 시대에 다시 강조됩니다: 명확함이 최고의 정중함입니다. 에이전트도, 사람도, 받는 사람이 시간을 덜 쓰게 해주는 것이 진짜 배려입니다.

에필로그 — 에이전트가 읽는 영어를 잘 쓰기 위해

체크리스트 — AGENTS.md / 시스템 프롬프트를 쓰기 전에

청중이 누구인지 적었다 — 사람만, 에이전트만, 또는 둘 다
모든 줄에 "이 줄이 없으면 에이전트가 실수했을까?" 를 물었다
헷지(probably, perhaps, kind of, somewhat)를 제거했다
비기능 요구사항을 숫자로 적었다 (latency, throughput, error rate)
"X와 Y가 충돌하면 Z" 형태의 충돌 정의가 있다
"Out of scope" / "Boundaries" 섹션이 있다
예시는 셋 이하, 모두 우리 팀 표준에 부합한다
짧다 — 다른 사람이 30초에 끝까지 읽는다
정중함을 짧은 "왜" 로 표현했지 헷지로 표현하지 않았다

안티패턴 모음

안티패턴	대신 이렇게
"should probably" / "kind of"	"Do X" 또는 "Do not do X"
"if unsure, ask" 만	화이트리스트 + 블랙리스트로 기준 정의
형용사 나열 (helpful, professional, ...)	두 가지를 골라 정확히 정의
모든 정보를 시스템 프롬프트에	세션 상수는 시스템, 한 번뿐인 정보는 사용자 메시지에
RFC의 모호한 work breakdown	동사 + 정확한 파일 + 크기 추정
"fast enough", "secure", "scalable"	숫자로 표현
사람용 이슈를 그대로 에이전트에게	재현 / 수락 기준 / out of scope 추가
자동 메시지의 차가운 톤	짧은 "왜" 로 인간 톤 유지
50개의 예시	3-5개의 정전 예시
부정문 폭주 ("don't X unless Y unless Z")	"Never X. If you think Y, list it for the user."

한 줄 요약

2026년의 좋은 개발자 영어는, 사람과 에이전트가 동시에 30초에 끝까지 읽고 같은 행동에 도달하게 만드는 영어다.

다음 글 예고

이 글은 영어 — 즉 한 가지 자연어 — 가 어떻게 에이전트와 사람을 동시에 향하는지를 다뤘습니다. 다음 글에서는 그 영어가 도구를 부르는 인터페이스, MCP (Model Context Protocol) 와 도구 설계의 craft 를 다룹니다. 좋은 도구 설명서가 어떻게 잘 쓴 함수 시그니처와 닮았는지, 도구 설명에서 약한 영어가 왜 에이전트의 실패를 만드는지 — 이 글의 자연스러운 후속편입니다.

영어를 잘 쓰는 개발자가 되고 싶다면, 가장 빠른 길은 — 동료에게 보내는 영어를 한 단계 줄이고, 에이전트가 읽는 영어를 한 단계 키우는 것입니다. 거기서 일의 형태가 바뀝니다.

참고 / References

Developer English in the AI Era: AGENTS.md, System Prompts, RFCs — Writing the English Agents Read

"The most important English you write in 2026 is not read by a human. It is read by an agent."

Prologue — The audience changed

Five years ago, the reasons a developer needed strong English were obvious. To explain a tricky change in code review. To work with engineers in other timezones over an issue tracker. To give a conference talk. To answer behavioral questions in interviews. The audience was always a human.

In 2026, the most important audience for your English has shifted. The English you write the most each week is probably not a Slack message to a colleague — it is English that an agent will read. The build commands in AGENTS.md. The system prompt for your Claude Code session. The description on a Devin ticket. The PR context you hand to Codex. The RFC where you spell out non-functional requirements — and the RFC, in 2026, is the document a human writes first and an agent then breaks into executable work.

This post takes a different angle from earlier English posts on this blog. Code review English, standup English, salary-negotiation English, FDE-interview English — those were all about speaking to humans. This one is about writing English for agents — and writing English that humans and agents both read. Those two audiences are not the same, and the differences are subtle but decisive.

The thesis

The single highest-leverage form of English a developer writes in 2026, measured per minute of effort, is English that agents read.

Three reasons.

First, reuse. A Slack message is read by one person, once. AGENTS.md is read by every teammate's every agent session, every time. A one-line quality difference compounds thousands of times.

Second, literalness. Humans patch over vague writing using context. Agents do not patch — or they patch in the wrong direction. Weak English maps directly to weak behavior.

Third, leverage. One well-written line in a system prompt replaces a hundred review comments. One well-written RFC produces thirty well-shaped tasks.

What this post covers:

Chapter	Topic
1	The audience changed — humans vs agents as readers
2	Agents are literal — the hedging trap
3	AGENTS.md / CLAUDE.md — writing context files
4	System prompts — English for production agents
5	RFCs and design docs — written for humans and agents
6	Issues and tickets — descriptions an agent can act on
7	Commit messages and PR descriptions — AI-first workflow
8	Prompt writing as a craft — general principles
9	Politeness vs directness — the trade-off
Epilogue	Checklist, anti-patterns, what comes next

1. The audience changed — humans vs agents as readers

Writing good English means understanding your audience. Human readers and agent readers are clearly different audiences.

1.1 Humans patch the gaps; agents do not

You write a sentence to a colleague:

We probably want to use the new auth flow for this, unless there's a reason
to stay on the old one. Let me know what you think.

Your colleague reads five things at once. (1) A proposal to use the new flow. (2) Soft conviction (probably). (3) Openness to disagreement (unless). (4) Partial delegation of the decision (let me know). (5) A friendly tone (we). They reach all five because they know company politics, your role, your previous conversations, and the conventions of hedged English.

Feed the same sentence to an agent. The agent often picks one of two readings — either "Use the new auth flow" or "Defer, ask the user, because of probably and unless." Neither matches the human reading of "Let us decide this together."

1.2 Agents read tokens as weights

An agent does not read sentences in semantic chunks. It reads them as a weighted sum of tokens. The small words humans treat as ornament — should, must, may, prefer, always, never, consider — push the agent's behavior in measurable ways.

English word	Human reading	Agent reading
"should"	"I would like you to" (soft)	"almost always do it" (strong)
"must"	absolute	absolute (agrees)
"prefer"	"this is the default, exceptions exist"	"do this by default"
"avoid"	"try not to"	"strong constraint, almost never"
"never"	absolute	absolute (agrees)
"consider"	"think about it, decide for yourself"	ambiguous — often ignored or over-applied

The takeaway — consider and perhaps feel like polite suggestions to a human reader, but to an agent they are mostly noise. For agents, Use X or Do not use X is almost always better.

1.3 Humans read once; agents read every time

AGENTS.md is reloaded every session, every task, every call. That is unlike a README, which a colleague reads once and stores in their head. A line of AGENTS.md is not a one-time cost — it is a recurring cost. One noisy line, multiplied by a thousand sessions, becomes a thousand weights.

This is why one of Anthropic's official recommendations is: for every line in CLAUDE.md, ask, "would Claude have made a mistake without this?" If the answer is no, the line is noise.

2. Agents are literal — the hedging trap

Native English speakers naturally hedge. kind of, sort of, I think, perhaps, it might be that, would probably want to. To a human, this is politeness. To an agent, it lowers signal-to-noise ratio.

2.1 Weak English vs strong English — before and after

Before — a system prompt riddled with hedges:

You should probably try to use TypeScript for new files, although JavaScript
might be acceptable in some cases. It would be nice if you could add tests,
but if you don't have time, that's okay too. Please try to follow the
existing code style as much as possible.

How the agent behaves:

Asks the user every time whether to use TypeScript or JavaScript
Skips tests, because "if you don't have time" is effectively a get-out clause
Follows code style "as much as possible" — undefined threshold

After — same intent, strong English:

Write all new files in TypeScript. JavaScript is only allowed when modifying
an existing .js file in place — do not convert .js to .ts as part of an
unrelated change.

Write tests for any new function exported from a module. Tests live next to
the source as <name>.test.ts.

Follow the rules in eslint.config.js. If a rule conflicts with your output,
fix your output — never disable the rule.

How the agent behaves now:

Writes new files in TypeScript (exceptions are explicit)
Writes tests for every exported function
Follows lint rules — conflict behavior is defined

Three changes did the work:

Remove hedges: drop probably, might, would be nice
Make exceptions explicit: spell out the conditions where a rule does not apply
Define conflict behavior: "if X conflicts with Y, do Z"

2.2 The negation trap — agents often miss negations

Long negative sentences are hard for agents.

Before:

Do not delete files unless you are sure they are not used by any other
module that you have not yet read.

Double negation plus stacked conditions, and the agent often collapses this into "delete files."

After:

Never delete a file in this session. If you believe a file is unused, list
it in your final summary under "Files I would delete" — let the user delete it.

The negation is now a single positive action plus an alternative. Agents follow negations far more reliably when there is an explicit alternative right next to them.

2.3 The "if unsure, ask" trap

If unsure, ask the user works when used carefully, but written naively it turns the agent into a question machine, because "unsure" has no threshold.

Before:

If you are unsure about anything, ask the user before proceeding.

After:

Proceed without asking when: (a) the change is a typo or comment-only fix,
(b) the test suite passes locally, (c) you can produce a unit test that
demonstrates correctness.

Ask the user when: (a) the change touches authentication or billing code,
(b) you cannot find a relevant test, (c) the change requires a new
dependency.

By replacing "unsure" with a whitelist and a blacklist, the agent becomes both more autonomous and safer.

3. AGENTS.md / CLAUDE.md — writing context files

AGENTS.md is the most important new document format of 2026. It started as a convention at OpenAI Codex, was adopted across the agent ecosystem, and is now stewarded as an open standard by the Agentic AI Foundation under the Linux Foundation. More than 60,000 open-source projects have adopted it. CLAUDE.md is the equivalent for Claude Code. Both are plain Markdown, but the gap between a good one and a bad one is large.

3.1 The four sections of a good AGENTS.md

A good AGENTS.md usually has four sections.

# AGENTS.md

## What this repo is
(So the agent does not get lost in unfamiliar code — 2 to 4 lines)

## How to build, test, run
(Terminal commands — the exact command, the exact directory)

## Conventions
(Code style, naming, file layout — things you cannot infer from the code)

## Boundaries
(What the agent must never touch)

The AGENTS.md in the OpenAI Codex repository itself follows this shape, and is striking for how short and powerful it is — a human reads it in thirty seconds, and the agent quotes from it accurately every time.

3.2 A weak AGENTS.md vs a strong AGENTS.md

Before — a common weak AGENTS.md:

# Our Project

This is a Node.js project. We try to follow best practices and write clean code.

## Getting started

You can install dependencies with npm. To run the tests, use the test command.
We use TypeScript and prefer functional patterns when it makes sense.

Please be careful when making changes and try to add tests when appropriate.
If you need to deploy, talk to the platform team.

Why this is weak:

"best practices", "clean code" — empty signifiers
"you can install with npm" — no exact command
"when it makes sense" — no threshold
"when appropriate" — no threshold

After — a strong AGENTS.md:

# AGENTS.md

## What this repo is
A Next.js 15 app for the customer billing portal. The backend is a separate
service at `api.example.com` — this repo contains only the frontend.

## Setup, build, test
- Install: `pnpm install` (do not use npm or yarn — the lockfile is pnpm)
- Dev server: `pnpm dev` (port 3000)
- Tests: `pnpm test --filter <package>` for a single package, never `pnpm test` for the whole monorepo (it takes 20 plus minutes)
- Typecheck: `pnpm typecheck` — must pass before any commit

## Conventions
- Components: PascalCase files in `src/components/`. One component per file.
- Hooks: `useFoo.ts` in `src/hooks/`. Always return an object, never an array.
- API calls: use `src/lib/api/client.ts` — never call `fetch` directly.
- Styling: Tailwind only. Do not add new CSS files.

## Boundaries
- Never modify `src/generated/` — it is regenerated from OpenAPI in CI.
- Never edit `pnpm-lock.yaml` by hand — use `pnpm install`.
- Never add a dependency without asking — the bundle is already 280 KB and we are tracking it.
- Do not touch `src/billing/legacy/` — it is being deprecated; new features go in `src/billing/v2/`.

The pattern is simple — every line gives (a) the exact command or path, (b) a short why. The "why" matters because it lets the agent generalize to edge cases the rule did not cover.

3.3 Patterns specific to CLAUDE.md

CLAUDE.md is automatically loaded into every Claude Code session. Two patterns follow from that.

Pattern 1 — permanently block a recurring mistake. When the agent makes the same mistake twice, add the lesson to CLAUDE.md.

## MDX blog rules

Files in `data/blog/` are MDX, pre-rendered at build time. Patterns that
break the build:

- A bare `{identifier}` in prose becomes a JSX expression, which throws a
  ReferenceError. Wrap in backticks or rephrase.
- A `<` followed by a digit makes MDX try to parse a JSX tag. Add a space:
  `< 100`, not `<100`.
- A code fence that opens with stray content after the language identifier
  breaks parsing.

When fixing one variant (.mdx), check all three (.mdx / .en.mdx / .ja.mdx).

Pattern 2 — surface the painful operations. Agents do not know what is slow. Tell them.

## Slow operations — avoid in agent sessions
- `pnpm install` from a cold cache: 4 minutes
- `pnpm test` for the whole monorepo: 22 minutes
- `next build`: 6 minutes
- E2E tests: 18 minutes — never run them unless explicitly asked

Prefer: `pnpm test --filter <pkg>`, `pnpm typecheck`, `pnpm lint`.

This one section alone prevents the agent from casually running tests and burning twenty-two minutes of its session.

4. System prompts — English for production agents

If AGENTS.md is "context about this codebase," a system prompt is "definition of the agent." For a production agent, the two must be cleanly separated.

4.1 The five sections of a system prompt (Anthropic's pattern)

Following Anthropic's prompt engineering guide:

1. Role and persona       — who you are
2. Context                — the environment you operate in
3. Capabilities and tools — what you can do
4. Rules and constraints  — what you absolutely cannot do
5. Output format          — how to return results

Example — skeleton for a customer support agent:

ROLE
You are the support assistant for Example Corp. You handle questions about
billing, account access, and subscription changes. You do not handle
refunds — those are escalated to a human agent.

CONTEXT
You have access to the user's account through the get_account tool. You do
not have access to payment methods, internal Slack, or engineering tickets.
The current date is in the user message.

CAPABILITIES
- Look up account state with get_account
- Trigger a password reset with send_password_reset
- Open a refund ticket with create_refund_ticket (this hands off to a human)

RULES
- Never reveal another user's data.
- Never tell the user about internal tools or model names.
- If the user asks for a refund, do not promise it — create a ticket and tell
  them a human will follow up within one business day.
- If you are unsure whether an action is allowed, do not do it. Ask.

OUTPUT
Reply in plain English, at most two to four short paragraphs. End with a
single sentence stating what action you took or will take next.

The strength of this structure — every section has a single intent. Add a tool? Only CAPABILITIES changes. Add a restriction? Only RULES changes. The prompt is maintainable English.

4.2 Anti-patterns in system prompts

Anti-pattern 1 — example flood. First-time authors cram in ten or twenty examples. Anthropic recommends the opposite — "three to five diverse, canonical examples" beat fifty inconsistent ones. Too many examples cause overfitting; the agent loses the ability to generalize.

Anti-pattern 2 — adjective parade. "Be helpful, professional, friendly, concise, accurate, thoughtful, careful." Seven adjectives weaken each other. Pick two — "Direct and concise. No filler." — and define them precisely.

Anti-pattern 3 — meta-instructions. "Think carefully before answering." Meaningful to a human, mostly noise to an agent. Either prescribe the actual reasoning steps ("Think step by step about X") or leave it out.

4.3 System message vs user message

Where you put the English matters. Rules that hold for the whole session live in the system prompt. Information specific to one request lives in the user message.

Before — everything in the system prompt:

You are a code reviewer. The user is reviewing PR #1234, which adds a new
billing flow. The PR description is: "..."

The problem — the next PR needs a brand new system prompt.

After — separated:

SYSTEM
You are a code reviewer for this monorepo. Focus on correctness, security,
and consistency with the patterns in src/lib/. Do not comment on style — a
linter handles that.

USER
Review PR #1234. The diff is below. The author claims it adds a new billing
flow.
[diff]

System prompt: reusable. User message: changes every time. Clean separation.

5. RFCs and design docs — written for humans and agents

RFCs and design docs are the high-water mark of technical English. In 2026, their audience now includes agents, which take an RFC, split it into work items, and sometimes execute the safe ones themselves.

5.1 Where traditional RFCs fall short

A traditional RFC often has these traits:

Leaves the how vague — out of respect for the implementer's autonomy
Scatters the why across the body — humans synthesize as they read
States non-functional requirements loosely — "should be fast enough"
Mentions alternatives briefly — "we considered X but decided not to"

All of this is fine for a human, but an agent cannot extract clear, actionable work units from this kind of RFC.

5.2 A new structure for RFCs read by agents too

1. TL;DR (1 paragraph)         — human reads it in 30 seconds
2. Problem                     — why we are doing this now
3. Goals / non-goals           — explicit boundaries
4. Proposal                    — what we will build
5. Detailed design             — how we will build it
6. Alternatives considered     — why we did not take other paths
7. Open questions              — what is still undecided
8. Work breakdown (NEW)        — actionable work units
9. Acceptance criteria (NEW)   — definition of done
10. Non-functional requirements (latency, throughput, error budget — in numbers)

Two new sections compared to a traditional RFC — Work breakdown and Acceptance criteria. Both should be written so an agent can use them as a starting point for action.

Before — a vague work breakdown:

We will need to update the API, the frontend, and add some tests.

After — an actionable work breakdown:

1. Add POST /v2/billing/preview endpoint
   - Request: BillingPreviewRequest (new schema)
   - Response: BillingPreviewResponse
   - Backed by services.billing.preview()
   - File: services/billing/handlers/preview.ts (new)
   - Estimated diff: ~150 lines

2. Add usePreview hook
   - File: src/hooks/usePreview.ts (new)
   - Wraps the new endpoint with TanStack Query
   - Estimated diff: ~40 lines

3. Wire the hook into BillingScreen
   - File: src/screens/BillingScreen.tsx (modify)
   - Loading and error states required
   - Estimated diff: ~80 lines

4. Tests
   - Unit: services/billing/preview.test.ts
   - Integration: e2e/billing/preview.spec.ts

The difference is simple — each item (a) starts with a verb, (b) points at a specific file, (c) sizes itself. Humans review better, and agents can act on it directly.

5.3 Non-functional requirements in numbers

This was always good practice, but it matters more in the agent era.

Before:

The new endpoint should be fast. It should handle our peak traffic.

After:

- p99 latency under 200 ms at 500 requests per second.
- Error rate under 0.5 percent over any 5 minute window.
- The endpoint must not push upstream Postgres CPU above 60 percent during
  peak — current peak is 38 percent.
- If the endpoint fails for more than 30 seconds, fall back to the existing
  billing summary endpoint (graceful degradation).

Numbers let humans review, and they give an agent a concrete target to test against.

6. Issues and tickets — descriptions an agent can act on

Ticket-based agents like Devin, Codex Cloud, and GitHub Copilot Workspace are now standard. The way you write a ticket is now the way you shape an agent's work.

6.1 A ticket for a human vs a ticket for an agent

A typical ticket aimed at a human developer:

Title: Login form looks broken on mobile

Description: Hey, when I try to log in on my phone, the form looks weird.
The button is cut off and I can't really submit. Can you take a look?

A human developer picks up this ticket — pulls out their phone, reproduces it, asks the designer about intent, searches for related issues, and starts work. An agent given this ticket alone can do almost nothing.

The same problem, rewritten for an agent:

Title: LoginForm submit button truncated on mobile viewport

Reproduction
1. Open http://localhost:3000/login in Chrome
2. Open DevTools and select the "iPhone 14 Pro" viewport (393x852)
3. Observe that the submit button extends past the right edge of the form
4. Expected: the submit button is fully visible and aligned to the form's right edge

Suspected cause
The form container has `max-width: 400px`, but at viewports under 400px
plus the form's padding, content overflows. See `src/components/LoginForm.tsx`
around line 45.

Acceptance criteria
- The submit button is fully visible at all viewports 320 px wide and above
- No visual regression on desktop (1024 px and above)
- The existing Playwright login test still passes

Out of scope
- Do not redesign the form
- Do not change the button text or color
- Do not touch the OAuth button — it is being redone in #1240

It looks longer, but the agent can start immediately, and when a human reviews the result they can see exactly what was and was not in scope. The Out of scope section is the secret weapon. The most common agent mistake is "cleaned the street and then also cleaned the neighbor's yard." Spelling out the edges prevents it.

6.2 A ticket template

A template usable across a team:

## Context
(1-3 sentences — why this matters now)

## Reproduction or scenario
(numbered steps for bugs; a concrete user story for features)

## Acceptance criteria
- ...
- ...

## Out of scope
- ...

## Related
- linked issues, ADRs, RFCs

Out of scope is the new addition. What used to be implicit for a human developer must be explicit for an agent.

7. Commit messages and PR descriptions — AI-first workflow

Commit messages and PR descriptions are the English we write most often, and they are the English agents read most often — and increasingly, write most often.

7.1 A good commit message (Conventional Commits plus the why)

Before:

fix bug

After:

fix(billing): prevent double-charge when retry follows a partial failure

The retry path was not idempotent because we generated a new idempotency
key on every attempt. We now reuse the key for the whole transaction
attempt, scoped to the user and order id.

Repro: tests/billing/retry.test.ts::"survives partial gateway failure"

Three elements — (1) type(scope): summary, (2) a paragraph of why, (3) a reproduction or test pointer. Humans like it, and the next agent that touches this area thanks you.

7.2 AI-written PR descriptions vs human-written ones

PR descriptions auto-generated by AI have a few common weaknesses.

They list every changed line in sequence ("Updated A, B, C")
They almost never explain why
They treat all changes as equal — no separation of headline vs detail

A good PR description, whether a human writes it briefly or an agent writes it at length, has:

## What
(1-2 paragraphs — what changes, in business terms)

## Why
(why now, why this approach)

## How (only if non-obvious)
(design choices that the code does not reveal)

## Risks
(what could break, how you looked)

## Test plan
(how you verified — commands run, screens seen, logs read)

When you ask an agent to generate a PR description, explicitly require these five sections. The result is dramatically better.

7.3 Co-authored-by and honesty

Marking AI-assisted code honestly is both ethical and practical. Co-Authored-By: Claude on a commit becomes a signal months later — "this was first drafted by an agent, then reviewed by me." When you debug the same area in six months, that signal carries real value.

8. Prompt writing as a craft

By now one pattern should be clear. Writing English for agents is, at its core, the same as writing good technical English. A person who writes clean RFCs writes clean system prompts, and vice versa.

8.1 General principles for good English — agent-era version

Principle 1 — one intent per paragraph. Humans appreciate it; agents need it. Mix a command with an exception and a tone in one paragraph and everyone struggles.

Principle 2 — explicit context. "As discussed yesterday" is explicit to a colleague and blank to an agent. Either restate the context every time, or — better — put it in a permanent document.

Principle 3 — named constraints. Not "this should be fast" — write "p99 under 200 ms". Not "secure" — write "no plaintext credentials in logs." Named constraints are verifiable.

Principle 4 — define conflict behavior. Good English always includes "if X conflicts with Y, do Z." If the conflict is not addressed, the agent picks one arbitrarily, and the pick is often wrong.

Principle 5 — concrete examples. Three canonical examples outperform one abstract rule. But — your examples must be internally consistent.

8.2 How to make writing shorter

AGENTS.md and system prompts are stronger when shorter. Five techniques.

Drop intensifiers: "very", "really", "quite", "somewhat" — almost always cut them
Un-nominalize: not "perform a review of" — just "review"
Active over passive: not "tests are to be written" — "write tests"
One command per bullet: if a bullet contains two commands, split it
Remove repetition: if the same rule appears twice, delete one or rewrite the second to address a different angle

A good system prompt is like a good poem — pulling any single word weakens the meaning.

8.3 Curating canonical examples

Anthropic recommends "three diverse, canonical examples." The word canonical is doing work. An example is not just something that compiles — it should represent "how our team works." Off-tone examples generalize into off-tone behavior.

Checklist for a canonical example:

The intent of the change is clear
The code style matches your team's median
It addresses one edge case (no edge case is too simple; many is noise)
It is a generalizable pattern

9. Politeness vs directness — the trade-off

To native English speakers, directness can read as rudeness. "Do this" feels harsh next to "Could you do this." But in writing meant for agents, that politeness becomes cost.

9.1 Serving both audiences

AGENTS.md is read by both humans and agents. Too direct, and humans feel the team is cold. Too polite, and the agent treats it as noise. There is a middle.

Before — cold tone:

Do not delete files. Do not modify generated code. Do not add dependencies.

After — clear but warm:

Working with this repo:
- Files are precious — we prefer marking unused files in a summary over deleting them.
- Generated code lives in `src/generated/` — it is regenerated by CI, so do not edit by hand.
- Dependencies are tracked carefully — please ask before adding a new one. The bundle is at 280 KB and growing.

Same commands, but (1) a short why for each, and (2) "we" and "please" preserve a human register. The agent still follows the commands precisely, and humans enjoy reading it.

9.2 The tone of automated messages

Auto-generated PR comments. Standup-bot messages. The English in these automated messages defines team culture more than people realize. "Failed" and "This pull request did not pass the required checks" are the same information, but the second one is human. In the agent era, time spent on the English in automated messages is unusually well spent.

9.3 The myth that weak English is polite

"Could you maybe consider possibly trying to..." is not polite — it is uncertain. Writing coaches have been teaching this for decades, and the agent era is bringing it back: clarity is the highest form of courtesy. Saving the reader — human or agent — from having to guess is the real consideration.

Epilogue — How to write the English agents will read

Checklist — before you ship an AGENTS.md or system prompt

I named the audience — humans, agents, or both
For each line, I asked, "would the agent have made a mistake without this?"
I removed hedges (probably, perhaps, kind of, somewhat)
I wrote non-functional requirements in numbers (latency, throughput, error rate)
Every important rule has a conflict definition: "if X conflicts with Y, do Z"
There is an Out of scope or Boundaries section
There are three or fewer examples, all matching team standards
The document is short — a reader finishes it in 30 seconds
Politeness comes from a short why, not from hedging

Anti-patterns and what to do instead

Anti-pattern	Do this instead
"should probably" / "kind of"	`Do X` or `Do not do X`
"if unsure, ask" alone	Define the threshold as a whitelist + blacklist
Adjective parade (helpful, professional, careful)	Pick two and define them precisely
Everything in the system prompt	Session-constant in system, per-request in user message
Vague RFC work breakdown	Verb + exact file + size estimate
"fast enough", "secure", "scalable"	Numbers
Human-style ticket given to an agent	Add reproduction, acceptance criteria, out of scope
Cold tone in automated messages	Short `why` to keep it human
Fifty examples	Three to five canonical ones
Nested negations ("don't X unless Y unless Z")	"Never X. If you think Y applies, list it for the user."

One-line summary

Good developer English in 2026 lets a human and an agent both read it in 30 seconds and arrive at the same action.

What is next

This post focused on one natural language — English — facing both human and agent readers. The next post moves down one layer to the interface that English calls into: tools, MCP (Model Context Protocol), and the craft of writing tool descriptions. A well-written tool description reads like a well-typed function signature, and the same weak English that breaks system prompts also breaks tools. That post is the natural sequel.

If you want to become a stronger English writer as a developer, the fastest path is to spend less time on English aimed at colleagues and more time on the English that agents read. That is where the shape of your work begins to change.