Skip to content

✍️ 필사 모드: Ramping Up on an Unfamiliar Codebase Fast: The Newcomer's Craft

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

"The fastest way to understand a codebase is not to read it from the top. It is to make it run, then follow one path all the way through."

Prologue — We become the newcomer over and over

Say the word "ramp-up" and most people picture a junior developer's first day. But reality is different. Across a whole career, we become the newcomer repeatedly.

You change jobs and stand before a new company's codebase. You transfer teams and inherit a service you have never seen, in the very same company. Someone leaves and you inherit their five-year-old payment system. You clone someone else's repo to contribute to open source. You drop into another team for a week to fix one of their bugs.

All of these situations share one thing. You are standing in front of code that already exists, that you did not write, for which you have no map in your head.

Here is the key fact: ramping up is not a talent. It is a learnable skill. Some people land their first PR on an unfamiliar repo in three days; others are still "getting up to speed" after three weeks. The difference is not raw intelligence — it is whether you know the method.

This post is about that method. Not "how should the team design onboarding" — that is the manager's question — but "how do I, the individual, get productive fast". The two are different. Designing onboarding is the team's job; ramping up is, in the end, something you do. Even with no good onboarding doc — and usually there is none — the skill of ramping up fast is in your own hands.

What this post covers:

ChapterTopic
1Get it running first — the highest-leverage move
2Find the entry points — where does a request begin
3Read the shape before the details
4Follow one real request end-to-end (the vertical slice)
5Read tests as documentation
6Use your tools — grep, go-to-definition, git as archaeology
7Make a small change early
8Build a mental map and write it down
9What to ask, and whom
10Ramping up alongside AI agents
EpilogueChecklist + anti-patterns + next-post teaser

Chapter 1 · Get it running first

When you receive an unfamiliar repo, the first instinct is usually "let's read the code." Wrong instinct. The highest-leverage first move is to make it run locally.

1.1 A working environment beats a week of reading

Reading code alone gives you only a static picture in your head. But a running environment puts a living system in your hands. You can change a value, add a log line, attach a debugger, and actually try "what happens if I delete this?" Understanding that would take three days by reading can take one hour once it runs.

And the act of setting up the environment is itself learning. How the build runs, what services it depends on, what each environment variable turns on and off — you learn by hand the things the README never tells you.

1.2 Day one's goal is a "green screen"

Make day one's goal explicit. It is not to understand a feature; it is for the app to start and the tests to pass.

git clone <repository-url>
cd project
# follow the README setup steps verbatim
make setup        # or npm install / poetry install / ...
make dev          # bring the app up
make test         # see whether the tests run

Getting stuck along the way is normal. The point where you get stuck is your first discovery. A missing environment variable, a dependency that will not install, an undocumented prerequisite — write these down. As Chapter 9 discusses, this is good material for fixing the onboarding doc later, and at the same time a signal of "how worn-in is this team's setup."

1.3 The order of moves when stuck

OrderActionTime box
1Re-read README, CONTRIBUTING, docs/15 min
2Search the exact error message inside the repo10 min
3Check recent setup-related commits with git log10 min
4Read the CI config file (the answer key lives there)15 min
5Search the team chat for the same error10 min
6Ask a person (with a well-formed question)

Step 4 is the key. The CI config (.github/workflows/, .gitlab-ci.yml, and the like) is the answer key to "how do you build this project in a clean environment." A human's README goes stale, but CI runs every time, so it does not lie.

See a "green screen" on day one and from that day every act of learning becomes an experiment. Miss it, and every act of learning stays a guess.


Chapter 2 · Find the entry points

Once the environment runs, the next question is "where does this system begin." A codebase always has an entry point. Find it, and the rest branches out from there.

2.1 Kinds of entry points

System typeEntry point
Web backendmain, server bootstrap, route registration
Web frontendApp root component, router config
CLI toolmain function, argument parser
LibraryThe public API, the exports in the index file
Batch / workerScheduler registration, queue consumer

Most frameworks fix where the entry point lives. The scripts in package.json, the targets in a Makefile, the Procfile, the CMD in a Dockerfile — these tell you "what this project starts with."

2.2 The request lifecycle

For a web service, the first picture to draw is the request lifecycle: from one HTTP request coming in to a response going out, what stages does it pass through.

The request lifecycle (typical web backend)

  HTTP request
  [Router]  ── maps URL to handler
  [Middleware]  ── auth, logging, parsing, rate limit
  [Handler / controller]  ── per-request entry function
  [Service / domain layer]  ── business logic
  [Data access layer]  ── DB, cache, external APIs
  [Response serialization]  ── result into JSON, etc.
  HTTP response

This picture applies to nearly every web backend. Only the names and boundaries differ per framework. Hold this picture and fill in "which directory is each layer in, in this repo?" — that becomes the skeleton of your codebase map.

2.3 The build is an entry point too

As important as the runtime entry point is the build entry point: how source becomes a deployable artifact. Skim the build config files (vite.config, webpack.config, tsconfig, Makefile) once and you will see "what form this code actually runs in." Is it transpiled, is it bundled, what environment does it target.


Chapter 3 · Read the shape before the details

Finding the entry point does not mean you dive straight into a function body. First you look at the shape. See the forest, then walk to the trees.

3.1 Directory structure equals the team's mental model

Directory structure is not mere folder housekeeping. It is a fossil of how this team carves up the system in their thinking.

src/
  routes/        ← HTTP boundary (entry point)
  services/      ← business logic
  models/        ← data shapes
  lib/           ← shared utilities
  jobs/          ← background work
  config/        ← configuration

This structure alone tells you "this team thinks in layers." Conversely, if it is split by feature like features/checkout/, features/search/, then it is "they think feature-first." Five minutes looking at the directory structure beats an hour reading random files.

3.2 The dependency graph — what leans on what

Next you look at dependencies. There are two layers.

  • External dependencies: package.json, requirements.txt, go.mod — what frameworks and libraries does it stand on. Flag anything you are not familiar with.
  • Internal dependencies: how do modules import each other. Which module is the "hub" (imported by many places), which is a "leaf" (imported by no one).

The hub module is the heart of the system. Reading from there gets you the most context at once.

3.3 The data model tells the truth

Finally, and most importantly — look at the data model. Schema definitions, ORM models, migration files, type definitions.

Code can lie. Comments go stale. But the data model shows what the system actually deals with. Grasp the core entities like User, Order, Subscription and their relationships, and you have understood half the business. Because what the code does, in the end, is create, read, change, and delete this data.

Read the shape first and, when you later read a detail, you immediately know "where this belongs." Read details with no shape, and every file floats free.


Chapter 4 · Follow one real request end-to-end

Once you have seen the shape, now you follow one real request all the way through. This is called the vertical slice technique.

4.1 What a vertical slice is

Do not try to read the whole codebase horizontally (layer by layer). Instead, pick one feature and follow the path it cuts through every layer of the system. From the very top (request entry) to the very bottom (the DB), and back out to the response.

Example: "What happens when a user logs in?"

Vertical slice: a login request

  POST /login                    ← starts in routes/auth.ts
  passes authMiddleware          ← middleware/ — ah, this path skips it
  calls loginHandler             ← controllers/auth.ts
  AuthService.login()            ← services/auth.ts — core logic here
     ├─▶ UserRepo.findByEmail()  ← db/users.ts — the query lives here
     ├─▶ password.verify()       ← lib/crypto.ts
     └─▶ Session.create()        ← services/session.ts
  Set-Cookie + 200 response      ← where does serialization happen?

Follow this one slice and you experience, in a single pass, how routing, middleware, controller, service, repository, and utilities connect. And that pattern repeats almost verbatim for other features. Following one deeply beats skimming ten shallowly.

4.2 Tools for following a slice

  • Put a debugger breakpoint at the entry point and send one real request. Step through the stack one frame at a time and the call order shows itself.
  • Add logs: drop a temporary log into each layer and send a request. The order they print in is the flow. (Remove them when done.)
  • Chain "go to definition": from the entry function to the function it calls, to the function that one calls, keep jumping.

4.3 Choosing a good first slice

Do not pick just any feature; pick one that is simple but representative. Login, fetching a single item, a simple form submission — something that passes through the system's main layers but has few edge cases. Save the complex ones like "payment settlement" or "report generation" for second and third.


Chapter 5 · Read tests as documentation

Tests are not only a tool for checking pass or fail. They are the most honest documentation.

5.1 Why tests are documentation

  • Comments go stale, but tests go red when they go stale. As long as CI passes, the tests reflect current behavior exactly.
  • Tests are real-world usage examples of "how do you call this function." The input shape, the expected output, the error cases are all right there.
  • Test names often state intent. it("does not apply the discount when the coupon is expired") — that one line is a business rule.

5.2 The order in which to read tests

1. Look at the list of test files
   → a map of "what is worth testing in this system"

2. Read the unit tests for core entities
   → how User, Order, and the like are intended to behave

3. Read the integration / E2E tests
   → how the layers are intended to work together
   → often the Chapter 4 "vertical slice" written out as code

4. Look at test fixtures / factories
   → what valid data actually looks like

Step 3 is especially good. A well-written integration test shows "the correct way to use this feature" as code. If the slice you traced by hand in Chapter 4 is already written as an integration test, that is a verified map.

5.3 If there are no tests

Plenty of repos are thin on tests. When that is the case, do two things. First, read what tests exist even more carefully — the fewer there are, the more likely each one marks "the part the team fears most." Second, as Chapter 7 discusses, adding the first test yourself makes a good first contribution.


Chapter 6 · Use your tools

Digging through files by naked eye in an unfamiliar codebase is inefficient. There are tools. Used well, they speed up exploration by an order of magnitude.

6.1 Search — grep / ripgrep

String search is the most basic and the most powerful.

# every place a function/symbol is defined and used
rg "createOrder"

# where routes get registered
rg "router\.(get|post|put|delete)"

# where environment variables are read
rg "process\.env\.|os\.environ"

# TODO/FIXME — things the team knowingly deferred
rg "TODO|FIXME|HACK|XXX"

# the source of a specific error message
rg "User not found"

ripgrep respects .gitignore and is fast. Ninety percent of "where does this string come from?" questions get answered by a single line of rg.

6.2 Go to definition / find references / call hierarchy

These are features your IDE or LSP gives you.

FeatureQuestion it answers
Go to definitionWhere is this defined?
Find referencesWho uses this?
Call hierarchyThrough what path does code reach this function?
Type infoWhat is this variable's shape?

The call hierarchy in particular is powerful for tracing the Chapter 4 vertical slice backwards. It shows at once "which HTTP endpoints does this DB query function ultimately get called from."

6.3 git as archaeology

git history is the record of "why this code came to be this way." Reading code is seeing the present; reading git is seeing the past.

# the commit that last changed this line, and its message
git blame path/to/file.ts

# the change history of this file
git log --oneline -- path/to/file.ts

# why this file came to be this way, with commit messages
git log -p -- path/to/file.ts

# find the commit where a specific string was added/removed
git log -S "featureFlag" --oneline

# the change history of one function only
git log -L :functionName:path/to/file.ts

Use git blame to find the commit behind some strange code, then read its commit message or the linked PR/issue, and "why it was written this way" appears. Half of code that looks strange is due to lost context — a past bug fix, an external API's constraint, an urgent hotfix. git brings that context back.

Code tells you "what," git history tells you "why," tests tell you "how to use it." Read all three together.


Chapter 7 · Make a small change early

Reading alone does not finish your ramp-up. At some point you have to make a change yourself. And as early as possible.

7.1 Why a small change matters

Push one small change all the way through and you prove the entire development loop runs. Edit code, check locally, test, commit, PR, review, CI, merge. Experience that loop running without a snag once, and the real work that follows is far less scary.

And a small change gets feedback fast. You create the chance for a reviewer to say "our team doesn't do it that way" early. Hearing that on a typo fix is a hundred times better than hearing it after building a big feature for two weeks.

7.2 Candidates for a good first change

CandidateWhy it is good
Typo / doc fixRuns the whole loop with no risk
Add a missing testEvidence you understood the code deeply
Improve an error messageSmall but helps real usage
A small refactor (renaming)Tool-use practice + safe
A good first issue labelSomething the team picked as "for beginners"

If there is a good first issue label or its equivalent, start there. If not, a problem you found while setting up in Chapter 1 (a stale README line, say) is a perfect first PR.

7.3 What to care about in your first PR

  • Small: a reviewer can read the whole thing in five minutes.
  • Follow team conventions: look at existing commit messages, existing PR descriptions, existing code style first, then imitate them.
  • State the "why": one paragraph in the PR description on what you changed and why.
  • Check CI yourself: do not wait for the merge — be the first to see whether CI is green.

One small PR merged in your first week takes you far further than a first week of only reading.


Chapter 8 · Build a mental map and write it down

Everything you have gathered so far — entry points, the shape, slices, tests, context dug out of git — leaks away if you keep it only in your head. You have to get it out and write it down.

8.1 Your own notes

It need not be grand. One Markdown file, or one wiki page, is enough. What to write:

My ramp-up notes — <service name>

## One-line summary
This service does ___.

## Entry points
- Runtime: src/main.ts
- Build: vite.config.ts
- Main routes: src/routes/

## Core data model
- User ─< Order ─< OrderItem
- Subscription (1:1 with User)

## Slices I have traced
- Login: routes/auth -> services/auth -> db/users
- (next: order creation)

## Still unknown / questions
- [ ] Where does cache invalidation happen?
- [ ] What is the `LEGACY_MODE` flag for?
- [ ] Why is payment split into a separate service?

## Things that confused me (for future me)
- The difference between `utils/` and `lib/`: no agreed rule, just historical

8.2 One diagram

Sometimes a picture is faster than prose. Not grand UML, but boxes and arrows is enough.

Mental map (unfamiliar codebase, week 1)

   ┌──────────┐     ┌──────────┐     ┌──────────┐
   │  routes/ │────▶│ services/│────▶│   db/    │
   │ (entry)  │     │ (logic)  │     │(persist) │
   └──────────┘     └────┬─────┘     └──────────┘
                   ┌──────────┐     ┌──────────┐
                   │  jobs/   │     │ external │
                   │ (async)  │────▶│   APIs   │
                   └──────────┘     └──────────┘

   ?  = places not yet entered: inside jobs/, external retry logic

The act of drawing this picture is itself learning. As you draw, "wait, I cannot draw an arrow here = I do not know this yet" surfaces. Mark the unknown spots with a question mark, and that becomes your list of where to read next.

8.3 Keep the question list alive

The "still unknown" list is a core ramp-up tool. Write things down as they occur to you; erase them when you find the answer. The rate at which this list shrinks is the rate of your ramp-up. And this list is the material for Chapter 9 — when you ask a person.

A map in your head scatters like fog. A written map stays, and you can even hand it to the next person who comes in.


Chapter 9 · What to ask, and whom

Ramping up is not done alone. At some point you have to ask a person. But what, whom, and how you ask decides the outcome.

9.1 Before asking: read the chat history first

Before you ask, invest just five minutes. Search the team chat (Slack and the like), the issue tracker, the PR comments. Your question is probably not new. Someone has already asked, someone has already answered.

  • Error message → search the chat for it verbatim
  • "Why X" → search the related PR/issue
  • An architecture question → search docs/, the wiki, design documents

Do this first, and when you go to a person you can say "I searched but couldn't find it." That one line builds trust.

9.2 A good question vs a bad question

Bad questionGood question
"How does this work?""I traced the login slice; I saw the session gets created in services/session.ts. But where is expiry handled? I couldn't find it with rg."
"My environment setup isn't working""make setup fails at step 3 with DB_URL missing. The README doesn't mention it — what value do people usually use locally?"
"I don't get this part""I'm curious why the LEGACY_MODE flag exists. I traced git blame back to a commit two years ago but couldn't see the context."

What good questions share: they show how far you got yourself. This saves the other person's time and at the same time makes the answer more precise — because they can answer surgically, "ah, if you got that far you only need to know this."

9.3 Whom to ask

  • Setup / environment problems → someone who joined recently. They went through the same pain just now, so it is fresh.
  • "Why did this come to be" → the person git blame points to, or a long-time contributor to that area.
  • Architecture / the big picture → the team lead or a senior. But gather your questions and bring them in one batch.
  • Stuck work → your mentor or buddy, if one is assigned. If not, your code reviewer.

Do not repeatedly ask the same kind of question to the same person. Gathering questions, bundling them, and bringing them in an organized form is how you respect the other person's time.

9.4 Do not let answers you receive drift away

When someone gives you an answer, write it into your Chapter 8 notes. So you do not ask the same thing twice, and so you can hand it to the next person who comes in. Better yet, put that answer into the README or wiki as a PR — that is one more "good first change" from Chapter 7.

A question is not a weakness. An unprepared question is the weakness. Search for five minutes, show how far you got, record the answer you receive — and a question becomes the fastest learning tool.


Chapter 10 · Ramping up alongside AI agents

Ramping up in 2026 has gained one more tool: the AI coding agent. Used well it is powerful; trusted wrongly it is dangerous.

10.1 What agents are good at

On an unfamiliar codebase, an AI agent gives you a fast guided tour.

RequestWhy the agent is good at it
"Explain what this module does"Reads hundreds of lines fast and summarizes
"Trace where this function is called"Does the grep + definition-tracing for you
"Follow the login request flow end-to-end"The Chapter 4 vertical slice, fast
"What is the intent of this directory structure?"Pattern recognition
"Show me how to use this unfamiliar library, using an example inside this repo"Extracts an example from within context

"Explain" and "trace" in particular are the agent's strengths. They cut exploration that would take a human an hour down to a few minutes.

10.2 But — do not blindly trust the tour

The problem is that the agent's explanation can sound plausible and still be wrong. The agent fills in parts it has not seen with the most plausible default — that is, it hallucinates. And a hallucination sounds just as confident as a real explanation.

So the principle is one: take the agent's tour as a hypothesis, and verify with the code.

The loop of ramping up alongside an agent

  1. Ask the agent
     "Explain the payment flow"
  2. Take the answer as a hypothesis (not as fact)
  3. Verify with the code
     - open the files the agent named, yourself
     - find the functions the agent named with rg
     - if in doubt, confirm with the debugger / logs
  4. If right, write it in your notes / if wrong, ask again

10.3 Ask verifiable questions

When you ask the agent, it is better to ask in a form that is easy to verify.

  • Bad: "Is this system safe?" (unverifiable, high hallucination risk)
  • Good: "List every place createOrder is called, with file paths" (you can cross-check with rg directly)
  • Good: "Point out the lines in this file that make external network calls" (open those lines yourself and confirm)

A specific, location-bearing, cross-checkable question — this is how to use the agent safely. It is exactly the same principle as asking a person a good question in Chapter 9.

10.4 The agent does not replace people

The agent is strong at questions about code. But context not written in the code — like "why was this decision made" — it does not know. Vanished meetings, organizational constraints, future plans — that, you still have to ask a person. The agent does not replace Chapter 9; it sharpens the questions you bring to Chapter 9. What can be answered with code, to the agent; what only people can answer, to the people.

The agent is an excellent tour guide. But a tourist who only copies down what the tour guide says never memorizes the route. Ask, verify, and walk it yourself.


Epilogue — Ramping up is a recurring skill

Standing before an unfamiliar codebase is not a one-time event in a career. When you change jobs, when you transfer teams, when you inherit a service, when you contribute to open source — it keeps recurring.

So ramping up is a skill you learn once and use for life. And the core of that skill is simple. Do not try to read it all from the top. Get it running, find the entry points, see the shape, follow one path all the way through, prove the loop with a small change, write down the map, ask good questions, use AI as a guide but verify it. People who ramp up fast are not smarter — they are people who know this order.

The fast ramp-up checklist

  1. It runs — the app starts locally and the tests pass
  2. You know the entry points — you know where the runtime and build entry points are
  3. You have seen the shape — you skimmed the directory structure, dependencies, data model
  4. You followed a slice — you traced one real request end to end
  5. You read the tests — you confirmed core behavior through tests
  6. You use your toolsrg, go-to-definition, call hierarchy, git blame are worn-in
  7. You made a small change — your first PR is merged (or in review)
  8. You wrote down the map — you have your own notes, a diagram, a question list
  9. You asked well — search first, show how far you got, record the answer
  10. You use AI with verification — you take the agent's tour as a hypothesis and confirm with code

Anti-patterns to avoid

  • Only reading — read code without running it and learning stays a guess
  • Reading it all from the top — read files at random with no shape and they float free
  • Reading horizontally — do not read layers separately; read one slice vertically
  • Staying stuck alone for too long — set a time box, and when you cross it, ask with an organized question
  • Asking unprepared — do not ask without having searched and without showing how far you got
  • Deferring the change — do not wait for "once I understand enough"; make a small change early
  • Blindly trusting the AI tour — do not take the agent's plausible explanation as fact
  • A map only in your head — if you do not write it down it scatters, and you cannot hand it to the next person

Next-post teaser

The next post is "Living with legacy code — how to change scary code safely." If this post was "how do you understand unfamiliar code," the next is "when that code you now understand is scary and has no tests, how do you touch it." It covers characterization tests, finding seams, small safe refactors, and the mindset for facing legacy.

Every expert was once the newcomer on that codebase too. The difference is the time it took to ramp up, and that time shrinks with method. Learn the order in this post once and into your body, and the next unfamiliar repo becomes less unfamiliar.

현재 단락 (1/272)

Say the word "ramp-up" and most people picture a junior developer's first day. But reality is differ...

작성 글자: 0원문 글자: 22,048작성 단락: 0/272