Skip to content
Published on

LSP & Tree-sitter Ecosystem 2026 — ast-grep / Biome / Helix / Zed / Neovim Treesitter / Per-Language LSPs Deep Dive

Authors

Prologue — The two standards of code tooling

The editor wars of the 2010s were simple. "Which app you use" decided almost everything. VS Code vs JetBrains vs Vim vs Emacs. Each one re-implemented its own completion, go-to-definition, refactoring, and syntax highlighting; every new language meant editors x languages worth of work.

The 2026 picture is completely different. The true center of gravity of editors and IDEs has shifted to two protocols/libraries.

StandardWhat it standardizedWho built it
LSP (Language Server Protocol)Code "intelligence" — completion, go-to-def, rename, diagnostics, formatMicrosoft (2016)
Tree-sitterCode "structure" — incremental parsing, highlighting, structure-aware opsMax Brunsfeld (GitHub, 2018+)

The lesson of the last ten years is clear. Hooking into the standards is far cheaper than building a new editor. That is why new editors like Helix and Zed bake in LSP + Tree-sitter from day one. Neovim shipped an LSP client in the core starting 0.5; VS Code was the reference LSP client from the start.

And a new ecosystem is exploding on top of these two standards — ast-grep (Tree-sitter based structural search/rewrite), BiomeJS (a Rust-rewritten JS toolchain), Marksman (Markdown LSP), Semgrep / CodeQL (security and policy search), Comby (language-neutral rewriting). This post is one volume on all of it.


1. LSP — Microsoft's standard

1.1 Why LSP was needed

Before 2015: every editor needed its own plugin per language.

EditorLanguageResult
VS CodeTypeScriptPlugin A
VimTypeScriptPlugin B (re-implemented)
EmacsTypeScriptPlugin C (re-implemented)
AtomTypeScriptPlugin D (re-implemented)

With M languages and N editors, you needed M x N plugins. Each plugin re-implemented features like "go to definition" or "rename," with wildly varying quality.

Microsoft's insight: split the code analysis logic into a separate process (the language server), and have the editor and server talk over standard JSON-RPC messages. The math collapses to M + N.

1.2 Core LSP messages

MessageWhat it asks
textDocument/completionCompletion candidates at the cursor
textDocument/definitionWhere a symbol is defined
textDocument/referencesAll places that reference a symbol
textDocument/hoverType and docs for a symbol under the cursor
textDocument/renameBulk-rename a symbol
textDocument/formattingFormat the document
textDocument/codeActionQuick fixes and refactorings
textDocument/publishDiagnosticsServer -> client (errors and warnings)

The default transport is JSON-RPC 2.0 over stdin/stdout. TCP/socket is supported, but stdio is the norm.

1.3 One-line diagram

       ┌──────────────┐           ┌─────────────────────┐
       │   Editor     │  JSON     │   Language Server   │
       │  (client)    │◀────────▶│   rust-analyzer,    │
       │  VS Code,    │   RPC     │   etc.              │
       │  Helix, Zed, │           │                     │
       │  Neovim ...  │           │  - Parsing          │
       └──────────────┘           │  - Type inference   │
                                  │  - Indexing         │
                                  └─────────────────────┘

The editor side gets thinner. The server side gets thicker. And once you write the server well, every editor benefits.

1.4 LSP's standing in 2026

  • Every serious editor ships an LSP client. VS Code, JetBrains, Helix, Zed, Neovim, Emacs (eglot), Sublime.
  • Every active language has its own LSP server. rust-analyzer, gopls, pyright, typescript-language-server, clangd, and so on.
  • A region with no vendor lock-in. It is now strange to say "this language is only well supported in VS Code."

2. Tree-sitter — Max Brunsfeld's incremental parser

2.1 The limits of regex highlighting

Until the mid-2010s, almost every editor implemented syntax highlighting with regular expressions. TextMate grammars (.tmLanguage) were the de facto standard.

Problems:

  • They lie. Regex is not a real parser. Nested strings, complex interpolation, and macros break it.
  • They are slow. Big files re-match on every keystroke.
  • They are fragile. When code briefly breaks, highlighting collapses entirely.

2.2 What Tree-sitter answered

Tree-sitter, built by Max Brunsfeld (formerly of Atom, then GitHub), solves all of these at once.

  1. Incremental. A keystroke re-parses only the changed region, not the entire tree.
  2. Error recovery. When code is temporarily broken, it parses as much as it can and leaves error nodes.
  3. Generalized LR (GLR) — handles ambiguous grammars.
  4. Language-neutral. Grammars are written in a small DSL; parsers are generated in C.
  5. Fast. A real parser, but fast enough for live highlighting.

2.3 Uses

  • Syntax highlighting: AST-based, accurate.
  • Code folding: structural, not regex.
  • Structural selection: "select this function", "expand selection by expression."
  • Search and rewriting: query nodes, not text (ast-grep, Comby).
  • Indexing beyond highlight: function/class/import extraction.

2.4 Who uses it

ToolTree-sitter use
Neovimnvim-treesitter — highlight, fold, structural text objects
HelixBuilt-in. Highlight, indent, structural motions all on TS
ZedBuilt-in. Highlight, outline, structural search
GitHubCode search, highlight, symbol extraction
ast-grepStructural search/rewrite engine
DifftasticStructure-aware diff

2.5 Grammar distribution

tree-sitter-rust, tree-sitter-python, tree-sitter-typescript, and so on — almost every popular language ships its grammar as a separate npm/crates package. Supporting a new language means writing a grammar; the highlight queries (.scm) are short.


3. ast-grep (sg) — structural search and rewrite

3.1 Why grep falls short and what ast-grep answers

grep matches text. "All callers of console.log" also catches comments, strings, and docs containing console.log. And "only calls where the first argument is an object" is effectively impossible with regex.

ast-grep (sg) is different. It matches patterns on the AST parsed by Tree-sitter. Patterns look like code in the same language, with $VAR for metavariables.

3.2 Who built it, what is new

  • Herrington Darkholme wrote it in Rust.
  • In 2024 it raised an external (seed) round and expanded as the "ast-grep company" — targeting enterprise code migration and policy search.
  • The Rust + Tree-sitter combo is fast and works in any Tree-sitter language.

3.3 Pattern examples

# Every call to console.log with any argument
sg --pattern 'console.log($A)' --lang typescript

# Only calls whose first arg is an object literal
sg --pattern 'console.log({ $$$ })' --lang typescript

# Rewrite: console.log(x) -> logger.debug(x)
sg --pattern 'console.log($A)' --rewrite 'logger.debug($A)' --lang typescript --update-all

$A matches one arbitrary expression; $$$ matches an arbitrary list of nodes.

3.4 sgconfig.yml — codebase policy

ast-grep can store team policies as YAML rulesets. Running the ruleset in CI automatically checks rules like "do not use this pattern."

id: no-direct-fetch
language: typescript
rule:
  pattern: fetch($URL)
message: "Use apiClient.get instead of fetch"
severity: warning

3.5 What it is best for

  • Large-scale refactoring. "Migrate this API call pattern to the new SDK." A week by hand, minutes with ast-grep.
  • Codebase policy. "Ban console.log in this module", "no fetch inside useEffect."
  • Huge codebase exploration. "Every use of this pattern" — far more accurate than grep.

4. BiomeJS — replacing ESLint + Prettier

4.1 Accumulated pain in the JS toolchain

Since the mid-2010s, two tools every JS dev used:

  • ESLint — code quality linter (written in JS)
  • Prettier — formatter (written in JS)

Both are excellent, but:

  • Slow. 30s for lint, 10s for format on a huge monorepo, every day.
  • Complex config. ESLint config is a maze of plugins, presets, and rules.
  • They run separately. ESLint --fix and Prettier conflict; you need yet another plugin.
  • No type info. AST-level only, with the limits that implies.

4.2 Biome's approach

Biome (split off from the original Rome project) bundles:

  • Written in Rust — tens to hundreds of times faster than ESLint + Prettier on the same code.
  • One binary — lint + format + import organize + code actions in a single tool.
  • Nearly zero config — sensible defaults. One biome.json.
  • Built-in LSP server — editor integration comes for free.

4.3 One-line difference

# Traditional
eslint . --fix && prettier --write .

# Biome
biome check . --apply

4.4 Limits

  • TypeScript-specific rules are still richer in ESLint (closing fast as of 2026).
  • Custom rules: ESLint is more mature. Biome v2 expanded its plugin system.
  • Non-standard syntax like Vue/Svelte: supported, but sometimes shallower than ESLint.

Still, the default for new JS/TS projects is rapidly moving to Biome.


5. Marksman — Markdown LSP

5.1 Does Markdown really need an LSP?

It seems odd at first — isn't Markdown just text? But for any wiki, note system, blog, or documentation site that takes Markdown seriously, you need:

  • Go to definition between files ([foo](./other.md) jumps and opens).
  • Completion for headings, anchors, images.
  • Diagnostics for broken links.
  • Backlink tracking.
  • Bulk-rename of every reference when a heading changes.

That is exactly what LSP is for.

5.2 Marksman's place

Marksman (written in F#) is a dedicated Markdown LSP server. It works in Helix, Zed, Neovim, and VS Code.

  • Supports both [[wiki-link]] and [text](path.md).
  • Heading completion: type [# and get a list of candidates.
  • Surfaces broken links.
  • Renaming a heading propagates to every reference.
  • Workspace symbols: every heading is searchable.

5.3 Neighbors

  • zk-lsp — Zettelkasten-style notes.
  • Obsidian — has its own indexer, but when an outside editor opens the same vault, Marksman is the standard.

6. Helix editor — built-in LSP + Tree-sitter

6.1 Helix's design choices

Helix is a modal editor in Rust. A descendant of Vim/Kakoune, with decisive differences.

  • Built-in LSP client. No plugins. Register the server in languages.toml and you are done.
  • Built-in Tree-sitter. Highlight, indent, structural motions, text objects all TS-based.
  • Selection-first editing. Flips Vim's verb-object into object-verb — the selection is always visible first.
  • Almost no plugin system (by design). A heavy core means you rarely need plugins.

6.2 What makes it attractive

ItemVim/NeovimHelix
LSP integrationPlugin (nvim-lspconfig)Built-in
Tree-sitterPlugin (nvim-treesitter)Built-in
ConfigurationDozens to hundreds of linesNear zero
First-use experienceSteepWorks immediately
ExtensibilityUnlimited (Lua)Limited

"IDE-grade environment in an hour" is Helix's promise. The trade-off is clear — deep customization still belongs to Neovim.

6.3 languages.toml example

[[language]]
name = "rust"
language-servers = ["rust-analyzer"]
auto-format = true

[[language]]
name = "python"
language-servers = ["basedpyright"]

That is nearly all of it.


7. Zed editor — Tree-sitter + LSP + real-time collaboration

7.1 Zed's roots and ambition

Zed is the editor a group of Atom and Electron-era co-authors (Nathan Sobo and others) rebuilt from scratch. The core is a native editor written in Rust + collaborative editing + AI integration.

  • GPU-accelerated rendering. Very fast as an editor.
  • Built-in Tree-sitter and LSP. Same philosophy as Helix.
  • Real-time collab. Google Docs-style multi-cursor, with voice and screen sharing integrated.
  • AI integration. Chat, inline assist, and agent integration are built-in.
  • Extensions via WebAssembly.

7.2 Who fits

  • Anyone tired of VS Code's weight (Electron, RAM, startup time).
  • Pair-programming teams that use product-grade collaborative editing every day.
  • Anyone who wants AI features in the editor core, not bolted on.

7.3 Trade-offs

  • The extension ecosystem is not as deep as VS Code's.
  • Some monorepo workflows (specific debuggers and test runners) are still richer on VS Code.

8. Neovim integration — nvim-treesitter / lsp-zero / nvim-lspconfig

8.1 Neovim's place

Neovim forked from Vim and brought a built-in LSP client (0.5+), a Lua runtime, and Tree-sitter support (0.8+) into the core. The result is an editor that allows endless customization.

Key plugins:

PluginRole
nvim-treesitterTree-sitter integration — highlight, indent, text objects
nvim-lspconfigCollected configurations for well-known LSP servers
lsp-zeronvim-lspconfig + mason + cmp pre-integrated — "LSP in one line"
mason.nvimInstaller for LSP servers, formatters, linters, debuggers
nvim-cmpCompletion UI engine
none-ls / null-lsExposes non-LSP tools (eslint, prettier, ...) as if they were LSPs
telescope.nvimFuzzy finder (files, symbols, LSP results)

8.2 The value of lsp-zero

Neovim's LSP setup is powerful but initially steep. 1) Register the server with lspconfig, 2) install via mason, 3) wire up cmp for completion, 4) set keymaps. lsp-zero does all four with sensible defaults.

local lsp_zero = require('lsp-zero')
lsp_zero.on_attach(function(client, bufnr)
  lsp_zero.default_keymaps({buffer = bufnr})
end)

require('mason').setup({})
require('mason-lspconfig').setup({
  ensure_installed = { 'rust_analyzer', 'gopls', 'basedpyright', 'tsserver' },
  handlers = { lsp_zero.default_setup },
})

8.3 nvim-treesitter

require('nvim-treesitter.configs').setup({
  ensure_installed = { 'rust', 'go', 'python', 'typescript', 'tsx', 'lua' },
  highlight = { enable = true },
  indent = { enable = true },
})

Turning this on brings a level of accuracy regex highlighting cannot match.


9. Per-language LSP catalog

9.1 Rust — rust-analyzer

  • Officially recommended Rust LSP.
  • Macro expansion, trait inference, lifetime hints shown inline.
  • Eats a non-trivial amount of memory and CPU, but earns it.

9.2 Go — gopls

  • Maintained by the Go team. The de facto standard.
  • Fast formatting integrated with gofmt and goimports.
  • Stabilized quickly after generics arrived.

9.3 Python — pyright / basedpyright / jedi / pylyzer

ServerNotes
pyrightA fast type checker by Microsoft, wrapped as LSP
basedpyrightA pyright fork. OSS-friendly with stricter defaults
jedi-language-serverjedi-based. Strong on code with limited type annotations
pylyzerA fast Rust-written static analyzer + LSP. Early, but promising

In 2026: the recommendation for new codebases is basedpyright. For legacy untyped code, jedi.

9.4 TypeScript / JavaScript

  • typescript-language-server (long-time standard) — wraps tsserver as LSP.
  • vtsls — an emerging wrapper that behaves more like VS Code's TS extension. In 2026, many Neovim and Helix users move to vtsls.
  • Format and lint are quickly being eaten by BiomeJS.

9.5 C/C++ — clangd

  • The LLVM camp's standard. With a good compile_commands.json, it works on huge codebases.
  • Indexing can be slow, but once built, responses are quick.

9.6 Java — jdtls

  • Exposes Eclipse JDT as LSP. The de facto Java LSP.
  • High memory usage. Deep Maven/Gradle integration.

9.7 Others

LanguageLSP
RubySolargraph (traditional), ruby-lsp (newer, Shopify)
Elixirelixir-ls, next-ls
Haskellhls (haskell-language-server)
Nimnimlsp
OCamlocaml-lsp
Luasumneko-lua / lua-language-server (essential for Neovim configs)
Zigzls
Kotlinkotlin-language-server
Swiftsourcekit-lsp
Erlangerlang_ls
Bashbash-language-server
YAMLyaml-language-server (Red Hat)
JSONvscode-json-languageserver
Terraformterraform-ls
MarkdownMarksman

9.8 A pattern

  • LSPs built by the language team itself (gopls, rust-analyzer, ruby-lsp, hls, ocaml-lsp) are almost always the deepest and most accurate.
  • LSPs built by private companies (pyright, sourcekit-lsp) often become standards.
  • Languages with static types see dramatically better LSP quality — type information is the raw material of LSP.

10. Structural search — Comby / Semgrep / CodeQL

Three tools in ast-grep's neighborhood, each in a slightly different niche.

10.1 Comby

  • Language-neutral structural matching. A small custom parser that recognizes "balanced structure" (brackets, braces, quotes).
  • Supporting a new language is cheap — you do not need a real grammar, just the language's token shape.
  • Excellent for small, fast, one-off rewrites.
comby 'foo(:[x])' 'bar(:[x])' file.py

10.2 Semgrep

  • Originally security-focused. Now a general policy search engine.
  • Patterns look like code in the language itself, with metavariables like $X.execute($Y).
  • Huge ruleset (thousands of security rules). A standard to wire into CI.
  • Ideal for company-wide code policy — "ban this API call", "verify this argument pattern."

10.3 CodeQL

  • Owned by GitHub (Microsoft). "Treat code as a database, query with SQL-like syntax."
  • Not just pattern matching — also data-flow analysis.
  • Very powerful but with a steep learning curve; usually a security-team tool.
  • The default engine behind GitHub Code Scanning.

10.4 Side-by-side

ToolParadigmStrengthBarrier
ast-grepTree-sitter AST match/rewriteFast, intuitive, every TS languageLow
CombyBalanced-structure match/rewriteCheap new-language support, one-offLow
SemgrepAST patterns + policy rulesetsBig security rulesets, CI-friendlyMedium
CodeQLData-flow query languageMost powerful analysis, taint trackingHigh

Picking guide:

  • Codify once, the team keeps checking -> Semgrep.
  • Ad hoc large refactoring -> ast-grep.
  • A spot fix in a place or two -> Comby or ast-grep.
  • Deep security analysis -> CodeQL.

11. Tabnine vs language-specific completion

11.1 Two streams

Completion is a blend of two things.

KindSourceExamples
Language-basedLSP servers. Types and symbol indexespyright, rust-analyzer
ProbabilisticLLMs / local modelsTabnine, Copilot, Codeium, Cursor Tab

11.2 Tabnine's place

  • Originally known for local GPT-2-based completion.
  • In 2026, focused on enterprise self-hosting and code-learning isolation. Offers "models trained only on your company's internal code."
  • In a market dominated by Copilot, it staked out the "privacy / on-prem" position.

11.3 LSP and LLM completion should run together

  • LSP completion knows exact types and symbols. No hallucination.
  • LLM completion is strong at long context and pattern generalization. Will guess names it does not know.
  • A good editor (Cursor, Zed, VS Code + Copilot, Neovim + lsp + ai plugins) shows both streams at once and lets the user pick.

Completion = LSP + LLM hybrid is the 2026 default.


12. The field in Korea and Japan

12.1 Korea — Toss's use of LSP and internal tooling

Toss's blog and platform team frequently mention LSP and Tree-sitter-based tools.

  • On a huge monorepo, type-aware grep (ast-grep) for API migration and deprecation hunts.
  • vtsls + Biome for fast frontend completion and formatting.
  • Internal ESLint/Biome rules maintained for the in-house design system and SDKs.
  • Security teams running Semgrep rulesets as codebase policy.

The point is editor freedom. Some pick IntelliJ, some Cursor, some Neovim. As long as they sit on standards — LSP, Tree-sitter, Biome, ast-grep — the team's rulesets apply uniformly to all of them.

12.2 Japan — Mercari's Tree-sitter usage

Mercari (メルカリ) frequently mentions Tree-sitter-based tools in its engineering blog.

  • Code search and symbol indexing — precise function/symbol extraction on a huge monorepo.
  • ast-grep / Semgrep rulesets in GitHub Actions as internal code policy.
  • Go + gopls and Rust + rust-analyzer are the in-house standard LSP combo on the backend (payments, search, and so on).
  • On mobile, kotlin-language-server / sourcekit-lsp wired into the build infrastructure.

Other Japanese companies — DeNA, CyberAgent, SmartHR, LINE — show a similar picture. LSP and Tree-sitter have settled in as "obvious infrastructure" in 2026.


13. Who should pick what

13.1 Scenarios

"I want to keep using VS Code."

  • VS Code + the language's official LSP extension (rust-analyzer, gopls, basedpyright, ...).
  • Completion: Copilot or a Cursor fork.
  • Lint / format: Biome (JS/TS), or per-language standards (rustfmt, gofmt, ruff).

"VS Code is too heavy. I want a fast native editor."

  • Zed. Built-in LSP, TS, collab, AI. The lightest path away.
  • Helix. Even lighter and more keyboard-centric, if you are comfortable with modal editing.

"I want endless customization and a keyboard-first workflow."

  • Neovim + lsp-zero + nvim-treesitter + telescope + nvim-cmp.
  • Steep setup, but once in place, the strongest workflow.

"Large monorepo refactors and policy enforcement."

  • Code search and rewrite: ast-grep.
  • Security and policy: Semgrep.
  • Deep analysis: CodeQL.
  • Quick one-off rewrite: Comby.

"Markdown-heavy work — notes, docs, blogs."

  • Any editor + Marksman as the LSP.
  • Obsidian users can mix in an outside editor + Marksman.

13.2 What everyone should share

ItemRecommendation
CompletionBoth LSP (language-based) and LLM (probabilistic)
HighlightingTree-sitter
FormattingThe language's official formatter (rustfmt, gofmt, ruff format, biome)
LintingPer-language standard + Semgrep for company policy
Searchast-grep when precision matters

14. Pitfalls and anti-patterns

14.1 Common pitfalls

  • Running too many LSP servers. In Neovim, attaching two LSPs to the same file (for example tsserver and vtsls) duplicates diagnostics and leaks memory. Pick one per language.
  • Underestimating index cost on a huge monorepo. Ten-minute first indexes are common. Care about CI caching and server restart timing.
  • Formatter conflicts. Running Prettier, Biome, ESLint --fix, and the language's official formatter all together makes your code oscillate on every save. Only one tool gets formatting authority.
  • Tree-sitter grammar version drift. Different editors with different grammar versions can render the same file differently. Not a big deal, but confusing.

14.2 Anti-patterns

  • Refactoring with regex. Past five occurrences, ast-grep / Comby / Semgrep is almost always faster and safer.
  • Hand-organizing imports. Use LSP's organize imports or Biome.
  • Exploring a huge codebase without LSP. Without go-to-def and references, the time you lose is enormous.
  • LSP only locally, none in CI. Lint, type check, and policy checks must also live in CI.

15. Closing — living on top of standards

The 2026 code-tooling ecosystem stands on two standards.

  • LSP — code intelligence.
  • Tree-sitter — code structure.

Because of these two:

  • The cost of switching editors collapsed.
  • The cost of supporting a new language collapsed.
  • The cost of expressing company policy as code collapsed.

The ecosystem that grew on top — ast-grep, Biome, Marksman, Helix, Zed, Neovim's lsp-zero/treesitter chain, and the well-built per-language LSPs — is all a direct descendant of these two axes.

One thing to remember when picking: if the tool sits on standards like LSP and Tree-sitter, you can hardly pick wrong. You can move anytime, and your team's rulesets follow you.


References