LSP & Tree-sitter Ecosystem 2026 — ast-grep / Biome / Helix / Zed / Neovim Treesitter / Per-Language LSPs Deep Dive

Prologue — The two standards of code tooling

The editor wars of the 2010s were simple. "Which app you use" decided almost everything. VS Code vs JetBrains vs Vim vs Emacs. Each one re-implemented its own completion, go-to-definition, refactoring, and syntax highlighting; every new language meant editors x languages worth of work.

The 2026 picture is completely different. The true center of gravity of editors and IDEs has shifted to two protocols/libraries.

Standard	What it standardized	Who built it
LSP (Language Server Protocol)	Code "intelligence" — completion, go-to-def, rename, diagnostics, format	Microsoft (2016)
Tree-sitter	Code "structure" — incremental parsing, highlighting, structure-aware ops	Max Brunsfeld (GitHub, 2018+)

The lesson of the last ten years is clear. Hooking into the standards is far cheaper than building a new editor. That is why new editors like Helix and Zed bake in LSP + Tree-sitter from day one. Neovim shipped an LSP client in the core starting 0.5; VS Code was the reference LSP client from the start.

And a new ecosystem is exploding on top of these two standards — ast-grep (Tree-sitter based structural search/rewrite), BiomeJS (a Rust-rewritten JS toolchain), Marksman (Markdown LSP), Semgrep / CodeQL (security and policy search), Comby (language-neutral rewriting). This post is one volume on all of it.

1. LSP — Microsoft's standard

1.1 Why LSP was needed

Before 2015: every editor needed its own plugin per language.

Editor	Language	Result
VS Code	TypeScript	Plugin A
Vim	TypeScript	Plugin B (re-implemented)
Emacs	TypeScript	Plugin C (re-implemented)
Atom	TypeScript	Plugin D (re-implemented)

With M languages and N editors, you needed M x N plugins. Each plugin re-implemented features like "go to definition" or "rename," with wildly varying quality.

Microsoft's insight: split the code analysis logic into a separate process (the language server), and have the editor and server talk over standard JSON-RPC messages. The math collapses to M + N.

1.2 Core LSP messages

Message	What it asks
`textDocument/completion`	Completion candidates at the cursor
`textDocument/definition`	Where a symbol is defined
`textDocument/references`	All places that reference a symbol
`textDocument/hover`	Type and docs for a symbol under the cursor
`textDocument/rename`	Bulk-rename a symbol
`textDocument/formatting`	Format the document
`textDocument/codeAction`	Quick fixes and refactorings
`textDocument/publishDiagnostics`	Server -> client (errors and warnings)

The default transport is JSON-RPC 2.0 over stdin/stdout. TCP/socket is supported, but stdio is the norm.

1.3 One-line diagram

       ┌──────────────┐           ┌─────────────────────┐
       │   Editor     │  JSON     │   Language Server   │
       │  (client)    │◀────────▶│   rust-analyzer,    │
       │  VS Code,    │   RPC     │   etc.              │
       │  Helix, Zed, │           │                     │
       │  Neovim ...  │           │  - Parsing          │
       └──────────────┘           │  - Type inference   │
                                  │  - Indexing         │
                                  └─────────────────────┘

The editor side gets thinner. The server side gets thicker. And once you write the server well, every editor benefits.

1.4 LSP's standing in 2026

Every serious editor ships an LSP client. VS Code, JetBrains, Helix, Zed, Neovim, Emacs (eglot), Sublime.
Every active language has its own LSP server. rust-analyzer, gopls, pyright, typescript-language-server, clangd, and so on.
A region with no vendor lock-in. It is now strange to say "this language is only well supported in VS Code."

2. Tree-sitter — Max Brunsfeld's incremental parser

2.1 The limits of regex highlighting

Until the mid-2010s, almost every editor implemented syntax highlighting with regular expressions. TextMate grammars (.tmLanguage) were the de facto standard.

Problems:

They lie. Regex is not a real parser. Nested strings, complex interpolation, and macros break it.
They are slow. Big files re-match on every keystroke.
They are fragile. When code briefly breaks, highlighting collapses entirely.

2.2 What Tree-sitter answered

Tree-sitter, built by Max Brunsfeld (formerly of Atom, then GitHub), solves all of these at once.

Incremental. A keystroke re-parses only the changed region, not the entire tree.
Error recovery. When code is temporarily broken, it parses as much as it can and leaves error nodes.
Generalized LR (GLR) — handles ambiguous grammars.
Language-neutral. Grammars are written in a small DSL; parsers are generated in C.
Fast. A real parser, but fast enough for live highlighting.

2.3 Uses

Syntax highlighting: AST-based, accurate.
Code folding: structural, not regex.
Structural selection: "select this function", "expand selection by expression."
Search and rewriting: query nodes, not text (ast-grep, Comby).
Indexing beyond highlight: function/class/import extraction.

2.4 Who uses it

Tool	Tree-sitter use
Neovim	nvim-treesitter — highlight, fold, structural text objects
Helix	Built-in. Highlight, indent, structural motions all on TS
Zed	Built-in. Highlight, outline, structural search
GitHub	Code search, highlight, symbol extraction
ast-grep	Structural search/rewrite engine
Difftastic	Structure-aware diff

2.5 Grammar distribution

tree-sitter-rust, tree-sitter-python, tree-sitter-typescript, and so on — almost every popular language ships its grammar as a separate npm/crates package. Supporting a new language means writing a grammar; the highlight queries (.scm) are short.

3. ast-grep (sg) — structural search and rewrite

3.1 Why grep falls short and what ast-grep answers

grep matches text. "All callers of console.log" also catches comments, strings, and docs containing console.log. And "only calls where the first argument is an object" is effectively impossible with regex.

ast-grep (sg) is different. It matches patterns on the AST parsed by Tree-sitter. Patterns look like code in the same language, with $VAR for metavariables.

3.2 Who built it, what is new

Herrington Darkholme wrote it in Rust.
In 2024 it raised an external (seed) round and expanded as the "ast-grep company" — targeting enterprise code migration and policy search.
The Rust + Tree-sitter combo is fast and works in any Tree-sitter language.

3.3 Pattern examples

# Every call to console.log with any argument
sg --pattern 'console.log($A)' --lang typescript

# Only calls whose first arg is an object literal
sg --pattern 'console.log({ $$$ })' --lang typescript

# Rewrite: console.log(x) -> logger.debug(x)
sg --pattern 'console.log($A)' --rewrite 'logger.debug($A)' --lang typescript --update-all

$A matches one arbitrary expression; $$$ matches an arbitrary list of nodes.

3.4 sgconfig.yml — codebase policy

ast-grep can store team policies as YAML rulesets. Running the ruleset in CI automatically checks rules like "do not use this pattern."

id: no-direct-fetch
language: typescript
rule:
  pattern: fetch($URL)
message: "Use apiClient.get instead of fetch"
severity: warning

3.5 What it is best for

Large-scale refactoring. "Migrate this API call pattern to the new SDK." A week by hand, minutes with ast-grep.
Codebase policy. "Ban console.log in this module", "no fetch inside useEffect."
Huge codebase exploration. "Every use of this pattern" — far more accurate than grep.

4. BiomeJS — replacing ESLint + Prettier

4.1 Accumulated pain in the JS toolchain

Since the mid-2010s, two tools every JS dev used:

ESLint — code quality linter (written in JS)
Prettier — formatter (written in JS)

Both are excellent, but:

Slow. 30s for lint, 10s for format on a huge monorepo, every day.
Complex config. ESLint config is a maze of plugins, presets, and rules.
They run separately. ESLint --fix and Prettier conflict; you need yet another plugin.
No type info. AST-level only, with the limits that implies.

4.2 Biome's approach

Biome (split off from the original Rome project) bundles:

Written in Rust — tens to hundreds of times faster than ESLint + Prettier on the same code.
One binary — lint + format + import organize + code actions in a single tool.
Nearly zero config — sensible defaults. One biome.json.
Built-in LSP server — editor integration comes for free.

4.3 One-line difference

# Traditional
eslint . --fix && prettier --write .

# Biome
biome check . --apply

4.4 Limits

TypeScript-specific rules are still richer in ESLint (closing fast as of 2026).
Custom rules: ESLint is more mature. Biome v2 expanded its plugin system.
Non-standard syntax like Vue/Svelte: supported, but sometimes shallower than ESLint.

Still, the default for new JS/TS projects is rapidly moving to Biome.

5. Marksman — Markdown LSP

5.1 Does Markdown really need an LSP?

It seems odd at first — isn't Markdown just text? But for any wiki, note system, blog, or documentation site that takes Markdown seriously, you need:

Go to definition between files ([foo](./other.md) jumps and opens).
Completion for headings, anchors, images.
Diagnostics for broken links.
Backlink tracking.
Bulk-rename of every reference when a heading changes.

That is exactly what LSP is for.

5.2 Marksman's place

Marksman (written in F#) is a dedicated Markdown LSP server. It works in Helix, Zed, Neovim, and VS Code.

Supports both [[wiki-link]] and [text](path.md).
Heading completion: type [# and get a list of candidates.
Surfaces broken links.
Renaming a heading propagates to every reference.
Workspace symbols: every heading is searchable.

5.3 Neighbors

zk-lsp — Zettelkasten-style notes.
Obsidian — has its own indexer, but when an outside editor opens the same vault, Marksman is the standard.

6. Helix editor — built-in LSP + Tree-sitter

6.1 Helix's design choices

Helix is a modal editor in Rust. A descendant of Vim/Kakoune, with decisive differences.

Built-in LSP client. No plugins. Register the server in languages.toml and you are done.
Built-in Tree-sitter. Highlight, indent, structural motions, text objects all TS-based.
Selection-first editing. Flips Vim's verb-object into object-verb — the selection is always visible first.
Almost no plugin system (by design). A heavy core means you rarely need plugins.

6.2 What makes it attractive

Item	Vim/Neovim	Helix
LSP integration	Plugin (nvim-lspconfig)	Built-in
Tree-sitter	Plugin (nvim-treesitter)	Built-in
Configuration	Dozens to hundreds of lines	Near zero
First-use experience	Steep	Works immediately
Extensibility	Unlimited (Lua)	Limited

"IDE-grade environment in an hour" is Helix's promise. The trade-off is clear — deep customization still belongs to Neovim.

6.3 `languages.toml` example

[[language]]
name = "rust"
language-servers = ["rust-analyzer"]
auto-format = true

[[language]]
name = "python"
language-servers = ["basedpyright"]

That is nearly all of it.

7. Zed editor — Tree-sitter + LSP + real-time collaboration

7.1 Zed's roots and ambition

Zed is the editor a group of Atom and Electron-era co-authors (Nathan Sobo and others) rebuilt from scratch. The core is a native editor written in Rust + collaborative editing + AI integration.

GPU-accelerated rendering. Very fast as an editor.
Built-in Tree-sitter and LSP. Same philosophy as Helix.
Real-time collab. Google Docs-style multi-cursor, with voice and screen sharing integrated.
AI integration. Chat, inline assist, and agent integration are built-in.
Extensions via WebAssembly.

7.2 Who fits

Anyone tired of VS Code's weight (Electron, RAM, startup time).
Pair-programming teams that use product-grade collaborative editing every day.
Anyone who wants AI features in the editor core, not bolted on.

7.3 Trade-offs

The extension ecosystem is not as deep as VS Code's.
Some monorepo workflows (specific debuggers and test runners) are still richer on VS Code.

8. Neovim integration — nvim-treesitter / lsp-zero / nvim-lspconfig

8.1 Neovim's place

Neovim forked from Vim and brought a built-in LSP client (0.5+), a Lua runtime, and Tree-sitter support (0.8+) into the core. The result is an editor that allows endless customization.

Key plugins:

Plugin	Role
nvim-treesitter	Tree-sitter integration — highlight, indent, text objects
nvim-lspconfig	Collected configurations for well-known LSP servers
lsp-zero	nvim-lspconfig + mason + cmp pre-integrated — "LSP in one line"
mason.nvim	Installer for LSP servers, formatters, linters, debuggers
nvim-cmp	Completion UI engine
none-ls / null-ls	Exposes non-LSP tools (eslint, prettier, ...) as if they were LSPs
telescope.nvim	Fuzzy finder (files, symbols, LSP results)

8.2 The value of lsp-zero

Neovim's LSP setup is powerful but initially steep. 1) Register the server with lspconfig, 2) install via mason, 3) wire up cmp for completion, 4) set keymaps. lsp-zero does all four with sensible defaults.

local lsp_zero = require('lsp-zero')
lsp_zero.on_attach(function(client, bufnr)
  lsp_zero.default_keymaps({buffer = bufnr})
end)

require('mason').setup({})
require('mason-lspconfig').setup({
  ensure_installed = { 'rust_analyzer', 'gopls', 'basedpyright', 'tsserver' },
  handlers = { lsp_zero.default_setup },
})

8.3 nvim-treesitter

require('nvim-treesitter.configs').setup({
  ensure_installed = { 'rust', 'go', 'python', 'typescript', 'tsx', 'lua' },
  highlight = { enable = true },
  indent = { enable = true },
})

Turning this on brings a level of accuracy regex highlighting cannot match.

9. Per-language LSP catalog

9.1 Rust — rust-analyzer

Officially recommended Rust LSP.
Macro expansion, trait inference, lifetime hints shown inline.
Eats a non-trivial amount of memory and CPU, but earns it.

9.2 Go — gopls

Maintained by the Go team. The de facto standard.
Fast formatting integrated with gofmt and goimports.
Stabilized quickly after generics arrived.

9.3 Python — pyright / basedpyright / jedi / pylyzer

Server	Notes
pyright	A fast type checker by Microsoft, wrapped as LSP
basedpyright	A pyright fork. OSS-friendly with stricter defaults
jedi-language-server	jedi-based. Strong on code with limited type annotations
pylyzer	A fast Rust-written static analyzer + LSP. Early, but promising

In 2026: the recommendation for new codebases is basedpyright. For legacy untyped code, jedi.

9.4 TypeScript / JavaScript

typescript-language-server (long-time standard) — wraps tsserver as LSP.
vtsls — an emerging wrapper that behaves more like VS Code's TS extension. In 2026, many Neovim and Helix users move to vtsls.
Format and lint are quickly being eaten by BiomeJS.

9.5 C/C++ — clangd

The LLVM camp's standard. With a good compile_commands.json, it works on huge codebases.
Indexing can be slow, but once built, responses are quick.

9.6 Java — jdtls

Exposes Eclipse JDT as LSP. The de facto Java LSP.
High memory usage. Deep Maven/Gradle integration.

9.7 Others

Language	LSP
Ruby	Solargraph (traditional), ruby-lsp (newer, Shopify)
Elixir	elixir-ls, next-ls
Haskell	hls (haskell-language-server)
Nim	nimlsp
OCaml	ocaml-lsp
Lua	sumneko-lua / lua-language-server (essential for Neovim configs)
Zig	zls
Kotlin	kotlin-language-server
Swift	sourcekit-lsp
Erlang	erlang_ls
Bash	bash-language-server
YAML	yaml-language-server (Red Hat)
JSON	vscode-json-languageserver
Terraform	terraform-ls
Markdown	Marksman

9.8 A pattern

LSPs built by the language team itself (gopls, rust-analyzer, ruby-lsp, hls, ocaml-lsp) are almost always the deepest and most accurate.
LSPs built by private companies (pyright, sourcekit-lsp) often become standards.
Languages with static types see dramatically better LSP quality — type information is the raw material of LSP.

10. Structural search — Comby / Semgrep / CodeQL

Three tools in ast-grep's neighborhood, each in a slightly different niche.

10.1 Comby

Language-neutral structural matching. A small custom parser that recognizes "balanced structure" (brackets, braces, quotes).
Supporting a new language is cheap — you do not need a real grammar, just the language's token shape.
Excellent for small, fast, one-off rewrites.

comby 'foo(:[x])' 'bar(:[x])' file.py

10.2 Semgrep

Originally security-focused. Now a general policy search engine.
Patterns look like code in the language itself, with metavariables like $X.execute($Y).
Huge ruleset (thousands of security rules). A standard to wire into CI.
Ideal for company-wide code policy — "ban this API call", "verify this argument pattern."

10.3 CodeQL

Owned by GitHub (Microsoft). "Treat code as a database, query with SQL-like syntax."
Not just pattern matching — also data-flow analysis.
Very powerful but with a steep learning curve; usually a security-team tool.
The default engine behind GitHub Code Scanning.

10.4 Side-by-side

Tool	Paradigm	Strength	Barrier
ast-grep	Tree-sitter AST match/rewrite	Fast, intuitive, every TS language	Low
Comby	Balanced-structure match/rewrite	Cheap new-language support, one-off	Low
Semgrep	AST patterns + policy rulesets	Big security rulesets, CI-friendly	Medium
CodeQL	Data-flow query language	Most powerful analysis, taint tracking	High

Picking guide:

Codify once, the team keeps checking -> Semgrep.
Ad hoc large refactoring -> ast-grep.
A spot fix in a place or two -> Comby or ast-grep.
Deep security analysis -> CodeQL.

11. Tabnine vs language-specific completion

11.1 Two streams

Completion is a blend of two things.

Kind	Source	Examples
Language-based	LSP servers. Types and symbol indexes	pyright, rust-analyzer
Probabilistic	LLMs / local models	Tabnine, Copilot, Codeium, Cursor Tab

11.2 Tabnine's place

Originally known for local GPT-2-based completion.
In 2026, focused on enterprise self-hosting and code-learning isolation. Offers "models trained only on your company's internal code."
In a market dominated by Copilot, it staked out the "privacy / on-prem" position.

11.3 LSP and LLM completion should run together

LSP completion knows exact types and symbols. No hallucination.
LLM completion is strong at long context and pattern generalization. Will guess names it does not know.
A good editor (Cursor, Zed, VS Code + Copilot, Neovim + lsp + ai plugins) shows both streams at once and lets the user pick.

Completion = LSP + LLM hybrid is the 2026 default.

12. The field in Korea and Japan

12.1 Korea — Toss's use of LSP and internal tooling

Toss's blog and platform team frequently mention LSP and Tree-sitter-based tools.

On a huge monorepo, type-aware grep (ast-grep) for API migration and deprecation hunts.
vtsls + Biome for fast frontend completion and formatting.
Internal ESLint/Biome rules maintained for the in-house design system and SDKs.
Security teams running Semgrep rulesets as codebase policy.

The point is editor freedom. Some pick IntelliJ, some Cursor, some Neovim. As long as they sit on standards — LSP, Tree-sitter, Biome, ast-grep — the team's rulesets apply uniformly to all of them.

12.2 Japan — Mercari's Tree-sitter usage

Mercari (メルカリ) frequently mentions Tree-sitter-based tools in its engineering blog.

Code search and symbol indexing — precise function/symbol extraction on a huge monorepo.
ast-grep / Semgrep rulesets in GitHub Actions as internal code policy.
Go + gopls and Rust + rust-analyzer are the in-house standard LSP combo on the backend (payments, search, and so on).
On mobile, kotlin-language-server / sourcekit-lsp wired into the build infrastructure.

Other Japanese companies — DeNA, CyberAgent, SmartHR, LINE — show a similar picture. LSP and Tree-sitter have settled in as "obvious infrastructure" in 2026.

13. Who should pick what

13.1 Scenarios

"I want to keep using VS Code."

VS Code + the language's official LSP extension (rust-analyzer, gopls, basedpyright, ...).
Completion: Copilot or a Cursor fork.
Lint / format: Biome (JS/TS), or per-language standards (rustfmt, gofmt, ruff).

"VS Code is too heavy. I want a fast native editor."

Zed. Built-in LSP, TS, collab, AI. The lightest path away.
Helix. Even lighter and more keyboard-centric, if you are comfortable with modal editing.

"I want endless customization and a keyboard-first workflow."

Neovim + lsp-zero + nvim-treesitter + telescope + nvim-cmp.
Steep setup, but once in place, the strongest workflow.

"Large monorepo refactors and policy enforcement."

Code search and rewrite: ast-grep.
Security and policy: Semgrep.
Deep analysis: CodeQL.
Quick one-off rewrite: Comby.

"Markdown-heavy work — notes, docs, blogs."

Any editor + Marksman as the LSP.
Obsidian users can mix in an outside editor + Marksman.

Item	Recommendation
Completion	Both LSP (language-based) and LLM (probabilistic)
Highlighting	Tree-sitter
Formatting	The language's official formatter (rustfmt, gofmt, ruff format, biome)
Linting	Per-language standard + Semgrep for company policy
Search	ast-grep when precision matters

14. Pitfalls and anti-patterns

14.1 Common pitfalls

Running too many LSP servers. In Neovim, attaching two LSPs to the same file (for example tsserver and vtsls) duplicates diagnostics and leaks memory. Pick one per language.
Underestimating index cost on a huge monorepo. Ten-minute first indexes are common. Care about CI caching and server restart timing.
Formatter conflicts. Running Prettier, Biome, ESLint --fix, and the language's official formatter all together makes your code oscillate on every save. Only one tool gets formatting authority.
Tree-sitter grammar version drift. Different editors with different grammar versions can render the same file differently. Not a big deal, but confusing.

14.2 Anti-patterns

Refactoring with regex. Past five occurrences, ast-grep / Comby / Semgrep is almost always faster and safer.
Hand-organizing imports. Use LSP's organize imports or Biome.
Exploring a huge codebase without LSP. Without go-to-def and references, the time you lose is enormous.
LSP only locally, none in CI. Lint, type check, and policy checks must also live in CI.

15. Closing — living on top of standards

The 2026 code-tooling ecosystem stands on two standards.

LSP — code intelligence.
Tree-sitter — code structure.

Because of these two:

The cost of switching editors collapsed.
The cost of supporting a new language collapsed.
The cost of expressing company policy as code collapsed.

The ecosystem that grew on top — ast-grep, Biome, Marksman, Helix, Zed, Neovim's lsp-zero/treesitter chain, and the well-built per-language LSPs — is all a direct descendant of these two axes.

One thing to remember when picking: if the tool sits on standards like LSP and Tree-sitter, you can hardly pick wrong. You can move anytime, and your team's rulesets follow you.

References

LSP specification — https://microsoft.github.io/language-server-protocol/
Tree-sitter — https://tree-sitter.github.io/tree-sitter/
ast-grep (sg) — https://ast-grep.github.io/
BiomeJS — https://biomejs.dev/
Marksman — https://github.com/artempyanykh/marksman
Helix editor — https://helix-editor.com/
Zed editor — https://zed.dev/
Neovim — https://neovim.io/
nvim-treesitter — https://github.com/nvim-treesitter/nvim-treesitter
nvim-lspconfig — https://github.com/neovim/nvim-lspconfig
lsp-zero.nvim — https://github.com/VonHeikemen/lsp-zero.nvim
mason.nvim — https://github.com/williamboman/mason.nvim
rust-analyzer — https://rust-analyzer.github.io/
gopls — https://pkg.go.dev/golang.org/x/tools/gopls
pyright — https://github.com/microsoft/pyright
basedpyright — https://github.com/DetachHead/basedpyright
jedi-language-server — https://github.com/pappasam/jedi-language-server
pylyzer — https://github.com/mtshiba/pylyzer
typescript-language-server — https://github.com/typescript-language-server/typescript-language-server
vtsls — https://github.com/yioneko/vtsls
clangd — https://clangd.llvm.org/
jdtls — https://github.com/eclipse-jdtls/eclipse.jdt.ls
Solargraph — https://solargraph.org/
ruby-lsp — https://github.com/Shopify/ruby-lsp
elixir-ls — https://github.com/elixir-lsp/elixir-ls
haskell-language-server — https://github.com/haskell/haskell-language-server
nimlsp — https://github.com/PMunch/nimlsp
ocaml-lsp — https://github.com/ocaml/ocaml-lsp
sumneko/lua-language-server — https://github.com/LuaLS/lua-language-server
zls — https://github.com/zigtools/zls
Comby — https://comby.dev/
Semgrep — https://semgrep.dev/
CodeQL — https://codeql.github.com/
GitLens (VS Code) — https://github.com/gitkraken/vscode-gitlens
Difftastic — https://github.com/Wilfred/difftastic
Tabnine — https://www.tabnine.com/
Toss Tech Blog — https://toss.tech/
Mercari Engineering Blog — https://engineering.mercari.com/