mirror of
https://github.com/clangd/clangd.git
synced 2025-04-15 21:32:11 +00:00
Copy detailed design docs from clangd-www
parent
b2088bba17
commit
47d552ead0
335
design/code-walkthrough.md
Normal file
335
design/code-walkthrough.md
Normal file
@ -0,0 +1,335 @@
|
||||
# clangd code walkthrough
|
||||
|
||||
This describes the critical parts of clangd and roughly what they do.
|
||||
It may get out of date from time to time, please
|
||||
[file a bug](https://github.com/llvm/clangd-www/issues)!
|
||||
It mostly starts at the outside and works it way inward.
|
||||
|
||||
The clangd code lives in the [llvm-project] repository under
|
||||
[clang-tools-extra/clangd][clangd]. We'll also mention some dependencies in
|
||||
other parts of llvm-project (such as clang).
|
||||
|
||||
Links below point to the [woboq] code browser. It has nice cross-referenced navigation, but may show
|
||||
a slightly outdated version of the code.
|
||||
|
||||
## Starting up and managing files
|
||||
|
||||
### Entry point and JSON-RPC
|
||||
|
||||
The `clangd` binary itself has its [main()] entrypoint in `tool/ClangdMain.cpp`.
|
||||
This mostly parses flags and hooks `ClangdLSPServer` up to its dependencies.
|
||||
|
||||
One vital dependency is [JSONTransport] which speaks the JSON-RPC protocol
|
||||
over stdin/stdout. LSP is layered on top of this [Transport] abstraction.
|
||||
(There's also an Apple XPC transport in the [xpc/] directory).
|
||||
|
||||
We call `ClangdLSPServer.run()` to start the loop, and it synchronously
|
||||
processes messages until the client disconnects. Calls to the large
|
||||
non-threadsafe singletons (`ClangdLSPServer`, `ClangdServer`, `TUScheduler`)
|
||||
all happen on the main thread.
|
||||
See [threads and request handling].
|
||||
|
||||
### Language Server Protocol
|
||||
|
||||
[ClangdLSPServer] handles the LSP protocol details. Incoming requests are routed
|
||||
to some method on this class using a lookup table, and then implemented
|
||||
by dispatching them to the contained `ClangdServer`.
|
||||
|
||||
The incoming JSON requests are mapped onto structs defined in [Protocol.h].
|
||||
In the simplest cases these are just forwarded to the appropriate method on
|
||||
`ClangdServer` - we use the LSP structs as vocabulary types for most things.
|
||||
|
||||
In other cases there's some gap between LSP and what seems to be a sensible C++
|
||||
API, so `ClangdLSPServer` has some real work to do.
|
||||
|
||||
### ClangdServer and TUScheduler
|
||||
|
||||
The [ClangdServer] class is best thought of as the C++ API to clangd.
|
||||
Features tend to be implemented as stateless, synchronous functions ("give me
|
||||
hover information from this AST at offset 25"). ClangdServer exposes them as
|
||||
stateful, asynchronous functions ("compute hover information for the latest
|
||||
version of Foo.cpp at offset 25, call back when done") which is the LSP model.
|
||||
|
||||
[TUScheduler] is responsible for keeping track of the latest version of each
|
||||
file, building and caching ASTs and preambles as inputs, and providing threads
|
||||
to run requests on in an appropriate sequence. (More details in
|
||||
[threads and request handling]).
|
||||
It also pushes certain events to `ClangdServer` via [ParsingCallbacks], to
|
||||
allow emitting diagnostics and indexing ASTs.
|
||||
|
||||
`ClangdServer` is fairly mechanical for the most part. The features are
|
||||
implemented in various other files, and the scheduling and AST building is
|
||||
done by `TUScheduler`, so largely it just binds these together.
|
||||
`TUScheduler` doesn't know about particular features (except diagnostics).
|
||||
|
||||
### Compile commands
|
||||
|
||||
Like other clang-based tools, clangd uses clang's command-line syntax as its
|
||||
interface to configure parse options (like include directories).
|
||||
The arguments are obtained from a [tooling::CompilationDatabase], typically
|
||||
built by reading `compile_commands.json` from a nearby directory.
|
||||
[GlobalCompilationDatabase] is responsible for finding and caching such
|
||||
databases, and for providing "fallback" commands when none are found.
|
||||
|
||||
Various heuristic tweaks are applied to these commands to make them more likely
|
||||
to work, particularly on Mac. These live in [CommandMangler].
|
||||
|
||||
## Features
|
||||
|
||||
### Diagnostics
|
||||
|
||||
During parsing, clang emits diagnostics through a `DiagnosticConsumer` callback.
|
||||
clangd's [StoreDiags] implementation converts them into [Diag] objects,
|
||||
which capture the relationships between diagnostics, fixes, and notes.
|
||||
These are exposed in `ParsedAST`.
|
||||
(clang-tidy diagnostics are generated separately in `buildAST()`, but also end
|
||||
up captured by `StoreDiags` and exposed in the same way).
|
||||
|
||||
[IncludeFixer] attempts to add automatic fixes to certain diagnostics by using
|
||||
the index to find headers that should be included.
|
||||
|
||||
`TUScheduler` has a logic to determine when a `ParsedAST`'s diagnostics are
|
||||
"correct enough" to emit, and when a new build is needed for this purpose.
|
||||
It then triggers `ClangdServer::onMainAST`, which calls
|
||||
`ClangdLSPServer::onDiagnosticsReady`, which sends them to the client.
|
||||
|
||||
### AST-based features
|
||||
|
||||
Most clangd requests are handled by inspecting a `ParsedAST`, and maybe the
|
||||
index. Examples are [locateSymbolAt()] (go-to-definition) and [getHover()].
|
||||
|
||||
These features are spread across various files, but are easy to find from their
|
||||
callsites in `ClangdServer`.
|
||||
|
||||
### Code completion (and signature help)
|
||||
|
||||
Code completion does not follow the usual pattern for AST-based features.
|
||||
Instead there's a dedicated parse of the current file with a callback when the
|
||||
completion point is reached.
|
||||
The core completion logic is implemented in clang's [SemaCodeComplete.cpp] and
|
||||
has access to information not present in the AST, such as name-lookup structures
|
||||
and parser state.
|
||||
|
||||
[CodeComplete.h] is mostly concerned with running clang in this mode,
|
||||
combining clang's results with index-based results, applying ranking, and
|
||||
converting to LSP's data model.
|
||||
|
||||
The ranking is mostly implemented in [Quality.h], and the name-matching
|
||||
is done by [FuzzyMatcher].
|
||||
|
||||
### Code actions
|
||||
|
||||
Most code actions are provided by `Tweak`s. These are small plugins that
|
||||
implement the [Tweak] interface. They live in the [refactor/tweaks] directory
|
||||
and are registered through the linker. Given a selection, they can (quickly)
|
||||
determine whether they apply there and (maybe slowly) generate the actual edits.
|
||||
The LSP code-actions flow is built out of these primitives.
|
||||
|
||||
## Feature infrastructure
|
||||
|
||||
### Parsing and ASTs
|
||||
|
||||
The representation of a parsed file in clangd is [ParsedAST].
|
||||
As the name suggests this is mostly used to access Clang's AST
|
||||
(`clang::ASTContext`), but extends it by:
|
||||
|
||||
- recording and exposing information gathered from callbacks (e.g. diagnostics)
|
||||
- encapsulating the other objects (e.g. SourceManager and Preprocessor) and
|
||||
keeps them alive with the correct lifetime
|
||||
|
||||
`ParsedAST::build()` is where we run the clang parser.
|
||||
Some low-level bits (creating `CompilerInstance`) are in [Compiler.h]
|
||||
instead, and are reused when we run clang without retaining an AST (code
|
||||
completion, indexing, preambles).
|
||||
|
||||
The [PreambleData] structure similarly extends Clang's `PrecompiledPreamble`
|
||||
class with extra recorded information. It contains the AST of included headers
|
||||
and is only rebuilt when those headers change. The preamble is large, it's kept
|
||||
on disk by default and parts are deserialized on demand.
|
||||
|
||||
### Abstractions over clang AST
|
||||
|
||||
Several tasks come up in various features and we have reusable solutions:
|
||||
|
||||
- [SelectionTree] identifies the AST nodes corresponding to a point or range
|
||||
in the source code.
|
||||
Used in go-to-definition, code actions, and many other features.
|
||||
- [targetDecl()] identifies the declaration an AST node refers to.
|
||||
Used e.g. in go-to-definition.
|
||||
- [findExplicitReferences()] traverses a chunk of AST and lists declarations
|
||||
referenced. Used e.g. in find-references and rename. Should be used for
|
||||
indexing, one day.
|
||||
|
||||
### Index
|
||||
|
||||
Operations that need information outside the current file/AST make use of
|
||||
[the clangd index], which is in the [index/] directory.
|
||||
|
||||
[SymbolIndex] is the index interface exposed to consuming features, and
|
||||
describes the data/queries they should provide. (`Symbol`, `Ref`, etc).
|
||||
It has several implementations used as building-blocks:
|
||||
|
||||
- [MemIndex] is a simple in-memory implementation that's cheap to construct.
|
||||
- [Dex] is a more complex one with a scalable fuzzyFind search.
|
||||
- [MergedIndex] combines indexes, merging results.
|
||||
- [IndexClient] is a client for a remote index service over grpc.
|
||||
|
||||
[SymbolCollector] extracts indexable data from a translation unit.
|
||||
[index/Serialization.h] defines a binary format to store/load index data.
|
||||
|
||||
These building blocks are used to provide clangd's index data:
|
||||
|
||||
- [BackgroundIndex] runs `SymbolCollector` over project files in background
|
||||
threads, periodically combining the results into an exposed `Dex` index.
|
||||
Index data is also written to disk and only reindexed when these are stale.
|
||||
- [FileIndex] stores the index information from all opened files and their
|
||||
preambles, running SymbolCollector on ASTs after they are rebuilt. It is a
|
||||
`MergedIndex` of a `Dex` of the preambles and a `MemIndex` of the main-file
|
||||
symbols. This is also known as the "dynamic index".
|
||||
- The "static index" is configured in `main` and may be a
|
||||
simple index file (generated by [indexer/IndexerMain.cpp]) loaded into `Dex`,
|
||||
a `RemoteIndex`, or nothing at all.
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Clang libraries
|
||||
|
||||
Clang code structure is a huge topic, but the most important pieces for clangd:
|
||||
|
||||
- AST: [clang/AST/] defines the data structures that represent
|
||||
parsed C++ code, such the [Decl], [Stmt], and [Type] hierarchies.
|
||||
[RecursiveASTVisitor] is the generic mechanism to walk an AST.
|
||||
- Preambles: [PrecompiledPreamble] wraps a serialized partial AST that can be
|
||||
lazy-loaded from disk, clangd relies *heavily* on this optimization.
|
||||
- Preprocessor: at the token level, the [Preprocessor] handles directives and
|
||||
macro expansion, and we use [PPCallbacks] hooks to listen for events.
|
||||
|
||||
### Clang-tools libraries
|
||||
|
||||
clangd shares code with other tools derived from the Clang compiler, these
|
||||
libraries live outside clangd.
|
||||
|
||||
- [syntax::TokenBuffer] captures token-boundary and preprocessing information
|
||||
that clang itself doesn't preserve.
|
||||
- [clang/Index/] implements an indexing-oriented traversal of Clang ASTs, which
|
||||
is used in clangd's index.
|
||||
- [clang/Format/] is the clang-format logic used to satisfy formatting requests
|
||||
and also to format newly-inserted code.
|
||||
- [tooling::CompilationDatabase] is the foundation for clangd's integration
|
||||
with build systems.
|
||||
|
||||
### General support libraries
|
||||
|
||||
Like most LLVM code, clangd heavily uses [llvm/ADT/] and [llvm/Support/] to
|
||||
supplement the standard library. We try to avoid other LLVM dependencies.
|
||||
|
||||
clangd has its own [support/] library, conceptually similar to `llvm/Support`.
|
||||
It contains libraries that are general-purpose, but not a good fit for llvm as a
|
||||
whole (too opinionated, or focused on multithreading). The most prominent:
|
||||
|
||||
- [ThreadsafeFS] addresses the problems with llvm's FileSystem abstraction for
|
||||
multithreaded programs.
|
||||
- [Context] is used to passing certain "ambient" data around within the current
|
||||
thread, and automatically propagating it when scheduling on another thread.
|
||||
(It is related to dynamically-scoped variables, and thread-local storage).
|
||||
It's used for certain settings like overriding LSP encoding, for tracking
|
||||
actions across threads, request cancellation and more.
|
||||
- [support/Logger.h] provides a concise, threadsafe logging API and lets
|
||||
embedders handle logs.
|
||||
- [support/Trace.h] allows instrumentation of clangd's implementation with
|
||||
events and metrics for performance analysis etc.
|
||||
|
||||
## Testing
|
||||
|
||||
Most of the tests are in the [unittests/] directory (despite being a mix of unit
|
||||
and integration tests). Test files are mostly named after the file they're
|
||||
testing, and use the [googletest] framework.
|
||||
|
||||
Some helpers are widely shared between tests:
|
||||
|
||||
- [TestTU] lets tests tersely specify code for a test case, and can prepare
|
||||
`ParsedAST` and other structures needed for testing features on that code.
|
||||
- [Annotations] recognizes code examples with marked points and ranges.
|
||||
This is used e.g. to specify tests for "go to definition".
|
||||
|
||||
clangd has a small number of black-box tests in [test/]. These use LLVM [lit]
|
||||
and [FileCheck] to drive the clangd binary and verify output. They smoke-test
|
||||
clangd as an LSP server, and test a few hard-to-isolate features.
|
||||
|
||||
[FileCheck]: https://llvm.org/docs/CommandGuide/FileCheck.html
|
||||
[googletest]: https://github.com/google/googletest
|
||||
[lit]: https://llvm.org/docs/CommandGuide/lit.html
|
||||
[llvm-project]: https://github.com/llvm/llvm-project
|
||||
[woboq]: https://code.woboq.org/llvm/clang-tools-extra/clangd/
|
||||
|
||||
[threads and request handling]: threads
|
||||
[the clangd index]: index
|
||||
|
||||
[clangd]: https://code.woboq.org/llvm/clang-tools-extra/clangd/
|
||||
[index/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/
|
||||
[test/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/test/
|
||||
[refactor/tweaks/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/refactor/tweaks/
|
||||
[support/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/support/
|
||||
[unittests/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/unittests/
|
||||
[xpc/]: https://code.woboq.org/llvm/clang-tools-extra/clangd/xpc/
|
||||
|
||||
[llvm/ADT/]: https://code.woboq.org/llvm/llvm/include/llvm/ADT/
|
||||
[llvm/Support/]: https://code.woboq.org/llvm/llvm/include/llvm/Support/
|
||||
[clang/AST/]: https://code.woboq.org/llvm/clang/include/clang/AST/
|
||||
[clang/Index/]: https://code.woboq.org/llvm/clang/include/clang/Index/
|
||||
[clang/Format/]: https://code.woboq.org/llvm/clang/include/clang/Format/
|
||||
|
||||
[CodeComplete.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/CodeComplete.h.html
|
||||
[Compiler.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Compiler.h.html
|
||||
[Protocol.h]:https://code.woboq.org/llvm/clang-tools-extra/clangd/Protocol.h.html
|
||||
[Quality.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Quality.h.html
|
||||
[index/Serialization.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/Serialization.h.html
|
||||
[indexer/IndexerMain.cpp]: https://code.woboq.org/llvm/clang-tools-extra/clangd/indexer/IndexerMain.h.html
|
||||
[support/Logger.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/support/Logger.h.html
|
||||
[support/Trace.h]: https://code.woboq.org/llvm/clang-tools-extra/clangd/support/Trace.h.html
|
||||
|
||||
[SemaCodeComplete.cpp]: https://code.woboq.org/llvm/clang/lib/Sema/SemaCodeComplete.cpp.html
|
||||
|
||||
[Annotations]: https://code.woboq.org/llvm/clang-tools-extra/clangd/unittests/Annotations.h.html#clang::clangd::Annotations
|
||||
[BackgroundIndex]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/Background.h.html#clang::clangd::BackgroundIndex
|
||||
[ClangdLSPServer]: https://code.woboq.org/llvm/clang-tools-extra/clangd/ClangdLSPServer.h.html#clang::clangd::ClangdLSPServer
|
||||
[ClangdServer]: https://code.woboq.org/llvm/clang-tools-extra/clangd/ClangdServer.h.html#clang::clangd::ClangdServer
|
||||
[CommandMangler]: https://code.woboq.org/llvm/clang-tools-extra/clangd/CompileCommands.h.html#clang::clangd::CommandMangler
|
||||
[Context]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Context.h.html#clang::clangd::Context
|
||||
[Dex]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/dex/Dex.h.html#clang::clangd::dex::Dex
|
||||
[Diag]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Diagnostics.h.html#clang::clangd::Diag
|
||||
[FileIndex]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/FileIndex.h.html#clang::clangd::FileIndex
|
||||
[FuzzyMatcher]: https://code.woboq.org/llvm/clang-tools-extra/clangd/FuzzyMatch.h.html#clang::clangd::FuzzyMatcher
|
||||
[GlobalCompilationDatabase]: https://code.woboq.org/llvm/clang-tools-extra/clangd/GlobalCompilationDatabase.h.html#clang::clangd::GlobalCompilationDatabase
|
||||
[IncludeFixer]: https://code.woboq.org/llvm/clang-tools-extra/clangd/IncludeFixer.h.html#clang::clangd::IncludeFixer
|
||||
[IndexClient]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/remote/Client.cpp.html#clang::clangd::remote::(anonymousnamespace)::IndexClient
|
||||
[JSONTransport]: https://code.woboq.org/llvm/clang-tools-extra/clangd/JSONTransport.cpp.html#clang::clangd::(anonymousnamespace)::JSONTransport
|
||||
[MemIndex]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/MemIndex.h.html#clang::clangd::MemIndex
|
||||
[MergedIndex]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/Merge.h.html#clang::clangd::MergedIndex
|
||||
[ParsedAST]: https://code.woboq.org/llvm/clang-tools-extra/clangd/ParsedAST.h.html#clang::clangd::ParsedAST
|
||||
[ParsingCallbacks]: https://code.woboq.org/llvm/clang-tools-extra/clangd/TUScheduler.h.html#clang::clangd::ParsingCallbacks
|
||||
[PreambleData]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Preamble.h.html#clang::clangd::PreambleData
|
||||
[SelectionTree]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Selection.h.html#clang::clangd::SelectionTree
|
||||
[StoreDiags]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Diagnostics.h.html#clang::clangd::StoreDiags
|
||||
[SymbolCollector]: https://code.woboq.org/llvm/clang-tools-extra/clangd/index/SymbolCollector.h.html#clang::clangd::SymbolCollector
|
||||
[SymbolIndex]: https://code.woboq.org/llvm/clang-tools-extra/clang-include-fixer/SymbolIndex.h.html#clang::include_fixer::SymbolIndex
|
||||
[TUScheduler]: https://code.woboq.org/llvm/clang-tools-extra/clangd/TUScheduler.h.html#clang::clangd::TUScheduler
|
||||
[TestTU]: https://code.woboq.org/llvm/clang-tools-extra/clangd/unittests/TestTU.h.html#clang::clangd::TestTU
|
||||
[ThreadsafeFS]: https://code.woboq.org/llvm/clang-tools-extra/clangd/support/ThreadsafeFS.h.html#clang::clangd::ThreadsafeFS
|
||||
[Transport]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Transport.h.html#clang::clangd::Transport
|
||||
[Tweak]: https://code.woboq.org/llvm/clang-tools-extra/clangd/refactor/Tweak.h.html#clang::clangd::Tweak
|
||||
[findExplicitReferences()]: https://code.woboq.org/llvm/clang-tools-extra/clangd/FindTarget.h.html
|
||||
[getHover()]: https://code.woboq.org/llvm/clang-tools-extra/clangd/Hover.h.html
|
||||
[locateSymbolAt()]: https://code.woboq.org/llvm/clang-tools-extra/clangd/XRefs.h.html
|
||||
[main()]: https://code.woboq.org/llvm/clang-tools-extra/clangd/tool/ClangdMain.cpp.html#main
|
||||
[targetDecl()]: https://code.woboq.org/llvm/clang-tools-extra/clangd/FindTarget.h.html
|
||||
|
||||
[Decl]: https://code.woboq.org/llvm/clang/include/clang/AST/DeclBase.h.html#clang::Decl
|
||||
[Stmt]: https://code.woboq.org/llvm/clang/include/clang/AST/Stmt.h.html#clang::Stmt
|
||||
[Type]: https://code.woboq.org/llvm/clang/include/clang/AST/Type.h.html#clang::Type
|
||||
[Preprocessor]: https://code.woboq.org/llvm/clang/include/clang/Lex/Preprocessor.h.html#clang::Preprocessor
|
||||
[PPCallbacks]: https://code.woboq.org/llvm/clang/include/clang/Lex/PPCallbacks.h.html#clang::PPCallbacks
|
||||
[RecursiveASTVisitor]: https://code.woboq.org/llvm/clang/include/clang/AST/RecursiveASTVisitor.h.html#clang::RecursiveASTVisitor
|
||||
[PrecompiledPreamble]: https://code.woboq.org/llvm/clang/include/clang/Frontend/PrecompiledPreamble.h.html#clang::PrecompiledPreamble
|
||||
[syntax::TokenBuffer]: https://code.woboq.org/llvm/clang/include/clang/Tooling/Syntax/Tokens.h.html#clang::syntax::TokenBuffer
|
||||
[tooling::CompilationDatabase]: https://code.woboq.org/llvm/clang/include/clang/Tooling/CompilationDatabase.h.html#clang::tooling::CompilationDatabase
|
94
design/index.md
Normal file
94
design/index.md
Normal file
@ -0,0 +1,94 @@
|
||||
# The clangd index
|
||||
|
||||
The index stores information about the whole codebase. It's used to provide LSP
|
||||
features where the AST of the current file doesn't have the information we need.
|
||||
|
||||
## Exposed data
|
||||
|
||||
- `Symbol`s are the primary objects managed by the index. A function, variable,
|
||||
class, or macro is a Symbol, and each one has an opaque `SymbolID`.
|
||||
Two declarations of the same thing will produce the same `SymbolID` and thus
|
||||
be merged into one `Symbol`.
|
||||
|
||||
Symbols have names, declaration/definition locations, documentation, and a
|
||||
bunch of attributes used for code completion.
|
||||
|
||||
They can be looked up by ID, or fuzzy-searched by name.
|
||||
|
||||
- `Ref`s are uses of a symbol in code, such as a call to a function.
|
||||
They are edges between a `Symbol` and a location in some file.
|
||||
|
||||
They can be looked up by SymbolID.
|
||||
|
||||
- `Relation`s describe related symbols, such as a class that inherits another.
|
||||
They are edges between two `Symbol`s, labeled with a relation kind.
|
||||
|
||||
They are looked up using one of the `Symbols` and the kind.
|
||||
|
||||
## Implementations
|
||||
|
||||
`SymbolIndex` is an interface, and clangd maintains several instances.
|
||||
These are stitched together using `MergedIndex`, which layers one index on top
|
||||
of another. Code implementing features sees only a single combined index.
|
||||
|
||||
### `FileIndex` ("dynamic index")
|
||||
|
||||
This is the top layer, and includes symbols from the files that have been opened
|
||||
and the headers they include. This is used:
|
||||
|
||||
- to provide code completions for symbols at global scope in header files.
|
||||
(This is more efficient than deserializing big parts of the preamble).
|
||||
- to ensure cross-references for the files you're working on are available, even
|
||||
if the background index hasn't finished yet
|
||||
- to ensure locations of definitions/references aren't stale despite actively
|
||||
editing the file
|
||||
|
||||
The `FileIndex` class stores data from each file separately. When a file is
|
||||
parsed, the TUScheduler invokes a callback which adds the AST to the index.
|
||||
(In fact, there is a separate storage and callback for expensive-and-rare
|
||||
preamble rebuilds vs cheap-and-frequent main-file rebuilds).
|
||||
|
||||
### `BackgroundIndex`
|
||||
|
||||
As the name suggests, this parses all files in the project in the background
|
||||
to build a complete index. This is used:
|
||||
|
||||
- to ensure full coverage of the codebase
|
||||
- to capture references inside template instantiations, which are disabled
|
||||
elsewhere in clangd for performance reasons
|
||||
|
||||
The `BackgroundIndex` maintains a thread-pool, and when a compilation database
|
||||
is found, the compile command for each source file is placed on a queue.
|
||||
|
||||
Before indexing each file, the index checks for a cached `*.idx` file on disk.
|
||||
After indexing, it writes this file. This avoids reindexing on startup if
|
||||
nothing changed since last time.
|
||||
These files are located in `.cache/clangd/index/` next to `compile_commands.json`.
|
||||
For headers with no CDB, such as the standard library, they are in `clangd/index`
|
||||
under the user's cache directory (`$XDG_CACHE_HOME`, `DARWIN_USER_CACHE_DIR`, or
|
||||
`%LocalAppData%`).
|
||||
|
||||
### Static index
|
||||
|
||||
The (optional) static index is built outside clangd. It would typically cover
|
||||
the whole codebase. This is used:
|
||||
|
||||
- to avoid waiting for the background index to build
|
||||
- to allow the background index to be disabled for large projects, saving
|
||||
CPU/RAM/battery
|
||||
|
||||
With the `-index-file` option, clangd will load an index produced by the
|
||||
`clangd-indexer` tool.
|
||||
|
||||
### Remote index
|
||||
|
||||
For large codebases (e.g. LLVM and Chromium) global index can take a long
|
||||
time to build (multiple hours even on very powerful machines for Chrome-sized
|
||||
projects) and induces a large memory overhead (multiple GB of RAM) to serve
|
||||
within clangd.
|
||||
|
||||
Remote index allows serving index on a separate machine and connecting to it
|
||||
from your device. This means you don't have to build the index yourself
|
||||
anymore and clangd will use significantly less memory. Hence developers can
|
||||
work from less powerful machines, while still using clangd to its fullest.
|
||||
For more details, see [remote index](https://clangd.llvm.org/remote-index.html).
|
68
design/threads.md
Normal file
68
design/threads.md
Normal file
@ -0,0 +1,68 @@
|
||||
# Threads and request handling
|
||||
|
||||
Our main goals in handling incoming requests are:
|
||||
|
||||
- respond to requests as quickly as possible (don't block on unrelated work)
|
||||
- provide diagnostics as soon as possible once the user stops typing
|
||||
- handle requests using the expected version the of the file
|
||||
- use a predictable and bounded amount of CPU and RAM
|
||||
|
||||
Some constraints are provided by clang:
|
||||
|
||||
- Clang supports parsing the preamble (initial `#include`s) separately for
|
||||
better incremental performance.
|
||||
- Initial build of the preamble can be extremely slow - tens of seconds.
|
||||
Incremental AST builds (with an up-to-date preamble) are fast (sub-second).
|
||||
- Once built, the preamble is threadsafe (it's just immutable bytes).
|
||||
However ASTs are not threadsafe, even for read-only operations.
|
||||
|
||||
## Life of a request
|
||||
|
||||
An LSP message like `textDocument/definition` or `textDocument/didChange` is
|
||||
decoded by `ClangdLSPServer` and dispatched to the appropriate function on
|
||||
`ClangdServer`. This happens on the main thread, ClangdServer is not threadsafe.
|
||||
Therefore, its methods should not block - that would block incoming messages
|
||||
which could be independent (code completion in a different file) or relevant
|
||||
(cancelling a slow request).
|
||||
|
||||
Instead, they determine the affected file and place the action on `TUScheduler`.
|
||||
This class maintains a set of `ASTWorker`s, each is responsible for one file.
|
||||
The `ASTWorker` has a queue of operations, and a thread consuming them:
|
||||
|
||||
- throwing away operations that are obsolete:
|
||||
- reads that have been cancelled
|
||||
- writes immediately followed by writes (e.g. two consecutive keystrokes)
|
||||
- executing the first operation that is still valid:
|
||||
- writes: rebuilding the AST (and preamble if needed), publishing diagnostics
|
||||
- reads: passing the AST to the action callback
|
||||
|
||||
This ensures there's only one AST and one preamble per open file, operations on
|
||||
one file don't block another, and that reads see exactly the writes issued
|
||||
before them.
|
||||
|
||||
## Debouncing
|
||||
|
||||
For files that rebuild relatively slowly, starting to build as soon as we see
|
||||
the first change isn't ideal.
|
||||
|
||||
Suppose the user quickly types `foo();`. Building after the `f` is typed means:
|
||||
|
||||
- we'll always see a diagnostic "unknown identifier `f`", which is annoying.
|
||||
- we'll never see the correct diagnostics until after 2 rebuilds
|
||||
|
||||
To address this, writes are debounced: rebuilding doesn't start until either a
|
||||
read is received or a short deadline expires (user stopped typing).
|
||||
|
||||
## Code completion
|
||||
|
||||
Unlike typical requests like go-to-definition, code completion does not use
|
||||
the pre-built AST. Clang doesn't do a great job of preserving the AST around
|
||||
incomplete code, and completion can make good use of transient parser state.
|
||||
We run a fresh parse with clang's completion API enabled, which injects a
|
||||
"code completion" token into the token stream, and invokes a callback once the
|
||||
parser hits it.
|
||||
|
||||
As this doesn't reuse the AST, it can run on a separate thread rather than the
|
||||
ASTWorker. It does use the preamble, but we don't wait for it to be up-to-date.
|
||||
Since completion is extremely time sensitive, it just uses whichever is
|
||||
immediately available.
|
Loading…
Reference in New Issue
Block a user