Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Context Pilot from a binary to a server #32

Open
krshrimali opened this issue Feb 6, 2024 · 0 comments
Open

Convert Context Pilot from a binary to a server #32

krshrimali opened this issue Feb 6, 2024 · 0 comments

Comments

@krshrimali
Copy link
Owner

krshrimali commented Feb 6, 2024

Context

We have been leveraging context-pilot as a binary from the plugins in vscode/neovim so far. The flow has been:

  1. Call context-pilot binary from the plugin side
  2. Fetch the stdout and parse it and show it on the frontend.

See: https://github.com/krshrimali/context-pilot.nvim/blob/26f154bda102d48e2bc84602fe46724d5305277b/lua/context_gpt.lua#L115 for reference

While this^ works for smaller workspaces/files (in terms of LoCs), this would not scale for a huge repository as the number of LoCs could be huge for each file. Imagine there is a file with 1000 lines of code and a user wants to get context for the current file, calling context-pilot binary would probably take around 30 seconds or more, and this could slow down the whole process.

To fix this, ideally I'd like this to look like an LSP server where when an editor is launched - indexing should start for the whole workspace (excluding for the files that match patterns with .gitignore file ofc).

Algorithm (proposed)

Here is what I'm thinking so far:

  • Whenever an editor is launched, we'll need the "workspace path" from the plugin side. This would help us to start the server for a particular workspace path and start indexing immediately.
  • When the editor is "activated", we'll start indexing all the files in the workspace (excluding files to be ignored as per .gitignore).
  • Indexing information should be shown to the user over the editor - example: 50% completed etc. This information should be live, to give user enough confidence on what's going on in the background.
  • If the user requests context while indexing is still going on, another thread should spawn which should independently compute the information and give it to the user (this is where existing binary workflow would happen)
  • Whenever the editor is closed, the indexing will also stop and the DB dump will be released/deleted (it will be more like a cache for now instead of storing the DB in a permanent file for later use).

The server implementation, should be independent (at least for discussions) on the algorithm of "getting context". The idea of this server implementation is to be algo-agnostic, as I do plan to change this into an LLM mode soon.

Implementation Details

Challenges:

  • I'm still thinking whether it's better to have DB as a cache that is killed once the editor is closed, or keep it until the user requests a refresh (/have an auto-refresh after some X unit).
  • How will the plugin (lua/typescript) start the server on each call? I've honestly not written a lot of async code before, wondering if I can just spawn a command like I'm doing right now - have to check this.
  • Race conditions - Imagine having 2 neovim editor instances with same workspace paths, the DB cache will have to be synced between both/all editors running on the same workspace paths, so that duplicate indexing doesn't happen. One interesting case is also when, an editor is alr opened for a day, and the user opens "a new editor" with the same workspace path, ideally - the cache should be updated now (so we'll have to check the time elapsed for the cache and set an auto-refresh + give users control on manually refreshing as well).

There will definitely be more challenges as we move along, but this definitely is just a start.

If anyone is interested, this is how the on-going work on this server implementation looks like: https://github.com/krshrimali/context-pilot-rs/blob/kush/pr/server-phase-1/src/server.rs. PR: #31 for tracking all changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant