-
Notifications
You must be signed in to change notification settings - Fork 349
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
**Problem (why)** Currently the daily cost and token usage on models is very high. We want to find some ways to reduce them. **Solution (context)** Prompt caching can significantly reduce token costs. Each cache hit reduces costs by 90%, while each cache miss increases costs by 25%. After some [initial analysis](https://docs.google.com/document/d/1y-pOAUgwksyMx-Uq1rutzWYd6qb13iqikTSaAn6UdKM/edit?tab=t.0#heading=h.202nr339v40e), we decide to start with implementing [prompt caching](https://linear.app/sourcegraph/issue/CODY-4000/determine-effort-for-prompt-caching-in-sonnet-35-and-haiku-35) for Claude models. **Implementation (what)** - Adding the header `cache_control: ephemeral`, which creates a cache with a 5 min TTL. - Server Side Implementation in this [PR](sourcegraph/sourcegraph#3198) **Anthropic Docs (context)** - https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching - https://docs.anthropic.com/en/api/messages ## Test plan - Tested locally and cache is being added
- Loading branch information
Showing
6 changed files
with
90 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters