-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
langchain[minor],docs[minor]: Add MatryoshkaRetriever
#4458
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Does this need to be in experimental? |
…to brace/addaptive-retrieval
No, can move --just seemed right at the time but ig it goes both ways. |
docs/core_docs/docs/modules/experimental/retrievers/matryoshka_retrieval.mdx
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,128 @@ | |||
import { CallbackManagerForRetrieverRun } from "@langchain/core/callbacks/manager"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should live in langchain
and not community since it's more of an algorithm?
constructor(fields: MatryoshkaRetrievalFields<Store>) { | ||
super({ | ||
...fields, | ||
vectorStore: fields.smallStore, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why override this name?
* | ||
* The MatryoshkaRetrieval retriever performs two semantic searches. The first uses the smaller | ||
* store which returns many results, but is less accurate. Then, using the documents returned | ||
* from the smaller store, the larger store is used to perform a more accurate search over the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you go into a bit more detail as to how this works? What does "more accurate search" mean?
query, | ||
this.largeK, | ||
(doc: DocumentInterface) => | ||
smallResultIds.includes(doc.metadata[this.idKey]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filter will not always be a function will it?
const smallResultIds = smallResults.map( | ||
(result) => result.metadata[this.idKey] | ||
); | ||
const largeResults = await this.largeStore.similaritySearch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait lmao this is just parent document isn't it?
MatryoshkaRetrieval
to lc experimental
MatryoshkaRetrieval
to lc experimentalMatryoshkaRetrieval
@@ -3646,6 +3650,15 @@ | |||
"import": "./retrievers/vespa.js", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there! I noticed that this PR introduces a new dependency for "./retrievers/matryoshka_retrieval" in the package.json file, potentially impacting the project's dependencies. This comment is to flag the change for maintainers to review the type of dependency change (peer/dev/hard). Keep up the great work!
MatryoshkaRetrieval
MatryoshkaRetriever
cc @nfcampos