Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[patch]: Adds support for passing schemaName to pgvector #4543

Merged
merged 20 commits into from
Feb 29, 2024
Merged

community[patch]: Adds support for passing schemaName to pgvector #4543

merged 20 commits into from
Feb 29, 2024

Conversation

ZuesYousif
Copy link
Contributor

@ZuesYousif ZuesYousif commented Feb 27, 2024

Adds schemaName option to PGVectorStoreArgs
Adds extensionSchemaName option to PGVectorStoreArgs

Copy link

vercel bot commented Feb 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-api-refs 🛑 Canceled (Inspect) Feb 29, 2024 2:28am
langchainjs-docs 🛑 Canceled (Inspect) Feb 29, 2024 2:28am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. auto:improvement Medium size change to existing code to handle new use-cases labels Feb 27, 2024
FOREIGN KEY (collection_id)
REFERENCES ${this.collectionTableName}(uuid)
ON DELETE CASCADE;
`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to run yarn format!

@@ -521,7 +561,23 @@ export class PGVectorStore extends VectorStore {
FOREIGN KEY (collection_id)
REFERENCES ${this.collectionTableName}(uuid)
ON DELETE CASCADE;
`);
` : `
CREATE TABLE IF NOT EXISTS ${this.schemaName}.${this.collectionTableName} (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be in quotes to account for capitalization?

@jacoblee93
Copy link
Collaborator

Looks good to me overall - main question is should these new/old queries be in quotes though to account for capitalizations?

@jacoblee93 jacoblee93 added the question Further information is requested label Feb 28, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Feb 28, 2024
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. size:L This PR changes 100-499 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. size:XXL This PR changes 1000+ lines, ignoring generated files. labels Feb 28, 2024
@ZuesYousif
Copy link
Contributor Author

Looks good to me overall - main question is should these new/old queries be in quotes though to account for capitalizations?

Good catch there, I've added some string wrapping around a few calls where it seems appropriate

@ZuesYousif
Copy link
Contributor Author

I found a bug where the operator <=> from pgvector was out of scope when pgvector is nested under a schema. To fix this I pushed a small change that sets the schema in a query prior to using the operator. I tried passing it under the same pool.query call however it said only one query is allowed & after testing the changes under a it's own call it seems to be working now.

@MJDeligan
Copy link
Contributor

Does setting the search_path only persist for that specific connection or does it change at the db level?

@ZuesYousif
Copy link
Contributor Author

Does setting the search_path only persist for that specific connection or does it change at the db level?

It looks like it only lasts for the duration of that connection. When I manually query the db with SELECT CURRENT_SCHEMA, CURRENT_SCHEMA(); after running this the db still returns public

@MJDeligan
Copy link
Contributor

Gotcha! Is that for the pool in general or is this tied to the client the pool creates to run the query? In the first case, we may run into issues if we ever allow passing a pool instance, instead of creating a new one (as proposed in #4409 and present in PostgresChatHistory) among other unforeseen consequences for users. In the latter case, we could get unlucky and the pool uses a different client for each query. Have you seen this ? Might be applicable here?.

@ZuesYousif
Copy link
Contributor Author

Gotcha! Is that for the pool in general or is this tied to the client the pool creates to run the query? In the first case, we may run into issues if we ever allow passing a pool instance, instead of creating a new one (as proposed in #4409 and present in PostgresChatHistory) among other unforeseen consequences for users. In the latter case, we could get unlucky and the pool uses a different client for each query. Have you seen this ? Might be applicable here?.

I haven't seen that yet, thanks that helps a lot in this case. I pushed a refactor that opts for the OPERATOR function instead which seems to fix the initial scope bug.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Feb 29, 2024
@jacoblee93 jacoblee93 changed the title Adds support for passing schemaName to pgvector community[patch]: Adds support for passing schemaName to pgvector Feb 29, 2024
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 29, 2024
@jacoblee93
Copy link
Collaborator

Thank you both!

@jacoblee93 jacoblee93 self-assigned this Feb 29, 2024
@jacoblee93 jacoblee93 added lgtm PRs that are ready to be merged as-is and removed question Further information is requested labels Feb 29, 2024
@vercel vercel bot temporarily deployed to Preview – langchainjs-docs February 29, 2024 02:28 Inactive
@vercel vercel bot temporarily deployed to Preview – langchainjs-api-refs February 29, 2024 02:28 Inactive
@jacoblee93 jacoblee93 merged commit ddd224c into langchain-ai:main Feb 29, 2024
14 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases lgtm PRs that are ready to be merged as-is size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants