Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Opengpts #361

Merged
merged 18 commits into from
Jan 29, 2025
Merged

Conversation

lgesuellip
Copy link
Contributor

@lgesuellip lgesuellip commented Oct 24, 2024

Hi Team,

As a member of the Pampa Team, I’ve been working on this PR to upgrade OpenGPTs to the latest version of Langchain dependencies. This update ensures compatibility with Pydantic 2 and resolves issues with related packages.

Code changes

  • Migration to Pydantic 2

    • Migrated the codebase to use Pydantic 2.
  • Langchain Dependency Upgrades

    • Updated all langchain dependencies to their latest versions (like langchain, langchain-core, langgraph, etc).
    • Removed langchain-robocorp as it is currently incompatible with Pydantic 2.
    • Updated the unstructured dependency to resolve issues related to nltk and its associated packages, such as punkt.
  • Code Adaptations

    • Refactored the checkpoint logic to use the AsyncPostgresSaver from the langgraph implementation for improved compatibility and performance.
  • Updated schemas to work with Pydantic 2's BaseModel.

  • Fixed bugs using GPT-4o.

Testing

  • Updated test cases to ensure compatibility with the new codebase.
  • Refactored and adapted tests to validate changes in schemas, checkpoint handling, and dependencies.

Related issues

#352

Looking forward to your feedback
Thank you Team!

backend/app/tools.py Outdated Show resolved Hide resolved
@lgesuellip
Copy link
Contributor Author

Hey team @eyurtsev @nfcampos , I just pushed all the changes.
I look forward to your review and feedback.
Thank you!

Copy link
Contributor

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing two issues:

  1. (MAJOR) w/ migration
  2. (minor) not 100% sure but UI doesn't seem to load the full screen for creating a new bot --- requiring clicking through on a tab. This could be something associated w/ data returned to the UI through one of the endpoints

backend/app/tools.py Show resolved Hide resolved
@@ -265,7 +264,7 @@ class ConfigurableRetrieval(RunnableBinding):
llm_type: LLMType
system_message: str = DEFAULT_SYSTEM_MESSAGE
assistant_id: Optional[str] = None
thread_id: Optional[str] = None
thread_id: Optional[str] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a None default?

Copy link
Contributor Author

@lgesuellip lgesuellip Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a lot of time debugging this part. The error indicates a conflict in the configuration specifications for thread_id.

The error occurs during validation in the following code:

@router.get("/config_schema")
async def config_schema() -> dict:
    """Return the config schema of the runnable."""
    return agent.config_schema().model_json_schema()

The issue seems to arise because there are two conflicting ConfigurableFieldSpec definitions for thread_id:
1. Definition 1: ConfigurableFieldSpec with annotation=typing.Optional[str] and default=None.
2. Definition 2: ConfigurableFieldSpec with annotation=<class 'str'> and default=''.

So, I decided to set the default to '', and it works. However, I would prefer to keep it as None. Do you know what might be causing the problem? The assistant_id is similar, but I don’t encounter this issue with it.

@@ -135,7 +132,7 @@ class ConfigurableAgent(RunnableBinding):
retrieval_description: str = RETRIEVAL_DESCRIPTION
interrupt_before_action: bool = False
assistant_id: Optional[str] = None
thread_id: Optional[str] = None
thread_id: Optional[str] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a None?

ALTER TABLE checkpoints
ADD COLUMN IF NOT EXISTS thread_ts TIMESTAMPTZ,
ADD COLUMN IF NOT EXISTS parent_ts TIMESTAMPTZ;
-- Drop existing checkpoints-related tables if they exist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this migration fails for anyone that's run migrations 1 through 4 already. the migration state is kept of in the database.

Should this run as step 5 so it'll run at the end?

So if you try to run opengpts with the previous version, and then apply PR on top and run things -- the migrations will not work.


W/ current approach it seems like any old threads are no longer usable from the app. (I'm assuming not super easy to recover b/c of the pickle serde that was used).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right! I think the same thing happens in LangGraph when people decide to use the new checkpointer, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new checkpointers can be versioned as far as I understand

https://github.com/langchain-ai/langgraph/blob/main/libs/checkpoint-postgres/langgraph/checkpoint/postgres/base.py#L27

So at least going forward there's a way to carry out schema migrations automatically.

But yeah going from the pickle checkpointer -> new checkpointer was a breaking change. I'm OK if we don't worry about this, don't think this affects that many users.

I'd just prefer if we didn't wipe out any potential sql tables that users may want to recover data from

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Eugene,
I've implemented the changes we discussed previously:

  1. Migration Changes

    • Up:
      • Updated migration 5 to rename the existing checkpoints table (preserving old data) instead of modifying it.
      • The checkpoint_blobs and checkpoint_writes tables will now be created at runtime.
    • Down:
      • The migration properly reverts these changes by restoring the original table name.
        Note: This is a breaking change – old checkpoints won’t be accessible in the new system.
  2. Runtime Setup

    • Added an ensure_setup() call in the lifespan event, which will call the async_postgres_saver.setup() method.
    • This ensures that the new checkpoint tables (checkpoints, writes, and blobs) are properly created during application startup.

I’ve tested the migration path from the old version to the new one, and it works as expected. The old data is preserved in the renamed table, while the new checkpoint system operates seamlessly with its updated table structure.

backend/migrations/000002_checkpoints_update_schema.up.sql Outdated Show resolved Hide resolved
backend/app/schema.py Show resolved Hide resolved
if isinstance(value, list) and all(isinstance(v, BaseMessage) for v in value):
loaded["channel_values"][key] = [v.__class__(**v.__dict__) for v in value]
return loaded
class AsyncPostgresCheckpoint(BasePostgresSaver):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered initializing on app start up and calling .setup() to set up the migration, and then avoiding doing the wrapping of the checkpointer?

It'll help keep the checkpoints in sync and remove some extra code here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Eugene,

Checkpoint:

I’ve run into a few challenges while implementing langgraph’s checkpoint:

  1. Global Checkpointer Initialization:
    I defined the checkpointer in the lifespan and declare it as a global in agent.py. However, I encountered the error “Checkpointer not initialized” because the global instance wasn’t properly initialized before being accessed.
    This issue occurs because the checkpointer depends on the application startup completing before it can be used.

  2. Singleton Pattern Issues:
    I tried using a singleton pattern for AsyncPostgresSaver to manage the global instance, initializing it during the lifespan. However, the initialization of AsyncPostgresSaver requires an async event loop, which isn’t always available—such as during testing—resulting in the error: “no running event loop.”

  3. Current Implementation:
    I implemented a solution inspired by the current approach in OpenGPTs, adapted to use the new checkpointer in LangGraph:

  • Singleton with Lazy Initialization: Created a BasePostgresSaver class with a singleton pattern that assigns the instance before initialization.
  • Async Setup Method: Moved the connection pool creation to an async setup() method, ensuring it initializes during the lifespan of the application when an asynchronous loop is available.

I’m open to trying a different approach if you have any suggestions or recommendations!

Migration:

Finally, I decided to run the migrations before using the app to stay consistent with OpenGPTs and ensure the queries are ready. I am using the same .sql.

What do you think?

@lgesuellip
Copy link
Contributor Author

Fix state handling inconsistency between different agent types:

The issue:

  • tool_agent and chatbot use a simple list of messages as their state
  • retrieval agent uses a dict with both messages and additional metadata (msg_count)
  • This was causing conflicts when updating state between different agent types

Changes:

  • Adapted update_thread_state to detect agent type from current state:
    • For retrieval agent: preserve additional state data while updating messages
    • For other agents: maintain simple message list state

@lgesuellip
Copy link
Contributor Author

@eyurtsev I've added the changes we discussed.
I'd love your feedback.

Copy link
Contributor

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! @lgesuellip changes look good

@eyurtsev
Copy link
Contributor

probably need to bump poetry version used on CI. i'm taking a look

backend/pyproject.toml Outdated Show resolved Hide resolved
@eyurtsev
Copy link
Contributor

Removed the package-mode instead -- there are some other breaking changes in poetry 2.0, so just using 1.5.1 instead

@eyurtsev eyurtsev merged commit 2cf3bf7 into langchain-ai:main Jan 29, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants