-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pod.TerminationGracePeriodSeconds setting #15599
Comments
Hi @V2arK, this was added years ago so there is a guarantee about connections not to be dropped during autoscaling. The knative autoscaler continuously makes decisions about the deployment scale and that may interrupt connections during pod shutdown. |
Hi @skonto, in my uses cases I just want to terminates the pods ASAP (maybe 3~5 seconds) when I triggers the termination, but not to change the timeout for requests (eg, LLM spits out response in minutes), |
Hi @V2arK
How do you trigger that? You are removing the knative service? Could you just drop the connection when a SIGTERM is received at the LLM python runtime or at the client side too? If you interrupt the connections, draining will happen pretty fast as QP will not wait for them to finish. |
Hi @skonto , In our workflow, we use ArgoCD to deploy a However, we don't want to change the timeout for in-flight requests; we want it to remain as long as possible. |
Describe the feature
Right now
TerminationGracePeriodSeconds
is set torev.Spec.TimeoutSeconds
https://github.com/knative/serving/blob/main/pkg/reconciler/revision/resources/deploy.go#L304
However,
rev.Spec.TimeoutSeconds
also specifies the timeout for in-flight request.I think these two values should be seperated, because in my project, I want to terminate deployment without graceful exit, but I want the timout for in-flight request to be as long as possible.
The text was updated successfully, but these errors were encountered: