pod.TerminationGracePeriodSeconds setting #15599

V2arK · 2024-10-29T20:05:09Z

Describe the feature

Right now TerminationGracePeriodSeconds is set to rev.Spec.TimeoutSeconds

https://github.com/knative/serving/blob/main/pkg/reconciler/revision/resources/deploy.go#L304

However, rev.Spec.TimeoutSeconds also specifies the timeout for in-flight request.

I think these two values should be seperated, because in my project, I want to terminate deployment without graceful exit, but I want the timout for in-flight request to be as long as possible.

The text was updated successfully, but these errors were encountered:

skonto · 2024-10-30T10:36:31Z

I want to terminate deployment without graceful exit, but I want the timout for in-flight request to be as long as possible.

Hi @V2arK, this was added years ago so there is a guarantee about connections not to be dropped during autoscaling. The knative autoscaler continuously makes decisions about the deployment scale and that may interrupt connections during pod shutdown.
Could you elaborate on your use case, you don't care about failing requests?

V2arK · 2024-10-31T18:46:56Z

Hi @skonto, in my uses cases I just want to terminates the pods ASAP (maybe 3~5 seconds) when I triggers the termination, but not to change the timeout for requests (eg, LLM spits out response in minutes),

skonto · 2024-11-27T11:57:49Z

Hi @V2arK

when I triggers the termination

How do you trigger that? You are removing the knative service? Could you just drop the connection when a SIGTERM is received at the LLM python runtime or at the client side too? If you interrupt the connections, draining will happen pretty fast as QP will not wait for them to finish.

V2arK · 2024-12-17T17:10:24Z

Hi @skonto ,
Sorry for the late reply; your message got buried among other communications, and I missed it.

In our workflow, we use ArgoCD to deploy a serving.knative.dev/v1 Helm chart. When signaling a termination or deletion of a deployment via ArgoCD, we aim to delete the entire deployment quickly. Ideally, we would like Kubernetes to send a SIGKILL directly, (so we can save as much dime as we could :) ) which is controlled by rev.Spec.TimeoutSeconds.

However, we don't want to change the timeout for in-flight requests; we want it to remain as long as possible.

V2arK added the kind/feature Well-understood/specified features, ready for coding. label Oct 29, 2024

skonto mentioned this issue Dec 13, 2024

How to make queue-proxy wait for long requests to finish? #15649

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pod.TerminationGracePeriodSeconds setting #15599

pod.TerminationGracePeriodSeconds setting #15599

V2arK commented Oct 29, 2024

skonto commented Oct 30, 2024

V2arK commented Oct 31, 2024

skonto commented Nov 27, 2024 •

edited

Loading

V2arK commented Dec 17, 2024 •

edited

Loading

pod.TerminationGracePeriodSeconds setting #15599

pod.TerminationGracePeriodSeconds setting #15599

Comments

V2arK commented Oct 29, 2024

Describe the feature

skonto commented Oct 30, 2024

V2arK commented Oct 31, 2024

skonto commented Nov 27, 2024 • edited Loading

V2arK commented Dec 17, 2024 • edited Loading

skonto commented Nov 27, 2024 •

edited

Loading

V2arK commented Dec 17, 2024 •

edited

Loading