Autoscaler slow memory leak #15624

DavidR91 · 2024-11-21T11:25:22Z

What version of Knative?

1.16.0

Expected Behavior

Autoscaler is able to GC etc. and avoid OOM

Actual Behavior

There is a visible leak occurring every ~10 hours in the autoscaler in our environment. This creates a constant upward trend in memory use.

Although there is some attempt to GC and reduce this as the memory limit is reached, it's never quite enough and it does eventually OOM and restart

About our environment:

GKE Kubernetes 1.30.6
knative 1.16, istio 1.23.3 and net-istio 1.16
The autoscaler is given request and limit of 2 CPU and 2Gi of memory (Guaranteed QoS)
The autoscaler is configured into HA mode: we have it scaled so there are 3 running at all times
- Notable that when the primary autoscaler OOMs, we experience a significant request error spike, because this seems to negatively affect the activator - which is why we're more interested in solving this
- The leak only seems to affect the primary/leader
We typically have about 2-300 different knative services. Most of them will have on average ~3 revisions at any one time
- The graph above is from when the cluster is almost entirely idle. Most of the period of that graph, there are no service pods running at all
We've added GOMEMLIMIT to 1.7GiB to see if this helps keep it under control but it has no effect (it stays alive longer but it does still eventually OOM)
Nothing in particular happens in our environment at a 10 hour frequency (we have jobs and new service creation+deletion occurring on 24hr cycles, typically 8AM and midnight)
The same issue was observed in knative 1.9.2

Steps to Reproduce the Problem

What would be useful to repro/diagnose this? Is the minimum a debug level log from the autoscaler over the ~10hrs where the issue occurs?

The text was updated successfully, but these errors were encountered:

skonto · 2024-11-26T10:48:17Z

Hi @DavidR91,

Could you show more about the pod status (kubectl describe pod ...)? What is the behavior of the istio sidecar?
In the past there was a similar issue that was coming from the istio side.

What would be useful to repro/diagnose this?

Could you provide the logs of the autoscaler?
Could you take a heap dump during the time that the issue occurs?

You can enable profiling as follows.

On one terminal:

cat <<EOF | oc apply -f -
apiVersion: v1
data:
  profiling.enable: "true"
kind: ConfigMap
metadata:
  name: config-observability
  namespace: knative-serving
EOF

Kubectl port-forward <pod-name> -n knative-serving  8008:8008

On another terminal:
$ go tool pprof http://localhost:8008/debug/pprof/heap

DavidR91 · 2024-11-26T15:36:02Z

In the past there was #8761 that was coming from the istio side.

We are just using istio's gateways, we don't actually use the sidecar or any sidecar injection at all (we just have VirtualServices pointing at the knative gateway with rewritten authority etc. for each service) - so I don't think that one is connected

Getting debug log is a bit more work so I will follow up with those - but I have managed to enable profiling and get the pprof dumps

I've attached two dumps, only a few minutes apart but the latter the memory use had grown by 1-2%. These were taken when the system was under load but the autoscaler was already at ~93% memory vs. limit, so it's very close to OOMing

pprof.autoscaler.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
pprof.autoscaler.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz

and PNG version of the first dump for convenience
profile001

I notice there is a lot of exporter+metric stuff here, and we do have knative configure to send to OTel via opencensus - is it enough of a presence in these dumps to suggest that OTel integration is the cause?

skonto · 2024-11-27T10:50:04Z

is it enough of a presence in these dumps to suggest that OTel integration is the cause?

Does not seem to be so even if it uses a lot of the allocated memory. I did a diff (go tool prof -base prof1 prof2) on the profiles you posted. Here is the output:

Same if you pass inuse_objects:

The biggest increase is ~40Mb at the streamwatcher. Could you take multiple snapshots and check the diff also during no load? Maybe it is related to this kubernetes/kubernetes#103789 (comment)? Do you have a lot of pods coming up during load times (autoscaler has a filtered informer for service pods)?

skonto · 2024-11-27T11:20:42Z

Btw the default resync period is ~10h, see here https://github.com/knative/serving/blob/main/vendor/knative.dev/pkg/controller/controller.go#L54.
Is your cluster a large one, is it slow?

DavidR91 added the kind/bug Categorizes issue or PR as related to a bug. label Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoscaler slow memory leak #15624

Autoscaler slow memory leak #15624

DavidR91 commented Nov 21, 2024 •

edited

Loading

skonto commented Nov 26, 2024

DavidR91 commented Nov 26, 2024 •

edited

Loading

skonto commented Nov 27, 2024 •

edited

Loading

skonto commented Nov 27, 2024 •

edited

Loading

Autoscaler slow memory leak #15624

Autoscaler slow memory leak #15624

Comments

DavidR91 commented Nov 21, 2024 • edited Loading

What version of Knative?

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

skonto commented Nov 26, 2024

DavidR91 commented Nov 26, 2024 • edited Loading

skonto commented Nov 27, 2024 • edited Loading

skonto commented Nov 27, 2024 • edited Loading

DavidR91 commented Nov 21, 2024 •

edited

Loading

DavidR91 commented Nov 26, 2024 •

edited

Loading

skonto commented Nov 27, 2024 •

edited

Loading

skonto commented Nov 27, 2024 •

edited

Loading