-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failing e2e test jobs after ControlPlaneKubeletLocalMode enabled by default #3154
Comments
@chrischdi and the dedicated fg=false job also started failing, oddly: edit: actually this one is clearer. this needs update:
|
external ca calls a kinder action setup-external-ca it needs to be updated because it uses a naive approach to generate the same kubelet.conf on both workers and CP nodes without that the kublet.conf will point to a non-existing local apiserver on worker nodes. instead it should point to lb. i don't think there is a bigger issue here, i.e. we don't need to patch k/k. edit: hmm but, --control-plane-endpoint=%s is already the lb IP according to the kinder source, but the file ends up with 172.17.0.5 which is the worker IP and there is no apiserver there at port 6443. |
tested locally.
so that's a regression. we need to think how the kubelet local mode will continue to respect the user prodided clusterconfiguration.controlplaneendpoint or flag. i will send revert PR for until we fix all these issues. edit: here it is: |
this issue seems to be that runKubeletWaitBootstrapPhase assumes there is a real kubelet running perhaps we should wrap the waiting
|
I'm planning to take a look at this next week. /assign |
Trying to iterate on the three issues which I call:
1. kinder dry-runThat is easy fixable and needs to be done in k/k. With the feature-gate disabled and when having dry-run, we:
With the feature-gate enabled we directly run Example fix: chrischdi/kubernetes@65839db 2. kinder external-ca
I'm not sure if this is a regression or the wanted outcome of the feature gate instead. $ kubeadm init phase certs ca
$ kubeadm init phase kubeconfig all --control-plane-endpoint=foo.bar --v=5
$ cat /etc/kubernetes/controller-manager.conf | grep server
server: https://172.17.0.3:6443
$ cat /etc/kubernetes/scheduler.conf | grep server
server: https://172.17.0.3:6443
$ cat /etc/kubernetes/admin.conf | grep server
server: https://foo.bar:6443 3. kinder fg-disabled is failingI'm now taking a look into this. |
makes sense.
historically the kcm and scheduler have been hardcoded to point to the local ip. it seems to me this breaking change is inevitable, but it should be mentioned in the release note of the graduation pr. one place where this break is kinder, like i mentioned earlier. so for the external ca workflow to pass this must be fixed here: |
Updated: kubernetes/kubernetes#129956 And added: #3157 1. kinder dry-runFixed in: via: Testing at https://github.com/neolit123/kubeadm-test/actions/runs/13131235897 2. kinder external-caFixed in: via: Note: in case of ControlPlaneKubeletLocalMode one way out is to set 3. kinder fg-disabled is failingFixed in: via: Testing at https://github.com/neolit123/kubeadm-test/actions/runs/13130880191 |
Changes merged and jobs are still green 🎉 xref: |
thanks for the fixes |
i suspect it's
because the other PR after is cosmetic (klog change)
failing jobs:
https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kubeadm-kinder-dryrun-latest/1884502182036770816/build-log.txt
https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kubeadm-kinder-external-ca-latest/1884261091186315264/build-log.txt
both cases need investigation. in one case it seems it's not reaching the kubelet and the other the apiserver.
don't seem like flakes as it failed consistently N times. these jobs are a bit uncommon, i.e. they do custom actions like dry-run/external ca.
the regular job is green:
cc @chrischdi
The text was updated successfully, but these errors were encountered: