[RFC] Add build-in TLS support #3368

pjbgf · 2022-12-04T09:34:20Z

Based on the discussion instigated by @rashedkvm.

More context can be found on the recordings from Sept 8th Flux Dev Meeting.

pjbgf · 2022-12-04T09:45:37Z

rfcs/0000-build-in-tls-support/README.md

+The TLS support could be expanded to the Notification controller webhooks endpoint
+to enable secure in-cluster communications.


The main issue I see with supporting this is the hybrid scenario in which we need to support in-cluster and external (via ingress). In such scenarios, the ingress would need to trust the CA used by Flux, which can be problematic from a security stand-point.

rashedkvm · 2022-12-04T23:16:56Z

Thank you @pjbgf. Will review shortly.

rashedkvm

@pjbgf thank you for restarting the encrypted communication discussion for Flux components. Am I understanding this correctly that this feature should not be used in a service mesh setup?

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>

pjbgf · 2022-12-05T10:44:47Z

@pjbgf thank you for restarting the encrypted communication discussion for Flux components. Am I understanding this correctly that this feature should not be used in a service mesh setup?

@rashedkvm The use of CNI or Service Mesh are alternatives, which could still be used regardless of the built-in support. However, I cannot see why users would use a Service Mesh to protect Flux inter-component comms when the built-in TLS is enabled, as IMO they are mutually exclusive. Although I could see cases in which some deployments may have CNI encryption + Flux built-in TLS.

stefanprodan · 2022-12-05T11:00:00Z

rfcs/0000-build-in-tls-support/README.md

+Controllers will watch the TLS secret for changes, once detected it will trigger a
+pod restart to refresh the certificates being used.


We can use https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/certwatcher/certwatcher.go and reload the cert without a pod restart. Especially for source-controller, a restart is very expensive and should be avoided.

I like the idea. This would cause all mid-flight requests to be dropped, which is a minor impact compared to rebuilding the storage. Besides, the mid-fight requests would be automatically retried within seconds anyway.

Yes, definitely that will be a considerable saving over the expensive restarts.

rashedkvm · 2022-12-05T21:21:33Z

@rashedkvm The use of CNI or Service Mesh are alternatives, which could still be used regardless of the built-in support. However, I cannot see why users would use a Service Mesh to protect Flux inter-component comms when the built-in TLS is enabled, as IMO they are mutually exclusive.

that is my understanding too. Thank you for the explanation!

phoban01 · 2022-12-06T12:16:10Z

rfcs/0000-build-in-tls-support/README.md

+- Change how CA trust is established across Flux containers.
+- Define TLS rotation mechanisms.
+
+## Proposal


Would the proposed architecture be compatible with SPIFFE/SPIRE? I can envisage a scenario where I would like to manage identity using SPIFFE SVIDs delivered via SPIRE rather than via Kubernetes TLS secrets. In practice I think this would mean adding a unix domain socket flag to the controllers with an accompanying listener to receive SVID bundles from the SPIRE agent.

Great question @phoban01. I see the built-in TLS support as orthogonal to the support of SPIFFE/SPIRE. The idea here is that users would be able to provision Flux with secure comms without requiring any additional components.

On that point, some service meshes should be able to implement that (mTLS based on SPIFFE/SPIRE) transparently on top of Flux, without any changes required to the source code. However, if there are good use cases for built-in support, that should be a RFC on its own right.

Spire support is nice, but will require for the Spire itself to be managed outside Flux. I think using native PKI is a good way of avoiding this as mentioned in the Alternatives section.

alukyan · 2023-06-06T21:28:43Z

rfcs/0000-build-in-tls-support/README.md

+## Proposal
+
+Controllers hosting web servers will introduce a new flag to configure the
+[TLS secret]: `--tls-secret`. The Secret type must be `kubernetes.io/tls`.


Why take dependency on a specific K8s secret hosting approach directly in the components? Can you please use a file reference instead. Then it will be easier to support other secret hosting approaches such as cloud key vaults.
For your information in our enterprise environments hosting any kind of secrets as K8s secrets is prohibited.

How are you using Flux then? All our APIs rely on Kubernetes Secrets, you can’t connect to Git/OCI/Helm repositories unless you store the auth in a secret.

hosting any kind of secrets as K8s secrets is prohibited

So you're using Secrets Store CSI Driver and mounting them directly into containers? I see why this might be a reasonable policy but as Stefan said, Flux doesn't work without Secrets at the moment and likely never will. Otherwise you'd have to mount all the Secrets any user might ever need into the Flux controller containers.

@makkes - correct, we use Secret CSI drivers, or you can create a wrapper/starter script to launch the binaries and this starter script pulls the secrets from key vaults and places them on tempfs volumes.

@stefanprodan - based on the documentation I see that Flux can use cloud identities to access artifact storages. We can still use K8s secrets as a config store for not secret material - such as public keys, application IDs, etc. But we cannot use K8s secrets for storing secret material like private keys.
Most Kubernetes components consume secrets as file references - then if someone wants to use K8s secrets to store secrets they can always mount them as files in the volume.

Consuming secret material directly from K8s secret resources is a gap that must be fixed for Flux to be used in different deployments that rely on consuming secrets as files.

Consuming secrets as files breaks Flux multi-tenancy. We rely on Kubernetes RBAC to isolate tenant, reading files from the container FS is not subject to RBAC while reading secrets from the API is. Have you considered enabling encryption at rest for etcd?

I might be confused here. But wouldn't the TLS k8s secret also mounted as a file in the pod. The certwatcher project suggested allows file based certs only.
https://github.com/kubernetes-sigs/controller-runtime/blob/a784ee78d04bdf216006adfbc313328a0aded88a/pkg/certwatcher/certwatcher.go#L36

Thus not sure how giving an option for a TLS on file-based key inside the pod can be less secure as even with secrets we need to do the same.

I understand going default with secret to not add any additional dependencies for the default secure path. But if the customer is willing to do the work to use a Secret CSI drivers to mount a file directly, shouldn't that be allowed.

The Secrets aren't mounted into the Flux Pods but read into memory through the Kubernetes API at runtime.

We could add a flag for pointing to a cert on disk, then it's up to the Flux admin to figure out the mounts and so on. With this flag, we'll read the cert at startup only, the Flux admin can bump some pod annotation in Git to restart the controllers to reload the cert. This flag e.g. --tls-file should be mutually exclusive with --tls-secret, if both are set, the controllers will panic and crash loop.

Thanks , this should resolve the cases where admin want to mount the cert on their own.
A bit of query, why not use use the certwatcher on the --tls-file also to avoid the restart ?

Why not use use the certwatcher on the --tls-file also to avoid the restart ?

Ah I haven't looked into that, if it supports it, then sure.

pjbgf added area/rfc Feature request proposals in the RFC format area/security Security related issues and pull requests labels Dec 4, 2022

pjbgf requested a review from rashedkvm December 4, 2022 09:34

pjbgf mentioned this pull request Dec 4, 2022

Enable storage server to serve https fluxcd/source-controller#967

Open

pjbgf commented Dec 4, 2022

View reviewed changes

pjbgf changed the title ~~rfc: Add build-in TLS support draft~~ RFC-NNNN: Add build-in TLS support draft Dec 4, 2022

pjbgf changed the title ~~RFC-NNNN: Add build-in TLS support draft~~ RFC-NNNN: Add build-in TLS support Dec 4, 2022

rashedkvm reviewed Dec 5, 2022

View reviewed changes

rfc: Add build-in TLS support draft

9a71003

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>

pjbgf force-pushed the rfc-build-in-tls-support branch from e06a13b to 9a71003 Compare December 5, 2022 10:38

stefanprodan reviewed Dec 5, 2022

View reviewed changes

pjbgf marked this pull request as ready for review December 6, 2022 10:48

phoban01 reviewed Dec 6, 2022

View reviewed changes

pjbgf changed the title ~~RFC-NNNN: Add build-in TLS support~~ [RFC] Add build-in TLS support Dec 7, 2022

stefanprodan marked this pull request as draft December 9, 2022 09:35

pjbgf assigned stefanprodan Dec 13, 2022

makkes mentioned this pull request Apr 15, 2023

[Security Pals] Flux multi-tenancy proposal cncf/tag-security#896

Closed

15 tasks

alukyan reviewed Jun 6, 2023

View reviewed changes

pjbgf closed this by deleting the head repository Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Add build-in TLS support #3368

[RFC] Add build-in TLS support #3368

pjbgf commented Dec 4, 2022 •

edited

Loading

pjbgf Dec 4, 2022

rashedkvm commented Dec 4, 2022

rashedkvm left a comment

pjbgf commented Dec 5, 2022

stefanprodan Dec 5, 2022

pjbgf Dec 5, 2022

rashedkvm Dec 5, 2022

rashedkvm commented Dec 5, 2022 •

edited

Loading

phoban01 Dec 6, 2022

pjbgf Dec 13, 2022

tburko Feb 7, 2023

alukyan Jun 6, 2023

stefanprodan Jun 7, 2023

makkes Jun 7, 2023

alukyan Jun 7, 2023

stefanprodan Jun 7, 2023

anagg929 Jun 22, 2023

makkes Jun 22, 2023 •

edited

Loading

stefanprodan Jun 22, 2023

anagg929 Jun 22, 2023 •

edited

Loading

stefanprodan Jun 22, 2023

		The TLS support could be expanded to the Notification controller webhooks endpoint
		to enable secure in-cluster communications.

		Controllers will watch the TLS secret for changes, once detected it will trigger a
		pod restart to refresh the certificates being used.

[RFC] Add build-in TLS support #3368

[RFC] Add build-in TLS support #3368

Conversation

pjbgf commented Dec 4, 2022 • edited Loading

Choose a reason for hiding this comment

rashedkvm commented Dec 4, 2022

rashedkvm left a comment

Choose a reason for hiding this comment

pjbgf commented Dec 5, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rashedkvm commented Dec 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makkes Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anagg929 Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pjbgf commented Dec 4, 2022 •

edited

Loading

rashedkvm commented Dec 5, 2022 •

edited

Loading

makkes Jun 22, 2023 •

edited

Loading

anagg929 Jun 22, 2023 •

edited

Loading