Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: New ScalingSet CRD to deploy isolated interceptors+scalers #1014

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

JorTurFer
Copy link
Member

Provide a description of what has been changed

Checklist

Fixes #241

@JorTurFer JorTurFer requested a review from a team as a code owner May 5, 2024 00:27
@JorTurFer JorTurFer force-pushed the isolated-components branch 2 times, most recently from 4e793d6 to 6e1700a Compare May 12, 2024 19:53
Copy link
Member

@wozniakjan wozniakjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach very much, here is a couple of minor nitpicks from the first pass, I will take a deeper look later today

operator/apis/http/v1alpha1/httpscalingset_types.go Outdated Show resolved Hide resolved
operator/apis/http/v1alpha1/httpscalingset_types.go Outdated Show resolved Hide resolved
operator/apis/http/v1alpha1/httpscalingset_types.go Outdated Show resolved Hide resolved
operator/apis/http/v1alpha1/httpscalingset_types.go Outdated Show resolved Hide resolved
operator/apis/http/v1alpha1/httpscalingset_types.go Outdated Show resolved Hide resolved
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass of review

}

// HTTPScalingSetStatus defines the observed state of HTTPScalingSet
type HTTPScalingSetStatus struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should provide some status about the resource, at least Ready condition

ctx context.Context,
logger logr.Logger,
cl client.Client,
httpss metav1.Object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider using duck types instead of metav1.Object here, to add some type safety.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wozniakjan has also an idea on this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYM? We do almost the same in KEDA with ScaledObjects and ScaledJobs. Would be enough if I validate the type and return an error if the type isn't correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned to @zroubalik earlier that maybe we could explore type aliasing or trivial type definitions. In this case, both ScalingSets as well as ClusterScalingSets seem to have exactly matching structure so maybe we could do something along the lines of either

type ClusterScalingSet ScalingSet

or

type ClusterScalingSet = ScalingSet

I had used the first one with controller-runtime v1 some eons ago successfully and haven't tried it ever since, but could be a good fit for this usecase unless controller-runtime became more strict regarding types.

The implementation went something like this: https://go.dev/play/p/sxZiO_BDw4z

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that this works with kubebuilder but I can give a try because yes. both types are exactly the same, just cluster scoped or namespaced

pkg/util/scalingset_validatior.go Outdated Show resolved Hide resolved
@JorTurFer
Copy link
Member Author

I appreciate your feedback, I'll review it, but don't worry about reviewing the PR for the moment as it's still a WIP and there are a couple of pending stuff, like propagating the status info the the CRD, documentation, more test coverage , etc

You can review it if you want, but there can be multiple changes until the final version :)

@JorTurFer JorTurFer marked this pull request as draft May 16, 2024 09:05
JorTurFer and others added 5 commits June 24, 2024 15:44
Signed-off-by: Jorge Turrado <jorge_turrado@hotmail.es>
Signed-off-by: Jorge Turrado <jorge_turrado@hotmail.es>
Signed-off-by: Jorge Turrado <jorge_turrado@hotmail.es>
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
@JorTurFer
Copy link
Member Author

I'm removing the changes applied to my local branch

Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
@JorTurFer JorTurFer force-pushed the isolated-components branch from b58f3f4 to 6286ab0 Compare June 24, 2024 13:57
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
@JorTurFer JorTurFer marked this pull request as ready for review June 24, 2024 18:23
@JorTurFer JorTurFer changed the title WIP - feat: New ScalingSet CRD to deploy isolated interceptors+scalers feat: New ScalingSet CRD to deploy isolated interceptors+scalers Jun 24, 2024
Signed-off-by: Jorge Turrado <jorge.turrado@scrm.lidl>
@JorTurFer
Copy link
Member Author

I've been talking about this with @zroubalik and to not lock you with the code refactor, we will agreed with merging this PR as it is (it works actually) and I'll open a followup PR in a few days/weeks (before the release for sure) adding the pending stuff to complete this PR (such as but not exclusive):

  • Documentation
  • CRD Status propagation
  • Autoscaling
  • Support for global configurations (timeouts, TLS, watched namespaces, etc...)

@kahirokunn
Copy link
Contributor

This feature is great!
I can't wait to start using it!

@kahirokunn
Copy link
Contributor

@wozniakjan Hello!
Is there anything else you need to do to merge this PR?
If you need help, I'd be happy to help.

@kahirokunn
Copy link
Contributor

I have prepared a helm chart 👍
kedacore/charts#703

@wozniakjan
Copy link
Member

@wozniakjan Hello!
Is there anything else you need to do to merge this PR?

@kahirokunn I haven't been monitoring the progress here too closely but it seems there are a few pending items still - #1014 (comment)

@kahirokunn
Copy link
Contributor

comment)

I’d like to provide an update and propose a way forward based on the current situation:
1. Current Status
• I have thoroughly tested this PR in my environment, and I can confirm that it works very well as it stands.
• While there are some pending items mentioned in #1014, these do not block the core functionality of this PR, and the PR is in a functional state that meets our immediate needs.
2. Proposal
• Based on the above, I propose we merge this PR in its current form.
• For the remaining tasks, I suggest we take a collaborative approach: we can divide the pending work among the team (myself included) and address these items in a follow-up PR before the release.
3. Benefits
• This approach ensures we make progress without blocking the current work while maintaining our focus on completing the remaining tasks in a timely manner.

Please let me know your thoughts on this proposal, or if there are any concerns about moving forward with this plan!

Comment on lines +7 to +30
- apiGroups:
- ""
resources:
- services
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- apps
resources:
- deployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why this broad RBAC is added, but it effectively means the addon can do anything to all deployments in the cluster. Is there any chance the scaling sets are optional?

I can imagine a dual deployment workflow configured on the helm chart level

  1. static - a single interceptor+scaler without scalingsets enabled (the current v0.8.0 workflow)
  2. dynamic - enable scalingsets CRD and this broad RBAC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ THIS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to share an updated proposal for your consideration - it works on the assumption that ScalingSets would be always enabled, and I've included example RBAC and Helm configurations to illustrate the approach. Would this work for you? This approach allows you to choose between narrower scoping (for multiple specific namespaces) or granting permissions cluster-wide.

The key idea is to always enable the CRD, but only grant the operator the ability to manage Deployments in namespaces where it is explicitly allowed. You do this by customizing Roles and RoleBindings (or, if desired, ClusterRoles and ClusterRoleBindings) according to your deployment preferences.


Example: Narrow RBAC for Specific Namespaces

Below is a simplified way to configure the operator’s ServiceAccount to manage Deployments in specific namespaces. In your Helm chart, provide a list of namespaces in which you want to manage ScalingSets. The example YAML would look like:

1. Values file (helm/keda-add-ons-http/values.yaml)

# Which namespaces you want the operator to access for scaling sets
scalingSets:
  targetNamespaces:
    - "my-namespace"
    - "another-namespace"

2. Role and RoleBinding templates

Below are sample templates that you can place under templates/ in your helm chart. They create a Role and a RoleBinding for each namespace listed in scalingSets.targetNamespaces.

Role template (helm/keda-add-ons-http/templates/role.yaml):

{{- range .Values.scalingSets.targetNamespaces }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: keda-operator-scalingsets-{{ . }}
  namespace: {{ . }}
rules:
  - apiGroups:
    - ""
    resources:
    - services
    verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
  - apiGroups:
    - apps
    resources:
    - deployments
    verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
---
{{- end }}

RoleBinding template (helm/keda-add-ons-http/templates/rolebinding.yaml):

{{- range .Values.scalingSets.targetNamespaces }}
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: keda-operator-scalingsets-{{ . }}
  namespace: {{ . }}
subjects:
  - kind: ServiceAccount
    name: keda-http-operator
    namespace: {{ .Release.Namespace }}
roleRef:
  kind: Role
  name: keda-operator-scalingsets-{{ . }}
  apiGroup: rbac.authorization.k8s.io
---
{{- end }}

3. Installation command

When installing, you can override the list of namespaces by passing multiple “--set” flags or using array notation. For example:

helm install http-add-on kedacore/keda-add-ons-http \
  --set scalingSets.targetNamespaces[0]=my-namespace \
  --set scalingSets.targetNamespaces[1]=another-namespace

This ensures that the KEDA HTTP operator (which is always aware of the ScalingSet CRD) only has permissions to manage Deployments in the specified namespaces.


Example: Cluster-Wide Management

If you prefer a simpler, cluster-wide experience (i.e., wanting the operator to manage Deployments in all namespaces), you can replace the per-namespace Role/RoleBinding with a ClusterRole/ClusterRoleBinding. Here’s a quick example:

1. Values file (helm/keda-add-ons-http/values.yaml)

# “cluster” is used here to indicate cluster-wide usage; no individual namespaces needed
scalingSets:
  targetNamespaces: ["*"]

2. ClusterRole and ClusterRoleBinding templates

ClusterRole template (helm/keda-add-ons-http/templates/clusterrole.yaml):

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: keda-operator-scalingsets-cluster
rules:
  - apiGroups:
    - ""
    resources:
    - services
    verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
  - apiGroups:
    - apps
    resources:
    - deployments
    verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch

ClusterRoleBinding template (helm/keda-add-ons-http/templates/clusterrolebinding.yaml):

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: keda-operator-scalingsets-cluster
subjects:
  - kind: ServiceAccount
    name: keda-http-operator
    namespace: {{ .Release.Namespace }}
roleRef:
  kind: ClusterRole
  name: keda-operator-scalingsets-cluster
  apiGroup: rbac.authorization.k8s.io

3. Installation command

helm install http-add-on kedacore/keda-add-ons-http

Running this will grant the operator cluster-wide permissions to manage Deployments in any namespace.


Putting It All Together

  1. The operator always lists all ScalingSets in the cluster, but the actual permission to create/update/delete deployments depends on RBAC scoping.
  2. For a minimal-scoped approach, create Roles/RoleBindings for each namespace of interest.
  3. For a broader approach, use a single ClusterRole/ClusterRoleBinding.
  4. In both scenarios, the CRD remains enabled, and no branching logic is required in the operator.

This way, you can fulfill the feature requirement for ScalingSets without forcing all users to grant cluster-wide permissions by default. It centralizes the choice (and the user experience) in Helm chart configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable dedicated interceptor/scalers with the same operator
4 participants