-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHCLOUD-37466 - Implement zookie into api #352
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
The latest Buf updates on your PR. Results from workflow buf-pull-request / build (pull_request).
|
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
… requests Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good @lennysgarage. It has everything I was expecting. Let's wait for a couple of more reviews on this one.
@@ -36,5 +37,6 @@ message CheckResponse { | |||
// e.g. ALLOWED_CONDITIONAL = 3; | |||
} | |||
Allowed allowed = 1; | |||
Zookie checked_at = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any value to having checked_at
, deleted_at
, etc., rather than just zookie
? Just wondering why you named them like this?
I'm wondering whether we want to give the impression that zookies are like timestamps, or to stay completely opaque as to what they are. I don't know the answer, just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great question, it would not make much of a difference if it were all just zookie
. The reasoning is due to the fact that I was following how authzed had setup their zedtoken. With looked_up_at
for lookup responses https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.LookupResourcesResponse and checked_at
for https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.CheckPermissionResponse etc.
Considering it is just an opaque string, I think having names that sound like timestamps can be misleading. I think it would be good to just have a single name like zookie
for requests and response fields and won't have users thinking they can grab some extra timestamp value from what it is. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that spicedb do it like this. :) I'm on the fence with this one, so whichever way you like is good with me. If nobody else comments, feel free to resolve one way or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether we want to give the impression that zookies are like timestamps, or to stay completely opaque as to what they are. I don't know the answer, just thinking out loud.
This – we should be very opaque about them IMO, because we can't guarantee they'll always be timestamps.
See my comment below on the "Zookie" name itself – I think we need a little less specific of a field name for these because across other FGA projects, the token we'd use isn't actually a logical timestamp, so there is arguably a better name for this. Maybe simply "consistency_token" or "read_token" (similar to what you had originally).
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
Two thoughts around usecases for fullyconsistent alongside zookies:
|
Thanks for raising these points and with this change, no we won't be able to do that. We don't expose the consistency level in the API, so like you said we're either always fully consistent or minimize latency/zookie. I guess we would need to implement a
ReadRelationships can be passed a zookie in the request as well. Since we're not exposing the consistency to the user we don't have a way for them to determine what they're going to do next (unless we modify the api like above). If the main usecase is for directly following up with a WriteRelationships call then it would be appropriate to change to full consistency always. |
I think this is a good point. We have assumed the existence of fully consistent checks in our candidate solutions, so I think we need them. Even if we were to suggest to notifications that they move from The simplest change for right now might be to just add boolean fully_consistent to check (partially exposing consistency). What do you/others think? (Other options include creating a new fully consistent check endpoint that could, for example, be locked down to inventory-api -- non-inventory uses maybe never get offered fully consistent anything? -- or merge the "fully consistent check" into the write endpoints, since a lot of the time we're using them to check access for a write. But my vote would be to keep things simple and just do the above.)
If we add a fully consistent check (above), we'd have the option of always performing that and using the zookie for read and write. Yes, it's an extra hop, but we can functionally do everything and can optimise in the future when we'll have a better idea of our requirements. |
Big agree on adding a |
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
The one thing I'll say on a bool versus a separate endpoint (like the ContentChangeCheck endpoint from Zanzibar) is it may be misleading to callers in a way that trends toward misuse- it's framing the request around how fresh you want the results to be (which is always the freshest possible, right?) rather than what you intend to do with them (authorize a view request vs authorizing something else.) We'll almost have to make that distinction in inventory because what it does with those two requests is pretty different, but I think it makes the API a little more self-evident here as well. (Put another way- I think it'd feel more natural to tell people to call CheckForUpdate or ContentChangeCheck when authorizing a change and Check or even CheckForRead/ReadCheck/etc to authorize reading data vs telling them to set fullyconsistent to true when making changes and false otherwise, though at the end of the day the two approaches are effectively the same.) |
Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>
@@ -49,3 +49,8 @@ message ObjectType { | |||
string namespace = 1 [(buf.validate.field).string.min_len = 1]; | |||
string name = 2 [(buf.validate.field).string.min_len = 1]; | |||
} | |||
|
|||
// The Zookie is used to provide consistency between write and read requests. | |||
message Zookie { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about exposing the "Zookie" name (and the exact concept) itself directly in the API.
What I mean by this is that we will probably benefit from making a bit more generic concept that fits better with other Zanzibar/FGA implementations, but still provides the same or similar API semantics.
e.g. something like maybe Revision
or ReadToken
or ConsistencyToken
... etc. We need to convey, "Here's your response and a token. You can use this token later to get data at least as recent as from this request. We promise nothing else about what this token means."
This value needn't be a logical timestamp as in a Zookie. It could simply be an opaque string that means "for this request, use full consistency," which would be compatible with OpenFGA, for example. OSO Cloud's "consistency token" is similar as well – it just guarantees reading-your-writes, but otherwise doesn't refer to a particular snapshot (in their case looks like a particular shard). For completeness, Permify, SpiceDB, and likely Warrant (now WorkOS FGA) use more by-the-book zookies.
The guarantees may not all be rigorously the same with different implementations, so I don't want to go down that rabbit hole. But for now we have to remain open to working with different implementations, even if there is the caveat that they have slightly different semantics. (In these examples though, for what we need, I think they would all work.)
This point is true regardless of whether we expose this in protos or not – even in our internal abstraction, we should remain as agnostic of the "Zanzibars" as possible until we absolutely have to commit to limiting that.
@@ -29,11 +29,13 @@ message LookupResourcesRequest { | |||
string relation = 2 [(buf.validate.field).string.min_len = 1]; | |||
SubjectReference subject = 3 [(buf.validate.field).required = true]; | |||
optional RequestPagination pagination = 4; | |||
optional Zookie zookie = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want a little more indirection here, although if this becomes an internal API this is not a big deal because we can refactor easily. But as a proto, I'd probably want more like a Consistency
message, of which one option is "use this token to get me at least as recent as whatever that token is bound to," leaving room for other options in the future without having to update every request where the token is appropriate input. (This is similar to how SpiceDB does it in this case, for good reason I think.)
Zookie zookie = 4; | ||
bool fully_consistent = 5; // defaults to false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below on a "Consistency" option – however in the case of check, I think we should go the Zanzibar route here and have an explicitly different RPC for a fully consistent check. This is easier to use correctly, and more obvious if you misuse it.
Offering "fully_consistent" as an option is an attractive footgun that gets you (from a client perspective) full consistency (yay) at the cost of killing our scalability (ouch), from our perspective.
So for a regular check, I'd offer an optional consistency option, of which fully consistent is not an option. And then for fully consistent checks, I'd offer a "ContentChangeCheck" (or similarly named) which is what you should use when you are well... changing content! (See internal design doc)
(Technically maybe another, later option is that we actually keep one endpoint, but automatically detect the consistency level to use based on schema. E.g. you can annotate a relation as "this one allows writes/content changes" and we use that. A bit more advanced though, so I'd probably save that for later if at all.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, agree on the points listed here. Added in the latest commit another rpc for full consistency check updates CheckForUpdate
that remove the need of a consistency
option bool.
PR Template:
Describe your changes
NEW RPC: CheckForUpdate
Check
but always fully consistentExample output of running a create & check immediately:
Ticket reference (if applicable)
Fixes # https://issues.redhat.com/browse/RHCLOUD-37466
Checklist
Are the agreed upon acceptance criteria fulfilled?
Was the 4-eye-principle applied? (async PR review, pairing, ensembling)
Do your changes have passing automated tests and sufficient observability?
Are the work steps you introduced repeatable by others, either through automation or documentation?
The Changes were automatically built, tested, and - if needed, behind a feature flag - deployed to our production environment. (Please check this when the new deployment is done and you could verify it.)
Are the agreed upon coding/architectural practices applied?
Are security needs fullfilled? (e.g. no internal URL)
Is the corresponding Ticket in the right state? (should be on "review" now, put to done when this change made it to production)
For changes to the public API / code dependencies: Was the whole team (or a sufficient amount of ppl) able to review?