RHCLOUD-37466 - Implement zookie into api #352

lennysgarage · 2025-01-29T16:00:27Z

PR Template:

Describe your changes

NEW RPC: CheckForUpdate

Full consistency Check. Same behaviour as Check but always fully consistent
Still returns a zookie

Adds a zookie to each create/del/check/lookup requests as an optional field
Adds a zookie to each create/del/check/lookup response
- Zookie is always returned from a read/write request. (Even if full consistency is enabled.)
Tests updated to at least work with zookies. 1 test case so far to verify read after write consistency with zookie (TestSpiceDbRepository_CheckPermissionZookie)

Example output of running a create & check immediately:

CreateTuples:
{ 
  "createdAt": { 
    "token": "GgYKBENNY0c=" 
  } 
}
Check (passing zookie in request):
{
  "allowed": "ALLOWED_TRUE",
  "checkedAt": {
    "token": "GgYKBENNY0c="
  }
}

Ticket reference (if applicable)

Fixes # https://issues.redhat.com/browse/RHCLOUD-37466

Checklist

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

github-actions · 2025-01-29T16:00:41Z

The latest Buf updates on your PR. Results from workflow buf-pull-request / build (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 31, 2025, 6:36 PM

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

… requests Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

merlante

Looks really good @lennysgarage. It has everything I was expecting. Let's wait for a couple of more reviews on this one.

internal/service/lookup_test.go

merlante · 2025-01-30T10:23:30Z

api/kessel/relations/v1beta1/check.proto

@@ -36,5 +37,6 @@ message CheckResponse {
 		// e.g.  ALLOWED_CONDITIONAL = 3;
 	}
 	Allowed allowed = 1;
+	Zookie checked_at = 2;


Is there any value to having checked_at, deleted_at, etc., rather than just zookie? Just wondering why you named them like this?

I'm wondering whether we want to give the impression that zookies are like timestamps, or to stay completely opaque as to what they are. I don't know the answer, just thinking out loud.

This is a great question, it would not make much of a difference if it were all just zookie. The reasoning is due to the fact that I was following how authzed had setup their zedtoken. With looked_up_at for lookup responses https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.LookupResourcesResponse and checked_at for https://buf.build/authzed/api/docs/main:authzed.api.v1#authzed.api.v1.CheckPermissionResponse etc.

Considering it is just an opaque string, I think having names that sound like timestamps can be misleading. I think it would be good to just have a single name like zookie for requests and response fields and won't have users thinking they can grab some extra timestamp value from what it is. WDYT?

Interesting that spicedb do it like this. :) I'm on the fence with this one, so whichever way you like is good with me. If nobody else comments, feel free to resolve one way or the other.

I'm wondering whether we want to give the impression that zookies are like timestamps, or to stay completely opaque as to what they are. I don't know the answer, just thinking out loud.

This – we should be very opaque about them IMO, because we can't guarantee they'll always be timestamps.

See my comment below on the "Zookie" name itself – I think we need a little less specific of a field name for these because across other FGA projects, the token we'd use isn't actually a logical timestamp, so there is arguably a better name for this. Maybe simply "consistency_token" or "read_token" (similar to what you had originally).

deploy/kessel-relations.yaml

api/kessel/relations/v1beta1/common.proto

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

wscalf · 2025-01-30T21:04:03Z

Two thoughts around usecases for fullyconsistent alongside zookies:

Moving more toward Zanzibar..compliance? Normalness?..we'll need to be able to do fully consistent checks for non-view operations. Can we do that with this API change? It seems like it's either all fully consistent (by configuration) or minimize latency/zookie depending on whether or not one's passed.
ReadRelationships may need to be fully consistent more often than not (or even always.) Rationale: the main usecase for directly reading relationships (vs subjects/resources) is if a client intends to follow it with a WriteRelationships call (+ precondition?) to make some change relative to the current state. A stale read would be a problem there. (Though open to there being other usecases. Also not sure anything calls ReadRelationships atm, or that anything outside the management fabric should.)

lennysgarage · 2025-01-30T21:44:48Z

Two thoughts around usecases for fullyconsistent alongside zookies:

Moving more toward Zanzibar..compliance? Normalness?..we'll need to be able to do fully consistent checks for non-view operations. Can we do that with this API change? It seems like it's either all fully consistent (by configuration) or minimize latency/zookie depending on whether or not one's passed.

Thanks for raising these points and with this change, no we won't be able to do that. We don't expose the consistency level in the API, so like you said we're either always fully consistent or minimize latency/zookie. I guess we would need to implement a consistency level similar to spicedb to determine when we want fully consistent checks versus ones with a zookie/no zookie.

ReadRelationships may need to be fully consistent more often than not (or even always.) Rationale: the main usecase for directly reading relationships (vs subjects/resources) is if a client intends to follow it with a WriteRelationships call (+ precondition?) to make some change relative to the current state. A stale read would be a problem there. (Though open to there being other usecases. Also not sure anything calls ReadRelationships atm, or that anything outside the management fabric should.)

ReadRelationships can be passed a zookie in the request as well. Since we're not exposing the consistency to the user we don't have a way for them to determine what they're going to do next (unless we modify the api like above). If the main usecase is for directly following up with a WriteRelationships call then it would be appropriate to change to full consistency always.

merlante · 2025-01-31T10:27:44Z

Two thoughts around usecases for fullyconsistent alongside zookies:

Moving more toward Zanzibar..compliance? Normalness?..we'll need to be able to do fully consistent checks for non-view operations. Can we do that with this API change? It seems like it's either all fully consistent (by configuration) or minimize latency/zookie depending on whether or not one's passed.

Thanks for raising these points and with this change, no we won't be able to do that. We don't expose the consistency level in the API, so like you said we're either always fully consistent or minimize latency/zookie. I guess we would need to implement a consistency level similar to spicedb to determine when we want fully consistent checks versus ones with a zookie/no zookie.

I think this is a good point. We have assumed the existence of fully consistent checks in our candidate solutions, so I think we need them. Even if we were to suggest to notifications that they move from fully consistent to zookies/at least as fresh (with them managing zookies ad hoc for now), we couldn't actually support their use case fully with the current implementation.

The simplest change for right now might be to just add boolean fully_consistent to check (partially exposing consistency). What do you/others think?

(Other options include creating a new fully consistent check endpoint that could, for example, be locked down to inventory-api -- non-inventory uses maybe never get offered fully consistent anything? -- or merge the "fully consistent check" into the write endpoints, since a lot of the time we're using them to check access for a write. But my vote would be to keep things simple and just do the above.)

ReadRelationships may need to be fully consistent more often than not (or even always.) Rationale: the main usecase for directly reading relationships (vs subjects/resources) is if a client intends to follow it with a WriteRelationships call (+ precondition?) to make some change relative to the current state. A stale read would be a problem there. (Though open to there being other usecases. Also not sure anything calls ReadRelationships atm, or that anything outside the management fabric should.)

ReadRelationships can be passed a zookie in the request as well. Since we're not exposing the consistency to the user we don't have a way for them to determine what they're going to do next (unless we modify the api like above). If the main usecase is for directly following up with a WriteRelationships call then it would be appropriate to change to full consistency always.

If we add a fully consistent check (above), we'd have the option of always performing that and using the zookie for read and write. Yes, it's an extra hop, but we can functionally do everything and can optimise in the future when we'll have a better idea of our requirements.

lennysgarage · 2025-01-31T16:05:43Z

Two thoughts around usecases for fullyconsistent alongside zookies:

Moving more toward Zanzibar..compliance? Normalness?..we'll need to be able to do fully consistent checks for non-view operations. Can we do that with this API change? It seems like it's either all fully consistent (by configuration) or minimize latency/zookie depending on whether or not one's passed.

Thanks for raising these points and with this change, no we won't be able to do that. We don't expose the consistency level in the API, so like you said we're either always fully consistent or minimize latency/zookie. I guess we would need to implement a consistency level similar to spicedb to determine when we want fully consistent checks versus ones with a zookie/no zookie.

I think this is a good point. We have assumed the existence of fully consistent checks in our candidate solutions, so I think we need them. Even if we were to suggest to notifications that they move from fully consistent to zookies/at least as fresh (with them managing zookies ad hoc for now), we couldn't actually support their use case fully with the current implementation.

The simplest change for right now might be to just add boolean fully_consistent to check (partially exposing consistency). What do you/others think?

(Other options include creating a new fully consistent check endpoint that could, for example, be locked down to inventory-api -- non-inventory uses maybe never get offered fully consistent anything? -- or merge the "fully consistent check" into the write endpoints, since a lot of the time we're using them to check access for a write. But my vote would be to keep things simple and just do the above.)

Big agree on adding a fully_consistent bool to the check request that can be utilized to handle fully consistent checks when needed.

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

wscalf · 2025-01-31T16:41:50Z

The one thing I'll say on a bool versus a separate endpoint (like the ContentChangeCheck endpoint from Zanzibar) is it may be misleading to callers in a way that trends toward misuse- it's framing the request around how fresh you want the results to be (which is always the freshest possible, right?) rather than what you intend to do with them (authorize a view request vs authorizing something else.)

We'll almost have to make that distinction in inventory because what it does with those two requests is pretty different, but I think it makes the API a little more self-evident here as well. (Put another way- I think it'd feel more natural to tell people to call CheckForUpdate or ContentChangeCheck when authorizing a change and Check or even CheckForRead/ReadCheck/etc to authorize reading data vs telling them to set fullyconsistent to true when making changes and false otherwise, though at the end of the day the two approaches are effectively the same.)

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

alechenninger · 2025-01-31T18:58:17Z

api/kessel/relations/v1beta1/common.proto

@@ -49,3 +49,8 @@ message ObjectType {
 	string namespace = 1 [(buf.validate.field).string.min_len = 1];
 	string name = 2 [(buf.validate.field).string.min_len = 1];
 }
+
+// The Zookie is used to provide consistency between write and read requests.
+message Zookie {


I'm not sure about exposing the "Zookie" name (and the exact concept) itself directly in the API.

What I mean by this is that we will probably benefit from making a bit more generic concept that fits better with other Zanzibar/FGA implementations, but still provides the same or similar API semantics.

e.g. something like maybe Revision or ReadToken or ConsistencyToken ... etc. We need to convey, "Here's your response and a token. You can use this token later to get data at least as recent as from this request. We promise nothing else about what this token means."

This value needn't be a logical timestamp as in a Zookie. It could simply be an opaque string that means "for this request, use full consistency," which would be compatible with OpenFGA, for example. OSO Cloud's "consistency token" is similar as well – it just guarantees reading-your-writes, but otherwise doesn't refer to a particular snapshot (in their case looks like a particular shard). For completeness, Permify, SpiceDB, and likely Warrant (now WorkOS FGA) use more by-the-book zookies.

The guarantees may not all be rigorously the same with different implementations, so I don't want to go down that rabbit hole. But for now we have to remain open to working with different implementations, even if there is the caveat that they have slightly different semantics. (In these examples though, for what we need, I think they would all work.)

This point is true regardless of whether we expose this in protos or not – even in our internal abstraction, we should remain as agnostic of the "Zanzibars" as possible until we absolutely have to commit to limiting that.

alechenninger · 2025-01-31T19:03:28Z

api/kessel/relations/v1beta1/lookup.proto

@@ -29,11 +29,13 @@ message LookupResourcesRequest {
    string relation = 2 [(buf.validate.field).string.min_len = 1];
    SubjectReference subject = 3 [(buf.validate.field).required = true];
    optional RequestPagination pagination = 4;
+    optional Zookie zookie = 5;


We might want a little more indirection here, although if this becomes an internal API this is not a big deal because we can refactor easily. But as a proto, I'd probably want more like a Consistency message, of which one option is "use this token to get me at least as recent as whatever that token is bound to," leaving room for other options in the future without having to update every request where the token is appropriate input. (This is similar to how SpiceDB does it in this case, for good reason I think.)

alechenninger · 2025-01-31T19:09:51Z

api/kessel/relations/v1beta1/check.proto

+	Zookie zookie = 4;
+	bool fully_consistent = 5; // defaults to false


See my comment below on a "Consistency" option – however in the case of check, I think we should go the Zanzibar route here and have an explicitly different RPC for a fully consistent check. This is easier to use correctly, and more obvious if you misuse it.

Offering "fully_consistent" as an option is an attractive footgun that gets you (from a client perspective) full consistency (yay) at the cost of killing our scalability (ouch), from our perspective.

So for a regular check, I'd offer an optional consistency option, of which fully consistent is not an option. And then for fully consistent checks, I'd offer a "ContentChangeCheck" (or similarly named) which is what you should use when you are well... changing content! (See internal design doc)

(Technically maybe another, later option is that we actually keep one endpoint, but automatically detect the consistency level to use based on schema. E.g. you can annotate a relation as "this one allows writes/content changes" and we use that. A bit more advanced though, so I'd probably save that for later if at all.)

Yup, agree on the points listed here. Added in the latest commit another rpc for full consistency check updates CheckForUpdate that remove the need of a consistency option bool.

lennysgarage added 3 commits January 28, 2025 16:22

Add zookie proto def to api

cb090f2

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Disable fully consistent mode

5bc00f9

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Expose zookie in API + update tests

04229bb

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

lennysgarage added 2 commits January 29, 2025 11:10

Update test to use same zookie...

aebe3ee

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Add basic test cases that ensure zookie's handle read after write for…

df51211

… requests Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

merlante reviewed Jan 30, 2025

View reviewed changes

Rajagopalan-Ranganathan reviewed Jan 30, 2025

View reviewed changes

api/kessel/relations/v1beta1/common.proto Show resolved Hide resolved

lennysgarage added 4 commits January 30, 2025 09:30

Keep ephemeral env fullyConsistent

01e911a

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Add a new enemy problem-like test

4215432

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Compare returned zookie.

1c6b17e

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Check for full consistency

e9c6135

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Full consistency field for check requests

21a9c9e

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

Implement a CheckForUpdate rpc for full consistency checks

31e1718

Signed-off-by: Jonathan Marcantonio <jmarcant@redhat.com>

alechenninger reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHCLOUD-37466 - Implement zookie into api #352

RHCLOUD-37466 - Implement zookie into api #352

lennysgarage commented Jan 29, 2025 •

edited

Loading

github-actions bot commented Jan 29, 2025 •

edited

Loading

merlante left a comment

merlante Jan 30, 2025

lennysgarage Jan 30, 2025

merlante Jan 31, 2025

alechenninger Jan 31, 2025

wscalf commented Jan 30, 2025

lennysgarage commented Jan 30, 2025

merlante commented Jan 31, 2025

lennysgarage commented Jan 31, 2025

wscalf commented Jan 31, 2025

alechenninger Jan 31, 2025

alechenninger Jan 31, 2025

alechenninger Jan 31, 2025 •

edited

Loading

lennysgarage Jan 31, 2025

		Zookie zookie = 4;
		bool fully_consistent = 5; // defaults to false

RHCLOUD-37466 - Implement zookie into api #352

Are you sure you want to change the base?

RHCLOUD-37466 - Implement zookie into api #352

Conversation

lennysgarage commented Jan 29, 2025 • edited Loading

PR Template:

Describe your changes

Ticket reference (if applicable)

Checklist

github-actions bot commented Jan 29, 2025 • edited Loading

merlante left a comment

Choose a reason for hiding this comment

merlante Jan 30, 2025

Choose a reason for hiding this comment

lennysgarage Jan 30, 2025

Choose a reason for hiding this comment

merlante Jan 31, 2025

Choose a reason for hiding this comment

alechenninger Jan 31, 2025

Choose a reason for hiding this comment

wscalf commented Jan 30, 2025

lennysgarage commented Jan 30, 2025

merlante commented Jan 31, 2025

lennysgarage commented Jan 31, 2025

wscalf commented Jan 31, 2025

alechenninger Jan 31, 2025

Choose a reason for hiding this comment

alechenninger Jan 31, 2025

Choose a reason for hiding this comment

alechenninger Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

lennysgarage Jan 31, 2025

Choose a reason for hiding this comment

lennysgarage commented Jan 29, 2025 •

edited

Loading

github-actions bot commented Jan 29, 2025 •

edited

Loading

alechenninger Jan 31, 2025 •

edited

Loading