-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster state refresher #2429
Cluster state refresher #2429
Conversation
fae0c62
to
0e90b2e
Compare
835d7da
to
1bf7793
Compare
3f976c4
to
9490ed2
Compare
Summary: derprecate old cluster state that included information about partition state. A new ClusterState object is introduced that only have livenss information
Summary: Types used by nodes to share cluster state
Summary: Simple ping mechanism to collect and maintain a local view of cluster liveness state
networking: Networking<T>, | ||
nodes: BTreeMap<PlainNodeId, NodeTracker>, | ||
heartbeat_interval: Duration, | ||
cluster_state_watch_tx: watch::Sender<Arc<ClusterState>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you imagine situations where users ClusterState
would be interested in waiting for state change of the entire ClusterState, or is it more likely that they'd be interested in a certain node?
And in respect to the latter, how can this achieved with this design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually pretty common pattern here to wait for a certain state of the entire cluster or a single node. While this can be accomplished by waiting on changes of the cluster state in a state machine, we maybe can make the experience a little better by providing a wrapper on top of the watch that make it easier to wait on more complex configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example use-case?
last_attempt_at: Option<MillisSinceEpoch>, | ||
} | ||
|
||
pub struct ClusterStateRefresher<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this is what we traditionally call "FailureDetector", perhaps we can call it that to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds better :)
#[derive(Debug, Clone, Serialize, Deserialize)] | ||
pub struct NodePong {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you imagine will be in response and do we actually need a response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point :) Pongs
are definitely not needed as their own message type and can be dropped. The failure detector still treat it as a ping anyway.
_ = &mut cancelled => { | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: this is possibly a good place to inform peers that you are shutting down?
tokio::select! { | ||
result = cluster_state_refresher.run() => { | ||
result | ||
} | ||
_ = cancelled => { | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't refresher itself monitor the cancellation token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, true. this is a mistake
networking: Networking<T>, | ||
nodes: BTreeMap<PlainNodeId, NodeTracker>, | ||
heartbeat_interval: Duration, | ||
cluster_state_watch_tx: watch::Sender<Arc<ClusterState>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example use-case?
networking, | ||
nodes: BTreeMap::default(), | ||
heartbeat_interval: config.common.heartbeat_interval.into(), | ||
cluster_state_watch_tx: watch::Sender::new(Arc::new(ClusterState::empty())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why ClusterState is in an Arc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Borrowed values from a watch has a Read lock which means you should only borrow for very short period of times and never across await points. Hence I think using an Arc is safer so we can cheaply copy the cluster state and release the borrow. Then you can pass this state snapshot around or use it across await points
async fn on_pong(&mut self, mut msg: Incoming<NodePong>) -> Result<(), ShutdownError> { | ||
msg.follow_from_sender(); | ||
|
||
trace!("Handling pong response"); | ||
|
||
let tracker = self.nodes.entry(msg.peer().as_plain()).or_default(); | ||
tracker.seen = Some(SeenState::new(msg.peer())); | ||
|
||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure if pong is needed. Nodes ping other nodes, and each node makes its own view based on the pings it has received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense!
|
||
pub struct BaseRole { | ||
processor_manager_handle: Option<ProcessorsManagerHandle>, | ||
incoming_node_state: MessageStream<GetNodeState>, | ||
processors_state_request_stream: MessageStream<GetPartitionsProcessorsState>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this move to PPM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, do we still need BaseRole or should we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be eventually be deleted but right now it still handles GetPartitionsProcessorsState messages needed for CC operation
Cluster state refresher
Summary:
Simple ping mechanism to collect and maintain a local
view of cluster liveness state
Stack created with Sapling. Best reviewed with ReviewStack.
ClusterState
types #2434