Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: #1314 partition watcher doesn't reacts on partition number changing #1365

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

arxon31
Copy link

@arxon31 arxon31 commented Jan 19, 2025

Suggestion to fix issue #1314

Steps to reproduce the issue:

  1. Create topic
  2. Simultaneously connect to this topic with consumer group
  3. At some time, if Kafka didn't assigned partitions to topic (so topic is exists, but has zero partitions), you won't receive topic to consumer group because of empty assignment. So Kafka after that assigning partitions to topic
  4. At the start of partitionWatcher it scrapes assigned number of partitions and starts loop where monitors number of partitions assigned to topic

After all we have empty consumer group, and partitionWatcher monitoring for changing partition number with non-zero start number of partitions. So we have to increase partitions to receive topic into consumer group.

Solution:
We need to assign partitions number to topic in assignTopicPartitions function. So we run partitionWatcher with assigned in assignTopicPartition function number of partitions and don't scrape it inside partitionWatcher

P.S. In this PR I also changed partitionWatcher. Now it runs only if consumer is leader of consumer group. Kafka says this is correct

@arxon31 arxon31 changed the title Fix: consumer's group partition watcher doesn't sees partition number changing Fix: #1314 partition watcher doesn't reacts on partition number changing Jan 19, 2025
@arxon31
Copy link
Author

arxon31 commented Jan 27, 2025

@erikdw @petedannemann
Sorry for pinging, but I noticed that you reviewed the last few PRs that were merged.
Could you please consider my proposal to improve the work of the consumer group?

}

for _, topic := range config.Topics {
cg.partitionsPerTopic[topic] = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being set to zero instead of the actual value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petedannemann
This code can be removed. I forgot that I fill this map in the assignTopicPartitions function and only after that I run partitionWatcher

Should I remove this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if it's not necessary let's remove this

if cg.config.WatchPartitionChanges {
for _, topic := range cg.config.Topics {
gen.partitionWatcher(cg.config.PartitionWatchInterval, topic)
if cg.config.WatchPartitionChanges && iAmLeader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide where you found this in the Kafka docs or code base? I guess this makes sense as the leader will trigger a consumer balance, which will let follows pick up partition changes

Copy link
Author

@arxon31 arxon31 Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petedannemann
So it's my mistake, when I said that kafka says that this is correct. I misunderstood Jun Rao from this video. He did not say who exactly should run the partition watcher mechanism. But it still seems to me that partitionWatcher should only be launched by the leader. This way we create less load on the cluster and if leader disconnects broker also triggers rebalance for group and leader election

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried about this causing some unintended regression. Could we break that change out into a separate PR and let us think it through for a bit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants