feat(controller:diskless): enable immediate partition reassignment#537
feat(controller:diskless): enable immediate partition reassignment#537giuseppelillo merged 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR enables immediate partition reassignment for diskless topics in Kafka's controller. Since diskless topics store data in object storage rather than on local disks, replica syncing is unnecessary, and reassignment can complete in a single step rather than going through the staged addingReplicas/removingReplicas process.
Changes:
- Skip the staged reassignment process for diskless topics by applying target replicas directly, with ISR containing only active brokers
- Exclude diskless topics from preferred-leader imbalance tracking and periodic leader balancing
- Add new tests covering fenced broker rejection, partial ISR with fenced brokers, and leader balancing skip behavior
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java |
Core logic: immediate reassignment for diskless, skip imbalance tracking, add isDisklessTopic helper |
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java |
Updated existing tests to expect immediate completion; added tests for fenced broker edge cases and leader balancing skip |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment)
270533b to
31a563a
Compare
There was a problem hiding this comment.
Pull request overview
This PR enables immediate partition reassignment for diskless topics in the Kafka controller. Since diskless topics store data in object storage rather than local disk, there's no need for the staged replica sync process (addingReplicas/removingReplicas). The PR also excludes diskless topics from the controller's preferred leader balancing, as a metadata transformer handles leader routing for these topics.
Changes:
- Skip staged reassignment for diskless topics: apply target replicas and ISR directly, rejecting reassignments where all target replicas are fenced
- Exclude diskless topics from
imbalancedPartitionstracking and periodic leader balancing - Extract
isDisklessTopic()helper method and refactor existing inline checks to use it
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java |
Core logic: immediate diskless reassignment, leader balancing skip, isDisklessTopic helper, imbalanced partition exclusion |
metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java |
Updated existing tests for immediate completion; added tests for fenced broker rejection, partial ISR, and leader balancing skip |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
I wonder whether these changes cause a momentary offline state for the reassigned partitions? For instance if we reassign a partition's replicas from [1,2,3] to [4,5,6] it was useful to take the union of the changes as we could ensure that the partition doesn't go offline.
Although you state that since everything is in object storage we actually don't need to sync which is true, but it may take some time to warm up the cache and deal with the cache misses for fetch requests. This may also cause some extra load on the brokers and object storage. So my questions are:
- is it possible that partitions become offline in some cases during the reassignment and should we worry about it?
- should we worry about the performance impact of cache misses during reassignment and somehow pre-warm the target set instead?
Yes, but currently there is no proactive cache warm-up. It happens on-demand as new requests are received.
Partitions won't go offline during reassignment. The controller rejects the reassignment if none of the target brokers are active (unfenced and not in controlled shutdown). As long as at least one target broker is active, it will be elected leader immediately and able to serve requests; even with empty cache.
Yes, there will be a transient increase in fetch latency and object storage reads until the new leader's cache is populated. This happens on-demand — there is no proactive cache warm-up today. I've added some additional comments to clarify these points. |
) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment) (cherry picked from commit 1982f0e) # Conflicts: # metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment)
) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment) (cherry picked from commit 1982f0e) # Conflicts: # metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas.
Changes: