feat(controller:diskless): enable immediate partition reassignment by jeqo · Pull Request #537 · aiven/inkless

jeqo · 2026-03-16T12:53:30Z

For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas.

Changes:

Skip setting targetRemoving/targetAdding for diskless topics
Use target replicas directly instead of merged replica list
Update tests to expect immediate completion (no ongoing reassignment)

Copilot

Pull request overview

This PR enables immediate partition reassignment for diskless topics in Kafka's controller. Since diskless topics store data in object storage rather than on local disks, replica syncing is unnecessary, and reassignment can complete in a single step rather than going through the staged addingReplicas/removingReplicas process.

Changes:

Skip the staged reassignment process for diskless topics by applying target replicas directly, with ISR containing only active brokers
Exclude diskless topics from preferred-leader imbalance tracking and periodic leader balancing
Add new tests covering fenced broker rejection, partial ISR with fenced brokers, and leader balancing skip behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java`	Core logic: immediate reassignment for diskless, skip imbalance tracking, add `isDisklessTopic` helper
`metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java`	Updated existing tests to expect immediate completion; added tests for fenced broker edge cases and leader balancing skip

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment)

Copilot

Pull request overview

This PR enables immediate partition reassignment for diskless topics in the Kafka controller. Since diskless topics store data in object storage rather than local disk, there's no need for the staged replica sync process (addingReplicas/removingReplicas). The PR also excludes diskless topics from the controller's preferred leader balancing, as a metadata transformer handles leader routing for these topics.

Changes:

Skip staged reassignment for diskless topics: apply target replicas and ISR directly, rejecting reassignments where all target replicas are fenced
Exclude diskless topics from imbalancedPartitions tracking and periodic leader balancing
Extract isDisklessTopic() helper method and refactor existing inline checks to use it

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java`	Core logic: immediate diskless reassignment, leader balancing skip, `isDisklessTopic` helper, imbalanced partition exclusion
`metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java`	Updated existing tests for immediate completion; added tests for fenced broker rejection, partial ISR, and leader balancing skip

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

viktorsomogyi

I wonder whether these changes cause a momentary offline state for the reassigned partitions? For instance if we reassign a partition's replicas from [1,2,3] to [4,5,6] it was useful to take the union of the changes as we could ensure that the partition doesn't go offline.
Although you state that since everything is in object storage we actually don't need to sync which is true, but it may take some time to warm up the cache and deal with the cache misses for fetch requests. This may also cause some extra load on the brokers and object storage. So my questions are:

is it possible that partitions become offline in some cases during the reassignment and should we worry about it?
should we worry about the performance impact of cache misses during reassignment and somehow pre-warm the target set instead?

…ment

jeqo · 2026-03-16T15:26:34Z

it may take some time to warm up the cache and deal with the cache misses for fetch requests. This may also cause some extra load on the brokers and object storage.

Yes, but currently there is no proactive cache warm-up. It happens on-demand as new requests are received.

is it possible that partitions become offline in some cases during the reassignment and should we worry about it?

Partitions won't go offline during reassignment. The controller rejects the reassignment if none of the target brokers are active (unfenced and not in controlled shutdown). As long as at least one target broker is active, it will be elected leader immediately and able to serve requests; even with empty cache.

should we worry about the performance impact of cache misses during reassignment and somehow pre-warm the target set instead?

Yes, there will be a transient increase in fetch latency and object storage reads until the new leader's cache is populated. This happens on-demand — there is no proactive cache warm-up today.
We can look into pre-warming the cache when working on the Diskless log implementation and assess then if we can play with metadata to signal cache readiness.

I've added some additional comments to clarify these points.

viktorsomogyi

Thanks for the explanation, it's good to go then on my side.

) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment) (cherry picked from commit 1982f0e) # Conflicts: # metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java

) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment)

) * feat(controller:diskless): enable immediate partition reassignment For diskless topics, partition reassignment now completes immediately without the staged process (addingReplicas/removingReplicas). Since data is stored in object storage rather than local disk, there is nothing to sync between replicas. Changes: - Skip setting targetRemoving/targetAdding for diskless topics - Use target replicas directly instead of merged replica list - Update tests to expect immediate completion (no ongoing reassignment) (cherry picked from commit 1982f0e) # Conflicts: # metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java

jeqo requested a review from Copilot March 16, 2026 12:53

Copilot started reviewing on behalf of jeqo March 16, 2026 12:54 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Comment thread metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java Outdated

Comment thread metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java

jeqo force-pushed the jeqo/pod-2122-enable-partition-reassign branch from 270533b to 31a563a Compare March 16, 2026 13:02

jeqo requested a review from Copilot March 16, 2026 13:02

Copilot started reviewing on behalf of jeqo March 16, 2026 13:03 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

jeqo marked this pull request as ready for review March 16, 2026 13:57

jeqo requested review from giuseppelillo and viktorsomogyi March 16, 2026 13:57

viktorsomogyi reviewed Mar 16, 2026

View reviewed changes

fixup! feat(controller:diskless): enable immediate partition reassign…

5caa341

…ment

jeqo requested a review from viktorsomogyi March 16, 2026 15:26

viktorsomogyi approved these changes Mar 16, 2026

View reviewed changes

giuseppelillo approved these changes Mar 16, 2026

View reviewed changes

giuseppelillo merged commit 1982f0e into main Mar 16, 2026
6 checks passed

giuseppelillo deleted the jeqo/pod-2122-enable-partition-reassign branch March 16, 2026 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(controller:diskless): enable immediate partition reassignment#537

feat(controller:diskless): enable immediate partition reassignment#537
giuseppelillo merged 2 commits intomainfrom
jeqo/pod-2122-enable-partition-reassign

jeqo commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

viktorsomogyi left a comment •

edited

Loading

Uh oh!

jeqo commented Mar 16, 2026

Uh oh!

viktorsomogyi left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jeqo commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

viktorsomogyi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeqo commented Mar 16, 2026

Uh oh!

viktorsomogyi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

viktorsomogyi left a comment •

edited

Loading

viktorsomogyi left a comment •

edited

Loading