feat(diskless): add managed replicas routing to metadata transformer#504
feat(diskless): add managed replicas routing to metadata transformer#504giuseppelillo merged 5 commits intomainfrom
Conversation
6ddf9d8 to
742bb4f
Compare
deb173b to
2d516d0
Compare
742bb4f to
4de2bb4
Compare
2d516d0 to
8763a1f
Compare
4de2bb4 to
919d6a7
Compare
8763a1f to
064cb93
Compare
0380c0b to
90fa55a
Compare
… List Change return type from Iterable<Node> to List<Node> for simpler downstream usage. The underlying KRaftMetadataCache already returns a List, so this removes unnecessary abstraction.
064cb93 to
22ee4b1
Compare
There was a problem hiding this comment.
Pull request overview
Adds AZ-aware routing for diskless topics with managed replicas (RF > 1), with different fallback behavior depending on whether remote (tiered) storage is enabled, and introduces new transformer observability metrics.
Changes:
- Refactors diskless metadata transformation to support managed-replica routing with tiered vs diskless-only priority rules.
- Extends
MetadataViewto exposeisRemoteStorageEnabled(topicName)and returnsList<Node>for alive brokers. - Adds new metrics (
fallback-total,offline-replicas-routed-around,cross-az-routing-total) and expands unit tests.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| storage/inkless/src/main/java/io/aiven/inkless/metadata/InklessTopicMetadataTransformer.java | Implements managed-replica routing logic and integrates new metrics. |
| storage/inkless/src/main/java/io/aiven/inkless/metadata/ClientAzAwarenessMetrics.java | Adds new meters and recording APIs for transformer routing behavior. |
| storage/inkless/src/main/java/io/aiven/inkless/control_plane/MetadataView.java | Updates API for alive brokers list and adds remote-storage flag method. |
| storage/inkless/src/test/java/io/aiven/inkless/metadata/InklessTopicMetadataTransformerTest.java | Updates legacy RF=1 assumptions and adds managed-replica routing tests (cluster metadata). |
| storage/inkless/src/test/java/io/aiven/inkless/metadata/ClientAzAwarenessMetricsTest.java | Adds tests for newly introduced meters. |
| core/src/main/scala/kafka/server/metadata/InklessMetadataView.scala | Implements isRemoteStorageEnabled and updates alive brokers return type. |
| core/src/test/scala/kafka/server/metadata/InklessMetadataViewTest.scala | Adds unit coverage for isRemoteStorageEnabled. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add MetadataView.isRemoteStorageEnabled() to check topic tiering config, needed for managed replicas routing decisions.
22ee4b1 to
0a3563d
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
storage/inkless/src/main/java/io/aiven/inkless/metadata/InklessTopicMetadataTransformer.java:399
selectLeaderLegacyclaims null client AZ is handled without a special check, butbrokersInAZ(listenerName, clientAZ)will return brokers whoserack()is null whenclientAZis null (becauseObjects.equals(bm.rack(), null)is true). That makes routing prefer rack-less brokers instead of selecting from all alive brokers when the client AZ is unknown. Consider short-circuitingbrokersInAZ/this call to return an empty list whenclientAZis null so the logic truly falls back toallAliveBrokers.
final String clientAZ = ClientAZExtractor.getClientAZ(clientId);
// This gracefully handles the null client AZ, no need for a special check.
final List<Node> brokersInClientAZ = brokersInAZ(listenerName, clientAZ);
// Fall back on all brokers if no broker in the client AZ.
final List<Node> brokersToPickFrom = brokersInClientAZ.isEmpty()
? allAliveBrokers(listenerName)
: brokersInClientAZ;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0a3563d to
adf1424
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
02dd3c1 to
1e40a71
Compare
153b6c2 to
5691af2
Compare
5691af2 to
1468d2b
Compare
1468d2b to
2881e84
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add support for managed replicas (RF > 1) in diskless topic metadata
transformation:
- For managed replicas, routing priority depends on remote storage:
- Tiered (remote storage enabled): same-AZ replica > cross-AZ replica
> unavailable
- Diskless-only: same-AZ replica > same-AZ any broker > cross-AZ
replica > cross-AZ any broker
- For unmanaged replicas (RF=1), preserve legacy hash-based selection
- Lazy-init alive broker snapshot to avoid unnecessary calls when no
diskless topics are in the response
Add metrics to track routing behavior and remove per-partition logging
in favor of metrics for sustained observability:
- fallback-total: Count of fallbacks to non-replica brokers
(diskless-only)
- offline-replicas-routed-around: Routing decisions when some replicas
are offline
- cross-az-routing-total: Requests routed to a different AZ than the
client
Update existing tests to use RF=1 to preserve original unmanaged
behavior.
Existing logging removed to rely on introduced metrics for Diskless
partitions.
2881e84 to
ddc2889
Compare
| metricsGroup.removeMetric(FALLBACK_TOTAL); | ||
| metricsGroup.removeMetric(OFFLINE_REPLICAS_ROUTED_AROUND); | ||
| metricsGroup.removeMetric(CROSS_AZ_ROUTING_TOTAL); | ||
| clientAzHitRatesPerAz.keySet().forEach(az -> metricsGroup.removeMetric(CLIENT_AZ_HIT_RATE, Map.of("client-az", az))); |
There was a problem hiding this comment.
Why is this renamed from "az" to "client-az" ?
There was a problem hiding this comment.
It's a left-over bug -- when the tag was updated to "client-az" on creation it was left with the old value here. Adding a constant to fix this.
| continue; | ||
| } | ||
|
|
||
| if (aliveNodes == null) { |
There was a problem hiding this comment.
As I see this part is reused below in transformDescribeTopicResponse. Moving it to a helper method might reduce the code the duplication.
There was a problem hiding this comment.
Doing that in a fixup commit, PTAL
…504) * refactor(metadata): change MetadataView.getAliveBrokerNodes to return List Change return type from Iterable<Node> to List<Node> for simpler downstream usage. The underlying KRaftMetadataCache already returns a List, so this removes unnecessary abstraction. * feat(metadata): add isRemoteStorageEnabled to MetadataView Add MetadataView.isRemoteStorageEnabled() to check topic tiering config, needed for managed replicas routing decisions. * feat(metadata): add managed replicas routing and transformer metrics Add support for managed replicas (RF > 1) in diskless topic metadata transformation: - For managed replicas, routing priority depends on remote storage: - Tiered (remote storage enabled): same-AZ replica > cross-AZ replica > unavailable - Diskless-only: same-AZ replica > same-AZ any broker > cross-AZ replica > cross-AZ any broker - For unmanaged replicas (RF=1), preserve legacy hash-based selection - Lazy-init alive broker snapshot to avoid unnecessary calls when no diskless topics are in the response Add metrics to track routing behavior and remove per-partition logging in favor of metrics for sustained observability: - fallback-total: Count of fallbacks to non-replica brokers (diskless-only) - offline-replicas-routed-around: Routing decisions when some replicas are offline - cross-az-routing-total: Requests routed to a different AZ than the client Update existing tests to use RF=1 to preserve original unmanaged behavior. Existing logging removed to rely on introduced metrics for Diskless partitions. * fixup! feat(metadata): add managed replicas routing and transformer metrics * fixup! feat(metadata): add managed replicas routing and transformer metrics (cherry picked from commit 039ecf3)
…504) * refactor(metadata): change MetadataView.getAliveBrokerNodes to return List Change return type from Iterable<Node> to List<Node> for simpler downstream usage. The underlying KRaftMetadataCache already returns a List, so this removes unnecessary abstraction. * feat(metadata): add isRemoteStorageEnabled to MetadataView Add MetadataView.isRemoteStorageEnabled() to check topic tiering config, needed for managed replicas routing decisions. * feat(metadata): add managed replicas routing and transformer metrics Add support for managed replicas (RF > 1) in diskless topic metadata transformation: - For managed replicas, routing priority depends on remote storage: - Tiered (remote storage enabled): same-AZ replica > cross-AZ replica > unavailable - Diskless-only: same-AZ replica > same-AZ any broker > cross-AZ replica > cross-AZ any broker - For unmanaged replicas (RF=1), preserve legacy hash-based selection - Lazy-init alive broker snapshot to avoid unnecessary calls when no diskless topics are in the response Add metrics to track routing behavior and remove per-partition logging in favor of metrics for sustained observability: - fallback-total: Count of fallbacks to non-replica brokers (diskless-only) - offline-replicas-routed-around: Routing decisions when some replicas are offline - cross-az-routing-total: Requests routed to a different AZ than the client Update existing tests to use RF=1 to preserve original unmanaged behavior. Existing logging removed to rely on introduced metrics for Diskless partitions. * fixup! feat(metadata): add managed replicas routing and transformer metrics * fixup! feat(metadata): add managed replicas routing and transformer metrics
…504) * refactor(metadata): change MetadataView.getAliveBrokerNodes to return List Change return type from Iterable<Node> to List<Node> for simpler downstream usage. The underlying KRaftMetadataCache already returns a List, so this removes unnecessary abstraction. * feat(metadata): add isRemoteStorageEnabled to MetadataView Add MetadataView.isRemoteStorageEnabled() to check topic tiering config, needed for managed replicas routing decisions. * feat(metadata): add managed replicas routing and transformer metrics Add support for managed replicas (RF > 1) in diskless topic metadata transformation: - For managed replicas, routing priority depends on remote storage: - Tiered (remote storage enabled): same-AZ replica > cross-AZ replica > unavailable - Diskless-only: same-AZ replica > same-AZ any broker > cross-AZ replica > cross-AZ any broker - For unmanaged replicas (RF=1), preserve legacy hash-based selection - Lazy-init alive broker snapshot to avoid unnecessary calls when no diskless topics are in the response Add metrics to track routing behavior and remove per-partition logging in favor of metrics for sustained observability: - fallback-total: Count of fallbacks to non-replica brokers (diskless-only) - offline-replicas-routed-around: Routing decisions when some replicas are offline - cross-az-routing-total: Requests routed to a different AZ than the client Update existing tests to use RF=1 to preserve original unmanaged behavior. Existing logging removed to rely on introduced metrics for Diskless partitions. * fixup! feat(metadata): add managed replicas routing and transformer metrics * fixup! feat(metadata): add managed replicas routing and transformer metrics (cherry picked from commit 039ecf3)
Adds AZ-aware routing support for diskless topics with managed replicas (RF > 1). The transformer now routes metadata requests based on replica assignments and remote storage configuration.
Changes
MetadataView.getAliveBrokerNodes()to returnList<Node>for consistent APIisRemoteStorageEnabledcheck to determine routing constraints:LEADER_NOT_AVAILABLE) when partition is unavailable; clear toNONEonly when a routable leader is foundfallback-total: Count of fallbacks to non-replica brokers (diskless-only)offline-replicas-routed-around: Routing decisions when some replicas are offlinecross-az-routing-total: Requests routed to different AZ than clientRouting Priority
For unmanaged replicas (RF = 1), the legacy hash-based selection from all brokers is preserved.
Test plan
leaderEpoch,errorCode,offlineReplicas,isrNodeson all managed testsClientAzAwarenessMetrics)