docs(inkless:design): add diskless-managed rf feature design #478
docs(inkless:design): add diskless-managed rf feature design #478jeqo wants to merge 27 commits intodesign/ts-unificationfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a comprehensive design document for implementing managed replication factor (RF) for diskless topics in the Inkless system. The document proposes transitioning from the current RF=1 with faked metadata to RF=rack_count with real KRaft-managed replicas, enabling bidirectional topic migration and standard RLM integration.
Changes:
- Adds new
DISKLESS_MANAGED_RF.mddesign document with detailed approach comparison, cost analysis, and implementation plan - Updates
DESIGN.mdto reference the new design document for Stream 4: Multi-Replica Model for Diskless Topics
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md | New comprehensive design document proposing RF=rack_count approach with transformer-first availability, including motivation, design details, implementation path (6-8 weeks), and rejected alternatives |
| docs/inkless/ts-unification/DESIGN.md | Adds cross-references to the new DISKLESS_MANAGED_RF.md document in the Multi-Replica Model section and References section |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| | Trade-off | Accepted | Rationale | | ||
| |-----------|----------|-----------| | ||
| | KRaft metadata may show offline brokers | Yes | Availability is not blocked; eventual consistency is acceptable | |
There was a problem hiding this comment.
KRaft metadata may show offline brokers
Internally or to the clients?
There was a problem hiding this comment.
To admin operations/metrics (e.g. topic describe), not to clients.
I'll clarify this
Ensure cross-az is the last resort when routing requests, and that replica placement is fully decoupled.
|
Moving it to draft to keep it as a live doc while this feature is developed. |
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
Replace placeholder metrics with actual implemented metric names: - Controller metrics using KafkaController MBean naming - Transformer metrics using ClientAzAwarenessMetrics MBean - Add note explaining logging-based approach for placement quality to avoid high-cardinality metric concerns Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # core/src/main/scala/kafka/server/ControllerServer.scala # core/src/main/scala/kafka/server/KafkaConfig.scala # metadata/src/main/java/org/apache/kafka/controller/QuorumController.java # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java
Remove ~1010 lines of inline Java/SQL code and detailed tiering pipeline diagrams. Replace with bullet-point descriptions and summary tables. Keep all architecture diagrams, state machines, and read-path flow diagrams. Key changes: - Update goals: RF=rack_count decided, tiering pipeline in scope - Replace Java code blocks in sections 4.1-4.8 with bullets - Compact tiering pipeline section (4.7) to overview + tables - Replace metrics/admin code with summary tables - Remove write path and tiering pipeline data flow diagrams - Fix stale references (PROJECT_PLAN.md, config typo) - Remove broken architecture.md link Co-Authored-By: Claude Opus 4.6 <[email protected]>
…cs (#492) * test(metadata:diskless): improve test coverage for broker fencing and unregister scenarios - Add _noRacks and _withRacks test variants for consistent coverage - Fix tests that assumed broker 0 was always the leader - Get actual leader from partition registration before fencing/unregistering - Use dynamic assertions based on actual partition state - Improve assertion error messages for clarity * feat(controller:diskless): add server config for managed replicas Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # core/src/main/scala/kafka/server/ControllerServer.scala # core/src/main/scala/kafka/server/KafkaConfig.scala # metadata/src/main/java/org/apache/kafka/controller/QuorumController.java # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java * feat(metadata:diskless): implement managed replicas for diskless topics When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java * fixup! feat(controller:diskless): add server config for managed replicas * fixup! feat(metadata:diskless): implement managed replicas for diskless topics
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java
When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md)
…cs (#492) * test(metadata:diskless): improve test coverage for broker fencing and unregister scenarios - Add _noRacks and _withRacks test variants for consistent coverage - Fix tests that assumed broker 0 was always the leader - Get actual leader from partition registration before fencing/unregistering - Use dynamic assertions based on actual partition state - Improve assertion error messages for clarity * feat(controller:diskless): add server config for managed replicas Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # core/src/main/scala/kafka/server/ControllerServer.scala # core/src/main/scala/kafka/server/KafkaConfig.scala # metadata/src/main/java/org/apache/kafka/controller/QuorumController.java # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java * feat(metadata:diskless): implement managed replicas for diskless topics When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java * fixup! feat(controller:diskless): add server config for managed replicas * fixup! feat(metadata:diskless): implement managed replicas for diskless topics (cherry picked from commit 09ba4d1)
…cs (#492) * test(metadata:diskless): improve test coverage for broker fencing and unregister scenarios - Add _noRacks and _withRacks test variants for consistent coverage - Fix tests that assumed broker 0 was always the leader - Get actual leader from partition registration before fencing/unregistering - Use dynamic assertions based on actual partition state - Improve assertion error messages for clarity * feat(controller:diskless): add server config for managed replicas Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # core/src/main/scala/kafka/server/ControllerServer.scala # core/src/main/scala/kafka/server/KafkaConfig.scala # metadata/src/main/java/org/apache/kafka/controller/QuorumController.java # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java * feat(metadata:diskless): implement managed replicas for diskless topics When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java * fixup! feat(controller:diskless): add server config for managed replicas * fixup! feat(metadata:diskless): implement managed replicas for diskless topics
…cs (#492) * test(metadata:diskless): improve test coverage for broker fencing and unregister scenarios - Add _noRacks and _withRacks test variants for consistent coverage - Fix tests that assumed broker 0 was always the leader - Get actual leader from partition registration before fencing/unregistering - Use dynamic assertions based on actual partition state - Improve assertion error messages for clarity * feat(controller:diskless): add server config for managed replicas Add diskless.managed.rf.enable config (default: false) to control whether diskless topics use managed replicas with RF=rack_count or legacy RF=1. This config only affects topic creation. When enabled, new diskless topics will be created with one replica per rack using standard KRaft placement. Part of Phase 1: Diskless Managed Replicas (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) # Conflicts: # core/src/main/scala/kafka/server/ControllerServer.scala # core/src/main/scala/kafka/server/KafkaConfig.scala # metadata/src/main/java/org/apache/kafka/controller/QuorumController.java # server-common/src/main/java/org/apache/kafka/server/config/ServerConfigs.java * feat(metadata:diskless): implement managed replicas for diskless topics When diskless.managed.rf.enable=true, new diskless topics are created with RF=rack_count using standard KRaft replica placement instead of legacy RF=1. Changes: - Compute RF from rack cardinality via rackCardinality() - Use standard replicaPlacer.place() for rack-aware assignment - Allow manual replica assignments when managed replicas enabled - Add checkstyle suppression for extended createTopic method Phase 1 limitations: - Add Partitions inherits RF from existing partitions (Phase 3) - Transformer not updated, uses legacy routing (Phase 2) - Integration tests deferred to Phase 2 (See #478 docs/inkless/ts-unification/DISKLESS_MANAGED_RF.md) � Conflicts: � metadata/src/main/java/org/apache/kafka/controller/QuorumController.java � metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java � metadata/src/test/java/org/apache/kafka/controller/ReplicationControlManagerTest.java * fixup! feat(controller:diskless): add server config for managed replicas * fixup! feat(metadata:diskless): implement managed replicas for diskless topics (cherry picked from commit 09ba4d1)
|
Closing as feature has been implemented and documented. |
No description provided.