refactor(markdown-parser): extract magic numbers to named constants#9228
Conversation
|
Replace semantic spec-derived numeric literals across the markdown parser with named constants that reference their CommonMark origins: - MAX_BLOCK_PREFIX_INDENT (3): max 0-3 spaces before block markers (CommonMark block constructs) - TAB_STOP_SPACES (4): tab stop width (§2.2) - INDENT_CODE_BLOCK_SPACES (4): indented code block threshold (§4.4) Promotes existing crate-private constants to pub(crate) and applies them consistently across 12 files.
188d4f2 to
38e1497
Compare
WalkthroughThis PR refactors the markdown parser to centralise indentation and tab-width configuration by replacing hard-coded literals (primarily 3 and 4 spaces) with crate-level constants. The changes introduce MAX_BLOCK_PREFIX_INDENT and TAB_STOP_SPACES across the lexer, parser, and syntax modules, affecting indentation threshold checks and tab expansion calculations in header, list, fenced code block, HTML block, and other parsing logic. No public API changes; functional behaviour is preserved. Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
crates/biome_markdown_parser/src/syntax/mod.rs (1)
536-538: UseTAB_STOP_SPACESin indent prefix helper for full consistency.
consume_indent_prefixstill hardcodes4at Line 537, which undercuts the centralisation done elsewhere.♻️ Suggested patch
- } else if text == "\t" { - consumed += 4; + } else if text == "\t" { + consumed += TAB_STOP_SPACES;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/mod.rs` around lines 536 - 538, The indent helper consume_indent_prefix currently hardcodes adding 4 when encountering a tab; replace that literal with the centralized constant TAB_STOP_SPACES so the function uses the same tab width as the rest of the codebase (i.e., change the branch that checks text == "\t" to add TAB_STOP_SPACES to consumed and otherwise leave logic unchanged). Ensure you reference consume_indent_prefix and TAB_STOP_SPACES so the change is consistent and compiles.crates/biome_markdown_parser/src/parser.rs (1)
468-474: UseTAB_STOP_SPACESincount_leading_indentas well.Line [473] still hardcodes
4; swapping to the shared constant keeps tab semantics single-sourced.♻️ Suggested patch
fn count_leading_indent(text: &str) -> usize { let mut count = 0usize; for c in text.chars() { match c { ' ' => count += 1, - '\t' => count += 4, + '\t' => count += TAB_STOP_SPACES, _ => break, } } count }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/parser.rs` around lines 468 - 474, In count_leading_indent, replace the hardcoded tab width (currently adding 4 for '\t') with the shared constant TAB_STOP_SPACES so tab semantics are single-sourced; update the match arm for '\t' to add TAB_STOP_SPACES and ensure the function can see the constant (use the existing TAB_STOP_SPACES in scope or import it) so behavior stays consistent across the crate.crates/biome_markdown_parser/src/to_html.rs (1)
1151-1151: Minor: Literal3could useMAX_BLOCK_PREFIX_INDENTfor consistency.Line 1151 still uses
.min(3)to clamp fence indentation. This is semantically the same asMAX_BLOCK_PREFIX_INDENT. Consider replacing it for consistency with the rest of the refactor.let fence_indent = fence_leading_indent.saturating_sub(container_indent).min(MAX_BLOCK_PREFIX_INDENT);Not blocking—feel free to defer if you deliberately left derived/clamping values as literals.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/to_html.rs` at line 1151, Replace the magic literal 3 used to clamp the fence indentation with the canonical constant MAX_BLOCK_PREFIX_INDENT for consistency: in the computation of fence_indent (which uses fence_leading_indent.saturating_sub(container_indent).min(...)), change the .min(3) call to .min(MAX_BLOCK_PREFIX_INDENT) so the code uses the shared constant rather than a hard-coded value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_markdown_parser/src/parser.rs`:
- Around line 468-474: In count_leading_indent, replace the hardcoded tab width
(currently adding 4 for '\t') with the shared constant TAB_STOP_SPACES so tab
semantics are single-sourced; update the match arm for '\t' to add
TAB_STOP_SPACES and ensure the function can see the constant (use the existing
TAB_STOP_SPACES in scope or import it) so behavior stays consistent across the
crate.
In `@crates/biome_markdown_parser/src/syntax/mod.rs`:
- Around line 536-538: The indent helper consume_indent_prefix currently
hardcodes adding 4 when encountering a tab; replace that literal with the
centralized constant TAB_STOP_SPACES so the function uses the same tab width as
the rest of the codebase (i.e., change the branch that checks text == "\t" to
add TAB_STOP_SPACES to consumed and otherwise leave logic unchanged). Ensure you
reference consume_indent_prefix and TAB_STOP_SPACES so the change is consistent
and compiles.
In `@crates/biome_markdown_parser/src/to_html.rs`:
- Line 1151: Replace the magic literal 3 used to clamp the fence indentation
with the canonical constant MAX_BLOCK_PREFIX_INDENT for consistency: in the
computation of fence_indent (which uses
fence_leading_indent.saturating_sub(container_indent).min(...)), change the
.min(3) call to .min(MAX_BLOCK_PREFIX_INDENT) so the code uses the shared
constant rather than a hard-coded value.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
crates/biome_markdown_parser/src/lexer/mod.rscrates/biome_markdown_parser/src/parser.rscrates/biome_markdown_parser/src/syntax/fenced_code_block.rscrates/biome_markdown_parser/src/syntax/header.rscrates/biome_markdown_parser/src/syntax/html_block.rscrates/biome_markdown_parser/src/syntax/inline/code_span.rscrates/biome_markdown_parser/src/syntax/link_block.rscrates/biome_markdown_parser/src/syntax/list.rscrates/biome_markdown_parser/src/syntax/mod.rscrates/biome_markdown_parser/src/syntax/thematic_break_block.rscrates/biome_markdown_parser/src/to_html.rs
Note
AI Assistance Disclosure: This PR was developed with assistance from Claude Code.
Summary
MAX_BLOCK_PREFIX_INDENT(3) constant for the CommonMark 0-3 space block prefix limit.INDENT_CODE_BLOCK_SPACESandTAB_STOP_SPACESfrom crate-private topub(crate).3and4literals with named constants across 12 files in the markdown parser.quote.rs,list.rs,header.rs,fenced_code_block.rs,thematic_break_block.rs,html_block.rs,link_block.rs,inline/code_span.rs,mod.rs,parser.rs,lexer/mod.rs,to_html.rs.Structural/non-semantic literals (e.g. string offset arithmetic in
inline/html.rs, percent-encoding slice widths into_html.rs) are intentionally left as-is.No user-facing behavior change. All 66 parser tests pass.
Test Plan
cargo test -p biome_markdown_parser— 66 tests passcargo check -p biome_markdown_parser— compiles cleanlyDocs
N/A — internal refactor only.