Skip to content

refactor(markdown-parser): extract magic numbers to named constants#9228

Merged
ematipico merged 2 commits intobiomejs:mainfrom
jfmcdowell:refactor/md-parser-named-constants
Feb 27, 2026
Merged

refactor(markdown-parser): extract magic numbers to named constants#9228
ematipico merged 2 commits intobiomejs:mainfrom
jfmcdowell:refactor/md-parser-named-constants

Conversation

@jfmcdowell
Copy link
Copy Markdown
Contributor

@jfmcdowell jfmcdowell commented Feb 24, 2026

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

  • Add MAX_BLOCK_PREFIX_INDENT (3) constant for the CommonMark 0-3 space block prefix limit.
  • Promote INDENT_CODE_BLOCK_SPACES and TAB_STOP_SPACES from crate-private to pub(crate).
  • Replace bare 3 and 4 literals with named constants across 12 files in the markdown parser.
  • Affected modules: quote.rs, list.rs, header.rs, fenced_code_block.rs, thematic_break_block.rs, html_block.rs, link_block.rs, inline/code_span.rs, mod.rs, parser.rs, lexer/mod.rs, to_html.rs.

Structural/non-semantic literals (e.g. string offset arithmetic in inline/html.rs, percent-encoding slice widths in to_html.rs) are intentionally left as-is.

No user-facing behavior change. All 66 parser tests pass.

Test Plan

  • cargo test -p biome_markdown_parser — 66 tests pass
  • cargo check -p biome_markdown_parser — compiles cleanly

Docs

N/A — internal refactor only.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Feb 24, 2026

⚠️ No Changeset found

Latest commit: 36fe595

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions Bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools labels Feb 24, 2026
Replace semantic spec-derived numeric literals across the markdown
parser with named constants that reference their CommonMark origins:

- MAX_BLOCK_PREFIX_INDENT (3): max 0-3 spaces before block markers (CommonMark block constructs)
- TAB_STOP_SPACES (4): tab stop width (§2.2)
- INDENT_CODE_BLOCK_SPACES (4): indented code block threshold (§4.4)

Promotes existing crate-private constants to pub(crate) and applies
them consistently across 12 files.
@jfmcdowell jfmcdowell force-pushed the refactor/md-parser-named-constants branch from 188d4f2 to 38e1497 Compare February 27, 2026 14:25
@github-actions github-actions Bot removed A-Formatter Area: formatter A-Tooling Area: internal tools labels Feb 27, 2026
@jfmcdowell jfmcdowell marked this pull request as ready for review February 27, 2026 15:07
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 27, 2026

Walkthrough

This PR refactors the markdown parser to centralise indentation and tab-width configuration by replacing hard-coded literals (primarily 3 and 4 spaces) with crate-level constants. The changes introduce MAX_BLOCK_PREFIX_INDENT and TAB_STOP_SPACES across the lexer, parser, and syntax modules, affecting indentation threshold checks and tab expansion calculations in header, list, fenced code block, HTML block, and other parsing logic. No public API changes; functional behaviour is preserved.

Possibly related PRs

  • PR #8525: Introduced the markdown parser indentation and tab constants that this PR now leverages across multiple parser modules.
  • PR #9224: Modifies the same indentation constants in the syntax module, directly related through constant definitions and usage patterns.

Suggested labels

A-Tooling

Suggested reviewers

  • dyc3
  • ematipico
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: extracting magic numbers to named constants across the markdown parser.
Description check ✅ Passed The description is detailed and directly related to the changeset, explaining the constants introduced, files affected, and test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
crates/biome_markdown_parser/src/syntax/mod.rs (1)

536-538: Use TAB_STOP_SPACES in indent prefix helper for full consistency.

consume_indent_prefix still hardcodes 4 at Line 537, which undercuts the centralisation done elsewhere.

♻️ Suggested patch
-        } else if text == "\t" {
-            consumed += 4;
+        } else if text == "\t" {
+            consumed += TAB_STOP_SPACES;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/mod.rs` around lines 536 - 538, The
indent helper consume_indent_prefix currently hardcodes adding 4 when
encountering a tab; replace that literal with the centralized constant
TAB_STOP_SPACES so the function uses the same tab width as the rest of the
codebase (i.e., change the branch that checks text == "\t" to add
TAB_STOP_SPACES to consumed and otherwise leave logic unchanged). Ensure you
reference consume_indent_prefix and TAB_STOP_SPACES so the change is consistent
and compiles.
crates/biome_markdown_parser/src/parser.rs (1)

468-474: Use TAB_STOP_SPACES in count_leading_indent as well.

Line [473] still hardcodes 4; swapping to the shared constant keeps tab semantics single-sourced.

♻️ Suggested patch
 fn count_leading_indent(text: &str) -> usize {
     let mut count = 0usize;
     for c in text.chars() {
         match c {
             ' ' => count += 1,
-            '\t' => count += 4,
+            '\t' => count += TAB_STOP_SPACES,
             _ => break,
         }
     }
     count
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/parser.rs` around lines 468 - 474, In
count_leading_indent, replace the hardcoded tab width (currently adding 4 for
'\t') with the shared constant TAB_STOP_SPACES so tab semantics are
single-sourced; update the match arm for '\t' to add TAB_STOP_SPACES and ensure
the function can see the constant (use the existing TAB_STOP_SPACES in scope or
import it) so behavior stays consistent across the crate.
crates/biome_markdown_parser/src/to_html.rs (1)

1151-1151: Minor: Literal 3 could use MAX_BLOCK_PREFIX_INDENT for consistency.

Line 1151 still uses .min(3) to clamp fence indentation. This is semantically the same as MAX_BLOCK_PREFIX_INDENT. Consider replacing it for consistency with the rest of the refactor.

let fence_indent = fence_leading_indent.saturating_sub(container_indent).min(MAX_BLOCK_PREFIX_INDENT);

Not blocking—feel free to defer if you deliberately left derived/clamping values as literals.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/to_html.rs` at line 1151, Replace the magic
literal 3 used to clamp the fence indentation with the canonical constant
MAX_BLOCK_PREFIX_INDENT for consistency: in the computation of fence_indent
(which uses fence_leading_indent.saturating_sub(container_indent).min(...)),
change the .min(3) call to .min(MAX_BLOCK_PREFIX_INDENT) so the code uses the
shared constant rather than a hard-coded value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_markdown_parser/src/parser.rs`:
- Around line 468-474: In count_leading_indent, replace the hardcoded tab width
(currently adding 4 for '\t') with the shared constant TAB_STOP_SPACES so tab
semantics are single-sourced; update the match arm for '\t' to add
TAB_STOP_SPACES and ensure the function can see the constant (use the existing
TAB_STOP_SPACES in scope or import it) so behavior stays consistent across the
crate.

In `@crates/biome_markdown_parser/src/syntax/mod.rs`:
- Around line 536-538: The indent helper consume_indent_prefix currently
hardcodes adding 4 when encountering a tab; replace that literal with the
centralized constant TAB_STOP_SPACES so the function uses the same tab width as
the rest of the codebase (i.e., change the branch that checks text == "\t" to
add TAB_STOP_SPACES to consumed and otherwise leave logic unchanged). Ensure you
reference consume_indent_prefix and TAB_STOP_SPACES so the change is consistent
and compiles.

In `@crates/biome_markdown_parser/src/to_html.rs`:
- Line 1151: Replace the magic literal 3 used to clamp the fence indentation
with the canonical constant MAX_BLOCK_PREFIX_INDENT for consistency: in the
computation of fence_indent (which uses
fence_leading_indent.saturating_sub(container_indent).min(...)), change the
.min(3) call to .min(MAX_BLOCK_PREFIX_INDENT) so the code uses the shared
constant rather than a hard-coded value.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce67318 and 36fe595.

📒 Files selected for processing (11)
  • crates/biome_markdown_parser/src/lexer/mod.rs
  • crates/biome_markdown_parser/src/parser.rs
  • crates/biome_markdown_parser/src/syntax/fenced_code_block.rs
  • crates/biome_markdown_parser/src/syntax/header.rs
  • crates/biome_markdown_parser/src/syntax/html_block.rs
  • crates/biome_markdown_parser/src/syntax/inline/code_span.rs
  • crates/biome_markdown_parser/src/syntax/link_block.rs
  • crates/biome_markdown_parser/src/syntax/list.rs
  • crates/biome_markdown_parser/src/syntax/mod.rs
  • crates/biome_markdown_parser/src/syntax/thematic_break_block.rs
  • crates/biome_markdown_parser/src/to_html.rs

Copy link
Copy Markdown
Member

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy and painless 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants