Skip to content

refactor(markdown-parser): decompose inline_list_source_len into scan helpers#9468

Merged
ematipico merged 1 commit intobiomejs:mainfrom
jfmcdowell:refactor/inline-list-predicates
Mar 21, 2026
Merged

refactor(markdown-parser): decompose inline_list_source_len into scan helpers#9468
ematipico merged 1 commit intobiomejs:mainfrom
jfmcdowell:refactor/inline-list-predicates

Conversation

@jfmcdowell
Copy link
Copy Markdown
Contributor

@jfmcdowell jfmcdowell commented Mar 13, 2026

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

Decompose inline_list_source_len into focused scan helpers, so the NEWLINE handling and list-indent stripping logic are independently readable without changing behavior.

  • Extract scan_newline_in_inline_list: post-newline break logic (blank line detection, quote prefix classification, setext/thematic checks, partial quote prefix, at_paragraph_break, and list-indent gated re-checks).
  • Extract scan_list_indent: whitespace token consumption up to the required list-item indent budget, with tab-stop-aware column counting.
  • Reduce inline_list_source_len to a flat loop with simple break conditions.
  • Add // #region / // #endregion grouping for the scan helpers.

Test Plan

  • just test-crate biome_markdown_parser
  • just test-markdown-conformance (652/652, 100%)
  • just f & just l

Docs

N/A — internal parser refactor with no user-facing behavior change.

…an helpers

Extract the NEWLINE handling block and list-indent stripping loop from
`inline_list_source_len` (cognitive complexity 108) into focused helpers:

- `scan_newline_in_inline_list`: post-newline break logic (quote prefix,
  setext, thematic, paragraph break, indent handling)
- `scan_list_indent`: whitespace token consumption up to required indent

No behavior change — pure structural extraction.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 13, 2026

⚠️ No Changeset found

Latest commit: 35f2599

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions Bot added A-Parser Area: parser L-Markdown Language: Markdown labels Mar 13, 2026
@jfmcdowell jfmcdowell marked this pull request as ready for review March 13, 2026 01:23
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

Walkthrough

This pull request refactors the inline list length scanning logic within the markdown parser. The changes extract newline handling and list-item indentation processing into two new helper functions: scan_newline_in_inline_list and scan_list_indent. These helpers encapsulate newline processing, quote prefix handling, setext/thematic break detection, and indentation consumption, removing substantial inlined branching logic from the main loop. The existing functionality and behaviour for boundary and type detection remain unchanged; the refactoring focuses on improving code modularity and reducing in-place complexity.

Possibly related PRs

Suggested reviewers

  • dyc3
  • ematipico
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: refactoring the inline_list_source_len function by decomposing it into helper functions.
Description check ✅ Passed The description clearly explains the refactoring goals, specific helpers extracted, and testing approach—all directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can generate a title for your PR based on the changes.

Add @coderabbitai placeholder anywhere in the title of your PR and CodeRabbit will replace it with a title based on the changes in the PR. You can change the placeholder by changing the reviews.auto_title_placeholder setting.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/biome_markdown_parser/src/syntax/mod.rs (1)

1315-1342: Tab calculation is consistent with existing patterns, but docstring may overstate.

The implementation at lines 1329–1332 treats tabs as a fixed 4 columns, which matches consume_indent_prefix (line 537). However, this isn't truly "tab-stop-aware" as the docstring suggests—proper tab-stop calculation would be TAB_STOP_SPACES - (columns % TAB_STOP_SPACES) (see line 682).

Since all conformance tests pass and this matches existing patterns, the behaviour is likely fine for prescan purposes. Consider adjusting the docstring to reflect the simplified calculation, or leave as-is if the approximation is intentional.

📝 Optional: clarify docstring
-/// Strip list-item indent tokens during inline list length scanning.
-///
-/// Consumes whitespace tokens up to `required_indent` columns, adding their
-/// byte lengths to `len`. Stops when the required indent is reached, a
-/// non-whitespace token is encountered, or consuming the next token would
-/// exceed the indent budget.
+/// Strip list-item indent tokens during inline list length scanning.
+///
+/// Consumes whitespace tokens up to `required_indent` columns, adding their
+/// byte lengths to `len`. Tabs are treated as a fixed 4-column width for
+/// simplicity (consistent with `consume_indent_prefix`). Stops when the
+/// required indent is reached, a non-whitespace token is encountered, or
+/// consuming the next token would exceed the indent budget.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/mod.rs` around lines 1315 - 1342, The
docstring for scan_list_indent overclaims "tab-stop-aware" behavior while the
implementation treats tabs as fixed TAB_STOP_SPACES (like consume_indent_prefix)
rather than computing true tab-stop width; update the docstring in
scan_list_indent to state it uses a simplified fixed-width tab approximation
(tabs counted as TAB_STOP_SPACES) or explicitly note it is an approximation for
prescan purposes referencing TAB_STOP_SPACES, rather than implying the full
tab-stop calculation used elsewhere (see consume_indent_prefix/line 682
behavior).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/mod.rs`:
- Around line 1315-1342: The docstring for scan_list_indent overclaims
"tab-stop-aware" behavior while the implementation treats tabs as fixed
TAB_STOP_SPACES (like consume_indent_prefix) rather than computing true tab-stop
width; update the docstring in scan_list_indent to state it uses a simplified
fixed-width tab approximation (tabs counted as TAB_STOP_SPACES) or explicitly
note it is an approximation for prescan purposes referencing TAB_STOP_SPACES,
rather than implying the full tab-stop calculation used elsewhere (see
consume_indent_prefix/line 682 behavior).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: be875783-1953-4d68-904b-e669f0b3c35a

📥 Commits

Reviewing files that changed from the base of the PR and between c8918d6 and 35f2599.

📒 Files selected for processing (1)
  • crates/biome_markdown_parser/src/syntax/mod.rs

@ematipico ematipico merged commit db303ae into biomejs:main Mar 21, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser L-Markdown Language: Markdown

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants