refactor(markdown-parser): decompose inline_list_source_len into scan helpers#9468
Conversation
…an helpers Extract the NEWLINE handling block and list-indent stripping loop from `inline_list_source_len` (cognitive complexity 108) into focused helpers: - `scan_newline_in_inline_list`: post-newline break logic (quote prefix, setext, thematic, paragraph break, indent handling) - `scan_list_indent`: whitespace token consumption up to required indent No behavior change — pure structural extraction.
|
WalkthroughThis pull request refactors the inline list length scanning logic within the markdown parser. The changes extract newline handling and list-item indentation processing into two new helper functions: Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes.Add |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/biome_markdown_parser/src/syntax/mod.rs (1)
1315-1342: Tab calculation is consistent with existing patterns, but docstring may overstate.The implementation at lines 1329–1332 treats tabs as a fixed 4 columns, which matches
consume_indent_prefix(line 537). However, this isn't truly "tab-stop-aware" as the docstring suggests—proper tab-stop calculation would beTAB_STOP_SPACES - (columns % TAB_STOP_SPACES)(see line 682).Since all conformance tests pass and this matches existing patterns, the behaviour is likely fine for prescan purposes. Consider adjusting the docstring to reflect the simplified calculation, or leave as-is if the approximation is intentional.
📝 Optional: clarify docstring
-/// Strip list-item indent tokens during inline list length scanning. -/// -/// Consumes whitespace tokens up to `required_indent` columns, adding their -/// byte lengths to `len`. Stops when the required indent is reached, a -/// non-whitespace token is encountered, or consuming the next token would -/// exceed the indent budget. +/// Strip list-item indent tokens during inline list length scanning. +/// +/// Consumes whitespace tokens up to `required_indent` columns, adding their +/// byte lengths to `len`. Tabs are treated as a fixed 4-column width for +/// simplicity (consistent with `consume_indent_prefix`). Stops when the +/// required indent is reached, a non-whitespace token is encountered, or +/// consuming the next token would exceed the indent budget.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/mod.rs` around lines 1315 - 1342, The docstring for scan_list_indent overclaims "tab-stop-aware" behavior while the implementation treats tabs as fixed TAB_STOP_SPACES (like consume_indent_prefix) rather than computing true tab-stop width; update the docstring in scan_list_indent to state it uses a simplified fixed-width tab approximation (tabs counted as TAB_STOP_SPACES) or explicitly note it is an approximation for prescan purposes referencing TAB_STOP_SPACES, rather than implying the full tab-stop calculation used elsewhere (see consume_indent_prefix/line 682 behavior).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/mod.rs`:
- Around line 1315-1342: The docstring for scan_list_indent overclaims
"tab-stop-aware" behavior while the implementation treats tabs as fixed
TAB_STOP_SPACES (like consume_indent_prefix) rather than computing true tab-stop
width; update the docstring in scan_list_indent to state it uses a simplified
fixed-width tab approximation (tabs counted as TAB_STOP_SPACES) or explicitly
note it is an approximation for prescan purposes referencing TAB_STOP_SPACES,
rather than implying the full tab-stop calculation used elsewhere (see
consume_indent_prefix/line 682 behavior).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: be875783-1953-4d68-904b-e669f0b3c35a
📒 Files selected for processing (1)
crates/biome_markdown_parser/src/syntax/mod.rs
Note
AI Assistance Disclosure: This PR was developed with assistance from Claude Code.
Summary
Decompose
inline_list_source_leninto focused scan helpers, so the NEWLINE handling and list-indent stripping logic are independently readable without changing behavior.scan_newline_in_inline_list: post-newline break logic (blank line detection, quote prefix classification, setext/thematic checks, partial quote prefix,at_paragraph_break, and list-indent gated re-checks).scan_list_indent: whitespace token consumption up to the required list-item indent budget, with tab-stop-aware column counting.inline_list_source_lento a flat loop with simple break conditions.// #region/// #endregiongrouping for the scan helpers.Test Plan
just test-crate biome_markdown_parserjust test-markdown-conformance(652/652, 100%)just f&just lDocs
N/A — internal parser refactor with no user-facing behavior change.