Skip to content

feat: add reporting as a top level primitive#790

Merged
stack72 merged 1 commit intomainfrom
adam/reports
Mar 20, 2026
Merged

feat: add reporting as a top level primitive#790
stack72 merged 1 commit intomainfrom
adam/reports

Conversation

@adamhjk
Copy link
Copy Markdown
Contributor

@adamhjk adamhjk commented Mar 20, 2026

Summary

  • Add a reports system: user-defined TypeScript extensions in extensions/reports/ that analyze model/workflow execution results and produce markdown + JSON output, automatically persisted as versioned data artifacts
  • Three report scopes (method, model, workflow) with a three-level control model: model-type defaults, definition YAML reports.require/reports.skip, and CLI flags (--report, --skip-report, --report-label, --skip-report-label, --skip-reports)
  • New CLI commands: report describe, report get, report search (with interactive fuzzy-search TUI)
  • Integrate reports into model method runs and workflow runs, extension push/pull, and add a swamp-report skill
  • Add markdown-to-terminal renderer for rich report output

Test Plan

  • Unit tests for all new domain modules (report registry, execution service, data handles, report selection)
  • Unit tests for libswamp generators (describe, get, search) and presentation renderers
  • Integration tests updated for architecture boundary and DDD layer rules
  • All 3474 tests pass, deno check, deno lint, deno fmt clean
  • Binary compiles successfully via deno run compile

🤖 Generated with Claude Code

Add a reports system that lets users define TypeScript extensions which
analyze model and workflow execution results and produce markdown and JSON
output.

Reports are user-defined TypeScript files in `extensions/reports/` that
implement a `ReportDefinition` interface with a name, description, scope,
optional labels, and an async `execute()` function. They run automatically
after model method runs or workflow runs, and their output is persisted as
versioned data artifacts.

Key capabilities:

- Three report scopes: method (single method execution), model (model-level
  aggregation), and workflow (full workflow run with all step results)
- Three-level control model for report selection: model-type defaults in the
  model definition, definition YAML overrides (`reports.require`/`reports.skip`
  with optional method scoping), and CLI flags (`--report`, `--skip-report`,
  `--report-label`, `--skip-report-label`, `--skip-reports`)
- Report results stored as data artifacts with 30-day lifetime and 5-version
  retention, tagged with report metadata for queryability
- User report loader that discovers, bundles, validates, and caches report
  extensions with mtime-based invalidation

CLI commands:

- `swamp report describe <name>` — show report definition metadata
- `swamp report get <name>` — retrieve a stored report's content
- `swamp report search [query]` — interactive fuzzy search across all stored
  reports with filters for model, workflow, scope, and label

Also adds a markdown-to-terminal renderer (via marked + marked-terminal) for
rich report output, integrates report extensions into the extension push/pull
system, and adds a `swamp-report` skill for Claude-assisted report authoring.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: add reporting as a top level primitive

This is a well-structured, comprehensive feature addition. The domain modeling follows DDD principles effectively, the three-level control model is well-designed, and the code quality is high overall. All new files have AGPLv3 headers, use named exports, and the import boundaries are respected.

No Blocking Issues

All the issues found are suggestions — nothing that must block merge.

Suggestions (non-blocking)

1. Dynamic imports should be static imports (src/libswamp/reports/get.ts:116-118, src/libswamp/reports/search.ts:110-112)

Both files import ModelType as a type at the top (import type { ModelType }), then use a dynamic await import("../../domain/models/model_type.ts") to get the runtime class for ModelType.create("workflow"). This should be a static value import instead:

// Change: import type { ModelType } from "../../domain/models/model_type.ts";
// To:     import { ModelType } from "../../domain/models/model_type.ts";

Then use ModelType.create("workflow") directly instead of the dynamic import. The dynamic import is an unnecessary runtime cost and a code smell.

2. Missing test for UserReportLoader (src/domain/reports/user_report_loader.ts)

This is a complex module (330 lines) handling file discovery, bundling with caching, mtime-based invalidation, and dynamic imports. The existing user_model_loader.ts has a test file — this should too. Key things to test: file discovery filtering, Zod validation of exports, duplicate name rejection, and cache invalidation logic.

3. DDD: MethodReportContext and ModelReportContext are identical (src/domain/reports/report_context.ts:39-74)

Both interfaces carry the exact same fields (only the scope discriminant differs: "method" vs "model"). Consider whether the model scope truly needs a separate context type now, or if a single InstanceReportContext with scope: "method" | "model" would reduce duplication. This is a minor DDD observation — the current design works fine with the discriminated union.

4. buildReportDataHandles hardcodes kind inference (src/domain/reports/report_data_handles.ts:51-53)

kind: (d.tags["type"] === "file" ? "file" : "resource") as | "resource" | "file",

This checks d.tags["type"] but report data is persisted with tags.type = "report", which would always evaluate to "resource". If this is intentional for reports, it's fine — but worth verifying the behavior matches intent.

5. Redundant workflow ID lookups in search results (src/libswamp/reports/search.ts:171-184)

For workflow-scoped reports, findWorkflowById is called twice in the same loop iteration — once for modelName and again for workflowName. A single call could populate both.

DDD Assessment

The domain modeling is well-done:

  • Value Objects: ReportScope, ReportSelection, ReportRef — correctly immutable, equality by value
  • Domain Service: filterReports, executeReports — stateless operations spanning reports and models
  • Registry: ReportRegistry — clean Map-backed registry with duplicate protection
  • Bounded Context: Reports are properly decoupled from models — models reference reports by name, not by value
  • Application Service: libswamp generators (reportGet, reportSearch, reportDescribe) correctly orchestrate domain objects from the application layer

The three-level control model (model-type defaults → definition YAML → CLI flags) with "skip always wins" and "required immune to CLI skip" is a clear, well-documented invariant.

Other Observations

  • All new CLI commands support both "log" and "json" output modes ✓
  • All imports from libswamp go through mod.ts
  • sanitizeReportNameForData properly handles path traversal risks ✓
  • assertSafePath used for bundle cache writes ✓
  • Test coverage for execution service, registry, data handles, renderers, and libswamp generators is comprehensive ✓
  • Integration tests updated for architecture boundary and DDD layer rules ✓

Good work — this is a clean, well-designed feature.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: add reporting as a top level primitive

Thorough review of all ~7000 lines of additions. This is a well-designed, well-executed feature that adds a reports system as a first-class primitive. The DDD building blocks are appropriate, the code is clean, and the architecture follows the project's established patterns.

No Blocking Issues Found

Suggestions (non-blocking)

  1. Missing test for user_report_loader.ts — This is a 330-line class with significant logic (bundle caching with mtime checks, file discovery, validation, CJS/ESM interop). The analogous user_model_loader.ts has tests. A test file would improve coverage of the caching and validation paths. Not blocking because the execution service and registry tests cover the downstream behavior, but worth considering as a follow-up.

  2. report_context.ts imports from infrastructure (line 23: UnifiedDataRepository) — This is a type-only import so it's safe, and the pattern is intentional for DDD pragmatism. Just noting for awareness that the domain layer has a type-level dependency on infrastructure.

  3. Non-null assertions in report_execution_service.ts (lines 160, 171: filterOptions.skipReportLabels!) — These are safe because of the condition guards above, but extracting to a local variable would be slightly cleaner:

    const skipLabels = filterOptions.skipReportLabels;
    if (skipLabels && report.labels?.some((l) => skipLabels.includes(l))) { ... }
  4. importBundle double-processes cached bundle (user_report_loader.ts lines 253-270) — rewritten is computed on line 253 but then the method re-reads the cached file and re-applies fixCjsEsmInterop(rewriteZodImports(...)). If the file URL import succeeds, the pre-computed rewritten is unused. Minor inefficiency, not a bug.

What looks good

  • DDD: Report is modeled as a Value Object (immutable result), ReportRegistry as a Repository, ReportExecutionService as a Domain Service, UserReportLoader as an Application Service — all appropriate choices
  • Import boundaries: All CLI commands and presentation renderers import from src/libswamp/mod.ts — no violations of the barrel export rule
  • License headers: Present on all new files
  • Output modes: All CLI commands support both "log" and "json" modes
  • Security: Path traversal prevention via sanitizeReportNameForData and assertSafePath, input validation via Zod schemas, report name pattern enforcement
  • Test coverage: Unit tests for registry, execution service, data handles, all libswamp generators (describe/get/search), and presentation renderers
  • Consistency: The globalThis as any pattern for the report registry matches the existing modelRegistry pattern
  • Filtering logic: The three-level control model (model-type defaults → definition YAML → CLI flags) is well-documented and thoroughly tested

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/libswamp/reports/get.ts:245-250 — JSON artifact version mismatch with markdown artifact

    When fetching the paired JSON content, the code uses data.version (the markdown artifact's version) to look up ${data.name}-json. However, the markdown and JSON artifacts are persisted independently via separate DefaultDataWriter instances in persistReportData() (report_execution_service.ts:226-249). Each DataWriter.writeText() call auto-increments the version counter for its own data name. If the JSON write succeeds but a previous run's markdown write had a different version number (e.g., due to partial failure mid-persist where markdown was written but JSON wasn't), the versions could diverge. The getContent call with data.version on the JSON name would then return null or stale content.

    Breaking scenario: Run a report, markdown persists as v3, JSON persists as v3 — fine. Now imagine a partial failure where only the markdown artifact is written (e.g., disk space issue between the two writes). Next successful run: markdown is v4, JSON is v3. reportGet finds markdown v4, then asks for {name}-json at version 4 — gets null, returns {}.

    Suggested fix: Instead of assuming version parity, look up the JSON artifact by name independently (find latest version for that data name) rather than pinning to the markdown's version. Or persist both artifacts atomically.

  2. src/domain/reports/report_execution_service.ts:222-223 — Unsanitized varySuffix in data name

    The baseName is constructed as report-${sanitized}-${varySuffix} but varySuffix is NOT passed through sanitizeReportNameForData() or any other sanitizer. The varySuffix comes from ctx.forEachVariable.value in execution_service.ts:585-593, which is user-controlled (YAML workflow forEach values). While Data.create validation would reject path traversal characters, the resulting error message would be a confusing internal Zod validation error rather than a clear user-facing message about the vary suffix.

    Breaking scenario: A forEach array contains the string "prod/us-east-1". This becomes varySuffix = "prod/us-east-1", then baseName = "report-myorg-cost-report-prod/us-east-1", which fails Data name validation with a cryptic "path traversal" error when the user just wanted to iterate over regions.

    Suggested fix: Apply the same sanitization to varySuffix:

    const sanitizedVarySuffix = varySuffix ? sanitizeReportNameForData(varySuffix) : undefined;
  3. src/libswamp/models/run.ts:511skipAllReports check is redundant/inconsistent with inner guard

    Line 511: if (!input.skipAllReports && reportRegistry.getAll().length > 0) — but then lines 546-548 pass skipAllReports: input.skipAllReports into filterOptions. If skipAllReports is true, the outer guard short-circuits the entire report block, meaning filterReports never gets to apply its logic that allows required reports through even when skipAllReports is set (lines 128-133 of report_execution_service.ts). So --skip-reports actually skips all reports including required ones, contradicting the documented three-level control model where "required reports are immune to CLI skip flags."

    Breaking scenario: Definition YAML has reports.require: ["@myorg/compliance"] and user runs with --skip-reports. The compliance report is skipped entirely, violating the invariant that required reports always run.

    Suggested fix: Remove !input.skipAllReports from the outer guard (or only check reportRegistry.getAll().length > 0), and let filterReports handle skip-all logic internally, which already preserves required reports.

  4. src/libswamp/workflows/run.ts:468-486 — Same skipAllReports bypass for workflow-scope reports

    The executePostRunReports function at line 527 checks reportRegistry.getAll().length === 0 but doesn't check skipAllReports. However, the calling code at line 468 passes filterOptions that includes skipAllReports to executeReports, which does handle it correctly with the required-report escape hatch. But in models/run.ts:511, the outer guard prevents reaching this code at all. This inconsistency means model runs and workflow runs behave differently for --skip-reports with required reports.

Low

  1. src/libswamp/reports/get.ts:116-119 — Dynamic import of ModelType inside generator

    The reportGet and reportSearch generators both use await import("../../domain/models/model_type.ts") to get ModelType when handling workflow scoping. This is a dynamic import inside a hot path that could be a static import at the top of the file (like run.ts:57 does). While functionally correct, it's inconsistent with the codebase pattern and adds unnecessary overhead on each invocation.

  2. src/domain/reports/report_data_handles.ts:51 — Tag-based kind detection may not match actual data kind

    kind is determined by d.tags["type"] === "file", but tags are user-controlled metadata (set during data persistence). If a report's tag type is "report" (as set in persistReportData), this would always yield "resource", even though report data is persisted as files (kind: "file" in the output spec). The tags.type check doesn't match the actual data kind — it checks whether the tag named type equals "file", but report data always has tags.type = "report".

    This means buildReportDataHandles would categorize all report artifacts as "resource" kind even though they're files. Impact is limited since this is used for report context and display, not for actual file operations.

  3. src/presentation/markdown_renderer.ts:34marked.parse can return Promise but cast to string is unchecked

    The code checks typeof result === "string" and falls back to raw markdown. If marked-terminal introduces async behavior in a future version, this would silently degrade to returning unrendered markdown. The comment documents this assumption, so it's fine for now.

Verdict

PASS — The code is well-structured with comprehensive test coverage. The medium findings (particularly #2 unsanitized varySuffix and #3 skipAllReports bypassing required reports) deserve attention but are not data-loss or security-critical since lower layers catch path traversal and the skip-reports scenario is an edge case in the control model. No critical or high severity issues found.

@stack72 stack72 merged commit 08c5e03 into main Mar 20, 2026
6 checks passed
@stack72 stack72 deleted the adam/reports branch March 20, 2026 14:29
stack72 added a commit that referenced this pull request Mar 20, 2026
…arch

Reports were added as a top-level primitive in #790, but several extension
system paths were not updated to handle them consistently with other
extension types (models, vaults, drivers, datastores).

Bug fixes:
- Report bundles were compiled without bundleOptions (denoConfigPath/
  packageJsonDir), causing reports using bare specifiers or import maps
  to fail during `extension push`
- Report entry points were excluded from the bare specifier detection
  that determines whether package.json resolution is needed
- `validateSourceCompleteness` in extension pull skipped the reports
  directory, so missing report dependencies went unwarned
- `extension search --content-type reports` was rejected by the CLI's
  hardcoded content type validation list

Consistency fixes:
- Manifest validation error message now includes "report" as a valid
  extension type
- Collective validation error messages and docstrings now mention reports
- Content metadata debug log now includes report count

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants