Skip to content

feat: optimize CLI output for AI agent consumption#944

Merged
stack72 merged 1 commit intomainfrom
feat/937-optimize-output-modes
Mar 30, 2026
Merged

feat: optimize CLI output for AI agent consumption#944
stack72 merged 1 commit intomainfrom
feat/937-optimize-output-modes

Conversation

@keeb
Copy link
Copy Markdown
Contributor

@keeb keeb commented Mar 30, 2026

Summary

Optimizes CLI output for both human and AI agent consumption by reworking the
built-in @swamp/method-summary report and summarise --log output.

Method-summary report changes

Terminal (markdown) — optimized for humans:

  • Added narrative summary line at top ("search on anime-source succeeded, producing 24 resources")
  • Added git sha of the swamp build in the header for traceability
  • Kept arguments section (globalArgs + methodArgs, redacted for sensitive fields)
  • Restored data output table with syntax-highlighted retrieval commands
  • Retrieval commands are now version-pinned (--version N) so reports are reliable historical records
  • Removed schema dump from terminal output (was noisy, not useful for humans)

JSON (via report get --json) — optimized for agents:

  • Added narrative, output schema (field names/types from model output specs), git sha
  • Kept redacted globalArgs/methodArgs for debugging
  • Data pointers include version numbers
  • Schema eliminates the need for a separate model type describe call

Summarise --log changes

  • Failed method executions now show last error: "..." inline instead of just "✗ 3"
  • Failed workflow runs show the error from the first failed step
  • Actionable error messages visible immediately without falling back to --json

Skill documentation changes

  • Removed --json from 77 execution command examples across 13 skill files
  • Added output mode guidance to swamp-model and swamp-report skills
  • Established consistent rule: execution = default output, retrieval = --json, mutation = --json

Closes #937, closes #921

Test Plan

  • All 3719 tests pass
  • deno check, deno lint, deno fmt clean
  • Binary compiles with git sha stamping
  • Verified terminal report format against swamp-media (narrative + args + table with version-pinned commands)
  • Verified terminal report format against proxmox-manager (no leaked auth ticket, git sha in header)
  • Verified JSON report includes schema, narrative, args, and versioned pointers
  • Verified summarise --log shows error messages in both repos
  • Verified data get retrieval path unchanged

🤖 Generated with Claude Code

@keeb keeb marked this pull request as draft March 30, 2026 19:57
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/domain/reports/builtin/method_summary_report.ts:83formatSchemaFields casts prop.type to string | undefined, but JSON Schema type can be an array (e.g., ["string", "null"] for optional/nullable fields from z.string().optional() or z.string().nullable()). When zodToJsonSchema produces { "type": ["string", "null"] }, the field renders as name: string,null instead of something readable.

    Breaking input: Any Zod schema with .nullable() or .optional() fields (common for optional resource attributes).

    Suggested fix: const type = Array.isArray(prop.type) ? prop.type.filter(t => t !== "null").join(" | ") : prop.type;

  2. src/domain/reports/builtin/method_summary_report.ts:150-153 — The retrieval command embeds definitionName and handles[0].name directly into a backtick-quoted markdown string without escaping. If a data handle name contains backticks or markdown special characters, the markdown output will be malformed.

    Breaking input: A data handle named `foo` or name with | pipe.

    Likelihood: Low in practice since data names are likely validated elsewhere, but worth noting.

  3. JSON output is a breaking change for external consumers — The report JSON shape removes modelId, globalArgs, methodArgs, and retrievalCommand from dataProduced. Any external tooling or agent parsing the @swamp/method-summary JSON output (via --json on report commands) will break. The PR description acknowledges this is intentional, but there's no migration path or version bump.

Low

  1. src/presentation/renderers/summarise.ts:96-98[...method.runs].reverse().find(...) copies the entire runs array just to find the last matching element. For very large run histories this is wasteful. A findLast or backward loop would be more efficient. Not a correctness issue.

  2. src/domain/reports/builtin/method_summary_report.ts:40-42 — Failed narrative omits the trailing period while succeeded narratives include it. Inconsistent but cosmetic.

Verdict

PASS — The code is well-structured, tested, and the logic is correct. The schema type array handling (Medium #1) is the most likely real-world issue but won't cause crashes, just slightly garbled field type display. The JSON shape change (Medium #3) is intentional per the PR description. No blocking issues.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

No blocking issues found. This is a well-structured PR that improves AI agent consumption of CLI output.

What looks good

  1. Report rewritemethod_summary_report.ts is cleanly decomposed into focused helper functions (buildNarrative, formatSchemaFields, renderSchemaMarkdown, renderPointersMarkdown). The narrative + schema + pointers format is much more useful than echoing back args.

  2. DDD alignmentOutputSpecInfo is a proper value object in the domain layer. buildOutputSpecs() in run.ts correctly lives in the application/libswamp layer, translating domain ModelDefinition specs into report context. Report context interfaces are well-defined discriminated unions.

  3. Import boundariessummarise.ts (presentation) imports from ../../libswamp/mod.ts ✓. run.ts (libswamp-internal) imports from domain paths ✓. No boundary violations.

  4. Test coverage — 8 test cases covering: succeeded/failed status, error messages, empty data handles, output spec schema rendering, multiple grouped handles, and JSON output structure validation. Tests follow the "functionName: describes behavior" naming convention.

  5. Error surfacing in summarise — The compact-mode last error: display for failed methods and workflow steps is a good UX improvement. Verbose mode already showed per-run errors, and the new compact fallback finds the most recent failure cleanly.

  6. Skill file consistency--json correctly removed from execution commands (model method run, workflow run) while retained on retrieval (data get, data list), mutation (model create), and inspection commands.

Suggestions

  1. formatSchemaFields only renders top-level type for each property. For schemas with oneOf, $ref, or array items, the output would show unknown. This is fine for now but could be noted as a known limitation if agents rely on it heavily.

  2. Minor: the buildNarrative failed case omits a trailing period (failed${reason}) while the success cases include one (succeeded with no data output.). Cosmetic only.

🤖 Generated with Claude Code

github-actions[bot]
github-actions Bot previously approved these changes Mar 30, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. summarise compact error indentation is hardcoded (src/presentation/renderers/summarise.ts:101): The method error line uses 13 fixed spaces, while the workflow equivalent uses .padEnd(maxLabel + 2) for dynamic alignment. If method names are short or long, the error line may not visually hang under the right column. Minor cosmetic inconsistency.

  2. Method summary report JSON loses modelId (src/domain/reports/builtin/method_summary_report.ts): The old JSON shape included modelId (the UUID) alongside modelName. The new shape drops it. Any script doing report get --json | jq .modelId will silently get null. modelName covers the identity use case for most cases, but worth noting the UUID is no longer available from this report.

  3. Method summary report: single Retrieve: example per spec group (src/domain/reports/builtin/method_summary_report.ts): The inline markdown shows one retrieval command using handles[0].name even when a group has multiple handles. The full name list is shown above it, so users can construct the rest, but a hint like (and N more) on the example would make this clearer for agents.

Verdict

PASS. The error message surfacing in summarise --log is a clear improvement — agents and users can now see why something failed without --verbose. The method summary report rewrite produces a much more useful narrative+schema+pointers format. The skill file changes consistently apply the --json-on-retrieval convention. No UX regressions.

@keeb keeb force-pushed the feat/937-optimize-output-modes branch from 17cb87d to d832cf2 Compare March 30, 2026 20:46
@keeb keeb marked this pull request as ready for review March 30, 2026 20:46
@keeb keeb force-pushed the feat/937-optimize-output-modes branch 2 times, most recently from 9155c83 to b8b2f64 Compare March 30, 2026 21:29
Rework the built-in @swamp/method-summary report to serve both humans
and AI agents through the same artifact:

Terminal (markdown): narrative summary, git sha in header, redacted
arguments, data output table with version-pinned retrieval commands.

JSON (report get --json): narrative, output schema with field types,
redacted arguments, versioned data pointers, git sha. Eliminates the
need for separate model type describe calls.

Add error messages to summarise --log output for failed method
executions and workflow runs.

Remove --json from 87 execution command examples across 19 skill
files. Establish consistent rule: execution = default output,
retrieval = --json, mutation = --json.

Bake git sha at compile time so reports always identify the exact
swamp build.

Closes #937, closes #921

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@keeb keeb force-pushed the feat/937-optimize-output-modes branch from b8b2f64 to d7b73bb Compare March 30, 2026 21:31
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

None.

Suggestions

  1. Removed redactSensitiveArgs test coverage — The PR removed the redactSensitiveArgs redacts sensitive fields in markdown and JSON and without redactSensitiveArgs, args render as-is tests. The redaction logic is still exercised (the function is called in the report), but there are no dedicated tests asserting that sensitive fields are properly masked. Consider adding a focused test back to prevent future regressions in credential redaction.

  2. No test coverage for summarise renderer error display — The new compact-mode error display logic in src/presentation/renderers/summarise.ts (lines 94–103 and 166–183) has no unit tests. The existing file had no tests either, so this is pre-existing, but since the PR adds user-facing error surfacing logic, this would be a good time to add a test file.

  3. Minor: renderPointersMarkdown unreachable branch — The function has an early return for dataHandles.length === 0 producing ["## Data Output", "", "No data output."], but the caller at line 156 already gates on dataHandles.length > 0. The defensive check isn't harmful but the "No data output" path is dead code within the current report.

Notes

  • Import boundary: All CLI/presentation imports use libswamp/mod.ts correctly. Internal libswamp code (src/libswamp/models/run.ts) imports from internal paths as allowed.
  • DDD: OutputSpecInfo is a well-formed value object. buildOutputSpecs appropriately lives in the application/libswamp layer. buildNarrative and renderPointersMarkdown are pure domain functions — clean separation.
  • No any types in production code — test files use as any for mock stubs with deno-lint-ignore, which is the established pattern.
  • License headers: All modified .ts files have the AGPLv3 header.
  • Compile script: Git SHA stamping with proper restore-on-finally is correct. The stderr: "null" in getGitSha gracefully handles non-git environments.
  • Skill doc updates: Consistent removal of --json from execution commands across 13 skill files aligns with the new output mode guidance.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Arguments section always renders empty {} blobs (method_summary_report.ts lines ~146–155)
    The new code unconditionally renders both **Global Arguments** and **Method Arguments** even when both are {}. The old code showed "No arguments." for empty args and omitted whichever section was absent. For models that take no args, users now see two json\n{}\n blocks after the narrative — noisy and not useful. The deleted tests ("both args empty shows no arguments", "only globalArgs present omits method section") validated the old conditional logic; the new code has no equivalent guard.

  2. retrievalCommand removed from JSON dataProduced (method_summary_report.ts line ~175)
    The old JSON included a ready-to-use retrievalCommand: "swamp data get ...". Scripts now have to construct swamp data get <modelName> <name> --version <version> from the three separate fields. The data is present and version-pinning is a real improvement, but the convenience regression is worth noting for any existing consumers. Consider adding a retrievalCommand field back (version-pinned) so both old and new consumers work.

  3. modelId removed from JSON (method_summary_report.ts line ~163)
    Any script filtering by model ID will silently get no field. Low-impact since modelName covers most lookups, but worth documenting as a breaking JSON shape change.

  4. last error indent hardcoded for methods, dynamic for workflows (summarise.ts line 101 vs 177)
    Method error lines use a hardcoded 13-space indent; workflow uses "".padEnd(maxLabel + 2) to align under the workflow name. On a model with a long name like my-very-long-model-name, the 13-space indent won't align under the method row. Minor visual inconsistency.

Verdict

PASS — No blocking issues. The narrative summary, version-pinned retrieval commands, inline error messages in summarise --log, and schema in JSON are all solid UX improvements. The suggestions above are polish items that don't block merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/domain/reports/builtin/method_summary_report.ts:86 — Markdown table breakage if handle.name or definitionName contain pipe characters.
    The retrieval command is interpolated directly into a markdown table cell:
    const cmd = `swamp data get ${definitionName} ${handle.name} --version ${handle.version}`;
    lines.push(`| **${handle.name}** | ${handle.kind} | \`${cmd}\` |`);
    If a data handle name contains | (e.g., foo|bar), the markdown table row will be malformed. In practice model/data names are likely validated to exclude such characters upstream, so this is unlikely to trigger — but the report code itself has no guard.

Low

  1. src/domain/reports/builtin/method_summary_report.ts:77-79 — Dead code path in renderPointersMarkdown.
    The function handles dataHandles.length === 0, but the sole call site (line 156) already guards with if (dataHandles.length > 0). The empty-array branch is unreachable. Not a bug, just dead code.

  2. src/domain/reports/builtin/method_summary_report.ts:41 vs :45 — Inconsistent trailing punctuation in narrative.
    The failed narrative has no trailing period ("...failed: reason"), while the success narratives end with a period ("...succeeded, producing 2 resources (items)."). This won't break anything, but is a minor inconsistency in the agent-facing output.

  3. scripts/compile.ts:49stderr: "null" silences git errors.
    If git rev-parse fails for a reason other than "not a git repo" (e.g., corrupt index), the error is silently swallowed and the sha falls back to empty string. This is the intended design (graceful degradation), but means compile-time git issues won't surface in build logs.

Verdict

PASS — The changes are structurally sound. The compile script correctly saves and restores the version file around git sha stamping. The report logic correctly handles all code paths (succeeded/failed, with/without data, with/without output specs). The redactSensitiveArgs handling is correct since globalArgs and methodArgs are required (non-optional) on MethodReportContext. The summarise.ts changes correctly copy arrays before reversing to avoid mutation. No critical or high issues found.

@stack72 stack72 merged commit 9af30ac into main Mar 30, 2026
10 checks passed
@stack72 stack72 deleted the feat/937-optimize-output-modes branch March 30, 2026 21:38
keeb added a commit that referenced this pull request Mar 30, 2026
PR #944 accidentally dropped the modelId (UUID) field from the
method-summary report JSON. Scripts filtering by model ID silently
got no field. This restores it alongside modelName.

Fixes #954

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
keeb added a commit that referenced this pull request Mar 30, 2026
Restores the version-pinned retrievalCommand field in each dataProduced
entry of the method-summary report JSON output, fixing the backwards
compatibility regression from PR #944.

Closes #953

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
keeb added a commit that referenced this pull request Mar 30, 2026
…r display

Add focused tests that were identified as missing in PR #944 code review:

- buildRedactSensitiveArgs: test redaction of sensitive global/method args
  via executeReports, verify no-op for models without sensitive fields, and
  verify workflow scope skips redaction
- methodSummaryReport: test that redactSensitiveArgs callback properly masks
  sensitive fields in both markdown and JSON output, and that args render
  as-is when no callback is provided
- summarise renderer: test compact-mode "last error:" display for failed
  methods and workflow steps, verify verbose mode shows per-run detail
  instead, and cover JSON output and error handling

Closes #956

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
keeb added a commit that referenced this pull request Mar 30, 2026
…r display

Add focused tests that were identified as missing in PR #944 code review:

- buildRedactSensitiveArgs: test redaction of sensitive global/method args
  via executeReports, verify no-op for models without sensitive fields, and
  verify workflow scope skips redaction
- methodSummaryReport: test that redactSensitiveArgs callback properly masks
  sensitive fields in both markdown and JSON output, and that args render
  as-is when no callback is provided
- summarise renderer: test compact-mode "last error:" display for failed
  methods and workflow steps, verify verbose mode shows per-run detail
  instead, and cover JSON output and error handling

Closes #956

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize CLI output modes for AI operator consumption Provide human-readable CLI output that doesn't require --json for every command

2 participants