feat: optimize CLI output for AI agent consumption#944
Conversation
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
-
src/domain/reports/builtin/method_summary_report.ts:83—formatSchemaFieldscastsprop.typetostring | undefined, but JSON Schematypecan be an array (e.g.,["string", "null"]for optional/nullable fields fromz.string().optional()orz.string().nullable()). WhenzodToJsonSchemaproduces{ "type": ["string", "null"] }, the field renders asname: string,nullinstead of something readable.Breaking input: Any Zod schema with
.nullable()or.optional()fields (common for optional resource attributes).Suggested fix:
const type = Array.isArray(prop.type) ? prop.type.filter(t => t !== "null").join(" | ") : prop.type; -
src/domain/reports/builtin/method_summary_report.ts:150-153— The retrieval command embedsdefinitionNameandhandles[0].namedirectly into a backtick-quoted markdown string without escaping. If a data handle name contains backticks or markdown special characters, the markdown output will be malformed.Breaking input: A data handle named
`foo`orname with | pipe.Likelihood: Low in practice since data names are likely validated elsewhere, but worth noting.
-
JSON output is a breaking change for external consumers — The report JSON shape removes
modelId,globalArgs,methodArgs, andretrievalCommandfromdataProduced. Any external tooling or agent parsing the@swamp/method-summaryJSON output (via--jsonon report commands) will break. The PR description acknowledges this is intentional, but there's no migration path or version bump.
Low
-
src/presentation/renderers/summarise.ts:96-98—[...method.runs].reverse().find(...)copies the entire runs array just to find the last matching element. For very large run histories this is wasteful. AfindLastor backward loop would be more efficient. Not a correctness issue. -
src/domain/reports/builtin/method_summary_report.ts:40-42— Failed narrative omits the trailing period while succeeded narratives include it. Inconsistent but cosmetic.
Verdict
PASS — The code is well-structured, tested, and the logic is correct. The schema type array handling (Medium #1) is the most likely real-world issue but won't cause crashes, just slightly garbled field type display. The JSON shape change (Medium #3) is intentional per the PR description. No blocking issues.
There was a problem hiding this comment.
Code Review
No blocking issues found. This is a well-structured PR that improves AI agent consumption of CLI output.
What looks good
-
Report rewrite —
method_summary_report.tsis cleanly decomposed into focused helper functions (buildNarrative,formatSchemaFields,renderSchemaMarkdown,renderPointersMarkdown). The narrative + schema + pointers format is much more useful than echoing back args. -
DDD alignment —
OutputSpecInfois a proper value object in the domain layer.buildOutputSpecs()inrun.tscorrectly lives in the application/libswamp layer, translating domainModelDefinitionspecs into report context. Report context interfaces are well-defined discriminated unions. -
Import boundaries —
summarise.ts(presentation) imports from../../libswamp/mod.ts✓.run.ts(libswamp-internal) imports from domain paths ✓. No boundary violations. -
Test coverage — 8 test cases covering: succeeded/failed status, error messages, empty data handles, output spec schema rendering, multiple grouped handles, and JSON output structure validation. Tests follow the
"functionName: describes behavior"naming convention. -
Error surfacing in summarise — The compact-mode
last error:display for failed methods and workflow steps is a good UX improvement. Verbose mode already showed per-run errors, and the new compact fallback finds the most recent failure cleanly. -
Skill file consistency —
--jsoncorrectly removed from execution commands (model method run,workflow run) while retained on retrieval (data get,data list), mutation (model create), and inspection commands.
Suggestions
-
formatSchemaFieldsonly renders top-leveltypefor each property. For schemas withoneOf,$ref, or array items, the output would showunknown. This is fine for now but could be noted as a known limitation if agents rely on it heavily. -
Minor: the
buildNarrativefailed case omits a trailing period (failed${reason}) while the success cases include one (succeeded with no data output.). Cosmetic only.
🤖 Generated with Claude Code
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
-
summarisecompact error indentation is hardcoded (src/presentation/renderers/summarise.ts:101): The method error line uses 13 fixed spaces, while the workflow equivalent uses.padEnd(maxLabel + 2)for dynamic alignment. If method names are short or long, the error line may not visually hang under the right column. Minor cosmetic inconsistency. -
Method summary report JSON loses
modelId(src/domain/reports/builtin/method_summary_report.ts): The old JSON shape includedmodelId(the UUID) alongsidemodelName. The new shape drops it. Any script doingreport get --json | jq .modelIdwill silently get null.modelNamecovers the identity use case for most cases, but worth noting the UUID is no longer available from this report. -
Method summary report: single
Retrieve:example per spec group (src/domain/reports/builtin/method_summary_report.ts): The inline markdown shows one retrieval command usinghandles[0].nameeven when a group has multiple handles. The full name list is shown above it, so users can construct the rest, but a hint like(and N more)on the example would make this clearer for agents.
Verdict
PASS. The error message surfacing in summarise --log is a clear improvement — agents and users can now see why something failed without --verbose. The method summary report rewrite produces a much more useful narrative+schema+pointers format. The skill file changes consistently apply the --json-on-retrieval convention. No UX regressions.
17cb87d to
d832cf2
Compare
9155c83 to
b8b2f64
Compare
Rework the built-in @swamp/method-summary report to serve both humans and AI agents through the same artifact: Terminal (markdown): narrative summary, git sha in header, redacted arguments, data output table with version-pinned retrieval commands. JSON (report get --json): narrative, output schema with field types, redacted arguments, versioned data pointers, git sha. Eliminates the need for separate model type describe calls. Add error messages to summarise --log output for failed method executions and workflow runs. Remove --json from 87 execution command examples across 19 skill files. Establish consistent rule: execution = default output, retrieval = --json, mutation = --json. Bake git sha at compile time so reports always identify the exact swamp build. Closes #937, closes #921 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
b8b2f64 to
d7b73bb
Compare
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Removed
redactSensitiveArgstest coverage — The PR removed theredactSensitiveArgs redacts sensitive fields in markdown and JSONandwithout redactSensitiveArgs, args render as-istests. The redaction logic is still exercised (the function is called in the report), but there are no dedicated tests asserting that sensitive fields are properly masked. Consider adding a focused test back to prevent future regressions in credential redaction. -
No test coverage for summarise renderer error display — The new compact-mode error display logic in
src/presentation/renderers/summarise.ts(lines 94–103 and 166–183) has no unit tests. The existing file had no tests either, so this is pre-existing, but since the PR adds user-facing error surfacing logic, this would be a good time to add a test file. -
Minor:
renderPointersMarkdownunreachable branch — The function has an early return fordataHandles.length === 0producing["## Data Output", "", "No data output."], but the caller at line 156 already gates ondataHandles.length > 0. The defensive check isn't harmful but the "No data output" path is dead code within the current report.
Notes
- Import boundary: All CLI/presentation imports use
libswamp/mod.tscorrectly. Internal libswamp code (src/libswamp/models/run.ts) imports from internal paths as allowed. - DDD:
OutputSpecInfois a well-formed value object.buildOutputSpecsappropriately lives in the application/libswamp layer.buildNarrativeandrenderPointersMarkdownare pure domain functions — clean separation. - No
anytypes in production code — test files useas anyfor mock stubs withdeno-lint-ignore, which is the established pattern. - License headers: All modified
.tsfiles have the AGPLv3 header. - Compile script: Git SHA stamping with proper restore-on-finally is correct. The
stderr: "null"ingetGitShagracefully handles non-git environments. - Skill doc updates: Consistent removal of
--jsonfrom execution commands across 13 skill files aligns with the new output mode guidance.
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
-
Arguments section always renders empty
{}blobs (method_summary_report.tslines ~146–155)
The new code unconditionally renders both**Global Arguments**and**Method Arguments**even when both are{}. The old code showed "No arguments." for empty args and omitted whichever section was absent. For models that take no args, users now see twojson\n{}\nblocks after the narrative — noisy and not useful. The deleted tests ("both args empty shows no arguments", "only globalArgs present omits method section") validated the old conditional logic; the new code has no equivalent guard. -
retrievalCommandremoved from JSONdataProduced(method_summary_report.tsline ~175)
The old JSON included a ready-to-useretrievalCommand: "swamp data get ...". Scripts now have to constructswamp data get <modelName> <name> --version <version>from the three separate fields. The data is present and version-pinning is a real improvement, but the convenience regression is worth noting for any existing consumers. Consider adding aretrievalCommandfield back (version-pinned) so both old and new consumers work. -
modelIdremoved from JSON (method_summary_report.tsline ~163)
Any script filtering by model ID will silently get no field. Low-impact sincemodelNamecovers most lookups, but worth documenting as a breaking JSON shape change. -
last errorindent hardcoded for methods, dynamic for workflows (summarise.tsline 101 vs 177)
Method error lines use a hardcoded 13-space indent; workflow uses"".padEnd(maxLabel + 2)to align under the workflow name. On a model with a long name likemy-very-long-model-name, the 13-space indent won't align under the method row. Minor visual inconsistency.
Verdict
PASS — No blocking issues. The narrative summary, version-pinned retrieval commands, inline error messages in summarise --log, and schema in JSON are all solid UX improvements. The suggestions above are polish items that don't block merge.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
src/domain/reports/builtin/method_summary_report.ts:86— Markdown table breakage ifhandle.nameordefinitionNamecontain pipe characters.
The retrieval command is interpolated directly into a markdown table cell:If a data handle name containsconst cmd = `swamp data get ${definitionName} ${handle.name} --version ${handle.version}`; lines.push(`| **${handle.name}** | ${handle.kind} | \`${cmd}\` |`);
|(e.g.,foo|bar), the markdown table row will be malformed. In practice model/data names are likely validated to exclude such characters upstream, so this is unlikely to trigger — but the report code itself has no guard.
Low
-
src/domain/reports/builtin/method_summary_report.ts:77-79— Dead code path inrenderPointersMarkdown.
The function handlesdataHandles.length === 0, but the sole call site (line 156) already guards withif (dataHandles.length > 0). The empty-array branch is unreachable. Not a bug, just dead code. -
src/domain/reports/builtin/method_summary_report.ts:41vs:45— Inconsistent trailing punctuation in narrative.
The failed narrative has no trailing period ("...failed: reason"), while the success narratives end with a period ("...succeeded, producing 2 resources (items)."). This won't break anything, but is a minor inconsistency in the agent-facing output. -
scripts/compile.ts:49—stderr: "null"silences git errors.
Ifgit rev-parsefails for a reason other than "not a git repo" (e.g., corrupt index), the error is silently swallowed and the sha falls back to empty string. This is the intended design (graceful degradation), but means compile-time git issues won't surface in build logs.
Verdict
PASS — The changes are structurally sound. The compile script correctly saves and restores the version file around git sha stamping. The report logic correctly handles all code paths (succeeded/failed, with/without data, with/without output specs). The redactSensitiveArgs handling is correct since globalArgs and methodArgs are required (non-optional) on MethodReportContext. The summarise.ts changes correctly copy arrays before reversing to avoid mutation. No critical or high issues found.
PR #944 accidentally dropped the modelId (UUID) field from the method-summary report JSON. Scripts filtering by model ID silently got no field. This restores it alongside modelName. Fixes #954 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Restores the version-pinned retrievalCommand field in each dataProduced entry of the method-summary report JSON output, fixing the backwards compatibility regression from PR #944. Closes #953 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…r display Add focused tests that were identified as missing in PR #944 code review: - buildRedactSensitiveArgs: test redaction of sensitive global/method args via executeReports, verify no-op for models without sensitive fields, and verify workflow scope skips redaction - methodSummaryReport: test that redactSensitiveArgs callback properly masks sensitive fields in both markdown and JSON output, and that args render as-is when no callback is provided - summarise renderer: test compact-mode "last error:" display for failed methods and workflow steps, verify verbose mode shows per-run detail instead, and cover JSON output and error handling Closes #956 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…r display Add focused tests that were identified as missing in PR #944 code review: - buildRedactSensitiveArgs: test redaction of sensitive global/method args via executeReports, verify no-op for models without sensitive fields, and verify workflow scope skips redaction - methodSummaryReport: test that redactSensitiveArgs callback properly masks sensitive fields in both markdown and JSON output, and that args render as-is when no callback is provided - summarise renderer: test compact-mode "last error:" display for failed methods and workflow steps, verify verbose mode shows per-run detail instead, and cover JSON output and error handling Closes #956 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Summary
Optimizes CLI output for both human and AI agent consumption by reworking the
built-in
@swamp/method-summaryreport andsummarise --logoutput.Method-summary report changes
Terminal (markdown) — optimized for humans:
--version N) so reports are reliable historical recordsJSON (via
report get --json) — optimized for agents:model type describecallSummarise
--logchangeslast error: "..."inline instead of just "✗ 3"--jsonSkill documentation changes
--jsonfrom 77 execution command examples across 13 skill files--json, mutation =--jsonCloses #937, closes #921
Test Plan
deno check,deno lint,deno fmtcleansummarise --logshows error messages in both reposdata getretrieval path unchanged🤖 Generated with Claude Code