Skip to content

update cli docs#184

Merged
anubra266 merged 1 commit intomainfrom
update-cli-docs
Sep 13, 2025
Merged

update cli docs#184
anubra266 merged 1 commit intomainfrom
update-cli-docs

Conversation

@anubra266
Copy link
Copy Markdown
Contributor

No description provided.

@vercel
Copy link
Copy Markdown

vercel Bot commented Sep 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
agents-docs Building Building Preview Comment Sep 13, 2025 8:25pm
agents-manage-api Building Building Preview Comment Sep 13, 2025 8:25pm
agents-manage-ui Building Building Preview Comment Sep 13, 2025 8:25pm
agents-run-api Building Building Preview Comment Sep 13, 2025 8:25pm

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 13, 2025

Claude finished @anubra266's task —— View job


📋 TypeScript Staff Engineer PR Review

Review Tasks:

  • Read and analyze the modified CLI documentation file
  • Evaluate changes against system architecture and documentation standards
  • Assess customer impact and usability considerations
  • Provide structured feedback with actionable recommendations

Analysis Summary

This PR significantly refactors the CLI documentation to shift from a graph-centric workflow to a project-centric workflow. The changes are substantial (+139/-101 lines) and represent a fundamental change in how users interact with the Inkeep Agent Framework.


🔍 Key Changes Identified

1. Workflow Paradigm Shift

  • Before: inkeep push <graph-path> - pushing individual graph files
  • After: inkeep push - pushing entire projects with multi-agent systems
  • Before: inkeep pull <graph-id> - pulling individual graphs
  • After: inkeep pull - pulling and updating all project files

2. New Environment Management

  • Added --env flag for environment-specific credentials
  • New environment file structure: environments/<environment>.env.ts
  • Support for development/production credential separation

3. Enhanced File Management

  • Project-wide file discovery and updates
  • Smart categorization (index, graphs, agents, tools)
  • Context-aware TypeScript generation per file type

💡 Strengths

Excellent Documentation Structure

  • Clear command syntax and examples
  • Comprehensive troubleshooting section
  • Well-organized workflow examples

Strong User Experience Focus

  • Step-by-step examples for common workflows
  • Environment-specific configuration guidance
  • Helpful error scenarios and solutions

⚠️ Critical Issues & Recommendations

1. Breaking Changes Not Clearly Communicated

Severity: HIGH

The documentation presents this as current functionality, but these appear to be breaking changes to the CLI interface:

  • Old: inkeep push src/agents/my-graph.ts
  • New: inkeep push (no file argument)
  • Old: inkeep pull existing-graph-id
  • New: inkeep pull (no graph ID)

Recommendations:

  • Add a "Migration Guide" section explaining the breaking changes
  • Include version information indicating when this new API was introduced
  • Consider deprecation warnings for the old syntax
  • Provide clear migration steps from graph-based to project-based workflow

2. Inconsistent Command Examples

Severity: MEDIUM

Several troubleshooting examples still reference the old API:

# Line 533 - Still uses old syntax
inkeep pull graph-id --max-retries 5

# Line 539 - Still uses old syntax  
inkeep pull graph-id --json

Recommendation: Update all examples to use the new project-centric API consistently.

3. Missing Error Handling Documentation

Severity: MEDIUM

The new project-wide update functionality introduces significant complexity, but error handling is underspecified:

  • What happens if LLM generation fails for some files but not others?
  • How are partial failures handled in project-wide operations?
  • What's the rollback strategy if TypeScript generation corrupts files?

Recommendations:

  • Document partial failure scenarios
  • Add backup/restore guidance for project-wide operations
  • Include validation steps users should take after inkeep pull

4. Credential Security Considerations

Severity: MEDIUM-HIGH

The new environment file structure stores credentials, but security guidance is minimal:

Current Example:

// environments/development.env.ts - This will be committed to git!
export const development = registerEnvironmentSettings({
  credentials: {
    "api-key-dev": {
      retrievalParams: {
        key: "API_KEY_DEV", // References env var - good
      },
    },
  },
});

Recommendations:

  • Add explicit guidance about NOT committing actual secrets
  • Show how credentials reference environment variables
  • Include .gitignore recommendations for credential files
  • Add section on credential rotation and management

5. File Discovery Algorithm Unclear

Severity: MEDIUM

The documentation states: "Recursively finds all .ts files in your project (excluding environments/ and node_modules/)"

Concerns:

  • No mention of other standard exclusions (.git/, dist/, build/, etc.)
  • What about symlinks or nested projects?
  • Performance implications for large codebases?
  • Risk of modifying unrelated TypeScript files?

Recommendations:

  • Define the complete exclusion list
  • Add option to specify custom exclusions
  • Include dry-run mode: inkeep pull --dry-run
  • Show example of what files would be modified before making changes

🏗️ System Architecture Considerations

1. Atomic Operations

The shift to project-wide operations raises atomicity concerns. If inkeep pull modifies 20 TypeScript files and fails on file 15, what's the state?

Recommendation: Document transaction semantics and recovery procedures.

2. Version Control Integration

Project-wide modifications have significant VCS implications:

Recommendations:

  • Add git integration suggestions (branch creation, automatic commits)
  • Include diff review workflows before pushing changes
  • Consider --commit flag to automatically create commits with descriptive messages

3. Scalability Concerns

For large projects with many agent files, the LLM-based generation could be:

  • Expensive (multiple LLM calls)
  • Slow (sequential processing)
  • Error-prone (cascading failures)

Recommendations:

  • Document expected performance characteristics
  • Consider parallel processing for independent files
  • Add progress indicators for long-running operations

🎯 Customer Experience Issues

1. Learning Curve Steepness

The transition from simple graph-based commands to complex project management represents a significant complexity increase.

Recommendations:

  • Add a "Quick Start" section showing the simplest possible workflow
  • Provide a comparison table: "Old way vs New way"
  • Include video tutorials or interactive examples

2. Discoverability Problems

Key functionality is buried in long explanations:

Issues:

  • The --json flag (important for debugging) is mentioned casually
  • Environment credential loading is explained but not prominently featured
  • Model configuration is hidden deep in the pull command documentation

Recommendations:

  • Add a "Common Flags" reference section
  • Create visual hierarchy with better use of callouts/admonitions
  • Include a "CLI Cheat Sheet" section

🔧 Technical Documentation Issues

1. Type Safety Missing

The environment file example doesn't show TypeScript types clearly:

Current:

export const development = registerEnvironmentSettings({
  credentials: { /* ... */ }
});

Better:

import type { EnvironmentSettings } from "@inkeep/agents-sdk";

export const development: EnvironmentSettings = registerEnvironmentSettings({
  credentials: { /* ... */ }
});

2. Import Statements Inconsistent

Different sections show different import patterns:

  • Line 398: import { project } from "@inkeep/agents-sdk";
  • Line 411: import { agent, agentGraph } from "@inkeep/agents-sdk";
  • Line 93: import { CredentialStoreType } from "@inkeep/agents-core";

Recommendation: Standardize import patterns and show complete, working examples.


✅ Final Recommendations

Immediate Actions (Pre-merge)

  1. Fix inconsistent command examples in troubleshooting section
  2. Add breaking changes notice and migration guide
  3. Include security warnings for credential management
  4. Add dry-run capability documentation

Short-term Improvements

  1. Create interactive tutorials for the new workflow
  2. Add comprehensive error code reference
  3. Include performance characteristics documentation
  4. Create CLI reference card/cheat sheet

Long-term Considerations

  1. Consider backward compatibility for the old graph-based API
  2. Implement transaction-safe file operations
  3. Add integration with popular git workflows
  4. Create visual project structure examples

This documentation update represents a significant evolution of the Inkeep CLI toward a more sophisticated project management approach. While the changes are well-intentioned and likely reflect important product improvements, the documentation needs additional work to ensure smooth user adoption and minimize confusion during the transition.


@anubra266 anubra266 merged commit 13a6009 into main Sep 13, 2025
2 of 7 checks passed
@anubra266 anubra266 deleted the update-cli-docs branch September 13, 2025 20:25
inkeep-oss-sync Bot pushed a commit that referenced this pull request Apr 22, 2026
…ded (#194)

* fix(ci): always reset copybara/sync on every mirror run

Closes #188

Drop the "leave branch in place if open PR is < STALE_PR_HOURS" branch in
the mirror sync reconcile step. Letting Copybara "append" to an existing
copybara/sync was never safe: the Copybara config uses fetch=main, so
every run baselines off inkeep/agents main's last GitOrigin-RevId. When
a new push lands on agents-private main while a prior sync PR is still
open, Copybara rebuilds the older origin change from main's HEAD (new
SHA due to timestamps) and the non-force push to copybara/sync is
rejected as non-fast-forward. This is the failure mode that blew up
the release cascade in #188 (Version Packages #185 merged while #3166
was still open 9 minutes after being created).

Every mirror run now closes any open sync PR and deletes copybara/sync
before Copybara runs, so each run pushes a fresh history. The concurrency
group already serializes runs and every new run includes all accumulated
changes since the last imported revision, so no information is lost.
PR churn (one inkeep/agents sync PR per agents-private main push) is the
cost, and it is much cheaper than a stuck release cascade.

CI_RUNBOOK gets a new entry for this specific failure string so future
red runs route to the fix without a re-investigation.

* fix(ci): harden release cascade against silent strandings

Bundled on top of the copybara/sync reset in this PR so the whole
release path (mirror sync -> npm publish -> GH Release -> Vercel
prod deploy -> scheduler restart) can run end-to-end with no human
intervention. Each fix closes a distinct silent-stranding mode.

1. public-mirror-sync.yml Create-PR guard
   - Reconcile now always deletes copybara/sync before Copybara runs,
     which introduced a regression: when Copybara exits 4 (no changes
     to sync, eg. workflow_dispatch with an idle main), the branch is
     gone and the next `gh pr create --head copybara/sync` would fail.
     Add an explicit branch-existence check; short-circuit cleanly.
   - Add explicit --state open to the gh pr list call. Defaults to open
     but being explicit prevents a future refactor from reintroducing
     the PR #184 bug class.
   - Replace the PR number extraction `grep -o '[0-9]*$'` on the PR URL
     with gh pr view --json number. gh's stdout format is not a contract.

2. private-agents-ui-version-packages.yml publish detection
   - Was parsing `Publishing "X" at "Y"` via grep/sed on the changesets
     log, which is the exact fragility PR #174 removed from public
     release.yml. If changesets v2 changes format, published=false is
     written despite a successful publish, the widget-release dispatch
     is skipped, and agents-docs changelog silently desyncs.
   - Use the stable "packages published successfully" presence marker
     and read the version from package.json (authoritative for a fixed
     release group).

3. public/agents/.github/workflows/release.yml catch-all + dispatch retry
   - `Notify agents-private (failure)` was gated on
     `steps.detect.outputs.has_changesets == 'false'`. If the workflow
     failed before the detect step ran (install, build, token gen),
     has_changesets is unset and the condition evaluated false -> no
     dispatch, no tracking issue on agents-private, red run sitting
     invisibly in the Actions tab. Drop the has_changesets gate.
   - Replace peter-evans/repository-dispatch with a bash retry loop
     (3 attempts, 5/10s backoff). The action has no built-in retry, so
     a transient 5xx or rate-limit during the post-publish dispatch
     loses the signal permanently: npm publishes, but no GH Release is
     created and no Vercel prod deploy fires. Retry + explicit error
     on exhausted attempts so the stranding is loud, not silent.

4. public-agents-vercel-production.yml concurrency + failure tracker
   - Add concurrency: vercel-production-deploy. DB migrations are not
     idempotent; two parallel deploys (eg. a release published while a
     manual re-dispatch is in flight) would race on migrate-databases
     and leave schema in a half-applied state.
   - Add notify-on-failure job (mirrors the tracking-issue pattern
     from public-mirror-sync.yml). At this point npm has published,
     the GH Release exists, but prod runtime is stale. Needs to be
     loud: auto-open a "Vercel production deploy failing" issue so
     the half-shipped state is visible instead of buried in the
     Actions tab.

CI_RUNBOOK.md: reword the release/publish failure entries to match
the new retry/tracking behavior, and add a new entry covering the
post-publish deploy failure case.

Intentionally out of scope: the auto-format.yml + Dependabot
`pnpm install --frozen-lockfile` race. Not a release-cascade issue,
will go in a separate PR.

* docs(runbook): bold Historical marker for consistency

GitOrigin-RevId: 04ff8b544833e109b57f75ded3236730d7fb10eb
github-merge-queue Bot pushed a commit that referenced this pull request Apr 22, 2026
* Version Packages (agents) (#185)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
GitOrigin-RevId: 7263142a67ac9ce9c9873a68a5673bfb436dbc1c

* chore(copilot-app): remove redundant lockfile, install from monorepo root (#186)

* chore(copilot-app): remove redundant lockfile, install from monorepo root

copilot-app is a workspace member (pnpm-workspace.yaml line 18), so the root
lockfile already resolves its dependencies. The second lockfile only existed
because vercel.json used pnpm install --ignore-workspace --frozen-lockfile,
which severs workspace context and therefore needed a local lockfile.

Two install boundaries for the same app meant root pnpm.overrides did not
apply to the Vercel install, so CI and Vercel could silently resolve to
different dependency trees. PR #167's description originally said "Vercel
to install + build from the monorepo root via pnpm --filter copilot-app...",
but the committed vercel.json drifted to --ignore-workspace. This aligns the
implementation with the stated plan.

- Delete private/copilot-app/pnpm-lock.yaml
- Change private/copilot-app/vercel.json installCommand to install from the
  monorepo root with a workspace filter
- Drop the copilot-app entry from scripts/check-monorepo-traps.mjs and simplify
  the DUAL_LOCKFILE_ROOTS comment (every remaining entry is a true workspace
  boundary, so the ignoreWorkspace workaround is no longer needed for any of
  them)

* docs(private): update lockfile section after copilot-app cleanup

* chore: add install:all convenience script for dual-lockfile installs

* chore: include create-agents-template in install:all

* fix(copilot-app): drop redundant cd ../.. from vercel installCommand

* docs: point dual-lockfile guidance at pnpm install:all

This PR introduces the install:all script; update every doc that teaches
the old cd-and-install-twice pattern to reference the shorthand instead.

- AGENTS.md (root) Dual lockfiles section: replaces the two-step pnpm
  install invocation with a single install:all, and lists all three
  lockfile scopes (root, public/agents, public/agents/create-agents-template)
  so readers understand what the shorthand covers.
- CI_RUNBOOK ERR_PNPM_OUTDATED_LOCKFILE: same substitution plus the third
  lockfile in the git add line.
- public/agents/AGENTS.md pnpm-lock.yaml Resolution Strategy: adds a
  When changing dependencies callout pointing at install:all, so readers
  inside the public/agents subtree know they have a root shortcut for
  the whole-monorepo regeneration.

* chore(check-monorepo-traps): drop dead ignoreWorkspace flag

Every DUAL_LOCKFILE_ROOTS entry is now a true workspace boundary that
installs without --ignore-workspace. The flag had exactly one live
consumer (private/copilot-app) which this PR removes. Simplify the data
structure to an array of path strings and drop the now-unused flag
branches in the install command and regen hint.

Also: the regen hint gains a pointer at the install:all shorthand, since
that's the recommended path for a whole-monorepo resync.

* docs: comprehensive command cheatsheet + check:structural aggregate

The problem: every time a new shorthand is added (install:all, check:*)
it lands in code but stays invisible in docs. People default to the raw
cd-and-install form, which is how we drift. The cheatsheet is the fix
for the drift-by-ignorance path.

Changes:
- Adds check:structural to root package.json - one command for the full
  structural guard set (boundaries + monorepo-traps + release-groups
  validate). Complements the existing pre-push hook which only runs
  check:monorepo-traps.
- Rewrites AGENTS.md 'Command routing' section as 'Command cheatsheet'
  with a scenario-driven quick-lookup table at top, then grouped by
  intent: install/lockfiles, build+dev+lint+typecheck+test, structural
  guards, changesets+releases, mirror/Copybara, parity, database.
- Documents the suffix convention (:agents, :agents-ui, :chat-to-edit,
  :inkeep-cloud-mcp, :copilot, :ext; no suffix = fan-out) so people
  can guess commands instead of memorizing.
- Every command gets a one-line description of what it does and when
  to reach for it.

* fix(check-monorepo-traps): guard the create-agents-template lockfile too

Docs introduced in this PR call out three lockfiles (root, public/agents,
public/agents/create-agents-template) and point at install:all as the
shorthand that regenerates them. The check only validated two — the
starter-kit lockfile could drift silently and slip past the pre-push
hook, surfacing for end users later when they cloned the starter.

Add public/agents/create-agents-template to DUAL_LOCKFILE_ROOTS and
update the comment to reflect the actual install-boundary taxonomy
(monorepo / Copybara+Vercel / standalone starter). install:all and the
check now cover the same set.

* ci: gate publish on check:structural (defense-in-depth)

Required checks on the source PR already run check:structural, and both
version-packages workflows check out origin/main before doing anything.
In practice, publish always runs against a validated main state.

But 'in practice' isn't the same as 'structurally'. A workflow_dispatch
run against main, an admin bypass of branch protection, or a future
change that loosens merge requirements could let a misconfigured main
reach the publish step without re-validation. Today's agents-ui release
already surfaced one post-publish pipefail bug that shouldn't have been
possible if we trusted the pipeline - this gate is the same intuition
applied upstream.

Adds 'Validate structural invariants' step between Install and the
release machinery in both private-agents-ui-version-packages.yml and
public-agents-version-packages.yml. Runs pnpm check:structural, which
aggregates check:boundaries + check:monorepo-traps + release-groups:validate
(including the workspace-isolation guard introduced in #191). Fails
hard on any structural misconfig, refusing to publish.

Cost: ~30-60s per publish run. Cheaper than a bad release.
GitOrigin-RevId: 684d52e5ab7734f592479b61e972cdfe5fc3ae23

* fix(ci): harden release cascade so copybara + npm publish run unattended (#194)

* fix(ci): always reset copybara/sync on every mirror run

Closes #188

Drop the "leave branch in place if open PR is < STALE_PR_HOURS" branch in
the mirror sync reconcile step. Letting Copybara "append" to an existing
copybara/sync was never safe: the Copybara config uses fetch=main, so
every run baselines off inkeep/agents main's last GitOrigin-RevId. When
a new push lands on agents-private main while a prior sync PR is still
open, Copybara rebuilds the older origin change from main's HEAD (new
SHA due to timestamps) and the non-force push to copybara/sync is
rejected as non-fast-forward. This is the failure mode that blew up
the release cascade in #188 (Version Packages #185 merged while #3166
was still open 9 minutes after being created).

Every mirror run now closes any open sync PR and deletes copybara/sync
before Copybara runs, so each run pushes a fresh history. The concurrency
group already serializes runs and every new run includes all accumulated
changes since the last imported revision, so no information is lost.
PR churn (one inkeep/agents sync PR per agents-private main push) is the
cost, and it is much cheaper than a stuck release cascade.

CI_RUNBOOK gets a new entry for this specific failure string so future
red runs route to the fix without a re-investigation.

* fix(ci): harden release cascade against silent strandings

Bundled on top of the copybara/sync reset in this PR so the whole
release path (mirror sync -> npm publish -> GH Release -> Vercel
prod deploy -> scheduler restart) can run end-to-end with no human
intervention. Each fix closes a distinct silent-stranding mode.

1. public-mirror-sync.yml Create-PR guard
   - Reconcile now always deletes copybara/sync before Copybara runs,
     which introduced a regression: when Copybara exits 4 (no changes
     to sync, eg. workflow_dispatch with an idle main), the branch is
     gone and the next `gh pr create --head copybara/sync` would fail.
     Add an explicit branch-existence check; short-circuit cleanly.
   - Add explicit --state open to the gh pr list call. Defaults to open
     but being explicit prevents a future refactor from reintroducing
     the PR #184 bug class.
   - Replace the PR number extraction `grep -o '[0-9]*$'` on the PR URL
     with gh pr view --json number. gh's stdout format is not a contract.

2. private-agents-ui-version-packages.yml publish detection
   - Was parsing `Publishing "X" at "Y"` via grep/sed on the changesets
     log, which is the exact fragility PR #174 removed from public
     release.yml. If changesets v2 changes format, published=false is
     written despite a successful publish, the widget-release dispatch
     is skipped, and agents-docs changelog silently desyncs.
   - Use the stable "packages published successfully" presence marker
     and read the version from package.json (authoritative for a fixed
     release group).

3. public/agents/.github/workflows/release.yml catch-all + dispatch retry
   - `Notify agents-private (failure)` was gated on
     `steps.detect.outputs.has_changesets == 'false'`. If the workflow
     failed before the detect step ran (install, build, token gen),
     has_changesets is unset and the condition evaluated false -> no
     dispatch, no tracking issue on agents-private, red run sitting
     invisibly in the Actions tab. Drop the has_changesets gate.
   - Replace peter-evans/repository-dispatch with a bash retry loop
     (3 attempts, 5/10s backoff). The action has no built-in retry, so
     a transient 5xx or rate-limit during the post-publish dispatch
     loses the signal permanently: npm publishes, but no GH Release is
     created and no Vercel prod deploy fires. Retry + explicit error
     on exhausted attempts so the stranding is loud, not silent.

4. public-agents-vercel-production.yml concurrency + failure tracker
   - Add concurrency: vercel-production-deploy. DB migrations are not
     idempotent; two parallel deploys (eg. a release published while a
     manual re-dispatch is in flight) would race on migrate-databases
     and leave schema in a half-applied state.
   - Add notify-on-failure job (mirrors the tracking-issue pattern
     from public-mirror-sync.yml). At this point npm has published,
     the GH Release exists, but prod runtime is stale. Needs to be
     loud: auto-open a "Vercel production deploy failing" issue so
     the half-shipped state is visible instead of buried in the
     Actions tab.

CI_RUNBOOK.md: reword the release/publish failure entries to match
the new retry/tracking behavior, and add a new entry covering the
post-publish deploy failure case.

Intentionally out of scope: the auto-format.yml + Dependabot
`pnpm install --frozen-lockfile` race. Not a release-cascade issue,
will go in a separate PR.

* docs(runbook): bold Historical marker for consistency

GitOrigin-RevId: 04ff8b544833e109b57f75ded3236730d7fb10eb

---------

Co-authored-by: Varun Varahabhotla <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant