Skip to content

fix: handle credential property name variations in nango-store#191

Merged
amikofalvy merged 7 commits intomainfrom
fix/ci-tests
Sep 14, 2025
Merged

fix: handle credential property name variations in nango-store#191
amikofalvy merged 7 commits intomainfrom
fix/ci-tests

Conversation

@amikofalvy
Copy link
Copy Markdown
Collaborator

@amikofalvy amikofalvy commented Sep 13, 2025

Summary

This PR fixes the CI failures that have been blocking merges to main. The issues were:

  1. TypeScript error in the nango-store credential handling
  2. Failing tests in the agents-sdk project

Problems Fixed

1. Nango Credential TypeScript Error

The CI was failing with:

error TS2551: Property 'api_key' does not exist on type 'ApiKeyCredentials'. Did you mean 'apiKey'?

Solution: The Nango API may return credentials with either camelCase (apiKey) or snake_case (api_key) property names. Added defensive handling to check for both formats.

2. Project Test Failures

Tests were failing with:

TypeError: Cannot read properties of undefined (reading 'ok')

Solution: Fixed the fetch mock in project tests to return a proper response object instead of undefined. Added proper cleanup of mocks in afterEach.

Changes

  • ✅ Handle both camelCase and snake_case credential properties in nango-store
  • ✅ Fix fetch mock setup in project.test.ts
  • ✅ Add proper mock cleanup in test teardown

Testing

  • ✅ All TypeScript compilation passes
  • ✅ All tests passing (agents-sdk: 203 passed, agents-run-api: 256 passed)
  • ✅ Backwards compatible with both naming conventions

Impact

Once merged, this will:

  • Fix the failing CI on main branch
  • Unblock all PRs waiting for CI to pass
  • Allow branch protection to work as intended

…ngo-store

The Nango API may return credentials with either camelCase or snake_case property names.
This fix ensures compatibility with both formats to prevent TypeScript errors.

Fixes the CI failure: 'api_key' does not exist on type 'ApiKeyCredentials'
@vercel
Copy link
Copy Markdown

vercel Bot commented Sep 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
agents-manage-api Ready Ready Preview Comment Sep 14, 2025 5:23am
agents-manage-ui Ready Ready Preview Comment Sep 14, 2025 5:23am
agents-run-api Ready Ready Preview Comment Sep 14, 2025 5:23am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
agents-docs Skipped Skipped Sep 14, 2025 5:23am

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 13, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've analyzed this PR thoroughly from both engineering and architectural perspectives. The fix addresses a legitimate TypeScript error that was blocking CI, but there are several important considerations and recommendations.

Todo List

  • Read and analyze the changed file (packages/agents-core/src/credential-stores/nango-store.ts)
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fix for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root cause - Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case).

Targeted Scope: Limited changes to only the failing cases (API_KEY and APP), which is appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The fix was only applied to API_KEY and APP cases, but not to other credential types that have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other cases may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants
  api_key?: string;
  access_token?: string;
  refresh_token?: string;
  // CamelCase variants  
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // ... then use flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: Existing tests in nango-store.test.ts:285-287 only test snake_case format. No tests verify camelCase handling or the fallback behavior.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


- Fix fetch mock to return a default response object instead of undefined
- Add proper cleanup of fetch mocks in afterEach
- Ensure mock responses have all required properties (ok, json, text, status, statusText)

This fixes the TypeError: Cannot read properties of undefined (reading 'ok') errors
in the project.test.ts tests.
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 13, 2025 23:50 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 13, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've analyzed this PR thoroughly from both engineering and architectural perspectives. The fix addresses a legitimate TypeScript error that was blocking CI, but there are several important considerations and recommendations.

Todo List

  • Read and analyze the changed file (packages/agents-core/src/credential-stores/nango-store.ts)
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fix for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root cause - Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case).

Targeted Scope: Limited changes to only the failing cases (API_KEY and APP), which is appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The fix was only applied to API_KEY and APP cases, but not to other credential types that have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other cases may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants
  api_key?: string;
  access_token?: string;
  refresh_token?: string;
  // CamelCase variants  
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // ... then use flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: Existing tests in nango-store.test.ts:285-287 only test snake_case format. No tests verify camelCase handling or the fallback behavior.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


The Nango TypeScript types expect 'apiKey' (camelCase) not 'api_key' (snake_case).
This fixes the TypeScript compilation error in the UI package that was causing CI failures.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 14, 2025 00:30 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 14, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've thoroughly analyzed this PR from both engineering and architectural perspectives. The fix addresses a legitimate TypeScript error that was blocking CI, but there are several important considerations and recommendations.

Todo List

  • Read and analyze all changed files
  • Research existing credential handling patterns in codebase
  • Analyze types and interfaces involved
  • Evaluate fixes for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root cause - Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case).

Targeted Scope: Limited changes to only the failing cases (API_KEY and APP), which is appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The fix was only applied to API_KEY and APP cases, but not to other credential types that have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other cases may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants
  api_key?: string;
  access_token?: string;
  refresh_token?: string;
  // CamelCase variants  
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // ... then use flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: Existing tests in nango-store.test.ts:285-287 only test snake_case format. No tests verify camelCase handling or the fallback behavior.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


The logger methods expect two arguments: a context object and a message string.
This fixes TypeScript compilation errors across multiple files:
- agents-run-api/src/agents/Agent.ts
- agents-run-api/src/agents/SystemPromptBuilder.ts
- agents-run-api/src/utils/incremental-stream-parser.ts

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 14, 2025 02:19 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 14, 2025

Claude finished @amikofalvy's task —— View job


Claude is working on @amikofalvy's task

Comprehensive Code Review

I've analyzed this PR thoroughly from both engineering and architectural perspectives. The fix addresses CI failures that were blocking merges to main.

Todo List

  • Read and analyze all 6 changed files to understand the fixes
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fixes for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root cause - Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case).

Targeted Scope: Limited changes to only the failing cases (API_KEY and APP), which is appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The fix was only applied to API_KEY and APP cases in nango-store.ts:79,83, but not to other credential types that have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other cases may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants
  api_key?: string;
  access_token?: string;
  refresh_token?: string;
  // CamelCase variants  
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // ... then use flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: No existing tests verify camelCase handling or the fallback behavior in the credential handling.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


The CLI has been refactored to use a project-based approach instead of
individual graph files. Updated test expectations to match the new interface:
- Push command now uses --project option instead of graph-path argument
- Updated help text expectations

Note: Some CLI tests still need fixing for the new project structure

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected])
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 14, 2025 02:21 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 14, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've thoroughly analyzed this PR from both engineering and architectural perspectives. The fix addresses CI failures that were blocking merges to main, specifically TypeScript errors in credential handling and failing tests.

Todo List

  • Read and analyze all 8 changed files to understand the fixes
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fixes for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root causes:

  1. Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case)
  2. Test failures due to improper fetch mock setup

Targeted Scope: Limited changes to only the failing cases, appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The defensive handling was only applied to API_KEY and APP cases, but NOT to other credential types that likely have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other credential types may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors when Nango returns camelCase versions of these properties.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety benefits.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants that Nango might return
  api_key?: string;
  access_token?: string; 
  refresh_token?: string;
  // CamelCase variants from TypeScript definitions
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // Then use: flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: No tests verify the fallback behavior works for both camelCase and snake_case formats.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


@amikofalvy
Copy link
Copy Markdown
Collaborator Author

CI Status Update

I've fixed the critical TypeScript and test failures:

✅ Fixed Issues:

  1. agents-manage-ui: Fixed api_key vs apiKey TypeScript error in nango.ts
  2. agents-run-api: Fixed logger calls missing context objects (6 TypeScript errors)
  3. agents-sdk: All 203 tests passing
  4. agents-run-api: All 256 tests passing

⚠️ Remaining Issue:

The agents-cli tests are failing because they expect the old CLI interface (positional arguments) but the code has been refactored to use a new project-based approach with options. This appears to be an intentional refactoring where:

  • Old: inkeep push graph.ts --api-url ...
  • New: inkeep push --project ./project --agents-manage-api-url ...

The CLI tests need a more comprehensive update to match the new project-based architecture. The failing tests are not blocking core functionality - all the main packages (agents-sdk, agents-run-api, agents-manage-ui) are now passing.

Summary:

  • TypeScript compilation: ✅ Fixed
  • Core packages: ✅ All tests passing
  • UI packages: ✅ Building successfully
  • CLI package: ⚠️ Tests need updating for new interface

Updated all CLI tests to match the new project-based push command:
- Tests now expect project loaded from index.ts instead of graph files
- Updated mock structures to match actual project API (init, getId, getName, getStats, getGraphs)
- Fixed expectations for JSON output mode using toFullProjectDefinition
- Updated UI tests to expect 'Next steps' output instead of direct UI links

Reduced test failures from 15 to 9, with core functionality tests passing.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 14, 2025 02:53 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 14, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've thoroughly analyzed this PR from both engineering and architectural perspectives. The fix addresses CI failures that were blocking merges to main, specifically TypeScript errors in credential handling and failing tests.

Todo List

  • Read and analyze all 9 changed files to understand the fixes
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fixes for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root causes:

  1. Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case)
  2. Test failures due to improper fetch mock setup and CLI interface changes

Targeted Scope: Limited changes to only the failing cases, appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The defensive handling was only applied to API_KEY and APP cases in nango-store.ts:79,83, but NOT to other credential types that likely have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other credential types may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors when Nango returns camelCase versions of these properties.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety benefits.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants that Nango might return
  api_key?: string;
  access_token?: string; 
  refresh_token?: string;
  // CamelCase variants from TypeScript definitions
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // Then use: flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: No tests verify the fallback behavior works for both camelCase and snake_case formats.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

5. Logger Call Pattern Inconsistency (LOW PRIORITY)

Issue: The logger fixes in agents-run-api add empty context objects {} rather than meaningful context data.

Better Pattern: Provide meaningful context:

logger.debug({ 
  templateCount: this.templates.size, 
  version: this.version 
}, 'Template loading message');

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


Fixed all remaining agents-cli test failures by updating tests to match the new
project-based architecture:

- Updated all mock project objects to include required methods (init, getId, getName, getStats, getGraphs)
- Fixed error assertions to match actual console.error call signature (two arguments)
- Updated JSON output test to use toFullProjectDefinition instead of toJSON
- Fixed UI/output tests to expect 'Next steps' messages instead of UI links
- Updated all mock configurations to match new project structure

All 133 CLI tests now passing. Build successful.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@vercel vercel Bot temporarily deployed to Preview – agents-docs September 14, 2025 05:21 Inactive
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Sep 14, 2025

Claude finished @amikofalvy's task —— View job


Comprehensive Code Review

I've thoroughly analyzed this PR from both engineering and architectural perspectives. The PR addresses CI failures that were blocking merges to main, specifically TypeScript errors in credential handling and failing tests.

Todo List

  • Read and analyze all 9 changed files to understand the fixes
  • Research existing credential handling patterns in the codebase
  • Analyze types and interfaces involved (ApiKeyCredentials, etc.)
  • Evaluate the fixes for engineering best practices
  • Check for potential issues, edge cases, or better alternatives
  • Provide architect-level feedback on system design implications
  • Submit comprehensive review feedback

✅ What's Working Well

Problem Identification: Correctly identified the root causes:

  1. Nango's inconsistent property naming between TypeScript types (camelCase) and runtime responses (snake_case)
  2. Test failures due to improper fetch mock setup and CLI interface changes

Targeted Scope: Limited changes to only the failing cases, appropriate for a CI-blocking hotfix.

Backwards Compatibility: The || fallback pattern maintains compatibility with both naming conventions.


⚠️ Critical Issues & Recommendations

1. Inconsistent Application of Fix (HIGH PRIORITY)

Issue: The defensive handling was only applied to API_KEY and APP cases in nango-store.ts:79,83, but NOT to other credential types that likely have the same pattern:

  • Line 103: credentials.access_token in OAUTH2 case - no defensive handling
  • Line 105: credentials.refresh_token in OAUTH2 case - no defensive handling
  • Other credential types may have similar issues

Impact: This creates inconsistent behavior and potential future TypeScript errors when Nango returns camelCase versions of these properties.

Recommendation: Apply the same defensive pattern consistently:

case 'OAUTH2':
  return {
    token: (credentials as any).accessToken || (credentials as any).access_token,
    refresh_token: (credentials as any).refreshToken || (credentials as any).refresh_token,
  };

2. Type Safety Erosion (MEDIUM PRIORITY)

Issue: Using (credentials as any) bypasses TypeScript's type system entirely, losing compile-time safety benefits.

Better Approach: Create proper type guards or union types:

type FlexibleCredentials = AllAuthCredentials & {
  // Snake_case variants that Nango might return
  api_key?: string;
  access_token?: string; 
  refresh_token?: string;
  // CamelCase variants from TypeScript definitions
  apiKey?: string;
  accessToken?: string;
  refreshToken?: string;
};

private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {
  const flexCreds = credentials as FlexibleCredentials;
  // Then use: flexCreds.apiKey || flexCreds.api_key
}

3. Missing Documentation (MEDIUM PRIORITY)

Issue: No inline documentation explaining why both property formats are checked.

Recommendation: Add JSDoc comments explaining the Nango API inconsistency:

/**
 * Extract access token, handling both camelCase and snake_case property names.
 * Nango API may return either format depending on version/configuration.
 */
private getAccessToken(credentials: AllAuthCredentials): Record<string, any> | null {

4. Testing Gap (MEDIUM PRIORITY)

Issue: No tests verify the fallback behavior works for both camelCase and snake_case formats.

Recommendation: Add test cases covering:

  • camelCase credential properties
  • Both formats present (should prefer camelCase)
  • Neither format present (should handle gracefully)

5. Logger Call Pattern Inconsistency (LOW PRIORITY)

Issue: The logger fixes in agents-run-api add empty context objects {} rather than meaningful context data.

Better Pattern: Provide meaningful context:

logger.debug({ 
  templateCount: this.templates.size, 
  version: this.version 
}, 'Template loading message');

🏗️ Architectural Considerations

Pattern Consistency

This defensive handling pattern should be standardized across the credential system. Consider creating a utility function:

private getPropertyWithFallback<T>(obj: any, camelKey: string, snakeKey: string): T | undefined {
  return obj[camelKey] ?? obj[snakeKey];
}

Monitoring & Observability

Consider adding logging when snake_case fallback is used to track which credential sources are using which format:

if (!credentials.apiKey && credentials.api_key) {
  logger.debug('Using snake_case fallback for API key credential');
}

Future Proofing

The current approach handles immediate needs but doesn't address the root cause of inconsistent external API contracts. Consider:

  1. Engaging with Nango team about standardizing response formats
  2. Creating a normalization layer for all external credential sources

📊 Risk Assessment

  • Low Risk: This is a defensive fix that maintains existing functionality
  • Medium Risk: Type casting reduces compile-time safety
  • High Risk: Inconsistent application may cause future issues in other credential types

🎯 Final Recommendations

For Immediate Merge (to unblock CI):

  1. ✅ The current fix is safe to merge as-is
  2. ⚠️ File a follow-up issue to address inconsistent application across all credential types
  3. ⚠️ File a follow-up issue for improved type safety and testing

For Follow-up Work:

  1. Apply defensive handling to all credential types consistently
  2. Improve type definitions to avoid any casting
  3. Add comprehensive test coverage for both naming conventions
  4. Consider architectural improvements for external API contract handling

This fix appropriately addresses the immediate CI blocking issue while maintaining backwards compatibility. The concerns raised are important for long-term maintainability but don't block the immediate need to restore CI functionality.


View job

@amikofalvy
Copy link
Copy Markdown
Collaborator Author

✅ All CI Checks Passing!

Successfully fixed all test failures and build issues:

Fixed Issues:

  1. TypeScript Compilation

    • Fixed api_key vs apiKey in agents-manage-ui
    • Fixed logger context objects in agents-run-api
  2. Core Package Tests

    • agents-sdk: All 203 tests passing
    • agents-run-api: All 256 tests passing
  3. CLI Package Tests

    • agents-cli: All 133 tests passing
    • Complete refactoring for new project-based interface
  4. Build & Deployment

    • All packages building successfully
    • All Vercel deployments passing

Summary:

All checks are green and the PR is ready for merge. The CI pipeline is fully functional with no failing tests or build errors.

@amikofalvy amikofalvy merged commit 714b067 into main Sep 14, 2025
7 checks passed
@amikofalvy amikofalvy deleted the fix/ci-tests branch September 14, 2025 06:27
inkeep-oss-sync Bot pushed a commit that referenced this pull request Apr 22, 2026
…root (#186)

* chore(copilot-app): remove redundant lockfile, install from monorepo root

copilot-app is a workspace member (pnpm-workspace.yaml line 18), so the root
lockfile already resolves its dependencies. The second lockfile only existed
because vercel.json used pnpm install --ignore-workspace --frozen-lockfile,
which severs workspace context and therefore needed a local lockfile.

Two install boundaries for the same app meant root pnpm.overrides did not
apply to the Vercel install, so CI and Vercel could silently resolve to
different dependency trees. PR #167's description originally said "Vercel
to install + build from the monorepo root via pnpm --filter copilot-app...",
but the committed vercel.json drifted to --ignore-workspace. This aligns the
implementation with the stated plan.

- Delete private/copilot-app/pnpm-lock.yaml
- Change private/copilot-app/vercel.json installCommand to install from the
  monorepo root with a workspace filter
- Drop the copilot-app entry from scripts/check-monorepo-traps.mjs and simplify
  the DUAL_LOCKFILE_ROOTS comment (every remaining entry is a true workspace
  boundary, so the ignoreWorkspace workaround is no longer needed for any of
  them)

* docs(private): update lockfile section after copilot-app cleanup

* chore: add install:all convenience script for dual-lockfile installs

* chore: include create-agents-template in install:all

* fix(copilot-app): drop redundant cd ../.. from vercel installCommand

* docs: point dual-lockfile guidance at pnpm install:all

This PR introduces the install:all script; update every doc that teaches
the old cd-and-install-twice pattern to reference the shorthand instead.

- AGENTS.md (root) Dual lockfiles section: replaces the two-step pnpm
  install invocation with a single install:all, and lists all three
  lockfile scopes (root, public/agents, public/agents/create-agents-template)
  so readers understand what the shorthand covers.
- CI_RUNBOOK ERR_PNPM_OUTDATED_LOCKFILE: same substitution plus the third
  lockfile in the git add line.
- public/agents/AGENTS.md pnpm-lock.yaml Resolution Strategy: adds a
  When changing dependencies callout pointing at install:all, so readers
  inside the public/agents subtree know they have a root shortcut for
  the whole-monorepo regeneration.

* chore(check-monorepo-traps): drop dead ignoreWorkspace flag

Every DUAL_LOCKFILE_ROOTS entry is now a true workspace boundary that
installs without --ignore-workspace. The flag had exactly one live
consumer (private/copilot-app) which this PR removes. Simplify the data
structure to an array of path strings and drop the now-unused flag
branches in the install command and regen hint.

Also: the regen hint gains a pointer at the install:all shorthand, since
that's the recommended path for a whole-monorepo resync.

* docs: comprehensive command cheatsheet + check:structural aggregate

The problem: every time a new shorthand is added (install:all, check:*)
it lands in code but stays invisible in docs. People default to the raw
cd-and-install form, which is how we drift. The cheatsheet is the fix
for the drift-by-ignorance path.

Changes:
- Adds check:structural to root package.json - one command for the full
  structural guard set (boundaries + monorepo-traps + release-groups
  validate). Complements the existing pre-push hook which only runs
  check:monorepo-traps.
- Rewrites AGENTS.md 'Command routing' section as 'Command cheatsheet'
  with a scenario-driven quick-lookup table at top, then grouped by
  intent: install/lockfiles, build+dev+lint+typecheck+test, structural
  guards, changesets+releases, mirror/Copybara, parity, database.
- Documents the suffix convention (:agents, :agents-ui, :chat-to-edit,
  :inkeep-cloud-mcp, :copilot, :ext; no suffix = fan-out) so people
  can guess commands instead of memorizing.
- Every command gets a one-line description of what it does and when
  to reach for it.

* fix(check-monorepo-traps): guard the create-agents-template lockfile too

Docs introduced in this PR call out three lockfiles (root, public/agents,
public/agents/create-agents-template) and point at install:all as the
shorthand that regenerates them. The check only validated two — the
starter-kit lockfile could drift silently and slip past the pre-push
hook, surfacing for end users later when they cloned the starter.

Add public/agents/create-agents-template to DUAL_LOCKFILE_ROOTS and
update the comment to reflect the actual install-boundary taxonomy
(monorepo / Copybara+Vercel / standalone starter). install:all and the
check now cover the same set.

* ci: gate publish on check:structural (defense-in-depth)

Required checks on the source PR already run check:structural, and both
version-packages workflows check out origin/main before doing anything.
In practice, publish always runs against a validated main state.

But 'in practice' isn't the same as 'structurally'. A workflow_dispatch
run against main, an admin bypass of branch protection, or a future
change that loosens merge requirements could let a misconfigured main
reach the publish step without re-validation. Today's agents-ui release
already surfaced one post-publish pipefail bug that shouldn't have been
possible if we trusted the pipeline - this gate is the same intuition
applied upstream.

Adds 'Validate structural invariants' step between Install and the
release machinery in both private-agents-ui-version-packages.yml and
public-agents-version-packages.yml. Runs pnpm check:structural, which
aggregates check:boundaries + check:monorepo-traps + release-groups:validate
(including the workspace-isolation guard introduced in #191). Fails
hard on any structural misconfig, refusing to publish.

Cost: ~30-60s per publish run. Cheaper than a bad release.
GitOrigin-RevId: 684d52e5ab7734f592479b61e972cdfe5fc3ae23
inkeep-oss-sync Bot pushed a commit that referenced this pull request Apr 22, 2026
…root (#186)

* chore(copilot-app): remove redundant lockfile, install from monorepo root

copilot-app is a workspace member (pnpm-workspace.yaml line 18), so the root
lockfile already resolves its dependencies. The second lockfile only existed
because vercel.json used pnpm install --ignore-workspace --frozen-lockfile,
which severs workspace context and therefore needed a local lockfile.

Two install boundaries for the same app meant root pnpm.overrides did not
apply to the Vercel install, so CI and Vercel could silently resolve to
different dependency trees. PR #167's description originally said "Vercel
to install + build from the monorepo root via pnpm --filter copilot-app...",
but the committed vercel.json drifted to --ignore-workspace. This aligns the
implementation with the stated plan.

- Delete private/copilot-app/pnpm-lock.yaml
- Change private/copilot-app/vercel.json installCommand to install from the
  monorepo root with a workspace filter
- Drop the copilot-app entry from scripts/check-monorepo-traps.mjs and simplify
  the DUAL_LOCKFILE_ROOTS comment (every remaining entry is a true workspace
  boundary, so the ignoreWorkspace workaround is no longer needed for any of
  them)

* docs(private): update lockfile section after copilot-app cleanup

* chore: add install:all convenience script for dual-lockfile installs

* chore: include create-agents-template in install:all

* fix(copilot-app): drop redundant cd ../.. from vercel installCommand

* docs: point dual-lockfile guidance at pnpm install:all

This PR introduces the install:all script; update every doc that teaches
the old cd-and-install-twice pattern to reference the shorthand instead.

- AGENTS.md (root) Dual lockfiles section: replaces the two-step pnpm
  install invocation with a single install:all, and lists all three
  lockfile scopes (root, public/agents, public/agents/create-agents-template)
  so readers understand what the shorthand covers.
- CI_RUNBOOK ERR_PNPM_OUTDATED_LOCKFILE: same substitution plus the third
  lockfile in the git add line.
- public/agents/AGENTS.md pnpm-lock.yaml Resolution Strategy: adds a
  When changing dependencies callout pointing at install:all, so readers
  inside the public/agents subtree know they have a root shortcut for
  the whole-monorepo regeneration.

* chore(check-monorepo-traps): drop dead ignoreWorkspace flag

Every DUAL_LOCKFILE_ROOTS entry is now a true workspace boundary that
installs without --ignore-workspace. The flag had exactly one live
consumer (private/copilot-app) which this PR removes. Simplify the data
structure to an array of path strings and drop the now-unused flag
branches in the install command and regen hint.

Also: the regen hint gains a pointer at the install:all shorthand, since
that's the recommended path for a whole-monorepo resync.

* docs: comprehensive command cheatsheet + check:structural aggregate

The problem: every time a new shorthand is added (install:all, check:*)
it lands in code but stays invisible in docs. People default to the raw
cd-and-install form, which is how we drift. The cheatsheet is the fix
for the drift-by-ignorance path.

Changes:
- Adds check:structural to root package.json - one command for the full
  structural guard set (boundaries + monorepo-traps + release-groups
  validate). Complements the existing pre-push hook which only runs
  check:monorepo-traps.
- Rewrites AGENTS.md 'Command routing' section as 'Command cheatsheet'
  with a scenario-driven quick-lookup table at top, then grouped by
  intent: install/lockfiles, build+dev+lint+typecheck+test, structural
  guards, changesets+releases, mirror/Copybara, parity, database.
- Documents the suffix convention (:agents, :agents-ui, :chat-to-edit,
  :inkeep-cloud-mcp, :copilot, :ext; no suffix = fan-out) so people
  can guess commands instead of memorizing.
- Every command gets a one-line description of what it does and when
  to reach for it.

* fix(check-monorepo-traps): guard the create-agents-template lockfile too

Docs introduced in this PR call out three lockfiles (root, public/agents,
public/agents/create-agents-template) and point at install:all as the
shorthand that regenerates them. The check only validated two — the
starter-kit lockfile could drift silently and slip past the pre-push
hook, surfacing for end users later when they cloned the starter.

Add public/agents/create-agents-template to DUAL_LOCKFILE_ROOTS and
update the comment to reflect the actual install-boundary taxonomy
(monorepo / Copybara+Vercel / standalone starter). install:all and the
check now cover the same set.

* ci: gate publish on check:structural (defense-in-depth)

Required checks on the source PR already run check:structural, and both
version-packages workflows check out origin/main before doing anything.
In practice, publish always runs against a validated main state.

But 'in practice' isn't the same as 'structurally'. A workflow_dispatch
run against main, an admin bypass of branch protection, or a future
change that loosens merge requirements could let a misconfigured main
reach the publish step without re-validation. Today's agents-ui release
already surfaced one post-publish pipefail bug that shouldn't have been
possible if we trusted the pipeline - this gate is the same intuition
applied upstream.

Adds 'Validate structural invariants' step between Install and the
release machinery in both private-agents-ui-version-packages.yml and
public-agents-version-packages.yml. Runs pnpm check:structural, which
aggregates check:boundaries + check:monorepo-traps + release-groups:validate
(including the workspace-isolation guard introduced in #191). Fails
hard on any structural misconfig, refusing to publish.

Cost: ~30-60s per publish run. Cheaper than a bad release.
GitOrigin-RevId: 684d52e5ab7734f592479b61e972cdfe5fc3ae23
github-merge-queue Bot pushed a commit that referenced this pull request Apr 22, 2026
* Version Packages (agents) (#185)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
GitOrigin-RevId: 7263142a67ac9ce9c9873a68a5673bfb436dbc1c

* chore(copilot-app): remove redundant lockfile, install from monorepo root (#186)

* chore(copilot-app): remove redundant lockfile, install from monorepo root

copilot-app is a workspace member (pnpm-workspace.yaml line 18), so the root
lockfile already resolves its dependencies. The second lockfile only existed
because vercel.json used pnpm install --ignore-workspace --frozen-lockfile,
which severs workspace context and therefore needed a local lockfile.

Two install boundaries for the same app meant root pnpm.overrides did not
apply to the Vercel install, so CI and Vercel could silently resolve to
different dependency trees. PR #167's description originally said "Vercel
to install + build from the monorepo root via pnpm --filter copilot-app...",
but the committed vercel.json drifted to --ignore-workspace. This aligns the
implementation with the stated plan.

- Delete private/copilot-app/pnpm-lock.yaml
- Change private/copilot-app/vercel.json installCommand to install from the
  monorepo root with a workspace filter
- Drop the copilot-app entry from scripts/check-monorepo-traps.mjs and simplify
  the DUAL_LOCKFILE_ROOTS comment (every remaining entry is a true workspace
  boundary, so the ignoreWorkspace workaround is no longer needed for any of
  them)

* docs(private): update lockfile section after copilot-app cleanup

* chore: add install:all convenience script for dual-lockfile installs

* chore: include create-agents-template in install:all

* fix(copilot-app): drop redundant cd ../.. from vercel installCommand

* docs: point dual-lockfile guidance at pnpm install:all

This PR introduces the install:all script; update every doc that teaches
the old cd-and-install-twice pattern to reference the shorthand instead.

- AGENTS.md (root) Dual lockfiles section: replaces the two-step pnpm
  install invocation with a single install:all, and lists all three
  lockfile scopes (root, public/agents, public/agents/create-agents-template)
  so readers understand what the shorthand covers.
- CI_RUNBOOK ERR_PNPM_OUTDATED_LOCKFILE: same substitution plus the third
  lockfile in the git add line.
- public/agents/AGENTS.md pnpm-lock.yaml Resolution Strategy: adds a
  When changing dependencies callout pointing at install:all, so readers
  inside the public/agents subtree know they have a root shortcut for
  the whole-monorepo regeneration.

* chore(check-monorepo-traps): drop dead ignoreWorkspace flag

Every DUAL_LOCKFILE_ROOTS entry is now a true workspace boundary that
installs without --ignore-workspace. The flag had exactly one live
consumer (private/copilot-app) which this PR removes. Simplify the data
structure to an array of path strings and drop the now-unused flag
branches in the install command and regen hint.

Also: the regen hint gains a pointer at the install:all shorthand, since
that's the recommended path for a whole-monorepo resync.

* docs: comprehensive command cheatsheet + check:structural aggregate

The problem: every time a new shorthand is added (install:all, check:*)
it lands in code but stays invisible in docs. People default to the raw
cd-and-install form, which is how we drift. The cheatsheet is the fix
for the drift-by-ignorance path.

Changes:
- Adds check:structural to root package.json - one command for the full
  structural guard set (boundaries + monorepo-traps + release-groups
  validate). Complements the existing pre-push hook which only runs
  check:monorepo-traps.
- Rewrites AGENTS.md 'Command routing' section as 'Command cheatsheet'
  with a scenario-driven quick-lookup table at top, then grouped by
  intent: install/lockfiles, build+dev+lint+typecheck+test, structural
  guards, changesets+releases, mirror/Copybara, parity, database.
- Documents the suffix convention (:agents, :agents-ui, :chat-to-edit,
  :inkeep-cloud-mcp, :copilot, :ext; no suffix = fan-out) so people
  can guess commands instead of memorizing.
- Every command gets a one-line description of what it does and when
  to reach for it.

* fix(check-monorepo-traps): guard the create-agents-template lockfile too

Docs introduced in this PR call out three lockfiles (root, public/agents,
public/agents/create-agents-template) and point at install:all as the
shorthand that regenerates them. The check only validated two — the
starter-kit lockfile could drift silently and slip past the pre-push
hook, surfacing for end users later when they cloned the starter.

Add public/agents/create-agents-template to DUAL_LOCKFILE_ROOTS and
update the comment to reflect the actual install-boundary taxonomy
(monorepo / Copybara+Vercel / standalone starter). install:all and the
check now cover the same set.

* ci: gate publish on check:structural (defense-in-depth)

Required checks on the source PR already run check:structural, and both
version-packages workflows check out origin/main before doing anything.
In practice, publish always runs against a validated main state.

But 'in practice' isn't the same as 'structurally'. A workflow_dispatch
run against main, an admin bypass of branch protection, or a future
change that loosens merge requirements could let a misconfigured main
reach the publish step without re-validation. Today's agents-ui release
already surfaced one post-publish pipefail bug that shouldn't have been
possible if we trusted the pipeline - this gate is the same intuition
applied upstream.

Adds 'Validate structural invariants' step between Install and the
release machinery in both private-agents-ui-version-packages.yml and
public-agents-version-packages.yml. Runs pnpm check:structural, which
aggregates check:boundaries + check:monorepo-traps + release-groups:validate
(including the workspace-isolation guard introduced in #191). Fails
hard on any structural misconfig, refusing to publish.

Cost: ~30-60s per publish run. Cheaper than a bad release.
GitOrigin-RevId: 684d52e5ab7734f592479b61e972cdfe5fc3ae23

* fix(ci): harden release cascade so copybara + npm publish run unattended (#194)

* fix(ci): always reset copybara/sync on every mirror run

Closes #188

Drop the "leave branch in place if open PR is < STALE_PR_HOURS" branch in
the mirror sync reconcile step. Letting Copybara "append" to an existing
copybara/sync was never safe: the Copybara config uses fetch=main, so
every run baselines off inkeep/agents main's last GitOrigin-RevId. When
a new push lands on agents-private main while a prior sync PR is still
open, Copybara rebuilds the older origin change from main's HEAD (new
SHA due to timestamps) and the non-force push to copybara/sync is
rejected as non-fast-forward. This is the failure mode that blew up
the release cascade in #188 (Version Packages #185 merged while #3166
was still open 9 minutes after being created).

Every mirror run now closes any open sync PR and deletes copybara/sync
before Copybara runs, so each run pushes a fresh history. The concurrency
group already serializes runs and every new run includes all accumulated
changes since the last imported revision, so no information is lost.
PR churn (one inkeep/agents sync PR per agents-private main push) is the
cost, and it is much cheaper than a stuck release cascade.

CI_RUNBOOK gets a new entry for this specific failure string so future
red runs route to the fix without a re-investigation.

* fix(ci): harden release cascade against silent strandings

Bundled on top of the copybara/sync reset in this PR so the whole
release path (mirror sync -> npm publish -> GH Release -> Vercel
prod deploy -> scheduler restart) can run end-to-end with no human
intervention. Each fix closes a distinct silent-stranding mode.

1. public-mirror-sync.yml Create-PR guard
   - Reconcile now always deletes copybara/sync before Copybara runs,
     which introduced a regression: when Copybara exits 4 (no changes
     to sync, eg. workflow_dispatch with an idle main), the branch is
     gone and the next `gh pr create --head copybara/sync` would fail.
     Add an explicit branch-existence check; short-circuit cleanly.
   - Add explicit --state open to the gh pr list call. Defaults to open
     but being explicit prevents a future refactor from reintroducing
     the PR #184 bug class.
   - Replace the PR number extraction `grep -o '[0-9]*$'` on the PR URL
     with gh pr view --json number. gh's stdout format is not a contract.

2. private-agents-ui-version-packages.yml publish detection
   - Was parsing `Publishing "X" at "Y"` via grep/sed on the changesets
     log, which is the exact fragility PR #174 removed from public
     release.yml. If changesets v2 changes format, published=false is
     written despite a successful publish, the widget-release dispatch
     is skipped, and agents-docs changelog silently desyncs.
   - Use the stable "packages published successfully" presence marker
     and read the version from package.json (authoritative for a fixed
     release group).

3. public/agents/.github/workflows/release.yml catch-all + dispatch retry
   - `Notify agents-private (failure)` was gated on
     `steps.detect.outputs.has_changesets == 'false'`. If the workflow
     failed before the detect step ran (install, build, token gen),
     has_changesets is unset and the condition evaluated false -> no
     dispatch, no tracking issue on agents-private, red run sitting
     invisibly in the Actions tab. Drop the has_changesets gate.
   - Replace peter-evans/repository-dispatch with a bash retry loop
     (3 attempts, 5/10s backoff). The action has no built-in retry, so
     a transient 5xx or rate-limit during the post-publish dispatch
     loses the signal permanently: npm publishes, but no GH Release is
     created and no Vercel prod deploy fires. Retry + explicit error
     on exhausted attempts so the stranding is loud, not silent.

4. public-agents-vercel-production.yml concurrency + failure tracker
   - Add concurrency: vercel-production-deploy. DB migrations are not
     idempotent; two parallel deploys (eg. a release published while a
     manual re-dispatch is in flight) would race on migrate-databases
     and leave schema in a half-applied state.
   - Add notify-on-failure job (mirrors the tracking-issue pattern
     from public-mirror-sync.yml). At this point npm has published,
     the GH Release exists, but prod runtime is stale. Needs to be
     loud: auto-open a "Vercel production deploy failing" issue so
     the half-shipped state is visible instead of buried in the
     Actions tab.

CI_RUNBOOK.md: reword the release/publish failure entries to match
the new retry/tracking behavior, and add a new entry covering the
post-publish deploy failure case.

Intentionally out of scope: the auto-format.yml + Dependabot
`pnpm install --frozen-lockfile` race. Not a release-cascade issue,
will go in a separate PR.

* docs(runbook): bold Historical marker for consistency

GitOrigin-RevId: 04ff8b544833e109b57f75ded3236730d7fb10eb

---------

Co-authored-by: Varun Varahabhotla <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant