fix(agents-core): upgrade dolt and remove custom jsonb#3096
fix(agents-core): upgrade dolt and remove custom jsonb#3096miles-kt-inkeep merged 3 commits intomainfrom
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
TL;DR — Upgrades Doltgres from 0.55.x to 0.56.1 and removes the custom Key changes
Summary | 6 files | 3 commits | base: Native
|
There was a problem hiding this comment.
Two issues — one build-breaking, one potential data-corruption risk.
-
The test file
packages/agents-core/src/__tests__/db/dolt-safe-jsonb.test.tsimports from the deleted module and will break the build/test suite. It needs to be deleted alongsidedolt-safe-jsonb.ts. -
The old
DoltSafeJsonbclass encoded backslashes as U+E000 on write and decoded them back on read. If any rows in production were written whiledolt-safe-jsonb.tswas active, those rows now contain literal U+E000 characters in their JSONB values. Removing the decode-on-read path means those values will be returned with U+E000 instead of backslashes. Before merging, confirm that either (a) a data migration has been run to replace U+E000 →\in all affected rows, or (b) no production data was written through this codepath.
Claude Opus | 𝕏
| return new DoltSafeJsonbBuilder(name as any) as unknown as ReturnType< | ||
| typeof import('drizzle-orm/pg-core').jsonb | ||
| >; | ||
| } |
There was a problem hiding this comment.
This file is deleted, but packages/agents-core/src/__tests__/db/dolt-safe-jsonb.test.ts still imports encodeBackslashes and decodeBackslashes from it. That test file needs to be deleted too — otherwise the build/test suite will break.
Also, scripts/fix-doltgres-corrupt-jsonb.mjs has comments referencing dolt-safe-jsonb.ts (lines 13, 24, 213) and still encodes values with U+E000. If the workaround is no longer needed, that script's encoding logic should also be updated or the script removed entirely.
| foreignKey, | ||
| index, | ||
| integer, | ||
| jsonb, |
There was a problem hiding this comment.
Switching to Drizzle's stock jsonb removes the mapFromDriverValue override that decoded U+E000 → \ on read. Any rows written while dolt-safe-jsonb.ts was active will still contain U+E000 placeholders in the database. Without the decode step, those values will now surface U+E000 to application code.
Has a data migration been run to replace U+E000 back to \ in all affected JSONB columns? If not, this change will silently corrupt reads for those rows.
There was a problem hiding this comment.
PR Review Summary
(4) Total Issues | Risk: High
🔴❗ Critical (2) ❗🔴
🔴 1) dolt-safe-jsonb.test.ts Orphaned test file imports deleted module
Issue: The test file packages/agents-core/src/__tests__/db/dolt-safe-jsonb.test.ts imports decodeBackslashes and encodeBackslashes from ../../db/manage/dolt-safe-jsonb, which was deleted in this PR.
Why: This will cause TypeScript compilation errors and test suite failures when pnpm check or pnpm test is run. CI will fail with "Cannot find module" error.
Fix: Delete the orphaned test file since the module it tests no longer exists:
rm packages/agents-core/src/__tests__/db/dolt-safe-jsonb.test.tsRefs:
🔴 2) Data migration Existing U+E000-encoded data will be read incorrectly
Issue: The deleted dolt-safe-jsonb.ts module encoded backslashes as U+E000 on write and decoded them back to backslashes on read. With this module removed, any existing data containing U+E000 characters will no longer be decoded.
Why: Any JSONB data containing backslashes that was written while the custom encoder was active will now be read incorrectly. The U+E000 placeholder character will appear in strings instead of backslashes. This affects all tables with JSONB columns that stored user content with backslashes (file paths, regex patterns, escape sequences in prompts, tool configs, etc.).
The migration script scripts/fix-doltgres-corrupt-jsonb.mjs also encodes data using U+E000, confirming there is data in production that was intentionally encoded this way.
Before (with custom jsonb):
Write: "hello\world" → stored as "hello\uE000world"
Read: "hello\uE000world" → returned as "hello\world"
After (with standard drizzle jsonb):
Read: "hello\uE000world" → returned as "hello\uE000world" ❌
Fix: Before merging this PR:
- Run a data migration to convert all U+E000 characters back to backslashes in existing JSONB data across all Dolt branches, OR
- Verify with the team that no data exists with U+E000 encoding (unlikely given the fix script exists), OR
- Keep the decode-only portion of the custom jsonb to handle legacy data while writing new data without encoding
Refs:
- fix-doltgres-corrupt-jsonb.mjs — migration script that also uses U+E000 encoding
🟠⚠️ Major (2) 🟠⚠️
🟠 1) scripts/fix-doltgres-corrupt-jsonb.mjs Migration script still encodes with U+E000
Issue: The fix script fix-doltgres-corrupt-jsonb.mjs contains an encodeBackslashes function (lines 213-225) that encodes data using U+E000. After this PR merges, if this script is run, it will write U+E000 characters that are never decoded.
Why: Running this script after the PR is merged would cause permanent data corruption from the user's perspective — U+E000 characters would remain in the data forever.
Fix: Update the fix script to either:
- Remove the
encodeBackslashescall since Dolt 0.56.1 no longer needs the workaround - Add a warning/error if run after Dolt 0.56.1 upgrade
- Archive/deprecate the script with a note that it's only for pre-0.56.1 databases
Refs:
🟠 2) Deployment Deployment order dependency for self-hosted users
Issue: The Doltgres upgrade and removal of the encoding workaround should be coordinated with a data migration. Self-hosted deployments need clear upgrade instructions.
Why: The upgrade sequence matters:
- Upgrade Dolt to 0.56.1
- Run data migration to decode U+E000 characters back to backslashes
- Deploy code without the encoder
If steps are done out of order, data could be double-encoded or not decoded properly.
Fix: Document the required deployment order in release notes or changelog. Consider adding a one-time migration that runs automatically on startup to decode U+E000 characters.
💭 Consider (1) 💭
💭 1) .github/workflows/cypress.yml:40 Inconsistent Dolt version pinning across CI workflows
Issue: The Cypress workflow uses pinned version 0.56.1 while ci.yml uses latest.
Why: The inconsistency means the two CI workflows may test against different Dolt versions. Pinned versions are more deterministic, while latest automatically gets fixes but may introduce unexpected changes.
Fix: Consider aligning the version strategy — either pin both to 0.56.1 for reproducibility or use latest for both.
Refs:
- cypress.yml:40
- ci.yml:274 (pre-existing - uses
latest)
🚫 REQUEST CHANGES
Summary: This PR has a build-breaking issue (orphaned test file) that will cause CI to fail, and a potential data corruption concern if there is existing data encoded with U+E000 characters. The orphaned test file must be deleted. The data migration concern needs clarification — please confirm whether a migration was run to decode existing U+E000 data before this change, or verify that no such data exists.
Discarded (0)
No findings were discarded.
Reviewers (3)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-standards |
1 | 1 | 0 | 0 | 0 | 0 | 0 |
pr-review-devops |
3 | 0 | 1 | 0 | 0 | 0 | 2 |
pr-review-breaking-changes |
4 | 3 | 0 | 0 | 0 | 0 | 1 |
| Total | 8 | 4 | 1 | 0 | 0 | 0 | 3 |
Note: Orphaned test file finding was flagged by multiple reviewers (deduplicated). Version inconsistency finding was marked pre-existing by devops reviewer.
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
✅ Prior Issues Resolved
The PR has been updated since the last review and has addressed the previously raised critical and major issues:
| Prior Issue | Status | Resolution |
|---|---|---|
🔴 Orphaned test file (dolt-safe-jsonb.test.ts) |
✅ Resolved | Deleted in commit e6aed96b7 |
| 🟠 Migration script still encodes U+E000 | ✅ Resolved | Script deleted in commit e6aed96b7 |
🕐 Pending Recommendations (2)
These items were raised in the prior review and still require clarification:
-
🔴 Data migration concern — The custom
DoltSafeJsonbencoder wrote data with U+E000 placeholders. If any production data contains U+E000 characters, it will now be returned as-is instead of decoded to backslashes. Please confirm: (a) a migration was run to decode U+E000 →\in affected rows, or (b) no production data was written through this codepath. -
🟠 Deployment order documentation — Self-hosted deployments may need guidance on upgrade sequencing (Dolt upgrade → data migration → code deploy). Consider documenting in release notes if applicable.
💭 Consider (1) 💭
💭 1) cypress.yml:40 / ci.yml:274 Dolt version pinning inconsistency
Issue: The Cypress workflow pins to 0.56.1 while ci.yml, docker-compose.dbs.yml, and docker-compose.isolated.yml use latest.
Why: Inconsistent pinning may cause CI workflows to test against different Dolt versions. This is pre-existing and not introduced by this PR.
Fix: Consider aligning version strategy across all compose/workflow files for reproducibility.
Refs:
- cypress.yml:40 — pinned
0.56.1 - ci.yml:274 — uses
latest
💡 APPROVE WITH SUGGESTIONS
Summary: The code changes look good! The previously identified critical issues (orphaned test file, migration script) have been properly resolved. The remaining concerns are about data migration confirmation and deployment documentation — please clarify whether production data contains U+E000-encoded values that need migration before this change goes live.
Discarded (0)
No findings were discarded.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
5 | 0 | 1 | 0 | 0 | 2 | 0 |
| Total | 5 | 0 | 1 | 0 | 0 | 2 | 0 |
Note: Delta review — sub-reviewers were not dispatched as changes since last review were file deletions addressing prior feedback.
Preview URLsUse these stable preview aliases for testing this PR:
These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find. Raw Vercel deployment URLs
|
Ito Test Report ❌10 test cases ran. 5 failed, 1 additional finding, 4 passed. Overall, the unified run failed with 6 of 10 tests failing (4 passing), indicating significant reliability issues despite some core safety checks working as expected. The most critical findings were multiple High-severity server-side JSONB persistence defects where valid escaped/backslash-heavy payloads cause 500s or can corrupt MCP tools, Data Components, and agent context-config records (including rapid multi-submit scenarios that poison subsequent list/get/delete paths), plus a Medium-severity API-validation gap that accepts dangerous transformation strings rejected by the UI, while unauthorized deep links, mobile JSON form usability, invalid-control-character blocking, and edit-flow refresh/back-forward resilience all passed. ❌ Failed (5)
|
| Category | Summary | Screenshot |
|---|---|---|
| Adversarial | 🟠 Dangerous transformation strings are accepted and saved instead of being rejected. |
🟠 Block dangerous transformation patterns in MCP tool overrides
- What failed: The API accepts and persists dangerous transformation strings with HTTP 200 instead of rejecting them.
- Impact: Invalid or unsafe transformation expressions can be stored and later cause unstable behavior when transformations are executed. This also creates inconsistent behavior between UI validation and direct API usage.
- Steps to reproduce:
- Open an existing MCP tool override update flow or call the update API directly.
- Send transformation values containing patterns like
eval(,function(,${...}, or__proto__. - Observe HTTP 200 and persisted unsafe transformation data instead of validation rejection.
- Stub / mock context: Unsafe and benign transformation payloads were submitted directly through local API flows to verify persistence behavior, which bypasses front-end form guards by design. The run used localhost auth bypass settings so only local non-production services were exercised.
- Code analysis: Server-side tool schemas allow any string/object for
transformationwith no dangerous-pattern guard, while UI form validation has explicit dangerous-pattern blocking; API clients can bypass UI checks and still persist unsafe values. - Why this is likely a bug: The product enforces dangerous-pattern blocking in the UI but not in API validation, so unsafe transformations are accepted in production data paths.
Relevant code:
packages/agents-core/src/validation/schemas.ts (lines 1232-1244)
toolOverrides: z
.record(
z.string(),
z.object({
displayName: z.string().optional(),
description: z.string().optional(),
schema: z.any().optional(),
transformation: z
.union([
z.string(), // JMESPath expression
z.record(z.string(), z.string()), // object mapping
])
.optional(),packages/agents-core/src/validation/schemas.ts (lines 2129-2133)
export const ToolUpdateSchema = ToolInsertSchema.partial();
export const ToolApiSelectSchema = createApiSchema(ToolSelectSchema).openapi('Tool');
export const ToolApiInsertSchema = createApiInsertSchema(ToolInsertSchema).openapi('ToolCreate');
export const ToolApiUpdateSchema = createApiUpdateSchema(ToolUpdateSchema).openapi('ToolUpdate');agents-manage-ui/src/components/mcp-servers/form/validation.ts (lines 22-30)
const DANGEROUS_PATTERNS = [
/\$\{.*\}/, // Template injection
/eval\s*\(/i, // Eval calls
/function\s*\(/i, // Function definitions
/constructor/i, // Constructor access
/prototype/i, // Prototype manipulation
/__proto__/i, // Proto access
];Commit: 9a0baf6
Tell us how we did: Give Ito Feedback
No description provided.