Conversation
… add driver/datastore tests - Validate all extracted file paths are within the temp directory after tar extraction, throwing a UserError if any entry escapes via path traversal sequences - Add 8 unit tests for extractDriverFromSource and extractDatastoreFromSource covering type/name/description extraction, configSchema field parsing, files without the relevant export, and exports missing the type field Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…traversal Use `tar -tzf` to list archive entries and reject any containing ".." or starting with "/" before extracting, eliminating the TOCTOU window where malicious entries could escape tmpDir during extraction. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
There was a problem hiding this comment.
Review Summary
This PR properly addresses the two issues from the prior review. The changes are well-implemented and ready to merge.
No Blocking Issues
The code is clean, follows project conventions, and the security fix is correctly placed.
Analysis
Security Fix (extension_pull.ts:593-615)
The path traversal guard is well-implemented:
- Checks happen BEFORE extraction (the correct time)
- Rejects paths containing
..(path traversal sequences) - Rejects absolute paths starting with
/ - Clear error message identifies the unsafe path
- Appropriate placement in the CLI layer (input validation at application boundary per DDD principles)
The refactoring to reuse archiveEntries for debug logging is a nice simplification that avoids redundant filesystem operations.
Test Coverage (extension_content_extractor_test.ts)
The 8 new tests for driver/datastore extraction are well-written:
- Follow existing test patterns in the file
- Proper cleanup with try/finally blocks
- Cover positive cases (successful extraction) and negative cases (missing export, missing type)
- Symmetric coverage for both drivers and datastores
Code Quality
- TypeScript strict mode: No violations
- Named exports: Properly used
- License headers: Present
- Test file naming: Correctly placed next to source
Suggestions (non-blocking)
-
Future consideration: A unit test for the path traversal check itself would strengthen confidence in this security fix. This could be done by:
- Creating a test archive with malicious paths, or
- Extracting the validation logic into a separate testable function
-
The
entry.includes("..")check is conservative and would reject rare legitimate filenames containing... This is the right tradeoff for security (false positives > false negatives).
LGTM - approve for merge.
There was a problem hiding this comment.
Adversarial Review
Critical
-
Symlink path traversal bypass (
src/cli/commands/extension_pull.ts:609-614)What's wrong: The path traversal check uses
tar -tzfwhich only outputs entry names, not symlink targets. A malicious archive can contain a symlink likeescape_link -> ../../../etc/passwdwhere:tar -tzfoutputs:escape_link- The check
entry.includes("..")tests"escape_link".includes("..")→ false - The check passes, but extraction creates a symlink pointing outside tmpDir
Breaking example:
# Create malicious archive ln -s ../../../etc/passwd escape_link tar -czf malicious.tar.gz escape_link # What PR's check sees: tar -tzf malicious.tar.gz # Output: "escape_link" (no ".." in this string!) # What verbose listing shows: tar -tzvf malicious.tar.gz # Output: "lrwxrwxrwx ... escape_link -> ../../../etc/passwd"
The archive can also include files like
escape_link/inject.txtto write through the symlink after it's extracted.Suggested fix: Either:
- Use
tar -tzvfand parse the-> targetpart to check symlink targets - Add
--no-same-permissionsand extract with Deno's built-in tar/untar that can validate symlink targets - Use tar's
--excludefor symlinks if symlinks aren't needed:tar -xzf archive.tar.gz --exclude='*' --include='*.ts' ...(filter to expected types) - Post-extraction: scan extracted directory for symlinks and reject if any exist or point outside
Medium
-
Debug logging semantic change (
src/cli/commands/extension_pull.ts:634-636)What changed: The original code logged files AFTER extraction by scanning the disk:
const allExtractedFiles = await listFiles(extractDir); for (const f of allExtractedFiles) { logger.debug`Archive contains: ${relative(extractDir, f)}`; }
The new code logs archive entries BEFORE extraction:
for (const entry of archiveEntries) { logger.debug`Archive contains: ${entry}`; }
This changes behavior when:
- Archive contains directories (now logged, previously skipped by
listFileswhich only returns files) - Archive contains symlinks (now logged as entry name, previously logged as resolved path or not at all)
- Extraction partially fails (new code logs intended entries, old code logged actual results)
Not a bug, but could affect debugging. Consider updating the log message to
Archive entry:for clarity. - Archive contains directories (now logged, previously skipped by
Verdict
FAIL - The symlink bypass is a critical security vulnerability that completely defeats the path traversal mitigation. The check must validate symlink targets, not just entry names.
Add post-extraction recursive scan that rejects any symlink whose resolved target (computed via readLink + resolve) falls outside tmpDir, closing the bypass where tar -tzf only shows entry names and not symlink targets. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
There was a problem hiding this comment.
Approved
This PR properly addresses the security concerns from the previous review.
What was reviewed:
- Path traversal protection: Well-implemented defense-in-depth with pre-extraction validation (
tar -tzflisting) and post-extraction symlink scanning - Test coverage: 8 new unit tests for driver/datastore extraction following established patterns
- TypeScript compliance: No new
anytypes, proper error handling withUserError - DDD patterns: Validation logic appropriately placed in CLI layer, tests co-located
Suggestions (non-blocking):
-
Duplicate JSDoc at lines 381-388 in
extension_pull.ts— the comment block forvalidateNoSymlinkEscapeappears twice:/** * Recursively validates that no symlink under `path` resolves... */ /** * Recursively validates that no symlink under `path` resolves... */
-
Minor: The check
entry.includes("..")is conservative but could reject legitimate filenames likechangelog..md. Consider checking for..as a path segment instead (e.g., split by/and check for..elements). The current approach is acceptable for security-sensitive code.
The security implementation is solid — the two-layer approach catches both direct path traversal in entry names and symlink-based escapes.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found. The security fix is architecturally sound.
Medium
-
Overly broad path traversal check (src/cli/commands/extension_pull.ts:640)
The check
entry.includes("..")will cause false positives for legitimate filenames containing consecutive dots.Breaking example: An archive containing a valid file named
config..json,v1..2.patch, orREADME..backupwould be rejected with "Archive contains unsafe path" even though these are safe filenames.Suggested fix: Check path segments individually:
const segments = entry.split("/"); if (segments.some(seg => seg === "..") || entry.startsWith("/")) {
-
Duplicate docstring (src/cli/commands/extension_pull.ts:381-388)
The same docstring block appears twice in succession - appears to be a copy-paste error. Not a functional issue but should be cleaned up.
Low
-
Symlink self-reference edge case (src/cli/commands/extension_pull.ts:399)
The check
!resolvedTarget.startsWith(resolvedTmpDir + "/")would reject a symlink that points toresolvedTmpDiritself (without trailing slash). This is a false positive scenario, not a security bypass. A symlink likefoo -> .at the tmpDir root would be rejected even though it doesn't escape. -
Hardlinks not validated
The post-extraction validation only checks symlinks (
stat.isSymlink). Tar archives can contain hardlinks which would bypass this check. However, the impact is minimal because:- Hardlinks can't point outside the filesystem
- Hardlinks can't link to directories
- The pre-extraction `.." check would still catch path traversal in hardlink names
Verdict
PASS - The security fix provides effective defense-in-depth against path traversal attacks. The pre-extraction entry name check catches malicious paths before extraction, and the post-extraction symlink check provides an additional layer of protection. The copyDir function already skips symlinks during the copy phase, making this truly defense-in-depth.
The overly broad `.." check (Medium #1) could cause operational issues with unusual but valid filenames, but is not a security vulnerability. The duplicate docstring should be cleaned up.
The test additions are well-structured and follow existing patterns.
Summary
Addresses two issues raised in PR review:
Path traversal during archive extraction (
src/cli/commands/extension_pull.ts): Aftertarextracts an extension archive to a temp directory, each extracted file path is now resolved and validated to confirm it starts within the temp directory. If any entry contains path traversal sequences (e.g.../../.bashrc) that would escape the temp dir, aUserErroris thrown before any further processing occurs.Missing test coverage for driver/datastore content extraction (
src/domain/extensions/extension_content_extractor_test.ts): Added 8 unit tests forextractDriverFromSourceandextractDatastoreFromSourcefollowing the same patterns as the existing vault tests:configSchemafields (inlinez.object)driver/datastore)typefieldTest Plan
deno run test)deno checkpasses (no type errors)deno lintpasses (no lint errors)deno fmt --checkpasses (no formatting issues)