Skip to content

feat: scheduled execution, webhook triggers, and trigger schema for swamp serve#1069

Merged
stack72 merged 8 commits intomainfrom
feat/scheduled-workflow-execution
Apr 2, 2026
Merged

feat: scheduled execution, webhook triggers, and trigger schema for swamp serve#1069
stack72 merged 8 commits intomainfrom
feat/scheduled-workflow-execution

Conversation

@keeb
Copy link
Copy Markdown
Contributor

@keeb keeb commented Apr 2, 2026

Summary

  • Scheduled workflow execution: swamp serve now runs workflows on cron schedules defined in workflow YAML via trigger: { schedule: "<cron>" }. Includes filesystem watcher for live-reload of schedule changes, overlap prevention, and serialized execution to avoid lock contention.
  • HTTP webhook endpoints: New --webhook '<route>:<workflow>:<secret>' flag registers POST endpoints that verify GitHub HMAC-SHA256 signatures and queue workflow runs. Designed for CI triggers (e.g. GitHub push webhook → run extension-test-suite).
  • Unified execution path: Extracted executeWorkflowWithLocks() as the single code path for WebSocket, scheduled, and webhook-triggered workflow runs.
  • Trigger schema: Refactored workflow schema from top-level schedule: to nested trigger: { schedule: } to prepare for additional trigger types (webhook filters, event-based triggers).

Test Plan

  • 4084 tests pass (including 15 new webhook tests, 3 scheduled execution tests, 5 scheduler tests, 6 workflow trigger schema tests)
  • Manually tested swamp serve with 3 scheduled workflows in swamp-media — all registered correctly with new trigger: schema
  • deno check, deno lint, deno fmt all pass

Closes #1041

🤖 Generated with Claude Code

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

  1. Import boundary violation (src/cli/commands/serve.ts:25-27): ScheduledExecutionService is imported directly from ../../libswamp/workflows/scheduled_execution.ts. Per CLAUDE.md, CLI commands must import from src/libswamp/mod.ts — never from internal module paths. The type is already re-exported from src/libswamp/mod.ts, so this is a one-line fix:

    import { ScheduledExecutionService } from "../../libswamp/mod.ts";
  2. Fire-and-forget promises (src/serve/webhook.ts:~line 281, src/libswamp/workflows/scheduled_execution.ts:~line 218): this.processQueue() is called without await or error handling in both WebhookService.handleRequest() and ScheduledExecutionService.handleFire(). CLAUDE.md requires: "No fire-and-forget promises. Every promise must be awaited or explicitly handled." Since these are intentionally async (returning immediately to the caller), the minimum fix is to add .catch() error handlers so failures don't silently disappear:

    this.processQueue().catch((err) => {
      logger.error("Queue processing failed: {error}", {
        error: err instanceof Error ? err.message : String(err),
      });
    });
  3. SKILL.md schema mismatch (.claude/skills/swamp-workflow/SKILL.md): The example YAML shows schedule: "0 3,12 * * *" as a top-level field, but the actual implementation uses trigger: { schedule: "0 3,12 * * *" }. This will mislead users into writing invalid workflow YAML. The example should be updated to match the schema:

    trigger:
      schedule: "0 3,12 * * *"
  4. Unhandled async in signal handlers (src/cli/commands/serve.ts:~line 285): shutdown() is now async but the signal listeners discard the returned promise (() => shutdown()). If shutdown fails (e.g. watcher.stop() throws), the error is silently lost. Add a .catch():

    Deno.addSignalListener("SIGINT", () => shutdown().catch((e) => logger.error("Shutdown error: {e}", { e })));

Suggestions

  1. listEndpoints() exposes secrets: WebhookService.listEndpoints() returns the full WebhookEndpoint including the secret field. While the health endpoint only maps route and workflow, the method signature returns secrets to any caller. Consider a separate type without the secret for the public API, or map to { route, workflowIdOrName } within the method.

  2. DDD: Good layering overall. WorkflowScheduler as a domain service (pure timer lifecycle), ScheduledExecutionService as an application service in libswamp (orchestrating scheduler + watcher + execution), and WebhookService in the serve layer are all well-placed per DDD principles. The executeWorkflowWithLocks extraction into a shared code path is a clean refactor.

  3. Watcher test coverage is thin (src/libswamp/workflows/watcher_test.ts): Only tests the workflowsDir helper function. Consider adding tests for scanExisting() behavior (e.g., filtering workflows without schedules) using a mock repo, similar to the scheduled_execution_test.ts pattern.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. src/serve/webhook.ts:55-78parseWebhookFlag throws plain Error, not UserError
    When a user passes a malformed --webhook flag (e.g., missing the secret segment), the error propagates through renderError and prints a stack trace, while UserError instances print only the message. The error messages themselves are clear ("Invalid --webhook format: expected '::', got '...'"), but the stack trace is noise for a validation error. Consider throw new UserError(...) here for consistency with the rest of the CLI.

  2. src/cli/commands/serve.ts:102-133schedule_unregistered event silently dropped in log mode
    The switch statement in the scheduled execution event handler covers schedule_registered, schedule_fired, schedule_skipped, schedule_completed, and schedule_failed — but not schedule_unregistered. In JSON mode the event is emitted correctly; in log mode it's silently dropped. Users who rely on log output won't see when a workflow's schedule is live-reloaded away. A one-liner logger.info("Unregistered schedule for workflow {name}", ...) would close the gap.

  3. src/cli/commands/serve.ts:53-56 — example uses single quotes, preventing env-var expansion
    The webhook example is swamp serve --webhook '/hooks/github:my-workflow:$WEBHOOK_SECRET'. With single quotes in a shell, $WEBHOOK_SECRET is not expanded — users who copy this verbatim would pass a literal $WEBHOOK_SECRET as the secret. The example would be more accurate with double quotes: "swamp serve --webhook \"/hooks/github:my-workflow:$WEBHOOK_SECRET\"".

  4. src/cli/commands/serve.ts:211-217onListen JSON output includes schedulingEnabled but not webhook count
    The startup JSON event has schedulingEnabled but doesn't surface whether webhooks are active. Since both features are configured at startup, it would be consistent to include "webhooksRegistered": N or similar so scripts can verify their configuration was picked up.

Verdict

PASS — no blocking issues. The new --no-schedule and --webhook flags are well-named and documented, log/JSON output is present for all event kinds (aside from the silent schedule_unregistered gap), and the health endpoint enrichment is additive and backward-compatible.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. src/serve/webhook.ts:55-78parseWebhookFlag throws plain Error, not UserError
    When a user passes a malformed --webhook flag (e.g., missing the secret segment), the error propagates through renderError and prints a stack trace, while UserError instances print only the message. The error messages themselves are clear (Invalid --webhook format: expected '<route>:<workflow>:<secret>', got '...'), but the stack trace is noise for a validation error. Consider throw new UserError(...) here for consistency with the rest of the CLI.

  2. src/cli/commands/serve.ts:102-133schedule_unregistered event silently dropped in log mode
    The switch statement covers schedule_registered, schedule_fired, schedule_skipped, schedule_completed, and schedule_failed — but not schedule_unregistered. In JSON mode the event is emitted correctly; in log mode it's silently dropped. Users relying on log output won't see when a workflow's schedule is live-reloaded away.

  3. src/cli/commands/serve.ts:53-56 — webhook example uses single quotes, preventing env-var expansion
    The example is swamp serve --webhook '/hooks/github:my-workflow:$WEBHOOK_SECRET'. With single quotes, $WEBHOOK_SECRET is not expanded by the shell — users who copy this verbatim pass a literal string. Double quotes would make it work as intended.

  4. src/cli/commands/serve.ts:211-217onListen JSON output includes schedulingEnabled but not webhook count
    The startup JSON event surfaces scheduling state but not whether webhooks are active. Since both are configured at startup, including something like webhooksRegistered: N would let scripts verify their configuration.

Verdict

PASS — no blocking issues. The new --no-schedule and --webhook flags are well-named and documented, log/JSON output is present for all major event kinds, and the health endpoint enrichment is additive and backward-compatible.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

  1. Unawaited async processQueue() — fire-and-forget promisesrc/serve/webhook.ts:271 and src/libswamp/workflows/scheduled_execution.ts:254

    Both WebhookService.handleRequest() and ScheduledExecutionService.handleFire() call this.processQueue() without await. processQueue is an async method that returns a Promise<void>. The comment on line 270 of webhook.ts even says "serialized, awaited — not fire-and-forget" but it is factually fire-and-forget.

    How it breaks: The project's CLAUDE.md explicitly states: "No fire-and-forget promises. Every promise must be awaited or explicitly handled — unhandled promises race with Deno.exit and silently lose data." During graceful shutdown, stop() aborts controllers and clears the queue, but if processQueue is mid-await on executeWorkflow, the shutdown path calls ac.abort() on the HTTP server while the unawaited processQueue promise is still running. The workflow execution may not complete and its result/error is silently lost. In webhook.ts specifically, the response is returned to the caller while processQueue runs detached — if the process exits (e.g., SIGTERM during CI), queued workflow runs vanish.

    Suggested fix: For the webhook case, the fire-and-forget is somewhat intentional (respond immediately, process async). Store the promise and drain it in stop():

    private processingPromise: Promise<void> = Promise.resolve();
    // In handleRequest:
    this.processingPromise = this.processQueue();
    // In stop:
    await this.processingPromise;

    For the scheduled execution case, same pattern — store and await in stop().

  2. Unawaited shutdown() in signal listenerssrc/cli/commands/serve.ts:292-293

    Deno.addSignalListener("SIGINT", () => shutdown());
    Deno.addSignalListener("SIGTERM", () => shutdown());

    shutdown is async but the returned promise is discarded. If shutdown throws (e.g., webhookService.stop() or scheduledExecution.stop() throws), it becomes an unhandled rejection. Additionally, there is no guard against double invocation — if SIGINT fires while a SIGTERM-triggered shutdown is in progress, both run concurrently through webhookService.stop() and scheduledExecution.stop(), which could produce confusing double-abort behavior.

    Suggested fix: Add a shuttingDown guard and capture the promise:

    let shuttingDown = false;
    const shutdown = async () => {
      if (shuttingDown) return;
      shuttingDown = true;
      // ...
    };
  3. Unawaited watchLoop() promisesrc/libswamp/workflows/watcher.ts:69-70

    start() calls this.watchLoop() (async) without await and does not store the promise. If watchLoop throws an error that isn't BadResource, it becomes an unhandled promise rejection.

    Suggested fix: Store the promise on an instance field so stop() can await it, or add a .catch() handler.

Medium

  1. No request body size limit on webhook endpointsrc/serve/webhook.ts:227

    req.arrayBuffer() reads the entire request body into memory with no size limit, before signature verification. An attacker who knows the webhook route (not secret — the route is a CLI flag, not a credential) can send a multi-gigabyte POST body to OOM the process. The route must start with / but is otherwise predictable.

    Suggested fix: Check Content-Length header and reject bodies over a reasonable limit (e.g., 10MB) before calling arrayBuffer(), or use a streaming reader with a byte counter.

  2. Unawaited handleChange() from setTimeout callbacksrc/libswamp/workflows/watcher.ts:115

    The debounce setTimeout callback calls this.handleChange(path, kind) which is async. The returned promise is not awaited or caught. If workflowRepo.findById throws an unexpected error not caught by the inner try/catch (e.g., the repo is disposed during shutdown), it's an unhandled rejection. The inner try/catch (line 148) catches errors from the happy path but the match regex and filename.split before the try block could theoretically throw on exotic inputs (unlikely but unhandled).

  3. Unbounded run queue in both servicessrc/serve/webhook.ts:254 and src/libswamp/workflows/scheduled_execution.ts:253

    Both runQueue arrays grow without bound. Under sustained webhook traffic or many rapidly-firing cron schedules, the queue grows indefinitely. No backpressure is applied — the webhook handler always returns 200/queued regardless of queue depth.

    Suggested fix: Add a max queue depth (e.g., 100) and return HTTP 503 when full for webhooks, or skip with a log for scheduled execution.

Low

  1. Health endpoint exposes schedule details without authenticationsrc/cli/commands/serve.ts:244-266

    The /health endpoint returns workflow IDs, cron expressions, next run times, and running state to any unauthenticated GET request. The serve command already warns about non-loopback binding (line 72-76), but even on loopback this leaks internal workflow topology to any local process.

  2. listEndpoints() returns endpoints including secrets in the typesrc/serve/webhook.ts:199

    The return type is ReadonlyArray<WebhookEndpoint> which includes the secret field. The health endpoint in serve.ts only uses route and workflowIdOrName, so secrets don't leak externally, but future callers of listEndpoints() would get secrets. Consider returning a type without secret.

Verdict

FAIL — The unawaited promises in processQueue(), shutdown(), and watchLoop() directly violate the project's explicit "no fire-and-forget promises" rule in CLAUDE.md. While the serialization logic is correct under normal operation, during shutdown these orphaned promises race with Deno.exit and can silently lose in-flight workflow run results. Fix the promise handling, add the shutdown guard, and consider the body size limit.

keeb and others added 5 commits April 2, 2026 13:24
Workflows can now declare an optional `schedule` field with a cron expression.
When `swamp serve` starts, it scans all workflows and registers cron entries
for any with schedules. A filesystem watcher monitors the workflows directory
for live reload — adding, changing, or removing a schedule takes effect without
restart.

- Add `croner` dependency for cron expression parsing and scheduling
- Add optional `schedule` field to WorkflowSchema and Workflow aggregate
- Create WorkflowScheduler domain service (pure timer lifecycle)
- Create WorkflowWatcher in libswamp (Deno.watchFs with debouncing)
- Create ScheduledExecutionService in libswamp (orchestrates scheduler,
  watcher, and workflowRun with overlap prevention)
- Integrate into `swamp serve` with `--no-schedule` flag and enhanced
  `/health` endpoint reporting scheduled workflows and next fire times
- Update design/workflow.md and swamp-workflow skill

Closes #1009

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Extract executeWorkflowWithLocks into serve/deps.ts as the single code
path for workflow execution with model lock acquisition. Both the
WebSocket handler (connection.ts) and ScheduledExecutionService now call
this shared function, eliminating the duplicated lock acquisition and
preventing execution path divergence (same class of issue as #947).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
When multiple workflows fire at the same cron tick, they now queue and
execute one at a time instead of concurrently. Before scheduling,
each workflow ran as a separate process via systemd timers — serializing
preserves that behavior and avoids lock contention on shared models.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add webhook support to swamp serve via --webhook flag that accepts
HTTP POST requests, verifies GitHub HMAC-SHA256 signatures, and
queues workflow runs through the shared executeWorkflowWithLocks path.

Refactor workflow schema from top-level `schedule:` to nested
`trigger: { schedule: }` to prepare for additional trigger types.

Closes #1041

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Store processQueue() promise in both WebhookService and
  ScheduledExecutionService; drain it in stop() instead of polling
- Add shuttingDown guard to prevent double shutdown on rapid signals
- Catch shutdown() promise rejection in signal listeners
- Store watchLoop() promise with .catch() handler; await in stop()
- Catch unawaited handleChange() in debounce setTimeout callback
- Add 10MB request body size limit on webhook endpoints (413 on exceed)
- Add max queue depth of 100 with HTTP 503 backpressure for webhooks
- Return WebhookEndpointInfo (without secret) from listEndpoints()

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@keeb keeb force-pushed the feat/scheduled-workflow-execution branch from df56233 to a2ba5cc Compare April 2, 2026 20:29
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

  1. Inverted dependency: libswamp imports from serve layer (src/libswamp/workflows/scheduled_execution.ts:38)

    import { executeWorkflowWithLocks } from "../../serve/deps.ts";

    libswamp is the library layer — it must not depend on serve (a consumer/presentation layer). This creates a circular dependency: serve → libswamp → serve. The executeWorkflowWithLocks function should either live in libswamp (since it orchestrates libswamp operations) or ScheduledExecutionService should accept it as an injected dependency. The cleanest fix is to move executeWorkflowWithLocks into libswamp (e.g. src/libswamp/workflows/execute.ts) and have serve/deps.ts re-export it, or inject it via the ScheduledExecutionDeps interface.

  2. CLI command imports from internal libswamp path (src/cli/commands/serve.ts:25-26)

    import {
      ScheduledExecutionService,
    } from "../../libswamp/workflows/scheduled_execution.ts";

    Per CLAUDE.md: "CLI commands and presentation renderers must import libswamp types and functions from src/libswamp/mod.ts — never from internal module paths." This should be:

    import { ScheduledExecutionService } from "../../libswamp/mod.ts";

    The types are already re-exported from src/libswamp/mod.ts, so this is just an import path fix.

  3. SKILL.md documents schedule: as a top-level field but code uses trigger: { schedule: } (.claude/skills/swamp-workflow/SKILL.md)
    The skill documentation shows:

    schedule: "0 3,12 * * *"

    But the actual schema uses trigger: { schedule: "0 3,12 * * *" }. Users following the skill docs will create invalid workflows. The SKILL.md example should match the implemented schema.

Suggestions

  1. WorkflowWatcher and ScheduledExecutionService share nearly identical queue/processing patterns with WebhookService — the processQueue / processing / processingPromise pattern is duplicated. Consider extracting a small SerializedRunner helper, though this is fine to defer.

  2. Content-Length check is easily bypassed (src/serve/webhook.ts): The Content-Length header check at line ~2092 can be spoofed (smaller value, larger body). The second check after req.arrayBuffer() at line ~2111 catches this, but by then the body is already in memory. Consider using a streaming read with a byte counter to truly enforce the limit, or just drop the first Content-Length check since the second one is the real guard.

  3. verifySignature copies body into a new ArrayBuffer unnecessarily (src/serve/webhook.ts:~1965-1966):

    const bodyBuffer = new ArrayBuffer(body.byteLength);
    new Uint8Array(bodyBuffer).set(body);

    crypto.subtle.sign accepts BufferSource, so you can pass body directly — no copy needed.

  4. The WorkflowScheduler re-exports from libswamp/mod.ts cross the domain boundary (src/libswamp/mod.ts):

    export { ... } from "../domain/workflows/workflow_scheduler.ts";

    This is a domain type being re-exported from the application layer barrel, which is fine for the types/interfaces, but the class itself could be considered an internal concern of ScheduledExecutionService. Not blocking since the pattern is already used elsewhere in the codebase.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

  1. Documentation shows wrong schema — both .claude/skills/swamp-workflow/SKILL.md and design/workflow.md show the old top-level schedule: syntax in their YAML examples:

    name: anime-downloader
    schedule: "0 3,12 * * *"   # ← this is WRONG

    But the actual implementation in src/domain/workflows/workflow.ts (and confirmed by the new tests) uses the nested trigger: schema:

    name: anime-downloader
    trigger:
      schedule: "0 3,12 * * *"  # ← correct

    Any user following either doc will write YAML that fails schema validation. Both examples need to be updated to use trigger: { schedule: ... }.

  2. parseWebhookFlag throws plain Error, not UserErrorsrc/serve/webhook.ts throws new Error(...) for invalid --webhook flag values. When this propagates from webhookFlags.map(parseWebhookFlag) in serve.ts, the user sees a full stack trace instead of a clean error message. All other user-facing validation errors in the CLI use UserError from src/domain/errors.ts to suppress the stack trace. parseWebhookFlag should throw UserError for the same clean UX.

Suggestions

  1. The /health JSON response includes scheduling.schedules[].workflowId but omits the workflow name. Scripts polling the health endpoint would need a secondary swamp workflow get call to make the IDs human-readable. Adding workflowName alongside workflowId would make the health output self-contained.

  2. The --webhook flag description says <route>:<workflow>:<secret> but the example uses $WEBHOOK_SECRET as the secret (a shell variable reference, not a literal value). A short clarifying note like "secret may be a literal value or shell-expanded variable" would prevent confusion.

Verdict

NEEDS CHANGES — documentation examples show the deprecated top-level schedule: field instead of the new trigger: { schedule: } schema, which will cause immediate user-facing failures.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

No critical or high severity findings.

Medium

  1. processingPromise overwrite silently drops the real processing promisesrc/serve/webhook.ts:327 and src/libswamp/workflows/scheduled_execution.ts:254

    Both WebhookService and ScheduledExecutionService use the same pattern:

    this.processingPromise = this.processQueue();
    

    where processQueue() returns immediately (resolved promise) if this.processing is already true. When a second item is queued while the first is still processing, this.processingPromise gets overwritten with a resolved Promise<void>, and the reference to the original in-flight processing promise is lost.

    stop() then awaits this.processingPromise — which is the resolved one, not the still-running one. This means stop() returns before the actual processing loop finishes.

    Breaking example: During swamp serve shutdown, stop() returns instantly, ac.abort() fires, server.finished resolves, and repoContext.catalogStore?.close() runs while processQueue is still mid-execution in executeWorkflowWithLocks, potentially accessing the closed catalog store.

    Suggested fix: Store the original promise and never overwrite it:

    if (!this.processing) {
      this.processingPromise = this.processQueue();
    }

    Or guard the assignment: only assign this.processingPromise when the returned promise is not already resolved (i.e. when processQueue actually started processing).

  2. Content-Length check is advisory-only; full body is always read into memorysrc/serve/webhook.ts:240-258

    The early Content-Length check (line 240-253) is ineffective against a malicious sender. An attacker can omit the Content-Length header (defaulting to parseInt("0", 10) = 0, passing the check) or lie about it, then send a very large chunked body. req.arrayBuffer() on line 258 reads the entire body into memory before the second size check on line 260 can reject it.

    Breaking example: POST /hooks/github with no Content-Length header and a 500MB chunked body → server allocates 500MB before rejecting at line 260.

    Suggested fix: Use a streaming read with a byte budget instead of req.arrayBuffer(), or use Deno's built-in request body size limits if available.

Low

  1. Missing schedule_unregistered case in serve.ts event handlersrc/cli/commands/serve.ts:102-134

    The event handler switch in log mode handles schedule_registered, schedule_fired, schedule_skipped, schedule_completed, and schedule_failed — but not schedule_unregistered. When a workflow's schedule is removed via live-reload, no log line is emitted in non-JSON mode (JSON mode correctly emits it via the catch-all JSON.stringify(event) on line 100).

  2. No execution timeout on webhook-triggered workflow runssrc/serve/webhook.ts:364

    The AbortController created for each webhook workflow execution has no timeout. A webhook-triggered workflow that hangs will block the serial queue indefinitely, preventing all subsequent queued webhook runs from executing. Scheduled execution has the same issue (src/libswamp/workflows/scheduled_execution.ts:275). Consider wiring AbortSignal.timeout() or a configurable max duration.

  3. Debounce timer callbacks can race with stop()src/libswamp/workflows/watcher.ts:119-128

    If a debounce timer fires (invoking handleChange) at the exact instant stop() is called, stop() clears the timers map but the already-fired handleChange callback is in-flight. It will call this.workflowRepo.findById() on a potentially-closing repository. The catch block on line 121 handles this gracefully (logs a warning), so the impact is cosmetic.

Verdict

PASS — The code is well-structured with good separation of concerns, proper HMAC verification, overlap prevention, and serialized execution. The processingPromise overwrite (Medium #1) is the most notable issue but its practical impact is limited to a narrow shutdown race since the abort signals and queue clearing provide defense-in-depth. No blocking issues found.

- Fix processingPromise overwrite: guard assignment with `if (!this.processing)`
  so stop() always awaits the real in-flight promise
- Use streaming body read with byte budget instead of req.arrayBuffer() to
  enforce 10MB limit without full allocation on oversized requests
- Throw UserError instead of Error in parseWebhookFlag for clean CLI output
- Update design/workflow.md and swamp-workflow SKILL.md to document
  trigger: { schedule: } instead of the old top-level schedule: field
- Add missing schedule_unregistered case to serve.ts log-mode event handler

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

  1. Import boundary violation in src/cli/commands/serve.ts:25-27: The CLI command imports ScheduledExecutionService directly from ../../libswamp/workflows/scheduled_execution.ts instead of from ../../libswamp/mod.ts. Per CLAUDE.md: "CLI commands and presentation renderers must import libswamp types and functions from src/libswamp/mod.ts — never from internal module paths." The type is already re-exported from mod.ts, so the fix is just changing the import path.

  2. Inverted layer dependency in src/libswamp/workflows/scheduled_execution.ts:39: A libswamp module imports executeWorkflowWithLocks from ../../serve/deps.ts. This creates a dependency from a lower-level library (libswamp) upward into the serve layer, which inverts the intended dependency direction (serve → libswamp, not libswamp → serve). The ScheduledExecutionService either needs to live in the src/serve/ layer (since it depends on serve infrastructure), or executeWorkflowWithLocks needs to be passed in as a dependency (e.g., via the ScheduledExecutionDeps interface) so libswamp remains independent of the serve layer.

Suggestions

  1. Consider extracting duplicate queue+serialize pattern: Both ScheduledExecutionService and WebhookService implement nearly identical run-queue + processing patterns (runQueue, processing, processingPromise, processQueue(), executeWorkflow()). A shared SerializedExecutionQueue could reduce this duplication — but not blocking since both implementations are correct as-is.

  2. Watcher test coverage is minimal (src/libswamp/workflows/watcher_test.ts): The only test covers the workflowsDir utility. The WorkflowWatcher class itself (debouncing, filename parsing, handling create/modify/remove events, scanExisting) has no unit tests. Consider at minimum testing scanExisting with a mock repo and the filename-matching logic in handleChange.

  3. DDD observation: Good separation of concerns — WorkflowScheduler as a pure domain service with no I/O, ScheduledExecutionService as an application service orchestrating the domain. The Workflow aggregate properly encapsulates the trigger/schedule concept through a getter. The event-driven communication pattern via typed discriminated unions is clean.

github-actions[bot]
github-actions Bot previously approved these changes Apr 2, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. --webhook help text doesn't mention repeatable flag (src/cli/commands/serve.ts): The description says "Register a webhook endpoint: ::" but doesn't hint that the flag is repeatable ({ collect: true }). Consider: "Register a webhook endpoint (repeatable): ::".

  2. --no-schedule description doesn't signal it's disabling a default (src/cli/commands/serve.ts): The flag says "Disable scheduled workflow execution" but swamp serve --help users won't know scheduling is on by default. Consider: "Disable scheduled workflow execution (enabled by default)".

  3. webhook_queued log-mode message drops the route (src/cli/commands/serve.ts): webhook_received logs route+workflow, but webhook_queued only logs the workflow name. Minor inconsistency — including route makes it easier to correlate logs when multiple webhooks are configured.

  4. Secret appears in ps aux: The --webhook flag embeds the secret inline. The example already uses $WEBHOOK_SECRET which is the right nudge, but a parenthetical in the description like "use env var for secret" would make this explicit.

Verdict

PASS — new --webhook and --no-schedule flags are well-documented, error messages are clear and actionable, JSON and log-mode events are consistent and complete, and the health endpoint additions are additive/backwards-compatible.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/serve/webhook.ts:281-295 — Body read before signature header check enables unauthenticated resource consumption.
    The body is read (up to 10MB) before checking whether the X-Hub-Signature-256 header is present. An attacker who discovers a webhook route can force the server to allocate up to 10MB per request with no authentication at all — just send large POST bodies without the signature header.

    // Current order:
    const signatureHeader = req.headers.get("x-hub-signature-256") ?? "";
    const body = await readBodyWithLimit(req, MAX_WEBHOOK_BODY_BYTES); // ← 10MB read
    // ...
    if (!signatureHeader) { return 401; }  // ← checked AFTER the 10MB allocation
    

    Breaking input: curl -X POST http://host:9090/hooks/github -d @/dev/urandom --max-time 5 (repeated in parallel).
    Suggested fix: Move the signatureHeader empty-check before readBodyWithLimit:

    const signatureHeader = req.headers.get("x-hub-signature-256") ?? "";
    if (!signatureHeader) {
      this.emit({ kind: "webhook_rejected", route: endpoint.route, reason: "Missing header" });
      return Response.json({ error: "Missing X-Hub-Signature-256 header" }, { status: 401 });
    }
    const body = await readBodyWithLimit(req, MAX_WEBHOOK_BODY_BYTES);
  2. src/serve/webhook.ts:157 — reader.cancel() not awaited before releaseLock() in finally block.
    ReadableStreamDefaultReader.cancel() returns a Promise. Not awaiting it means the stream may not have finished cancellation when releaseLock() runs in the finally block. While most runtimes handle this gracefully, the Streams spec says calling releaseLock() on a reader with a pending read request throws a TypeError.
    Breaking scenario: A very slow sender triggers the size limit; the pending reader.read() inside cancel() hasn't resolved when releaseLock() fires.
    Suggested fix: await reader.cancel();

  3. CLI --webhook flag passes secret as a command-line argument (src/cli/commands/serve.ts:49-52).
    The secret is visible in ps aux, /proc/*/cmdline, and shell history. The example suggests $WEBHOOK_SECRET (env var expansion by the shell), which mitigates somewhat, but is easy to misuse with a literal string. Not blocking since the example demonstrates env var usage, but worth documenting that literal secrets in the flag value are a security concern.

Low

  1. src/serve/webhook.ts:123-124 — Early return on hex length mismatch before constant-time loop. The comment says "constant-time comparison" but the length check at line 123 short-circuits. In practice this is harmless since expectedHex is always 64 characters (SHA-256), so a length mismatch only reveals that the attacker provided a malformed hex string — not a secret-dependent timing difference.

  2. src/cli/commands/serve.ts:301-314 — Signal listeners are never removed. Deno.addSignalListener adds listeners that hold closures over webhookService, scheduledExecution, and other resources. These are never cleaned up after server.finished resolves. Harmless in practice since serve runs for the lifetime of the process, but would leak if the command were ever called more than once in the same process.

  3. src/libswamp/workflows/watcher.ts:135 — path.split("/").pop() doesn't handle Windows paths. If the repo ever runs on Windows, backslash separators would cause filename to contain the full path. Not an issue for the stated target platform, but fragile.

Verdict

PASS — The code is well-structured with proper serialization, overlap prevention, abort propagation, and HMAC verification. The body-before-auth-check issue (Medium #1) is worth fixing but does not cause data loss or authentication bypass — it's a bounded resource consumption vector that requires knowledge of the route path. No critical or high severity findings.

- Fix CLI import: use libswamp/mod.ts barrel for ScheduledExecutionService
- Fix layer inversion: inject executeWorkflow as a callback via
  ScheduledExecutionDeps instead of importing from serve/deps.ts
- Move signature header check before body read to prevent unauthenticated
  resource consumption on webhook endpoints

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Log mode silent on --no-schedule: When the user passes --no-schedule, log mode emits nothing to confirm scheduling is disabled. JSON mode includes schedulingEnabled: false in the listening event (good), but log mode users get no acknowledgment. A single logger.info("Scheduled execution disabled (--no-schedule)") in the disabled branch would make the flag's effect visible.

  2. JSON field name inconsistency in webhook events: The webhook registration output (emitted directly in serve.ts) uses the key workflow:

    {"kind":"webhook_registered","route":"...","workflow":"my-workflow"}

    But all WebhookEvent variants use workflowName. Scripts parsing JSON output would need to handle two different keys depending on event kind. Consider using workflowName consistently, or at minimum document the difference.

  3. webhook_queued log message drops route context: webhook_received logs "Webhook received on {route} for workflow {workflow}" but webhook_queued logs only "Webhook queued workflow {workflow}" — the route disappears. Minor, but the route is useful context when multiple webhooks are configured.

Verdict

PASS — new flags are well-documented with clear help text and an example, error messages are actionable, and both log/JSON output modes are covered for all new event types. The suggestions above are cosmetic and do not block merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

  1. Fire-and-forget promise in readBodyWithLimit (src/serve/webhook.ts:157): reader.cancel() returns a Promise that is neither awaited nor .catch()-handled. This violates the project's "No fire-and-forget promises" rule in CLAUDE.md. Should be await reader.cancel();.

Suggestions

  1. Thin watcher tests (src/libswamp/workflows/watcher_test.ts): Only the workflowsDir helper is tested. Consider adding a test for scanExisting behavior using a mock WorkflowRepository — similar to how scheduled_execution_test.ts tests ScheduledExecutionService with mocks. Not blocking since the watcher is indirectly exercised through the scheduled execution service tests, but direct coverage would catch regressions in debounce/change-handling logic.

  2. Unnecessary buffer copy in verifySignature (src/serve/webhook.ts:115-116): The body Uint8Array from readBodyWithLimit already owns its backing ArrayBuffer, so the copy to a new ArrayBuffer is redundant. You can pass body.buffer directly (with byteOffset/byteLength if needed) or just pass bodycrypto.subtle.sign accepts BufferSource which includes Uint8Array.

Notes

  • DDD: Clean layering — WorkflowScheduler as a pure domain service, ScheduledExecutionService as an application service in libswamp, WebhookService in the serve layer. The executeWorkflowWithLocks extraction as a shared code path is a good call.
  • Import boundary: All CLI imports correctly go through src/libswamp/mod.ts. ✅
  • Schema: The optional trigger field with backward compatibility (existing workflows without trigger continue to work) is well-handled with tests.
  • Security: HMAC verification with constant-time comparison, body size limits, queue depth limits, and signature-before-body-read ordering are all solid.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. Unawaited reader.cancel() promise — src/serve/webhook.ts:157
    readBodyWithLimit calls reader.cancel() without awaiting it, then returns null, triggering the finally block's reader.releaseLock(). The project convention (CLAUDE.md: "No fire-and-forget promises") is violated. While the Streams spec tolerates releaseLock() after an unawaited cancel(), the dangling promise could mask stream errors on slow/large uploads.
    Suggested fix: await reader.cancel(); before return null;.

  2. No cross-service serialization for the same workflow — src/serve/webhook.ts + src/libswamp/workflows/scheduled_execution.ts
    ScheduledExecutionService and WebhookService each serialize their own queues, but both can concurrently trigger the same workflow through executeWorkflowWithLocks. If a scheduled trigger and a webhook fire at the same time for the same workflow, two lock-acquisition + execution attempts will race. The lock layer may handle this, but the overlap-prevention logic in ScheduledExecutionService.handleFire (checking this.running) does not protect against a webhook-triggered run of the same workflow — or vice versa.
    Impact: Possible lock contention or duplicate concurrent runs of the same workflow.
    Suggested fix: Consider a shared "workflow is executing" guard or document this as an accepted limitation.

Low

  1. Watcher remove events use filename as workflow name — src/libswamp/workflows/watcher.ts:131
    When a workflow file is deleted, handleChange passes the raw filename (e.g. workflow-abc123.yaml) as the workflowName to onChange. The ScheduledExecutionService falls back to this via this.workflowNames.get(workflowId) ?? workflowName, so schedule_unregistered events and log messages may show a filename instead of the human-readable workflow name. Cosmetic only — unregistration works correctly by ID.

  2. Watcher does not detect workflows directory creation after startup — src/libswamp/workflows/watcher.ts:50-56
    If the workflows/ directory does not exist when start() is called, the watcher logs and returns without setting up a watch. If the directory is created later, no schedules will be detected until the service is restarted. This matches the documented "no catch-up" behavior but could surprise users who create their first workflow while serve is already running.

  3. Unnecessary ArrayBuffer copy in verifySignaturesrc/serve/webhook.ts:115-116
    The body is copied from Uint8Array into a fresh ArrayBuffer before passing to crypto.subtle.sign, but the API accepts BufferSource directly (which Uint8Array implements). The extra allocation is harmless but wasteful for large payloads.

Verdict

PASS — The code is well-structured with clear separation of concerns. The extracted executeWorkflowWithLocks unifies the execution path cleanly. Webhook signature verification is correctly implemented with constant-time comparison. Queue processing is correctly serialized within each service using JS's single-threaded event loop guarantees. Shutdown logic properly drains queues and aborts in-flight runs. The medium findings are worth addressing but don't block the merge.

- Await reader.cancel() in readBodyWithLimit to satisfy no-fire-and-forget rule
- Use body.buffer directly in verifySignature instead of copying to new ArrayBuffer

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Log-mode startup doesn't confirm --no-schedule (src/cli/commands/serve.ts): In JSON mode, the server-ready event includes schedulingEnabled: false when --no-schedule is passed, so a monitoring script knows scheduling is off. In log mode the startup line is just WebSocket API server listening on {host}:{port} — no mention of scheduling being disabled. Consider adding a log-mode line like logger.info("Scheduled execution disabled (--no-schedule)") when !enableSchedule, so operators have visible confirmation.

  2. webhook_queued log message drops the route (src/cli/commands/serve.ts): The other webhook log messages include the route (webhook_received, webhook_rejected), but webhook_queued only logs the workflow name: "Webhook queued workflow {workflow}". Minor inconsistency — the route is useful context in multi-webhook setups.

  3. Single-quoted example may surprise shell users (src/cli/commands/serve.ts): The example swamp serve --webhook '/hooks/github:my-workflow:$WEBHOOK_SECRET' uses single quotes, which prevents shell expansion of $WEBHOOK_SECRET — the literal string $WEBHOOK_SECRET is passed as the secret. Double-quoting is the idiomatic pattern here: "'/hooks/github:my-workflow:$WEBHOOK_SECRET'" or just --webhook "/hooks/github:my-workflow:$WEBHOOK_SECRET".

Verdict

PASS — no blocking issues. New flags are clearly documented, error messages are actionable, JSON mode covers all new events, and the health endpoint is updated consistently.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured PR with clean DDD layering. The architecture is solid:

  • WorkflowScheduler (domain service) — pure scheduling logic, no I/O, testable
  • ScheduledExecutionService (application service in libswamp) — orchestrates domain objects, emits typed events
  • WorkflowWatcher (infrastructure) — filesystem observation with proper debouncing
  • WebhookService (serve layer) — HTTP concerns correctly isolated from domain
  • executeWorkflowWithLocks — good extraction of shared code path for WebSocket, scheduled, and webhook triggers

Blocking Issues

None.

Suggestions

  1. watcher_test.ts is thin — it only tests the workflowsDir path helper. The WorkflowWatcher class has meaningful logic (debouncing, file change → schedule change propagation, scanExisting) that could benefit from direct unit tests with a mock WorkflowRepository. Not blocking since the ScheduledExecutionService tests cover scanExisting indirectly and the watcher is hard to unit test without filesystem setup.

  2. Queue depth unbounded for scheduled executionWebhookService has a MAX_QUEUE_DEPTH = 100 with backpressure (returns 503 when full), but ScheduledExecutionService.runQueue has no cap. In practice this is unlikely to matter since overlap prevention limits growth, but a defensive cap would be consistent.

Security looks good: HMAC-SHA256 with constant-time comparison, body size limits before signature verification, signature checked before reading body, no secret leakage in health endpoint or logs.

All new files have license headers, named exports, no any types, tests co-located with source, and libswamp import boundary is respected (serve.ts imports ScheduledExecutionService from libswamp/mod.ts).

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. src/serve/webhook.ts:120body.buffer is incorrect for Uint8Array views in a public security function.

    verifySignature is exported and accepts body: Uint8Array, but computes the HMAC over body.buffer (the underlying ArrayBuffer), not the view itself. If a caller passes a subarray (e.g., largerBuffer.subarray(10, 50)), body.buffer is the entire underlying ArrayBuffer, causing the HMAC to be computed over wrong data — the signature check silently produces an incorrect result.

    The comment on line 115–116 acknowledges the internal call site is safe (freshly-allocated buffer), but the function is a public export used through src/serve/mod.ts. A future caller using a view would get silent verification failure or, worse, a false positive if an attacker controls the surrounding buffer.

    Breaking example:

    const full = new Uint8Array([0, 0, ...payload, 0, 0]);
    const view = full.subarray(2, 2 + payload.length);
    await verifySignature(view, sig, secret); // HMAC computed over `full`, not `view`

    Fix: crypto.subtle.sign accepts BufferSource (which includes Uint8Array). Just pass body directly:

    const signature = await crypto.subtle.sign("HMAC", key, body);
  2. src/serve/connection.ts:217 — WebSocket disconnect no longer breaks the workflow event iteration loop.

    Previously, handleWorkflowRun used for await (const event of workflowRun(...)) with if (socket.readyState !== WebSocket.OPEN) break;, which called the async generator's return() and could propagate cancellation into the workflow pipeline. The refactored code now just skips sending in the onEvent callback, but executeWorkflowWithLocks continues iterating the full event stream to completion.

    For long-running workflows, this means the server keeps processing (holding locks, consuming CPU) for a disconnected client that will never see the results. The controller.abort() is only triggered by an explicit cancel message, not by socket closure.

    Breaking example: Client starts a 30-minute workflow, disconnects immediately. Old behavior: generator return propagates, iteration stops, locks release. New behavior: workflow runs to completion, burning resources for 30 minutes with no consumer.

    Suggested fix: Check socket state in onEvent and call controller.abort() when the socket is no longer open, or set up a socket close listener that aborts the controller.

Low

  1. src/serve/webhook.ts:127-128 — Length check before constant-time comparison leaks length information. The early return false on receivedHex.length !== expectedHex.length exits before the constant-time XOR loop. Since both values are hex-encoded SHA-256 (always 64 chars), this is unexploitable in practice — a well-formed sha256= header always produces a 64-char hex string, and the expected value is always 64 chars. But it contradicts the stated intent of constant-time comparison. If the codebase later adds support for other hash algorithms with different lengths, this would become a timing oracle.

  2. src/libswamp/workflows/scheduled_execution.ts — Serialized queue means one slow workflow blocks all scheduled runs. The processQueue pattern executes workflows one at a time. If a workflow takes 2 hours, all other scheduled workflows queued behind it miss their windows. This is documented as intentional ("serializing preserves systemd-timer behavior"), but worth noting that it means a misbehaving workflow can starve the entire schedule.

Verdict

PASS — The code is well-structured with good test coverage, proper resource cleanup, and sensible security defaults (HMAC verification, body size limits, queue backpressure). The body.buffer issue (Medium #1) is latent — not exploitable via current call sites — and the WebSocket behavior change (Medium #2) is a resource efficiency concern, not a correctness bug. Neither blocks merge, but both are worth addressing.

@stack72 stack72 merged commit 2797cc6 into main Apr 2, 2026
10 checks passed
@stack72 stack72 deleted the feat/scheduled-workflow-execution branch April 2, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add HTTP webhook endpoints to swamp serve for triggering workflows

2 participants