Workflows

A Workflow is a durable container task — one Docker image plus an optional cron schedule. Fire it on schedule or on demand, watch it run to completion, retry on failure, and keep a permanent record of every run with stdout/stderr captured. Reach for it when a job lives next to your stack but does not belong inside any one long-running service.

Workflow vs cron service vs service

Workflow

Reusable container task. Re-image without redeploying a service. Retries are first-class; run history persists per fire.

Cron service

Built from your repo on every deploy. Runs your source code with your build pipeline. Pick this when the schedule belongs to the app, not to the operator.

Long-running service

HTTP / worker that never exits. Use a service when work is continuous; a workflow when work is bursty and finite.

Schema

A workflow row stores an image reference, a command, an env map, a schedule, a per-run timeout, and a retry budget. Every field maps 1:1 to the body of POST /api/projects/:teamId/:projectId/workflows.

createWorkflowSchema (packages/shared/src/schemas/workflow.ts)

{
  projectId: number;
  environmentId?: number;     // omit → project's Production env
  name: string;               // 1–100 chars, shown in UI + run history
  image: string;              // Docker registry reference, e.g. "curlimages/curl:latest"
  command?: string[] | null;  // up to 64 args; null → use image CMD
  env?: Record<string, string>;
  schedule?: string | null;   // standard cron (5- or 6-field), null = manual-only
  timeoutSec?: number;        // 1–86400 (24h), default 3600
  maxRetries?: number;        // 0–10, default 0
}

The Docker image must be reachable from the worker. Public Docker Hub images (curlimages/curl:latest, ghcr.io/...) work out of the box; private registries are configured in the Workflows tab under registry credentials.

Create a workflow

curl

curl -X POST https://hoststack.dev/api/projects/${TEAM_ID}/${PROJECT_ID}/workflows \
  -H "Authorization: Bearer hs_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": '${PROJECT_ID}',
    "name": "prod-monitor",
    "image": "curlimages/curl:latest",
    "command": ["-fsS", "https://api.example.com/health"],
    "schedule": "*/5 * * * *",
    "timeoutSec": 60,
    "maxRetries": 3
  }'

Run lifecycle

Every fire — scheduled, manual, or retry — creates a run row. Status transitions:

queued — accepted, waiting for a worker slot. Scheduler bumps long-pending runs into the failed state if no worker picks them up within the timeout window.
running — container started. stdout/stderr stream to the run's log buffer in real time.
succeeded — container exit code 0 inside the timeout. Done.
failed — container exit code ≠ 0 (after exhausting maxRetries). Fires the workflow.failed event to any subscribed notification channels.
cancelled — operator clicked Cancel on an in-flight run, or the API was called with POST /api/workflows/:teamId/workflows/runs/:runId/cancel.
timed_out — exceeded timeoutSec without exiting. Container is SIGKILL'd; same retry rules as failed.

Trigger a run

Manual fires are useful in CI ("kick a backfill after migration X applied") and for sanity-checking a scheduled workflow without waiting for the next tick. Per-trigger env merges on top of the workflow's stored env.

POST /api/workflows/:teamId/workflows/:workflowId/trigger

curl -X POST \
  https://hoststack.dev/api/workflows/${TEAM_ID}/workflows/${WORKFLOW_PUBLIC_ID}/trigger \
  -H "Authorization: Bearer hs_live_..." \
  -H "Content-Type: application/json" \
  -d '{"env": {"DRY_RUN": "true"}}'

HTTP surface

All workflow routes accept either a session cookie or an hs_live_* / hs_test_* API key bearer token. Trigger + retry mutations require an API key with full scope.

GET /api/projects/:teamId/:projectId/workflows — list the project's workflows.
POST /api/projects/:teamId/:projectId/workflows — create.
GET /api/workflows/:teamId/workflows/:workflowId — get one.
PATCH /api/workflows/:teamId/workflows/:workflowId — partial update (any subset of name, image, command, env, schedule, timeoutSec, maxRetries).
DELETE /api/workflows/:teamId/workflows/:workflowId — remove. In-flight runs continue; new fires stop immediately.
POST /api/workflows/:teamId/workflows/:workflowId/trigger — manual fire (rate-limited: 30/min/team).
GET /api/workflows/:teamId/workflows/:workflowId/runs — list runs for one workflow.
GET /api/workflows/:teamId/workflows/runs/:runId — single run.
GET /api/workflows/:teamId/workflows/runs/:runId/logs — captured stdout/stderr.
POST /api/workflows/:teamId/workflows/runs/:runId/cancel — stop an in-flight run.
POST /api/workflows/:teamId/workflows/runs/:runId/retry — re-fire a terminal run with the same env (rate-limited: 30/min/team).

Limits

timeoutSec cap is 24 h (86 400 s). Default is 1 h.
maxRetries cap is 10. Default is 0 (no retry).
command is capped at 64 args; each arg is ≤1024 chars.
env values are ≤4096 chars per entry. Long secrets belong in an env group.
Manual trigger + retry are rate-limited to 30 requests / minute / team.

Failure → notification → re-run

A failed or timed-out workflow run fires the workflow.failed event. Subscribe a Slack, Discord, or email notification channel to it to get pinged with the workflow name, exit code, and a deep link to the run detail page. (It's pre-selected as a critical event when you create a channel.) From the run detail page, "Retry" re-fires the same args + env without redeploying anything.

Next: Cron & Monitoring · Env Groups