runfabric

runfabric.yml Reference

Canonical config reference for the current release train. Aligned with upstream RUNFABRIC_YML_REFERENCE. RunFabric uses a canonical provider/function model (provider, optional backend/state, and functions). JSON Schema: schemas/runfabric.schema.json.

Quick navigation

Minimum Example

service: hello-api
provider:
  name: aws-lambda
  runtime: nodejs20.x
functions:
  - name: api
    entry: src/index.ts
    triggers:
      - type: http
        method: GET
        path: /hello

Top-Level Fields

Multi-cloud (providerOverrides)

When you want one runfabric.yml to target multiple providers (e.g. AWS and GCP), define providerOverrides and pass --provider <key> on deploy, plan, and remove:

service: my-api
provider:
  name: aws-lambda
  runtime: nodejs
  region: us-east-1

providerOverrides:
  aws:
    name: aws-lambda
    runtime: nodejs
    region: us-east-1
    backend: # optional: per-provider state backend (when using --provider aws)
      kind: s3
      s3Bucket: my-aws-bucket
  gcp:
    name: gcp-functions
    runtime: nodejs
    region: us-central1
    source: external # optional: prefer external plugin over built-in
    version: 1.2.3 # optional: pin external plugin version
    backend: # optional: e.g. gcs for GCP
      kind: gcs

# ... functions, triggers, etc.

Then run e.g. runfabric deploy --provider aws --stage prod or runfabric deploy --provider gcp --stage prod. Without --provider, the top-level provider block is used. When a provider override includes backend, that backend is used for state (receipts, locks) when --provider <key> is set. Invoke, logs, metrics, and traces also accept --provider for multi-cloud.

Auto-install missing extensions (plugins)

If your provider.name refers to an external provider plugin that is not installed on disk, lifecycle commands (plan/deploy/invoke/logs/etc.) will fail with “provider … not registered”.

To let RunFabric auto-install the missing provider from the registry, enable:

extensions:
  autoInstallExtensions: true

You can force external provider resolution and pin a plugin version directly in provider config:

provider:
  name: vercel
  runtime: nodejs
  source: external
  version: 1.2.3

Rules:

Behavior:

You can also ensure other plugin kinds are installed (best-effort) when auto-install is enabled:

extensions:
  autoInstallExtensions: true
  runtimePlugin: nodejs # kind=runtime
  simulatorPlugin: local # kind=simulator
  routerPlugin: cloudflare # kind=router (router command backend)
  secretManagerPlugin: vault-secret-manager # kind=secret-manager (secret manager references)
  secretManagerPluginVersion: 1.0.0 # optional pin
  router:
    autoApply:
      enabled: true
      stages: [staging, prod]
      enforceStageRollout: true
    approvalEnvByStage:
      staging: RUNFABRIC_DNS_SYNC_DEV_APPROVED
      prod: RUNFABRIC_DNS_SYNC_STAGING_APPROVED
    requireReason: true
    reasonEnv: RUNFABRIC_DNS_SYNC_REASON
    mutationPolicy:
      enabled: true
      approvalEnv: RUNFABRIC_DNS_SYNC_RISK_APPROVED
      riskyResources: [lb_monitor, lb_pool, load_balancer]
      maxMutationsWithoutApproval: 3
    credentialPolicy:
      enabled: true
      requireAttestation: true
      attestationEnv: RUNFABRIC_ROUTER_TOKEN_ATTESTED
      issuedAtEnv: RUNFABRIC_ROUTER_TOKEN_ISSUED_AT
      expiresAtEnv: RUNFABRIC_ROUTER_TOKEN_EXPIRES_AT
      maxTTLSeconds: 3600
      minRemainingSeconds: 120
    credentials:
      zoneIDEnv: RUNFABRIC_ROUTER_ZONE_ID
      accountIDEnv: RUNFABRIC_ROUTER_ACCOUNT_ID
      apiTokenEnv: RUNFABRIC_ROUTER_API_TOKEN
      apiTokenFileEnv: RUNFABRIC_ROUTER_API_TOKEN_FILE
      apiTokenSecretRef: router_api_token

Router commands ship with built-in cloudflare backend. External kind=router plugins are also supported through extension discovery/dispatch. When omitted, extensions.routerPlugin defaults to cloudflare. Additional built-ins: route53, ns1, azure-traffic-manager (provider API reconcilers).

Router policy keys under extensions.router:

Real deploy and unsafe defaults

Real deploy is opt-in: set RUNFABRIC_REAL_DEPLOY=1 or provider-specific RUNFABRIC_<PROVIDER>_REAL_DEPLOY=1 (e.g. RUNFABRIC_CLOUDFLARE_REAL_DEPLOY=1). When real deploy is enabled:

First-class layers

Define layers once and reference them by name from functions. Use ref for the provider-specific layer identifier:

layers:
  node-deps:
    ref: "arn:aws:lambda:us-east-1:123456789012:layer:node-deps:1"
    name: node-deps
    version: "1"
  custom:
    ref: "${env:LAMBDA_LAYER_ARN}"
    version: "${env:LAYER_VERSION}" # optional: set from CI (e.g. package-lock hash)

functions:
  - name: api
    entry: src/handler.default
    layers: ["node-deps", "custom"]

Each function entry’s layers list can use logical names (keys in top-level layers) or literal provider-specific layer refs. For AWS, ARNs continue to work.

Versioning on dependency change: Use version with an env var (e.g. version: "${env:LAYER_VERSION}") and set that in CI from a hash of package-lock.json or requirements.txt so layer refs or versions track dependency changes. Resolve runs after env is set, so the same config works across environments.

Other providers: Layers are applied by AWS Lambda today. Other providers (GCP, Azure, etc.) preserve the layers config but do not apply it; use provider-specific mechanisms (e.g. build env, separate artifacts) where needed.

Dynamic Env Bindings

String values can resolve environment variables using:

Example:

service: ${env:RUNFABRIC_SERVICE_NAME,my-service}
provider:
  name: aws-lambda
  runtime: nodejs20.x
  region: ${env:AWS_REGION,us-east-1}
backend:
  kind: s3
  s3Bucket: ${env:RUNFABRIC_STATE_S3_BUCKET}
functions:
  - name: api
    entry: src/index.ts
    triggers:
      - type: http
        method: GET
        path: /hello

If ${env:VAR_NAME} is used without a default and the variable is missing, config parsing fails with an explicit error.

Secret References

String values can also resolve ${secret:KEY} placeholders. Resolution order:

  1. secrets.KEY from top-level config.
  2. Environment variable KEY.

Top-level secrets entries support secret://OTHER_KEY indirection and secret manager references:

extensions:
  secretManagerPlugin: vault-secret-manager

secrets:
  db_url: secret://DATABASE_URL
  jwt_private_key: vault://apps/team/prod/jwt-private-key

functions:
  - name: api
    entry: src/handler.default
    env:
      DATABASE_URL: "${secret:db_url}"
      JWT_PRIVATE_KEY: "${secret:jwt_private_key}"

Secret manager references (aws-sm://..., gcp-sm://..., azure-kv://..., vault://...) are resolved via extensions.secretManagerPlugin.

Production stages (prod, production, live) reject static literal secrets.* values. Use ${env:VAR}, secret://KEY, or secret manager references instead.

If a ${secret:KEY} reference cannot be resolved, config resolution fails with an explicit error.

MCP Integrations and Policies

MCP configuration is provider-neutral and configured under integrations.mcp plus policies.mcp.

Register MCP servers

integrations:
  mcp:
    servers:
      crm:
        url: https://mcp.internal/crm
      kb:
        url: https://mcp.internal/kb

Enforce MCP allow/deny policy

policies:
  mcp:
    defaultDeny: true
    allow:
      servers: ["crm", "kb"]
      tools: ["crm.lookup*", "kb.search*"]
      resources: ["kb.kb://*"]
      prompts: ["crm.reply*"]
    deny:
      tools: ["crm.delete*"]

Policy semantics:

Provider-specific MCP policy rules

policies:
  mcp:
    providers:
      aws-lambda:
        requiredRegion: us-east-1
        denyCrossRegion: true
        denyRegions: ["eu-*"]
        requiredAuth: iam
        models:
          default: anthropic.claude-3-sonnet-20240229-v1:0
          ai-eval: anthropic.claude-3-haiku-20240307-v1:0

Supported keys under policies.mcp.providers.<provider>:

Environment-based overrides are also supported:

Deploy Policy

Single-function deploy: use runfabric deploy --function <name>, runfabric deploy fn <name>, runfabric deploy function <name>, or runfabric deploy-function <name>.

deploy:
  rollbackOnFailure: true # optional
  strategy: all-at-once # optional: all-at-once (default), canary, blue-green
  canaryPercent: 10 # 0-100 when strategy: canary (provider-specific traffic shift)
  canaryIntervalMinutes: 5 # minutes before full shift when strategy: canary (optional)
  healthCheck: # optional post-deploy HTTP GET
    enabled: true
    url: "" # empty = use deployed URL from receipt (ServiceURL, url, ApiUrl)
  scaling: # optional provider-level defaults (overridden per function)
    reservedConcurrency: 10
    provisionedConcurrency: 0

Per-function scaling (and layers) in functions:

functions:
  - name: api
    entry: src/handler.default
    layers: ["node-deps"] # refs to top-level layers.* or literal provider-specific layer refs

Stage override:

stages:
  prod:
    deploy:
      rollbackOnFailure: true

Behavior precedence for rollback-on-failure:

  1. CLI flag (deploy --rollback-on-failure or --no-rollback-on-failure)
  2. runfabric.yml deploy policy (deploy.rollbackOnFailure)
  3. Env toggle (RUNFABRIC_ROLLBACK_ON_FAILURE)

Trigger Types

HTTP

- type: http
  method: GET
  path: /hello

Cron

- type: cron
  schedule: "*/5 * * * *"
  timezone: UTC # optional

Queue

- type: queue
  queue: arn:aws:sqs:us-east-1:123456789012:jobs
  batchSize: 10 # optional
  maximumBatchingWindowSeconds: 5 # optional
  maximumConcurrency: 2 # optional
  enabled: true # optional
  functionResponseType: ReportBatchItemFailures # optional

Storage

- type: storage
  bucket: uploads
  events:
    - s3:ObjectCreated:*
  prefix: incoming/ # optional
  suffix: .jpg # optional
  existingBucket: true # optional

EventBridge / PubSub / Kafka / RabbitMQ

- type: eventbridge
  pattern:
    source:
      - app.source
  bus: default # optional

- type: pubsub
  topic: jobs
  subscription: jobs-sub # optional

- type: kafka
  brokers:
    - kafka:9092
  topic: events
  groupId: runfabric

- type: rabbitmq
  queue: jobs
  exchange: app-exchange # optional
  routingKey: app.jobs # optional

Function Overrides

functions:
  - name: api
    entry: src/api.ts
    runtime: nodejs # optional override
    triggers:
      - type: http
        method: POST
        path: /api
    env:
      FEATURE_FLAG: "1"

AWS Extension Example

extensions:
  aws-lambda:
    region: us-east-1
    stage: dev
    roleArn: arn:aws:iam::123456789012:role/runfabric-lambda-role # required for internal AWS real deploy
    functionName: my-service-dev # optional override
    runtime: nodejs20.x # optional runtime override for internal AWS real deploy
    iam:
      role:
        statements:
          - effect: Allow
            actions:
              - s3:GetObject
            resources:
              - arn:aws:s3:::uploads/*

Kubernetes Extension Example

extensions:
  kubernetes:
    namespace: runfabric
    context: dev-cluster
    deploymentName: hello-api
    serviceName: hello-api
    ingressHost: api.dev.example.com

State Backends

backend:
  kind: local # local|postgres|sqlite|s3|dynamodb|gcs|azblob
  s3Bucket: my-state-bucket # when kind=s3
  s3Prefix: runfabric/state # when kind=s3
  lockTable: runfabric-locks # when kind=s3 or kind=dynamodb
  gcsBucket: my-state-bucket # when kind=gcs
  gcsPrefix: runfabric/state # when kind=gcs
  azblobContainer: runfabric-state # when kind=azblob
  azblobPrefix: runfabric/state # when kind=azblob
  postgresConnectionStringEnv: RUNFABRIC_STATE_POSTGRES_URL # when kind=postgres
  postgresTable: runfabric_receipts # when kind=postgres
  sqlitePath: .runfabric/state.db # when kind=sqlite
  receiptTable: runfabric-receipts # when kind=dynamodb

Backend-specific options:

DB-backed deploy state (receipts): Set backend.kind to postgres, sqlite, or dynamodb (and the corresponding backend.* options) to store and fetch deploy receipts from a database. See STATE_BACKENDS.md.

Detailed backend behavior: STATE_BACKENDS.md.

Logs

Optional local log file source (unified with provider logs). When logs.path is set (or default .runfabric/logs), runfabric invoke logs appends lines from:

Example:

logs:
  path: .runfabric/logs # default; directory relative to project root

Provider logs (e.g. CloudWatch for AWS) are fetched first; local file lines are appended to the same result.

Build order

Optional ordering of build steps or hook modules. When you have multiple hooks (see PLUGINS.md), build.order defines the execution order. Values can use ${env:VAR}.

build:
  order: ["deps", "compile", "bundle"]

hooks:
  - ./hooks/deps.mjs
  - ./hooks/compile.mjs
  - ./hooks/bundle.mjs

Alerts

Optional alerting configuration. URLs support ${env:VAR}. Delivery is integration-specific; the config is available for tooling or future runtime hooks.

alerts:
  webhook: "${env:ALERT_WEBHOOK_URL}"
  slack: "${env:SLACK_WEBHOOK_URL}"
  onError: true
  onTimeout: true

App and org

Optional grouping for dashboards or multi-service UIs:

app: my-app
org: my-org
service: my-api
# ...

Add-ons (RunFabric Addons, Phase 15)

Add-ons are optional integrations (e.g. Sentry, Datadog) declared under addons. The provider and runtime fields elsewhere in config resolve to RunFabric Plugin IDs (e.g. aws-lambda, nodejs); use runfabric extensions extension list to see built-in plugins. Each entry can specify:

Example:

secrets:
  sentry_dsn: "${env:SENTRY_DSN}"

addons:
  sentry:
    version: "1"
    options:
      tracesSampleRate: 1.0
    secrets:
      SENTRY_DSN: sentry_dsn # uses secrets.sentry_dsn → ${env:SENTRY_DSN}
  datadog:
    secrets:
      DD_API_KEY: "${env:DD_API_KEY}"

Use runfabric extensions addons list to see the built-in catalog; if addonCatalogUrl is set, the CLI fetches and merges entries from that URL. Validation ensures addon secret keys (env var names) are non-empty.

Per-function addons: In each function entry under functions, set addons to a list of addon keys (e.g. ["sentry"]). Only those addons’ secrets are injected into that function. If addons is omitted or empty, all top-level addons apply.

Runtime fabric

When you want active-active deploy (same service in multiple regions or providers) with health checks and optional failover/latency routing, add a fabric block. It requires providerOverrides; each entry in fabric.targets is a provider key to deploy to.

Example:

providerOverrides:
  aws-us:
    name: aws-lambda
    runtime: nodejs
    region: us-east-1
  aws-eu:
    name: aws-lambda
    runtime: nodejs
    region: eu-west-1

fabric:
  targets: [aws-us, aws-eu]
  routing: latency

Then run runfabric router deploy (deploys to both), runfabric router status (HTTP GET each endpoint, report healthy/fail), and runfabric router endpoints (list URLs for use with Route53 or other DNS/LB).

Managed resource binding

Declare database and cache resources so that DATABASE_URL, REDIS_URL, and similar connection strings are injected into every function’s environment at deploy. Values come from the process environment or from a literal/${env:VAR} expression.

Each entry under resources must have:

Optional provisioning (RDS, ElastiCache): provision (boolean): when true, the engine calls the provider’s provision callback to obtain a connection string (e.g. RDS, ElastiCache). The config layer supports this via ResourceProvisionFn; if the provider does not implement it or returns an error, binding falls back to connectionStringEnv or connectionString. The AWS provider implements lookup for existing RDS and ElastiCache resources. Supported spec fields when provision: true:

Per-function resource refs: In each function entry under functions, set resources to a list of resource keys (e.g. ["db"]). Only those resources’ env vars are injected into that function. If resources is omitted or empty, all top-level resources are injected (current default).

Example:

resources:
  db:
    type: database
    envVar: DATABASE_URL
    connectionStringEnv: DATABASE_URL # value from process env at deploy
  cache:
    type: cache
    envVar: REDIS_URL
    connectionString: "${env:REDIS_URL}" # or literal redis://localhost:6379

At deploy, each function’s environment is merged with these bindings (then with compose SERVICE_*_URL and other extraEnv). If a function sets resources: [key1, ...], only those resources’ env vars are injected; otherwise all resources apply. When provision: true is set, the engine calls the provider’s Provisioner; if it returns not-implemented or error, the existing connectionStringEnv/connectionString path is used.

Validation

See platform/core/model/config/validate.go: provider name/runtime required, at least one function, backend kind constraints, and event/authorizer rules.

Integrations and policies

Use integrations and policies for workflow/runtime extension settings without changing core config fields.

Example (MCP + policy blocks):

integrations:
  mcp:
    enabled: true
    server: runfabric-mcp
    transport: stdio
  approvals:
    provider: slack
    channel: ${env:APPROVALS_CHANNEL,#ops-approvals}

policies:
  workflow:
    maxRunSeconds: 1800
    denyModelFamilies: ["experimental"]
  deploy:
    requireRollbackOnFailure: true

Validation expectations:

Workflow step kinds

Workflow steps support a typed kind for AI/human-in-the-loop flows:

Minimal typed example:

workflows:
  - name: release-flow
    steps:
      - id: gather-context
        kind: ai-retrieval
        input:
          query: "Summarize deploy risks for this commit"
          model: gpt-4.1
      - id: generate-plan
        kind: ai-structured
        input:
          prompt: "Create a release plan"
          schema:
            type: object
            properties:
              actions:
                type: array
                items: { type: string }
      - id: approve
        kind: human-approval
        input:
          approvalRequest: "Review generated release actions"
      - id: deploy
        kind: code

Step requirements:

Human approval lifecycle

For kind: human-approval, workflow execution follows:

Operational flow:

Approval inputs typically include the approval.inputKey payload from prior steps, reviewer identity, and optional justification captured by the integration.

Provider-native orchestration extensions

Provider orchestration adapters are configured under extensions.

GCP Cloud Workflows

Use extensions.gcp-functions.cloudWorkflows for workflow sync, invoke, and inspect:

extensions:
  gcp-functions:
    cloudWorkflows:
      - name: order-flow
        definitionPath: workflows/order-flow.yaml
        bindings:
          createOrder: createOrder

Supported fields per item:

Azure Durable Functions

Use extensions.azure-functions.durableFunctions for durable orchestration routing:

extensions:
  azure-functions:
    durableFunctions:
      - name: order-flow
        orchestrator: OrderFlowOrchestrator
        taskHub: order-hub
        storageConnectionSetting: AzureWebJobsStorage

Supported fields per item:

Durable declarations are now applied through explicit Azure management-plane app settings updates during orchestration sync/remove. RunFabric writes and removes managed keys under RUNFABRIC_DURABLE_<NAME>_* so durable lifecycle state is explicit and reversible.

Schema files

File Purpose
schemas/runfabric.schema.json Full schema for the current config contract.
schemas/resource.schema.json Resource definition schema (binding + optional provisioning fields).
schemas/workflow.schema.json Workflow definition schema (name, steps, kind/input/model/timeout/retry shape).
schemas/secrets.schema.json Secrets map shape.