CI/CD Pipelines¶

All automation for this repository lives in .github/workflows. There are ten workflow files: four that deploy, one PR gate, one teardown job, one DNS-cleanup job (currently disabled), one no-op deployment-freeze placeholder, one docs publisher, and two Claude-bot integrations. The deployment workflows are thin GitHub Actions wrappers around the PowerShell scripts in .scripts/ — the heavy lifting (azd, slot swaps, DB-migration handshake, DNS/TLS) happens there, not in YAML.

What CI/CD does here in one sentence

Every push to main bumps the version and rolls the change through both Azure paths — an ephemeral Aspire / Container Apps environment and the App Service dev slots — while a GitHub release drives a gated staging → production slot swap; pull requests are gated by a build/test/format job.

For the deployment architecture these pipelines drive (the "two parallel Azure paths", the Aspire app graph, the App Service slots), read the Deployment Overview first. This page is the per-workflow reference: triggers, jobs, the secrets/inputs each one consumes, and how they relate.

Workflow catalogue¶

File	Name	Trigger	Purpose
`azure-dev.yml`	Deploy Aspire Environment	push `main`, manual	`azd up` → ephemeral Container Apps env + DNS/TLS
`azure-dev-app-services.yml`	Deploy Dev App Services	push `main`, manual	version bump + publish + App Service dev slots + swap
`release-deploy-app-services.yml`	Deploy Release App Services	GitHub release published	release version + `stage` Aspire + prod-staging + approval + prod swap
`pr-validation.yml`	PR Validation	PR → `main`/`develop`	.NET build/test/format + frontend build/format gate
`teardown.yml`	Teardown Environments	daily cron `0 2 * * *`, manual	delete aged `rg-ABP-*` resource groups by tag
`cleanup-dns.yml.inactive`	Daily DNS Cleanup	(disabled — `.inactive` suffix)	would prune dangling CNAMEs daily
`deployment-freeze.yml`	Deployment Freeze	manual	placeholder no-op (does nothing yet)
`docs.yml`	Deploy Documentation	push `main` (`docs/**`), manual	`mkdocs build --strict` → GitHub Pages
`claude.yml`	Claude Code	`@claude` mentions on issues/PRs	interactive Claude bot
`claude-code-review.yml`	Claude Code Review	PR opened/synchronize	automated `/code-review` on PRs

How they relate¶

flowchart TD
    PR["Pull request → main / develop"] --> PRV["pr-validation.yml<br/>build · test · csharpier · prettier"]
    PR --> CR["claude-code-review.yml<br/>/code-review"]

    Push["Push to main"] --> AD["azure-dev.yml<br/>azd up → rg-ABP-main (ACA)"]
    Push --> ADAS["azure-dev-app-services.yml"]
    Push -. "docs/** only" .-> Docs["docs.yml → GitHub Pages"]

    subgraph ADAS_jobs["azure-dev-app-services.yml"]
        direction TB
        U["update: UpdateVersion.ps1 -incVersion build"] --> D["deploy: PublishApp → HubTest → HubProd → swap → HubTest"]
    end

    Rel["GitHub release published"] --> RUP["update: UpdateVersion.ps1 -version tag"]
    RUP --> RA["deployAspire → stage.cargonerds.dev"]
    RUP --> RAS["deployAppService → Prod-Staging slot"]
    RAS --> AP["approve: environment 'production' gate"]
    AP --> SW["swap: SwapAppServiceSlots → production"]

    Cron["Daily 02:00 UTC"] --> TD["teardown.yml<br/>delete aged rg-ABP-*"]

    Comment["@claude on issue/PR"] --> CL["claude.yml"]

Shared conventions¶

A few things are wired identically across the deploying workflows; they are described once here and not repeated per workflow.

Every Azure-touching job authenticates via federated OIDC — no stored Azure password is used for the az CLI. This requires the job-level permission id-token: write and the azure/login@v2 action:

OIDC login block (repeated in every deploy/teardown job)

permissions:
  contents: read
  id-token: write

# …
      - name: Azure Login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

The Aspire workflows additionally run azd auth login with secrets.AZURE_CLIENT_SECRET because azd (unlike the az CLI) does not consume the OIDC token from azure/login.

.NET 10 + ABP CLI toolchain¶

The deploy jobs set up .NET 10 and the ABP CLI (Volo.Abp.Cli) before building. abp install-libs restores the client-side libraries ABP needs at build time; the AppHost itself uses ABP's .NET Aspire integration:

Toolchain setup (Aspire / App Service deploy jobs)

      - name: Setup .NET
        uses: actions/setup-dotnet@v5
        with:
          dotnet-version: |
            10.x.x
      - name: Install ABP CLI
        run: |
          dotnet tool install -g Volo.Abp.Cli
          abp install-libs
        shell: bash

azure-dev.yml and azure-dev-app-services.yml also run dotnet workload restore (Aspire workloads). PR validation uses a slimmer setup (dotnet-version: '10.0.x', actions/setup-dotnet@v4) and does not install the ABP CLI.

GitHub environments and secrets¶

Almost every deploy/teardown job declares environment: name: stage, and the production swap declares environment: name: production. These are GitHub deployment environments, not the Azure/Aspire "Spark environment" — environment-scoped secrets and required reviewers attach to them. The comments in the release workflow call this out explicitly:

    environment:
      name: stage #env name for github env not the azd env

Secret	Used by	What for
`AZURE_CLIENT_ID` / `AZURE_TENANT_ID` / `AZURE_SUBSCRIPTION_ID`	all deploy/teardown jobs	OIDC `azure/login`
`AZURE_CLIENT_SECRET`	Aspire jobs only	`azd auth login`
`GIT_ADMIN_TOKEN`	`update` jobs (version bump)	checkout with write creds so the bump commit can be pushed
`GITHUB_TOKEN`	`docs.yml`, `claude*.yml`	release/PR API reads (docs), bot identity
`CLAUDE_CODE_OAUTH_TOKEN`	`claude.yml`, `claude-code-review.yml`	authenticate the Claude bot

Where the values come from

These are GitHub repository/environment secrets, not Azure Key Vault entries and not appsettings keys. Runtime application configuration (connection strings, OpenIddict, etc.) is a separate concern — see Configuration Reference and appsettings.

Deploy workflows¶

`azure-dev.yml` — Deploy Aspire Environment¶

Deploys the Aspire app graph to an ephemeral per-branch Azure Container Apps environment. See Azure Container Apps for the target details.

Triggers

on:
  push:
    branches: [ main ]
  workflow_dispatch:
    inputs:
      environment_name:   # optional override (defaults to branch name)
      spark_environment:  # choice: Default | Test (default Default)

Jobs (run in sequence via needs)

resolve-environment (ubuntu) — derives the env name from the branch (or the manual environment_name override) by lowercasing and replacing every non-alphanumeric run with a single hyphen. It also computes two flags later consumed by teardown tagging:
- PROTECTED_BRANCHES="main|production|staging" — a protected branch name cannot be supplied as a manual override (the step exit 1s), and protected branches are never auto-torn-down.
- Outputs env_name, is_protected_branch, is_manual_deployment.

deploy (ubuntu, env stage) — installs azd + .NET 10 + ABP CLI, OIDC login, azd auth login, then runs the wrapper script:

      - name: Deploy Aspire
        working-directory: ./.scripts
        run: >
          ./DeployToAzureContainerApps.ps1
          -EnvironmentName '${{ needs.resolve-environment.outputs.env_name }}'
          -SkipEnvSetup $true
          -SparkEnvironment '${{ inputs.spark_environment || 'Default' }}'

configure (ubuntu, env stage) — runs SetupDnsAndCertificates.ps1 against rg-ABP-<env>, then tags the resource group with deployment metadata. autoTeardownEnabled is set to true only when the deploy is a manual run of a non-protected branch:

AUTO_TEARDOWN="false"
if [ "…is_manual_deployment" == "true" ] && [ "…is_protected_branch" == "false" ]; then
  AUTO_TEARDOWN="true"
fi
az group update --name "rg-ABP-<env>" --tags \
  deployedAt=… branch=… isProtectedBranch=… isManualDeployment=… \
  autoTeardownEnabled="$AUTO_TEARDOWN" commitSha=… workflowRunId=…

These RG tags are the contract read by teardown.yml (see below) to decide what may be deleted.

Resource-group naming

The script prefixes the azd env with ABP-, so EnvironmentName main becomes azd env ABP-main, resource group rg-ABP-main, and domain main.cargonerds.dev. Region for new envs is germanywestcentral.

AZD_UP_CONCURRENCY=1 is mandatory

DeployToAzureContainerApps.ps1 sets $env:AZD_UP_CONCURRENCY = "1". Without it, parallel dotnet publish runs collide — a regression since azure-dev-cli 1.25.0. The script comment documents this; do not remove it.

`azure-dev-app-services.yml` — Deploy Dev App Services¶

The primary dev App Service pipeline. See Azure App Service for the slot model.

Triggers: push to main, or manual workflow_dispatch.

Jobs

update (windows, env stage, permissions: contents: write) — bumps the 4^th version part of common.props and pushes the bump commit, then verifies the push landed on origin:
```
      - name: Update version
        run: powershell -ExecutionPolicy Bypass -File '.scripts/UpdateVersion.ps1' -incVersion build -commit
```
Checkout uses token: ${{ secrets.GIT_ADMIN_TOKEN }} with persist-credentials: true so the script can push. The job exposes the pushed SHA as output ref (a retry loop polls git ls-remote origin until the tip matches before continuing).
Self-bump rerun guard

The bump push to main would otherwise re-trigger this same workflow. The update job is guarded so it skips its own commit:
```
if: >-
  github.event_name != 'push' ||
  !startsWith(github.event.head_commit.message, 'chore: bump version to ')
```
UpdateVersion.ps1 always commits with the message chore: bump version to <v> (common.props), which is exactly what this prefix match suppresses.
deploy (ubuntu, env stage, needs: update) — checks out the exact needs.update.outputs.ref, sets up the toolchain, OIDC login, then runs four script steps in order:

      - name: Publish Code
        run: ./PublishApp.ps1
      - name: Deploy To Dev-HubTest
        run: ./DeployToAppServices.ps1 -AzureEnvironment Dev-HubTest
      - name: Deploy To Dev-HubProd
        run: ./DeployToAppServices.ps1 -AzureEnvironment Dev-HubProd
      - name: Swap Environments for HubTest to production slot
        run: ./SwapAppServiceSlots.ps1 -AzureEnvironment Dev-HubTest
      - name: Deploy To Dev-HubTest after swap with production slot
        run: ./DeployToAppServices.ps1 -AzureEnvironment Dev-HubTest -SkipDbMigration

So the flow is: publish all service zips → deploy to HubTest (with DB migration) → deploy to HubProd (with DB migration) → swap HubTest into the production slot → redeploy HubTest.

!!! warning "Why HubTest is deployed twice"
    The slot swap moves HubTest's content into the live production slot, leaving the test slot holding the old production bits. The final step refills the test slot with the new build, this time `-SkipDbMigration` (the migration already ran on the first HubTest deploy, so re-running it is unnecessary).

`release-deploy-app-services.yml` — Deploy Release App Services¶

The production release pipeline. Triggered when a GitHub release is published (on: release: types: [published]).

Jobs

Job	Env	`needs`	Action
`update`	`stage`	—	`UpdateVersion.ps1 -version ${{ github.event.release.tag_name }} -commit` — sets the exact release version
`deployAspire`	`stage`	`update`	`DeployToAzureContainerApps.ps1 -EnvironmentName 'stage' -SkipEnvSetup $true -SparkEnvironment 'Test'` → `stage.cargonerds.dev`
`deployAppService`	`stage`	`update`	`PublishApp.ps1` then `DeployToAppServices.ps1 -AzureEnvironment Prod-Staging` (the `staging` slot of prod)
`approve`	`production`	`deployAppService`	manual gate — the job body just `echo`s; the `production` environment's required reviewers block it
`swap`	`stage`	`approve`	`SwapAppServiceSlots.ps1 -AzureEnvironment Prod-Staging` (staging → production)

deployAspire and deployAppService both check out ${{ github.event.release.target_commitish }} (the commit the release points at). The release flow is therefore: set version → deploy to stage Aspire + prod-staging slot in parallel → wait for human approval → swap staging into production.

Rollback

SwapAppServiceSlots.ps1 performs a warm-up swap --action preview, health-polls each service, then swap --action swap. If a preview gets stuck the script prints the swap --action reset commands to cancel it — that is the documented rollback path. See Azure App Service.

flowchart LR
    R(["release published"]) --> U["update<br/>set version = tag"]
    U --> DA["deployAspire<br/>stage.cargonerds.dev"]
    U --> DS["deployAppService<br/>Prod-Staging slot"]
    DS --> AP{{"approve<br/>environment: production"}}
    AP -->|reviewer approves| SW["swap<br/>staging → production"]

PR validation¶

`pr-validation.yml` — PR Validation¶

The merge gate. Runs on pull_request targeting main or develop. Two independent jobs run in parallel:

build-dotnet (ubuntu) — dotnet restore Cargonerds.All.sln → dotnet build … -c Release → dotnet tool restore → dotnet csharpier check . (formatting) → a "build produced no files" worktree check → dotnet test Cargonerds.All.sln -c Release --no-build.
build-frontend (ubuntu, Node 22) — npm ci in frontend/ → npm run build in frontend/realtime → npx prettier --check ..

Aggregate-then-fail pattern

Every step uses continue-on-error: true and records its outcome; a final if: always() step collects the failures and exits non-zero with a list. This means one PR run reports all problems at once (build and format and tests) instead of stopping at the first failure. The --ignore-exit-code 8 on dotnet test tolerates the "no tests found" exit code so empty test projects do not fail the gate.

The CSharpier check mirrors local formatting expectations; see AGENTS.md and the Development Workflow page.

Operational workflows¶

`teardown.yml` — Teardown Environments¶

Deletes aged ephemeral environments. Trigger: daily cron 0 2 * * * or manual workflow_dispatch.

Inputs (manual mode): mode (scheduled | manual), environment_name, max_age_days (default 7), dry_run, and confirm_deletion (must equal DELETE for a real manual delete).

The single teardown job (env stage) is entirely inline PowerShell:

Scheduled mode lists every rg-ABP-* resource group and reads the tags written by azure-dev.yml:
- skips the RG unless autoTeardownEnabled == 'true' (protected branches, automatic deployments, and untagged RGs are all kept),
- then deletes only when age >= max_age_days (default 7 days, computed from the deployedAt tag).
Manual mode targets rg-ABP-<environment_name> and refuses unless dry_run is on or confirm_deletion == 'DELETE'.
Before deleting, it scans the RG for Key Vaults and, after az group delete, purges the soft-deleted vaults (az keyvault purge) so their names are free for the next deploy.

Missing custom teardown script

The job probes for ./.scripts/Teardown.ps1 and would prefer it, but that file does not exist in the repo — so the inline az group delete fallback always runs. (The local-developer equivalent is .scripts/TeardownAzureContainerApps.ps1, which is not what this workflow invokes.)

`cleanup-dns.yml.inactive` — Daily DNS Cleanup (disabled)¶

A daily (0 2 * * *) job that would run CleanupDnsEntries.ps1 against the cargonerds.dev zone in rg-cargonerds-applications-core-infrastructure to prune dangling CNAMEs.

This workflow does not run

The filename ends in .yml.inactive, so GitHub Actions ignores it (only *.yml / *.yaml are loaded). The DNS-cleanup logic only executes ad hoc or as part of a teardown. To re-enable it, rename the file to cleanup-dns.yml.

`deployment-freeze.yml` — Deployment Freeze (placeholder)¶

Manual-only (workflow_dispatch). The single job (named Placehlder — sic) just echoes a message and performs no action:

      - name: "Freeze Deployments"
        run: |
          echo "This workflow is a placeholder to prevent deployments during freeze periods. It does not perform any actions."

Note

Despite holding permissions: actions: write, it does not currently disable or cancel any other workflow. Treat it as a stub.

`docs.yml` — Deploy Documentation¶

Builds and publishes this documentation site to GitHub Pages.

Triggers: push to main touching docs/** or .github/workflows/docs.yml, plus manual dispatch. Concurrency group pages (no cancel-in-progress).

Jobs

build (working-directory: docs) — Python 3.12 + pip install -r requirements-docs.txt, then:
```
      - name: Build documentation
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: mkdocs build --strict
```
--strict makes any warning (e.g. a broken internal link) fail the build. GITHUB_TOKEN lets gen_releases.py read the repo's releases and merged PRs via the authenticated API (avoiding the anonymous rate limit) when generating the Releases page. The built docs/site is uploaded via actions/upload-pages-artifact.
deploy (needs: build, env github-pages) — actions/deploy-pages@v4 publishes the artifact.

--strict will fail on bad links

Because the build is strict, a relative cross-link to a page that does not exist (or an over-deep ../../../ path that escapes the docs root) breaks the whole publish. Keep cross-links relative to the docs/docs/ tree, and link repo-root files such as common.props or AGENTS.md to their https://github.com/Cargonerds/CargonerdsApp/blob/main/<file> URL rather than with a relative path.

Claude bot workflows¶

These two are developer-assist integrations via anthropics/claude-code-action@v1 and are not part of any deployment.

claude.yml — Claude Code. Fires on issue comments, PR review comments, PR reviews, and issue open/assign, but the job's if only runs when the body/title contains @claude. It authenticates with secrets.CLAUDE_CODE_OAUTH_TOKEN and is granted actions: read so it can read CI results on PRs.

claude-code-review.yml — Claude Code Review. Fires on every PR opened / synchronize / ready_for_review / reopened and runs the code-review plugin automatically:

        plugin_marketplaces: 'https://github.com/anthropics/claude-code.git'
        plugins: 'code-review@claude-code-plugins'
        prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}'

Gotchas summary¶

Things to know before editing these workflows

azure-dev-app-services.yml self-triggers. Its own version-bump push to main re-runs the workflow; the startsWith(head_commit.message, 'chore: bump version to ') guard is what prevents an infinite loop. Keep UpdateVersion.ps1's commit message and that guard in sync.
HubTest is deployed twice in the dev pipeline — the second deploy is intentional (refills the test slot after the swap) and uses -SkipDbMigration.
teardown.yml reads RG tags written by azure-dev.yml. The autoTeardownEnabled/deployedAt/branch tags are a cross-workflow contract; changing the tag names in one place breaks teardown decisions.
cleanup-dns.yml.inactive is disabled by filename. Renaming back to .yml re-activates a daily DNS-cleanup run.
deployment-freeze.yml is a no-op. It blocks nothing today.
teardown.yml references ./.scripts/Teardown.ps1, which does not exist — the inline az group delete path always runs.
docs.yml builds with --strict — broken links fail the publish, not just warn.
azd needs AZURE_CLIENT_SECRET even though az uses OIDC; the Aspire jobs run azd auth login separately.

Deployment Overview — the two Azure paths these pipelines drive.
Azure Container Apps — the ephemeral ACA target (azure-dev.yml).
Azure App Service — the slot model, publish + swap + DB-migration handshake.
Helm & Kubernetes and Local Deployment — local-only paths, not used by CI.
Configuration Reference and appsettings — runtime config (distinct from the GitHub secrets above).
Development Workflow — the local CSharpier/Prettier expectations the PR gate enforces.

CI/CD Pipelines¶

Workflow catalogue¶

How they relate¶

Shared conventions¶

OIDC Azure login¶

.NET 10 + ABP CLI toolchain¶

GitHub environments and secrets¶

Deploy workflows¶

azure-dev.yml — Deploy Aspire Environment¶

azure-dev-app-services.yml — Deploy Dev App Services¶

release-deploy-app-services.yml — Deploy Release App Services¶

PR validation¶

pr-validation.yml — PR Validation¶

Operational workflows¶

teardown.yml — Teardown Environments¶

cleanup-dns.yml.inactive — Daily DNS Cleanup (disabled)¶

deployment-freeze.yml — Deployment Freeze (placeholder)¶

docs.yml — Deploy Documentation¶

Claude bot workflows¶

Gotchas summary¶

Related pages¶

`azure-dev.yml` — Deploy Aspire Environment¶

`azure-dev-app-services.yml` — Deploy Dev App Services¶

`release-deploy-app-services.yml` — Deploy Release App Services¶

`pr-validation.yml` — PR Validation¶

`teardown.yml` — Teardown Environments¶

`cleanup-dns.yml.inactive` — Daily DNS Cleanup (disabled)¶

`deployment-freeze.yml` — Deployment Freeze (placeholder)¶

`docs.yml` — Deploy Documentation¶