Pearl GitHub CI/CD Guide
This page documents the working GitHub Actions CI/CD system for Pearl. It covers all seven workflows (create-environment, deploy-production, deploy-test, cleanup-test-env, build, deploy, windows-build),
how they connect to Azure infrastructure, and how the deployment and validation pipeline operates in practice.
Region: UK West | Deploy method: Azure VM Run Commands (v2) | Auth: OIDC Federated Identity | Runners: GitHub-hosted windows-2022
What The System Does Now
The CI/CD system is fully operational with seven workflows covering environment provisioning, automated builds, production deployment, test environment deployment, and automated cleanup. All authentication uses OIDC federated identity (no stored secrets).
| Area | Current working state | How it works |
|---|---|---|
| Infrastructure (Terraform) | Remote-state Terraform provisions environments (UK West): hub/spoke networking, NSGs, NAT Gateway, Azure Firewall, VMs, Key Vault, storage, and VNet peering to the externally-managed SQL MI. Target environment selectable via dropdown. | create-environment.yaml with plan/apply/destroy modes and target_environment dropdown. Publishes environment metadata artifact. |
| Build pipeline | GitHub-hosted windows-2022 runners build web and worker tiers using encrypted vendor dependencies (AES-256 .zop bundle) from Azure Blob Storage. | build.yaml downloads β validates SHA256 β retrieves password from Key Vault β decrypts β NuGet restore β MSBuild β packages versioned artifacts + release manifest. |
| Production deployment | Manual-only production deployment to web and worker VMs using Azure VM Run Commands (v2). Includes optional infrastructure refresh via Terraform. | deploy-production.yaml (workflow_dispatch only) β builds β uploads artifacts to Blob Storage β deploys via Run Command to both VMs β validates. |
| Test environment deployment | Automatic deployment on feature/env branch pushes. Creates isolated IIS sites with host-header routing on the shared test VM. | deploy-test.yaml triggers on push β derives env name from branch β builds web tier β deploys via Run Command with DeploymentType=test-environment. |
| Cleanup | Automated lifecycle management: removes test environments when branches are deleted, runs daily orphan scan at 03:00 UTC. | cleanup-test-env.yaml triggers on delete event, schedule, or manual dispatch β removes IIS site, app pool, and files via Run Command. |
| Validation | Structured per-VM validation: Web gets HTTP health + IIS checks. Worker gets Windows service status + SQL TCP connectivity with retry logic (3Γ15s attempts, 5s delay). | Invoke-RemoteValidation.ps1 runs via Run Command, outputs base64-encoded structured results parsed by the workflow. |
Operational status: Fully working
All seven workflows are operational. GitHub-hosted runners handle builds, Azure VM Run Commands (v2) handle deployment, and OIDC federated identity handles authentication. Production deploys are manual-only; test environments deploy automatically on branch push and are cleaned up on branch deletion.
Current Delivery Model
The CI/CD system uses GitHub-hosted Windows runners for builds, Azure VM Run Commands (v2) for deployment, and OIDC federated identity for authentication. Production deployments are manual-only, while test environments deploy automatically on feature branch pushes.
- GitHub-hosted Windows runners for all build jobs (windows-2022).
- OIDC federated identity β no stored service principal secrets.
- Build once, deploy by version with SHA-256 validated artifacts and release manifests.
- Production: manual dispatch only β no automatic deployments to production.
- Test environments: automatic β deploy on branch push, clean up on branch delete.
- IIS host-header routing for isolated test environments on the shared test VM.
- Daily orphan cleanup at 03:00 UTC removes stale test environments.
End-To-End CI/CD Diagram
This diagram shows all seven GitHub Actions workflows, how they connect to each other, how build artifacts flow into deployment, and how the test environment lifecycle is managed automatically.
How to read the future branches
The build VM and slot-aware rollout are shown as later branches on purpose. They are part of the long-term direction, but they are not the primary execution path for the first CI/CD implementation.
Workflow Order
All seven workflows and how they relate. The primary automated path is highlighted.
Manual dispatch. Provision or destroy infrastructure via Terraform (plan/apply/destroy modes). Publishes environment-metadata.json artifact. Has target_environment dropdown (currently: test).
Auto on push to feature/*, env/* branches. Derives environment name from branch, builds from that branch, deploys to isolated IIS site on shared test VM. Comments URL on PR.
Manual dispatch only (push-on-master disabled). Ensures infrastructure via Terraform, builds from master, deploys web+worker to production VMs. Records deployment history.
Auto on branch delete + daily 03:00 UTC + manual dispatch. Removes IIS site, app pool, and files for test environments. Daily run detects orphans (no matching branch) and cleans them.
Manual dispatch. Standalone build β downloads encrypted vendor DLLs, SHA256 verification, MSBuild web+worker, packages versioned artifacts. No deployment.
Manual dispatch. Deploy a specific version or rollback to previous known-good release. Supports both deploy and rollback modes with post-deployment validation.
Manual dispatch (legacy). Validation-only build using older Build-WebTier/Build-WorkerTier scripts. Does not produce uploaded artifacts.
create-environment.yaml
This workflow owns environment lifecycle. It should be the only workflow that creates or destroys the Terraform-managed estate.
Main job
Run Terraform plan and apply for the target environment and publish environment metadata after success.
Outputs
Expose current VM names, addresses, storage details, and key environment identifiers to later workflows.
Destroy control
Support manual destroy only, with explicit confirmation and approval before running Terraform destroy.
build.yaml
This workflow should convert the existing manual build logic into a repeatable hosted-runner build without changing the actual build contract.
| Build responsibility | Expected result |
|---|---|
| Dependency restore | Encrypted vendor bundle is downloaded from private Blob storage, validated by SHA256, and decrypted safely on the runner. |
| Web build | Artifacts equivalent to the current web build script output are created and packaged. |
| Worker build | Artifacts equivalent to the current worker build script output are created and packaged. |
| Release contract | Versioned artifacts, checksums, and a release manifest are uploaded for deployment. |
Build rule
The build output must stay machine-neutral and environment-neutral. That means no VM-specific paths, no environment-specific hostnames or IP addresses baked into the package, and no assumption that the build machine is also the runtime machine. Machine targeting belongs to deploy, not to build.
deploy.yaml (Manual Deploy/Rollback)
Manual utility workflow for deploying specific versions or rolling back. Uses Azure VM Run Commands (v2) for remote execution without SSH/WinRM.
- Mode: deploy β Deploy a specific version (by
artifact_versionorbuild_run_id) to web+worker VMs. - Mode: rollback β Load rollback-reference.json, find previous successful deployment, redeploy that version.
- Upload release artifacts to Azure Blob Storage and generate SAS URLs for VM access.
- Deploy web, then worker β sequential deployment with post-deploy validation on each tier.
- Post-deploy validation via
Invoke-RemoteValidation.ps1checks health endpoints and service status. - Record history β updates deployment-history.json; supports future rollback.
How Azure VM Run Commands work
The workflow creates a Run Command resource on the target VM via Azure ARM API. The VM agent executes the script and reports stdout/stderr. No inbound network ports needed. Scripts uploaded to Blob Storage, referenced by SAS URL. Timeout: 1200s per deployment.
deploy-test.yaml (Auto Test Deploy)
Trigger: Push to feature/* or env/* branches. Automatically builds and deploys to an isolated test environment.
- Derive environment name from branch β
feature/auth-refactorβauth-refactor(sanitized, max 20 chars). - Build on windows-2022 runner from the pushed branch code β full MSBuild producing web artifact.
- Stage to Azure Blob Storage β web ZIP + deployment scripts uploaded with SAS URLs.
- Deploy via Run Command v2 β extracts files, creates IIS app pool + website, binds host header
*:80:{env}.pearl.test. - Comment on PR β posts environment URL on associated pull request.
- Auto-starts VM if stopped (waits up to 5 min for readiness).
Concurrency
Per-environment: deploy-test-{env-name}. A new push to the same branch cancels the in-progress deploy. Different branches deploy independently in parallel.
deploy-production.yaml (Production Deploy)
Trigger: Manual dispatch only. Push-on-master is disabled. Deploys both web and worker tiers to production VMs.
- Ensure infrastructure β Terraform plan+apply (idempotent). Skip with
skip_infrastructure: true. - Build from master β full MSBuild producing web + worker + release manifest.
- Validate environment metadata against JSON schema.
- Deploy web to web VM via Azure Run Command v2.
- Deploy worker to worker VM via Azure Run Command v2.
- Record deployment history to blob storage for rollback support.
Concurrency
deploy-production β only one production deploy runs at a time. No cancellation (runs to completion).
cleanup-test-env.yaml (Cleanup)
Removes test environments when they're no longer needed. Three triggers cover all cases.
- Branch delete β auto-triggers on branch deletion. Derives env name and runs Cleanup-TestEnvironment.ps1 via Run Command.
- Manual dispatch β provide
env_nameto clean up a specific environment. - Daily schedule (03:00 UTC) β lists all
pearl-*IIS sites on VM, compares with active branches via GitHub API, removes orphans. - Safety guards β never touches non-pearl sites, idempotent (exits successfully if env doesn't exist), skips if VM is stopped.
Deployment Strategy
The current strategy is intentionally simple: automate the environment that already exists, prove the workflow contract, then expand later.
| Phase | Runtime shape | Operational model |
|---|---|---|
| Now (Active) | One web VM and one worker VM (UK West region) | Build once on GitHub-hosted windows-2022, deploy by version via Azure Run Commands, validate with structured checks (HTTP + service + SQL with retry), rollback by redeploying previous artifact |
| Next | Same runtime shape plus optional self-hosted build runner | Reuse the same workflows with runner-label changes only where needed |
| Later | Slot-aware runtime topology | Blue/green deployment, promotion controls, and routing-based rollback |
Do not start with slotting
The first CI/CD rollout should prove hosted-runner build, environment metadata, deployment automation, and rollback on the current topology before a slot-aware infrastructure refactor is introduced.
Database restore later
Do not add a standalone database-refresh workflow in this phase. If restoration is needed later, call the restore tool script as part of the broader operational flow after the main create-environment, build, and deploy pipeline is already implemented.
Blob ownership split
Keep bootstrap installer blobs and CI build dependency bundles as separate logical artifact classes, even if they share a storage account. Bootstrap blobs serve VM preparation. CI dependency bundles serve build.yaml and should be versioned and validated independently.
Implementation Flow
The AI backlog works best when it follows dependency order instead of workflow file order. This keeps later tasks from guessing fields, paths, or validation names.
Lock the JSON shapes for environment metadata, release manifests, validation results, deployment history, and rollback references before writing the workflows.
Make Terraform emit the current VM names, IPs, health URLs, IIS identifiers, worker service names, storage locations, and SQL details needed by deploy.
Separate CI dependency bundles from bootstrap installers, then add checksum, decryption, and extraction support for the Windows runner.
Publish the agreed metadata artifact in a stable shape so downstream workflows can consume it directly.
Produce versioned, machine-neutral artifacts and release manifests using the hosted runner path.
Write the remote deployment scripts first, then wire deploy.yaml to use metadata-driven validation and explicit rollback behavior.
Variables And Secrets You Should Expect
These are the main environment values needed for the GitHub Actions path based on the current repository inputs and the near-term workflow contract.
Azure identity and Terraform
Subscription, tenant, client identity, backend state settings, SQL admin credentials, VM admin credentials, and Terraform environment values.
Dependency security
Blob account and container details, dependency bundle name and SHA256, and AES decryption material stored as protected secrets.
Environment metadata
Current VM names, addresses, health URLs, IIS site and app-pool names, worker service names, storage outputs, SQL details, and release version information passed across workflows.
Metadata contract first
The environment metadata artifact should be treated as a stable schema, not an ad hoc set of outputs. Deploy validation should read its health URLs, IIS identifiers, and worker service names from that contract instead of hardcoding them.
Validation Gates
These are the minimum checks the automated path should pass before the deployment can be considered successful.
- Web health endpoint from environment metadata responds after web deployment.
- Smoke URLs load with the expected status and content behavior.
- Configured IIS site and app pools are started after deployment.
- Worker services from environment metadata are running and stable after restart.
- Database connectivity succeeds after the worker deployment.
- Artifact and dependency checksums match the release manifest.
- The deployed version is recorded for rollback reference.
- Rollback fails explicitly if the previous known-good artifact is missing or corrupt.
Recommended Rollout Phases
The safest implementation path is phased. That keeps the current environment contract stable before the automation becomes more ambitious.
Align Terraform outputs and metadata to the existing web, worker, and build VM topology.
Convert the current Windows build scripts into a versioned GitHub Actions build workflow.
Deploy web first, validate, deploy worker next, validate again, and record rollback-ready release history.
Keep the same workflow shape and only change runner selection when the build VM is ready to act as a self-hosted runner.
Add slot-aware Terraform, routing, inactive-target deployment, and traffic-based rollback only after the first CI/CD contract is already stable.
Delivery Checklist
This is the website summary of what must exist before the first CI/CD implementation can be called complete.
- The shared metadata schema is defined before workflow implementation starts.
- Terraform outputs are usable by workflows on the current topology.
- create-environment.yaml can plan, apply, and destroy safely.
- build.yaml produces versioned web and worker artifacts with checksums.
- deploy.yaml updates the current web and worker VMs with validation gates.
- Rollback can redeploy the previous known-good version.
- Self-hosted parity and slotting remain future phases, not hidden scope in phase 1.