Pearl Test Environment Central Runbook
Use this page when you need the plain-English, step-by-step operating path. It explains what to do first, what each VM proves, what Terraform has already prepared for you, and how to avoid validating the wrong thing in the wrong order.
When To Use This Runbook
This page is for the person who needs the flow, not just the files. If you are asking what comes first, what can be skipped, or what a successful environment actually looks like, start here.
Use this page if your question sounds like this
What should I do next? Do I need the build VM first? What does the worker VM prove? Why does Terraform success not automatically mean the application works?
- Use the home page when you need orientation and navigation.
- Use this runbook when you need the practical execution order.
- Use the Terraform guide when the issue is provisioning, bootstrap, storage, secrets, or apply behavior.
- Use the detailed reference page when you need the mirrored commands, paths, logs, service names, or script locations without going back to the markdown docs.
What Each VM Is Responsible For
The most common confusion is treating all three VMs as if they prove the same thing. They do not. Each one answers a different operational question.
Web VM
The web VM proves the interactive application tier works: IIS, deployed web content, host bindings, Solr, Memcached, and internal routing.
Worker VM
The worker VM proves the background-processing tier works: Queue Processor, System Checker, AI Spooler, Totem, service hosting, and writable logs.
Build VM
The build VM proves repeatable delivery is possible from inside the test estate: source restore, compilation, packaging, and artifact output.
Important separation
A runtime-ready environment and a proven build host are related, but they are not the same milestone. You can prove the web and worker VMs before the build VM if artifacts are produced elsewhere.
The Tested Order To Follow
This order reflects the tested behavior of the current environment and avoids false failures caused by validating a later phase before an earlier dependency is ready.
Make sure Terraform apply has completed and the role bootstrap has run on the web, worker, and build VMs. This gives you the platform and guest preparation, not the final application deployment.
Build artifacts on a trusted Windows build machine if the build VM is not yet your proven build host. Do not confuse placeholder folders on the VM with real deployed application output.
Prove that IIS, content, host bindings, Solr, and Memcached work before you expect worker-side features to behave consistently.
Prove that services start, Totem responds, logs are writable, and background processors are running from the deployed worker payload.
This is where you confirm the environment is operational rather than only provisioned: SQL connectivity, session state reachability, queue job execution, Totem flow, system checker status, and AI spooler connectivity.
After the environment already works, prove the build VM can build and package the same deliverables so it becomes the repeatable delivery machine going forward.
Before You Touch The VM Validation Guides
The fastest way to waste time is to run validation against half-prepared machines. Complete these checks first.
- Confirm VM access method by using the tested Bastion and private-connectivity route the team now documents.
- Confirm the expected application drive is
F:for the prepared application paths on the current test environment. - Confirm your deployment payloads are real and not just empty bootstrap-created directories.
- Confirm environment settings point to test resources especially SQL Managed Instance and other service endpoints.
- Confirm you know your purpose runtime proof or build-host proof, because the checklist order changes based on that.
Why this matters
Terraform and bootstrap create the landing zone. They do not guarantee that the web application, worker services, or build outputs have already been deployed into that landing zone.
Web VM Validation Path
Start runtime validation here. The worker tier depends on parts of the web tier, so proving the web VM first reduces noise in later troubleshooting.
| Check area | What success means | What failure usually means |
|---|---|---|
| Application folders | Real web payload is present in the prepared F: paths. |
Deployment never happened or copied to the wrong location. |
| IIS sites and bindings | Expected sites load and internal hostnames resolve correctly. | Binding mismatch, host file issue, or missing content. |
| Supporting services | Solr and Memcached are reachable and behaving as expected. | Bootstrap gap, service startup failure, or local firewall issue. |
| Connection settings | Config points to the test SQL MI and other test endpoints. | Old environment settings or incomplete transform work. |
Exit condition for the web phase
You can reach the application through the intended internal path, the correct content is deployed, and the supporting services used by the site are healthy enough to support worker-side testing.
Worker VM Validation Path
The worker VM proves background execution. This is where service hosting, Totem behavior, log write permissions, and downstream reachability become important. After VNet peering recreation, Azure VirtualNetwork service tag propagation to the SQL MI NSG can take 15โ60 seconds โ automated validation now includes retry logic to handle this.
- Queue Processor service is present and can run from the deployed worker payload.
- System Checker service can start and record status as expected.
- AI Spooler service can start and reach its configured endpoint path.
- Totem endpoint path can register, respond to ping, and participate in notification flow.
- Local log paths are writable by the service identity and the logs actually change during runtime activity.
Do not validate an empty worker
If services exist only as placeholders or point to incomplete binaries, the result tells you almost nothing. Validate only after the real worker payload is deployed and configured.
Build VM Validation Path
This is the final proof that future builds can happen inside the test environment itself. It is not the first thing you need when the immediate goal is runtime validation.
| Build VM area | What current tested guidance expects |
|---|---|
| Tools and package cache | Blob-backed installer prerequisites are used for the tested bootstrap path, with the package cache prepared on F:\vs-package-cache. |
| Source and restore tooling | Git, NuGet, PowerShell modules, and related tooling are present for source retrieval and restore steps. |
| Compilation path | MSBuild and ASP.NET compilation steps can produce the expected web and worker outputs. |
| Artifact trust | The build output is good enough to be promoted into the same runtime validation flow already proven elsewhere. |
Simple decision rule
If your only question is whether the environment works, validate web and worker first. If your question is whether the test environment can also build itself consistently, add the build VM proof after runtime success.
Decision Guide
Use this section when you are uncertain which path applies to your situation.
Skip directly to web deployment and runtime validation. The build VM can wait until later.
Use the Terraform guide first, then return here for the runtime order.
Complete runtime proof first if possible, then use the build VM path to replace your external trusted build host.
GitHub CI/CD Path
The repository has seven GitHub Actions workflows covering environment provisioning, building, production deployment, test environment deployment, and automated cleanup. All deployment uses Azure VM Run Commands (v2) โ no inbound ports required. Authentication uses OIDC federated identity.
- Create or confirm infrastructure through
create-environment.yaml(Terraform plan/apply/destroy with target environment dropdown). - Deploy to production via
deploy-production.yaml(manual dispatch only, includes build + approval gate + deploy to both VMs). - Deploy test environments automatically via
deploy-test.yaml(triggers on push to feature/env branches, creates isolated IIS site). - Cleanup test environments via
cleanup-test-env.yaml(triggers on branch delete + daily orphan scan at 03:00 UTC). - Standalone build via
build.yamlon GitHub-hostedwindows-2022runners with encrypted vendor dependencies (.zopAES-256). - Manual deploy/rollback via
deploy.yamlfor targeted VM deployments using existing artifacts. - After destroy/rebuild cycles the firewall public IP changes โ retrieve with
terraform output firewall_public_ipand update external access rules.
Best companion pages
Use the GitHub CI/CD Guide for workflow details, the Multi-Environment Guide for test environment system, or the Deployment Manual for the complete reference.
Common Failure Patterns
These are the patterns that create the most confusion because they look like application defects but are really environment-state issues.
- IIS looks healthy but the application is wrong usually means placeholder directories exist but real web content was not deployed.
- Worker services start and stop unexpectedly usually means incomplete worker payload, bad config, or missing dependencies rather than a Windows-service hosting problem alone.
- Database tests fail from one tier only usually means mismatched connection strings or tier-specific reachability issues.
- SCP assumptions fail because Bastion/RDP tunneling is not the same thing as SSH support inside the guest VM.
- Build tools mismatch often comes from old documentation assuming a different installer path than the current Blob-backed bootstrap prerequisite layout.
When To Open The Detailed Reference Library
The runbook tells you the right order. The detailed reference page carries the deeper technical mirror of the internal docs: exact paths, expected service names, bootstrap logs, access methods, and sample command blocks.
Web deep checks
Expected IIS paths, app names, host aliases, Solr and Memcached notes, and web-tier command examples.
Worker deep checks
Expected worker folders, service names, binary names, log paths, Totem checks, and worker-tier validation commands.
Build deep checks
Build-tool locations, package cache paths, bootstrap logs, artifact directories, and build smoke-test guidance.
Best usage pattern
Stay on this runbook for flow decisions, then open the detailed reference page beside it when you need the deeper commands and proof points during execution.
How To Reuse This Runbook For A New Project
This page should remain useful even after Pearl. The reusable pattern is to separate platform readiness, deployment readiness, runtime readiness, and repeatable-build readiness.
- Platform readiness means cloud resources and baseline VM preparation exist.
- Deployment readiness means real application payloads and settings are available.
- Runtime readiness means the app actually works on the provisioned estate.
- Build readiness means the estate can produce its own trustworthy artifacts.
Portable principle
For any future project, build the operator guide around proofs instead of around technologies. That makes the documentation easier for non-technical readers and more durable when implementation details change.
Final Checklist
If you only remember one page from this site, remember this sequence.
- Provision and bootstrap the test environment.
- Prepare trusted web and worker artifacts.
- Validate the web VM first.
- Validate the worker VM second.
- Run shared database and functional checks third.
- Validate the build VM after runtime success.
- Capture repeatable lessons and turn them into automation.