Skip to main content

Platform Runbook

This is the authoritative command reference for operating the CTMS platform. All operational commands are documented here — other pages link back to this runbook for the canonical versions.

Compose Command Shorthand

Throughout this page, DC refers to the canonical production compose command:

DC="docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.production"

All commands assume you are in the /opt/ctms-deployment (or ctms.devops) directory.


Quick Reference

TaskCommand
Start core services$DC up -d
Start all services$DC --profile all up -d
Stop all servicesdocker compose --profile all down
View service statusdocker compose --profile all ps
Health checkcurl per service (see Health Checks)
View all logsdocker compose --profile all logs -f
Force-recreate all services$DC up -d --force-recreate
Pull latest imagesdocker compose pull
Pull + restart (update)./zynctl.sh update
Seed demo users./zynctl.sh seed-users
Refresh Frappe token./zynctl.sh refresh-token
Fix server after snapshotSERVER_HOST=<new-ip> ./zynctl.sh post-snapshot

zynctl.sh Commands (Bundle Deployments)

If you deployed via zynctl.sh full-deploy, these shorthand commands are available:

CommandDescription
./zynctl.sh statusShow running containers
./zynctl.sh healthCheck all service endpoints
./zynctl.sh seed-usersSeed/re-seed demo users (idempotent)
./zynctl.sh refresh-tokenRe-generate Frappe API token + patch .env.production + recreate KrakenD
./zynctl.sh logs [service]View container logs
./zynctl.sh stopStop all stacks (reverse order)
./zynctl.sh restartRestart CTMS core services
./zynctl.sh updatePull latest images + restart
./zynctl.sh resume-deployResume deployment from Step 7 (Supabase + Frappe already running)
./zynctl.sh post-snapshotFix IPs, tokens, and ports after creating a server from a Hetzner snapshot
./zynctl.sh env-checkValidate .env.production for placeholders
./zynctl.sh infoShow version, config, and system info
./zynctl.sh destroyRemove everything including data (irreversible)

1. Service Management (ctms.devops)

Starting Services

TaskCommand
Core only (Caddy, KrakenD, Zynexa, Sublink, ODM)$DC up -d
Core + Observability$DC --profile core --profile observability up -d
Core + Analytics$DC --profile core --profile analytics up -d
All services$DC --profile all up -d

Stopping & Restarting

TaskCommand
Stop alldocker compose --profile all down
Restart all (quick process bounce)$DC restart
Force-recreate all (picks up env + image changes)$DC up -d --force-recreate
Pull latest images + restart./zynctl.sh update
Update to new bundle versionExtract new bundle → SERVER_HOST=<ip> ./zynctl.sh deploy
Stop + remove volumes (destructive)docker compose --profile all down -v --remove-orphans

Instance-Specific Deployments

# Deploy for a specific instance
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
--env-file .env.<instance>.prod --profile all up -d

# Examples
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
--env-file .env.zynomi.prod --profile all up -d

Individual Service Management

restart vs up -d — Know the Difference

docker compose restart <service> only restarts the container process — it does NOT re-read .env.production or compose file changes. If you changed an environment variable (e.g., CUBEJS_DEV_MODE), you must use up -d to recreate the container with the new config:

# ❌ WRONG — env changes are NOT picked up
docker compose restart cube

# ✅ CORRECT — recreates container with latest env
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
--env-file .env.production up -d cube

Rule of thumb: Use restart only for a quick process bounce. Use up -d after any config change.

The canonical compose command for production (IP-based, before DNS) is:

DC="docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.production"
TaskStart / Recreate (picks up env changes)Quick Restart (no env reload)
API Gateway$DC up -d api-gateway$DC restart api-gateway
Zynexa$DC up -d zynexa$DC restart zynexa
Sublink$DC up -d sublink$DC restart sublink
Caddy (reload config)$DC exec caddy caddy reload --config /etc/caddy/Caddyfile
OpenObserve + OTEL$DC --profile observability up -d openobserve otel-collector$DC restart openobserve otel-collector
Cube.dev$DC --profile analytics up -d cube$DC restart cube
MCP Server$DC up -d mcp-server$DC restart mcp-server
ODM API$DC up -d odm-api$DC restart odm-api

Common Scenarios

# Set the canonical compose command
DC="docker compose -f docker-compose.yml -f docker-compose.prod.yml --env-file .env.production"

# Scenario: Changed CUBEJS_DEV_MODE in .env.production
# Only 'cube' needs recreating — cubestore is unused when DEV_MODE=true
$DC --profile analytics up -d cube

# Scenario: Changed NEXTAUTH_SECRET or NEXTAUTH_URL
$DC up -d zynexa

# Scenario: Changed FRAPPE_API_TOKEN
$DC up -d zynexa mcp-server odm-api

# Scenario: Changed EC2_PUBLIC_IP (affects RUNTIME_* vars in prod overlay)
$DC --profile all up -d

# Scenario: Quick restart after OOM or crash (no config changes)
$DC restart zynexa

2. Logs

TaskCommand
All servicesdocker compose --profile all logs -f
Caddydocker compose logs -f caddy
API Gatewaydocker compose logs -f api-gateway
Zynexadocker compose logs -f zynexa
Sublinkdocker compose logs -f sublink
OpenObservedocker compose logs -f openobserve
OTEL Collectordocker compose logs -f otel-collector
Cube.devdocker compose logs -f cube

Useful Log Filters

# Last 100 lines of a service
docker compose logs --tail 100 zynexa

# Logs since a specific time
docker compose logs --since 1h zynexa

# Logs for multiple services at once
docker compose logs -f zynexa api-gateway caddy

3. Shell Access

ContainerCommand
Caddydocker compose exec caddy sh
API Gatewaydocker compose exec api-gateway sh
Zynexadocker compose exec zynexa sh
Sublinkdocker compose exec sublink sh
OpenObservedocker compose exec openobserve sh
Cube.devdocker compose exec cube sh

4. Health Checks

Manual Health Checks

ServiceCommandExpected
Caddycurl -s https://zynexa.localhost200
API Gatewaycurl -s https://api.localhost/__health{"status":"ok"}
Zynexacurl -s http://localhost:3000/api/health200
Sublinkcurl -s http://localhost:3001200
Cube.devcurl -s http://localhost:4000/readyz{"health":"HEALTH"}
OpenObservecurl -s http://localhost:5080/healthz200
ODM APIcurl -s http://localhost:8000/health200
MCP Servercurl -s http://localhost:8006/health200
Grafanacurl -s http://localhost:3100/api/health{"database":"ok"}
Elementarycurl -s http://localhost:3200/health200

Docker Health Status

# Check container health status
docker compose --profile all ps --format "table {{.Name}}\t{{.Status}}"

# Check a specific service
docker inspect --format='{{.State.Health.Status}}' ctms-zynexa

5. CTMS Init & Provisioning

TaskCommand
Run all 5 stagesdocker compose --env-file .env.production --profile init up ctms-init
Force-pull before run$DC --profile init pull ctms-init && $DC --profile init up ctms-init
Run specific stagesCTMS_INIT_STAGES=3,4 docker compose --env-file .env.production --profile init run --rm ctms-init
Dry runCTMS_INIT_DRY_RUN=true docker compose --env-file .env.production --profile init run --rm ctms-init
Seed Supabase tablesdocker compose --env-file .env.production --profile init run --rm ctms-supabase-seed
Stage 4 — Item Groups

Since ctms-init v1.11 (bundle v2.30), Stage 4 seeds "Laboratory" and "Drug" Item Groups before creating Items. This prevents 417 EXPECTATION FAILED errors when the ERPNext setup wizard didn't fully complete its fixture install. Total Stage 4 records: 142 (was 140).

Deep-Dive

For detailed stage descriptions, env variables, selective runs, and log analysis, see Platform Provisioning Commands.

Demo Data Seeding (Optional)

TaskCommand
Seed 4 demo users./zynctl.sh seed-users or $DC --profile init run --rm ctms-user-seed
Seed 20 synthetic patients$DC --profile init run --rm ctms-patient-seed
Dry-run patient seedDRY_RUN=true $DC --profile init run --rm ctms-patient-seed

6. Data Lakehouse Pipeline

The data lakehouse pipeline extracts clinical data from Frappe, loads it into a dedicated PostgreSQL analytics database, and transforms it through medallion architecture layers using dbt.

Zynexa Semantic Layer Architecture

Frappe (REST API) → Ingester (DLT) → Lakehouse DB (PostgreSQL) ← dbt
Bronze layer Bronze → Silver → Gold 197 tests
Prerequisites

Before running the pipeline, ensure:

  1. Frappe is running and populated with clinical data (Studies, Sites, Subjects, Practitioners, etc.)
  2. Lakehouse DB is started and healthy
  3. FRAPPE_API_TOKEN and FRAPPE_URL are correctly set in .env.production

Without sample data in Frappe, the pipeline will produce empty tables and dashboards will have nothing to display.

Start Lakehouse DB

docker compose --env-file .env.production --profile lakehouse up -d lakehouse-db

Ingestion (Bronze Layer)

The ingester uses dlt (data load tool) to extract ~46 Frappe DocTypes via the REST API and load them into the bronze schema as raw tables.

# Run Ingester: Bronze (~46 tables)
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-ingester

dbt Transformation (Silver + Gold)

dbt transforms the raw bronze data through two additional layers:

LayerSchemaTablesPurpose
Bronzebronze~63Raw Frappe data (1:1 mirror of API responses)
Silversilver~7Cleaned, deduplicated, joined clinical entities
Goldgold~28Aggregated, analytics-ready tables for Cube.dev dashboards
# Run dbt: Silver + Gold (bronze ~63, silver ~7, gold ~28 tables)
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt daily

The daily command runs: dbt depsdbt builddbt run --select elementary.

Expected output: PASS=197 WARN=5 ERROR=0

Verify Table Counts

docker exec ctms-lakehouse-db psql -U ctms_user -d ctms_dlh -c "
SELECT schemaname AS schema, COUNT(*) AS tables
FROM pg_tables
WHERE schemaname IN ('bronze', 'silver', 'gold')
GROUP BY schemaname ORDER BY schemaname;
"

Full Refresh

Use when schema changes require rebuilding all models from scratch:

docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt dbt-full-refresh

Schedule with Cron

Automate daily pipeline runs (e.g., at 2 AM):

0 2 * * * cd /opt/ctms-deployment && \
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-ingester && \
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt daily \
>> /var/log/ctms-pipeline.log 2>&1

Purge Lakehouse Schemas (Reset)

Destructive Operation

This drops all pipeline data. Re-run the ingester + dbt after purging.

docker exec ctms-lakehouse-db psql -U ctms_user -d ctms_dlh -c "
DROP SCHEMA IF EXISTS raw CASCADE;
DROP SCHEMA IF EXISTS raw_staging CASCADE;
DROP SCHEMA IF EXISTS bronze CASCADE;
DROP SCHEMA IF EXISTS bronze_staging CASCADE;
DROP SCHEMA IF EXISTS silver CASCADE;
DROP SCHEMA IF EXISTS gold CASCADE;
"

Docker Commands Quick Reference

TaskCommand
Start Lakehouse DBdocker compose --env-file .env.production --profile lakehouse up -d lakehouse-db
Run ingestion (Bronze)docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-ingester
Run dbt dailydocker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt daily
Run dbt full refreshdocker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt dbt-full-refresh
Purge all schemasSee above
Restart Cube.dev cache$DC --profile analytics restart cube
Deep-Dive Documentation

For detailed configuration and troubleshooting of each stage, see:


7. Vendor Stacks (Self-Hosted)

Supabase

TaskCommand
Startcd supabase && docker compose up -d
Stopcd supabase && docker compose down
Statuscd supabase && docker compose ps
Logscd supabase && docker compose logs -f

Frappe

TaskCommand
Startcd frappe-marley-health && docker compose up -d
Stopcd frappe-marley-health && docker compose down
Statuscd frappe-marley-health && docker compose ps
Get API tokendocker logs frappe-marley-health-setup-1 2>&1 | grep FRAPPE_API_TOKEN
Regenerate API tokendocker exec -w /home/frappe/frappe-bench frappe-marley-health-backend-1 bash -c 'source env/bin/activate && python3 /setup/frappe-generate-token.py'
Frappe shell (bench)docker exec -it frappe-marley-health-backend-1 bash

⚠️ After Restarting Frappe Services

Restarting the Frappe Docker stack (e.g., cd frappe-marley-health && docker compose restart or docker compose up -d) can cause the setup init container to re-run and regenerate the Administrator api_secret. The api_key remains unchanged, so it looks identical in the tabUser table — but the secret silently rotates, breaking all KrakenD → Frappe authentication with 401 AuthenticationError.

After every Frappe restart, run:

cd /opt/ctms-deployment
./zynctl.sh refresh-token

This single command will:

  1. ✅ Test the current token against Frappe
  2. 🔄 Regenerate if invalid (via the built-in helper script)
  3. 📝 Patch .env.production with the new token
  4. 🐳 Force-recreate the KrakenD API gateway container
  5. ✅ Verify both direct Frappe auth and the gateway

Get the Frappe API Token

To retrieve the current token from the production environment file:

TOKEN=$(grep "^FRAPPE_API_TOKEN=" /opt/ctms-deployment/.env.production | cut -d= -f2)
echo "Token: ${TOKEN}"

This prints the full key:secret pair needed for Frappe API authentication. If empty, run ./zynctl.sh refresh-token to regenerate it.


8. Web Application (ctms-web)

Development

TaskCommand
Install dependenciesbun install
Start dev serverbun run dev
Build for productionbun run build
Start production serverbun run start
Lint codebun run lint

API Generation

TaskCommand
Generate entity APIsbun run generate:apis
Generate OpenAPI specbun run openapi:generate
Validate OpenAPI specbun run openapi:validate
Export Postman collectionbun run openapi:postman

9. Common Workflows

Full Platform Start (On-Prem)

cd ctms.devops

# 1. Start Supabase (creates ctms-network)
cd supabase && docker compose up -d && cd ..

# 2. Seed CTMS tables into Supabase
docker compose --env-file .env.local --profile init run --rm ctms-supabase-seed

# 3. Start Frappe (setup auto-completes wizard + generates API token)
cd frappe-marley-health && docker compose up -d && cd ..

# 4. Retrieve the generated API token → update .env.local
docker logs frappe-marley-health-setup-1 2>&1 | grep FRAPPE_API_TOKEN

# 5. Provision Frappe with CTMS data model (5 stages)
docker compose --env-file .env.local --profile init up ctms-init

# 6. Start CTMS core services
docker compose --env-file .env.local up -d

# 7. (Optional) Start analytics + observability
docker compose --env-file .env.local --profile all up -d

Daily Data Refresh

cd ctms.devops

# 1. Extract data from Frappe API → Lakehouse
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-ingester

# 2. Transform through dbt layers (Bronze → Silver → Gold)
docker compose --env-file .env.production --profile lakehouse run --rm lakehouse-dbt dbt-daily

# 3. Clear Cube.dev analytics cache
docker compose --env-file .env.production --profile analytics restart cube

Full Platform Stop (On-Prem, Reverse Order)

cd ctms.devops

# CTMS core
docker compose --profile all down

# Frappe
cd frappe-marley-health && docker compose down && cd ..

# Supabase (last — owns ctms-network)
cd supabase && docker compose down && cd ..

Clone Server from Snapshot

When creating a new server from a Hetzner snapshot or AWS AMI, run:

SERVER_HOST=<new-server-ip> ./zynctl.sh post-snapshot

This fixes IP addresses, refreshes the Frappe API token, and force-recreates all services. See the full guide: Recipe: Post-Snapshot / AMI Setup.


10. URLs Reference

Core Services

ServiceURL
Zynexahttps://zynexa.localhost
Sublinkhttps://sublink.localhost
API Gatewayhttps://api.localhost
ODM APIhttps://odm.localhost

Core + Observability

ServiceURL
OpenObservehttps://observe.localhost

Core + Analytics

ServiceURL
Cube.dev Playgroundhttps://cube.localhost
MCP Serverhttps://mcp.localhost
Grafanahttp://localhost:3100

Core + Lakehouse

ServiceURL
Elementary Reportshttp://localhost:3200

Vendor Stacks (Self-Hosted)

ServiceURL
Supabase Studiohttp://localhost:8000
Frappe Dashboardhttp://localhost:8080/app

11. Environment Files

FilePurpose
.env.exampleTemplate (commit-safe, checked into Git)
.env.productionDefault production configuration
.env.<instance>.prodInstance-specific production (e.g., .env.zynomi.prod)
.env.localLocal development (gitignored)

12. GitHub Releases API

The CTMS bundle is distributed via GitHub Releases. These two API calls let you programmatically discover the latest version and download the bundle — useful for CI/CD pipelines, automated server provisioning, or upgrade scripts.

When to use these
  • Automated deployments: A CI/CD pipeline checks for the latest release, downloads the asset, and deploys.
  • Upgrade scripts: Compare the running VERSION file against the latest tag to decide whether to upgrade.
  • Air-gapped transfers: Download the bundle on a machine with internet, then SCP to a private server.

Discover the Latest Release

Returns the tag name, release title, and all downloadable assets (with their id for the next step).

GITHUB_TOKEN="<your-github-token>"

curl -sL \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/zynomilabs/ctms.devops/releases/latest" \
| jq '{
tag_name,
name,
assets: [.assets[] | {name, id, size, browser_download_url}]
}'

Example response:

{
"tag_name": "bundle-v2.31.20260308",
"name": "CTMS Install Bundle v2.31.20260308",
"assets": [
{
"name": "zynctl-bundle-2.31.20260308.tar.gz",
"id": 369193224,
"size": 71931,
"browser_download_url": "https://github.com/zynomilabs/ctms.devops/releases/download/bundle-v2.31.20260308/zynctl-bundle-2.31.20260308.tar.gz"
},
{
"name": "zynctl-bundle-2.31.20260308.tar.gz.sha256",
"id": 369193223,
"size": 101,
"browser_download_url": "https://github.com/zynomilabs/ctms.devops/releases/download/bundle-v2.31.20260308/zynctl-bundle-2.31.20260308.tar.gz.sha256"
}
]
}

Key fields:

  • tag_name — the version tag (e.g. bundle-v2.31.20260308)
  • assets[].id — the asset ID needed to download the file via API
  • assets[].size — file size in bytes (useful for progress bars or integrity checks)

Download a Release Asset

Use the id from the previous response to download the bundle. The Accept: application/octet-stream header tells the GitHub API to return the raw binary.

GITHUB_TOKEN="<your-github-token>"
ASSET_ID=369193224 # from the 'id' field above

curl -sL \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/octet-stream" \
"https://api.github.com/repos/zynomilabs/ctms.devops/releases/assets/$ASSET_ID" \
-o /tmp/zynctl-bundle-latest.tar.gz

ls -lh /tmp/zynctl-bundle-latest.tar.gz

Full Example: Auto-Download Latest Bundle

Combine both calls to discover and download in one script:

#!/usr/bin/env bash
# Download the latest CTMS bundle from GitHub Releases
set -euo pipefail

GITHUB_TOKEN="<your-github-token>"
REPO="zynomilabs/ctms.devops"
OUT_DIR="/tmp"

# 1. Get latest release metadata
RELEASE=$(curl -sL \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/$REPO/releases/latest")

TAG=$(echo "$RELEASE" | jq -r '.tag_name')
ASSET_ID=$(echo "$RELEASE" | jq -r '.assets[] | select(.name | endswith(".tar.gz") and (endswith(".sha256") | not)) | .id')
ASSET_NAME=$(echo "$RELEASE" | jq -r '.assets[] | select(.name | endswith(".tar.gz") and (endswith(".sha256") | not)) | .name')

echo "Latest release: $TAG"
echo "Asset: $ASSET_NAME (ID: $ASSET_ID)"

# 2. Download the bundle
curl -sL \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/octet-stream" \
"https://api.github.com/repos/$REPO/releases/assets/$ASSET_ID" \
-o "$OUT_DIR/$ASSET_NAME"

echo "Downloaded: $OUT_DIR/$ASSET_NAME ($(du -h "$OUT_DIR/$ASSET_NAME" | cut -f1))"
Authentication

The GitHub API requires a Personal Access Token with repo scope for private repositories. For public repos, the token is optional but avoids rate limits (60 req/hr unauthenticated vs 5,000 req/hr authenticated).