Files
growqr-backend/docs/environment-matrix.md

8.9 KiB

Environment Matrix

PRM-42 staging vs production separation inventory for growqr-backend.

No refactor was performed in this pass.

Current Environment Model

The backend currently uses config.nodeEnv plus many individual env vars. There is no explicit first-class environment such as development | staging | production | demo.

Important consequence: local/dev defaults can leak into staging or production unless deployment env vars override every sensitive value.

Current Config Inventory

Area Config/env Current default Production concern
Runtime PORT, LOG_LEVEL, NODE_ENV 4000, info, development NODE_ENV is too broad for staging/demo behavior.
Database DATABASE_URL hardcoded fallback DSN in config.ts Production should fail fast instead of falling back.
Auth CLERK_SECRET_KEY, CLERK_PUBLISHABLE_KEY empty Secret key absence changes auth behavior; publishable key appears underused.
Service auth SERVICE_TOKEN, A2A_ALLOWED_KEY empty / dev-a2a-key Dev token fallback must not be accepted in production.
Redis events GROW_EVENTS_REDIS_URL, REDIS_URL, stream/group/consumer names disabled unless set Staging/prod need explicit stream, group, and replay policy.
Legacy Redis INTERVIEW_REDIS_URL, ROLEPLAY_REDIS_URL, RESUME_REDIS_URL fallback to event Redis Legacy observation should be explicitly enabled per environment.
LLM LLM_PROVIDER, LLM_API_KEY, OPENCODE_API_KEY, LLM_BASE_URL, GROW_AGENT_MODEL, LLM_MODEL opencode, https://opencode.ai/zen/v1, kimi-k2.6 Staging/prod should pin provider/model and require API key where features are enabled.
Rivet RIVET_ENDPOINT, RIVET_CLIENT_ENDPOINT localhost/127.0.0.1 Docker compose overrides endpoint; production needs internal and public separation.
Product services INTERVIEW_SERVICE_URL, ROLEPLAY_SERVICE_URL, QSCORE_SERVICE_URL, RESUME_SERVICE_URL, USER_SERVICE_URL, MATCHMAKING_SERVICE_URL, SOCIAL_BRANDING_SERVICE_URL localhost ports Production should require service URLs or feature-disable explicitly.
Public URLs INTERVIEW_PUBLIC_URL, ROLEPLAY_PUBLIC_URL, RESUME_PUBLIC_URL, WORKFLOWS_DASHBOARD_URL, FRONTEND_ORIGIN localhost/frontend fallback Public and internal service URLs need separate semantics.
Gitea GITEA_PUBLIC_URL, GITEA_INTERNAL_URL, GITEA_ADMIN_USER, GITEA_ADMIN_PASSWORD, GITEA_ADMIN_TOKEN, GITEA_ORG_NAME localhost, growqr-admin, growqr-admin-dev, empty token Admin password fallback is dev-only. Production should require token/secret.
OpenCode OPENCODE_IMAGE, OPENCODE_IMAGE_VERSION, MIGRATION_VERSION, PROMPT_VERSION, USER_CONTAINER_HOST, USER_DATA_ROOT, USER_PORT_RANGE_* dev image/version, local paths/ports Needs staging/prod image tags and storage policy.
CORS/admin FRONTEND_ORIGIN, ADMIN_USER_IDS localhost / empty Empty admin list currently allows /workflows/admin/ops to all authenticated users.
Agent limits MAX_AGENT_TOKENS, PROJECTION_AGENT_MODEL, CONVERSATION_ACTOR_MODEL 4096 / agent model Model overrides should be pinned by environment.

Environment-Dependent Code Paths

File Behavior
src/config.ts Central env parsing with dev defaults for database, tokens, local service URLs, Gitea, OpenCode, Rivet, frontend, and ports.
src/auth/clerk.ts In non-production, A2A_ALLOWED_KEY is accepted as an auth fallback. Clerk client is only created when CLERK_SECRET_KEY exists.
src/index.ts Proxies /api/rivet only when process.env.RIVET_ENDPOINT is set. Starts Redis consumer opportunistically. CORS uses FRONTEND_ORIGIN.
src/events/redis-consumer.ts Canonical consumer disabled if no Redis URL. Legacy observers enabled by legacy Redis URLs.
src/events/projectors/projection-agent.ts Falls back if no LLM API key; model can be overridden by PROJECTION_AGENT_MODEL.
src/actors/conversation/agent.ts Requires LLM key for streaming; model can be overridden by CONVERSATION_ACTOR_MODEL.
src/routes/events.ts Service ingest auth allows no service token in non-production.
src/routes/home.ts Exposes demo seeding route.
src/home/seed-demo-home.ts Demo notifications and executable direct script behavior.
src/services/service-agents.ts Synthetic/demo fallbacks for some unavailable services and Q Score estimate behavior.
src/docker/manager.ts Uses Gitea/OpenCode image/version/host/path/port config and mutates Docker runtime.
scripts/rivet-actors.ts Uses dev Rivet namespace/token defaults.
docker-compose.yml Dev compose defaults for Postgres, Gitea, Rivet, backend, services, frontend origins, and OpenCode image.
docker/opencode/* Dev-oriented OpenCode image/template behavior.

Hardcoded URL and Default Hotspots

  • http://localhost:* defaults in src/config.ts, .env.example, README.md, and docker-compose.yml.
  • http://127.0.0.1:* defaults for Rivet client, Gitea, and user container host.
  • http://host.docker.internal:* compose service defaults.
  • OpenCode base image ghcr.io/anomalyco/opencode:latest in docker/opencode/Dockerfile.
  • Dev image tag growqr/opencode:dev.
  • Gitea admin defaults growqr-admin / growqr-admin-dev.
  • A2A fallback dev-a2a-key.

Clerk / JWKS Assumptions

The code uses Clerk SDK with CLERK_SECRET_KEY; there is no explicit JWKS URL configuration in the reviewed backend source. Service-to-service auth is token based, with dev fallback behavior. Target production should document whether auth is:

  • Clerk session token verification for user requests.
  • SERVICE_TOKEN for service-to-backend event ingestion.
  • Separate internal A2A key for legacy product service calls.
  • Optional JWKS validation if services send JWTs instead of opaque service tokens.

Target Config Model

Introduce:

type RuntimeEnvironment = "development" | "test" | "staging" | "demo" | "production";

Recommended top-level config shape:

config.environment
config.isProduction
config.isStaging
config.isDemo
config.features.demoDataEnabled
config.features.legacyRedisObserversEnabled
config.features.opencodeProvisioningEnabled
config.features.serviceProxyEnabled
config.urls.internal.*
config.urls.public.*
config.auth.*
config.retry.*
config.events.*

Rules:

  • Production must fail fast for missing DATABASE_URL, CLERK_SECRET_KEY, SERVICE_TOKEN, FRONTEND_ORIGIN, Gitea credentials/token, and any enabled service URL.
  • Staging may use staging service URLs and demo data only when DEMO_DATA_ENABLED=true.
  • Development may keep local defaults.
  • Demo behavior should be impossible in production unless an explicit, audited flag is set and the route remains auth/admin-gated.

What Should Move to src/staging

Proposed src/staging candidates:

  • home/seed-demo-home.ts
  • /home/seed-demo route handler
  • demo notification factories
  • demo Q Score formulas/fallback constants in service-agent behavior, if not product-approved
  • local-only service session scaffolding helpers
  • any future seeders/backfills used only for demos

Suggested layout:

src/staging/
  demo-home.ts
  demo-qscore.ts
  seed-routes.ts
  guards.ts

src/staging/guards.ts should expose requireStagingOrDemo(config) and fail closed in production.

Target Environment Matrix

Behavior Development Staging Demo Production
Localhost defaults Allowed Not allowed Not allowed unless local demo Not allowed
Demo seed endpoints Allowed Explicit flag + admin Enabled by flag + admin Disabled
Service token fallback Allowed Not allowed Not allowed Not allowed
Legacy Redis observers Optional Explicit flag Explicit flag Disable unless migration requires
Redis canonical events Optional Required for event demos Required Required
OpenCode image :dev ok pinned staging tag pinned demo tag pinned release tag
Admin ops route Authenticated maybe ok ADMIN_USER_IDS required ADMIN_USER_IDS required ADMIN_USER_IDS required
Missing Clerk secret Allowed only for local mock if implemented Fail Fail Fail
Gitea admin password default Allowed Fail Fail Fail

Priority Recommendations

  1. Add APP_ENV or GROWQR_ENV and derive config.environment; stop relying on NODE_ENV for product behavior.
  2. Fail fast in staging/production for missing secrets and localhost/default service URLs.
  3. Move demo seed code into src/staging and guard routes with DEMO_DATA_ENABLED plus admin check.
  4. Require ADMIN_USER_IDS before enabling /workflows/admin/ops outside development.
  5. Split public URLs and internal URLs in config names consistently across frontend, services, Gitea, Rivet, and OpenCode.
  6. Add a deployment checklist that records every required env var per environment.
  7. Make legacy Redis observers an explicit feature flag and set a removal date.