Shipping VibeGuard: Multi-Worker Celery + Playwright on a Home Server

VibeGuard is the newest product I've been building — an automated QA platform that crawls a target web app, plans test cases using an LLM, executes them in a real browser via Playwright, and produces reports. You point it at a URL, describe what matters, and it runs the scenarios for you. This week we shipped the first production version to vibeguard.intisaas.digital.

The deployment is non-trivial because there are moving parts: a FastAPI backend, a queue of long-running browser jobs, a scheduler for periodic tasks, a shared artifact directory for screenshots, and an LLM layer on top. This post walks through the stack and how we glued it together on a home server behind a Cloudflare Tunnel.

This is the third post in the series on running production-grade apps from a $300 home server. Earlier posts covered the self-hosted AI stack and the Telegram-driven development pipeline. This one is about the first real SaaS I've put behind that infrastructure.

1. What VibeGuard Does

The idea is simple. Writing end-to-end tests for a web app is slow, brittle, and almost always the first thing that gets skipped when a deadline looms. VibeGuard does the writing for you. You give it a target URL and a plain-English description of what the app is supposed to do. It crawls the app, discovers the pages and interactive elements, plans a set of realistic test cases against that map, and then runs them in a real browser.

The pipeline inside the backend looks like this:

Discovery — crawl the target app, build a page map with links, forms, and buttons
Planner — LLM generates a test plan in plain English from the page map plus the user's intent
Prompt parser — converts the plan into structured actions (click, type, assert, wait)
Executor — Playwright runs the actions, takes screenshots at every step, records results
Report — per-job HTML/PDF report with screenshots, action logs, and any failures

Jobs are long-running — anywhere from 30 seconds to several minutes depending on how many scenarios are in the plan — so they can't be handled synchronously in the API request. They have to go on a queue. That's where Celery comes in.

2. The Stack

The full production stack looks like this:

FastAPI — async Python API for auth, jobs, credits, reports, payments
SQLAlchemy + PostgreSQL 16 — relational store for users, jobs, reports, and billing
Celery + Redis — background job queue for the crawler, planner, and executor
Playwright — headless Chromium for the actual browser automation
litellm — provider-agnostic LLM client so we can route between Claude, GPT, and local models
WeasyPrint — HTML-to-PDF for the final test reports
Resend — transactional email for magic-link auth and report notifications
Vue 3 + TypeScript + Vite + Tailwind v4 + Pinia — the SPA frontend

Nothing exotic. What makes the deployment interesting is how many of these pieces need to coexist without stepping on each other.

3. Six Containers, One Compose File

The production compose file defines six containers. Here's the topology:

vibeguard-api (FastAPI)
↓ enqueues jobs via Redis
vibeguard-redis — vibeguard-beat (scheduler)
↓
vibeguard-worker-1 vibeguard-worker-2 (Playwright + Celery)
↓ writes screenshots + results
vibeguard-db (Postgres 16) vibeguard-screenshots (volume)

The pared-down compose file:

services:
  vibeguard-api:
    build: { context: ., dockerfile: Dockerfile.prod }
    container_name: vibeguard-api
    ports:
      - "127.0.0.1:8600:8300"
    volumes:
      - vibeguard-screenshots:/app/screenshots
    env_file: backend/.env
    depends_on: [vibeguard-redis, vibeguard-db]
    restart: unless-stopped

  vibeguard-worker-1:
    build: { context: ., dockerfile: Dockerfile.worker }
    container_name: vibeguard-worker-1
    volumes:
      - vibeguard-screenshots:/app/screenshots
    env_file: backend/.env
    depends_on: [vibeguard-redis, vibeguard-db]
    restart: unless-stopped
    command: celery -A app.worker.celery_app worker
             --loglevel=info -P solo -n worker1@%h

  # worker-2 identical but -n worker2@%h

  vibeguard-beat:
    build: { context: ., dockerfile: Dockerfile.worker }
    container_name: vibeguard-beat
    env_file: backend/.env
    depends_on: [vibeguard-redis, vibeguard-db]
    restart: unless-stopped
    command: celery -A app.worker.celery_app beat --loglevel=info

  vibeguard-redis:
    image: redis:7-alpine
    container_name: vibeguard-redis
    volumes: [vibeguard-redis-data:/data]
    restart: unless-stopped

  vibeguard-db:
    image: postgres:16-alpine
    container_name: vibeguard-db
    environment:
      POSTGRES_DB: vibeguard
      POSTGRES_USER: vibeguard
      POSTGRES_PASSWORD: vibeguard
    volumes: [vibeguard-db-data:/var/lib/postgresql/data]
    restart: unless-stopped

volumes:
  vibeguard-db-data:
  vibeguard-redis-data:
  vibeguard-screenshots:

Two separate Dockerfiles keep things clean: Dockerfile.prod builds a lean FastAPI image; Dockerfile.worker is heavier because it bundles Playwright and the Chromium binary.

4. Why Two Workers and a Beat

Two workers instead of one because jobs are slow. A single Playwright-driven executor easily takes a minute or more, and if you only have one worker, a second job sits idle behind it. Two gives us a small amount of parallelism without needing to scale horizontally yet.

The beat container runs celery beat, which is Celery's scheduler. It fires periodic tasks on a schedule — in our case, a job called promote_queued_jobs that moves jobs from the queued state to scheduled state once they're ready to be picked up by a worker. Think of it like Laravel's scheduler: it runs every few seconds and triggers whatever is due.

5. Solo Pool, Not Prefork

Celery's default pool is prefork, which forks child processes to run tasks in parallel. That's great for CPU-bound or simple I/O tasks. It is not great for Playwright. Playwright spawns Chromium as a subprocess and uses asyncio internally to talk to it over a pipe. Prefork's forking model plus Playwright's subprocess management plus asyncio event loops is a recipe for zombie browsers and cryptic hangs.

The fix is to run each worker with -P solo, which runs a single task at a time in the main process. No forking. Each worker container handles one job; parallelism comes from running multiple worker containers. Hence worker-1 and worker-2.

celery -A app.worker.celery_app worker \
    --loglevel=info -P solo -n worker1@%h

The tradeoff: we lose concurrency inside a single container. The upside: no more "why did Chromium just disappear" debugging sessions at 2am.

6. The Shared Screenshots Volume

The workers generate a lot of artifacts — every test step produces a screenshot, every job produces a report. Those files need to live somewhere both the workers (who write them) and the API (which serves them to the browser) can see.

A Docker named volume, vibeguard-screenshots, solves this. It's mounted at /app/screenshots inside the API container and both worker containers. Workers write; the API reads; everyone stays happy.

One thing worth noting: we use a named volume, not a bind mount. Named volumes survive docker compose down but get blown away if you run docker compose down -v. For production data this matters — you never want to accidentally nuke screenshots during a redeploy.

7. No Public Ports — Cloudflare Tunnel

The API binds to 127.0.0.1:8600, not 0.0.0.0. Nothing on the server is reachable from the public internet directly. Everything routes through a Cloudflare Tunnel, which is the same pattern I use for every service on this machine.

The tunnel config for vibeguard is a single ingress rule:

ingress:
  - hostname: vibeguard.intisaas.digital
    service: http://localhost:8600
  # ...other hostnames
  - service: http_status:404

What this gets us:

TLS for free — Cloudflare terminates HTTPS at the edge, so the origin only needs to speak HTTP
No open ports — the server's firewall has no inbound rules for 80 or 443
DDoS protection — Cloudflare's edge absorbs traffic before it reaches the server
DNS managed as code — the hostname is just a CNAME to the tunnel, so rolling back is instant

The only tricky part is the Vue frontend's asset base URL. Vite builds the SPA assuming it's served at the root, which works because the FastAPI app mounts the built dist/ folder as static files at / and serves index.html for any unknown route (classic SPA fallback).

8. Adding VibeGuard to the Health Watchdog

I already run a custom health watchdog every 15 minutes. It checks disk, memory, load, critical Docker containers, systemd services, and the webhook listener on port 5679. If something goes from healthy to broken (or back), I get a Telegram alert. The key property is that it only alerts on state changes — no notification spam when something stays down.

Adding VibeGuard meant appending all six containers to the critical list in ~/scripts/watchdog.sh:

CRITICAL_CONTAINERS=(
    # ...other containers
    "vibeguard-api"
    "vibeguard-beat"
    "vibeguard-worker-1"
    "vibeguard-worker-2"
    "vibeguard-db"
    "vibeguard-redis"
    # ...
)

Now if any of the six stops or goes unhealthy, I get a Telegram ping within 15 minutes. Recovery notifications come on the next check once the state flips back. Good for liveness. Not good for catching application-level errors, which brings us to the log watchdog.

9. A Dedicated Log Watchdog

Liveness monitoring only tells you if a container is alive. It doesn't tell you that your Celery workers are throwing sqlalchemy.exc.ProgrammingError every ten seconds because a migration didn't run. For that you need a log watchdog.

I wrote a small Python script, vibeguard-log-watchdog.py, that runs every 10 minutes from cron. It does four things:

Pulls new logs — uses docker logs --timestamps --since <cursor> for each vibeguard container, saves the last-seen timestamp to a state file per container
Parses error blocks — regex-matches Celery ERROR/CRITICAL lines, Python tracebacks, SQLAlchemy/psycopg2 exceptions, and HTTP 5xx responses; collects continuation lines until the next log entry starts
Deduplicates and summarises — collapses identical errors with a (xN) prefix, then pipes up to 15 unique blocks to claude -p with a prompt asking for a 3-6 sentence summary
Sends a Telegram alert — one message per run, only if there are errors

The error pattern is the important part:

ERROR_MARKERS = re.compile(
    r"(ERROR/|CRITICAL/|ERROR:|CRITICAL:|"
    r"Traceback \(most recent call last\)|"
    r"Exception|sqlalchemy\.exc\.|"
    r"psycopg2\.errors\.|celery\.exceptions\.|"
    r'HTTP/1\.[01]"\s*5\d\d)'
)

This catches both the Celery-flavoured log format ([2026-04-15 08:30:01,403: ERROR/MainProcess]) and the uvicorn-flavoured one (ERROR: ...). The continuation-line collector walks forward until it sees the next [timestamp: or INFO:, which is enough to capture a full Python traceback without accidentally pulling in unrelated log lines.

The cron entry offsets the run by 5 minutes so it doesn't compete with the other log watchdog (AutoKira's Laravel log watcher) for the claude binary:

# Vibeguard log watchdog - every 10 min (offset 5 min)
5-59/10 * * * * /usr/bin/python3 \
    /home/khadam/scripts/vibeguard-log-watchdog.py \
    >> /home/khadam/scripts/logs/vibeguard-watchdog.log 2>&1

10. The First Real Catch

I set the log watchdog up and then let it run for a few minutes as a sanity check. The parser immediately flagged an old error sitting in the worker logs from earlier in the day:

[vibeguard-worker-1] [ERROR/MainProcess] promote_queued_jobs failed
Traceback (most recent call last):
  File ".../sqlalchemy/engine/base.py", line 1967, in _exec_single_context
psycopg2.errors.UndefinedTable: relation "qa_jobs" does not exist
LINE 2: FROM qa_jobs
             ^
[SQL: SELECT DISTINCT qa_jobs.user_id AS qa_jobs_user_id
FROM qa_jobs WHERE qa_jobs.status = %(status_1)s]

A missing table. The Celery beat scheduler was firing promote_queued_jobs every ten seconds, the task was trying to query a qa_jobs relation that didn't exist yet in production, and nothing was failing loudly enough to notice — because the API was still healthy, the frontend still loaded, and the health watchdog saw green lights. The only signal was buried in the worker stdout.

That's exactly the kind of silent failure a log watchdog is built to catch. The fix is a migration, but the important thing is that the problem surfaced minutes after I finished writing the watchdog — not days later when a user opened a support ticket.

11. The Deployment Checklist

For anyone shipping a similar stack, here's what actually matters:

Separate Dockerfiles for API vs worker — the worker image is much heavier because of Playwright and Chromium, no need to bloat the API container
Named volumes for artifacts — shared between API and workers, survives redeploys
Celery solo pool — one task per worker, scale horizontally, avoid prefork's fork-plus-asyncio-plus-subprocess hell
Bind API to 127.0.0.1 — never 0.0.0.0, let Cloudflare Tunnel handle exposure
Health watchdog — every container in the critical list, state-change alerts only
Log watchdog — dedicated script per app, claude -p for the summary layer, Telegram for delivery
Cron offsets — when multiple log watchdogs call claude, stagger them so they don't compete

Total infrastructure cost: still $0/month. Same repurposed desktop running OKHalal, AutoKira, HRMS, and now VibeGuard. The only paid component is the LLM traffic for the planner and the log summariser, and both are minimal at current volume.

12. What's Next

The deployment is live and the monitoring is in place. What's still on the list:

The qa_jobs migration — fix the missing table the log watchdog just surfaced
More test plan types — the current planner handles click/type/assert flows well; visual regression and API-level assertions are the obvious next additions
Scaling workers — adding worker-3 and worker-4 is a one-line compose edit when load grows
Screenshot cleanup cron — the shared volume will fill up eventually; a weekly prune of reports older than 30 days is next

The broader takeaway: shipping a real multi-container SaaS on a home server is genuinely straightforward once you have the building blocks — Cloudflare Tunnel for routing, a named volume for artifacts, Celery solo pool for browser jobs, a health watchdog for liveness, and a log watchdog for application errors. No Kubernetes, no cloud bill, no ops team. Just Docker Compose, cron, and a few hundred lines of Python glue.

Curious about VibeGuard or want to try it? The app is live at vibeguard.intisaas.digital. Questions or feedback — reach out via the contact section.