A technical deep-dive into how the Observatory is built, how data flows through it, and the decisions behind the architecture.
The Observatory is a Python/Flask application running inside Docker on a single VPS. The pipeline is a set of daily-scheduled crawlers that write into a normalised SQLite database; the web layer reads from that same database and renders pages server-side. There is no message queue, no microservices, no external data store — simplicity is a feature.
| Layer | Technology | Why |
|---|---|---|
| Web framework | Flask + Jinja2 | Minimal, well-understood, easy to deploy as a single process. |
| Database | SQLite | Read-heavy workload with a single writer (the pipeline). SQLite's file-based model makes backups trivial. |
| Frontend | Tabler UI + HTMX + Chart.js | Server-rendered HTML with islands of interactivity via HTMX. No build step. |
| Crawlers | Python (requests + BeautifulSoup) | Each source has a dedicated crawler module. eBay uses the official Browse API. |
| Scheduling | cron (inside container) | Simple, no external dependencies. |
| Auth | Flask-Login + bcrypt | Session-based auth with hashed passwords. No OAuth dependencies. |
| Deployment | Docker + Nginx reverse proxy | Single-compose stack; Nginx handles SSL termination and static files. |
Data flows through four stages:
Each source-specific crawler fetches raw listing HTML or JSON and writes to raw_html.
Parser modules extract structured fields (title, price, currency, condition, URL, image) from raw HTML.
Prices converted to EUR; condition codes unified; titles translated to English via AI for cross-source matching.
Normalised observations written to staging_observations. Status set to active once all checks pass.
VACUUM keeps the file compact. Backups are a single cp.
/img/<obs_id> route fetches images server-side
(no Referer header) and caches them in memory for 10 minutes.
Want to dig deeper or discuss the architecture? contact@intellisynthprices.com