# Docker and Podman Setup Guide
This guide explains how to run Abstracts Explorer using containers with Podman (recommended) or Docker.
**Note:** The container images are production-optimized and use pre-built static vendor files (CSS/JS libraries). Node.js is **not required** for production containers - it's only needed for local development if you want to rebuild vendor files.
## Rootless Containers with Podman Quadlets (recommended systemd-native setup)
The `systemd/` directory contains Podman
[quadlet](https://docs.podman.io/en/latest/markdown/podman-systemd.unit.5.html)
files that integrate the container stack directly with systemd. Quadlets are
the modern, daemonless alternative to podman-compose: each container, volume,
and network is a native systemd unit, giving you the full `systemctl` and
`journalctl` experience with no compose wrapper needed.
Two TLS variants are provided:
| Variant | Reverse proxy | Certificate source |
| ------------------- | -------------------- | -------------------------------------------------------- |
| **caddy** (default) | Caddy | Automatic Let's Encrypt (recommended for public domains) |
| **nginx** | nginx (unprivileged) | Your own certificate files (`cert.pem` / `key.pem`) |
### Architecture
```{mermaid}
flowchart TD
Internet["Internet"]
socket["systemd socket
:80 / :443 (root)"]
proxy["systemd-socket-proxyd"]
rproxy["nginx or Caddy
(rootless Podman)"]
app["abstracts-explorer:5000"]
pg["PostgreSQL"]
chroma["ChromaDB"]
Internet --> socket
socket --> proxy
proxy -->|"127.0.0.1:8080"| rproxy
rproxy --> app
app --> pg
app --> chroma
```
Privileged ports 80/443 are held by a system-level systemd socket.
All containers run rootless under your normal user account — no `sysctl`
changes, no Docker daemon, no root containers.
Sensitive values (API tokens, database password) are stored as
[Podman secrets](https://docs.podman.io/en/latest/markdown/podman-secret-create.1.html)
and injected at runtime — they never appear in plain text in unit files or
environment files.
Container logs are automatically deleted after 7 days to comply with GDPR.
The install script sets up a daily systemd user timer (`abstracts-log-cleanup.timer`)
that vacuums journal entries older than 7 days.
### Automated install
An install script automates the full setup:
```bash
# Caddy variant (automatic Let's Encrypt — default):
curl -fsSL https://raw.githubusercontent.com/thawn/abstracts-explorer/main/scripts/install-podman.sh | bash
# nginx variant (existing SSL certificate):
curl -fsSL https://raw.githubusercontent.com/thawn/abstracts-explorer/main/scripts/install-podman.sh \
| bash -s -- --variant nginx
```
The script:
1. Downloads all quadlet, configuration, and environment files
2. Installs the system socket units (requires `sudo`)
3. Enables lingering for your user
4. Generates a secure database password and stores it as a Podman secret
(skipped if the secret already exists)
5. Prints the remaining manual steps
After the script finishes, you only need to:
1. **Create the LLM backend API token secret:**
```bash
printf '%s' 'YOUR_BLABLADOR_TOKEN' | podman secret create llm-backend-auth-token -
```
2. **Edit the environment file** (`~/abstracts-explorer/abstracts-explorer.env`) to
adjust model names, LLM backend URL, and other settings.
3. **Set up TLS:**
- **Caddy:** edit `~/abstracts-explorer/caddy/Caddyfile` — replace
`abstracts.example.com` with your domain and `your@email.com` with
your email address.
- **nginx:** place your certificate and key in `~/abstracts-explorer/certs/`
```bash
cp /path/to/cert.pem ~/abstracts-explorer/certs/
cp /path/to/key.pem ~/abstracts-explorer/certs/
```
4. **Start all services:**
```bash
systemctl --user daemon-reload
systemctl --user start abstracts-postgres abstracts-chromadb abstracts-explorer
# Caddy variant:
systemctl --user start abstracts-caddy
# nginx variant:
systemctl --user start abstracts-nginx
```
### Configuration
All non-secret settings live in a single environment file:
```
~/abstracts-explorer/abstracts-explorer.env
```
Edit this file to change the LLM backend URL, model names, log level, RAG
parameters, and other options. Changes take effect after restarting the
abstracts-explorer service:
```bash
systemctl --user restart abstracts-explorer
```
### Managing secrets
| Secret | Required | Purpose |
| ------------------------ | -------- | --------------------------------------------------------------- |
| `postgres-password` | Yes | PostgreSQL database password (auto-generated by install script) |
| `llm-backend-auth-token` | Yes | Blablador or other LLM backend API key |
| `github-token` | No | GitHub token for the OCI registry feature |
Create or update a secret:
```bash
printf '%s' 'NEW_VALUE' | podman secret create --replace SECRET_NAME -
systemctl --user restart abstracts-explorer # pick up the new value
```
**GitHub token (optional):** To enable OCI registry downloads, create the
secret and then uncomment the `Secret=github-token` line in
`~/.config/containers/systemd/abstracts-explorer.container`:
```bash
printf '%s' 'YOUR_GITHUB_TOKEN' | podman secret create github-token -
# Then edit ~/.config/containers/systemd/abstracts-explorer.container
# and uncomment: Secret=github-token,type=env,target=GITHUB_TOKEN
systemctl --user daemon-reload
systemctl --user restart abstracts-explorer
```
### Checking status and logs
```bash
# Status of all services
systemctl --user status 'abstracts-*'
# Follow logs for a specific container
journalctl --user -u abstracts-explorer -f
# Follow logs for all containers
journalctl --user -u 'abstracts-*' -f
```
Log entries older than 7 days are vacuumed automatically by the
`abstracts-log-cleanup.timer` user timer installed by the install script.
To check the timer's status:
```bash
systemctl --user status abstracts-log-cleanup.timer
```
### Updating containers
An update script pulls the latest images and restarts all services:
```bash
~/abstracts-explorer/update-podman.sh
```
Or download and run it directly:
```bash
curl -fsSL https://raw.githubusercontent.com/thawn/abstracts-explorer/main/scripts/update-podman.sh | bash
```
### Download data
The final step is to populate the database. There are three options:
#### Get pre-computed data from the registry
After the services are running, use the CLI to [download pre-computed data from the registry](registry.md):
```bash
podman exec abstracts-explorer \
abstracts-explorer registry download
```
#### Run the full pipeline inside the container
Alternatively, you can run the full [download and embedding pipeline](cli_reference.md) inside the container:
```bash
podman exec abstracts-explorer \
abstracts-explorer download
podman exec abstracts-explorer \
abstracts-explorer create-embeddings
podman exec abstracts-explorer \
abstracts-explorer clustering pre-compute
```
#### Migrate existing data
If you have an existing Docker Compose deployment and want to migrate to the
Podman quadlet setup, execute the migration script in the directory of your existing `docker-compose.yml`:
```bash
curl -fsSL https://raw.githubusercontent.com/thawn/abstracts-explorer/main/scripts/migrate-to-podman.sh | bash
```
The script copies data from `./data`, `./chroma`, and the `./postgres` data
directories into the new Podman named volumes and sets the correct ownership and
permissions. It creates a timestamped backup of the existing data before
migrating.
## Available Images
Pre-built container images are available from:
- **GitHub Container Registry**: `ghcr.io/thawn/abstracts-explorer:latest`
- **Docker Hub** (releases only): `thawn/abstracts-explorer:latest`
Available tags (following container best practices):
- `latest` - Latest stable release only (never points to branch builds)
- `develop` - Latest development branch build
- `v*.*.*` - Specific version releases (e.g., `v0.1.0`)
- `v*.*` - Major.minor version (e.g., `v0.1`)
- `v*` - Major version (e.g., `v0`)
- `sha-*` - Specific commit SHA for traceability (e.g., `sha-5f8567d`)
- `pr-*` - Pull request builds for testing (e.g., `pr-40`)
## Security hardening
Both nginx and caddy configurations include the following security hardening out of the box:
| Setting | Value / Behaviour |
| ----------------------------------- | --------------------------------------------------------------------------------------- |
| **TLS protocol** | TLS 1.3 only (TLS 1.2 disabled; see comments in config to re-enable for legacy clients) |
| **TLS 1.2 ciphers** (if re-enabled) | ECDHE + AES-GCM + ChaCha20-Poly1305 only; weak/export ciphers excluded |
| **SSL session tickets** | Disabled (`ssl_session_tickets off`) to preserve forward secrecy |
| **SSL session cache** | Shared 10 MB cache, 1 day timeout |
| **OCSP stapling** | Enabled — reduces handshake latency and supports revocation checking |
| **Server version** | Hidden (`server_tokens off`) |
| **HSTS** | `max-age=63072000; includeSubDomains` (2 years) |
| **X-Content-Type-Options** | `nosniff` |
| **X-Frame-Options** | `SAMEORIGIN` |
| **Referrer-Policy** | `strict-origin-when-cross-origin` |
| **X-XSS-Protection** | `1; mode=block` |
| **X-Powered-By** | Stripped from upstream responses |
> **OCSP stapling note (Option 1 — existing cert):** OCSP stapling requires a certificate
> issued by a public CA and the full certificate chain in `cert.pem`. If you are using a
> self-signed certificate, remove the `ssl_stapling`, `ssl_stapling_verify`,
> `ssl_trusted_certificate`, `resolver`, and `resolver_timeout` lines from
> `nginx/nginx.conf`.
## Testing Pull Requests
To test changes from a pull request before they're merged:
1. **Find the PR number** (e.g., PR #40)
2. **Update ~/.config/containers/systemd/abstracts-explorer.service** to use the PR image:
```ini
[Service]
# Change this line to test the PR image
Image=ghcr.io/thawn/abstracts-explorer:pr-40
^^^^^
```
## Data Persistence
All data is stored in named volumes:
- `abstracts-data` - Application data directory
- `abstracts-chromadb-data` - ChromaDB vector embeddings
- `abstracts-postgres-data` - PostgreSQL database
### Backup
```bash
# Stop services to ensure data consistency
systemctl --user stop abstracts-explorer abstracts-postgres abstracts-chromadb
# Backup PostgreSQL database
podman exec abstracts-postgres pg_dump -U abstracts abstracts > backup.sql
# Backup ChromaDB data
cp -a ~/local/share/containers/storage/volumes/abstracts-chromadb-data/_data/chroma ./chroma-backup
# Restart services
systemctl --user start abstracts-postgres abstracts-chromadb abstracts-explorer
```
### Restore
```bash
# Stop services before restoring
systemctl --user stop abstracts-explorer abstracts-postgres abstracts-chromadb
# Restore PostgreSQL database
cat backup.sql | podman-compose exec -T abstracts-postgres psql -U abstracts
# Restore ChromaDB data
cp -a ./chroma-backup/* ~/local/share/containers/storage/volumes/abstracts-chromadb-data/_data/chroma/
# Restart services
systemctl --user start abstracts-postgres abstracts-chromadb abstracts-explorer
```
## Traditional Docker Compose Setup (deprecated)
This setup uses a single `docker-compose.yml` file to run all services. It is provided mainly as a reference and for users who prefer the Docker Compose workflow, but the Podman quadlet setup is recommended for better integration with systemd and improved security.
### 1. Create .env File
First create a `.env` file with your [blablador token](https://sdlaml.pages.jsc.fz-juelich.de/ai/guides/blablador_api_access/):
```bash
LLM_BACKEND_AUTH_TOKEN=your_blablador_token_here
```
### 2. Download the compose file
```bash
curl -L https://github.com/thawn/abstracts-explorer/raw/main/docker-compose.yml -o docker-compose.yml
```
> **HTTPS certificates required** — the default `docker-compose.yml` uses nginx with
> your own certificate files. See [HTTPS / SSL Setup](#https--ssl-setup) below for
> how to place your certificate (Option 1) or use Let's Encrypt (Option 2).
### 3. Start Services
```bash
# Podman
podman-compose up -d
```
### HTTPS / SSL Setup
Both Docker Compose files include an **nginx reverse proxy** that handles SSL termination.
Waitress (the application server) continues to serve plain HTTP on port 5000 inside the
container network while nginx exposes the service securely on port 443.
Choose the approach that matches your situation:
| Approach | Compose file | When to use |
| ------------------------ | -------------------------------- | ------------------------------------------------------------------------------------ |
| **Let's Encrypt** | `docker-compose.letsencrypt.yml` | You need a free, automatically renewed certificate for a public domain |
| **Existing certificate** | `docker-compose.yml` | You already have a valid certificate (e.g. from your institution or a wildcard cert) |
Fo the detailed setup instructions, refer to popular guides on setting up nginx with Let's Encrypt or your existing certificate. The nginx configuration files in the `nginx/` directory are pre-configured for secure TLS settings and can be used as a reference for your own setup.
## Further Reading
- [Podman Documentation](https://docs.podman.io/)
- [Main README](https://github.com/thawn/abstracts-explorer/blob/main/README.md)
- [Configuration Guide](configuration.md)
## Support
- 🐛 [Report issues](https://github.com/thawn/abstracts-explorer/issues)
- 💬 [Discussions](https://github.com/thawn/abstracts-explorer/discussions)