Docker and Podman Setup Guide
This guide explains how to run Abstracts Explorer using containers with Podman (recommended) or Docker.
Note: The container images are production-optimized and use pre-built static vendor files (CSS/JS libraries). Node.js is not required for production containers - it’s only needed for local development if you want to rebuild vendor files.
Available Images
Pre-built container images are available from:
GitHub Container Registry:
ghcr.io/thawn/abstracts-explorer:latestDocker Hub (releases only):
thawn/abstracts-explorer:latest
Available tags (following container best practices):
latest- Latest stable release only (never points to branch builds)main- Latest main branch builddevelop- Latest develop branch buildv*.*.*- Specific version releases (e.g.,v0.1.0)v*.*- Major.minor version (e.g.,v0.1)v*- Major version (e.g.,v0)sha-*- Specific commit SHA for traceability (e.g.,sha-5f8567d)pr-*- Pull request builds for testing (e.g.,pr-40)
Quick Start
1. Create .env File
First create a .env file with your blablador token:
LLM_BACKEND_AUTH_TOKEN=your_blablador_token_here
2. Download the compose file
curl -L https://github.com/thawn/abstracts-explorer/raw/main/docker-compose.yml -o docker-compose.yml
HTTPS certificates required — the default
docker-compose.ymluses nginx with your own certificate files. See HTTPS / SSL Setup below for how to place your certificate (Option 1) or use Let’s Encrypt (Option 2).
3. Start Services
# Podman
podman-compose up -d
# Docker
docker compose up -d
4. Download Data
# Podman
podman-compose exec abstracts-explorer \
abstracts-explorer download --year 2025 --plugin neurips
# Docker: Replace 'podman-compose' with 'docker compose'
5. Generate Embeddings (Optional)
podman-compose exec abstracts-explorer \
abstracts-explorer create-embeddings
6. Access the Web UI
Open https://localhost in your browser (HTTP on port 80 is automatically redirected to HTTPS).
HTTPS / SSL Setup
Both Docker Compose files include an nginx reverse proxy that handles SSL termination. Waitress (the application server) continues to serve plain HTTP on port 5000 inside the container network while nginx exposes the service securely on port 443.
Choose the approach that matches your situation:
Approach |
Compose file |
When to use |
|---|---|---|
Existing certificate |
|
You already have a valid certificate (e.g. from your institution or a wildcard cert) |
Let’s Encrypt |
|
You need a free, automatically renewed certificate for a public domain |
Option 1: Existing Certificate
Compose file:
docker-compose.yml
Use this when you already have a valid SSL certificate (e.g. issued by your institution, a wildcard cert, or any other CA).
Certificate files
Before starting the services place your SSL certificate and private key in a
certs/ directory next to docker-compose.yml:
certs/
├── cert.pem ← your certificate (or full chain)
└── key.pem ← your private key
The files are mounted into the nginx container as read-only at /etc/nginx/certs/.
Note: The nginx configuration (
nginx/nginx.conf) references these paths. If your certificate files have different names, update thessl_certificateandssl_certificate_keydirectives innginx/nginx.confaccordingly.
Changing the server name
By default nginx uses server_name _; (match any hostname). To restrict it to a
specific domain, edit nginx/nginx.conf and replace _ with your domain:
server_name abstracts.example.com;
Start the stack
docker compose up -d
Option 2: Let’s Encrypt Certificate
Compose file:
docker-compose.letsencrypt.yml
Use this when you need a free, automatically renewed certificate from Let’s Encrypt.
Requirements
A public domain that points to this server (Let’s Encrypt cannot issue certificates for
localhostor private IP addresses).Ports 80 and 443 must be reachable from the internet so Let’s Encrypt can verify domain ownership.
Step 1 — Replace the placeholder domain
Edit both docker-compose.letsencrypt.yml and nginx/nginx.letsencrypt.conf,
replacing every occurrence of abstracts.example.com with your real domain.
Step 2 — Obtain the initial certificate
Run Certbot once in standalone mode before starting the stack (nginx must not be running yet so that Certbot can bind to port 80):
# Docker
docker run --rm \
-p 80:80 \
-v letsencrypt-certs:/etc/letsencrypt \
certbot/certbot certonly --standalone \
--domain abstracts.example.com \
--email your@email.com \
--agree-tos --non-interactive
# Podman
podman run --rm \
-p 80:80 \
-v letsencrypt-certs:/etc/letsencrypt \
certbot/certbot certonly --standalone \
--domain abstracts.example.com \
--email your@email.com \
--agree-tos --non-interactive
Step 3 — Start the stack
docker compose -f docker-compose.letsencrypt.yml up -d
Automatic renewal
The certbot service in docker-compose.letsencrypt.yml checks for renewal every
12 hours. Let’s Encrypt certificates expire after 90 days; renewal is attempted
automatically when fewer than 30 days remain.
After a successful renewal, nginx must be reloaded to activate the new certificate (certbot and nginx run in separate containers, so this cannot happen automatically). Run this command once after renewal:
docker compose -f docker-compose.letsencrypt.yml exec nginx nginx -s reload
To automate the reload, add a daily host cron job (replace docker with podman
if using Podman):
# Add via: crontab -e
0 3 * * * docker exec abstracts-nginx nginx -s reload
To force an immediate renewal:
docker compose -f docker-compose.letsencrypt.yml exec certbot \
certbot renew --webroot -w /var/www/certbot --force-renewal
HTTP → HTTPS redirect
In both setups port 80 is redirected to HTTPS and port 5000 is not exposed to the host; all traffic must go through nginx on port 443.
Security hardening
Both nginx configurations include the following security hardening out of the box:
Setting |
Value / Behaviour |
|---|---|
TLS protocol |
TLS 1.3 only (TLS 1.2 disabled; see comments in config to re-enable for legacy clients) |
TLS 1.2 ciphers (if re-enabled) |
ECDHE + AES-GCM + ChaCha20-Poly1305 only; weak/export ciphers excluded |
SSL session tickets |
Disabled ( |
SSL session cache |
Shared 10 MB cache, 1 day timeout |
OCSP stapling |
Enabled — reduces handshake latency and supports revocation checking |
Server version |
Hidden ( |
HSTS |
|
X-Content-Type-Options |
|
X-Frame-Options |
|
Referrer-Policy |
|
X-XSS-Protection |
|
X-Powered-By |
Stripped from upstream responses |
OCSP stapling note (Option 1 — existing cert): OCSP stapling requires a certificate issued by a public CA and the full certificate chain in
cert.pem. If you are using a self-signed certificate, remove thessl_stapling,ssl_stapling_verify,ssl_trusted_certificate,resolver, andresolver_timeoutlines fromnginx/nginx.conf.
Testing Pull Requests
To test changes from a pull request before they’re merged:
Find the PR number (e.g., PR #40)
Update docker-compose.yml to use the PR image:
services:
abstracts-explorer:
image: ghcr.io/thawn/abstracts-explorer:pr-40 # Replace 40 with your PR number
Pull and start services:
docker compose pull
docker compose up -d
Verify the setup:
# Check service health
docker compose ps
# View logs
docker compose logs abstracts-explorer
# Access web UI
# - If you have a CA-signed certificate omit the -k flag
# - -k skips certificate verification (only use for self-signed/test certificates)
curl -k https://localhost/health
Note: PR images are automatically built and pushed when commits are made to pull requests. They’re tagged with pr-<number> for easy testing.
Prerequisites
Podman (Recommended)
Podman is a daemonless container engine that’s more secure and doesn’t require root privileges.
Linux:
# Debian/Ubuntu
sudo apt-get install podman podman-compose
# Fedora/RHEL
sudo dnf install podman podman-compose
macOS:
brew install podman podman-compose
podman machine init
podman machine start
Windows: Download from Podman Desktop
Docker (Alternative)
Linux: curl -fsSL https://get.docker.com -o get-docker.sh && sudo sh get-docker.sh
macOS/Windows: Install Docker Desktop
Configuration
Environment Variables
Configure via docker-compose.yml or mount a custom .env file.
Option 1: Edit docker-compose.yml
services:
abstracts-explorer:
environment:
- LLM_BACKEND_URL=http://host.docker.internal:1234
- CHAT_MODEL=your-chat-model
Option 2: Mount .env File
cp .env.docker .env
# Edit .env with your settings
Uncomment in docker-compose.yml:
volumes:
- ./.env:/app/.env:ro
Key Settings
Variable |
Description |
Default |
|---|---|---|
|
PostgreSQL connection URL |
|
|
ChromaDB location (URL or path) |
|
|
LLM backend URL |
|
|
Chat model name |
|
|
Embedding model name |
|
|
ChromaDB collection |
|
|
PAT for registry upload/download |
(empty) |
|
Default OCI repository for registry commands |
(empty) |
Note: The setup uses PostgreSQL and ChromaDB by default. For local development with SQLite, set PAPER_DB=abstracts.db instead of the PostgreSQL URL. For local ChromaDB, set EMBEDDING_DB=chroma_db instead of the HTTP URL.
Connecting to Host LM Studio
Podman on Linux:
environment:
- LLM_BACKEND_URL=http://host.containers.internal:1234
Docker (Mac/Windows) or Podman (Mac):
environment:
- LLM_BACKEND_URL=http://host.docker.internal:1234
Alternative (Linux): Use host network
services:
abstracts-explorer:
network_mode: host
environment:
- LLM_BACKEND_URL=http://localhost:1234
Services
The Docker Compose setup includes four services that work together:
Nginx Reverse Proxy (nginx)
Ports: 80 (HTTP → HTTPS redirect), 443 (HTTPS)
Purpose: SSL termination and reverse proxy to the application
Config:
./nginx/nginx.conf(mounted read-only)Certs:
./certs/directory (mounted read-only)
Main Application (abstracts-explorer)
Port: 5000 (internal only, not exposed to host — access via nginx)
Volumes:
abstracts-dataPurpose: Web UI and CLI tools
Image:
ghcr.io/thawn/abstracts-explorer:latest
ChromaDB
Port: 8000 (internal only, not exposed to host)
Purpose: Vector database for semantic search embeddings
Health Check: TCP check on port 8000
Data: Persisted in
chromadb-datavolume
PostgreSQL
Port: 5432 (internal only, not exposed to host)
Purpose: Relational database for paper metadata
Health Check:
pg_isreadycommandData: Persisted in
postgres-datavolumeCredentials: Set in
docker-compose.yml(change for production!)
Security Note: Database ports (5432, 8000) and the application port (5000) are not exposed to the host system. Only the nginx ports (80, 443) are accessible from outside the container network. All inter-service communication happens via Docker’s internal network.
Common Commands
View Logs
podman-compose logs -f abstracts-explorer
Execute CLI Commands
podman-compose exec abstracts-explorer abstracts-explorer search "neural networks"
Interactive Shell
podman-compose exec -it abstracts-explorer /bin/bash
Stop Services
podman-compose down
# Remove volumes (deletes data)
podman-compose down -v
Data Persistence
All data is stored in named volumes:
abstracts-data- Application data directorychromadb-data- ChromaDB vector embeddingspostgres-data- PostgreSQL database
Backup
# Backup PostgreSQL database
podman-compose exec postgres pg_dump -U abstracts abstracts > backup.sql
# Backup ChromaDB data
podman-compose exec abstracts-explorer \
tar czf /tmp/chroma-backup.tar.gz /app/chroma_db
podman cp abstracts-chromadb:/chroma/chroma ./chroma-backup
Restore
# Restore PostgreSQL database
cat backup.sql | podman-compose exec -T postgres psql -U abstracts
# Restore ChromaDB data
podman cp ./chroma-backup abstracts-chromadb:/chroma/chroma
podman-compose restart chromadb
Troubleshooting
Container Won’t Start
Check logs:
podman-compose logs abstracts-explorerVerify ports 80 and 443 are available:
lsof -i :80 -i :443Rebuild:
podman-compose build --no-cache && podman-compose up -dExisting cert setup: ensure
./certs/cert.pemand./certs/key.pemexist before starting nginxExisting cert + self-signed cert: remove the
ssl_stapling*,resolver, andresolver_timeoutlines fromnginx/nginx.conf— OCSP stapling is not available for self-signed certificatesLet’s Encrypt setup: ensure you ran the Certbot standalone command (Step 2) before starting the stack
Cannot Connect to LM Studio
Ensure LM Studio server is running with models loaded
Verify URL:
podman-compose exec abstracts-explorer curl -v http://host.docker.internal:1234/v1/modelsFor Linux, try
host.containers.internalornetwork_mode: host
Permission Errors (Podman)
podman unshare chown 1000:1000 /path/to/volume
Database Locked
PostgreSQL is now the default for Docker Compose (no locking issues)
For SQLite mode, ensure only one process accesses the database
ChromaDB Health Check Fails
The health check uses TCP port checking (bash built-in)
If failing, check logs:
podman-compose logs chromadbVerify ChromaDB container started:
podman-compose ps
Cannot Access Databases from Host
Database ports (5432, 8000) are intentionally not exposed for security
The application port (5000) is also internal — use https://localhost instead
Access via application container:
podman-compose exec abstracts-explorer psqlFor debugging, temporarily add port mappings to
docker-compose.yml
Out of Memory
Increase container memory limits:
services:
abstracts-explorer:
deploy:
resources:
limits:
memory: 4G
Production Deployment
Change default passwords in your compose file
Use external secrets for tokens
HTTPS is enabled by default via the built-in nginx reverse proxy — see Option 1 for an existing cert or Option 2 for Let’s Encrypt
Set resource limits for memory and CPU
Configure monitoring and health checks
Use specific image tags instead of
latest(e.g.,v1.0.0orsha-5f8567dfor precise version control)
Example Production Settings
services:
abstracts-explorer:
deploy:
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
restart: unless-stopped