Docker and Podman Setup Guide
This guide explains how to run Abstracts Explorer using containers with Podman (recommended) or Docker.
Note: The container images are production-optimized and use pre-built static vendor files (CSS/JS libraries). Node.js is not required for production containers - it’s only needed for local development if you want to rebuild vendor files.
Available Images
Pre-built container images are available from:
GitHub Container Registry:
ghcr.io/thawn/abstracts-explorer:latestDocker Hub (releases only):
thawn/abstracts-explorer:latest
Available tags (following container best practices):
latest- Latest stable release only (never points to branch builds)main- Latest main branch builddevelop- Latest develop branch buildv*.*.*- Specific version releases (e.g.,v0.1.0)v*.*- Major.minor version (e.g.,v0.1)v*- Major version (e.g.,v0)sha-*- Specific commit SHA for traceability (e.g.,sha-5f8567d)pr-*- Pull request builds for testing (e.g.,pr-40)
Quick Start
1. Create .env File
First create a .env file with your blablador token:
LLM_BACKEND_AUTH_TOKEN=your_blablador_token_here
2. Download the compose file
curl -L https://github.com/thawn/abstracts-explorer/raw/main/docker-compose.yml -o docker-compose.yml
3. Start Services
# Podman
podman-compose up -d
# Docker
docker compose up -d
4. Download Data
# Podman
podman-compose exec abstracts-explorer \
abstracts-explorer download --year 2025 --plugin neurips
# Docker: Replace 'podman-compose' with 'docker compose'
5. Generate Embeddings (Optional)
podman-compose exec abstracts-explorer \
abstracts-explorer create-embeddings
6. Access the Web UI
Open http://localhost:5000 in your browser.
Testing Pull Requests
To test changes from a pull request before they’re merged:
Find the PR number (e.g., PR #40)
Update docker-compose.yml to use the PR image:
services:
abstracts-explorer:
image: ghcr.io/thawn/abstracts-explorer:pr-40 # Replace 40 with your PR number
Pull and start services:
docker compose pull
docker compose up -d
Verify the setup:
# Check service health
docker compose ps
# View logs
docker compose logs abstracts-explorer
# Access web UI
curl http://localhost:5000/health
Note: PR images are automatically built and pushed when commits are made to pull requests. They’re tagged with pr-<number> for easy testing.
Prerequisites
Podman (Recommended)
Podman is a daemonless container engine that’s more secure and doesn’t require root privileges.
Linux:
# Debian/Ubuntu
sudo apt-get install podman podman-compose
# Fedora/RHEL
sudo dnf install podman podman-compose
macOS:
brew install podman podman-compose
podman machine init
podman machine start
Windows: Download from Podman Desktop
Docker (Alternative)
Linux: curl -fsSL https://get.docker.com -o get-docker.sh && sudo sh get-docker.sh
macOS/Windows: Install Docker Desktop
Configuration
Environment Variables
Configure via docker-compose.yml or mount a custom .env file.
Option 1: Edit docker-compose.yml
services:
abstracts-explorer:
environment:
- LLM_BACKEND_URL=http://host.docker.internal:1234
- CHAT_MODEL=your-chat-model
Option 2: Mount .env File
cp .env.docker .env
# Edit .env with your settings
Uncomment in docker-compose.yml:
volumes:
- ./.env:/app/.env:ro
Key Settings
Variable |
Description |
Default |
|---|---|---|
|
PostgreSQL connection URL |
|
|
ChromaDB location (URL or path) |
|
|
LLM backend URL |
|
|
Chat model name |
|
|
Embedding model name |
|
|
ChromaDB collection |
|
Note: The setup uses PostgreSQL and ChromaDB by default. For local development with SQLite, set PAPER_DB=abstracts.db instead of the PostgreSQL URL. For local ChromaDB, set EMBEDDING_DB=chroma_db instead of the HTTP URL.
Connecting to Host LM Studio
Podman on Linux:
environment:
- LLM_BACKEND_URL=http://host.containers.internal:1234
Docker (Mac/Windows) or Podman (Mac):
environment:
- LLM_BACKEND_URL=http://host.docker.internal:1234
Alternative (Linux): Use host network
services:
abstracts-explorer:
network_mode: host
environment:
- LLM_BACKEND_URL=http://localhost:1234
Services
The Docker Compose setup includes three services that work together:
Main Application (abstracts-explorer)
Port: 5000 (exposed to host)
Volumes:
abstracts-dataPurpose: Web UI and CLI tools
Image:
ghcr.io/thawn/abstracts-explorer:latest
ChromaDB
Port: 8000 (internal only, not exposed to host)
Purpose: Vector database for semantic search embeddings
Health Check: TCP check on port 8000
Data: Persisted in
chromadb-datavolume
PostgreSQL
Port: 5432 (internal only, not exposed to host)
Purpose: Relational database for paper metadata
Health Check:
pg_isreadycommandData: Persisted in
postgres-datavolumeCredentials: Set in
docker-compose.yml(change for production!)
Security Note: Database ports (5432, 8000) are not exposed to the host system. Only the web UI port (5000) is accessible from outside the container network. All inter-service communication happens via Docker’s internal network.
Common Commands
View Logs
podman-compose logs -f abstracts-explorer
Execute CLI Commands
podman-compose exec abstracts-explorer abstracts-explorer search "neural networks"
Interactive Shell
podman-compose exec -it abstracts-explorer /bin/bash
Stop Services
podman-compose down
# Remove volumes (deletes data)
podman-compose down -v
Data Persistence
All data is stored in named volumes:
abstracts-data- Application data directorychromadb-data- ChromaDB vector embeddingspostgres-data- PostgreSQL database
Backup
# Backup PostgreSQL database
podman-compose exec postgres pg_dump -U abstracts abstracts > backup.sql
# Backup ChromaDB data
podman-compose exec abstracts-explorer \
tar czf /tmp/chroma-backup.tar.gz /app/chroma_db
podman cp abstracts-chromadb:/chroma/chroma ./chroma-backup
Restore
# Restore PostgreSQL database
cat backup.sql | podman-compose exec -T postgres psql -U abstracts
# Restore ChromaDB data
podman cp ./chroma-backup abstracts-chromadb:/chroma/chroma
podman-compose restart chromadb
Troubleshooting
Container Won’t Start
Check logs:
podman-compose logs abstracts-explorerVerify port 5000 is available:
lsof -i :5000Rebuild:
podman-compose build --no-cache && podman-compose up -d
Cannot Connect to LM Studio
Ensure LM Studio server is running with models loaded
Verify URL:
podman-compose exec abstracts-explorer curl -v http://host.docker.internal:1234/v1/modelsFor Linux, try
host.containers.internalornetwork_mode: host
Permission Errors (Podman)
podman unshare chown 1000:1000 /path/to/volume
Database Locked
PostgreSQL is now the default for Docker Compose (no locking issues)
For SQLite mode, ensure only one process accesses the database
ChromaDB Health Check Fails
The health check uses TCP port checking (bash built-in)
If failing, check logs:
podman-compose logs chromadbVerify ChromaDB container started:
podman-compose ps
Cannot Access Databases from Host
Database ports (5432, 8000) are intentionally not exposed for security
Access via application container:
podman-compose exec abstracts-explorer psqlFor debugging, temporarily add port mappings to
docker-compose.yml
Out of Memory
Increase container memory limits:
services:
abstracts-explorer:
deploy:
resources:
limits:
memory: 4G
Production Deployment
Change default passwords in
docker-compose.ymlUse external secrets for tokens
Enable HTTPS with a reverse proxy (nginx, traefik)
Set resource limits for memory and CPU
Configure monitoring and health checks
Use specific image tags instead of
latest(e.g.,v1.0.0orsha-5f8567dfor precise version control)
Example Production Settings
services:
abstracts-explorer:
deploy:
resources:
limits:
cpus: '2'
memory: 4G
reservations:
cpus: '1'
memory: 2G
restart: unless-stopped