Configuration

Abstracts Explorer uses a flexible configuration system that supports environment variables, .env files, and command-line arguments.

Configuration Priority

Settings are loaded in the following priority order (later overrides earlier):

  1. Built-in defaults

  2. .env file in the current directory

  3. Environment variables

  4. Command-line arguments (when applicable)

Configuration File

Create a .env file in your project directory:

# Copy the example file
cp .env.example .env

# Edit with your preferences
nano .env

Available Settings

Chat/Language Model Settings

  • CHAT_MODEL: The LLM model to use for RAG chat (default: gemma-3-4b-it-qat)

  • CHAT_TEMPERATURE: Temperature for LLM responses, 0.0-2.0 (default: 0.7)

  • CHAT_MAX_TOKENS: Maximum tokens in LLM responses (default: 1000)

Embedding Model Settings

  • EMBEDDING_MODEL: The embedding model to use (default: text-embedding-qwen3-embedding-4b)

LLM Backend Configuration

  • LLM_BACKEND_URL: URL of the LLM backend server (default: http://localhost:1234)

  • LLM_BACKEND_AUTH_TOKEN: Optional authentication token for LLM backend (default: empty)

Data Directory

  • DATA_DIR: Base directory for data files (default: data)

Database Configuration

Paper Database

  • PAPER_DB: Database connection for papers. Can be either:

    • PostgreSQL URL: postgresql://user:password@host:port/database

    • SQLite file path: abstracts.db (relative to DATA_DIR) or /absolute/path/to/abstracts.db

    • Default: abstracts.db

The configuration automatically detects the database type based on the format:

  • URLs starting with postgresql://, sqlite://, or other database schemes are treated as database URLs

  • Other values are treated as SQLite file paths (relative to DATA_DIR unless absolute)

Embedding Database

  • EMBEDDING_DB: ChromaDB location. Can be either:

    • HTTP URL: http://chromadb:8000 for remote ChromaDB service (Docker deployments)

    • File path: chroma_db (relative to DATA_DIR) or /absolute/path/to/chroma_db for local ChromaDB

    • Default: chroma_db

The configuration automatically detects the type based on the format:

  • URLs starting with http:// or https:// are treated as remote ChromaDB services

  • Other values are treated as local file paths (relative to DATA_DIR unless absolute)

RAG Settings

  • COLLECTION_NAME: ChromaDB collection name (default: papers)

  • MAX_CONTEXT_PAPERS: Number of papers to include in RAG context (default: 5)

Example Configurations

Local Development (SQLite)

# .env file for local development

# Base directory for data files
DATA_DIR=data

CHAT_MODEL=diffbot-small-xl-2508
CHAT_TEMPERATURE=0.7
CHAT_MAX_TOKENS=1000

EMBEDDING_MODEL=text-embedding-qwen3-embedding-4b

LLM_BACKEND_URL=http://localhost:1234
LLM_BACKEND_AUTH_TOKEN=

# SQLite database (relative to DATA_DIR - will resolve to data/abstracts.db)
PAPER_DB=abstracts.db

# Local ChromaDB (relative to DATA_DIR - will resolve to data/chroma_db)
EMBEDDING_DB=chroma_db

COLLECTION_NAME=papers
MAX_CONTEXT_PAPERS=5

Production/Docker (PostgreSQL)

# .env file for production with PostgreSQL

DATA_DIR=data

CHAT_MODEL=diffbot-small-xl-2508
CHAT_TEMPERATURE=0.7
CHAT_MAX_TOKENS=1000

EMBEDDING_MODEL=text-embedding-qwen3-embedding-4b

LLM_BACKEND_URL=http://localhost:1234
LLM_BACKEND_AUTH_TOKEN=

# PostgreSQL database URL
PAPER_DB=postgresql://abstracts:password@postgres:5432/abstracts

# Remote ChromaDB HTTP service
EMBEDDING_DB=http://chromadb:8000

COLLECTION_NAME=papers
MAX_CONTEXT_PAPERS=5

Alternative: Absolute Paths

# Using absolute paths for both databases
PAPER_DB=/var/data/abstracts.db
EMBEDDING_DB=/var/data/chroma_db

Using Configuration in Code

from abstracts_explorer.config import get_config

# Get the singleton configuration instance
config = get_config()

# Access configuration values
print(f"Chat model: {config.chat_model}")
print(f"Backend URL: {config.llm_backend_url}")
print(f"Database URL: {config.database_url}")  # SQLAlchemy-compatible URL

# ChromaDB location (automatically detected as URL or path)
print(f"ChromaDB: {config.embedding_db}")

Environment Variables

You can also set configuration via environment variables:

export CHAT_MODEL=llama-3.2-3b-instruct
export LLM_BACKEND_URL=http://localhost:8080
abstracts-explorer chat

Security Best Practices

  • Never commit .env files to version control

  • Use .env.example as a template without sensitive data

  • Keep authentication tokens secure

  • Use environment variables in production environments