Config Module
The config module provides configuration management for Abstracts Explorer.
Overview
The configuration system supports:
Environment variables
.envfile loadingType conversion (string, int, float)
Singleton pattern for global configuration
Priority-based configuration loading
Class Reference
Configuration management for neurips-abstracts package.
This module loads configuration from environment variables and .env files. Uses only standard library (no python-dotenv dependency required).
- abstracts_explorer.config.load_env_file(env_path=None)[source]
Load environment variables from a .env file.
Uses a simple parser that handles basic .env file format without requiring external dependencies.
- Parameters:
env_path (Path, optional) – Path to .env file. If None, looks for .env in current directory and parent directories up to the package root.
- Returns:
Dictionary of environment variables loaded from file.
- Return type:
Examples
>>> env_vars = load_env_file(Path(".env")) >>> print(env_vars.get("CHAT_MODEL"))
- class abstracts_explorer.config.Config(env_path=None)[source]
Bases:
objectConfiguration manager for neurips-abstracts package.
Loads configuration from environment variables with fallback to defaults. Automatically loads from .env file if present.
- embedding_db
ChromaDB configuration - can be either a URL (e.g., “http://chromadb:8000”) or a file path (e.g., “chroma_db” or “/path/to/chroma_db”).
- Type:
- query_similarity_threshold
Similarity threshold for determining when to retrieve new papers (0.0-1.0).
- Type:
- database_url
SQLAlchemy database URL (supports SQLite, PostgreSQL, etc.). Automatically constructed from PAPER_DB config variable.
- Type:
- log_level
Logging level from environment (WARNING, INFO, DEBUG). Empty string if not set. Used by setup_logging() to set the default log level when verbosity flags are not used.
- Type:
Examples
>>> config = Config() >>> print(config.chat_model) 'diffbot-small-xl-2508' >>> config.llm_backend_url 'http://localhost:1234' >>> # Using DATABASE_URL for PostgreSQL >>> config.database_url 'postgresql://user:password@localhost/abstracts'
- __init__(env_path=None)[source]
Initialize configuration.
- Parameters:
env_path (Path, optional) – Path to .env file. If None, searches for .env automatically.
- abstracts_explorer.config.get_config(reload=False, env_path=None)[source]
Get global configuration instance.
- Parameters:
reload (bool, optional) – Force reload configuration from environment, by default False
env_path (Path, optional) – Path to .env file. If provided, loads configuration from this file. Useful for testing to ensure consistent configuration.
- Returns:
Global configuration instance.
- Return type:
Examples
>>> config = get_config() >>> print(config.chat_model) >>> # In tests, use .env.tests for consistent values >>> config = get_config(reload=True, env_path=Path(".env.tests"))
Usage Examples
Getting Configuration
from abstracts_explorer.config import get_config
# Get singleton instance
config = get_config()
# Access configuration values
print(f"Chat model: {config.chat_model}")
print(f"Backend URL: {config.llm_backend_url}")
print(f"Database URL: {config.database_url}")
Configuration Values
# Chat/LLM settings
config.chat_model # str
config.chat_temperature # float
config.chat_max_tokens # int
config.llm_backend_url # str
config.llm_backend_auth_token # str
# Embedding settings
config.embedding_model # str
config.embedding_db_path # str (for local ChromaDB)
config.embedding_db_url # str (for remote ChromaDB)
config.collection_name # str
# Database settings
config.database_url # str (SQLAlchemy-compatible URL)
# RAG settings
config.max_context_papers # int
Custom .env File
from abstracts_explorer.config import load_env_file
# Load from specific file
env_vars = load_env_file("/path/to/custom.env")
# Use with os.environ
import os
for key, value in env_vars.items():
os.environ[key] = value
Configuration Priority
Settings are loaded in order (later overrides earlier):
Built-in defaults - Hardcoded in Config class
.envfile - In current directoryEnvironment variables - System environment
CLI arguments - Command-line overrides (when applicable)
Example Priority
# 1. Default in code
chat_model = "gemma-3-4b-it-qat"
# 2. .env file (overrides default)
CHAT_MODEL=llama-3.2-3b-instruct
# 3. Environment variable (overrides .env)
export CHAT_MODEL=diffbot-small-xl-2508
# 4. CLI argument (overrides all)
abstracts-explorer chat --model custom-model
.env File Format
The .env file uses simple KEY=VALUE format:
# Comments start with #
CHAT_MODEL=gemma-3-4b-it-qat
# Quotes are optional
LLM_BACKEND_URL=http://localhost:1234
# Empty values allowed
LLM_BACKEND_AUTH_TOKEN=
# No spaces around =
CHAT_TEMPERATURE=0.7
Supported Features
Comments (
#)Empty lines (ignored)
Quoted values (
"value"or'value')Unquoted values
Empty values
Not Supported
Variable expansion (
$VAR)Multi-line values
Export statements (
export VAR=value)Inline comments (
VAR=value # comment)
Type Conversion
The Config class automatically converts types:
# String values
config.chat_model # str: "gemma-3-4b-it-qat"
config.llm_backend_url # str: "http://localhost:1234"
# Integer values
config.chat_max_tokens # int: 1000
config.max_context_papers # int: 5
# Float values
config.chat_temperature # float: 0.7
Default Values
Default values when not configured:
DATA_DIR = "data"
CHAT_MODEL = "diffbot-small-xl-2508"
CHAT_TEMPERATURE = 0.7
CHAT_MAX_TOKENS = 1000
EMBEDDING_MODEL = "text-embedding-qwen3-embedding-4b"
EMBEDDING_DB_PATH = "chroma_db"
EMBEDDING_DB_URL = ""
LLM_BACKEND_URL = "http://localhost:1234"
LLM_BACKEND_AUTH_TOKEN = ""
PAPER_DB = "abstracts.db" # Converted to database_url internally
COLLECTION_NAME = "papers"
MAX_CONTEXT_PAPERS = 5
Configuration in Tests
Tests use configuration from environment:
import pytest
from abstracts_explorer.config import get_config
def test_with_config():
config = get_config()
# Tests use configured values
assert config.chat_model is not None
assert config.llm_backend_url is not None
Overriding in Tests
import os
import pytest
@pytest.fixture
def custom_config():
# Save original
original = os.environ.get('CHAT_MODEL')
# Override
os.environ['CHAT_MODEL'] = 'test-model'
yield
# Restore
if original:
os.environ['CHAT_MODEL'] = original
else:
del os.environ['CHAT_MODEL']
def test_with_custom_config(custom_config):
config = get_config()
assert config.chat_model == 'test-model'
Security Best Practices
Do Not Commit Secrets
# .gitignore
.env
*.env
!.env.example
Use Environment Variables in Production
# Production environment
export LLM_BACKEND_AUTH_TOKEN="secret-token"
export LLM_BACKEND_URL="https://production-api.example.com"
Provide Template
# .env.example (commit this)
CHAT_MODEL=gemma-3-4b-it-qat
LLM_BACKEND_URL=http://localhost:1234
LLM_BACKEND_AUTH_TOKEN=
# Users copy and customize
cp .env.example .env
Best Practices
Use .env for development - Easy local configuration
Use environment variables in production - Secure and flexible
Document all settings - Keep .env.example up to date
Validate configuration - Check required settings exist
Use defaults wisely - Provide sensible defaults
Don’t commit secrets - Use .gitignore properly