Database Module

The database module provides an interface for storing and retrieving conference papers in SQLite.

Overview

The DatabaseManager class handles all database operations including:

  • Creating and managing database schema

  • Adding papers and authors

  • Searching and filtering papers

  • Managing paper-author relationships

Class Reference

Database Module

This module provides functionality to load JSON data into a SQL database. Supports both SQLite and PostgreSQL backends via SQLAlchemy.

exception abstracts_explorer.database.DatabaseError[source]

Bases: Exception

Exception raised for database operations.

exception abstracts_explorer.database.EmbeddingModelConflictError(local_model, remote_model)[source]

Bases: DatabaseError

Raised when the embedding model in imported data differs from the local database.

local_model

Embedding model currently in the local database.

Type:

str

remote_model

Embedding model in the data being imported.

Type:

str

__init__(local_model, remote_model)[source]
abstracts_explorer.database.normalize_model_name(name)[source]

Normalize an embedding model name for comparison.

Strips a leading alias- prefix (case-insensitive) so that, e.g., alias-qwen3-embeddings-8b is considered identical to qwen3-embeddings-8b.

Parameters:

name (str) – Embedding model name.

Returns:

Normalized model name.

Return type:

str

class abstracts_explorer.database.DatabaseManager[source]

Bases: object

Manager for SQL database operations using SQLAlchemy.

Supports SQLite and PostgreSQL backends through SQLAlchemy connection URLs. Database configuration is read from the config file (PAPER_DB variable).

database_url

SQLAlchemy database URL from configuration.

Type:

str

engine

SQLAlchemy engine instance.

Type:

Engine or None

SessionLocal

SQLAlchemy session factory.

Type:

sessionmaker or None

_session

Active database session if connected.

Type:

Session or None

Examples

>>> # Database configuration comes from config file
>>> db = DatabaseManager()
>>> db.connect()
>>> db.create_tables()
>>> db.close()
__init__()[source]

Initialize the DatabaseManager.

Reads database configuration from the config file.

connect()[source]

Connect to the database.

Creates the database file if it doesn’t exist (SQLite only).

Raises:

DatabaseError – If connection fails.

Return type:

None

close()[source]

Close the database connection.

Does nothing if not connected.

Return type:

None

__enter__()[source]

Context manager entry.

__exit__(exc_type, exc_val, exc_tb)[source]

Context manager exit.

create_tables()[source]

Create database tables for papers and embeddings metadata.

Creates the following tables: - papers: Main table for paper information with lightweight ML4PS schema - embeddings_metadata: Metadata about embeddings (model used, creation date) - clustering_cache: Cache for clustering results

This method is idempotent - it can be called multiple times without error. Tables are only created if they don’t already exist.

Raises:

DatabaseError – If table creation fails.

Return type:

None

add_paper(paper)[source]

Add a single paper to the database.

Parameters:

paper (LightweightPaper) – Validated paper object to insert.

Returns:

The UID of the inserted paper, or None if paper was skipped (duplicate).

Return type:

str or None

Raises:

DatabaseError – If insertion fails.

Examples

>>> from abstracts_explorer.plugin import LightweightPaper
>>> db = DatabaseManager()
>>> with db:
...     db.create_tables()
...     paper = LightweightPaper(
...         title="Test Paper",
...         authors=["John Doe"],
...         abstract="Test abstract",
...         session="Session 1",
...         poster_position="P1",
...         year=2025,
...         conference="NeurIPS"
...     )
...     paper_uid = db.add_paper(paper)
>>> print(f"Inserted paper with UID: {paper_uid}")
add_papers(papers)[source]

Add multiple papers to the database in a batch.

Parameters:

papers (list of LightweightPaper) – List of validated paper objects to insert.

Returns:

Number of papers successfully inserted (excludes duplicates).

Return type:

int

Raises:

DatabaseError – If batch insertion fails.

Examples

>>> from abstracts_explorer.plugin import LightweightPaper
>>> db = DatabaseManager()
>>> with db:
...     db.create_tables()
...     papers = [
...         LightweightPaper(
...             title="Paper 1",
...             authors=["Author 1"],
...             abstract="Abstract 1",
...             session="Session 1",
...             poster_position="P1",
...             year=2025,
...             conference="NeurIPS"
...         ),
...         LightweightPaper(
...             title="Paper 2",
...             authors=["Author 2"],
...             abstract="Abstract 2",
...             session="Session 2",
...             poster_position="P2",
...             year=2025,
...             conference="NeurIPS"
...         )
...     ]
...     count = db.add_papers(papers)
>>> print(f"Inserted {count} papers")
donate_validation_data(paper_priorities)[source]

Store donated paper rating data for validation purposes.

This method accepts anonymized paper ratings from users and stores them in the validation_data table for improving the service.

Parameters:

paper_priorities (Dict[str, Dict[str, Any]]) – Dictionary mapping paper UIDs to priority data. Each priority data dict must contain: - priority (int): Rating value - searchTerm (str, optional): Search term associated with the rating

Returns:

Number of papers successfully donated

Return type:

int

Raises:
  • ValueError – If paper_priorities is empty or contains invalid data format

  • DatabaseError – If database operation fails

Examples

>>> db = DatabaseManager()
>>> with db:
...     priorities = {
...         "abc123": {"priority": 5, "searchTerm": "machine learning"},
...         "def456": {"priority": 4, "searchTerm": "deep learning"}
...     }
...     count = db.donate_validation_data(priorities)
...     print(f"Donated {count} papers")
donate_chat_transcript(rating, transcript)[source]

Store a donated chat transcript with thumbs up/down feedback.

This method accepts an anonymized chat transcript and a rating from the user and stores it for improving the chat system.

Parameters:
  • rating (str) – User feedback rating, must be ‘up’ or ‘down’.

  • transcript (List[Dict[str, str]]) – List of message dicts, each with ‘role’ and ‘text’ keys.

Returns:

ID of the stored donation entry.

Return type:

int

Raises:
  • ValueError – If rating is invalid or transcript is empty/malformed.

  • DatabaseError – If database operation fails.

Examples

>>> db = DatabaseManager()
>>> with db:
...     transcript = [
...         {"role": "user", "text": "What papers discuss transformers?"},
...         {"role": "assistant", "text": "Here are some relevant papers..."}
...     ]
...     donation_id = db.donate_chat_transcript("up", transcript)
query(sql, parameters=())[source]

Execute a SQL query and return results.

Note: This method provides backward compatibility with raw SQL queries. For new code, prefer using SQLAlchemy ORM methods.

Parameters:
  • sql (str) – SQL query to execute (use named parameters like :param1, :param2).

  • parameters (tuple, optional) – Query parameters for parameterized queries.

Returns:

Query results as list of dictionaries.

Return type:

list of dict

Raises:

DatabaseError – If query execution fails.

Examples

>>> db = DatabaseManager()
>>> with db:
...     results = db.query("SELECT * FROM papers WHERE session = ?", ("Poster",))
>>> for row in results:
...     print(row['title'])
get_paper_count()[source]

Get the total number of papers in the database.

Returns:

Number of papers.

Return type:

int

Raises:

DatabaseError – If query fails.

search_papers(keyword=None, session=None, sessions=None, year=None, years=None, conference=None, conferences=None, limit=100)[source]

Search for papers by various criteria (lightweight schema).

Parameters:
  • keyword (str, optional) – Keyword to search in title, abstract, or keywords fields.

  • session (str, optional) – Single session to filter by (deprecated, use sessions instead).

  • sessions (list[str], optional) – List of sessions to filter by (matches ANY).

  • year (int, optional) – Single year to filter by (deprecated, use years instead).

  • years (list[int], optional) – List of years to filter by (matches ANY).

  • conference (str, optional) – Single conference to filter by (deprecated, use conferences instead).

  • conferences (list[str], optional) – List of conferences to filter by (matches ANY).

  • limit (int, default=100) – Maximum number of results to return.

Returns:

Matching papers as dictionaries.

Return type:

list of dict

Raises:

DatabaseError – If search fails.

Examples

>>> db = DatabaseManager("neurips.db")
>>> with db:
...     papers = db.search_papers(keyword="neural network", limit=10)
>>> for paper in papers:
...     print(paper['title'])
>>> # Search with multiple sessions
>>> papers = db.search_papers(sessions=["Session 1", "Session 2"])
>>> # Search with years
>>> papers = db.search_papers(years=[2024, 2025])
search_papers_keyword(query, limit=10, sessions=None, years=None, conferences=None)[source]

Perform keyword-based search with filtering and author parsing.

This is a convenience method that wraps search_papers and formats the results for web API consumption, including author parsing.

Parameters:
  • query (str) – Keyword to search in title, abstract, or keywords fields

  • limit (int, optional) – Maximum number of results, by default 10

  • sessions (list of str, optional) – Filter by paper sessions

  • years (list of int, optional) – Filter by publication years

  • conferences (list of str, optional) – Filter by conference names

Returns:

List of paper dictionaries with parsed authors

Return type:

list of dict

Examples

>>> papers = db.search_papers_keyword(
...     "neural networks",
...     limit=5,
...     years=[2024, 2025]
... )
get_stats(year=None, conference=None)[source]

Get database statistics, optionally filtered by year and conference.

Parameters:
  • year (int, optional) – Filter by specific year

  • conference (str, optional) – Filter by specific conference

Returns:

Statistics dictionary with: - total_papers: int - Number of papers matching filters - year: int or None - Filter year if provided - conference: str or None - Filter conference if provided

Return type:

dict

Examples

>>> stats = db.get_stats()
>>> print(f"Total papers: {stats['total_papers']}")
>>> stats_2024 = db.get_stats(year=2024)
>>> print(f"Papers in 2024: {stats_2024['total_papers']}")
search_authors_in_papers(name=None, limit=100)[source]

Search for authors by name within the papers’ authors field.

Parameters:
  • name (str, optional) – Name to search for (partial match).

  • limit (int, default=100) – Maximum number of results to return.

Returns:

Unique authors found in papers with fields: name.

Return type:

list of dict

Raises:

DatabaseError – If search fails.

Examples

>>> db = DatabaseManager()
>>> with db:
...     authors = db.search_authors_in_papers(name="Huang")
>>> for author in authors:
...     print(author['name'])
get_author_count()[source]

Get the approximate number of unique authors in the database.

Note: This provides an estimate by counting unique author names across all papers. The actual count may vary.

Returns:

Approximate number of unique authors.

Return type:

int

Raises:

DatabaseError – If query fails.

get_years_for_conference(conference)[source]

Get distinct years available for a specific conference.

Parameters:

conference (str) – Conference name to query.

Returns:

Sorted list of distinct years for the conference.

Return type:

list of int

Raises:

DatabaseError – If query fails.

resolve_conference_name(conference)[source]

Resolve a conference name to the canonical form stored in the database.

Performs a case-insensitive match against conference names already present in the database. If no database match is found, falls back to a case-insensitive match against the conference_name attribute of every registered downloader plugin. If neither lookup succeeds the original conference string is returned unchanged.

This is the single authoritative place where conference-name normalization must happen. CLI commands should call this method once at the entry point of each command and then work with the returned canonical name for all subsequent operations.

Parameters:

conference (str) – Conference name as supplied by the caller. May differ in case or spelling from the form stored in the database (e.g. ml4ps@neurips vs. ML4PS@Neurips).

Returns:

The conference name exactly as it appears in the database (first match), or exactly as defined by the first matching plugin, or the input string if no match is found.

Return type:

str

Examples

>>> with DatabaseManager() as db:
...     db.create_tables()
...     canonical = db.resolve_conference_name("ml4ps@neurips")
...     # Returns "ML4PS@Neurips" if that form is stored in the DB
get_filter_options(year=None, conference=None)[source]

Get distinct values for filterable fields (lightweight schema).

Returns a dictionary with lists of distinct values for session, year, and conference fields that can be used to populate filter dropdowns. Optionally filters by year and/or conference.

Parameters:
  • year (int, optional) – Filter results to only show options for this year

  • conference (str, optional) – Filter results to only show options for this conference

Returns:

Dictionary with keys ‘sessions’, ‘years’, ‘conferences’ containing lists of distinct non-null values sorted alphabetically (or numerically for years).

Return type:

dict

Raises:

DatabaseError – If query fails.

Examples

>>> db = DatabaseManager()
>>> with db:
...     filters = db.get_filter_options()
>>> print(filters['sessions'])
['Session 1', 'Session 2', ...]
>>> print(filters['years'])
[2023, 2024, 2025]
>>> # Get filters for specific year
>>> filters = db.get_filter_options(year=2025)
get_embedding_model()[source]

Get the embedding model used for the current embeddings.

Returns:

Name of the embedding model, or None if not set.

Return type:

str or None

Raises:

DatabaseError – If query fails.

Examples

>>> db = DatabaseManager()
>>> with db:
...     model = db.get_embedding_model()
>>> print(model)
'text-embedding-qwen3-embedding-4b'
set_embedding_model(model_name)[source]

Set the embedding model used for embeddings.

This stores or updates the embedding model metadata. If a record exists, it updates the model and timestamp. Otherwise, it creates a new record.

Parameters:

model_name (str) – Name of the embedding model.

Raises:

DatabaseError – If update fails.

Return type:

None

Examples

>>> db = DatabaseManager()
>>> with db:
...     db.set_embedding_model("text-embedding-qwen3-embedding-4b")
get_clustering_cache(embedding_model, clustering_method, n_clusters=None, clustering_params=None, reduction_method=None, n_components=None)[source]

Get cached clustering results matching the parameters.

When reduction_method and n_components are provided, only entries that match exactly (including the reduction method) are returned. When they are omitted (None), the reduction method is ignored and the most recent entry matching the clustering parameters is returned.

Parameters:
  • embedding_model (str) – Name of the embedding model.

  • clustering_method (str) – Clustering algorithm used.

  • n_clusters (int, optional) – Number of clusters (for kmeans/agglomerative).

  • clustering_params (dict, optional) – Additional clustering parameters (e.g., distance_threshold, eps).

  • reduction_method (str, optional) – Dimensionality reduction method. When provided, the query requires an exact match on this column.

  • n_components (int, optional) – Number of components after reduction. When provided, the query requires an exact match on this column.

Returns:

Cached clustering results as dictionary, or None if not found.

Return type:

dict or None

Raises:

DatabaseError – If query fails.

save_clustering_cache(embedding_model, reduction_method, n_components, clustering_method, results, n_clusters=None, clustering_params=None)[source]

Save clustering results to cache.

The full results including visualization coordinates are stored. The reduction_method and n_components are stored so that an exact-match lookup can return cached points directly. When only the reduction method changes, the clustering results are reused and only the reduction is re-applied.

Parameters:
  • embedding_model (str) – Name of the embedding model.

  • reduction_method (str) – Dimensionality reduction method.

  • n_components (int) – Number of components after reduction.

  • clustering_method (str) – Clustering algorithm used.

  • results (dict) – Clustering results to cache (full results including points).

  • n_clusters (int, optional) – Number of clusters (for kmeans/agglomerative).

  • clustering_params (dict, optional) – Additional clustering parameters.

Raises:

DatabaseError – If save fails.

Return type:

None

clear_clustering_cache(embedding_model=None)[source]

Clear clustering cache, optionally filtered by embedding model.

This is useful when embeddings change or cache becomes stale.

Parameters:

embedding_model (str, optional) – If provided, only clear cache for this embedding model. If None, clear all cache entries.

Returns:

Number of cache entries deleted.

Return type:

int

Raises:

DatabaseError – If deletion fails.

get_hierarchical_label_cache(embedding_model, linkage='ward')[source]

Get cached hierarchical labels for agglomerative clustering.

Hierarchical labels are independent of the number of clusters and the distance threshold, so they are reused for all agglomerative clustering settings that share the same embedding model and linkage.

Parameters:
  • embedding_model (str) – Name of the embedding model.

  • linkage (str, optional) – Agglomerative linkage method (default: "ward").

Returns:

Mapping of {node_id: label} (integer keys), or None if no entry is found.

Return type:

dict or None

Raises:

DatabaseError – If query fails.

save_hierarchical_label_cache(embedding_model, labels, linkage='ward')[source]

Save hierarchical cluster labels to cache.

Parameters:
  • embedding_model (str) – Name of the embedding model.

  • labels (dict) – Mapping of {node_id: label} to store.

  • linkage (str, optional) – Agglomerative linkage method (default: "ward").

Raises:

DatabaseError – If save fails.

Return type:

None

add_eval_qa_pair(conversation_id, turn_number, query, expected_answer, tool_name=None, source_info=None)[source]

Insert a single evaluation Q/A pair.

Parameters:
  • conversation_id (str) – Identifier grouping turns in a conversation.

  • turn_number (int) – Position within the conversation (0 = first).

  • query (str) – The user query text.

  • expected_answer (str) – The expected/reference answer.

  • tool_name (str, optional) – MCP tool expected to be invoked.

  • source_info (str, optional) – JSON metadata about how the pair was generated.

Returns:

Primary key of the inserted row.

Return type:

int

Raises:

DatabaseError – If insertion fails.

get_eval_qa_pairs(verified_only=False, tool_name=None, conversation_id=None, limit=None, offset=0)[source]

Retrieve evaluation Q/A pairs with optional filters.

Parameters:
  • verified_only (bool) – If True, return only pairs with verified == 1.

  • tool_name (str, optional) – Filter by expected MCP tool name.

  • conversation_id (str, optional) – Filter by conversation.

  • limit (int, optional) – Maximum number of pairs to return.

  • offset (int) – Number of rows to skip (for pagination).

Returns:

Matching Q/A pairs as dictionaries.

Return type:

list of dict

Raises:

DatabaseError – If query fails.

get_eval_qa_pair_count(verified_only=False)[source]

Count evaluation Q/A pairs.

Parameters:

verified_only (bool) – If True, count only verified pairs.

Returns:

Number of matching pairs.

Return type:

int

Raises:

DatabaseError – If query fails.

update_eval_qa_pair(pair_id, **fields)[source]

Update fields on an existing Q/A pair.

Parameters:
  • pair_id (int) – Primary key of the pair to update.

  • **fields – Keyword arguments mapping column names to new values. Supported keys: query, expected_answer, tool_name, verified, source_info.

Returns:

True if a row was updated, False if the pair was not found.

Return type:

bool

Raises:

DatabaseError – If update fails.

delete_eval_qa_pair(pair_id)[source]

Delete an evaluation Q/A pair by ID.

Parameters:

pair_id (int) – Primary key of the pair to delete.

Returns:

True if a row was deleted, False if pair was not found.

Return type:

bool

Raises:

DatabaseError – If deletion fails.

delete_verified_eval_qa_pairs()[source]

Delete all verified (accepted) evaluation Q/A pairs.

Returns:

Number of pairs deleted.

Return type:

int

Raises:

DatabaseError – If deletion fails.

delete_eval_results(run_id=None)[source]

Delete stored evaluation results, optionally filtered to a single run.

Parameters:

run_id (str, optional) – If supplied, only results for this run are deleted. If None, all stored results are deleted.

Returns:

Number of rows deleted.

Return type:

int

Raises:

DatabaseError – If deletion fails.

add_eval_result(run_id, qa_pair_id, actual_answer=None, actual_tool_name=None, answer_score=None, tool_correct=None, latency_ms=None, error=None, judge_reasoning=None)[source]

Insert a single evaluation result.

Parameters:
  • run_id (str) – Identifier for the evaluation run.

  • qa_pair_id (int) – ID of the evaluated Q/A pair.

  • actual_answer (str, optional) – Answer produced by the RAG system.

  • actual_tool_name (str, optional) – MCP tool actually invoked.

  • answer_score (float, optional) – LLM-judged quality score (1–5).

  • tool_correct (int, optional) – 1 if the correct tool was used, 0 otherwise.

  • latency_ms (int, optional) – Query latency in milliseconds.

  • error (str, optional) – Error message if the query failed.

  • judge_reasoning (str, optional) – LLM judge’s reasoning for the score.

Returns:

Primary key of the inserted row.

Return type:

int

Raises:

DatabaseError – If insertion fails.

get_eval_results(run_id=None, limit=None, offset=0)[source]

Retrieve evaluation results with optional run filter.

Parameters:
  • run_id (str, optional) – Filter by evaluation run. If None, return results from all runs.

  • limit (int, optional) – Maximum number of results to return.

  • offset (int) – Number of rows to skip.

Returns:

Evaluation results as dictionaries.

Return type:

list of dict

Raises:

DatabaseError – If query fails.

get_eval_run_ids()[source]

Return distinct evaluation run IDs ordered by run time, oldest first.

The ordering is determined by the minimum created_at timestamp of all results in each run, so the most recent run appears last.

Returns:

Distinct run IDs ordered chronologically (oldest to newest).

Return type:

list of str

Raises:

DatabaseError – If query fails.

get_eval_run_summary(run_id)[source]

Compute summary statistics for an evaluation run.

Parameters:

run_id (str) – The evaluation run identifier.

Returns:

Dictionary with keys:

  • total : int – number of evaluated pairs

  • avg_score : float or None – mean answer quality score

  • tool_accuracy : float or None – fraction of correct tool selections

  • avg_latency_ms : float or None – mean latency

  • error_count : int – number of queries that produced errors

  • run_date : datetime or None – timestamp of the first result in the run

Return type:

dict

Raises:

DatabaseError – If query fails.

export_papers_to_sqlite(output_path, conference, year)[source]

Export papers for a given conference and year to a standalone SQLite file.

The exported file includes hierarchical label cache and embeddings metadata rows. Clustering cache is not included — it is exported separately via export_clustering_cache_to_json().

Parameters:
  • output_path (Path) – Destination path for the SQLite file.

  • conference (str) – Conference name to export.

  • year (int) – Year to export.

Returns:

Number of papers exported.

Return type:

int

Raises:

DatabaseError – If the export fails or no papers are found.

import_papers_from_sqlite(sqlite_path, conference, year)[source]

Import papers for a given conference and year from a SQLite file.

Existing papers for the given conference/year are replaced (not merged). Clustering cache and hierarchical label cache entries that match the conference and year are replaced. Embeddings metadata is validated for consistency (the embedding model must match).

Parameters:
  • sqlite_path (Path) – Path to the source SQLite file.

  • conference (str) – Conference name being imported.

  • year (int) – Year being imported.

Returns:

Number of papers imported.

Return type:

int

Raises:

DatabaseError – If the import fails or the embedding model is inconsistent.

export_clustering_cache_to_json(conference, year)[source]

Export clustering cache entries matching conference and year as JSON.

Parameters:
  • conference (str) – Conference name to match.

  • year (int) – Year to match.

Returns:

A JSON-serialisable dictionary with an entries list. Each entry contains all ClusteringCache columns except id (auto-generated on import).

Return type:

dict

Raises:

DatabaseError – If the export fails.

import_clustering_cache_from_json(data, conference, year, overwrite_embedding_model=None)[source]

Import clustering cache entries from a JSON dictionary.

Existing clustering cache entries matching conference and year are deleted before importing the new entries.

Parameters:
  • data (dict) – Dictionary previously returned by export_clustering_cache_to_json().

  • conference (str) – Conference name for scoping the delete.

  • year (int) – Year for scoping the delete.

  • overwrite_embedding_model (str, optional) – When provided, the embedding_model field of every imported entry is replaced with this value. Use this when importing an artifact whose embedding model differs from the locally configured one (i.e. when --ignore-embedding-model-mismatch was passed) so that get_clustering_cache() can find the entries using the local model name.

Returns:

Number of cache entries imported.

Return type:

int

Raises:

DatabaseError – If the import fails.

Usage Examples

Basic Operations

from abstracts_explorer.database import DatabaseManager

# Initialize database
db = DatabaseManager()

# Add a paper
paper_id = db.add_paper({
    'openreview_id': 'abc123',
    'title': 'Example Paper',
    'abstract': 'This is an example abstract.',
    'year': 2025,
    'pdf_url': 'https://example.com/paper.pdf'
})

# Add authors
db.add_author(paper_id, 'John Doe', 0)
db.add_author(paper_id, 'Jane Smith', 1)

Searching Papers

# Search by title
results = db.search_papers(title="transformer")

# Search by abstract content
results = db.search_papers(abstract="attention mechanism")

# Filter by year
results = db.get_papers_by_year(2025)

# Get papers by author
results = db.get_papers_by_author("John Doe")

Retrieving Data

# Get all papers
all_papers = db.get_all_papers()

# Get specific paper
paper = db.get_paper_by_id(paper_id)

# Get authors for a paper
authors = db.get_authors_for_paper(paper_id)

Database Schema

papers Table

Column

Type

Description

id

INTEGER

Primary key (auto-increment)

openreview_id

TEXT

Unique OpenReview ID

title

TEXT

Paper title

abstract

TEXT

Paper abstract

year

INTEGER

Conference year

pdf_url

TEXT

URL to PDF

created_at

TIMESTAMP

Creation timestamp

authors Table

Column

Type

Description

id

INTEGER

Primary key (auto-increment)

paper_id

INTEGER

Foreign key to papers

name

TEXT

Author name

position

INTEGER

Author position in list

Error Handling

The database module raises standard SQLite exceptions. Wrap operations in try-except blocks:

try:
    db.add_paper(paper_data)
except sqlite3.IntegrityError:
    print("Paper already exists")
except Exception as e:
    print(f"Database error: {e}")