Files
bookshelf/docs/overview.md
Petr Polezhaev b94f222c96 Add per-request AI logging, DB batch queue, WS entity updates, and UI polish
- log_thread.py: thread-safe ContextVar bridge so executor threads can log
  individual LLM calls and archive searches back to the event loop
- ai_log.py: init_thread_logging(), notify_entity_update(); WS now pushes
  entity_update messages when book data changes after any plugin or batch run
- batch.py: replace batch_pending.json with batch_queue SQLite table;
  run_batch_consumer() reads queue dynamically so new books can be added
  while batch is running; add_to_queue() deduplicates
- migrate.py: fix _migrate_v1 (clear-on-startup bug); add _migrate_v2 for
  batch_queue table
- _client.py / archive.py / identification.py: wrap each LLM API call and
  archive search with log_thread start/finish entries
- api.py: POST /api/batch returns {already_running, added}; notify_entity_update
  after identify pipeline
- models.default.yaml: strengthen ai_identify confidence-scoring instructions;
  warn against placeholder data
- detail-render.js: book log entries show clickable ID + spine thumbnail;
  book spine/title images open full-screen popup
- events.js: batch-start handles already_running+added; open-img-popup action
- init.js: entity_update WS handler; image popup close listeners
- overlays.css / index.html: full-screen image popup overlay
- eslint.config.js: add new globals; fix no-redeclare/no-unused-vars for
  multi-file global architecture; all lint errors resolved

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 12:10:54 +03:00

12 KiB

Bookshelf — Technical Overview

Purpose

Photo-based book cataloger. Hierarchy: Room → Cabinet → Shelf → Book. AI plugins identify spine text; archive plugins supply bibliographic metadata.

Stack

  • Server: FastAPI + SQLite (no ORM), Python 3.11+, Poetry (poetry run serve)
  • Frontend: Vanilla JS SPA — static/index.html + static/css/ + static/js/; no build step
  • AI: OpenAI-compatible API (OpenRouter, OpenAI, etc.) via openai library
  • Images: Stored uncompressed in data/images/; Pillow used server-side for crops and AI prep

Directory Layout

src/
  app.py                        # FastAPI app, exception handlers
  api.py                        # All routes (APIRouter)
  db.py                         # All SQL; connection() / transaction() context managers
  files.py                      # Image file I/O; DATA_DIR, IMAGES_DIR
  config.py                     # Config loading and typed AppConfig
  models.py                     # Typed dataclasses / mashumaro decoders
  errors.py                     # Domain exceptions (NotFoundError, BadRequestError subtypes)
  log_thread.py                 # Thread-safe logging context (ContextVar + event-loop bridge for executor threads)
  logic/
    __init__.py                 # dispatch_plugin() orchestrator + re-exports
    boundaries.py               # Boundary math, shelf/spine crop sources, boundary detector runner
    identification.py           # Status computation, text recognizer, book identifier runners
    archive.py                  # Archive searcher runner (sync + background)
    batch.py                    # Batch queue consumer (run_batch_consumer); queue persisted in batch_queue DB table
    ai_log.py                   # AI request ring buffer + WebSocket pub-sub (log_start/log_finish/notify_entity_update); persisted to ai_log table
    images.py                   # crop_save, prep_img_b64, serve_crop
  migrate.py                    # DB migration; run_migration() called at startup
  plugins/
    __init__.py                 # Registry: load_plugins(), get_plugin(), get_manifest(), get_all_text_recognizers(), get_all_book_identifiers(), get_all_archive_searchers()
    rate_limiter.py             # Thread-safe per-domain rate limiter
    ai_compat/                  # AI plugin implementations
    archives/                   # Archive plugin implementations
scripts/
  presubmit.py                  # Poetry console entry points: fmt, presubmit
static/
  index.html                    # HTML shell + CSS/JS imports (load order matters)
  css/                          # base, layout, tree, forms, overlays
  js/                           # state → helpers → api → canvas-boundary → tree-render →
                                #   detail-render → canvas-crop → editing → photo → events → init
config/
  credentials.default.yaml      # API endpoints and keys (override in credentials.user.yaml)
  models.default.yaml           # Model selection and prompts per AI function
  functions.default.yaml        # Plugin definitions and per-plugin settings
  ui.default.yaml               # UI display settings
  *.user.yaml                   # Gitignored overrides — create these with real values
data/                           # Runtime: books.db + images/ (gitignored)
tests/
  *.py                          # Python tests (pytest)
  js/pure-functions.test.js     # JS tests (node:test)
docs/
  overview.md                   # This file
  contributing.md               # Documentation and contribution standards

Layer Architecture

Unidirectional: apilogicdb / files. No layer may import from a layer above it.

  • api: HTTP parsing, entity existence checks via db.connection(), calls logic, returns HTTP responses. Owns HTTPException and status codes.
  • logic: Business operations, no HTTP/FastAPI imports. Raises domain exceptions from errors.py for expected failures.
  • db / files: SQL and file I/O only. Returns typed dataclasses or None. Never raises domain exceptions.

Configuration System

Config loaded from config/*.default.yaml merged with config/*.user.yaml. Deep merge: dicts recursive, lists replaced. Typed via mashumaro BasicDecoder[AppConfig].

Categories:

File Purpose
credentials base_url + api_key per endpoint; no model or prompt
models credentials ref + model string + optional extra_body + prompt
functions Plugin definitions; dict key = plugin_id (unique across all categories)
ui Frontend display settings (boundary_grab_px, spine_padding_pct, ai_log_max_entries)

Minimal setup — create config/credentials.user.yaml:

credentials:
  openrouter:
    api_key: "sk-or-your-actual-key"

Plugin System

Categories

Category Input Output DB field
boundary_detectors (target=shelves) cabinet image {boundaries:[…], confidence:N} cabinets.ai_shelf_boundaries
boundary_detectors (target=books) shelf image {boundaries:[…]} shelves.ai_book_boundaries
text_recognizers spine image {raw_text, title, author, …} books.raw_text + candidates
book_identifiers raw_text + archive results + optional images [{title, author, …, score, sources}, …] books.ai_blocks + books.ai_*
archive_searchers query string [{source, title, author, …}, …] books.candidates

Identification pipeline (POST /api/books/{id}/identify)

Single endpoint runs the full pipeline in sequence:

  1. VLM text recognizer reads the spine image → raw_text and structured fields.
  2. All archive searchers run in parallel with title+author and title-only queries.
  3. Archive results are deduplicated by normalized full-field match (case-insensitive, punctuation removed, spaces collapsed).
  4. Main identifier model receives raw_text, deduplicated archive results, and (if is_vlm: true) spine + title-page images. Returns ranked IdentifyBlock list.
  5. ai_blocks stored persistently in the DB (never cleared; overwritten each pipeline run). Top block updates ai_* fields if score ≥ confidence_threshold.

functions.*.yaml key for book_identifiers: add is_vlm: true for models that accept images.

Universal plugin endpoint

POST /api/{entity_type}/{entity_id}/plugin/{plugin_id}

Routes to the correct runner via dispatch_plugin() in logic/__init__.py.

AI Plugin Configuration

  • credentials file: connection only — base_url, api_key.
  • models file: credentials ref, model string, prompt text, optional extra_body.
  • functions file: per-plugin settings — model, max_image_px (default 1600), confidence_threshold (default 0.8), auto_queue, rate_limit_seconds, timeout.
  • OUTPUT_FORMAT is a hardcoded class constant in each plugin — not user-configurable; injected into the prompt as ${OUTPUT_FORMAT} by AIClient.

Archive plugins

All implement search(query: str) -> list[CandidateRecord]. Use shared RATE_LIMITER singleton for per-domain throttling.

Auto-queue

  • After text_recognizer completes → fires all archive_searchers with auto_queue: true in background thread pool.
  • POST /api/batch → adds all unidentified books to the batch_queue DB table; starts run_batch_consumer() if not already running. Calling again while running adds newly-unidentified books to the live queue.

Database Schema (key fields)

Table Notable columns
cabinets shelf_boundaries (JSON […]), ai_shelf_boundaries (JSON {pluginId:[…]})
shelves book_boundaries, ai_book_boundaries (same format), photo_filename (optional override)
books raw_text, ai_title/author/year/isbn/publisher, candidates (JSON [{source,…}]), ai_blocks (JSON [{title,author,year,isbn,publisher,score,sources}]), identification_status
batch_queue book_id (PK), added_at — persistent batch processing queue; consumed in FIFO order by run_batch_consumer()

ai_blocks are persistent: set by the identification pipeline, shown in the book detail panel as clickable cards. Hidden by default for user_approved books.

DB Migration (src/migrate.py)

run_migration() is called at startup (after init_db()). Migrations:

  • _migrate_v1: adds the ai_blocks column if absent; clears stale AI fields (runs once only, not on every startup).
  • _migrate_v2: creates the batch_queue table if absent.

identification_status: unidentifiedai_identifieduser_approved.

Boundary System

N interior boundaries → N+1 segments. full = [0] + boundaries + [1]. Segment K spans full[K]..full[K+1].

  • User boundaries: shelf_boundaries / book_boundaries (editable via canvas drag)
  • AI suggestions: ai_shelf_boundaries / ai_book_boundaries (JSON object {pluginId: [fractions]})
  • Shelf K image = cabinet photo cropped to (0, y_start, 1, y_end) unless shelf has override photo
  • Book K spine = shelf image cropped to (x_start, *, x_end, *) with composed crop if cabinet-based

Frontend JS

No ES modules, no bundler. All files use global scope; load order in index.html is the dependency order. State lives in state.js (S, _plugins, _bnd, _photoQueue, _aiLog, _aiLogWs, etc.). Events delegated via #app in events.js.

connectAiLogWs() subscribes to /ws/ai-log on startup. Message types:

  • snapshot — full log on connect (_aiLog initialized)
  • update — single log entry added or updated (spinner count in header updated)
  • entity_update — entity data changed (tree node updated via walkTree; detail panel or full render depending on selection)

Tooling

poetry run serve       # start uvicorn on :8000
poetry run fmt         # black (in-place)
poetry run presubmit   # black --check + flake8 + pyright + pytest + JS tests
npm install            # install ESLint + Prettier (requires network; enables JS lint/fmt in presubmit)
npm run lint           # ESLint on static/js/
npm run fmt            # Prettier on static/js/

Line length: 120. Pyright strict mode. Pytest fixtures with yield return Iterator[T]. Test fixtures: monkeypatch db.DB_PATH / files.DATA_DIR / files.IMAGES_DIR.

Key API Endpoints

GET    /api/config                                       # UI config + plugin manifest
GET    /api/tree                                         # full nested tree
POST   /api/{entity_type}/{entity_id}/plugin/{plugin_id} # universal plugin runner
PATCH  /api/cabinets/{id}/boundaries                     # update shelf boundary list
PATCH  /api/shelves/{id}/boundaries                      # update book boundary list
GET    /api/shelves/{id}/image                           # shelf image (override or cabinet crop)
GET    /api/books/{id}/spine                             # book spine crop
POST   /api/books/{id}/identify                          # full identification pipeline (VLM → archives → main model)
POST   /api/books/{id}/process                           # full auto-queue pipeline (single book)
POST   /api/batch / GET /api/batch/status                # batch processing
WS     /ws/batch                                         # batch progress push (replaces polling)
WS     /ws/ai-log                                        # AI request log: snapshot + update per request + entity_update on book changes
POST   /api/books/{id}/dismiss-field                     # dismiss a candidate suggestion
PATCH  /api/{kind}/reorder                               # drag-to-reorder
POST   /api/cabinets/{id}/crop / POST /api/shelves/{id}/crop  # permanent crop