Local GPU Job Queue
⚠️ ARCHIVED (2026-02-01): Queue system deprecated. LLM via Claude CLI (Max plan). GPU for embeddings only.
Historical Overview
SQLite-backed job queue for local Ollama inference with GPU acceleration — now superseded by Claude CLI.
Current Architecture
All LLM processing now routes through:
- Claude CLI (Max plan) - Unlimited usage, all LLM tasks
- **
GPU Still Used For
- Embeddings: sentence-transformers/all-MiniLM-L6-v2 (CUDA accelerated)
- Vector operations: ChromaDB similarity search
See Also
- Knowledge RAG System - Current architecture
- PatentBot Changelog - Migration details