Knowledge RAG System Setup

Date: 2026-02-01 Status: ✅ Operational LLM Provider: Claude CLI (Max plan - unlimited)

Architecture

Paperless-ngx (OCR/Index) → Document Export → ChromaDB (GPU Embeddings) → Query API
                                                                            ↓
                                                                      Claude CLI
                                                                   (Max plan LLM)

Pipeline Changes (2026-02-01)

Removed: Direct Anthropic API (credits-based) Added: Claude CLI (Max plan - unlimited) as LLM provider

Benefits:

Unlimited usage via Max subscription
Consistent quality across all domains
No API key/credits management
Unified CLI for all knowledge systems (PatentBot, RAG, domains)
GPU dedicated to embeddings only

Components

1. Paperless-ngx (Document Management)

Location: Docker container paperless
Web UI: http://localhost:8000
Consume folder: /home/shdwdev/docker-services/paperless/consume/
API Token: 291bc34a562298e93b720ae08e0e02988b0dfe00

Docker compose: /home/shdwdev/docker-services/paperless/docker-compose.yml

Key features:

OCR processing with Tesseract
Full-text search
Tag-based organization
REST API for document retrieval

2. Knowledge RAG System

Location: /home/shdwdev/projects/knowledge-rag/
Python venv: ./venv/ (5.2GB with PyTorch CUDA)
ChromaDB: ./chroma_db/ (4.5GB+ with 721K+ chunks)

Environment (.env):

PAPERLESS_URL=http://localhost:8000
PAPERLESS_TOKEN=291bc34a562298e93b720ae08e0e02988b0dfe00
CHROMA_PATH=./chroma_db

3. Claude CLI (LLM Provider)

Provider: Claude CLI (Max plan subscription)
Location: ~/.local/bin/claude
Use Cases: RAG queries, entity extraction, gap detection, reasoning
Note: No API key needed - uses Max subscription auth

4. GPU Acceleration (Embeddings Only)

GPU: NVIDIA GeForce RTX 2060 SUPER (8GB VRAM)
CUDA: 13.0
Driver: 580.119.02
PyTorch: 2.5.1+cu121
Embedding Model: sentence-transformers/all-MiniLM-L6-v2 (CUDA)

Usage

Sync Documents from Paperless

cd /home/shdwdev/projects/knowledge-rag
source venv/bin/activate
export $(cat .env | xargs)
python3 knowledge_rag.py sync

List Indexed Documents

python3 knowledge_rag.py list

Search Knowledge Base

python3 knowledge_rag.py search "OSINT social media techniques"

RAG Query

python3 knowledge_rag.py query "What are the best OSINT techniques for social media?"

Python API

from knowledge_rag import KnowledgeBase
 
kb = KnowledgeBase()
kb.sync_from_paperless()  # Sync new documents
 
# Search only (no LLM)
results = kb.search("Greek mystery rites", n_results=5)
 
# Full RAG with Claude
answer = kb.query("What herbs were used in ancient Greek rituals?")
print(answer["answer"])
print(answer["sources"])

Current Corpus (as of 2026-02-01)

Total chunks: 721,275+

Corpus	Chunks
science	554,335
engineering	87,746
tech	43,067
math	26,311
knowledge_base	7,127
greek	1,135
esoteric	846
quantum_biology	555
biohacking	153

Performance

Operation	GPU (RTX 2060)
Embed 1000 docs	~2 minutes
Search query	~200ms
RAG query (Claude CLI)	~2-5 seconds

Maintenance

Re-sync after new Paperless documents

python3 knowledge_rag.py sync

(Only indexes new/modified documents)

Clear and rebuild index

rm -rf ./chroma_db
python3 knowledge_rag.py sync

Troubleshooting

GPU not detected

python3 -c "import torch; print(torch.cuda.is_available())"
# Should print: True

Claude CLI not found

which claude  # Should show ~/.local/bin/claude
claude --version  # Verify working

If missing, install via: pip install claude-code

Paperless API 403 Forbidden

Regenerate token:

docker exec paperless python3 manage.py shell -c "
from django.contrib.auth.models import User
from rest_framework.authtoken.models import Token
user = User.objects.filter(is_superuser=True).first()
Token.objects.filter(user=user).delete()
token = Token.objects.create(user=user)
print(token.key)"

Quartz 4

Explorer

Knowledge RAG System Setup

Knowledge RAG System Setup

Architecture

Pipeline Changes (2026-02-01)

Components

1. Paperless-ngx (Document Management)

2. Knowledge RAG System

3. Claude CLI (LLM Provider)

4. GPU Acceleration (Embeddings Only)

Usage

Sync Documents from Paperless

List Indexed Documents

Search Knowledge Base

RAG Query

Python API

Current Corpus (as of 2026-02-01)

Performance

Maintenance

Re-sync after new Paperless documents

Clear and rebuild index

Troubleshooting

GPU not detected

Claude CLI not found

Paperless API 403 Forbidden

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

Knowledge RAG System Setup

Knowledge RAG System Setup

Architecture

Pipeline Changes (2026-02-01)

Components

1. Paperless-ngx (Document Management)

2. Knowledge RAG System

3. Claude CLI (LLM Provider)

4. GPU Acceleration (Embeddings Only)

Usage

Sync Documents from Paperless

List Indexed Documents

Search Knowledge Base

RAG Query

Python API

Current Corpus (as of 2026-02-01)

Performance

Maintenance

Re-sync after new Paperless documents

Clear and rebuild index

Troubleshooting

GPU not detected

Claude CLI not found

Paperless API 403 Forbidden

Related

Graph View

Table of Contents

Backlinks