PatentBot

An AI-powered system that ingests engineering/scientific literature, builds domain expertise, and identifies opportunities for novel patents by finding gaps, combinations, and unexplored intersections in the knowledge space.

Vision

Turn a book library into a patent factory.

Most patents come from combining existing ideas in novel ways. PatentBot:

Ingests technical literature (PDFs, papers, textbooks)
Builds structured knowledge graphs per domain
Identifies “white space” — unexplored combinations and gaps
Generates patent-worthy invention disclosures

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     KNOWLEDGE LAYER                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │ Engineering  │    │   Physics/   │    │  Materials   │  │
│  │   Corpus     │    │   Quantum    │    │   Science    │  │
│  │  (1000+ PDFs)│    │   Corpus     │    │   Corpus     │  │
│  └──────┬───────┘    └──────┬───────┘    └──────┬───────┘  │
│         │                   │                   │          │
│         └───────────────────┼───────────────────┘          │
│                             ▼                              │
│                    ┌────────────────┐                      │
│                    │  RAG Pipeline  │                      │
│                    │  (Embeddings)  │                      │
│                    └────────┬───────┘                      │
│                             │                              │
└─────────────────────────────┼──────────────────────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                     ANALYSIS LAYER                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐  │
│  │  Knowledge   │    │    Gap       │    │   Prior Art  │  │
│  │    Graph     │◄──►│  Detection   │◄──►│    Search    │  │
│  │  Extractor   │    │   Engine     │    │    Agent     │  │
│  └──────────────┘    └──────────────┘    └──────────────┘  │
│                             │                              │
└─────────────────────────────┼──────────────────────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                     OUTPUT LAYER                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────────────────────────────────────────────┐  │
│  │              Patent Disclosure Generator              │  │
│  │                                                       │  │
│  │  • Title & Abstract                                   │  │
│  │  • Problem Statement                                  │  │
│  │  • Novel Solution                                     │  │
│  │  • Claims (Independent + Dependent)                   │  │
│  │  • Prior Art References                               │  │
│  │  • Figures/Diagrams (conceptual)                      │  │
│  └──────────────────────────────────────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Core Components

1. Corpus Ingestion

Input: PDF/EPUB technical books, papers, patents
Processing: OCR → chunking → embeddings
Storage: Vector DB (local ChromaDB or Qdrant)
Metadata: Domain tags, publication date, citation graph

2. Knowledge Graph Extraction

Extract entities: materials, processes, properties, applications
Map relationships: “X enables Y”, “A improves B”, “C replaces D”
Build domain ontology automatically from corpus

3. Gap Detection Engine

Combination gaps: A+B exists, B+C exists, but A+C doesn’t
Property gaps: Material X has properties P,Q but no one’s tested R
Application gaps: Technique T used in domain D1 but not D2
Temporal gaps: Old idea + new capability = new invention

4. Prior Art Agent

Search USPTO, Google Patents, arXiv
Validate novelty before generating disclosure
Find closest prior art for differentiation

5. Disclosure Generator

LLM-powered patent drafting
Structured output: claims, abstract, description
Human review workflow

Sub-Projects

🏺 Pharmakon Mining

Cross-axis patent discovery from Ancient Greek pharmacology. Mining 2,500-year-old pharmaceutical knowledge (theriac, PGM, Dioscorides) and mapping to modern science for novel patentable formulations.

Target Domains (Phase 1)

Domain	Corpus Size	Patent Potential
Engineering	1,000+ books	High — mechanical, electrical, systems
Quantum Tech	150+ books	Very High — emerging field
Materials Science	TBD	High — nanotechnology, composites
Biohacking/Biotech	~50 books	Medium — some regulatory hurdles
Ancient Greek	1,135 chunks	Very High — unexplored pharmakon

Tech Stack

Component	Technology
Embeddings	sentence-transformers/all-MiniLM-L6-v2 (CUDA)
Vector Store	ChromaDB (721K+ chunks)
Knowledge Graph	Neo4j or NetworkX
LLM	Claude CLI (Max plan - unlimited)
Prior Art Search	USPTO API, Google Patents API
Orchestration	Python + custom pipeline

Development Phases

Phase 1: Foundation (MVP)

Set up RAG pipeline with engineering corpus
Basic entity extraction (materials, processes)
Simple gap detection: “What combinations don’t exist?”
Manual patent drafting from insights

Phase 2: Intelligence

Knowledge graph construction
Prior art search integration
Automated novelty scoring
Disclosure template generation

Phase 3: Automation

Full disclosure generator
Patent attorney review workflow
Filing assistance
Portfolio management

Success Metrics

Disclosures generated: Target 10+/month after Phase 2
Novelty rate: >50% pass prior art check
Filing rate: 1-2 provisional patents/quarter
Time to disclosure: <4 hours from gap identification

Revenue Model

Own patents — File provisional, license or sell
Service — Generate disclosures for clients ($500-2000/disclosure)
SaaS — Patent discovery platform for R&D teams

Risks & Mitigations

Risk	Mitigation
Patent quality	Human expert review before filing
Prior art miss	Multiple search sources, conservative claims
Corpus gaps	Continuous expansion, paper scraping
LLM hallucination	RAG grounding, citation requirements

Knowledge RAG System — Foundation
Book Library — Source corpus
NTS — Potential service offering

Next Steps

Build engineering corpus in RAG system
Test entity extraction on 10 sample books
Prototype gap detection algorithm
Generate first 5 invention concepts manually
Validate novelty with prior art search

“The best way to predict the future is to invent it.” — Alan Kay

1 item under this folder.

Feb 02, 2026
🏺 Pharmakon Mining - Ancient Greek Patent Discovery

Quartz 4

Explorer

🔬 PatentBot - Novel Patent Discovery System

PatentBot

Vision

Architecture

Core Components

1. Corpus Ingestion

2. Knowledge Graph Extraction

3. Gap Detection Engine

4. Prior Art Agent

5. Disclosure Generator

Sub-Projects

🏺 Pharmakon Mining

Target Domains (Phase 1)

Tech Stack

Development Phases

Phase 1: Foundation (MVP)

Phase 2: Intelligence

Phase 3: Automation

Success Metrics

Revenue Model

Risks & Mitigations

Next Steps

🏺 Pharmakon Mining - Ancient Greek Patent Discovery

Quartz 4

Explorer

🔬 PatentBot - Novel Patent Discovery System

PatentBot

Vision

Architecture

Core Components

1. Corpus Ingestion

2. Knowledge Graph Extraction

3. Gap Detection Engine

4. Prior Art Agent

5. Disclosure Generator

Sub-Projects

🏺 Pharmakon Mining

Target Domains (Phase 1)

Tech Stack

Development Phases

Phase 1: Foundation (MVP)

Phase 2: Intelligence

Phase 3: Automation

Success Metrics

Revenue Model

Risks & Mitigations

Related Projects

Next Steps

🏺 Pharmakon Mining - Ancient Greek Patent Discovery