Medical Corpus Staging
Staging medical texts for the cross-domain lexicon and Pharmakon Miner pipeline. Critical nexus for ancient→modern pharmacological mapping.
📊 Corpus Overview
Source Files Est. Chunks Priority /Medicine/171 ~200k 🔴 Critical /Botany/43 ~50k 🔴 Critical /Biohacking/7 ~10k 🟡 High /History/Ancient/ (pharma)10+ ~15k 🔴 Critical /Science/Biology-*TBD ~50k 🟡 High
Total estimated: ~325k chunks → medical_corpus
🗂️ Source Breakdown
Medicine (171 files)
Location: /mnt/storage/books_organized/Medicine/
Pharmacology & Drug Science
Text Focus Comprehensive Toxicology 4ed Toxicology reference Lehne’s Pharmacology for Nursing Care 12ed Drug mechanisms Drug Metabolism and Pharmacokinetics ADME processes Cobert’s Manual of Drug Safety Pharmacovigilance Neuroimmune Pharmacology 3ed Neuro-drug interactions
Biochemistry & Physiology
Text Focus Vander’s Human Physiology 16ed Systems physiology Paul’s Fundamental Immunology 8ed Immune mechanisms Textbook of Biochemistry (Lal) Biochem foundations Ross & Wilson Anatomy & Physiology 14ed Structure-function Gray’s Anatomy for Students 5ed Anatomical reference
Clinical Medicine
Text Focus Oxford Handbook of Clinical Medicine 11ed Clinical reference CURRENT Medical Diagnosis & Treatment Diagnostic guide Harrison’s (if present) Internal medicine Miller’s Anesthesia 10ed (2-vol) Anesthesiology
Specialized
Text Focus Photodynamic Therapy in Dermatology Light therapy Dark Matters - Circadian Rhythms Chronobiology Regenerative Medicine handbooks Stem cells, tissue eng Brain-Gut Connection Gut-brain axis
Botany (43 files)
Location: /mnt/storage/books_organized/Botany/
Herbal Medicine
Text Focus Alchemy of Herbal Medicine Vol 1-2 Traditional herbalism Native American Herbal Remedies Encyclopedia Indigenous medicine Little Encyclopedia of Herbal Medicine Quick reference Herbal Medicines Survival Guide Practical applications Encyclopedia of Rare Drug Plants Uncommon botanicals
Ethnobotany & Psychedelics
Text Focus Psilocybin Mushroom guides (10+) Cultivation, identification Hallucinogenic Plants Field Guide Psychoactive botany DMT Entities Illustrated Guide Phenomenology Ayurveda (DK) Indian traditional medicine
Plant Science
Text Focus Chemistry of Plant-derived Natural Products Phytochemistry Encyclopedia of Cultivated Plants Botanical reference Plant Genetic Resources textbook Genetics
Biohacking (7 files)
Location: /mnt/storage/books_organized/Biohacking/
Biohacker’s Handbook - Sleep
Grindhouse Wetware
Kill Zombie Cells
Biohacking tech/kits guides
History/Ancient - Pharma-relevant
Location: /mnt/storage/books_organized/History/Ancient/
Text Focus The Chemical Muse (Hillman) Drugs in antiquity Mushrooms, Myth & Mithras (Ruck) Entheogens in religion Greek Magical Papyri Ritual pharmacology Hermetica Hermetic medicine
🔗 Pharmakon Bridge Points
Greek→Modern Mapping Targets
Ancient Concept Modern Mapping Corpus Source φάρμακον (drug/poison) Pharmacology, toxicology Medicine, Botany θηριακή (theriac) Polypharmacy, mithridatism History/Ancient κυκεών (kykeon) Ergot alkaloids, entheogens Botany (psilocybin) βοτάνη (herb) Phytochemistry, botanicals Botany χυμός (humor) Biochemistry, homeostasis Medicine θεραπεία (therapy) Clinical medicine Medicine
Cross-Domain Connections
greek_corpus ──────┬──────► medical_corpus
│
├──────► science_corpus (mechanisms)
│
└──────► physicists_corpus (quantum)
Key bridges:
Ancient plant remedies → Modern phytochemistry
Humoral theory → Biochemical homeostasis
Ritual medicine → Psychopharmacology
Theriac compounds → Polypharmacy principles
🛠️ Pipeline Integration
Ingest Command
cd ~/projects/knowledge-rag
source venv/bin/activate
# Create medical corpus
python patentbot.py ingest \
--corpus medical_corpus \
--source /mnt/storage/books_organized/Medicine/ \
--source /mnt/storage/books_organized/Botany/ \
--source /mnt/storage/books_organized/Biohacking/
MEDICAL_ENTITY_TYPES = [
"drugs" , # Pharmaceuticals, compounds
"mechanisms" , # Pathways, receptors, enzymes
"conditions" , # Diseases, syndromes
"plants" , # Botanical sources
"compounds" , # Chemical structures
"therapies" , # Treatment modalities
"anatomy" , # Body systems, organs
]
Pharmakon Enrichment
def enrich_medical_chunk (chunk, greek_lexicon):
"""Add Greek term mappings to medical chunks"""
# Find Greek equivalents
for term in extract_medical_terms(chunk):
if greek_match := greek_lexicon.get(term):
chunk.metadata[ "greek_mapping" ] = {
"term" : greek_match[ "greek" ],
"transliteration" : greek_match[ "translit" ],
"ancient_usage" : greek_match[ "context" ]
}
return chunk
📋 Staging Checklist
Phase 1: Inventory
Phase 2: Organization
Phase 3: Ingest
Phase 4: Integration
📊 Target Metrics
Metric Target Total chunks ~325k Unique drug entities 5,000+ Plant compounds mapped 1,000+ Greek→Modern bridges 500+ Mechanism coverage 2,000+ pathways
Medical corpus: the modern anchor for ancient pharmacological knowledge.