Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
The original RAG paper I used as the baseline idea: retrieve evidence, then generate against it.
OpenIntelligence
Papers and references I had found while working through OpenIntelligence. Most of this started as curiosity around how far an Apple Intelligence-capable device could go as a local document intelligence system. These are most of the papers that shaped the RAG engine.
The original RAG paper I used as the baseline idea: retrieve evidence, then generate against it.
The map for the main RAG pieces: retrieval, generation, augmentation, routing, and evaluation.
General background on how retrieval changes answer generation and grounding.
The ranking paper behind fusing keyword and vector results without pretending one signal always wins.
A useful reference for local-first hybrid retrieval, FTS/vector fusion, and ranking diagnostics.
A reference point for keeping retrieval local, embeddable, and small enough for edge-style constraints.
The paper that pushed me to care about where retrieved chunks land inside a tight context window.
The HyDE paper behind generating a better search target before retrieving against the real corpus.
Useful for the retrieval-needed and self-checking ideas around grounded answers.
Background for trying multiple retrieved evidence paths and then verifying the better answer.
Background reading for recursive retrieval, planning, and agent-style query execution.
A checklist source for relevance, source quality, confidence, and weak-retrieval warnings.
Helpful for sanity-checking tradeoffs around retrieval quality, grounding, efficiency, and robustness.
Title links open the paper PDFs directly. The smaller links go to the source/abstract pages.