AgenticRAG Documentation
Vectorless, reasoning-based RAG for Python. No vector DB, no chunking, no embeddings — just pure LLM reasoning over your documents.
Installation
pip install agentic-rag-coreOptional extras:
pip install agentic-rag-core[web] # Web UI (FastAPI server)
pip install agentic-rag-core[gcs] # Google Cloud Storage
pip install agentic-rag-core[neo4j] # Neo4j graph backend
pip install agentic-rag-core[all] # EverythingQuick Start
Step 1: Get a free API key from console.groq.com
Step 2: Set your API key in a .env file:
GROQ_API_KEY=gsk_your_key_hereStep 3: Ask questions about any PDF — three lines of code:
from agenticrag import Forest
forest = Forest(verbose=True)
forest.add("report.pdf")
result = forest.ask("What was the net income?")
print(result.text)Web UI
AgenticRAG includes a browser-based chat interface:
pip install agentic-rag-core[web]
python -m agenticrag serveOpens at http://localhost:8000 — upload PDFs, chat with documents, switch between Groq/Gemini/local LLMs.
API Reference
Classes
| Class | What it does | When to use |
|---|---|---|
| Forest | Multi-document knowledge base | Most common — use for everything |
| PageIndex | Single-document index | When you only have one document |
| ForestResult | Result from Forest.ask() | Access .text, .sources, .confidence |
Forest Methods
| Method | Description |
|---|---|
| .add("file.pdf") | Add a single document |
| .add_directory("./docs/") | Add all PDFs from a folder |
| .add_directory_batch("./docs/") | Fast batch add (100+ docs) |
| .ask("question") | Ask a question across all docs |
| .documents() | List all indexed documents |
| .remove(doc_id) | Remove a document |
| .clear_history() | Reset conversation memory |
| .info() | Forest status summary |
ForestResult Fields
| Field | Type | Description |
|---|---|---|
| .text | str | The final verified answer |
| .confidence | float | 0.0 to 1.0 confidence score |
| .sources | list | Which documents/pages were used |
| .reasoning_trace | list | Step-by-step agent pipeline trace |
| .was_rewritten | bool | Whether the Critic modified the answer |
| .hallucinations | list | Hallucinations that were caught |
| .elapsed_seconds | float | Total time taken |
Supported Models
Cloud (Groq — Free API)
from agenticrag import Forest, GroqModel
forest = Forest(model=GroqModel.GPT_OSS_20B) # Fast, recommended
forest = Forest(model=GroqModel.GPT_OSS_120B) # Largest, best reasoning
forest = Forest(model=GroqModel.LLAMA4_SCOUT) # Llama 4 Scout
forest = Forest(model=GroqModel.QWEN3_32B) # Qwen 3 32BLocal (Ollama — 100% Free)
ollama pull qwen3:4b # 2.5 GB — recommended
from agenticrag import Forest, LocalModel
forest = Forest(
model=LocalModel.QWEN3_4B,
base_url="http://localhost:11434/v1",
)| Model | Size | VRAM | Best For |
|---|---|---|---|
| QWEN3_4B | 2.5 GB | ≤5 GB | Low-VRAM, fastest |
| QWEN3_8B | 5.2 GB | ≤8 GB | Best quality/size ratio |
| QWEN3_14B | 9.3 GB | ≤12 GB | Higher quality |
| QWEN3_30B | 19 GB | ≤24 GB | Strong reasoning |
| LLAMA3_2_3B | 2.0 GB | ≤4 GB | Ultra-lightweight |
| MISTRAL | 4.1 GB | ≤6 GB | General purpose |
How It Works
AgenticRAG uses a multi-agent pipeline — like a team of AI researchers working together:
Examines the document graph to find which documents might have the answer
Search documents IN PARALLEL using tree-based reasoning
Combines evidence from multiple docs into a coherent answer
Checks every claim against source text — removes unverified claims