AgenticRAG Documentation

Vectorless, reasoning-based RAG for Python. No vector DB, no chunking, no embeddings — just pure LLM reasoning over your documents.

View on GitHub PyPI Package →

Installation

Terminal

pip install agentic-rag-core

Optional extras:

pip install agentic-rag-core[web]    # Web UI (FastAPI server)
pip install agentic-rag-core[gcs]    # Google Cloud Storage
pip install agentic-rag-core[neo4j]  # Neo4j graph backend
pip install agentic-rag-core[all]    # Everything

Quick Start

Step 1: Get a free API key from console.groq.com

Step 2: Set your API key in a .env file:

GROQ_API_KEY=gsk_your_key_here

Step 3: Ask questions about any PDF — three lines of code:

Python

from agenticrag import Forest

forest = Forest(verbose=True)
forest.add("report.pdf")

result = forest.ask("What was the net income?")
print(result.text)

Web UI

AgenticRAG includes a browser-based chat interface:

pip install agentic-rag-core[web]
python -m agenticrag serve

Opens at http://localhost:8000 — upload PDFs, chat with documents, switch between Groq/Gemini/local LLMs.

API Reference

Classes

Class	What it does	When to use
Forest	Multi-document knowledge base	Most common — use for everything
PageIndex	Single-document index	When you only have one document
ForestResult	Result from Forest.ask()	Access .text, .sources, .confidence

Forest Methods

Method	Description
.add("file.pdf")	Add a single document
.add_directory("./docs/")	Add all PDFs from a folder
.add_directory_batch("./docs/")	Fast batch add (100+ docs)
.ask("question")	Ask a question across all docs
.documents()	List all indexed documents
.remove(doc_id)	Remove a document
.clear_history()	Reset conversation memory
.info()	Forest status summary

ForestResult Fields

Field	Type	Description
.text	str	The final verified answer
.confidence	float	0.0 to 1.0 confidence score
.sources	list	Which documents/pages were used
.reasoning_trace	list	Step-by-step agent pipeline trace
.was_rewritten	bool	Whether the Critic modified the answer
.hallucinations	list	Hallucinations that were caught
.elapsed_seconds	float	Total time taken

Supported Models

Cloud (Groq — Free API)

from agenticrag import Forest, GroqModel

forest = Forest(model=GroqModel.GPT_OSS_20B)       # Fast, recommended
forest = Forest(model=GroqModel.GPT_OSS_120B)       # Largest, best reasoning
forest = Forest(model=GroqModel.LLAMA4_SCOUT)        # Llama 4 Scout
forest = Forest(model=GroqModel.QWEN3_32B)           # Qwen 3 32B

Local (Ollama — 100% Free)

ollama pull qwen3:4b   # 2.5 GB — recommended

from agenticrag import Forest, LocalModel

forest = Forest(
    model=LocalModel.QWEN3_4B,
    base_url="http://localhost:11434/v1",
)

Model	Size	VRAM	Best For
QWEN3_4B	2.5 GB	≤5 GB	Low-VRAM, fastest
QWEN3_8B	5.2 GB	≤8 GB	Best quality/size ratio
QWEN3_14B	9.3 GB	≤12 GB	Higher quality
QWEN3_30B	19 GB	≤24 GB	Strong reasoning
LLAMA3_2_3B	2.0 GB	≤4 GB	Ultra-lightweight
MISTRAL	4.1 GB	≤6 GB	General purpose

How It Works

AgenticRAG uses a multi-agent pipeline — like a team of AI researchers working together:

1Planner

Examines the document graph to find which documents might have the answer

↓

2Hunters

Search documents IN PARALLEL using tree-based reasoning

↓

3Synthesizer

Combines evidence from multiple docs into a coherent answer

↓

4Critic

Checks every claim against source text — removes unverified claims

✓ Verified Answer

← Back to docs View on GitHub