Overview
Retrieval Augmented Generation—RAG—is now the standard way to improve LLM accuracy and relevance. But building production-grade RAG systems requires far more than connecting an LLM to a vector database. In this book, you’ll learn RAG from first principles by creating a complete portfolio of end-to-end applications. You’ll build each component of the pipeline, ensuring full control over every part of the stack.
Written by former Google research scientist Hamza Farooq, this hands-on guide takes you from LLM and transformer fundamentals through keyword search and semantic retrieval to production RAG systems. You’ll build a hotel search engine with semantic ranking, implement semantic caching for cost-effective production deployments, develop autonomous AI agents powered by RAG context, and deploy optimized open-source LLMs. Through under-the-hood experience, you’ll master embeddings, chunking, reranking, vector databases, evaluation frameworks, fine-tuning, and more.
What's inside
• Design and implement efficient search algorithms for LLM applications
• Master deep customization techniques for every RAG pipeline component
• Model fine-tuning techniques for task-specific and domain adaptation
• Deploy quantized versions of open-source LLMs using vLLMs and Ollama
About the reader
For Python developers with NLP basics, who are ready to move beyond framework abstractions and build RAG systems optimized for their specific constraints.
About the author
Hamza Farooq is the founder and CEO of Traversaal.ai, and he's a seasoned AI expert. His experience includes roles as both a research scientist at Google and a distinguished adjunct professor at leading institutions like Stanford UCLA and University of Minnesota.
Select a Delivery Option
1 Item Added to Bag 1 Item Added to Pickup