Learn how tokenization works in large language and embedding models and how the tokenizer can affect the quality of your search.
We'd like to know you better so we can create more relevant courses. What do you do for work?
Instructor: Kacper Ćukawski
Learn how tokenization works in large language and embedding models and how the tokenizer can affect the quality of your search.
Explore how different tokenization techniques including Byte-Pair Encoding, WordPiece, and Unigram are trained and work.
Understand how to measure the quality of your retrieval and how to optimize your search by adjusting HNSW parameters and vector quantizations.
In Retrieval Optimization: From Tokenization to Vector Quantization, taught by Kacper Ćukawski, Developer Relations Lead of Qdrant, youâll learn all about tokenization and also how to optimize vector search in your large-scale customer-facing RAG applications. Youâll explore the technical details of how vector search works and how to optimize it for better performance.
This course focuses on optimizing the first step in your RAG and search results. Youâll see how different tokenization techniques like Byte-Pair Encoding, WordPiece, and Unigram work and how they affect search relevancy. Youâll also learn how to address common challenges such as terminology mismatches and truncated chunks in embedding models.
To optimize your search, you need to be able to measure its quality. You will learn several quality metrics for this purpose. Most vector databases use Hierarchical Navigable Small Worlds (HNSW) for approximate nearest-neighbor search. Youâll see how to balance the HNSW parameters for higher speed and maximum relevance. Finally, you would use different vector quantization techniques to enhance memory usage and search speed.
What youâll do, in detail:
By the end of this course, youâll have a solid understanding of how tokenization is done and how to optimize vector search in your RAG systems.
Anyone with basic Python knowledge who wants to learn to build effective customer-facing RAG applications!
Introduction
Embedding models
Role of the tokenizers
Practical implications of the tokenization
Measuring Search Relevance
Optimizing HNSW search
Vector quantization
Conclusion
Quiz
Gradedă»Quiz
ă»10 minsCourse access is free for a limited time during the DeepLearning.AI learning platform beta!
Keep learning with updates on curated AI news, courses, and events, as well as Andrewâs thoughts from DeepLearning.AI!