Retrieval-Augmented Generation (RAG) Examples and Use Cases
by Dr. Phil Winder , CEO
Retrieval augmented generation (RAG) has emerged as a useful way to inject information into natural language systems that include LLMs. This webinar provides an overview of all of the possible use cases that you might enhance your products in 2024.
- Understand what RAG is and how it is used
- Discover the wide range of use cases where RAG helps power applications
- Gain insights into advanced LLM architectures and techniques to boost your AI applications
This webinar is perfect for those that are interested in or are already using RAG to improve their LLM-powered systems. This webinar won’t show any code, but the discussion will be technical. And the introductory recap will be brief, so I recommend that you first become familiar by reading one of our previous introductory articles.
Download SlidesThe following sections are expanded notes that I wrote for the presentation.
Examples of RAG Use By Industry
Below is a list of links that I used to demonstrate that RAG is being used across a diverse range of industries.
- Finance https://arxiv.org/abs/2311.11944
- Legal https://arxiv.org/abs/2309.11325
- Medicine https://arxiv.org/html/2402.13178v2
- Healthcare https://arxiv.org/abs/2403.09226
- Technology https://arxiv.org/html/2404.00657v1
- Agriculture https://arxiv.org/abs/2401.08406
- Pharmaceuticals https://arxiv.org/html/2402.01717v1
- Telecommunications https://arxiv.org/abs/2404.15939
- Energy https://arxiv.org/abs/2406.06566
- Science https://arxiv.org/abs/2403.15729
- Education https://arxiv.org/abs/2406.07796
- Construction https://arxiv.org/abs/2402.09939
- Sport https://arxiv.org/abs/2406.01280
- Real Estate https://arxiv.org/abs/2310.01429
RAG Use Cases
RAG use cases depend slightly on the modality of the data. But there are a few that occur over all modalities.
Text
Text-based modalities represent the vast majority of RAG use cases. They include:
- QA: Generating answers to user questions based upon a repository of textual sources.
- Summarization: Distill the essential information from longer texts.
- Fact Verification: Determining whether a given claim can be supported by facts in the text.
- Commonsense Reasoning: The capability of machines to infer in a human-like manner, drawing upon external knowledge.
- Human-Machine Conversation: A language model using external data sources to maintain a human-like conversation for the purposes of a goal.
- Translation: An automated process of translating text from a source language to a target language.
- Extraction: A goal in natural language processing that involves identifying and categorizing specific events and entities within a text and associating them via relevant relationships.
- Recommendations: Using RAG to help recommend in natural language.
- Generation: Using retrieval to help generative tasks based upon an external database, like a blog, for example.
- Classification: Classify items based upon how other items in a repository have been classified.
- Search: Combining interpretation with retrieval to find items in a repository of information.
Code
Code-related use cases were subtly different compared to text-based approaches:
- Generation: Convert natural language descriptions into code implementations.
- Summarization: Convert code into natural languages descriptions.
- Completion: Predict the next bit of code.
- Automatic Program Repair: Query-based RAG is often used in automatic program repair to help generative models fix buggy codes.
- Generation: Generating and running new code to perform a deeper analysis.
- Testing & Security: Using language models and RAG to perform functional and security testing, like fuzzing and prompt injection.
Databases
Databases are a natural source of data, so it has become common to interface language models with them:
- Translation: Conversion of natural language into database specific query languages like SQL.
- Graph-Based QA: Leveraging graph databases like knowledge bases to answer a question, often via generating database specific queries like SPARQL.
- Table-Based QA: Using relation databases or spreadsheets to answer questions based upon structured knowledge. Often includes generative code aspects to answer more complicated queries.
Media
These tasks generally apply to image, video, and audio modalities. It could even include more esoteric data formats like 3D data:
- Generation: Generation usually refers to the models generating new media. But in a RAG context, it refers to the use of external repositories of media to help guide or seed the generation.
- Captioning: Captioning is the process of generating a textual descriptions of media. For example, adding subtitles or image captions. In a RAG context, this refers to leveraging external text or media to improve the captioning process.
- QA: This refers to asking questions based upon the media followed by retrieval. For example, ‘who wrote this song?’.
Science
Generative techniques are also being applied to the sciences. Where there is a repository of information, it makes sense to include RAG in the process:
- Drug discovery: Generate molecules that fulfill requirements.
- Medical Applications: Improve the generation of a language model by retrieving information from biomedical domain-specific databases.
- Maths: Streamline problem-solving, boost innovation and refine educational strategies.
Key Challenges
Looking broadly across the use cases, a variety of challenges are reported.
One of the most important is the increase in complexity. The combination of a pseudo-non-deterministic generator with a finicky retrieval mechanism can make it hard to build robust systems.
This often manifests in non-functional performance metrics like latency and the time-to-first-token. The addition of a retrieval step makes an already noticeable lag from the language model even worse.
If retrievers perform poorly then they are a source of noise for the generator. However, some experiments show that noise can help performance on average, possibly by providing counterfactual anchors.
Retrievers often operate with a different domain and modality compared to the generator. This presents a domain gap which may adversely affect results. For example, a generator might not be able to understand results provided by the retriever if they are in a different language.
Increased context sizes and multimodality can make it difficult to integrate and add to the complexity.
Failure Points
Various papers talk about the challenges of making RAG work well. But it’s only on use that the various failure modes of RAG become apparent.
- Non-functional performance can be poor, but caching helps
- Missing content is the greatest source of poor results
- Ranking can be challenging, especially when content is similar.
- Formatting and general data cleanliness can causes issues for queries that result in retrieving this data.
- Bad prompts can add to the ranking problem if they lack specificity or clarity.
- Solutions are likely to be suboptimal because it is hard to test end-to-end and iterate over hyperparameters like chunk sizes or embedding strategies, etc. Akin to feature engineering.
- Testing in general is hard due to the complex interaction between the generator and the retriever.
Further Reading
Two really great surveys providing deeper links to applications of RAG: