RAG Systems: A 2024 Enterprise AI Brief

AI & Enterprise Use Cases

2024-01-12

RAG investment brief for allocators: why retrieval infrastructure became the 2024 enterprise AI control point and how to assess a team's pipeline.

Frequently Asked Questions

Retrieval-augmented generation is an architecture that pairs a language model with an external document store. Before the model answers, a retrieval step pulls the most relevant passages from that store and feeds them into the prompt, so the model answers from current, owned data rather than only from what it memorised in training. The term was introduced by Lewis et al. at Facebook AI Research (now Meta AI) in 2020, combining the model's parametric memory with a non-parametric document index.

RAG lets an enterprise ground a general model in its own documents without the cost and brittleness of retraining. By late 2023 it was already the most common customisation approach, used by 31% of AI adopters surveyed, ahead of fine-tuning at 19%. It updates as the underlying documents change, which fits enterprises whose knowledge changes faster than any training cycle.

Check retrieval quality first, because a model can only answer as well as the passages it is given. Confirm the team measures retrieval accuracy, not only model output, that they control how documents are chunked and embedded, and that access permissions on source data carry through to retrieval. A weak retrieval layer caps the value of any model placed on top of it.