AI Education

Why you should only use AI platforms powered by RAG

Written by James Smith | Apr 15, 2024 1:27:57 AM
Retrieval-Augmented Generation, or RAG, is a remarkable advancement in AI. It essentially gives Large Language Models (LLMs) the ability to fetch and incorporate factual data from external sources.
 
To fully appreciate the power of RAG, you need to understand how LLMs work in the first place. LLMs are not chock-full of facts, ready to spit them out at your request. Sure, they have been trained on — more or less — the entire Internet, but this has training has been for the purpose of teaching them natural language.
 
Put simplistically, when an LLM responds to a question, it is putting together a sentence by predicting the next most ‘common’ word in accordance with what you have asked it and the context you have offered.
 
RAG essentially assigns AI the role of academic researcher, and encourages it to cite actual sources when generating text. The sources it cites depends on the data set it has been fed.

So… without RAG, is ChatGPT just firing from the hip?

To be blunt, yes.
 
Don’t get me wrong: ChatGPT’s ‘guess’ is more likely to be right than yours or mine. The data it has used to learn natural language is so expansive that, for the most part, the text it streams in response to your question is likely to be factually accurate. It is just at risk of hallucinating incorrect information if, say, its training data is inaccurate.
 
Think of a judge making a ruling in court. They will not make a decision at random — they will cite past cases, and evidence from the case in question, to ensure their verdict is accurate and based in fact. This is what RAG empowers LLMs to do.

Newer AI models — like ChatGPT4 — have the ability to use RAG, when prompted to do so.

For example, I asked ChatGPT4: “What month has the warmest weather in Melbourne?” It responded: “Melbourne, Australia, experiences its warmest weather during the summer months, which are December, January, and February. January is typically the warmest month of the year in Melbourne.”
 
I then asked it again, but included language that would trigger RAG: “Browse the internet and tell me which month has the warmest weather in Melbourne, and cite the sources you use.” It responded: “January is typically the warmest month in Melbourne with an average maximum temperature of 26°C, making it the warmest month of the year. This information is supported by data from weather-and-climate.com​​ and also aligns with the details provided by climatestotravel.com​​ and australia.com.” Its response included links to the stated websites.
 
Both answers are correct — the second just has more weight behind it. And if your requests are more niche, more specific, ensuring responses are grounded in external sources is critical.

Understanding RAG is key to harnessing the full power of AI.

So, how does RAG actually work?
 
Imagine you're a school librarian, using a super-smart robot to help you with finding information. Let’s pretend this robot is powered by RAG, and you ask them a question like, “Who was the Football Team Captain for the school in 2023?”
 
The pre-trained AI model itself does not know the answer to this question. But, if you have given this robot access to your school’s online library as its ‘data set’, it can use RAG to find out.
 
First, the robot will listen to your question carefully, noting down the keywords "Football Team Captain, 2023" that effectively capture your search intent.
 
With these keywords, our robot goes on a mission. It uses these keywords to quickly search through the enormous, up-to-date online library. This is the “retrieval” part of RAG— it runs through the library's virtual aisles, finding books or articles that talk about the football team in 2023.
 
Out of all the books and articles, the robot might find a few that are exactly about the "Football Team in 2023." It quickly reads through them to find the name of the captain.
But, the robot is smart. It doesn't just take the name and restate it for us. It also reads a bit about the captain, maybe its performance in the 2023 season, the rest of the team. It tries to understand the context of the question it has been asked.
 
Our robot will then combine this information with anything else it already “knows” about, for example, football from either the school’s dataset, or the information the model was trained with originally. This is the “generation” component.
 
Finally, the robot returns to your desk and hands you a note, which states, “The school’s 2023 Football Team Captain was Liam Smith. Under his leadership, the school team won the premiership for the first time in ten years.”
 
Through this process, RAG helped the robot to not just repeat facts but to understand your question, find the latest information, understand the context, and then answer in a way that's informative and easy for you to understand.

RAG reshapes the capabilities of AI systems and sets new benchmarks for accuracy, reliability and user trust in artificial intelligence.

The two key benefits to RAG are around verifiability and precision.
 
RAG gives AI models the ability to cite its sources. This serves to bolster the trustworthiness and authenticity of AI-generated responses, and also gives users the ability to verify these sources. In the classroom, responses structured in this way allow teachers to educate their students on the value of citation, and of cross-referencing information.
 
Reliance on external sources significantly improves the precision of AI-generated text, reducing the likelihood of AI hallucinations and general errors. It advances AI from being a tool with exceptional natural language skills, to one that also has the ability to read and interpret any data it has been provided.

There are, of course, limitations to this technology.

RAG may not achieve 100% retrieval accuracy, meaning it may not always retrieve every single piece of relevant knowledge. Unless users have an understanding of the topic themselves, this can be difficult to confirm or clarify — which could lead to incomplete information being reproduced.
 
For example, a legal firm may use RAG to expedite case research. If the RAG system fails to retrieve all applicable rules for a given client, it could result in the provision of inaccurate advice.
 
As with anything that relies on external data, there is also the limitation of the quality of that data — the output can only be as good as the information it has been given. If inaccurate, outdated data has been fed in, inaccurate information will come out. Ensuring the retrieval store is populated with recent, quality data is essential.

The applications for this form of AI are vast and significant.

Ultimately, RAG enhances AI’s ability to operate as an accurate research assistant and writing tool. The possibilities are endless, provided it is used as just that — an assistant, a tool, that keeps a ’human in the loop’.