Microsoft announced an update to GraphRAG that improves AI search engines’ ability to provide specific and comprehensive answers while using less resources. This update speeds up LLM processing and increases accuracy.
RAG (Retrieval Augmented Generation) combines a large language model (LLM) with a search index (or database) to generate responses to search queries. The search index grounds the language model with fresh and relevant data. This reduces the possibility of AI search engine providing outdated or hallucinated answers.
GraphRAG improves on RAG by using a knowledge graph created from a search index to then generate summaries referred to as community reports.
Step 1: Indexing Engine
The indexing engine segments the search index into thematic communities formed around related topics. These communities are connected by entities (e.g., people, places, or concepts) and the relationships between them, forming a hierarchical knowledge graph. The LLM then creates a summary for each community, referred to as a Community Report. This is the hierarchical knowledge graph that GraphRAG creates, with each level of the hierarchical structure representing a summarization.
There’s a misconception that GraphRAG uses knowledge graphs. While that’s partially true, it leaves out the most important part: GraphRAG creates knowledge graphs from unstructured data like web pages in the Indexing Engine step. This process of transforming raw data into structured knowledge is what sets GraphRAG apart from RAG, which relies on retrieving and summarizing information without building a hierarchical graph.
Step 2: Query Step
In the second step the GraphRAG uses the knowledge graph it created to provide context to the LLM so that it can more accurately answer a question.
Microsoft explains that Retrieval Augmented Generation (RAG) struggles to retrieve information that’s based on a topic because it’s only looking at semantic relationships.
GraphRAG outperforms RAG by first transforming all documents in its search index into a knowledge graph that hierarchically organizes topics and subtopics (themes) into increasingly specific layers. While RAG relies on semantic relationships to find answers, GraphRAG uses thematic similarity, enabling it to locate answers even when semantically related keywords are absent in the document.
This is how the original GraphRAG announcement explains it:
“Baseline RAG struggles with queries that require aggregation of information across the dataset to compose an answer. Queries such as “What are the top 5 themes in the data?” perform terribly because baseline RAG relies on a vector search of semantically similar text content within the dataset. There is nothing in the query to direct it to the correct information.
However, with GraphRAG we can answer such questions, because the structure of the LLM-generated knowledge graph tells us about the structure (and thus themes) of the dataset as a whole. This allows the private dataset to be organized into meaningful semantic clusters that are pre-summarized. The LLM uses these clusters to summarize these themes when responding to a user query.”
To recap, GraphRAG creates a knowledge graph from the search index. A “community” refers to a group of related segments or documents clustered based on topical similarity, and a “community report” is the summary generated by the LLM for each community.
The original version of GraphRAG was inefficient because it processed all community reports, including irrelevant lower-level summaries, regardless of their relevance to the search query. Microsoft describes this as a “static” approach since it lacks dynamic filtering.
The updated GraphRAG introduces “dynamic community selection,” which evaluates the relevance of each community report. Irrelevant reports and their sub-communities are removed, improving efficiency and precision by focusing only on relevant information.
Microsoft explains:
“Here, we introduce dynamic community selection to the global search algorithm, which leverages the knowledge graph structure of the indexed dataset. Starting from the root of the knowledge graph, we use an LLM to rate how relevant a community report is in answering the user question. If the report is deemed irrelevant, we simply remove it and its nodes (or sub-communities) from the search process. On the other hand, if the report is deemed relevant, we then traverse down its child nodes and repeat the operation. Finally, only relevant reports are passed to the map-reduce operation to generate the response to the user. “
Microsoft tested the new version of GraphRAG and concluded that it resulted in a 77% reduction in computational costs, specifically the token cost when processed by the LLM. Tokens are the basic units of text that are processed by LLMs. The improved GraphRAG is able to use a smaller LLM, further reducing costs without compromising the quality of the results.
The positive impacts on search results quality are:
Dynamic community selection in GraphRAG improves search results quality by generating responses that are more specific, relevant, and supported by source material.
Read Microsoft’s announcement:
GraphRAG: Improving global search via dynamic community selection
Featured Image by Shutterstock/N Universe