We survey, evaluate, and open source SOTA Retrieval-Augmented Generation (RAG) algorithms for LLM customization and reasoning. We offer a comprehensive evaluation on each component of GraphRAG, including graph construction (time), knowledge retrieval (time), answer generation (accuracy), and rationale generation (reasoning). We aim to provide unprecedented insights into how graph-structured knowledge enhances LLMs' reasoning capabilities compared to traditional RAG approaches.

Read our survey for more introduction to RAG for LLM customization & reasoning.

Select tags...
Model Average TF MC MS FB OE Token cost Time cost Organization Retrieval time Date Site
GPT-4o-mini
70.68 75.95 81.11 76.68 74.29 52.23 - - - -
TF-IDF
71.71 84.17 77.88 72.52 75.71 50.18 - - - -
BM-25
71.66 84.49 78.80 71.17 74.28 50.00 - - - -
RAPTOR
73.58 82.28 80.65 77.48 76.67 54.83 10,142,221 347m27s - 0.02s 2025-01-31
HippoRAG
72.64 81.65 80.18 74.32 70.48 56.13 33,006,198 162m26s 89.58% 2.44s 2024-12-19
GraphRAG(Microsoft)
72.50 80.70 81.57 77.48 75.24 52.42 79,929,698 216m17s 72.51% 44.87s 2025-02-19
GFM-RAG
72.10 82.59 80.65 72.07 72.38 52.79 32,766,094 95m24s 89.97% 1.96s 2025-02-03
KGP
71.86 82.28 79.26 74.77 74.29 51.49 15,271,633 292m2s 46.03% 89.38s 2023-12-25
ToG
71.71 79.75 78.80 78.38 70.48 54.28 33,008,230 105m15s 89.95% 70.53s 2024-03-24
LightRAG
71.22 82.59 78.80 73.42 65.24 53.16 83,909,073 240m6s 69.71% 13.95s 2025-04-28
G-Retriever
69.84 78.80 77.42 71.62 70.95 52.04 32,948,161 103m55s 89.95% 23.77s 2024-05-27
DALK
69.30 77.22 78.34 71.62 70.00 51.49 33,007,324 84m41s 89.49% 26.80s 2024-10-17
No entries match the selected filters. Try adjusting your filters.
Model Average TF MC MS FB OE Token cost Time cost Organization Retrieval time Date Site
GPT-4o-mini
39.78 53.40 50.92 39.19 53.33 9.76 - - - -
TF-IDF
42.38 61.23 49.19 43.02 52.61 10.50 - - - -
BM-25
44.15 62.18 53.11 42.79 56.42 11.52 - - - -
RAPTOR
45.53 62.90 52.07 49.10 57.86 13.57 10,142,221 347m27s - 0.02s 2025-01-31
HippoRAG
44.55 63.61 52.30 47.52 50.48 12.36 33,006,198 162m26s 89.58% 2.44s 2024-12-19
GraphRAG(Microsoft)
43.30 60.13 52.42 45.72 55.24 10.50 79,929,698 216m17s 72.51% 44.87s 2025-02-19
GFM-RAG
44.30 63.69 52.07 45.50 54.76 10.69 32,766,094 95m24s 89.97% 1.96s 2025-02-03
KGP
42.22 60.68 52.07 44.37 49.29 8.92 15,271,633 292m2s 46.03% 89.38s 2023-12-25
ToG
44.01 62.26 51.73 45.72 53.10 12.08 33,008,230 105m15s 89.95% 70.53s 2024-03-24
LightRAG
43.81 63.45 52.30 49.10 47.86 10.13 83,909,073 240m6s 69.71% 13.95s 2025-04-28
G-Retriever
43.66 60.21 53.46 48.20 55.00 10.04 32,948,161 103m55s 89.95% 23.77s 2024-05-27
DALK
42.12 58.23 50.35 46.40 55.24 9.67 33,007,324 84m41s 89.49% 26.80s 2024-10-17
No entries match the selected filters. Try adjusting your filters.
Model Average Reasoning TF MC MS FB OE Site
GPT-4o-mini
70.68 39.78 53.40 50.92 39.19 53.33 9.76
No entries match the selected filters. Try adjusting your filters.

GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation.
GraphRAG-Bench contains 1,018 college-level question spans 16 disciplines, e.g., computer vision, computer networks, human-computer interaction, AI ethics, etc., featuring the ability of conceptual understanding, e.g., "Given [theorem] A and B, prove [conclusion] C", complex algorithmic programming, e.g., coding with interlinked function calls, and mathematical computation, e.g., "Given [Input], [Conv1], [MaxPool], [FC], calculate the output volume dimensions."
GraphRAG-Bench contains five types of diverse questions to thoroughly evaluate different aspects of reasoning, including true-or-false (TF), multiple-choice (MC), multi-select (MS), fill-in-blank (FB), and open-ended (OE).

If you find this website helpful, welcome to cite our papers:

@article{zhang2025survey,
  title={A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models},
  author={Zhang, Qinggang and Chen, Shengyuan and Bei, Yuanchen and Yuan, Zheng and Zhou, Huachi and Hong, Zijin and Dong, Junnan and Chen, Hao and Chang, Yi and Huang, Xiao},
  journal={arXiv preprint arXiv:2501.13958},
  year={2025}
}
@article{xiao2025graphrag,
  title={GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation},
  author={Xiao, Yilin and Dong, Junnan and Zhou, Chuang and Dong, Su and Zhang, Qianwen and Yin, Di and Sun, Xing and Huang, Xiao},
  journal={arXiv preprint arXiv:2506.02404},
  year={2025}
}