Structured Knowledge Representation through Contextual Pages for Retrieval-Augmented Generation

January 14, 2026

9 authors

arXiv:2601.09402v1

Authors

Xinze LiZhenghao LiuHaidong XinYukun YanShuo WangZheni ZengSen MeiGe YuMaosong Sun

Abstract

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge. Recently, some works have incorporated iterative knowledge accumulation processes into RAG models to progressively accumulate and refine query-related knowledge, thereby constructing more comprehensive knowledge representations. However, these iterative processes often lack a coherent organizational structure, which limits the construction of more comprehensive and cohesive knowledge representations. To address this, we propose PAGER, a page-driven autonomous knowledge representation framework for RAG. PAGER first prompts an LLM to construct a structured cognitive outline for a given question, which consists of multiple slots representing a distinct knowledge aspect. Then, PAGER iteratively retrieves and refines relevant documents to populate each slot, ultimately constructing a coherent page that serves as contextual input for guiding answer generation. Experiments on multiple knowledge-intensive benchmarks and backbone models show that PAGER consistently outperforms all RAG baselines. Further analyses demonstrate that PAGER constructs higher-quality and information-dense knowledge representations, better mitigates knowledge conflicts, and enables LLMs to leverage external knowledge more effectively. All code is available at https://github.com/OpenBMB/PAGER.

Paper Information

arXiv ID:: 2601.09402v1
Published:: January 14, 2026
Categories:: cs.CL

Structured Knowledge Representation through Contextual Pages for Retrieval-Augmented Generation

Authors

Abstract

Paper Information

Related Papers

Large Language Models for Code Generation

Diffusion Models for High-Resolution Image Synthesis