Innovative practice of archival data development workflow in the AGI era: a case study of scientist archives project
DOI:
https://doi.org/10.47989/ir30iConf47335Keywords:
Archival Workflow Innovation, Archival Data Development, Archival Data Mining, Large Language Model Application in Archival WorkAbstract
Introduction. The advent of large language models (LLMs) presents a transformative opportunity for the field of archival science, offering advanced capabilities in intelligent information processing, semantic search, and more. These innovations address critical challenges posed by the exponential growth in archival materials and the increasing demand for efficient data analysis. Traditional archival workflows, often rely on manual description and optical character recognition (OCR), struggle with the complexities of unstructured digital data, especially in the context of digitized historical archives and manuscripts.
Method. This paper explores a novel archival workflow through the case study of the scientist archives project, integrating human-machine collaboration and leveraging technologies such as open-sourced archival databases, IIIF-supported OCR environments, and advanced LLMs, including classic retrieval-augmented generation (Classic RAG) and graph-based retrieval-augmented generation (GraphRAG).
Analysis. A Scientist Archives project is analysed, which utilises AGI era technologies to mine and manage the archives.
Results. By embracing these technologies, the proposed approach seeks to revolutionize archival management, enhancing both efficiency and the depth of content revelation in the AGI era.
Conclusions. This study contributes to the ongoing discourse on the intelligent transformation of archival practices, providing a roadmap for future archival data mining and management.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Yaming Fu, Jie Song, Xinran Zhang, Jingyun Bi

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.