Systematically modeling and extracting bibliographic metadata of power grid standard documents with LLMs

Authors

  • Guowei Chen State Grid Fujian Electric Power Co., Ltd. Fuzhou, China
  • Wei Xie State Grid Fujian Electric Power Research Institute, Fuzhou, China
  • Yanan Liu Yingda Media Investment Group Company. Ltd. Beijing, China
  • Xiaoqun Yuan School of Information Management, Wuhan University; Key Laboratory of Semantic Publishing and Knowledge Service of the National Press and Publication Administration, Wuhan University
  • Liang Zhao School of Information Management, Wuhan University; Key Laboratory of Semantic Publishing and Knowledge Service of the National Press and Publication Administration, Wuhan University

DOI:

https://doi.org/10.47989/ir30iConf47233

Keywords:

Power Grid Standards, Bibliographic Metadata Extraction, Large Language Models, Trustworthiness Estimation

Abstract

Introduction. This study addresses the critical need for systematic bibliographic metadata representation and extraction from power grid standard documents, essential for operational efficiency and knowledge management in the power industry.

Method. We developed a two-stage methodology utilizing large language models (LLMs) for extracting bibliographic metadata. The first stage involves constructing state grid-oriented instructions for the LLM, and the second stage includes a trustworthiness estimation to ensure the reliability of the extracted metadata.

Analysis. Experiments were conducted using 96 state grid PDF samples to test the accuracy of metadata extraction. The performance of different LLMs was evaluated using single and multiple instructions.

Results. The results showed over 70% accuracy across all models, with GPT-4 achieving the highest accuracy of 84%. Multiple instructions outperformed single instructions, highlighting the effectiveness of our approach.

Conclusions. This study demonstrates the promising potential by LLM for data management in the power grid field, with the trustworthiness estimation mechanism significantly enhancing the reliability of the data extracted.

Downloads

Published

2025-03-11

How to Cite

Chen, G., Xie, W., Liu, Y., Yuan, X., & Zhao, L. (2025). Systematically modeling and extracting bibliographic metadata of power grid standard documents with LLMs. Information Research an International Electronic Journal, 30(iConf), 654–665. https://doi.org/10.47989/ir30iConf47233

Issue

Section

Peer-reviewed papers

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.