Systematically modeling and extracting bibliographic metadata of power grid standard documents with LLMs
DOI:
https://doi.org/10.47989/ir30iConf47233Keywords:
Power Grid Standards, Bibliographic Metadata Extraction, Large Language Models, Trustworthiness EstimationAbstract
Introduction. This study addresses the critical need for systematic bibliographic metadata representation and extraction from power grid standard documents, essential for operational efficiency and knowledge management in the power industry.
Method. We developed a two-stage methodology utilizing large language models (LLMs) for extracting bibliographic metadata. The first stage involves constructing state grid-oriented instructions for the LLM, and the second stage includes a trustworthiness estimation to ensure the reliability of the extracted metadata.
Analysis. Experiments were conducted using 96 state grid PDF samples to test the accuracy of metadata extraction. The performance of different LLMs was evaluated using single and multiple instructions.
Results. The results showed over 70% accuracy across all models, with GPT-4 achieving the highest accuracy of 84%. Multiple instructions outperformed single instructions, highlighting the effectiveness of our approach.
Conclusions. This study demonstrates the promising potential by LLM for data management in the power grid field, with the trustworthiness estimation mechanism significantly enhancing the reliability of the data extracted.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Guowei Chen, Wei Xie, Yanan Liu, Xiaoqun Yuan, Liang Zhao

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.