<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">IR</journal-id>
<journal-title-group>
<journal-title>Information Research</journal-title>
</journal-title-group>
<issn pub-type="epub">1368-1613</issn>
<publisher>
<publisher-name>University of Bor&#x00E5;s</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">ir30iConf47233</article-id>
<article-id pub-id-type="doi">10.47989/ir30iConf47233</article-id>
<article-categories>
<subj-group xml:lang="en">
<subject>Research article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Systematically modeling and extracting bibliographic metadata of power grid standard documents with LLMs</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name><surname>Chen</surname><given-names>Guowei</given-names></name>
<xref ref-type="aff" rid="aff0001"/></contrib>
<contrib contrib-type="author"><name><surname>Xie</surname><given-names>Wei</given-names></name>
<xref ref-type="aff" rid="aff0002"/></contrib>
<contrib contrib-type="author"><name><surname>Liu</surname><given-names>Yanan</given-names></name>
<xref ref-type="aff" rid="aff0003"/></contrib>
<contrib contrib-type="author"><name><surname>Yuan</surname><given-names>Xiaoqun</given-names></name>
<xref ref-type="aff" rid="aff0004"/></contrib>
<contrib contrib-type="author"><name><surname>Zhao</surname><given-names>Liang</given-names></name>
<xref ref-type="aff" rid="aff0005"/></contrib>
<aff id="aff0001"><bold>Guowei Chen</bold> works at State Grid Fujian Electric Power Co.,Ltd, Fuzhou, China. His research interests are technology management and knowledge services, and can be contacted at <email xlink:href="chen_guowei@fj.sgcc.com.cn">chen_guowei@fj.sgcc.com.cn</email></aff>
<aff id="aff0002"><bold>Wei Xie</bold> works at State Grid Fujian Electric Power Research Institute, Fuzhou, China. His research interest is artificial intelligence in electric power and can be contacted at <email xlink:href="xiewei3896@163.com">xiewei3896@163.com</email></aff>
<aff id="aff0003"><bold>Yanan Liu</bold> is an Editor at Yingda Media Investment Group Co., Ltd. Her research interest is digitalization of electric power standards, and she can be contacted at <email xlink:href="743113090@qq.com">743113090@qq.com</email>.</aff>
<aff id="aff0004"><bold>Xiaoqun Yuan</bold> is Associate Professor in School of Information Management, Wuhan University, China. He received their Ph.D. from Huazhong University of Science and Technology. His research interests are information resource management and knowledge services and can be contacted at <email xlink:href="yuan20030308@whu.edu.cn">yuan20030308@whu.edu.cn</email></aff>
<aff id="aff0005"><bold>Liang Zhao</bold> is Associate Professor in School of Information Mangement, Wuhan University, China. She received her Ph.D. from Tsinghua University. Her research interests are data mining and knowledge services. She is the corresponding author of the paper and can be contacted at <email xlink:href="liangzhao@whu.edu.cn">liangzhao@whu.edu.cn</email></aff>
</contrib-group>
<pub-date pub-type="epub"><day>06</day><month>05</month><year>2025</year></pub-date>
<pub-date pub-type="collection"><year>2025</year></pub-date>
<volume>30</volume>
<issue>i</issue>
<fpage>654</fpage>
<lpage>665</lpage>
<permissions>
<copyright-year>2025</copyright-year>
<copyright-holder>&#x00A9; 2025 The Author(s).</copyright-holder>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by-nc/4.0/">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>), permitting all non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract xml:lang="en">
<title>Abstract</title>
<p><bold>Introduction.</bold> This study addresses the critical need for systematic bibliographic metadata representation and extraction from power grid standard documents, essential for operational efficiency and knowledge management in the power industry.</p>
<p><bold>Method.</bold> We developed a two-stage methodology utilizing large language models (LLMs) for extracting bibliographic metadata. The first stage involves constructing state grid-oriented instructions for the LLM, and the second stage includes a trustworthiness estimation to ensure the reliability of the extracted metadata.</p>
<p><bold>Analysis.</bold> Experiments were conducted using 96 state grid PDF samples to test the accuracy of metadata extraction. The performance of different LLMs was evaluated using single and multiple instructions.</p>
<p><bold>Results.</bold> The results showed over 70% accuracy across all models, with GPT-4 achieving the highest accuracy of 84%. Multiple instructions outperformed single instructions, highlighting the effectiveness of our approach.</p>
<p><bold>Conclusion(s).</bold> This study demonstrates the promising potential by LLM for data management in the power grid field, with the trustworthiness estimation mechanism significantly enhancing the reliability of the data extracted. </p>
</abstract>
</article-meta>
</front>
<body>
<sec id="sec1">
<title>Introduction</title>
<p>Power grid standard documents in actual production provides a set of standardized operation and management processes for power grid enterprises (<xref rid="R5" ref-type="bibr">China Power, 2024</xref>). Through unified regulations, standardization helps optimize resource allocation, risk control and decision-making support, thus enhancing overall operational efficiency and service quality (<xref rid="R10" ref-type="bibr">Gal &#x0026; Rubinfeld, 2019</xref>; Reyes et al.,2023).</p>
<p>In the era of burgeoning data and information systems, metadata plays a pivotal role in organizing, navigating, and understanding the plethora of massive standard documentation resources (<xref rid="R28" ref-type="bibr">Zeng &#x0026; Qin, 2020</xref>; <xref rid="R20" ref-type="bibr">Riley, 2017</xref>; <xref rid="R1" ref-type="bibr">Baca, 2016</xref>). Particularly, bibliographic metadata of the standard documents, which includes information about documents such as titles, authors, and publication details, is crucial for effective knowledge management and retrieval (Liu et al.,2022; Reyes et al.,2023).</p>
<p>However, systematically and effectively extract structured bibliographic metadata from a variety of unstructured standard documents (e.g., in PDF format) is a non-trivial task. Existing methods of knowledge extraction from PDFs (<xref rid="R3" ref-type="bibr">B&#x00FC;chter et al., 2020</xref>) typically uses OCR technique to convert these documents into editable text (He ,2020; Karthick et al.,2019;Bartz et al.,2017), followed by manual rule-based systems (Proctor et al.,2019) (e.g., by regular expressions; <xref rid="R4" ref-type="bibr">Chapman et al., 2017</xref>) or the construction of machine learning models (Fanni et al.,2023; <xref rid="R13" ref-type="bibr">Kang et al., 2020</xref>; Chowdhary &#x0026; <xref rid="R6" ref-type="bibr">Chowdhary, 2020</xref>) to recognize the required information. Such methods generally incur high costs and require model readjustment when the set of standard documents change, lacking a convenient and universal framework.</p>
<p>Ovbiously, there are two major challenges that need addressing. Conceptually, a unified and structured representation framework of bibliographic metadata specifically for power grid standard documents is still ambiguous. Technically, a unified and low-cost knowledge extraction pipeline needs to be developed.</p>
<p>The advent of large language models (LLMs) has marked a significant leap forward in the field of knowledge extraction, especially in areas where traditional data labelling is limited or costly. Efforts have been made to harness the abilities of LLMs for generative information extraction tasks, (e.g. <xref rid="R26" ref-type="bibr">Xu et al., 2023</xref>; Sivarajkumar et al., 2023). Researchers have been investigating the usage of LLMs in both zero-shot and in-context learning settings to tackle the problem of extracting procedures from unstructured text in an incremental question-answering fashion (<xref rid="R7" ref-type="bibr">Dagdelen et al., 2024</xref>; <xref rid="R12" ref-type="bibr">Kabongo &#x0026; D&#x2019;Souza, 2024</xref>). For example, Papaluca et al. (<xref rid="R17" ref-type="bibr">2023</xref>) developed a dynamic pipeline for knowledge graph triplet extraction in zero-shot and few-shot scenarios by using contextual data from a knowledge base. Xue et al. (<xref rid="R27" ref-type="bibr">2024</xref>) introduced AutoRE, a method for relationship extraction that goes beyond traditional sentence-level analysis to document-level analysis. Through simple prompt-based strategies, Shao et al. (2023) explored using pre-trained LLMs for extracting astronomical knowledge entities from astrophysical journal articles. Zhang et al. (<xref rid="R29" ref-type="bibr">2024</xref>) proposed a zero-shot learning framework to enable LLMs to assimilate knowledge even without direct training on specific datasets.</p>
<p>Inspired by the promising capabilities of LLMs, in this study, we propose a comprehensive pipeline for both modelling and extracting the bibliographic metadata of power grid standard documents, marking a pivotal step from conceptual representation to application realization within the domain of power grid standard documentation. As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, we adopted a concise and efficient pipeline to realize metadata extraction from power network standard documents, which consists of two modules, addressing the two major challenges, respectively. Firstly, we conceptually construct a unified and structured framework of bibliographic metadata specifically for power grid standard documents, a step that is the basis of the entire process and involves determining what key information should be extracted from the documents. Secondly, a two-stage methodology utilizing LLMs for extracting bibliographic metadata is designed, including instruction design upon LLMs and trustworthiness estimation to refine the reliability.</p>
<fig id="F1">
<label>Figure 1.</label>
<caption><p>Research framework</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="images\c56-fig1.jpg"><alt-text>none</alt-text></graphic>
</fig>
</sec>
<sec id="sec2">
<title>Modeling bibliographic metadata of power grid standard documents</title>
<sec id="sec2_1">
<title>Conceptualization of bibliographic metadata</title>
<p>It is necessary to construct a unified and structured conceptualizaiton that categorizes metadata into distinct yet interconnected components. According to the national standard GB/T 22373-2021 &#x2018;Standard Document Metadata&#x2019; (<ext-link ext-link-type="uri" xlink:href="http://c.gb688.cn/bzgk/gb/showGb?type=online&#x0026;hcno=174FB61306FB900BFD86FF84C3BD12F7">http://c.gb688.cn/bzgk/gb/showGb?type=online&#x0026;hcno=174FB61306FB900BFD86FF84C3BD12F7</ext-link>) of China, which is designed to standardize the metadata of general standard documents (Liu et al.,2022), we summarize and design the conceptual framwork of bibliographic metadata by involving the specific attributes in the field of power grid, capturing the essence of the metadata systematically and hierarchically. Totally, 24 bibliographic metadata items are conceptualized, organizd in four categories including basic information, semantic information, correlated information, as well as property information. <xref ref-type="fig" rid="F2">Figure 2</xref> illustrates hierarchical diagram for clarity. </p>
<fig id="F2">
<label>Figure 2.</label>
<caption><p>Hierarchical conceptualization of bibliographic metadata</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="images\c56-fig2.jpg"><alt-text>none</alt-text></graphic>
</fig>
</sec>
<sec id="sec2_2">
<title>Detailed representations of title metadata</title>
<p>In this section, we will detail the representation of each metadata item in the standard documents according to the above hierarchical conceptualization.</p>
<sec id="sec2_2_1">
<title>Basic information</title>
<p><bold>Standard number:</bold> a character type that consists of the standard code, followed by a space, a sequential number, a hyphen, and a four-digit publication year. This is located on the front page of the standard document.</p>
<p><bold>International standard classification number (ICS):</bold> a character type used for international classification, providing a standardized reference for the document&#x2019;s subject matter.</p>
<p><bold>Chinese standard classification number (CCS):</bold> a character type that categorizes the standard within the Chinese regulatory framework.</p>
<p><bold>Standard level:</bold> indicates the hierarchy of the standard, such as GB (National Standard), DL (Industry Standard), T (Group Standard), Q (Enterprise Standard), and DB (Local Standard).</p>
<p><bold>Release date:</bold> presented in YYYY-MM-DD format, indicates when the standard was officially published.</p>
<p><bold>Implementation or trial date:</bold> also in YYYY-MM-DD format, specifies when the standard becomes effective or enters a trial phase.</p>
<p><bold>Text language:</bold> identifies the language of the document, such as Chinese or English.</p>
<p><bold>Standard category:</bold> classifies the standard into types such as terminology, symbols, classification, testing, specifications, codes of practice, and guidelines.</p>
<p><bold>Uniform book number:</bold> a character type that serves as a unique identifier for the publication.</p>
</sec>
<sec id="sec2_2_2">
<title>Semantic information</title>
<p>This category captures the content-related aspects of the standard, providing insights into its purpose and scope.</p>
<p><bold>Chinese standard name:</bold> the official title of the standard in Chinese, presented as free text.</p>
<p><bold>English standard name:</bold> the official title of the standard in English, also presented as free text.</p>
<p><bold>Main revisions or changes:</bold> a free text description of significant revisions or changes made to the standard, with different entries separated by semicolons.</p>
<p><bold>Scope:</bold> defines the applicability of the standard, detailing what is covered and the contexts in which it is relevant.</p>
</sec>
<sec id="sec2_2_3">
<title>Correlated information</title>
<p>This category includes references and relationships that connect the standard to other documents and frameworks.</p>
<p><bold>Drafting rules:</bold> refers to the standards that guided the drafting process, indicated by their standard numbers. This information is typically found in the preface of the document.</p>
<p><bold>Replaced standards:</bold> lists the numbers and names of standards that this document replaces, with multiple entries separated by semicolons.</p>
<p><bold>Normative reference documents:</bold> free text that includes standard numbers and descriptions of documents that are considered normative, with multiple items separated by semicolons.</p>
<p><bold>Cited literature:</bold> free text that may include standard numbers and names of cited documents, with multiple entries separated by semicolons. This is typically found on the last page and copyright page of the document.</p>
</sec>
<sec id="sec2_2_5">
<title>Property information</title>
<p>This category encompasses details regarding the ownership and responsibility for the standard. <bold>Publishing organization:</bold> the entity responsible for publishing the standard, ensuring its formal distribution and recognition, typically presented as free text.</p>
<p><bold>Submitted by:</bold> the organization that submitted the standard for review and publication, which may include government bodies or industry groups, with multiple entities separated by commas.</p>
<p><bold>Subordinate unit:</bold> any subsidiary or subordinate units under the main organization that contributed to the standard, listed in free text.</p>
<p><bold>Drafting unit:</bold> the specific group or department within an organization tasked with drafting the standard document, presented as free text.</p>
<p><bold>Implementing unit:</bold> the entity responsible for the enforcement and practical application of the standard, detailed in free text.</p>
<p><bold>Drafter(s):</bold> individuals involved in drafting the standard, with multiple names separated by commas to acknowledge all contributors.</p>
<p><bold>Publisher:</bold> the organization or entity that physically produces and disseminates the final version of the standard document, typically indicated as free text.</p>
</sec>
</sec>
</sec>
<sec id="sec3">
<title>Prompt-based extracting bibliographic metadata by LLMs</title>
<p>Taking the abilities of LLMs, we design a two-stage prompt-strategy to extract the bibliographic metadata from the standards documents in a simple-yet-powerful zero-code manner.</p>
<sec id="sec3_1">
<title>Stage 1: state grid-oriented instruction construct</title>
<p>The first stage of our method is the <italic>&#x2018;State Grid-Oriented Instruction Construct&#x2019;</italic>, where we meticulously design and implement a set of instructions aimed at guiding the Large Language Model (LLM) in extracting metadata from power grid standard documents. This stage is crucial as it sets the groundwork for how the LLM interacts with the documents and identifies the necessary metadata elements. Through these instructions, we effectively prompt the LLM to perform the task of metadata extraction, ensuring that it focuses on the relevant details and follows a structured approach to gather the required information.</p>
<sec id="sec3_1_1">
<title>Identity instruction</title>
<p>This process is to prompt the LLM as a meta-data extraction expert. Then tell the LLM what to do next. The instruction is as follows: <italic>now you are a metadata extraction expert. I will provide the PDF to be extracted and the extracted metadata items. Please help me extract the relevant content.</italic></p>
</sec>
<sec id="sec3_1_2">
<title>Metadata instruction</title>
<p>Two alternative prompting strategies are designed to extract the metadata.</p>
<p><bold>Multiple instructions (MI)</bold> extract items based on the different locations of metadata within the standard document.</p>
<list list-type="order">
<list-item><p>Prompts for metadata items in the homepage of standard files:</p>
<p><italic>please output the following content in sequence: Standard number, International standard classification number, Chinese standard classification number, Standard level (GL-national standard, DL-industry standard, T-group standard, Q-enterprise standard, DB-local standard), Chinese standard name, English standard name, Re-lease date, Implementation or trial date, Release organization, Text language, Standard category.</italic></p></list-item>
<list-item><p>For metadata items in the preface of standard files:</p>
<p><italic>please output the following content in order: Drafting rules, Replacement standards, Main revisions or changes, Proposing unit, Responsible unit, Drafting unit, Imple-menting unit, Drafter.</italic></p></list-item>
<list-item><p>For metadata items in the body part:</p>
<p><italic>please input the following content in order: Specified scope, Applicable scope, Inap-plicable scope, Normative reference documents (multiple items separated by ;).</italic></p></list-item>
<list-item><p>For metadata items in the copyright page of standard files:</p>
<p><italic>Please input the following content in order: References (multiple items separated by ;) , Publisher, Uniform book number.</italic></p></list-item>
</list>
<p><bold>Single instruction (SI)</bold> simply extracts all metadata without location hints as following:</p>
<p><italic>output the following content in sequence: Standard number, International standard classification number, Chinese standard classification number, Standard level (GL- national standard, DL-industry standard, T-group standard, Q-enterprise standard, DB-local standard), Chinese standard name, English standard name, Release date, Implementation or trial date, Release organization, Text language, Standard catego-ry, Drafting rules, Replacement standards, Main revisions or changes, Proposing unit, Responsible unit, Drafting unit, Implementing unit, Drafter, Specified scope, Applicable scope, Inapplicable scope, Normative reference documents (multiple items separated by ;), Please input the following content in order: References (multiple items separated by ;) , Publisher, Uniform book number.</italic></p>
</sec>
<sec id="sec3_2">
<title>Stage 2: trustworthiness estimation (TE)</title>
<p>Following the metadata extraction facilitated by the State Grid-Oriented Instructions, the second stage of our method is the <italic>&#x2018;Trustworthiness Estimation.&#x2019;</italic> In this phase, we evaluate the credibility of the metadata extracted by the LLM to enhance the reliability of the model&#x2019;s results. This involves assessing the consistency and accuracy of the extracted data, ensuring that the final output is not only comprehensive but also trustworthy. By implementing a trustworthiness estimation mechanism, we aim to mitigate the inherent variability in LLM responses and select the most reliable metadata, thereby ensuring the highest quality of information for our users.</p>
<p>It is a common phenomenon that LLM&#x2019;s Randomness, i.e. if you prompt an LLM with the same instruction multiple times, it may give different answers. To overcome this challenge, we propose a trustworthiness estimation mechanism to make our instruction method more reliable.</p>
<p>We using the instructions to prompt LLM k times. For each unique meta-data answer, we define its trustworthiness as:</p>
<disp-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="block"><mml:mtext>t</mml:mtext><mml:mo>=</mml:mo><mml:mfrac><mml:mtext>h</mml:mtext><mml:mtext>k</mml:mtext></mml:mfrac></mml:math></disp-formula>
<p>where h denotes the appearing time of this answer. Lastly, we choose the most trustworthy i-th answer as final answer, where</p>
<disp-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="block"><mml:mtext>i</mml:mtext><mml:mo>=</mml:mo><mml:mtext>argmax</mml:mtext><mml:mfenced><mml:mrow><mml:mtext>T</mml:mtext><mml:mfenced><mml:mtext>i</mml:mtext></mml:mfenced></mml:mrow></mml:mfenced></mml:math></disp-formula>
</sec>
</sec>
</sec>
<sec id="sec4">
<title>Experimental study and result</title>
<sec id="sec4_1">
<title>Experiment setting</title>
<p><bold>Dataset</bold>: we conducted the experiment on 96 state grid PDF samples. All these 96 samples are treated as testing data. As the LLMs do not need to be trained or fine-tuned.</p>
<p><bold>Evaluation metric</bold>: we use accuracy as the evaluation metric. For each sample, the accuracy is calculated as the ratio of right metadata items in total items. The final results are defined as the average performance on all testing samples.</p>
<p><bold>Baselines</bold>: four baselines are considered, GPT-4, Kimi, GLM, and LLaMA (7B).</p>
<p><bold>Settings</bold>: we directly use the official API of each LLM to conduct our experiment.</p>
</sec>
<sec id="sec4_2">
<title>Result</title>
<sec id="sec4_2_1">
<title>Main results</title>
<p>Our extensive testing of various LLMs using both single and multiple instruction methods (with TE by default) has yielded promising results shown in <xref ref-type="table" rid="T1">Table 1</xref>. All models have achieved an average accuracy of over 70%, validating the effectiveness of our instructional approach in extracting metadata. Notably, the GPT-4 model has demonstrated exceptional performance with the highest accuracy of 84% by multiple instructions, affirming its robustness as a backbone model for the task. This superior performance is likely due to GPT-4&#x2019;s advanced natural language processing capabilities, which allow it to better comprehend and respond to complex instructions.</p>
<p>Moreover, a consistent trend emerges, revealing that multiple instructions consistently outperform single instruction across all models. This trend suggests that the additional context provided by multiple instructions enables the models to generate more precise and detailed responses, thereby enhancing the overall accuracy of metadata extraction. </p>
<table-wrap id="T1">
<label>Table 1.</label>
<caption><p>Collation of experimental results</p></caption>
<table>
<thead>
<tr>
<th align="left" valign="top"></th>
<th align="center" valign="top"><bold>LLaMA</bold></th>
<th align="center" valign="top"><bold>GLM</bold></th>
<th align="center" valign="top"><bold>Kimi</bold></th>
<th align="center" valign="top"><bold>GPT-4</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" valign="top">SI without TE</td>
<td align="center" valign="top">0.68</td>
<td align="center" valign="top">0.72</td>
<td align="center" valign="top">0.69</td>
<td align="center" valign="top">0.73</td>
</tr>
<tr>
<td align="center" valign="top">SI</td>
<td align="center" valign="top">0.72</td>
<td align="center" valign="top">0.75</td>
<td align="center" valign="top">0.74</td>
<td align="center" valign="top">0.77</td>
</tr>
<tr>
<td align="center" valign="top">MI without TE</td>
<td align="center" valign="top">0.73</td>
<td align="center" valign="top">0.77</td>
<td align="center" valign="top">0.76</td>
<td align="center" valign="top">0.82</td>
</tr>
<tr>
<td align="center" valign="top">MI</td>
<td align="center" valign="top">0.79</td>
<td align="center" valign="top">0.82</td>
<td align="center" valign="top">0.82</td>
<td align="center" valign="top">0.84</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec4_2_2">
<title>Comparisons between multiple and single instructions</title>
<p><xref ref-type="fig" rid="F3">Figure 3</xref> give a case study comparing the differences between the two instructions. This visual representation vividly captures the nuances in the responses generated by each method. Single instruction, while efficient for basic tasks, tend to yield only abbreviated metadata, providing a quick snapshot but lacking in depth. Conversely, multiple instructions, which are more elaborate and structured, result in a richer output that includes both the abbreviated forms and the full names of metadata elements. This comprehensive approach not only enhances the granularity of the data but also significantly improves the contextual understanding of the metadata, thereby facilitating more informed decision-making processes.</p>
<fig id="F3">
<label>Figure 3.</label>
<caption><p>Comparison of metadata extraction responses: single vs. multiple instructions</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="images\c56-fig3.jpg"><alt-text>none</alt-text></graphic>
</fig>
<p>This observation underscores a critical point that the depth and breadth of instructions have a direct impact on the quality of metadata extraction. By leveraging multiple instructions, we are essentially equipping the LLMs with a more detailed roadmap, enabling them to navigate the complexities of document structures and extract metadata that is not just accurate but also contextually rich.</p>
</sec>
<sec id="sec4_2_3">
<title>Ablation results</title>
<p>To further validate the significance of our Trustworthiness Estimation (TE) mechanism, we conducted an ablation study. The results, as indicated in the table, reveal a notable decrease in performance when the TE mechanism is not utilized. This decline underscores the pivotal role of the TE mechanism in bolstering the reliability of the LLMs. By incorporating TE, we are effectively mitigating the inherent variability in LLM responses, ensuring that the extracted metadata is not only accurate but also consistent across different instances.</p>
</sec>
</sec>
</sec>
<sec id="sec5">
<title>Discussion and conclusion</title>
<p>Our study presents a comprehensive framework tailored for the power system industry, focusing on the systematic extraction of bibliographic metadata from standard documents with LLMs, contributing a significant advancement of document management and knowledge extraction in vertical fields like power grid.</p>
<sec id="sec5_1">
<title>Theoretical and practical implications</title>
<p>The theoretical underpinning of this research is profound, marking a pioneering effort in applying LLMs to metadata extraction within the power industry. We propose the first trial of a comprehensive and unified conceptualization of bibliographic metadata for powergrid standard documents. On the other hand, the exploration of the simple-yet-powerful prompt-based strategy with zero code manner in our study transcends the conventional boundaries of AI application, offering a fresh lens through which to view the integration of artificial intelligence with industry-specific processes. By situating our study within the operational context of power systems, we contribute to the broader discourse on AI&#x2019;s role in enhancing operational efficiency and data integrity.</p>
<p>From a practical standpoint, this study underscores the transformative potential of LLMs in streamlining operational workflows and bolstering data management practices within the power industry (Schilling-Wilhelmi et al.,2024; De Santis et al.,2024). Compared with traditional knowledge extraction methods with high cost and complicated realization, such as regular expressions (RE; <xref rid="R4" ref-type="bibr">Chapman et al., 2017</xref>) and machine learning models (Fanni et al.,2023; <xref rid="R13" ref-type="bibr">Kang et al., 2020</xref>; Chowdhary &#x0026; <xref rid="R6" ref-type="bibr">Chowdhary, 2020</xref>), we demonstrate a universal, convenient but effective pipeline with LLMs. The ability of LLMs to parse through complex documents and extract critical metadata with high accuracy not only enhances the speed of information retrieval but also mitigates the risk of human error. In addition, it also provides a new but more user-friendly interaction mode with natural language, instead of complicated RE grammar rules and machine learning algorithms traditionally leveraged in metadata extraction.</p>
</sec>
<sec id="sec5_2">
<title>Limitations and future work</title>
<p>Of course, there are also limitations in our study. One of the limitation of our current study is the modest size of the dataset involved in the experiments, which, while representative of standard documents in the power industry, may not encapsulate the full diversity of documents encountered in real-world applications. In the future work we will continue to investigate the feasibility and robustness of our model by the expansion of dataset to encompass a more comprehensive and heterogeneous collection of documents. Additionally, as we only explored the simple zero-shot prompt strategy to interact with LLMs, the model&#x2019;s performance, while basically satisfactory, may not fully account for the complexities and variability in language use across different documents and contents. Future work will investigate the prompts with few-shots to help LLMs better understand the tasks and also explore the integration of the retrieval-augmented generation (RAG) model to further improve the accuracy in metadata extraction.</p>
</sec>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>This work was supported by State Grid Corporation of China (Comprehensive Construction Technology of Standard Text Resources and Key Elements, 5400-202318585A-3-2-ZN), and Key Laboratory of Semantic Publishing and Knowledge Service of the National Press and Publication Administration (Wuhan University).</p>
</ack>
<ref-list>
<title>References</title>
<ref id="R1"><element-citation publication-type="book"><person-group person-group-type="editor"><name><surname>Baca</surname><given-names>M.</given-names></name></person-group><year>2016</year><source>Introduction to metadata</source><publisher-name>Getty Publications</publisher-name></element-citation></ref>
<ref id="R2"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Bartz</surname><given-names>C.</given-names></name><name><surname>Yang</surname><given-names>H.</given-names></name><name><surname>Meinel</surname><given-names>C.</given-names></name></person-group><year>2017</year><article-title>STN-OCR: A single neural network for text detection and text recognition</article-title><source>arxiv preprint arxiv:1707.08831.10.48550/arXiv.1707.08831</source></element-citation></ref>
<ref id="R3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>B&#x00FC;chter</surname><given-names>R. B.</given-names></name><name><surname>Weise</surname><given-names>A.</given-names></name><name><surname>Pieper</surname><given-names>D.</given-names></name></person-group><year>2020</year><article-title>Development, testing and use of data extraction forms in systematic reviews: a review of methodological guidance</article-title><source>BMC medical research methodology</source><volume>20</volume><fpage>1</fpage><lpage>14</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1186/s12874-020-01143-3">10.1186/s12874-020-01143-3</ext-link></element-citation></ref>
<ref id="R4"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Chapman</surname><given-names>C.</given-names></name><name><surname>Wang</surname><given-names>P.</given-names></name><name><surname>Stolee</surname><given-names>K. T.</given-names></name></person-group><year>2017</year><article-title>Exploring regular expression comprehension</article-title><source>2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)</source><fpage>405</fpage><lpage>416</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5555/3155562.3155616">10.5555/3155562.3155616</ext-link></element-citation></ref>
<ref id="R5"><element-citation publication-type="other"><person-group person-group-type="author"><collab>China Power</collab></person-group><year>2024</year><comment>July 19</comment><article-title>State Grid lays a solid foundation for the construction of a new type of power system</article-title></element-citation></ref>
<ref id="R6"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Chowdhary</surname><given-names>K.</given-names></name><name><surname>Chowdhary</surname><given-names>K. R.</given-names></name></person-group><year>2020</year><article-title>Natural language processing</article-title><source>Fundamentals of artificial intelligence</source><fpage>603</fpage><lpage>649</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-81-322-3972-7">10.1007/978-81-322-3972-7</ext-link></element-citation></ref>
<ref id="R7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dagdelen</surname><given-names>J.</given-names></name><name><surname>Dunn</surname><given-names>A.</given-names></name><name><surname>Lee</surname><given-names>S.</given-names></name><name><surname>Walker</surname><given-names>N.</given-names></name><name><surname>Rosen</surname><given-names>A. S.</given-names></name><name><surname>Ceder</surname><given-names>G.</given-names></name><name><surname>Jain</surname><given-names>A.</given-names></name></person-group><year>2024</year><article-title>Structured information extraction from scientific text with large language models</article-title><source>Nature Communications</source><volume>15</volume><issue>1</issue><fpage>1418</fpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/s41467-024-45563-x">10.1038/s41467-024-45563-x</ext-link></element-citation></ref>
<ref id="R8"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>De Santis</surname><given-names>A.</given-names></name><name><surname>Balduini</surname><given-names>M.</given-names></name><name><surname>De Santis</surname><given-names>F.</given-names></name><name><surname>Proia</surname><given-names>A.</given-names></name><name><surname>Leo</surname><given-names>A.</given-names></name><name><surname>Brambilla</surname><given-names>M.</given-names></name><name><surname>Della Valle</surname><given-names>E.</given-names></name></person-group><year>2024</year><article-title>Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data</article-title><source>arxiv preprint arxiv:2408.01700</source><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-031-77847-6">10.1007/978-3-031-77847-6</ext-link></element-citation></ref>
<ref id="R9"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Fanni</surname><given-names>S. C.</given-names></name><name><surname>Febi</surname><given-names>M.</given-names></name><name><surname>Aghakhanyan</surname><given-names>G.</given-names></name><name><surname>Neri</surname><given-names>E.</given-names></name></person-group><year>2023</year><chapter-title>Natural language processing</chapter-title><source>Introduction to Artificial Intelligence</source><fpage>87</fpage><lpage>99</lpage><publisher-loc>Cham</publisher-loc><publisher-name>Springer International Publishing</publisher-name><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-031-25928-9">10.1007/978-3-031-25928-9</ext-link></element-citation></ref>
<ref id="R10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gal</surname><given-names>M. S.</given-names></name><name><surname>Rubinfeld</surname><given-names>D. L.</given-names></name></person-group><year>2019</year><article-title>Data standardization</article-title><source>NYUL Rev</source><volume>94</volume><fpage>737</fpage></element-citation></ref>
<ref id="R11"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>He</surname><given-names>Y.</given-names></name></person-group><year>2020</year><article-title>Research on text detection and recognition based on OCR recognition technology</article-title><source>2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE)</source><fpage>132</fpage><lpage>140</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICISCAE51034.2020.9236870">10.1109/ICISCAE51034.2020.9236870</ext-link></element-citation></ref>
<ref id="R12"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Kabongo</surname><given-names>S.</given-names></name><name><surname>D&#x2019;Souza</surname><given-names>J.</given-names></name></person-group><year>2024</year><article-title>Instruction Finetuning for Leaderboard Generation from Empirical AI Research</article-title><source>arxiv preprint arxiv:2408.10141</source><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2408.10141">10.48550/arXiv.2408.10141</ext-link></element-citation></ref>
<ref id="R13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kang</surname><given-names>Y.</given-names></name><name><surname>Cai</surname><given-names>Z.</given-names></name><name><surname>Tan</surname><given-names>C. W.</given-names></name><name><surname>Huang</surname><given-names>Q.</given-names></name><name><surname>Liu</surname><given-names>H.</given-names></name></person-group><year>2020</year><article-title>Natural language processing (NLP) in management research: A literature review</article-title><source>Journal of Management Analytics</source><volume>7</volume><issue>2</issue><fpage>139</fpage><lpage>172</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/23270012.2020.1756939">10.1080/23270012.2020.1756939</ext-link></element-citation></ref>
<ref id="R14"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Kapoor</surname><given-names>A.</given-names></name><name><surname>Gulli</surname><given-names>A.</given-names></name><name><surname>Pal</surname><given-names>S.</given-names></name><name><surname>Chollet</surname><given-names>F.</given-names></name></person-group><year>2022</year><source>Deep Learning with TensorFlow and Keras: Build and deploy supervised, unsupervised, deep, and reinforcement learning models</source><publisher-name>Packt Publishing Ltd</publisher-name></element-citation></ref>
<ref id="R15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Karthick</surname><given-names>K.</given-names></name><name><surname>Ravindrakumar</surname><given-names>K. B.</given-names></name><name><surname>Francis</surname><given-names>R.</given-names></name><name><surname>Ilankannan</surname><given-names>S.</given-names></name></person-group><year>2019</year><article-title>Steps involved in text recognition and recent research in OCR; a study</article-title><source>International Journal of Recent Technology and Engineering</source><volume>8</volume><issue>1</issue><fpage>2277</fpage><lpage>3878</lpage></element-citation></ref>
<ref id="R16"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Liu</surname><given-names>J.</given-names></name><name><surname>Kong</surname><given-names>Y.</given-names></name><name><surname>Peng</surname><given-names>G.</given-names></name></person-group><year>2022</year><comment>June</comment><chapter-title>Interpreting the Development of Information Security Industry from Standards</chapter-title><source>International Conference on Human-Computer Interaction</source><fpage>372</fpage><lpage>391</lpage><publisher-loc>Cham</publisher-loc><publisher-name>Springer International Publishing</publisher-name><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-031-05463-1">10.1007/978-3-031-05463-1</ext-link></element-citation></ref>
<ref id="R17"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Papaluca</surname><given-names>A.</given-names></name><name><surname>Krefl</surname><given-names>D.</given-names></name><name><surname>Rodriguez</surname><given-names>S. M.</given-names></name><name><surname>Lensky</surname><given-names>A.</given-names></name><name><surname>Suominen</surname><given-names>H.</given-names></name></person-group><year>2023</year><article-title>Zero-and few-shots knowledge graph triplet extraction with large language models</article-title><source>arxiv preprint arxiv:2312.01954</source><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2312.01954">10.48550/arXiv.2312.01954</ext-link></element-citation></ref>
<ref id="R18"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Proctor</surname><given-names>M.</given-names></name><name><surname>Fusco</surname><given-names>M.</given-names></name><name><surname>Vacchi</surname><given-names>E.</given-names></name><name><surname>Sottara</surname><given-names>D.</given-names></name></person-group><year>2019</year><article-title>Rule modularity and execution control enhancements for a Java-based rule engine</article-title><source>2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)</source><fpage>89</fpage><lpage>96</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/AIKE.2019.00023">10.1109/AIKE.2019.00023</ext-link></element-citation></ref>
<ref id="R19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reyes</surname><given-names>J.</given-names></name><name><surname>Mula</surname><given-names>J.</given-names></name><name><surname>D&#x00ED;az-Madro&#x00F1;ero</surname><given-names>M.</given-names></name></person-group><year>2023</year><article-title>Development of a conceptual model for lean supply chain planning in industry 4.0: multidimensional analysis for operations management</article-title><source>Production Planning &#x0026; Control</source><volume>34</volume><issue>12</issue><fpage>1209</fpage><lpage>1224</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/09537287.2021.1993373">10.1080/09537287.2021.1993373</ext-link></element-citation></ref>
<ref id="R20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Riley</surname><given-names>J.</given-names></name></person-group><year>2017</year><article-title>Understanding metadata</article-title><source>Washington DC, United States: National Information Standards Organization</source><ext-link ext-link-type="uri" xlink:href="http://www.niso.org/publications/press/UnderstandingMetadata.pdf">http://www.niso.org/publications/press/UnderstandingMetadata.pdf</ext-link><volume>23</volume><fpage>7</fpage><lpage>10</lpage></element-citation></ref>
<ref id="R21"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Schilling-Wilhelmi</surname><given-names>M.</given-names></name><name><surname>R&#x00ED;os-Garc&#x00ED;a</surname><given-names>M.</given-names></name><name><surname>Shabih</surname><given-names>S.</given-names></name><name><surname>Gil</surname><given-names>M. V.</given-names></name><name><surname>Miret</surname><given-names>S.</given-names></name><name><surname>Koch</surname><given-names>C. T.</given-names></name><name><surname>Jablonka</surname><given-names>K. M.</given-names></name></person-group><year>2024</year><article-title>From Text to Insight: Large Language Models for Materials Science Data Extraction</article-title><source>arxiv preprint arxiv:2407.16867</source><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2407.16867">10.48550/arXiv.2407.16867</ext-link></element-citation></ref>
<ref id="R22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shao</surname><given-names>W.</given-names></name><name><surname>Zhang</surname><given-names>R.</given-names></name><name><surname>Ji</surname><given-names>P.</given-names></name><name><surname>Fan</surname><given-names>D.</given-names></name><name><surname>Hu</surname><given-names>Y.</given-names></name><name><surname>Yan</surname><given-names>X.</given-names></name><name><surname>Chen</surname><given-names>L.</given-names></name></person-group><year>2024</year><article-title>Astronomical knowledge entity extraction in astrophysics journal articles via large language models</article-title><source>Research in Astronomy and Astrophysics</source><volume>24</volume><issue>6</issue><fpage>065012</fpage></element-citation></ref>
<ref id="R23"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Singh</surname><given-names>P.</given-names></name><name><surname>Manure</surname><given-names>A.</given-names></name></person-group><year>2019</year><source>Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python</source><publisher-name>Apress</publisher-name></element-citation></ref>
<ref id="R24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sivarajkumar</surname><given-names>S.</given-names></name><name><surname>Kelley</surname><given-names>M.</given-names></name><name><surname>Samolyk-Mazzanti</surname><given-names>A.</given-names></name><name><surname>Visweswaran</surname><given-names>S.</given-names></name><name><surname>Wang</surname><given-names>Y.</given-names></name></person-group><year>2024</year><article-title>An empirical evaluation of prompting strategies for large language models in zero-shot clinical natural language processing: algorithm development and validation study</article-title><source>JMIR Medical Informatics</source><volume>12</volume><fpage>e55318</fpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2196/55318">10.2196/55318</ext-link></element-citation></ref>
<ref id="R25"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Stevens</surname><given-names>E.</given-names></name><name><surname>Antiga</surname><given-names>L.</given-names></name><name><surname>Viehmann</surname><given-names>T.</given-names></name></person-group><year>2020</year><source>Deep learning with PyTorch</source><publisher-name>Manning Publications</publisher-name></element-citation></ref>
<ref id="R26"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Xu</surname><given-names>D.</given-names></name><name><surname>Chen</surname><given-names>W.</given-names></name><name><surname>Peng</surname><given-names>W.</given-names></name><name><surname>Zhang</surname><given-names>C.</given-names></name><name><surname>Xu</surname><given-names>T.</given-names></name><name><surname>Zhao</surname><given-names>X.</given-names></name><name><surname>Chen</surname><given-names>E.</given-names></name></person-group><year>2023</year><article-title>Large language models for generative information extraction: a survey</article-title><source>arxiv preprint arxiv:2312.17617</source><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s11704-024-40555-y">10.1007/s11704-024-40555-y</ext-link></element-citation></ref>
<ref id="R27"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Xue</surname><given-names>L.</given-names></name><name><surname>Zhang</surname><given-names>D.</given-names></name><name><surname>Dong</surname><given-names>Y.</given-names></name><name><surname>Tang</surname><given-names>J.</given-names></name></person-group><year>2024</year><article-title>AutoRE: Document-Level Relation Extraction with Large Language Models</article-title><source>Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics</source><comment>(Volume 3: System Demonstrations)</comment><fpage>211</fpage><lpage>220</lpage><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2403.14888">10.48550/arXiv.2403.14888</ext-link></element-citation></ref>
<ref id="R28"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>M. L.</given-names></name><name><surname>Qin</surname><given-names>J.</given-names></name></person-group><year>2020</year><source>Metadata</source><publisher-name>American Library Association</publisher-name></element-citation></ref>
<ref id="R29"><element-citation publication-type="other"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>J.</given-names></name><name><surname>Ullah</surname><given-names>N.</given-names></name><name><surname>Babbar</surname><given-names>R.</given-names></name></person-group><year>2024</year><article-title>Zero-shot learning over large output spaces: utilizing indirect knowledge extraction from large language models</article-title><source>arxiv preprint arxiv:2406.09288</source></element-citation></ref>
</ref-list>
</back>
</article>