OpenAlex in focus: Metadata quality of publication type and language fields in an open peer review corpus

Authors

DOI:

https://doi.org/10.47989/ir31iConf64207

Keywords:

OpenAlex, Metadata quality, Publication type classification, Language metadata, Crossref

Abstract

Introduction. OpenAlex is widely used as a free bibliographic database for bibliometric and scholarly communication research. Despite its openness and coverage, its metadata contains inconsistencies that require systematic cleaning. This study examines metadata quality in OpenAlex, focusing on publication type and language.

Method. Publications on open peer review were retrieved from OpenAlex. After filtering and deduplication, 6,640 records were manually checked. Document type and language fields were cross-verified with publisher sources, while Crossref publication types were also collected for comparison.

Analysis. Manual classification was harmonised across categories to ensure comparability. The main focus was to evaluate the agreement between OpenAlex and manual classifications of type and language, and to assess the consistency of Crossref publication types with both.

Results. Of 6,640 records, 2,878 (43%) showed publication type discrepancies, with ‘Article’ most often misused. Crossref aligned more closely with OpenAlex in broad categories but diverged from manual verification. Additionally, 222 records (3.3%) had language mismatches, often English labels wrongly assigned to non-English works.

Conclusion(s). OpenAlex is a valuable infrastructure, yet its metadata for publication type and language shows notable inconsistencies. Researchers should apply systematic cleaning and validation before using OpenAlex or similar databases.

References

About the data. (n.d.). OpenAlex. Retrieved January 7, 2026, from https://help.openalex.org/hc/en-us/articles/24397285563671-About-the-data

About us. (n.d.). OpenAlex. Retrieved January 7, 2026, from https://help.openalex.org/hc/en-us/articles/24396686889751-About-us

Alperin, J. P., Portenoy, J., Demes, K., Larivière, V., & Haustein, S. (2024). An analysis of the suitability of OpenAlex for bibliometric analyses (No. arXiv:2404.17663). arXiv. https://doi.org/10.48550/arXiv.2404.17663

Céspedes, L., Kozlowski, D., Pradier, C., Sainte-Marie, M. H., Shokida, N. S., Benz, P., Poitras, C., Ninkov, A. B., Ebrahimy, S., Ayeni, P., Filali, S., Li, B., & Larivière, V. (2025). Evaluating the linguistic coverage of OpenAlex: An assessment of metadata accuracy and completeness. Journal of the Association for Information Science and Technology, 76(6), 884–895. https://doi.org/10.1002/asi.24979

Gelfand, J. M., & Lin, A. (2020). How Open Science Influences Next Developments in Grey Literature. Grey Journal (TGJ), 16(1), 34–48.

Giannini, S., & Molino, A. (2020). Open Access—A Never-Ending Transition? Grey Journal (TGJ), 16(1), 6–26.

Gusenbauer, M., & Gauster, S. P. (2025). How to search for literature in systematic reviews and meta-analyses: A comprehensive step-by-step guide. Technological Forecasting and Social Change, 212, 123833. https://doi.org/10.1016/j.techfore.2024.123833

Haupka, N., Culbert, J. H., Schniedermann, A., Jahn, N., & Mayr, P. (2025). Analysis of the Publication and Document Types in OpenAlex, Web of Science, Scopus, PubMed and Semantic Scholar (No. arXiv:2406.15154). arXiv. https://doi.org/10.48550/arXiv.2406.15154

Hauschke, C., & Nazarovets, S. (2025). (Non-)retracted academic papers in OpenAlex. Journal of Information Science. https://doi.org/10.1177/01655515251322478

Hval, G., Harboe, I., Johansen, M., Larsen, M., & Næss, G. (2023). Evaluation of OpenAlex. Folkehelseinstituttet. https://www.fhi.no/en/publ/2023/Evaluation-of-OpenAlex/

Jahn, N., Haupka, N., & Hobert, A. (2023). Scholarly Communication Analytics: Analysing and reclassifying open access information in OpenAlex. https://subugoe.github.io/scholcomm_analytics/posts/oalex_oa_status/

Jason. (2025, October 1). OpenAlex rewrite enters beta! OpenAlex Blog. https://blog.openalex.org/openalex-rewrite-enters-beta-%f0%9f%8e%89/

Jiao, C., Li, K., & Fang, Z. (2023). How are exclusively data journals indexed in major scholarly databases? An examination of the Web of Science, Scopus, Dimensions, and OpenAlex (No. arXiv:2307.09704). arXiv. https://doi.org/10.48550/arXiv.2307.09704

Kar, S., & Rath, D. S. (2025). Open Data in Social Sciences: Growth, Impact, and Equity in Data Paper Publishing | DESIDOC Journal of Library & Information Technology. DESIDOC Journal of Library & Information Technology, 45(4), 350–366.

Ortega, J. L., & Delgado‑Quirós, L. (2024). The indexation of retracted literature in seven principal scholarly databases: A coverage comparison of dimensions, OpenAlex, PubMed, Scilit, Scopus, The Lens and Web of Science. Scientometrics, 129, 3769–3785. https://doi.org/10.1007/s11192-024-05034-y

Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts (No. arXiv:2205.01833). arXiv. https://doi.org/10.48550/arXiv.2205.01833

Simard, M.-A., Basson, I., Hare, M., Lariviere, V., & Mongeon, P. (2024). The open access coverage of OpenAlex, Scopus and Web of Science (No. arXiv:2404.01985). arXiv. https://doi.org/10.48550/arXiv.2404.01985

Thelwall, M., & Jiang, X. (2025). Is OpenAlex suitable for research quality evaluation and which citation indicator is best? https://doi.org/10.1002/asi.70020

Zhang, L., Cao, Z., Shang, Y., Sivertsen, G., & Huang, Y. (2024). Missing institutions in OpenAlex: Possible reasons, implications, and solutions. Scientometrics, 129(10), 5869–5891. https://doi.org/10.1007/s11192-023-04923-y

OpenAlex Technical Documentation. (2025, December 14). Work object. https://docs.openalex.org/api-entities/works/work-object

Downloads

Published

2026-03-20

How to Cite

Doğan, G., & Sezen, A. N. (2026). OpenAlex in focus: Metadata quality of publication type and language fields in an open peer review corpus. Information Research an International Electronic Journal, 31(iConf), 1011–1020. https://doi.org/10.47989/ir31iConf64207

Issue

Section

Conference proceedings

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.