Information Research logotype

Information Research

Vol. 31 No. 1 2026

Information need in metaverse recordings - A field study

Patrick Steinert, Jan Mischkies, Stefan Wagenpfeil, Ingo Frommholz, Matthias L. Hemmje

DOI: https://doi.org/10.47989/ir31163020

Abstract

Introduction. Metaverse Recordings (MVRs) represent an emerging and underexplored media type within the field of Multimedia Information Retrieval (MMIR). This paper presents findings from a field study aimed at understanding the user information needs and search behaviours specific to MVR retrieval.

Method. By conducting and analysing expert interviews, the study identifies application scenarios and retrieval challenges. from an MVR collection.

Analysis. The collected responses were analysed to reveal patterns in user needs and behaviours. Key focus areas included interest in time-series data generated during graphical rendering and data from related input-output devices, identified as crucial components for MVR retrieval.

Results. The findings outline existing application scenarios for MVRs, validate the need to capture specific data types in MVR retrieval, and emphasise the importance of user-tailored MVR retrieval systems. The study also identifies use cases, and requirements essential for the design of effective MVR retrieval systems.

Conclusions. This research contributes to a foundational understanding of MVR retrieval and supports future efforts in system design and research within MMIR, enhancing the comprehension of information search behaviours in the context of the emerging multimedia type MVR.

Introduction

The rate of multimedia creation is accelerating. Digital cameras are ubiquitous, and social media has led to an immense media generation. In recent years, this has boosted short form video content. Furthermore, the COVID crisis has given remote technologies for communication a push, such as increased use of video conferencing and virtual conferences. Another trend re-emerged in recent years, the idea of a persistent virtual space, where people meet and live together, the metaverse.

The growth rate of usage of platforms (KZero Worldwide, 2024) like Roblox (Wikipedia, 2023b) or Minecraft (Mojang, 2023) show that people are heavily using virtual worlds. Trend reports assume an even higher usage in the future (Gartner Inc., 2022). It is likely that people will create recordings of experiences in the virtual world, like they do in the real-world. Early versions of this can be seen as YouTube videos (Bestie Let’s play, 2022) for entertainment purposes.

Multimedia Information Retrieval (MMIR) (Rüger, 2010) is the field in computer science which addresses indexing and retrieval of multimedia content. The metaverse is built on virtual worlds, which are basically computer-generated multimedia. Therefore, this study examines the integration of Metaverse Recordings (MVRs) in MMIR.

Steinert et al. (2023; 2024b) have outlined the differences between metaverse content and other media types, such as format, structure and content. The analysis of the differences revealed a lack of support of MMIR for metaverse content. The further integration of MVR in MMIR should be grounded on user demands. There is a noted gap in existing literature regarding the information needs specifically related to MVR retrieval. Understanding these information needs is essential for developing effective MVR retrieval systems.

MVR as an emerging multimedia type introduces challenges for integration in MMIR, related to the capture, organisation, and retrieval of content generated in virtual environments. One open question is whether such user sessions are recorded, which would be indirectly recorded metaverse content, in the field and for which applications. Another significant challenge concerns the formats of data available in metaverse environments and how they align with user interests. Unlike traditional media recordings, MVRs can capture not only video and audio but also complex data formats such as movement patterns, eye-tracking information, and biosensor data. The potential for data capture in the metaverse is considerable, yet it remains unclear how these rich data formats align with users’ needs and interests. For example, while systems may be capable of reconstructing virtual scenes with mathematical precision, it is unclear whether users find such detailed data useful or necessary for their tasks. A further challenge lies in understanding users’ information needs and how they search for and retrieve MVRs. Little is known about the information searching behaviour specific to MVRs, and existing search systems are not yet tailored to the unique attributes of virtual worlds. Traditional search filters, such as date ranges, location, and event types, may not fully capture the complexity of user needs in the metaverse. Moreover, it is unclear how users express their information needs when searching for MVRs, as past queries and behaviours have not yet been documented.

The lack of understanding of the technical capabilities and user interests shows a critical research gap. Understanding which data types users value and how they search for MVRs is crucial for integrating MVR in MMIR and developing effective MVR retrieval systems. This paper presents a field study conducted with a small expert group. Based on interviews, we describe application scenarios and search behaviours of users, and how MMIR can support them.

The following sections present an overview of the metaverse and related technologies, information retrieval (IR), and MMIR in the section for state of the art and related work. The section on study design describes the field study design. The results section presents the results of the interviews. Finally, the conclusion and outlook section summarises the presented work and discusses future work.

In this section, the state of the art and related work for MVR retrieval are summarised.

Metaverse

The metaverse is an umbrella term for a set of digital technologies and use cases. Mystakidis (2022) defines the metaverse as a persistent, multiuser environment that blends physical reality with digital virtuality. It uses technologies like virtual reality (VR) (Alsop, 2023) and augmented reality (AR) (Wikipedia, 2023a) for multisensory interactions with virtual spaces, objects, and people. The metaverse connects immersive, social environments on multiuser platforms, allowing real-time communication and interaction with digital content. The idea of a metaverse originates in Stephenson’s (1992) novel Snow Crash, where it was described as a network of virtual worlds where avatars could move between them, but now it includes social VR platforms, online games, and AR collaboration spaces (Anderson & Rainie, 2022). Such application domains are described next.

Metaverse application domains

In the field of metaverse applications, several use cases are imaginable and already visible, which potentially include MVRs, such as the user sessions.

Gunkel et al. (2018) researched metaverse experiences with persons. Based on a survey, they present a user interest in the domains of video conferencing, education, gaming, watching movies, and further similar entertainment activities (Gunkel et al., 2018). The audience research company GWI researched use cases for the metaverse and lists (Morris, 2022) watching TV and playing games as applications for customers. Hence, there is strong support for entertainment applications. An example of MVRs in this application domain is YouTube’s videos created from the metaverse (Bestie Let’s play, 2022).

Within the metaverse-related literature, the risk of cybercrime is mentioned. The International Criminal Police Organization (INTERPOL, 2024b) describes the potential harms and the concepts to counter them. Examples are crimes against children or sexual offenses and assault. The activities in question can be tracked by recording the relevant transaction data (INTERPOL, 2024a). While not conclusively demonstrated in the existing literature, it is conceivable that MVRs could be used for this purpose.

In addition to consumer-focused use cases, the field of industrial metaverse is relevant. The field of industrial metaverse covers the use of metaverse technologies mainly for simulations of industrial processes (Zheng et al., 2022). Although not definitively described in current research, the review of such simulations is conceivable as a use case for MVRs. Such simulations can address not yet existing scenarios, thus addressing more research and development use cases, or real scenarios mirrored in the virtual world. The latter is described as digital twin. An example is the simulation of mining operations (Stothard et al., 2024).

While many people record their personal life experiences digitally and share them online (Austin, 2023), it is conceivable that people record their experiences in the metaverse as well. Within this paper it is termed as personal use.

In conclusion, a list of potential application domains is compiled, shown in Table 1. The application domains of education and videoconferencing & collaboration are grounded in the research of Gunkel et al., 2018. In the application domain entertainment, this paper summarises the applications of things such as gaming, watching TV, and listening to music. Law enforcement is anchored in the literature of risks of the metaverse. Industrial metaverse is grounded in the literature about metaverse. The personal use application domain is anchored in the described behaviour of people creating and sharing experiences on the Internet.

ID Domain Example
AD 2.1 Education VR Training
AD 2.2 Videoconferencing / Collaboration Online Business Meetings
AD 2.3 Entertainment / Video Gaming Watching movies, play games
AD 2.4 Law Enforcement Investigate in crimes
AD 2.5 Industrial Metaverse Simulating Car Driving Scenarios
AD 2.6 Personal Use Record personal memories

Table 1. Metaverse application domains.

Next, the technical opportunities of MVRs are described.

Options to produce MVRs

In the realm of metaverse, there are several technical recording methodologies for user sessions to consider. These can be broadly categorised into three types, each with distinct characteristics and applications. Figure 1 illustrates the rendering process and the three types. This process takes scene raw data (SRD) and peripheral data (PD), also known as auxiliary data, as input. PD is derived from user devices, hence represents user behaviour. SRD includes information used by the computer to construct the virtual scene. Both inputs are used to create perceivable rendering output, in the form of 2D/3D image stream(s), audio streams, and actuator commands. The rendering output is recordable multimedia, which is defined as multimedia content objects (MMCOs). In summary, the three major categories of recordable data are MMCO, SRD, and PD.

Figure 1. Rendering process.

The first group, MMCO shown as recordable multimedia in Figure 1, can be created by directly recording sessions as videos within metaverse applications, such as in Horizon Worlds. This approach captures both audio and video outputs of the virtual environment. The 256 metaverse records dataset (Steinert et al., 2024a) contains examples of such MVRs. An alternative within this category is the use of screen recorders to capture the audio-visual output of the rendered scenes.

The second group, SRD visualised as input in Figure 1, rendering inputs capture the visual rendering inputs used to create the virtual scene. This can be done in two primary approaches. First, capturing scene graphs, which detail the objects present in a scene. Second, utilising network transmitted information to record inputs from other players, including avatar positions and actions. This raw data, when combined with additional audio-visual elements like textures and colours, offers a comprehensive view of the scene. Tools exist to capture rendering raw data, such as Nvidia Ansel (Güngör, 2016) or RenderDoc (Karlsson, 2018).

The third group of recordable data is capturing PD, shown as input in Figure 1. The recording of recording PD provides supplementary information that enriches the primary recording. This data is either hard to resemble later, like emotion detection from facial expressions versus audio recordings, or not possible, like heart rate in specific situations.

As of now, it is unclear which recording method is superior. Audio-visual data is convenient for playback and compatible with existing tools but extracting detailed information for analysis is challenging. On the other hand, the Metaverse's technical nature allows for the utilisation of logs and interaction data, e.g. avatar presence and hand movements captured from controller data, to gain immediate insights without the need for extensive video analysis. Capturing and organising virtual world experiences for retrieval poses several unresolved challenges. These include how to represent rendered scenes in a way that preserves both technical structure and user interaction, how to map and synchronize multimedia content over time, and how to manage composite content from multiple sources. A core issue is establishing reliable connections between SRD, PD, and their rendered outputs MMCO. Addressing these problems is essential for enabling effective retrieval and reuse of metaverse recordings.

Steinert et al. (2023) see value in the different types for different use cases. Hence, they described a model of the combination of data, visualised in Figure 2. If the data is combined in any form, they define this as composite multimedia content object (CMMCO). If there is a common time in the data, such as a timestamp or frame number, they define this as time-mapped CMMCO (TCMMCO). The 111 recordings dataset (Steinert, 2025) presents MVRs containing MMCO, SRD, and PD, hence CMMCO data. The data is recorded with timestamps; hence the MVRs can be considered TCMMCO.

Ein Bild, das Text, Screenshot, Menschliches Gesicht, Schrift enthält. KI-generierte Inhalte können fehlerhaft sein.

Figure 2. Classification of MVR components (Steinert, 2023).

Information retrieval

Starting with the user, there is the need to search, find, and retrieve information. This happens when the user has a knowledge gap, which Belkin et al. (1993) describe as an anomalous state of knowledge (ASK). When the user has all information to complete a task, it is the normal state. If information is missing, it is ASK, which is an information need. This results in information-seeking behaviour, which is described by many scientific research articles.

Multimedia is the combination of any combination of media formats. In the case of MMIR, it is about the different types of media that can be retrieved in a system. The types can be any perceivable media, such as images or audio, or biometric sensor data.

The yearly Lifelog Search Challenge (Gurrin, 2025) addresses a comparable use case then to expect with metaverse content. The challenges address user experience questions for user interfaces to query multimedia originated from recordings of cameras worn by people to capture images of full days in the life. Corresponding data is gathered, such as biometrics and location (Gurrin, 2021). What can be found in the related literature are example search terms, used in the challenges, but the literature lacks specific descriptions of relatable information search behaviour for MVR retrieval.

Summary

The metaverse is a term which is used to describe a set of technologies and use cases including fully virtual environments, or real-world environments, with digital extensions. Different use cases are conceivable or already observable, which produce MVRs. A production of MVRs leads to larger collections, which are retrievable by MMIR. Missing in the literature is an understanding of the information need and information searching behaviour, to integrate MVRs in MMIR. A subsequent field study can provide important findings.

Study design

To create an understanding of the information need for MVR retrieval, a field study was conducted. This study employs a qualitative design to explore participants' experiences and perceptions through in-depth interviews. The field study was carried out as an expert interview with experts in the field of the metaverse.

Expert selection

Following the application domains described in Table 1, relevant experts listed in Table 2 were identified through a targeted search within the professional network of the authors, the research group at university, and the Immersive Collaboration Hub. Additional contacts were made through subject-specific communities, including VR vendors, consultants, meetup groups, and registered associations. Recruitment focused on individuals with demonstrable practical experience in metaverse or VR applications within the specified domains, for example, through direct involvement in user workshops or active participation in related projects. Twelve experts in the selected application domains were contacted. Six have responded and agreed to a recorded interview. The formed group of participants consists of six individuals, five German, one Italian, all male, with higher education backgrounds. Their professional expertise spans product design, programming, IT consulting, and documentary filmmaking; this offered a diverse and practice-oriented perspective on the study topics. The six people work for one or multiple companies or organisations and are active in an international but mostly European market. It proved challenging to identify individuals utilising the metaverse technologies for personal purposes, particularly those engaged in recording and subsequent retrieval of these experiences.

It was notable that the experts identified for one application domain were in fact able to provide information on other application domains, or at the least, on existing application scenarios in other domains, during the discussion.

ID Domain Potential MVR Retriever User Group
AD 2.1 Education (1) Organisers of VR Training
AD 2.2 Videoconferencing / Collaboration (2) Participants of Virtual Team Meetings, (3) Marketing Staff of Collaboration Platforms
AD 2.3 Entertainment / Video Gaming (4) Content Creators
AD 2.4 Law Enforcement (5) Law Enforcement Staff
AD 2.5 Industrial Metaverse (6) Quality Assurance Staff at Telecommunications Companies, (7) Market Research Staff
AD 2.6 Personal Use (8) Personal users of VR Headsets

Table 2. Potential MVR retriever user groups by domain.

After reaching out to the experts, the responding experts were invited to a video conference call. During the recorded call, two interviewers were present, one in the role leading through the interview, the second in a mostly silent role, noting the responses. Both interviewers are experts in the field of multimedia retrieval. To structure the interviews, a questionnaire was created, which is explained next.

Questionnaire

The expert interviews were conducted as guided conversations, using a structured questionnaire to ensure consistency and depth of investigation. This questionnaire, as shown in the Appendix Questionnaire Table 7 and Table 8, was divided into two main parts: Application scenario validation and use context detail questions. The questionnaire served as a flexible guide, allowing interviewers to adapt their inquiries based on each expert's unique insights and experiences. This methodology facilitated a comprehensive examination of metaverse recording applications while maintaining the ability to delve into emerging or unexpected areas of interest.

In the first section, the interviewee is asked to name existing application scenarios (F1.1) and conceivable application scenarios (F1.2). Beforehand, we constructed application scenarios for which the interviewee was asked to assess whether they exist, are conceivable or not relevant (F1.3). If in this step an existing use case was identified, this use case was used to determine the information search behaviour.

The application scenario validation section aimed to explore known cases of metaverse recordings and assess the realism of previously hypothesised scenarios. Table 6 addresses the identification of application scenarios of metaverse recordings and MVR retrieval. After a brief explanation of MMIR in general and MVR retrieval in specific, interviewees were asked to describe familiar applications (F1.1, F1.2) and evaluate the feasibility of proposed application scenarios (F1.3). The use context detail questions focused on gathering in-depth information about a specific application scenario, preferably one described by the interviewee during the initial discussion. This approach allowed for a more targeted exploration of practical implications and potential challenges.

To understand the information search behaviour in an application scenario, the questionnaire asks for the information available at the start of a search (F3). Further, it is asked how the search query is constructed (F11), what is expected to be retrieved (F4, F5, F9), and how the result is presented (F15). Further questions check whether the retrieval is static or dynamic, and which kind of devices are likely used.

Results

After the interviews had taken place, they were analysed. In a first step, the application scenarios were summarised.

Analysis of the interviews

Based on the responses, several findings can be summarised.

Application domains and application scenarios

During the expert interviews, various application scenarios for MVR retrieval were identified by experts (F1.1-F1.3). Some of the prepared application scenarios were also validated as existing. Some experts contributed scenarios beyond their own domain. Table 3 lists these scenarios, showing who proposed and classified each one. Scenarios labelled as existing are already in use today, indicating real-world demand (market pull), while conceivable scenarios could become relevant in the future. Scenarios known before the survey are shaded in grey, and the original survey data can be found in the appendix. Expert F was unable to name an AS or classify one as existing or conceivable.

ID Domain Scenario Description Expert Rating
AS 1 AD 2.1 Education A professor of medicine searches the recordings for instances where specific surgical tools were used. Expert A Conceivable
AS 2 AD 2.1 Education Recordings from VR glasses are reviewed to edit instructional videos. Expert C Existing
AS 3 AD 2.1 Education Screen recordings of Spatial.io were recorded to organise training and teaching sessions Expert A Existing
AS 4 AD 2.1 Education Spatial videos are recorded and can be relived afterwards for training purposes. Expert A Conceivable
AS 5 AD 2.1 Education Welding training is carried out with the help of VR. Trainers can see in replays whether the weld seam has been drawn accurately. Expert A Conceivable
AS 6 AD 2.2 Videoconferencing / Collaboration

An employee of a company is looking for a certain Excel graphic that could

be contained in the recordings of meetings from the past few months.

Expert B Conceivable
AS 7 AD 2.2 Videoconferencing / Collaboration Marketing employees of collaboration platforms search recordings for marketing purposes. Expert B Conceivable
AS 8 AD 2.3 Entertainment / Video Gaming Content creators search Metaverse Recordings looking for content to edit. Expert C Existing
AS 9 AD 2.3 Entertainment / Video Gaming Recordings of VR concerts are searched to create highlight reels or relive them with friends, for example. Expert C Conceivable
AS 10 AD 2.3 Entertainment / Video Gaming

Parts of virtual experiences (rooms) are

made available again to other companies (Business-to-Business) and may also be searched.

Expert B Existing
AS 11 AD 2.4 Law Enforcement The investigators were informed of a sexual assault in a metaverse room. The investigators search the operators’ log files, which also contain short screen recordings, for an avatar wearing a red t-shirt. Expert A Conceivable
AS 12 AD 2.2 Videoconferencing / Collaboration XRSI meetings were held in VR-Space and published on YouTube. Expert A Existing
AS 13 AD 2.4 Law Enforcement

Capture spatial videos of crime scenes to

revisit and search them later in VR.

Expert A Conceivable
AS 14

AD 2.5 Industrial

Metaverse / Research and Development

Behavioural researchers search metaverse

recordings to better understand behaviours.

Expert D Conceivable
AS 15

AD 2.5 Industrial

Metaverse / Research and Development

Market research employees analyse recordings of virtual supermarket visits. Expert C Existing
AS 16

AD 2.5 Industrial

Metaverse / Research and Development

Industrial products are presented in a digital experience. For sales discussions, specific products are searched within the experience. Expert B Conceivable
AS 17

AD 2.5 Industrial

Metaverse / Research and Development

Telecommunications technicians work in the field with AR glasses during line switching. This is recorded, reviewed by quality assurance employees, and specifically searched. Expert D Conceivable
AS 18

AD 2.5 Industrial

Metaverse / Research and Development

For example, certain movements could be searched to create training data for robots. Expert D Conceivable
AS 19 AD 2.6 Personal Use A person has recorded metaverse sessions and would like to experience them again. Expert E Conceivable
AS 20 AD 2.6 Personal Use Search spatial videos (e.g., from a smartphone) to experience them again with VR glasses. Expert D Existing
AS 21 AD 2.6 Personal Use

Private gaming experiences (specifically

golf) should be filterable, so good shots can be shared or rewatched for game optimisation.

Expert E Existing

Table 3. Application scenarios validated by metaverse experts. Application scenarios highlighted in grey were proposed by the authors, others by the expert.

Table 4 summarises the ratings, where eight application scenarios were existing, and thirteen were conceivable (F1.1 - F1.3). All application domains except law enforcement have existing application scenarios, based on the knowledge of the experts. Overall, it can be confirmed that application scenarios for MVR retrieval exist, based on the current insights provided by the experts.

ID Application Domain Amount of Cases Rated Existing Rated Conceivable
AD 2.1 Education 5 2 3
AD 2.2 Videoconferencing / Collaboration 3 1 2
AD 2.3 Entertainment / Video Gaming 3 2 1
AD 2.4 Law Enforcement 2 0 2
AD 2.5 Industrial Metaverse 5 1 4
AD 2.6 Personal Use 3 2 1
Total 21 8 13

Table 4. Rating of the application scenarios.

File types

A particular question of F5 and F9 were which kind of data is recorded and relevant to retrieve. Relevant are not only the audio-video recordings, but also the sensor data and rendering data. Several application scenarios (AS1, partly AS7, AS8, AS 17, AS11, AS15, AS21) can involve recording of the 3D scene in a way that it can be played back afterward with the ability to look around. Several described application scenarios include multiple data types, summarised in Table 5. This supports the hypothesis that in addition to MMCO, SRD and PD are relevant to record and to be supported in the retrieval. It was not asked directly in the interviews, but the answers suggest that the MVR components are analysed and played back in a time-linked manner.

Mention Type/Format Classification in MVR taxonomy
Screen recording Video 2D MMCO
Engine data Video 3D SRD
Spatial video Video 3D MMCO
Multiple Videos Multiple Videos MMCO
3D Videos Video 3D SRD
Augmented video Video 2D/3D SRD/MMCO
Image Image MMCO
Audio Audio MMCO
Background processes Text SRD
Language Text SRD/Metadata
Session owner Text SRD
IP addresses Text SRD
Location data Text SRD (virtual) PD (real)
Sensor data Various Formats PD
Metadata (e.g. user) Text SRD
Chat Text MMCO
Document Text Document MMCO

Table 5. MVR-data types mentioned in the responses to questions F5 and F9.

Search types and query input methods

Search types and query input methods play a crucial role in IR systems, thus MVR systems. Table 6 summarises the number of input methods for MVR searches mentioned in the responses to F3 and F11, with keywords being the most frequently mentioned (five occurrences), followed by natural language inputs, including voice commands (four occurrences), and image-based methods such as screenshots and sketches (three occurrences). Other less frequently used input methods include timestamps and metadata (two occurrences), audio inputs (two occurrences), and more specialised methods like location in-world, SPARQL queries, filters (such as participant groups), and interfaces with media asset management (MAM) systems, each mentioned once. While the data may not be exhaustive, it offers an indication of the input methods currently recognised within the scope of the expert responses.

This wide range of input methods underscores the need for flexible search functions in MVR retrieval systems. Indexing features must accommodate content from MMCO, SRD, and PD, ensuring that MVR indexing supports a comprehensive content analysis across all data types. Importantly, user interest often focuses on specific sequences rather than entire files, necessitating search methods that can efficiently target relevant segments. The type of search input varies significantly based on the application scenario, further highlighting the importance of adaptable and versatile retrieval systems to meet diverse user needs.

Input Method Frequency
Keywords 5
Natural Language (including voice) 4
Image-based (screenshot, sketch, snipping) 3
Timestamp/Metadata 2
Audio 2
Location in-world 1
SPARQL 1
Filters (like participant groups) 1
Interface with MAM system 1

Table 6. Frequency of mentioned input methods for MVR search.

Further findings

The answers to the questions on search results (F4) confirm that a specific section of the MVRs is more relevant as a result than the entire MVR.

In terms of integration, the responses suggest a requirement for dynamic processing of MVRs, with potential for embedded processing capabilities. Regarding device preferences, computers and laptops emerge as primary platforms for MVR interaction, with growing potential for VR/AR devices. Smartphones and tablets appear to play a minimal role in this context. These observations show implications for the design and implementation of future MVR systems, emphasising the need for desktop-oriented interfaces adaptable to emerging AR/VR technologies.

The study aims to find general requirements for information search behaviour for metaverse recordings. However, two existing application scenarios in particular provided deeper insight: AS15 analysis of supermarket visits and AS2 edit instructional videos. In the development of supermarket concepts, increasing sales and improving the experience are relevant goals, which strongly depends on customer behaviour. To assess this behaviour and identify changes in the layout and arrangement of supermarket shelves, position data and gaze direction are recorded, the expert explained. Implementing this in VR simplifies the process. In addition to the video feed, other data, such as gaze direction and position, are captured (Adhanom et al. 2023, Meißner et al., 2019). These data fall into the category of PD. In subsequent analysis, the video data is overlaid with position and gaze direction to enhance the analytical process. IR can support this process by identifying similar or distinctly different behaviours, which can further improve the analysis. Also, analysis of the order of products watched can be created from the MVRs, either by manual inspection of the MVRs or by specialised content analysis of the MVR. The boundary between a general search system and an analysis system is fluid in this case, which reinforces the argument that an MVR retrieval system should be seamlessly integrable, providing access to the PD and features from the MVR content analysis, as well as the query and retrieval capabilities.

In the application scenario of instructional videos, the editor must construct the final video out of elements of recorded videos of VR, also called footage. To find relevant segments, the editor must scan through or search in the MVRs. This application scenario is at least like real-world video editing. The IR process can support the editor by showing the similar segments as alternatives for the current segment. The utility of such a function increases with the quality of the semantic understanding of the segments, since the recognition of and search for the instructional action requires a deeper understanding. The explanations of the process by the expert described known challenges from the field of video retrieval, for example, object detection and boundary detection.

In conclusion, the results show a broader application of MVR retrieval, with a broad set of requirements. This leads to several challenges for the MVR retrieval: MVR query construction, MVR content analysis, and MVR retrieval integration.

Discussion of limitations

This study has a few limitations that should be considered when interpreting its results. Firstly, the small sample size of only six experts from diverse application domains limits the generalisability of the findings. Additionally, the narrow geographic and cultural representation of the participants may not capture the full spectrum of perspectives on MVRs. Another significant limitation is the potential ambiguity in understanding user needs, as the study relied heavily on expert opinions rather than direct observations of existing systems. This approach was necessitated by the current inaccessibility of some key platforms, such as AltspaceVR (Wikipedia, 2024), which are no longer publicly visible. Consequently, this constraint also led to a limited exploration of data types within actual metaverse environments. These limitations underscore the need for future research to expand the sample size, broaden cultural representation, and incorporate direct observations of metaverse systems, when possible, to provide a more comprehensive understanding of MVR retrieval needs and challenges. However, for the future research of MVR retrieval, this study’s results provide fundamental findings, insights on information need, and indications of what MVR retrieval system features are more relevant to the users.

Summary

The results of this study provide significant insights into the creation, usage, and information needs associated with MVRs. Firstly, our six-expert panel validated eight existing and thirteen conceivable MVR retrieval scenarios across six domains, providing compelling evidence that MVRs are actively being created and utilised in various field contexts. A central finding is the users focus on locating specific segments within MVRs, rather than entire recordings, and the need to identify similar segments or their immediate temporal context (F4). The study clearly demonstrates that search methodologies and the relative importance of different data types vary considerably across the identified application scenarios (such as AS15 and AS2), reinforcing the demand for highly flexible, adaptable, and potentially user-tailored MVR retrieval systems capable of diverse query inputs (see Table 6). Crucially, the analysis of data types in Table 5 validates the practical relevance of all components of the proposed MVR taxonomy, MMCO, SRD, and PD, in real-world MVR contexts. This finding underscores the necessity for MVR retrieval systems to handle these distinct yet interconnected data streams. However, the relative importance and utilisation of these data types fluctuate significantly across different application scenarios, highlighting the complex and multifaceted nature of MVR data and the diverse requirements for effective information retrieval in metaverse environments.

Conclusions and outlook

Based on the results of this study, it is now feasible to model the context of use for MVR retrieval and develop a corresponding system design that can be both implemented and evaluated. The findings from the field study allowed for the validation and further detailing of application scenarios. Moreover, the information needs of users are better understood, as well as their information-searching behaviour in relation to MVR retrieval. The study confirms that time-series data, including SRD and PD, are not only technically feasible to capture, but are also highly relevant to user needs.

These results serve as a foundational basis for future research into MVR retrieval systems. For instance, they contribute to defining the context of use for such systems, identifying user stereotypes, and deriving specific use cases and other requirements for designing effective MVR retrieval systems. These contributions provide a structured approach to further development in this domain, ensuring that future systems align more closely with user needs and behaviour.

Acknowledgements

The authors thank the experts for their time, valuable insights, and contributions to the study.

About the authors

Patrick Steinert is a Ph.D. student and lecturer at the faculty of mathematics and computer science, University of Hagen, Hagen Germany. Further, he works as Technology Consultant for software solutions in media technologies at Qvest Digital AG, Bonn, Germany. His research interest includes virtual worlds and multimedia technologies. He can be contacted at psteinert@acm.org.

Jan Mischkies is a graduate student at the faculty of mathematics and computer science, University of Hagen, Germany. He can be contacted at janmischkies@web.de.

Stefan Wagenpfeil is a Professor for Software Engineering and IT-Management at the Private University of Applied Sciences, Göttingen, Germany. He received his Ph.D. from the University of Hagen and his research interests include Software Engineering Multimedia Information Retrieval, Virtual Reality and Gamification. He can be contacted at s.wagenpfeil@pfh.de.

Ingo Frommholz is Professor and Head of the School of Applied Data Science at Modul University Vienna, Austria, and Adjunct Professor at the Bern University of Applied Sciences, Berne, Switzerland. With a general interest in data science and artificial intelligence (AI), his focus is on human-centric Information Retrieval, generative AI and its application in disciplines such as digital libraries. He can be contacted at ifrommholz@acm.org.

Matthias L. Hemmje is full professor at the University in Hagen, Hagen, Germany. He is involved in research on Virtual Information and Knowledge Environments with special focus on distributed collaborative Digital Libraries, Multimedia Archives, Information Retrieval, Filtering, Linking, Enrichment, Personalization, and Information Visualization. He can be contacted at matthias.hemmje@fernuni-hagen.de.

References

Adhanom, I.B., MacNeilage, P. & Folmer, E. (2023). Eye tracking in virtual reality: A broad review of applications and challenges. Virtual Reality, 27, 1481–1505. https://doi.org/10.1007/s10055-022-00738-z

Alsop, T. (2023, August 31). Virtual reality (VR)—Statistics & facts. Statista. https://www.statista.com/topics/2532/virtual-reality-vr/ (Archived at https://web.archive.org/web/20250920142626/https://www.statista.com/topics/2532/virtual-reality-vr/)

Anderson, J., & Rainie, L. (2022, June 30). The metaverse in 2040. Pew Research Center. https://www.pewresearch.org/internet/2022/06/30/the-metaverse-in-2040/ (Archived at https://web.archive.org/web/20250921093518/https://www.pewresearch.org/internet/2022/06/30/the-metaverse-in-2040/)

Austin, D. (2023, April 20). 2023 Internet minute infographic. eDiscovery Today by Doug Austin. https://ediscoverytoday.com/2023/04/20/2023-internet-minute-infographic-by-ediscovery-today-and-ltmg-ediscovery-trends/ (Archived at https://web.archive.org/web/20250823180818/https://ediscoverytoday.com/2023/04/20/2023-internet-minute-infographic-by-ediscovery-today-and-ltmg-ediscovery-trends/)

Belkin, N. J., Marchetti, P. G., & Cool, C. (1993). BRAQUE: Design of an interface to support user interaction in information retrieval. Information Processing & Management, 29(3), 325–344. https://doi.org/10.1016/0306-4573(93)90059-M

Bestie Let’s Play (Director). (2022, October 16). Wir verbringen einen Herbsttag mit der Großfamilie!!/Roblox Bloxburg Family Roleplay Deutsch [Video recording]. https://www.youtube.com/watch?v=sslXNBKeqf0

Gartner Inc. (2022, August 30). Metaverse, web3 and crypto: Separating blockchain hype from reality [Interview]. Gartner. https://www.gartner.com/en/newsroom/press-releases/2022-08-30-metaverse-web3-and-crypto-separating-blockchain-hype-from-reality (Archived at https://web.archive.org/web/20250806055732/https://www.gartner.com/en/newsroom/press-releases/2022-08-30-metaverse-web3-and-crypto-separating-blockchain-hype-from-reality)

Güngör, A. (2016, May 17). Video: NVIDIA Ansel architecture explained. Technopat Sosyal. https://www.technopat.net/sosyal/konu/video-nvidia-ansel-architecture-explained.329263/ (Archived at https://web.archive.org/web/20251104185726/https://www.technopat.net/sosyal/konu/video-nvidia-ansel-architecture-explained.329263/)

Gunkel, S., Stokking, H., Prins, M., Niamut, O., Siahaan, E., & Cesar, P. (2018). Experiencing virtual reality together: Social VR use case study. Proceedings of the 2018 ACM International Conference on Interactive Experiences for TV and Online Video, 233–238. https://doi.org/10.1145/3210825.3213566

Gurrin, C. (2021). Personal data matters: New opportunities from lifelogs. 2021 16th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), 1–3. https://doi.org/10.1109/iSAI-NLP54397.2021.9678155

Gurrin, C., Zhou, L., Healy, G., Tran, A., Rossetto, L., Bailer, W., Dang-Nguyen, D., Hodges, S., Jónsson B, Tran, M., & Schöffmann, K. (2025, June). Introduction to the 8th Annual Lifelog Search Challenge, LSC'25. In Proceedings of the 2025 International Conference on Multimedia Retrieval (pp. 2143-2144). https://doi.org/10.1145/3731715.3734579

The International Criminal Police Organization (INTERPOL). (2024a). Metaverse—A law enforcement perspective [Whitepaper]. https://www.interpol.int/News-and-Events/News/2024/Grooming-radicalization-and-cyber-attacks-INTERPOL-warns-of-Metacrime (Archived at https://web.archive.org/web/20250611211724/https://www.interpol.int/News-and-Events/News/2024/Grooming-radicalization-and-cyber-attacks-INTERPOL-warns-of-Metacrime)

The International Criminal Police Organization (INTERPOL). (2024b, January 18). Grooming, radicalization and cyber-attacks: INTERPOL warns of ‘Metacrime’. https://www.interpol.int/en/News-and-Events/News/2024/Grooming-radicalization-and-cyber-attacks-INTERPOL-warns-of-Metacrime (Archived at https://web.archive.org/web/20250906030108/https://www.interpol.int/en/News-and-Events/News/2024/Grooming-radicalization-and-cyber-attacks-INTERPOL-warns-of-Metacrime)

Karlsson, B. (2018). RenderDoc. https://renderdoc.org/ (Archived at https://web.archive.org/web/20251010074022/https://renderdoc.org/)

KZero Worldwide. (2024, February 6). Exploring the Q1 24’ metaverse radar chart: Key findings unveiled - KZero Worldswide. https://kzero.io/2024/02/06/2633/ (Archived at https://web.archive.org/web/20240206134622/https://kzero.io/2024/02/06/2633/)

Meißner, M., Pfeiffer, J., Pfeiffer, T., & Oppewal, H. (2019). Combining virtual reality and mobile eye tracking to provide a naturalistic experimental environment for shopper research. Journal of Business Research, 100, 445-458. https://doi.org/10.1016/j.jbusres.2017.09.028

Mojang. (2023, January 31). Minecraft official website. Minecraft.net. https://www.minecraft.net/de-de (Archived at https://web.archive.org/web/20251103172205/https://www.minecraft.net/de-de)

Morris, T. (2022, May 3). Understanding just what’s happening with the metaverse? GWI. https://blog.gwi.com/chart-of-the-week/metaverse-predictions/ (Archived at https://web.archive.org/web/20250807010715/https://www.gwi.com/blog/metaverse-predictions)

Mystakidis, S. (2022). Metaverse. Encyclopedia, 2(1), 486–497. https://doi.org/10.3390/encyclopedia2010031

Rüger, S. (2010). What is multimedia information retrieval? In S. Rüger, Multimedia Information Retrieval (pp. 1–12). Springer International Publishing. https://doi.org/10.1007/978-3-031-02269-2_1

Steinert, P. Wagenpfeil, S., Frommholz, I., Hemmje, M. L., (2023). Towards the integration of metaverse and multimedia information retrieval, IEEE International Conference on Metrology for eXtended Reality, Milano, Italy, pp. 581-586. https://doi.org/10.1109/MetroXRAINE58569.2023.10405728

Steinert, P., Wagenpfeil, S., Frommholz, I., & Hemmje, M. L. (2024a, October). 256 metaverse records dataset. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 4256-4263). https://doi.org/10.1145/3664647.3681711

Steinert, P., Wagenpfeil, S., Frommholz, I., & Hemmje, M. L. (2024b, February). Integration of Metaverse Recordings in Multimedia Information Retrieval. In Proceedings of the 2024 13th International Conference on Software and Computer Applications (pp. 137-145). https://doi.org/10.1145/3651781.365180

Steinert, P., Wagenpfeil, S., Frommholz, I., & Hemmje, M. L. (2025). Uses of metaverse recordings in multimedia information retrieval. Multimedia, 1(1), 2. https://doi.org/10.3390/multimedia1010002

Stephenson, N. (1992). Snow crash. Bantam Books.

Stothard, P., Ryan, P., Kurata, T., & Stapleton, D. (2024). Towards a mining metaverse. Mining Technology: Transactions of the Institutions of Mining and Metallurgy, 25726668241242232. https://doi.org/10.1177/25726668241242232

Wikipedia. (2023a). Augmented reality. In Wikipedia. https://en.wikipedia.org/w/index.php?title=Augmented_reality&oldid=1142793928 (Archived at https://web.archive.org/web/20240903103104/https://en.wikipedia.org/w/index.php?title=Augmented_reality&oldid=1142793928)

Wikipedia. (2023b). Roblox. In Wikipedia. https://en.wikipedia.org/w/index.php?title=Roblox&oldid=1177660840 (Archived at https://web.archive.org/web/20241005054004/https://en.wikipedia.org/w/index.php?title=Roblox&oldid=1177660840)

Wikipedia. (2024). AltspaceVR. In Wikipedia. https://en.wikipedia.org/w/index.php?title=AltspaceVR&oldid=1248210194 (Archived at https://web.archive.org/web/20251104192245/https://en.wikipedia.org/w/index.php?title=AltspaceVR&oldid=1248210194)

Zheng, Z., Li, T., Li, B., Chai, X., Song, W., Chen, N., Zhou, Y., Lin, Y., & Li, R. (2022). Industrial metaverse: Connotation, features, technologies, applications and challenges. In W. Fan, L. Zhang, N. Li, & X. Song (Eds.), Methods and Applications for Modeling and Simulation of Complex Systems (pp. 239–263). Springer Nature Singapore. https://doi.org/10.1007/978-981-19-9198-1_19

Authors contributing to Information Research agree to publish their articles under a Creative Commons CC BY-NC 4.0 license, which gives third parties the right to copy and redistribute the material in any medium or format. It also gives third parties the right to remix, transform and build upon the material for any purpose, except commercial, on the condition that clear acknowledgment is given to the author(s) of the work, that a link to the license is provided and that it is made clear if changes have been made to the work. This must be done in a reasonable manner, and must not imply that the licensor endorses the use of the work by third parties. The author(s) retain copyright to the work. You can also read more at: https://publicera.kb.se/ir/openaccess

Appendix

Questionnaire

The Table 7 contains the questions of part 1.

Identifier Question
F1.1 Can you identify application scenarios of MVR retrieval within your application domain?
(Goal: Existing MVR retrieval)
F1.2

Can you imagine further scenarios beyond the identified application scenarios?

(Goal: Conceivable MVR retrieval)

F2

To what extent do you consider the following application scenario conceivable?

(Goal: Validation of modeled MVR retrieval scenarios)

Table 7. Expert panel questionnaire part 1 - Questions on application scenarios.

The Table 8 contains the questions of part 2.

Identifier Question / Response Ideas
F3

What kind of information do users most likely have at the start of their search

to begin the search process?

Keywords

Example image

Example audio

Timestamp

Filename

Other

F11 Which components should a user interface for MVR retrieval include for querying?

Keywords

Text field for natural language

Query language (SPARQL)

Image input

Sketch

Audio input

F4 What type of search result might users be particularly interested in for MVR retrieval?

A single relevant file

A collection of relevant files

A specific position within a file

Multiple positions within one or more files

Other

F5 What type of media might users be particularly interested in for MVR retrieval?

Chat

Audio

Image

Video

Spatial video (3D video)

Document

Combinations of the above

Other

F9 What kind of file types do the collections to be searched consist of?

Screen recordings

Video raw files (engine data)

Metadata/Log files

Sensor data

Other

F15

What components or combinations thereof should a user interface have for

presenting results?

Displaying one file at a time (details)

Display all files as a ranked list sorted by relevance (ranked list)

Option to filter results based on metadata (e.g., file type, creation date, etc.)

Ability to open files (e.g., play back a video file to verify a result)

Display additional files based on another similarity metric (recommendations)

Other

F6 Is MVR retrieval a static or dynamic process?
F8

What type of application are MVR retrieval systems likely to be? (Standalone

or Embedded)

F7 On which technical devices is the use of MVR retrieval most likely?

Desktop PC

Laptop

Smartphone

Gaming console

Other

Table 8. Expert panel questionnaire part 2 - Questionnaire for application context gathering.

Expert responses

Expert responses can be provided upon request.