User Experience and Immersion of Interactive Omnidirectional Videos in CAVE Systems and Head-Mounted Displays

. Omnidirectional video (ODV) is a medium that offers the viewer a 360-degree panoramic video view of the recorded setting. In recent years, various novel platforms for presenting such content have emerged. Many of these applications aim to offer an immersive and interactive experience for the user, but there has been little research on how immersive these solutions actually are. For this study, two interactive ODV (iODV) applications were evaluated: a CAVE system and a head-mounted display (HMD) application. We compared the users’ expectations and experience and the level of immersion between these systems. Both indoor and outdoor recorded environments were included. First, the results indicate that the user’s experiences with these applications exceed their expectations greatly. Second, the HMD application was found to be more immersive than the CAVE system. Based on the findings of this study, both systems seem to have a great potential for presenting ODV content, thus offering the user an immersive experience for both indoor and outdoor content.


Introduction
Omnidirectional videos (ODV) have been making their way into the mainstream in the last years.These videos are typically recorded with a set of cameras that cover 360 degrees of the recorded scenery.ODV content has been utilized in several interactive applications, including capturing events such as mountain climbing 1 and musical concerts 2 .As the full contents of these videos cannot be viewed as-is due to the limitations in the human field of view, they pose two main design challenges: presentation of the content and interacting with it.
There are several different methods for ODV playback.Often these mediums are some kind of Virtual Reality (VR) applications, ranging from CAVEs (Cave Automatic Virtual Environment) [24] to HMDs [18], but ODV content can also be played with web-based applications (Youtube and Facebook 360 video support) and tablets [33].In addition to the growing consumer markets, VR applications are used in many domains.For example, they have been found to be a promising tool for treating different kind of phobias, such as acrophobia [6] and agoraphobia [21].ODV's also have potential in industry use, where they could replace for example 3-dimensional models or content recorded with a single camera, which are often used for demonstrating or training purposes.While numerous interesting solutions and applications exist, thorough understanding of omnidirectional video as a medium and its possibilities in different application domains is yet to be achieved.Our study focuses on iODVs, application that utilize ODV with additional interaction in addition to looking around the scene.This interaction could be, for example, in the form of activating UI elements for more information on different objects in the scene, or transitioning from one ODV scene to another.
One of the most important features of virtual reality applications, also the one's that utilize ODVs, is immersion.For example, in a study by Slater, Alberto and Usoh [27] results indicated that those individuals with a higher sense of immersion achieved better performance overall.The term itself has many definitions in the scientific community, but it is commonly referred to as the feeling of "being there".Our study looked into the differences in the feeling of immersion in two different interactive applications displaying omnidirectional video contenta CAVE system and a HMD application.Both mediums have been studied thoroughly in different contexts but in our study, we wanted to explore these applications further in the context of user experience and immersion.As they are both used extensively, e.g. in industrial use, the results from our study can help in designing future applications.Comparing two different methods of displaying interactive content can be very useful for future designs in this domain.In the two applications we implemented, the user could interact with the environment by activating either exits that took the user to another video or hotspots that offered the user contextual information about the environment.In addition to measuring the sense of immersion, we evaluated the user experience on both applications in order to validate them and to measure the differences in both expectations and experiences between the two systems.The user experience metrics measured the participant's opinion for example on usefulness, pleasantness and clarity of the application.In addition, we compared the different video content types to see if there are any differences between them in the user experience or in the feeling of immersion.
Our main research questions for this study were:  What are the differences in the user experience between CAVE and HMD applications?
 How immersive are interactive CAVE and HMD applications utilizing omnidirectional videos and are there differences in the level of immersion between these two mediums?
Our findings suggest that the users' experiences exceeded their expectations greatly, especially with the HMD application.The user experience results were very positive in general, and both applications received high scores on the 7-point Likert scale on pleasantness, clearness and performance.One explanation for the contrast between expectations and experiences with the HMD application can be in its "black box" nature, which offers barely any cues on the method of interaction or the overall experience to the user.In the case of CAVE systems, their large size and futuristic look might increase the users' expectations.The positive feedback the HMD application received is also interesting when considering its technical limitations in the presentation of the content: our HMD application offered relatively limited field of view of 60 degrees, which is much more limited than that of the human eye, whereas the CAVE system had no physical limitations on its field of view.Interestingly, none of the users reported this as a limitation.
Regarding immersion, our results indicate that ODV is a very immersive medium.Overall, the HMD application was considered more immersive than the CAVE system with both indoor and outdoor video content.For this difference, we have three explanations: a) HMD obscures the outside world completely from the user, thus allowing them to better focus on the content, b) the sense of depth created by the stereoscopic effect (separate viewports for both eyes), and c) the viewport on the display is based on head orientation, allowing the user to naturally look around.
The motivation for this study stems from the extensive use of CAVE systems in various fields, e.g. in the industry.We argue that HMD systems offer many unique and new application areas requiring immersion, and our results seem to support this argument.The benefits of HMDs come from their portability, as they are often small and mobile, and scalability, as they are less dependent on specific equipment or physical setup.Omnidirectional content could prove useful for example in situations where several people manipulate large objects (such as skylifts) at the same time, as they can show relevant information in multiple directions.CAVE systems also have their uses, for example in situations where the information needs to be presented to multiple persons at the same time.
In the following, we first analyze and summarize the related work in this field of research, which is then followed by a comprehensive description of both applications and their differences.Next, we introduce the methodology used in this study and then report the results of the evaluation along with the discussion on the main findings.We conclude the paper by discussing how these results could be used in designing more immersive interactive ODV applications that offer a better user experience.

Immersion in Virtual Environments
The term immersion has many definitions in the scientific community, and there is clearly some discrepancy on what the term actually means.There are no prior evaluations on immersion in interactive ODV applications, and therefore the related work presented below is based on studies on immersion in VR applications.Immersion is an important aspect of virtual reality applications, as it is believed to affect user's behavior with and in these applications [31].Based on Slater [26], the level of immersion is dependent only on the system's rendering software and display technology.By this definition, immersion is objective and measurable.What some researchers refer to as immersion, Slater defines as presence.According to them, presence is "an individual and context-dependent user response" [26], as in the experience of 'being there'.In short, immersion is defined as objective level of sensory fidelity the system provides, whereas presence refers to the user's subjective experience and response to the system.Using Slater's definitions, the level of immersion easier to measure, but restricts the evaluation so that it can made only on the technological level.This includes only the technical aspects such as field of view (FOV), field of regard (FOR), display size and resolution and the use of stereoscopy.There are several evaluation methods for measuring immersion/presence (based on the definition used), for example the ones by Witmer & Singer [32], Schubert, Friedmann & Regenbrecht [25] and Usoh et al. [30].
Immersion has also been studied extensively in the context of video games, and Brown and Cairns [5] attempted to resolve the disparity with the term.They conducted a qualitative study amongst gamers and talked to them about their experience on playing video games.The study resulted a grounded theory where immersion was used to describe a person's "degree of involvement with the game".This finding supported the idea that immersion is a cognitive phenomenon.The theory also identified restrictions that could limit the degree of user's involvement, including engagement, engrossment and total immersion.
As the related work shows, immersion can be defined in several ways, depending on many factors such as the emphasis on technology, the research domain and the method of evaluation.With VR related studies, Slater's [26] division of immersion and presence is more prevalent, whereas in video game related studies the term immersion is used more often.In this paper, immersion is referred as perceptual phenomenon that is dependent on the individual and the context.

Omnidirectional Videos
Lot of scientific research has been done to enable the use of omnidirectional video.There exists a large variety of algorithms and devices to capture, construct, project, compress, display and automatically analyze omnidirectional video content.
Application domains, where omnidirectional video has received wider interest include remote operation and telepresence applications [4][20] [8], some of which include automatic situation tracking based on the omnidirectional imagery and directional audio.Another application field identifiable in literature is remote operation of unmanned machines and vehicles, for example drones by using omnidirectional video.Applications where omnidirectional video is used by consumers [17][13] [3] provide immersive experiences to cultural contents, e.g., in museums [15][14] [19] and theatre [9].Other application domains include education, e.g., teaching sign language [12], and health care, e.g., relieving stress during medical care [10], and therapy [23].There has been little research on using ODV in industrial use, for example in demonstrating or training purposes.
From the human-computer interaction perspective, augmenting omnidirectional video with interactive content [2] and UI elements [22] are crucial features in many applications.Another field is multisensory augmentations of video content, e.g., simulated wind [22], to further immerse the viewer and improve sense of presence.Interaction studies have also looked at gesture-based interaction [34][24] and second screen interfaces [33] to interact with omnidirectional video content.For example, Benko and Wilson [1] present the Pinch-the-Sky Dome, which projects a full 360 view of omnidirectional data onto an inner side of a dome-shaped structure.The view is controlled using mid-air gestures from anywhere inside the space, and it supports several simultaneous users.They found that mid-air gestures could enhance immersion in an omnidirectional context.

iODV Applications
In this section, we introduce the iODV applications that were built for this study.Both applications used the same ODV content with length of 60 seconds.When the content is finished, it starts again from the beginning.Both applications have two types of user interface elements: exits and hotspots.When activated, an exit takes the user to another video that is linked to that particular exit element, and hotspots provide contextual information about the environment.First, we introduce the video production procedure used for content creation, and then explain the basic features and interaction techniques for both applications.Finally, we compare the main differences between these two applications.

Video Production
The videos used in this study were recorded with six GoPro 4 cameras attached to a Freedom360 mount on top of a tripod.The resulting six videos from each shot were converted into 4k omnidirectional videos by using AutoPano Video Pro 2 and Au-toPano Giga 4 software.Panoramic images and videos are usually divided into either cylindrical (limited vertical field of view -VFOV) or spherical (360°x180°) views.
For this study, we produced a total of six videos, three of which were shot indoors, and three in an outdoor environment.Each video was roughly one minute long.Indoor videos were recorded in an industrial hall used for repairing and maintenance of skylifts.Each video contains some movement, such as people walking around and working, and a forklift riding around the hall.Two of the indoor videos were recorded from a top of a ladder to offer a better view of the surroundings.The outdoor videos were recorded in downtown Tampere, Finland.These videos were recorded during quiet hours, but nonetheless contained a relatively large amount of movement, i.e., people walking on the streets.

cCAVE
For our first experiment, we implemented a multimodal CAVE application, circular CAVE (cCAVE), where the user can explore omnidirectional videos via eight displays set in the form of an octagon.A cylindrical view where the horizontal FOV is 360 degrees and vertical 150 degrees was used in the application.In this system, the user is located at the center of the octagon, sitting on a rotating chair (see Figure 1).The chair has a rotating sensor that sends the rotational axis to the computer.This sensor data is used to update user interface elements on one of the displays, e.g. when the chair is pointing at specific coordinates.The application was developed with Vizard virtual reality software.The omnidirectional video content is then displayed on a 3-dimensional cylinder that is divided between the displays so that each monitor covers 45 degrees of the content.
Each interface element (exits and hotspots mentioned earlier) has a coordinate range (i.e. when the rotating chair is pointed at this range) in which they are shown on the screen.The interface elements are triggered by dwelling, i.e. by focusing an element in the center of the view (by turning the chair towards it) and waiting for five seconds.Dwelling is a relatively common technique for selecting targets with e.g.gaze and midair gestures, which is utilized by a number of applications (e.g.[16]).Before the hotspots are activated, they are presented on the screen as blue circles with an exclamation mark inside.Exits are presented as green arrows.During the activation period, the element is scaled up in order to visualize that it is being selected.Users can cancel the activation process by turning away from the element.Similarly, a hotspot dialog is closed by turning away from it.
We used a set of eight Eyevis Eye-LCD 4000 M/W monitors.Each monitor has a screen diagonal of 40 inches with full HD resolution and they were raised 77 cm from the ground.They were 91 cm high, 53 cm wide and 13 cm thick.The bezel between two monitors was 28 mm (14mm in one monitor).These monitors were set up so that they covered an area of 360 degrees around the user.The rotating chair's seat height was adjusted to 50 cm and the distance from the user's head to the monitors was approximately 60 cm.The outer walls of the cCAVE installation were 175 cm wide and 192 cm high.The total resolution for the application was 4320 x 3840 pixels.The monitors were connected to AMD HD 7870 display adapter with 1 GHz processor and 2 gigabytes of GDDR5 memory.

Amaze360
Amaze360 is an iODV application for HMDs that allows the user to freely observe omnidirectional videos by simply turning one's head in the desired direction.The screen is divided into two separate viewports in order to create a stereoscopic effect, thus creating a sense of depth.This effect is done with the spherical presentation of the video content, as the video content itself is not stereoscopic.The video content used by the application has 360-degree horizontal and 180-degree vertical field of view and the video is projected on a virtual sphere.The viewport's field of view is 60 degrees.
Interface elements (exits and hotspots) in Amaze360 are also triggered by dwelling, but with slight differences.These elements are activated by focusing on an element in the center of the view (by turning the head towards it) and waiting for two seconds.The hotspot and exit icons in Amaze360 are similar to the ones used in cCAVE (blue circle with an exclamation mark inside for the hotspots, and green arrows for the exits).The entire set up and a screenshot of the Amaze360 application with hotspot activation can be seen in Figure 2.
Amaze360 is C# application built on the Unity platform, and it utilizes the Oculus Mobile SDK 1.0.0.0 for iODV features.The application also uses the Easy Movie Texture plugin to enable smooth video playback on mobile devices.It is run on Samsung Note 4 and utilizes the Samsung GEAR headset.

Differences between the applications
Even though the two applications are intended for the same purpose, there are obvious differences ranging from physical setup and display devices to interaction mechanics.These differences further affected some design choices for both applications.A general overview of the features and differences can be seen in Table 1.The primary difference between the two applications is in how content is presented -cCAVE shows the ODV in multiple monitors whereas the Amaze360 uses a stereoscopic presentation on a mobile device.In other words, cCAVE always physically displays the full 360-degree view of the content.Therefore, the user sees the content with the full field of view of the human eye.Amaze360, on the other hand, is limited to a 60-degree sector of the content at any given time.
Another major difference is in how the applications are interacted with, i.e. how hotspots and exits are activated.The cCAVE system utilizes the rotation of the chair, and therefore only uses the X axis (chair's rotation relative to the screens) for activating UI elements.Amaze360 relies on head orientation, and hence uses both X and Y axes.For illustration on these differences, see Figure 3. Due to the difference in how UI elements are activated, both applications vary in how contextual information is presented.In cCAVE, textual content is shown (when a hotspot is activated) at the bottom of the screen.This design choice was made so that the textual content would not obscure the object it is referring to.In Amaze360, textual information was presented on top of the corresponding hotspot (see Figure 4).This was due to the interaction method: as the user activates hotspots by turning their head towards them, it makes sense that the displayed information is displayed in the same position so that the user does not need to adjust the head once more.Furthermore, this allows closing activated hotspots by turning the head away from them, similar to closing hotspots in cCAVE by rotating the chair to another position.Finally, the activation time for UI elements was also different between the applications because of the conclusions made during pilot testing: a short activation time sometimes caused accidental activations in the CAVE system, whereas with Amaze360 these were not as prevalent.This was caused by the slower interaction with the chairturning one's head is much faster and more precise than turning on a chair.The pilot tests verified that the Amaze360 application could have a significantly shorter activation time (2 seconds) for the UI elements than the CAVE system (5 seconds).

Experiment of CAVE and HMD
For this study, we conducted two separate experiments which evaluated the user experience, level of immersion and spatial abilities in immersive virtual environments that utilize omnidirectional videos.Experiment 1 was conducted with the CAVE system and Experiment 2 was conducted with a HMD and the Amaze360 application.

Participants
A total of 34 participants took part in the study, both experiments having 17 participants.The cCAVE was evaluated by 8 females and 9 males aged 30.9 on average (SD = 5.46) and the Amaze360 system also by 8 females and 9 males with an average age of 30.7 (SD = 5.43).They were recruited from around a university campus and were compensated with a movie ticket for their participation.All participants were naïve with respect to interacting with omnidirectional videos, as in they had not use CAVE, HMD or other type of applications that utilize these type of videos.

Procedure
In the evaluation scenario the participants were asked to explore the virtual environments that consist of omnidirectional videos.Both indoor and outdoor environments were presented to the user as separate scenarios (one could not move from inside locations to the outside locations, and vice versa).They could move from one location to another after they had spent thirty seconds in one location.The time limitation was set in order to encourage exploration and looking around the scenery instead of just moving quickly from one scenery to another.Each location also contained two hotspots which, when activated, offered contextual information about the object they were referring to.Both indoor and outdoor video content consisted of three different locations and the last location led the user back to the first one, which made it possible for the participant to explore the locations indefinitely.
No specific tasks were given to the participants because we wanted to emphasize the explorative nature of the experiment.This way the participants could concentrate solely on experiencing the virtual environment.The users could use the system under evaluation as long as they wanted to.They informed the researcher when they were finished with each scenario (indoor and outdoor).Participants used each system (both indoor and outdoor scenarios combined) for approximately 10 minutes on average.
In Experiment 1 the participants used the cCAVE system in a laboratory setting while sitting on the rotating chair.In Experiment 2 they used the Amaze360 application also in a laboratory setting while standing and wearing the HMD.Both locations were approximately the same size.For both experiments, conditions were balanced so that half of the users started using the system in indoor locations and the remaining half in outdoor locations.A researcher was present during the procedure for support in case of a technical fault or other disturbance, but did not otherwise intervene with the evaluation.

Data collection and analysis
We gathered general information from all participants, including age, gender, and experience level with the iODV applications.For the user experience evaluation, we used the SUXES [29] method.It is an evaluation method for collecting subjective user feedback of multimodal systems.In this method, the participants fill out a subjective feedback form about their expectations and experiences on using the system.The form consisted of 9 user experience related claims to which the participants responded on a 7-point Likert scale, where 1 = "Totally disagree", 4 = "Neither agree nor disagree" and 7 = "Totally agree".
Participants filled the expectations form after the user had been informed of the procedure and had been shown to the basics of the system, but before the user personally experienced the system.Then, after they had used the application, users filled out their experiences on a similar form.In addition, after the experiment, participants answered to question regarding their level of immersion during the use of the system ("While using the system, I felt like I was actually standing on the streets/industrial hall").The same 7-point Likert scale was used for the questions regarding immersion.We decided to disregard the existing evaluation methods for measuring immersion for practical reasonsour custom-made questionnaire allows us to compare the results with the UX results for different modalities using the SUXES method [29].Finally, we logged basic interactions with timestamps in both systems, such as start and end times of the application, activations of hotspots, and movements from one video to another.We also considered adding the Santa Barbara Sense-of-Direction questionnaire [11] to the evaluation, but decided against it as the evaluation itself was not about measuring spatial ability.

Results
The main research interests in this study were the feeling of immersion and the user experience with the two applications.In addition, we report the results from logged interaction data.For all results, a Bonferroni-corrected independent t-test was conducted to compare the results between the two systems.Here, we treat the disagreeagree-like scale to be equidistant, which is why the t-test for analyzing the results was used.For the statistical analysis, an average UX score of both indoor and outdoor video content was used.

Expectations versus Experiences
When comparing the UX results of the two experiments, statistically significant differences were discovered between the expectations and the actual user experience on both applications, especially with the HMD.For average UX ratings on all statements in both systems, see Figure 5.In almost all metrics the actual use experience exceeded the expectations, especially so with the HMD.Using both systems were considered to be very easy to learn by the participants.All participants except for two in the first experiment and one in the second one agreed (scored either 5, 6 or 7 on the Likert scale) with the statement that the system is useful (Experiment 1, M = 5.29, SD = 1.047 and Experiment 2, M = 5.82, SD = .883).
Regarding the user experience, the questionnaire results on both applications were generally positive.With cCAVE, 88 % of the users gave positive feedback (scored either 5, 6 or 7 on the Likert scale) on the system's usefulness.82 % of the users thought that the system was pleasant to use, and 100 % of the users felt that the use of the system is easy to learn (where 2.9 % ranked it at 5, 29.5 % ranked it 6 and 67.6 % ranked it at 7 on the Likert scale).The HMD application received even more positive results, where 94 % of the users thought that the system is useful and pleasant to use.Like with cCAVE, all of the HMD users felt that the system is easy to learn.

Immersion and System Interaction
The main interest in addition to the user experience was the feeling of immersion experienced during the use.Between the two applications, statistically significant differences were observed with both indoor video content (Experiment 1, M = 5.18, SD = 1.629 and Experiment 2, M = 6.18,SD = .883);t(32) = -2.225,p < 0.05, and outdoor video content (Experiment 1, M = 5.18, SD = 1.510 and Experiment 2, M = 6.29,SD = .686);t(32) = -2.779,p < 0.05.The immersion level of participants for both applications with indoor and outdoor videos can be seen in Figure 6.

Fig. 6. Average immersion level of the participants in both applications with indoor and outdoor video content
Based on interaction log data, some statistically significant differences in the application use times were observed.cCAVE was used for longer periods of time (in seconds) in total (both outdoor and indoor scenarios combined) than the HMD application (Experiment 1, M = 884.47,SD = 357.91 and Experiment 2, M = 561.41,SD = 214.52);t(32) = 3.193, p < 0.05.Participants also used the CAVE system for longer periods with the indoor video content (Experiment 1, M = 502.82,SD = 303.77and Experiment 2, M = 260.76,SD = 92.96);t(32) = 3.142, p < 0.05.There was no observed effect with outdoor video content.The total times spent with both indoor and outdoor video content can be seen in Figure 7.

Expectations Versus Experiences
The most interesting finding regarding the user experience was that in almost all metrics the actual use experience exceeded the expectations, especially so in the second experiment with the HMD.The interaction method in the cCAVE experiment can be the reason for the difference in expectations on pleasantnesssitting and interacting with a chair can be expected to be more comfortable for users than standing up while wearing the HMD.In addition, it might be difficult to make any estimates on the pleasantness and clarity on the sort of a "black box" HMD device, which offers no cues on the method of interaction to the user.The cCAVE set up might be more impressive and futuristic looking than HMD devices in general.Another factor to consider is the physical set up of the two applications: cCAVE is a large installation built on a metallic rig with eight monitors, whereas the headset is using the smaller Samsung Gear headset and basic Samsung Note 4 mobile device.The system size difference itself might indicate that the cCAVE is more powerful than the compact HMD device.In addition, a desktop computer can be presumed to have better performance than a smaller mobile device, which might implicate to some participants that the system itself is also better graphics and performance-wise.
The HMD application was considered to be faster than cCAVE, which can be at least partly explained by the interaction method: head turning used with the HMD is much faster to perform than rotating the on cCAVE.As mentioned earlier, both systems were regarded as easy to learn, but for the HMD this metric was significantly higher.The intuitive method where the viewport is rotated based on the user's head orientation offers the user an efficient way to start interacting with the virtual world immediately after they wear the device.The UI elements draw the user's attention and when they concentrate on these elements, they are activated and animated which again hints the user that something is happening.
The implications of these results are that both CAVE systems and HMD applications utilizing ODVs are regarded as both useful and easy to learn.Both of the applications had very simple interaction methods which were based on dwell-time.This seems to be a meaningful way of interacting with these types of systems, especially when the interaction is kept simple.Nevertheless, more research is required in order to understand the relationship between complex UI elements and different interaction methods.
Overall, the positive feedback on both applications validates their use on this study.The applications were also very robust and had no technical faults during evaluations, which might have also affected the participant's feedback on the user experience.The actual user experience was much more positive than the user's expectations with both systems, but especially so with the HMD application.

Immersion and User Interaction
Our evaluation suggests that the Amaze360 application is more immersive than the cCAVE with both video content types.There are many possible explanations for this result.First, the headset obscures any other visual stimuli from the view, only showing the contents of the application to the user, whereas in the cCAVE the user can still observe objects outside the screens, including the bezels of the monitors.Second, the HMD provides a stereoscopic effect (coming from the spherical projection, not the video itself) which creates an illusion of depth.This is not provided in cCAVE.Third, since interaction with the Amaze360 is based on head orientation, it does not require any external devices which might enhance the feeling of immersion even further.In the first experiment the aim was to make the interaction as simple as possible with the use of the rotational chair, but it is still not as natural as interaction with the headset.In future implementations a combination of body tracking and gaze tracking could be combined to produce a similar interaction solution as in the HMD application.
Despite the unique advantages of the HMD application, the positive feedback for this application is interesting when considering the current limitations of the technology.For instance, Amaze360 offers a relatively limited 60-degree field of view, which is much smaller than that of the human eye, whereas cCAVE had no such physical limitations.However, none of the users reported this as an issue.Some cCAVE users had trouble finding the textual content from the bottom of screen even when they were informed about the location beforehand, during the introduction.Participants had no trouble finding or activating the hotspots with the Amaze360 application, but three participants noted that the hotspot text box obscures the visibility of the actual object behind it.One solution for this could be an opaque text box that does not hide the content.These findings indicate that the optimal location for the contextual information is somewhere around the center of the screen where the user is most often looking at, but that it also should not dominate the viewport.It should also be located close to the actual UI element activating it, so that the user quickly finds it.
One participant using cCAVE remarked that the reflection on the monitors broke the feeling of immersion, as the participant could see the monitors behind him reflected in the monitors in front of him.Four Amaze360 users reported that they felt dizzy during the indoor scenes, which were filmed on a ladder.This interesting finding and its connection to acrophobia could be an interesting topic of research, and has also been looked into by Coelho et.al [6].None of the participants did not report any motion sickness effects in either applications.Three participants using cCAVE and one user using the Amaze360 stated that the resetting of the omnidirectional video content back to the beginning (due to looping videos) broke the immersion somewhat.In addition, some video production errors that caused distortions were breaking the feeling of immersion for one cCAVE user.These distortions can be eliminated with careful planning of the recording and editing phase of the ODV content.The biggest hurdles in the post-production phase are the color level differences between the cameras, stitching errors where the content between the cameras are not overlapping properly or displaying of the camera equipment in the recording.Also, if the content needs to loop, some attention should be paid to how smoothly the end of the content loops back to the beginning.These problems will likely dissipate once the ODV recording and production technologies advance.When comparing the results between outdoor and indoor video content, there was no significant difference in the feeling of immersion.
The difference in use times with the indoor video content is also an interesting finding.As this same effect was not observed in outdoor environments, one explanation for the difference between the systems could be in the claustrophobic nature of the indoor environment and the limited field of view used in the HMD application.Another explanation could be the filming location of the indoor video content.Two out of three of these videos were recorded from a higher ground, i.e., from a ladder.Four HMD users said that they felt dizzy during these scenes, which might affect the total time used with the indoor videos.
We also note that CAVE systems are diverse and may significantly vary between setups.The cCAVE system was unique but also relatively limited in regards to the rotating chair.It would be interesting to research immersion further with CAVE systems, in particular with larger installations inside which users could walk freely.Also, there are factors that should be taken into account in the future evaluations.For example, evaluating the participant's spatial abilities with Santa Barbara sense-of-direction scale [11] before they use the application.

Implications for iODV Applications
In the past, CAVE systems have been used extensively in many areas such as the industry [28][7].However, we argue that HMD systems offer many unique, new application areas because of two reasons.First, due to their small size and easy physical setups, HMDs are easily portable.Second, they are more scalable and adjustable, i.e. less dependent on specific equipment and a specific physical setup.These features could make HMDs a valuable asset in many situations.For instance, we recorded the indoor video content used in our experiments in a skylift maintenance hall.However, maintenance on skylifts is often conducted in the field.Field technicians could carry HMDs with them and access informative content on-the-spot, in case they needed additional guidance on e.g.how to conduct some specific maintenance procedure on a skylift model unfamiliar to them.We believe omnidirectional video content could prove useful in such situations, as a potentially complicated procedure may be difficult to fully document (and view) on a regular camera, especially if the procedure involves large objects.

Conclusion
In this paper, we investigated the user experience and level of immersion in iODV applications that utilize omnidirectional videos.We conducted a comparative study between two applications: a CAVE system, cCAVE, and a head-mounted display application, Amaze360.We collected and analyzed interaction logs and questionnaire data to gain insight on similarities and differences between these two systems and on the feeling of immersion and user experience in iODV applications in general.
Our main findings suggest that in regards to user experience in interactive ODV applications, the experiences exceed the user's expectations.These differences were especially evident with the HMD system, as the users' expectations were exceeded in many aspects such as pleasantness, clarity and performance of the system.Both the CAVE and the HMD applications were considered very easy to learn.Some of the differences in user experience between these two iODV applications can be explained by the different user interaction methods.Head orientation-based interaction used with theHMD is much faster to use than the rotating chair of the CAVE system.
Another interesting take away from our study is that ODV is a very immersive medium.Overall, the HMD application was considered to be more immersive than the CAVE system.This effect was observed with both indoor and outdoor video content.We primarily attribute the immersiveness of the HMD application to a) the head-mount that effectively blocks outside visual stimuli and allows concentration on the content, b) the stereoscopic view creating a sense of depth and c) the viewport on the display is based on head orientation, allowing the user to naturally look around.
As interactive ODV applications are becoming more available in the consumer market, further research on the possibilities of this medium is necessary.For future work, it would be meaningful to study the feeling of immersion on a video content with different heights (skyscraper versus a cave) and different types of background movement (crowded street versus peaceful forest), as these properties were not within the scope of this study.Also, the effect of a moving camera (e.g. a roller coaster or a racing car) and its effects on immersion should be evaluated.This could provide more insight on what kind of ODV content offers the most immersive experience to the user.

Fig. 1 .
Fig. 1.The cCAVE system.The rotating chair used for interaction is at the center of the system.Eight monitors (only 6 shown in the image) each show 45 degrees of the omnidirectional video content.The two monitors in front are attached to the doors that are opened for entrance and closed during use.

Fig. 2 .
Fig. 2. Top: Amaze360 physical setup.Bottom: Amaze360 application view.The video is shown as a stereoscopic presentation.Activated hotspot is shown at the center of screen.

Fig. 3 .
Fig. 3.The hotspot activation sectors illustrated in both applications.The gray coordinate area represents the coordinate rate of hotspots in cCAVE, and the circular area represents the X-and Y-coordinate range used in Amaze360.

Fig. 4 .
Fig. 4. Hotspot locations in the two applications.HMD hotspot location is presented in white dotted line and CAVE system hotspot location in black dotted line.

Fig. 5 .
Fig. 5. Average UX ratings for expectations and experiences on both systems.Arrows indicate the direction of the change between expectations and experiences.The statements in bold had statistically significant differences between the applications regarding expectations, and those marked with asterisk in experience.

Fig. 7 .
Fig. 7. Total mean time spent on task with indoor and outdoor video content

Table 1 .
Differences between the two applications