Wireless Smartphone Mirroring in Video Calls

. While screen mirroring is an integral part of many video mediated collaborations, current systems are limited in their ability to include ad hoc screen mirroring from personal devices of collocated participants on each end of a video call. In this paper we introduce a system that addresses this limitation by enabling lightweight multi-user wireless smartphone mirroring within a video call. The system enables multiple smartphones to share both digital content as well as physical artefacts when mirroring the live view from the smartphone camera feed. We present a study of the system in use for a distributed design task. The ﬁ ndings explore how shared access to screen mirroring facilitates a ﬂ uid switching of ﬂ oor control in the meeting and smooth interleaving of individual, sub group and full group shared activities. Further, the ﬁ ndings highlight the importance of smartphone mobility in enabling access to screen mirroring from the sites of individual work and sites of various physical artefacts and the signi ﬁ cance of this for the dynamics of a video mediated collaboration.


Introduction
Since the early days of video mediated communication research, there has been widespread recognition of the importance of sharing the viewing of information artefacts between distributed participants as the basis for ongoing discussion and collaboration [e.g. 2]. These shared artefacts whether physical or digital provide a common ground [1] that can be drawn attention to in the context of collaborative work. Significant research efforts within the domain of computer-supported cooperative work (CSCW) have looked to develop ways in which these artefacts can be shared and simultaneously viewed by distributed collaborators. Drawing on these efforts, many commercially available video calling solutions (e.g. Skype, Lync, Google Hangouts, and WebEx) offer some form of screen mirroring capability to support shared viewing of digital documents across distributed sites. In this paper the term mirroring refers to the direct duplication of one screen to another, while sharing is used to describe the act of visually sharing activities and digital or physical artefacts among collaborators.
While screen mirroring capabilities offer important value in video mediated collaborations, the current set-ups are not without their limitations. For example, it is common in many everyday conferencing situations for one site to use a single host computer to connect to another host computer at a remote site. It is common for multiple collocated participants to be present at either site but in hosting on a single computer, the access to screen mirroring capabilities can be restricted to the person driving the host computer. While it is possible to swap control to another participant, this process is cumbersome and ultimately inhibits more casual and ad hoc artefact sharing [4,8]. Related to this concern, these collocated participants may be working with a broader ecosystem of mobile devices such as laptops, tablets and smartphones as well as a variety of physical information surfaces like whiteboards and paper. Such personal devices have significance in that they may contain information arising from individual work performed prior to the video call as well as supporting any individual or subgroup work being performed in the meeting in parallel to the shared aspects of the workboth of these being potential contributions to subsequent shared activities in the meeting. While there are possibilities for sharing from some of these devices, the mechanisms are again cumbersome and inhibitory. For example, it is possible that these devices actively join the video meeting as an additional participant and then proceed to mirror the screen from there. But this is sufficiently effortful to be a barrier. As argued by Mueller-Tomfelde and O'Hara [5], such effort also incurs certain social consequences. Going through these processes entails "taking the floor" in a strong way in which the contributions of the sharing and talk need to be sufficient to justify the interruption. In this sense, these cumbersome processes create a social barrier to more casual sharing.
Furthermore, there are various physical constraints on the sharing opportunities within such set-ups. Elements of sharing may involve fixed wired connections such as those connecting to any shared display used within the room. Again while not insurmountable, such requirements do present certain constraints on the sharing from wherever in the room. As McGill et al. [4] highlight in relation to purely collocated settings, wireless screen mirroring can liberate collaborators from such constraints in ways that can benefit the participation dynamic of the collocated settings. Additional concerns here arise in relation to opportunities for sharing physical artefacts whether these are personally created and assembled paper documents or larger vertical surfaces such as flipcharts and whiteboards within a conference room. While there exist bespoke camera based set-ups that support aspects of this kind of sharing, what is more typical is to appropriate the existing single camera set-up already in use for the video sharing. Such cameras are either fixed to the host machine or alternatively to the front of the meeting room. In either instance such fixity of the camera offers significant limitations on the ability to use them for the sharing of physical artefacts. Smaller documents can potentially be brought to within the camera frame which is burdensome, but the fixed cameras cannot be brought to the site of their production and use at the table. For larger physical information surfaces, such as wall-based whiteboards and flip charts there is little or no opportunity to usefully bring them into the frame of the primary video camera and little opportunity to bring cameras to their location in the meeting room.
In this paper, we present work that is motivated by the key arguments and limitations of current screen mirroring capabilities discussed above. The work looks to support more ad hoc and casual screen mirroring opportunities within video calls for multiple collocated participants at either end of a remote collaboration. The aim here is to achieve this by facilitating the personal devices of all participants that are not explicitly connected as host devices within the call. Furthermore it aims to enable this to be done from wherever in the room by using wireless rather than wired based screen mirroring mechanisms. While our aims and motivations apply to a broader range of personal devices and artefacts within the meeting room ecosystem, we focus in the first instance on wireless smartphone screen mirroring. One of the key reasons for this is to enable us to further exploit the specific mobile camera capabilities of smartphones with a view to open up access to all in the room to participate in distributed sharing of the various physical information artefacts located around the room. Before presenting the system, we briefly discuss some related work to further ground the arguments underpinning the system. After presenting the system, we highlight some initial findings from a study of the system in use for a distributed collaborative activity.

Related Work
The sharing of screens, documents, and artefacts within collocated and distributed settings has been well documented in the CSCW literature and a comprehensive review of the various systems and nuances of particular approaches cannot be given due justice here. A good review of the key arguments can be found in the work of Tee et al. [8]. Of particular significance to our concerns here are some of the themes arising from research into Multi Display Groupware. Such efforts look to augment elements of single display groupware based collaboration with additional display devices such as personal tablets or mobile devices (e.g. [6,10]). Of particular relevance here is the combination of personal and shared displays that acknowledge and bring together strands of individual and small subgroup work with the larger shared activities of the group as a whole (e.g. [9]).
With the emergence of commercial screen mirroring technologies in smartphones such as Miracast, Airplay and Chromecast, we are beginning to see some explorations of its use among small collocated groups of collaborators. A recent study [4] showed how multiple collocated users of these technologies self-managed the mirroring of their phones to a main display. While they used wired connections in their study, they highlighted the ways in which participation was better shared among the collaborators reducing dominance by a single person who might otherwise control the mirrored display. In our work we extend these ideas to include remote settings as well as wireless techniques for smartphone screen mirroring.
In addition to such multi-display and screen mirroring work, additional work looks to consider the potential for incorporating mobile phone camera capabilities into video call and media space set-ups. Neustaedter and Judge, for example, developed the peek-a-boo [7] concept that exploited the mobility of the camera phone to link in with a fixed media space display in the home [cf. 3].
Detailed technical details is beyond the scope of this paper, but the following provides an overview of the used technologies and the main components of the implementation. The system we developed draws together capabilities from Microsoft's Lync communication suite and Lumia Beamer screen mirroring applications. Lync is Microsoft's communication suite that supports video conferencing, instant messaging and screen sharing functionalities. While a mobile Lync client is available, it does not support distributed screen mirroring capabilities. Lumia Beamer is an application available on Nokia Lumia Phones that enables users to mirror the screen of their phone to another display through a regular web browser. By using the phone to scan a QR code presented in the browser, the application mirrors the screen of the phone in the web session. In addition, switching to the inbuilt smartphone camera application while mirroring, can effectively make the phone function as a wireless handheld web cam.
The prototype is implemented as a desktop application using.NET 4.5 and Windows Presentation Foundation. Using the Lync SDK we developed a bespoke Lync Client into which we have wrapped key elements of the Lumia Beamer functionality.
The key difference from the standard Lync desktop client is in the interface of an active video call. When a call is answered our application intercepts the conversation and presents a full screen mode containing 3 primary elements: The video conversation, a QR code for connecting the Beamer application, and an area for displaying the mirrored smartphone screen (see Fig. 1). To synchronise the mirrored view on both clients connected to a call, a lightweight machine-to-machine communication protocol (MQTT) with a publish/subscribe mechanism is used. Whenever a Lync call is accepted, each client creates an ID unique to the conversation, based on an MD5 hash of a sorted list of the Lync IDs of connected users. Each client connects to a MQTT message broker and uses the created ID as the subscription topic. When a QR code is scanned the client publishes the session URL to the conversation topic and all clients with the same ID receive the message and redirect its mirrored view to that session URL. Each time a Beamer mirroring session has been initiated, the client on which the QR code was scanned generates and presents a new unique QR code. By doing this, participants on each side of the video call always have access to start a new Beamer mirroring session.

Study
In order to evaluate the system in action we conducted a study of its use in a distributed collaborative task. The study consisted of 20 participants in total, organised into five different sessions with four participants in each. The participants were employees at the same company recruited on a volunteer basis. Of these, 16 were male and 4 female (average age = 29 years, SD = 4.10).
A session consisted of an introduction to the task and system, a 20 min collaborative design session, followed by a 20 min group interview. The task chosen for the design session was for participants to collaborate on a t-shirt design representing the company they work for. The task was chosen to encourage several participants to create, share and discuss both physical and digital artefacts. Participants were divided across 2 conference rooms with 3 collocated people in one room and a single person in the other. This configuration was chosen as we found the 2 × 2 configuration option to be limiting for exploring aspects of collocated interaction. As such we chose 3 people at one end to better represent these concerns in each session.
The rooms used were standard conference rooms with a large display on the front wall on which the application was presented (see Fig. 2). Each room had various vertical whiteboard surfaces on the walls. We also provided both rooms with coloured pencils, felt-tip pens, post-its, and A4 paper both blank or with pre-printed t-shirt templates. During the introduction each participant was given a Nokia Lumia phone configured to have access from the start screen to the Beamer application, Internet Explorer, Office, camera, photo gallery, and calendar. Browser history and photo Fig. 2. Room set-up for the study gallery were cleared between sessions. Each session was video recorded using a dedicated video camera that was positioned to capture all of the collocated participants in the room and the shared display on the front wall on which the remote video of the single participant room was visible. The single person room was not video recorded but observed, to document events that might not be so apparent through the video conversation view.
The subsequent group interview sessions were used to elicit general opinions about the system as well as elaborations on specific behaviours of interest identified by the researcher observing the sessions. These interviews were video recorded and transcribed for later analysis. Over and above the in situ observations of the research, video recordings of the task sessions were subsequently revisited to allow a more detailed, reflective and systematic analysis of the unfolding collaborative action of the participants. Findings are primarily based on the observations and video recordings, but interview data have been utilised in getting a deeper understanding of particularly interesting events throughout the analysis.

Findings
Within the sessions, it was observed that multiple people in both locations took the opportunity to share content from their devices. The content shared included images sourced from the web, photographs taken of drawings on paper documents, photographs of whiteboards, photographs of objects captured outside the meeting room and live video images of paper documents and objects as they were being worked on or discussed and pointed to in real time.

Coordinated Organisation
Of significance here was the coordinated organisation of individual and shared aspects of the task. Individual work here took place in parallel to shared discussions happening around the shared display. For example, individuals were observed using their smartphones to search for images while not mirrored to the shared display. Once the images were located they would then mirror their display to the shared surface in order to take the floor. Likewise, design ideas were explored on paper documents in preparation for subsequent sharing via the main screen. Such preparation would often happen in parallel with another participant sharing and presenting. Some participants would prefer to utilise the live video to share design ideas with the option of quickly switching between different documents laid out on the table, by simply moving the camera around. Others would take photos as their work progressed and later share them by mirroring the photo gallery application of the smartphone. In any case, distributing the activities across multiple devices and artefacts enabled a more fluid interleaving of individual, subgroup and full group sharing activities. The preparation work meant that objects of sharing were immediately available to facilitate the social mechanisms and timings by which new ideas could be introduced to take the floor in the discussion.

Sharing Physical Artefacts
As well as sharing products of particular individual and subgroup work, what was also noteworthy was the real time sharing of work being done on paper documents and whiteboards. Here the video capabilities of the phones were used to reveal work as it was being performed. This involved some collaborative efforts with one person holding the mirrored phone to video the mark-up, gesticulation and talk around the paper in situ such that it could be shared across the two locations.
A critical feature of these interactions was the nature of mobility enabled by the wireless sharing of these images. This played out in a number of important ways. First of all, we saw how it allowed people to perform sharing activities from wherever they were seated allowing them to fluidly shift from individual to shared activities in the context of their locally assembled artefacts. Second, we saw how this mobility enabled movement around and beyond the room. For example, one participant left the meeting room to capture a photo of artwork situated in the atrium of the building. Another participant moved from their seat to the whiteboard in the room and proceeded to share a live image of the content from his camera phone while talking about it. Finally the micro mobility of the phone was exploited to achieve the fine-grained framing requirements of specific features of the work process and artefacts that were deemed useful to be shared across sites. In essence, mobility allowed participants to accommodate for features of the environment that impacted on the spatial organisation of the work.

Negotiating Control
As a final point we saw how participants were able to successfully negotiate among themselves the fluid transfer of control over the shared display. With the always-present availability of the barcode to control mirroring, participants were observed to vocalise their intention to share just prior to initiating the mirroring process. It was apparent in the timing and nature of these socially mediated requests that participants exhibited sufficient awareness of the ongoing work of the collaborating parties across sites to achieve such transitions relatively smoothly. Because there were no explicit mechanics in the application to control the flow of screen mirroring or indicate who was currently sharing content, occasionally situations would occur where multiple participants would for instance try to share content simultaneously. However, keeping negotiation of control as part of the social interaction rather than an explicit function in the application was observed to be a strength rather than a needed feature that could easily complicate frequent switching between participants.

Discussion
In this paper, we have presented a system to enable wireless screen mirroring from smartphones to shared displays and across distributed settings. Key here is the integration of these mirroring capabilities within a video conferencing application that lends mirroring access mechanisms both across sites and to collocated participants within a site. What we see is how this extends the ecosystem of devices from which ad hoc wireless screen mirroring can be achieved within a video call in ways that exploits their unique affordances. Of note here is the lightweight way in which the camera ecosystem of the video call can be extended through the camera capabilities of the smartphone.
The mobility of these devices means that real time capture opportunities are available in flexible sites around the distributed locations. As we saw, this enabled this functionality to be moved to the sites of interest allowing physical artefacts and the work around them to be incorporated in the distributed screen mirroring. In extending the mirroring capabilities across multiple personal devices, individuals had an additional resource through which to take control of the floor in the conversation. We saw how this facilitated parallel streams of individual and shared working and the fluid interleaving of these activities. Individuals were able to engage in their own preparatory activities with both digital and physical resources before introducing them into a more shared context for discussion. Directly mirroring personal devices such as smartphones naturally introduce privacy issues. These are outside the scope of this paper but is an interesting issue for future work. Finally, in contrast to some screen mirroring technologies that require an existing mirror connection to be first disconnected, our mechanism enabled participants to override any existing connection. This meant the opportunity to share was always available and negotiable through lightweight social mediation.