Mouse Tracking Measures and Movement Patterns with Application for Online Surveys

. There is growing interest in the ﬁeld of human-computer interaction in the use of mouse movement data to infer e.g. user’s interests, preferences and personality. Previous work has deﬁned various patterns of mouse movement behavior. However, there is a paucity of mouse tracking measures and deﬁned movement patterns for use in the speciﬁc context of data collection with online surveys. The present study aimed to deﬁne and visualize patterns of mouse movements while the user provided responses in a survey (with questions to be answered us-ing a 5-point Likert response scale). The study produced a wide range of diﬀerent patterns, including new patterns, and showed that these can easily be distinguished. The identiﬁed patterns may - in conjunction with machine learning algorithms - be used for further investigation toward e.g. the recognition of the user’s state of mind or for user studies.


Introduction
A multitude of human factors influences human-computer interaction (HCI) (e.g., [18]). The influence on HCI of individually stable patterns of thinking, feeling and behavior is of longstanding interest (e.g., [9,21]), as this often reflects underlying interests, preferences and personality. Making decisions is a complex cognitive and affective process [8,13]. Understanding user behavior in the context of decision-making has increasingly attracted attention in HCI research [10,19].
Pointer tracking refers to the recording of users' mouse cursor positions, used, for example, to capture the mouse movement trajectories for the purpose of further analysis. Data acquisition of mouse cursor positions has the advantage of being cheap, easy to implement and is already integrated in the use of the computer.
The present study aimed to identify patterns of mouse movements while the users give input in an online survey. These mouse movement patterns are potentially relevant as a means to understanding the user, such as in terms of the user's patterns of decision uncertainty.
Given the relative paucity of mouse tracking measures and mouse movement patterns in the literature, we present a new set of mouse behavior patterns that could potentially be combined with machine learning algorithms as a means to capturing information [14] about stable patterns of thinking, feeling and behavior of the user.

Related work
Eye tracking systems are used in HCI research since mid-1970s [22]. The data structure is similar to that of mouse movements (x and y positions in screen over time). In fact, a wide range of eye movement behaviors have been associated with mouse movements behaviors. There is also multimodal data acquisition devices available, such as Tobii and SensorMotoric Instruments (SMI) systems, that allow concurrent measurement of eye and mouse movement behavior.
For instance, Tobii permits eye tracking and analysis of eye sampling behavior while the user observes and interacts with web pages [4]. This system also enables concurrent acquisition of video, sound, key-strokes and mouse clicks. Analyses include a range of measures such as mouse movement velocity, and can visualize results using various methods, such as heat maps. The analyses of different modalities may also be combined in order to assess, for example, the time from the first fixation to a particular target until the user clicks on the same target (or the number of clicks on the target).
SMI [2] also provides behavioral and gaze analysis software for research in the fields of reading research, psychology, cognitive neuroscience, marketing research and usability testing. While this system only processes eye and head tracking data, it has the advantage of allowing the analyzes of several subjects simultaneously. This permits analysis, for example, of the hit ratio, that is the relative number of subjects in the sample that fixated at least once on the target.
Although eye tracking systems have a comparatively long history, the field of mouse tracking had developed several interesting approaches for mouse movements analysis. This largely relates to web pages usability testing in order to improve the user experience [1,3,[5][6][7]15], but others extract data from the mouse coordinates, such as path distance, time measures and mouse clicks in order to study user's behavior rather than the web design itself.
For instance, Revett et al. and Hugo et al. [11,23] propose the biometric identification of the user based only on mouse or pointer movements. Another approach, led by Khan et al. [17], related the mouse behavior patterns with personality. In Pimenta et al. mental fatigue has been detected by means of mouse movements [20], while Hibbel et al. related movements to emotions [12,26].
Other measures and movements patterns have also been used in behavior studies. In 2006, Arroyo et al. described mouse behaviors in the context of web-sites, reporting user behavior that consists of a long pause next to text or a blank space, followed by a fast and direct movement towards a link [6]. Arroyo et al. also examined hesitation patterns and random movements, while Huang et al. compared clicks and hover distributions, unclicked hovers and abandonments [15].
Seelye et al. used the deviation of the movement in relation to a straight path and the time between the two targets to distinguish older adults with and without mild cognitive impairment (MCI). They found that more curved or looped mouse movements and less consistency over time are more closely correlated with MCI subjects [24].
Yamauchi et al. focused on two trajectory measures from mouse cursor to detect user emotions. They defined attraction as the area under the curve from the starting position to the end position and zigzag as the number of direction changes during the movement. A statistical model build with these trajectory measures could predict nearly 10%-20% of the variance of positive affect and attentiveness ratings [25].
Arapakis et al. used a large number of measures to predict user engagement, as indicated by, for example, attention and usefulness [5]. The set of measures included the most common distance and time measures and also measures related to the target, for instance, the number of movements toward and away from the target, or the number of hovers over the target compared with around the target.
More recently, Katerina et al. used a wide set of measures, including mouse and keyboard measures [16]. Their objective was to examine the relationship between the measures extracted from mouse and keystroke and end-user behavioral measures. Two examples of measures examined in terms of mouse movements are the number of mouse long pauses and the number of clicks in the end of direct mouse movements. From keystroke dynamics one example of a measure done was the time elapsed between key press and key release.
To the best of our knowledge, no previous studies have reported mouse movements during data collection using online surveys.

Participants and procedure
N=119 volunteers recruited via a pool of test participants and students of the University of Zurich and of the ETH Zurich participated in this study. The participants were between 20 and 52 years old (M=25.4; SD=5.4; 18 male). All participants were native or fluent speakers of Standard German. Written informed consent was obtained before participation, according to the guidelines of the Declaration of Helsinki.

Data Acquisition Architecture
In this study, the data resulted from the interaction of the user with the web browser while completing an online survey, which was programmed to send the data to a server machine via AJAX, where it is finally recorded as a file in a data base.
The results of the survey are also saved on the database, although in this case via the Survey Management System using PHP. Therefore, if needed, these results could be accessed as well.

Data Collection
The pointer movement is recorded by a server, which creates a report file with relevant recorded data: frame number; event type (represented by 0 when a movement is verified and 1 when the mouse button is pressed down); question number if hovered; answer number if hovered; x and y screens position (in pixels) and time stamp. The name of the file includes the IP address, the survey ID and the step of the questionnaire.
The online survey is constructed using a freely available software survey tool on the web. The online survey presents a sequence of statements and the answers are 5-point Likert-type scale. The results from the survey could be returned to a csv file.

Data cleaning
To ensure correct formatting and processing of data from the server file, a validation procedure is applied as a first step. This validation procedure ignores data acquired with touch screen devices, reorder the data by time, join different files from the same questionnaire and detects how many samples are lost.

Behavioral Patterns Description
The data acquired with the LimeSurvey contains information about the mouse position with and without scroll in pixels. This data is first interpolated with equal time interval between samples in order to retrieve the correct information from it. With the mouse position pre-processed and the other information delivered by the LimeSurvey, several measures from temporal, spatial and contextual domains can be derived.
In this study, these measures are essential to compute several of the behavioral patterns described further.

Overview Pattern
A behavior that can be found in some subjects in participating/answering surveys regards getting an overall idea of the number of questions, the length of the survey or the types of questions. This behavior is characterized by, at the beginning of the survey, scrolling the cursor over a wide area in direction to the bottom of the survey getting an overview of it. In figure 2 it is represented the mouse y coordinate represented over time, which makes it easy to observe this behavior. The first question are at the top of the plot (small y values) and, moving forward through the next questions, the y increases. At the beginning of the questionnaire, this subject goes to the end of the survey and then comes back to the first questions. This behavior also occurs after one minute and two minutes of interaction, but never so far as the first time.

Fast Decision Pattern
While some people take a long time to answer the questions, others are very fast. It is possible to find both behaviors, that we call Fast Decision Patterns, which are represented in figure 3. Both plots represent the question where the mouse is located over time and, as it is possible to observe, the subject at the top is much faster than the bottom subject, taking one and a half less minutes to answer the same questionnaire.
The work of Arroyo et al. [6] analyzed fast movements towards a target.

Revisit Pattern
A typical behavior of the subject that can be found in the survey context, is to revisit prior questions after some time of having answered. In Figure 4 the user has revisited a prior answer (from question 14 to question 3) which was at the top of the survey. Interestingly, after answering the first time to the question 3, this subject responded to question 4 and came back to question 3, having changed three times the option previously answered. The revisit was around three minutes after these changes. The analysis done by SMI [2] considers a similar metric with eye movements for a group evaluation: the average number of glances into the target.

Skips Pattern
When answering the survey some subjects would not have a linear behavior of following the natural order of questions. In fact, some subjects would skip questions and answer in an unnatural order. In Figure 5, it is represented the questions answered over time. It is observed that the user does not take a linear approach in completing the survey, after answering question two, the subject starts to answer from question 14 to the previous questions. When the user is back to question 3, goes again to the end and answer question 18 until question 15.

Hover Pattern
In the context of the survey, a typical behavior found on certain users is hovering multiple available options before selecting their final answer. In Figure 6, two different users are compared in their survey completion. The flow chart indicates the way each user behave by indicating in which options they kept their mouse. Each blue circle is a selectable option to answer the corresponding question, which the user hovered. The size of the circle is proportional to the time spent on that option.
As can be seen on Figure 6, the user on the right (b) has more hovered areas (specially highlighted areas) than the user represented on the left (a).
Although Tobii [4] is an eye tracking system, it considers the number of fixations before fixating on the target, which is similar to what we are suggesting  here. Previous studies also includes hover patterns in mouse movements analysis. Katerina et al. [16] considered the number of mouse hovers that turned into mouse clicks and Arapakis et al. [5] compared between hovering the area of interest in relation to other areas. Also Huang et al. [15] analyzed the hover distributions and clicks to verify the number of search results hovered before the user clicks.

Hover Reading Pattern
During the completion of questionnaires, the questions have some text in the left border which can be read in several ways. We found two distinct patterns: some people move the mouse to the text area, while reading the question, while others just move the mouse around the answers area. One example of each behavior are shown in figures 7 and 8, for the first it is evident that for each item the subject is hovering the text of the question before choosing an answer. That is not verified in the second, that only moves the mouse around the answers.
The computational process of this behavior is quite easy, the survey software has a tool in which the width of text of the question can be defined. Knowing that, the x mouse coordinates can be associated to questions or answers area.

Inter Item Pattern
The distance and time taken between the answered choice and enter the next question could be different from person to person. The time and distance are highly correlated and define the same kind of behaviors. However, some more specific patterns can be highlighted, for instance, the subject can take more time because it was moving slower, or because it was moving a lot, even if quickly. Therefore it is important to individualize these measures. In figure 9 it is presented four possible behaviors. Considering that the color intensity depends on the velocity (more intense for higher speeds), the a) and b) present short distance inter items, being b) much faster than a), while c) and d) present long distances inter item, being d) much faster than c). Here although a) and c) have very different distances, the speed of movement is similar. The same is true to b) and d).  9. Representation of the mouse movement in the survey context considering only the inter item interval. The color of the line corresponds to the velocity of the movement (color more intense for higher velocity). In a) there is an example of short distance but low speed, in b) short distance and high speed, in c) long distance and low speed and d) long distance and high speed.

Long Pauses Pattern
Long pauses correspond to mouse movements at the same place (x and y coordinates) for a long period of time. This can be observed in figure 10 in which orange circles represent long pauses while answering the survey questions. The longer the pauses, the larger the circles.
Multiple studies considered the number and time of long pauses [6,16,24]

Straight and Curvy Pattern
Straight patterns are characterized by a direct or straight line in direction to a target. This pattern indicates that a target has been spotted and the subject decided to move the cursor towards it. The opposite behavior is the curvy pattern, characterized by more curved movements. Comparing figure 11 with figure  12 it is possible to detect a huge differences in the way they move the mouse. The studies from Katerina et al. [16] and Seelye et al. [24] had these patterns into consideration, having compared more straight or curved movements.

<-turn Pattern
While making a decision, sometimes the mouse movement nearly inverts its direction, this pattern has been called <-turn. Figure 13 presents this behavior Yamauchi et al. [25] analyzed a similar pattern considering direction change. Fig. 13. Representation of <-turn pattern. In blue is presented the mouse movement and in red the mouse click.

Random Movements
While some movements are spontaneous and have an inner purpose, others might just be unconscious and have no specific intention. The latter patterns are described as random movement patterns and are characterized by a large number of movements confined in a non-interest area for a short time, as shown in figure  14. This behavior was briefly described by [6], however they do not present a visualization example or a way to compute random movements.

Loop Pattern
In the category of random movements, a pattern that can be found is characterized by a turn of more than 360, which can be defined as a loop and observed in figure 15. This behavior was previously considered by Seelye et al. [24] that calculated the number of looped mouse movements.

Conclusion
This study demonstrates the use of mouse tracking measures and movement patterns in the specific context of online survey-based data collection. The survey consisted of several questions, each to be answered using a 5-point Likert response scale. Using only the mouse movements data, we show that it is possible to extract a wide range of different behaviors. The results also show the behavior patterns can easily be distinguished by mere visual inspection.
Although some of the behavioral patterns have already been reported in other studies (e.g., [6,16,24]), none were used in the context of surveys. Given that this is a completely different task situation with different task requirements, the proposed patterns require a different interpretation. This work delivered also new patterns of movement that were not reported in the previous literature, contributing therefore to the current state of the art.
It is possible to group several of these patterns according to their potential explanation. There are patterns that might be associated with personality traits, decision confidence or decision difficulty, but this awaits further investigation. For example, overview, fast decision, skips, straight and curvy, inter items intervals and long pauses could indicate personal characteristics and some users would follow the questions in an orderly and sequential manner, while others would first get an overall picture of the survey questions and then answer (overview pattern). Fast decisions could be related to confidence, and decision difficulty could be associated with hover pattern and <-turn.
Concerning the hover reading pattern, the users that move the mouse to the question text while reading it are less goal-oriented than those who just move the cursor directly to the next question. Whether this is so requires further investigation. If this is the case, it is also possible the first group of users could reveal a higher correlation between mouse and eye movements.

Future Work
As a first step after this work, it would be interesting to create metrics that express each of the patterns extracted. Consistent with other studies, we will progress in order to apply machine learning techniques to infer personality and states of mind from mouse movements data.
The recognition of these patterns in more complex contexts could be applied to improve the usability of websites and create an adapted design and contents according with user preferences.
Another application area is clinical/ergonomics field, for example to recognize mental fatigue or even mental diseases by studying the cognitive state of the subject given that users state of mind could be directly associated with a conjunction of behaviors.