Evaluating a Serious Game for Cognitive Stimulation and Assessment with Older Adults: The Sorting Sheep Game

. Active Ageing, the preservation of the potential for achievement and development in old age, is a societal concern and neuropsychological assessments are part of psychological clinical studies. Although approaches are based on stable principles, Serious Games with designs based on neuropsychological tests can enable data collection and analysis useful for medical evaluation. This study developed and tested a game (Sorting Sheep) to be applied in cognitive stimulation with older adults providing, at the same time, useful information about the performance of the player. Meaningful correlations have been found between aspects of performance and the standard MoCA test, when the game is calibrated for a population with cognitive conditions.


Introduction
Increased longevity of the population is a worldwide phenomenon with profound implications for society.The ageing process, which is only partially related with chronological age, is a heterogeneous group of health and functional states experienced by older people and determined by a range of genetic, biologic, social environmental, psychologic and cultural factors [17].Although with a significant diversity, the aging process is characterized by a gradual impairment in many body systems and an increased risk for disease.The Central Nervous System is affected by changes in neurotransmitter levels and neuronal function, brain atrophy, reduction in oxygenation and cerebral blood flow among others [1].Some deterioration in memory, processing speed of information, inductive reasoning, numeric abilities as well as impairment in motor and visual-perceptive functions are commonly found in older persons.Yet, age-associated cognitive changes are not irreversible and can be improved with adequate training [17].Thus, cognitive stimulation is a component of "active aging", a term was adopted by the World Health Organization (WHO) to promote a better quality of life and improved autonomy and independence of older people.There is evidence that regular engagement in physical and cognitive activity with moderate intensity can delay functional decline and the onset of chronic disorders in older subjects [17].This not only stimulates neuronal plasticity [7] but also makes use of the "cognitive reserve" as additional brain regions are recruited during the task to compensate the reduced functional capacity [2,17].Optimizing cognitive function is an important objective since cognitive decline is associated with adverse outcomes in mental and physical health as well as in longevity [6,16].
Cognitive stimulation and monitoring of cognitive performance can be implemented with Serious Games.These belong to a class of games simulating practical daily-life situations for professional training in critical conditions as well as for educational purposes targeting a diversity of users [18].The increase utilization of Serious Games in immersive environments and the adoption of non-conventional devices has strengthened the relation with Digital Games.The possibility of generating virtual scenarios can increase motivation of users during the learning process.
Previous research as Tong et al. [15] proposed to demonstrate the feasibility of cognitive assessment in the elderly populations based on mobile platform games.The methodology of the study was to study the viability of the game in an emergency department of the hospital of the University of Toronto [14].The results of the players were correlated with a series of standard evaluations (MMSE, MoCA and CAM).The authors of the study concluded that this is the first time that a serious game is used for cognitive assessment in an elderly population, followed by a full battery of conventional cognitive assessment methods to correlate the results.The research developed by Boletsis and Mccallum [3] aims to design and develop a serious game for the cognitive health screening of the elderly, that is, evaluate the Smartkuber game and document its development design.The study follows a mixed methodological approach using the In-Game Experience Questionnaire to assess players' gameplay experience and a correlation study to examine the relationship between the Smartkuber and MoCA scores (the study sample was thirteen older adults).The study shows that Smartkuber is a promising tool for cognitive health screening, providing a fun and motivating gameplay experience for older players.
The study by Manera et al. [8], which aims to examine the acceptability of the Kitchen and Cooking game, is a serious mobile platform game developed in the context of VERVE (EU project available at http://www.verveconsortium.eu/),being a game designed for the elderly population.In this game, a list of activities is employed to evaluate and stimulate executive functions (such as planning skills).Kitchen and Cooking was used by a sample of 21 elderly people (with and without cognitive pathology) for a month.Finally, the author of the study could conclude that the game Kitchen and Cooking is adapted for the elderly population with or without cognitive pathology.The investigation by Robert et al. [11] intends to analyze the feasibility, advantages and disadvantages of using Serious Games in patients with Alzheimer's disease in order to provide practical recommendations for the development and use of Serious Games in these populations.The methodology adopted by the authors of the research was not clear, but the authors concluded that the results revealed that Serious Games can offer very useful tools for professionals involved in the care of patients suffering from Alzheimer's.
The aim of this study was to develop and test a serious game (Sorting Sheep) in cognitive stimulation of older adults providing, at the same time, useful information about the cognitive performance of the player which could be used by healthcare professionals.For this purpose we aimed to establish a correspondence between performance in game and performance on a simple standard cognitive assessment test.Thus, our main question is: which game play indicators and metrics could be used as a good proxy for predicting (a correlate) evaluation with a rapid screening instrument, such as that obtained with MoCA?
In the second section of this study, we will present the serious game proposal and its functioning, with a comparison to a traditional method of cognitive assessment.Next we present the methodology and details of the samples used in the research.The analysis of results and reception of the game proposal are outlined in the fourth section.In the fifth section are discussed relevant points for research.Finally, we will present the conclusions and future work.

The Sorting Sheep Design Proposal
Since a significant proportion of older subjects have mobility problems, this game was developed for mobile touch technology such as Tablets.Thus, the user can have access to this game in every place including hospitals, nursing homes or their own residence [12,13].In this game (Fig. 1) the objective is to separate all sheep according to their colour into the correspondent zone or field.Thus, black sheep should re-main in the right-hand side whereas white sheep must be kept in the left-hand side.To achieve this goal, the player should use a finger to open (touch) and move the Gate (drag) allowing the sheep to transpose the field to the correct side.The game has the following functioning mechanic: 1) the scenario consists of two fields separated by a gate.Sheep can move across the fields through the gate; 2) White and black sheep are randomly generated in the fields; 3) Sheep move randomly across the fields without any intervention from the user; 4) After the goal of separating black from white sheep is achieved, the difficulty level rises with the generation of one additional sheep of each colour.Separating Sheep was culturally calibrated to the target population (with a rural scenario and familiar animals) which is important to obtain reliable assessments.A set of game play logging variables are recorded, as described in Table 1.

The Montreal Cognitive Assessment (MoCA)
This is a brief screening test for cognitive function which can discriminate between normal and cognitive impairment in older adults [9].This test has a one-page protocol and takes 10 to 20 minutes to apply.It doesn't have adaptations for education level, sensorial impairment and it lacks a ludic component.It is not particularly sensitive to assess some cognitive domains.Executive functions are assessed by the part B of Trail Making Test (the subject has to link letters and numbers in alternate order); Phonemic Verbal Fluency (also included in Language); Verbal Abstraction (also included in Abstraction).Visuo-spatial assessment consists in the Clock Drawing Test (circle, numbers and hands) and copying a cube."Attention, Concentration and Working Memory" are assessed with digit span test in direct and reverse orders.A Sustained Attention task (target detection) consists in identifying the letter "A" during the pronunciation of a series of letters.Finally, in the Serial Sevens subtractions the subject has to consecutively sub-tract 7 starting from 100.Language tasks consist in naming 3 animals, repetition of 2 sentences with complex syntax, and phonemic verbal fluency in which the subject is asked to generate as many words starting with "p" as possible (excluding proper names).For verbal abstraction the subject must think and verbalize the similarity between two objects (e.g.banana and orange being fruits).Differed memory recall is tested 5 minutes after retention of 5 words (short term memory).The subject has two trials to recall the words after completing other tasks (attention, language and abstraction).The Orientation domain is assessed questioning the subject about the time (e.g.date, month) and location.Each task has a score as follows: Executive Function/Visio-spatial (5 points); Naming (3 points); Attention (6 points); Abstraction (2 points); Memory (5 points) and Orientation (6 points).
The sum of each individual score provides a total score (max.30 points) which can be compared with standardized values according to age and educational level.From Table 2 we would expect, as our initial conjecture, to find an empirical relation between the performance within the current Sorting Sheep game and a cognitive assessment, especially relevant along the visuo-spatial, attention-concentration, and orientation.

Proceedure and Population
The study was previously validated by the University Hospital Ethics Comission.Concurrent validity for cognitive domains was tested against the Montreal Cognitive Assessment (MoCA).The game was tested with two groups with a distinct profile.Group 1 was composed with subjects with high cognitive performance attending a "Cultural Academy" (Academia de Convívio e Cultura da Casa Cor de Rosa) and a "Senior University" (Universidade Sênior Nova Acrópole).Group 2 consisted in subjects with cognitive impairment recruited in the Old Age Psychiatric Unit of Centro Hospital Universitário de Coimbra.All subjects with age ≥ 50 years were eligible to enter the study.The final sample consisted in 55 subjects in Group 1 (Gr.1) and 54 subjects in Group 2 (Gr.2).Although a high proportion of participants in Gr.1 used internet-based mobile services, the majority is not familiar with digital games on mobile devices.Gr.2 is characterized by low internet usage and no knowledge of digital games (Table 3).
After explaining the purpose and details of the study to each subject and obtaining the informed consent, the research was conducted with the following steps: 1. Baseline cognitive assessment (MoCA); 2. Demonstration of the game (rules and how to play); 3. Game play during 10 minutes; 4. Post-test questionnaire. 3 Results

Cognitive Assessment and Game Performance
The two groups differed in respect to baseline cognitive performance in all assessed cognitive domains (Table 4).The proportion of subjects completing each level correlated negatively with level difficulty; game performance was higher in Gr.1 (Table 5).In Group 1 the variables Highest Level correlated positively with "Language" and MoCA total score suggesting that comprehension of the game rules and global cognitive function are important in achieving higher game performance (Table 6) (C: correlation; S: significance and N: sample).Language correlated negatively with the time spent in level 1, with the presence of 3 extreme outliers (Figure 2).In Gr.1, Memory, Language and Total MoCA score correlated with Completed Level 2, 3 and 4.
2) There was a break in the linearity of time spent by the player when completing the levels.This was caused by the random and independent movement of 6 sheep changing the speed and direction every 20 seconds.In level 3 one or more sheep don't reach the gate before their speed or direction changes inducing an increased player-independent waiting time.This game design problem explains why the correlation between time measure and completed level disappeared in level 4. i.e., Time on Level 3 ("Player 9"), Time on Level 4 ("Player18" and "Player32") and Time on Level 5 ("Player 19").Therefore, these are players with an absence of cognitive impairment according to the normative rules of MoCA performed by Freitas et al. [4].In Gr.2 (Table 6) Maximum Level correlated with the performance in most cognitive domains assessed by MoCA except "abstraction".The strength of these correlation increases in higher levels suggesting that cognitive performance is increasingly important as the difficulty increases.It is apparent that in Level 1 there was no correlation suggesting that this level is a stage of learning."Orientation" is recruited during this level as the learning process includes spatial orientation of the player into the game.The decreased strength of correlations with cognitive performance observed in Level 4 and might be explained by the loss of game linearity with excessive waiting time (Fig. 3).
As in Gr.1, subjects in Gr.2 with higher cognitive performance spend less time to complete the game stages.

Feedback from Users
Globally, participants could understand the rules and how to play the game (100% in Gr.1 vs. 92.6% in Gr.2).The majority of participants in Gr.1 and 2 (87.3% and 72.2% respectively) considered the game a valid instrument to be used daily at home and did not find difficult to identify the animals, gate and game buttons (100% vs. 94.4%).For 52.7% of players in Gr.1 and 61.1% in Gr.2 the sound during the game did not affect their performance while other participants (34.5% vs. 25.9%) reported increased motivation with sound and some (5.5% vs. 13%) considered that a familiar sound was an empathic element in the game.Increasing the speed of sheep (Gr.1) and replacing sheep with horses (Gr.2) were suggested although the majority of participants (89.1% vs. 98.1%) did not present any suggestion for changes.Most subjects preferred to play with a touch screen pen (63.6% vs. 88.9%).

Discussion
The Sorting Sheep Game for Gr.1 did not present correlations directly with the cognitive domains proposed by the game design (Table 2), but, it is noticed that some domains do not work in isolation, that is, they work together, for example, Language with Attention or Executive Function with Orientation.
Analyzing the results of the game for Gr.2, we can conclude that the game presented more calibration, thus, there were correlations predicted in the game development process, corresponding to the expectations of the game design (Table 2).
In the context of cognitive assessment, Table 9 shows the performance profile Gr.1 in the game and makes it possible to evaluate the performance of the sample players.The table was assembled with "Highest Level", the most efficient variable in the evaluation criterion of player performance and the other variables chosen are profiles such as "Age" (important variable to indicate cognitive decline) and "Education level" of the player (Second important variable to indicate cognitive decline) [4].Similar to the table above presented in the context of cognitive assessment, Table 10 exposes the Gr.2 profile in the game.
Through the results, it was possible to perceive that the game environment did not present difficulties in the context of usability towards the target population.There was encouragement and evaluation of the players in particular to the Gr.2 players.As a tool to help health professionals interpret players' results, normative tables of player performance have been constructed (Table 9 and Table 10).The results of the investigation also allowed us to conclude that the game presents a challenge better calibrated for Gr.2 (Hospital), than for Gr.1 (Cultural Association/ Senior University).Remember Gr.2 contains players with cognitive limitations (some with already diagnosed cognitive pathology), the opposite occurs in Gr.1 (people without diagnosed cognitive pathologies).Evidence for this calibration with Gr.2 is the number of correlations between the MoCA and some of the gameplay variables, where Gr.2 presented a significant amount of correlations when compared to Gr.1.When we put together both sample groups, still not a random sample, we gain a perspective from which we can notice a promising trend line correlating performance in MoCA test with in game performance (Highest Level Achieved), thus justifying the need and usefulness of further investment in testing this approach (Figure 4).

Conclusions
The objective of this research was to design and test a serious game to be used for cognitive stimulation and assessment with older adults (with and without cognitive impairment) using a mobile device.
The results suggest that this game is more calibrated to users from Group 2 profile (clinical sample).This is confirmed by the great number of correlations between MoCA variables (Executive Function, Attention, Orientation, Total Score) and game variables (Highest Level) in this sub-sample.In contrast, a ceiling effect occurred in Group 1 in which a higher level of difficulty was not associated with the performance in several cognitive domains.Thus, with healthy subjects, the game performance was, in addition to age, only dependent on the capacity to understand the rules.Additionally, the time for level completion didn't correlate with level difficulty.Thus, time was not useful to determine the player performance.
In the general analysis of the game and results it is possible to conclude that the game in the current conceptual form seems adequate to be used as an evaluation instrument, when compared with a screening a screening test, with elderly population (with or without cognitive pathology), however, as a device for cognitive exercise, it can be quite demotivating for cognitively healthy people because as mentioned, the current game design does not appear to be challenging for cognitively healthy people.
Thus, this research provided a case study in the development of Serious Games for cognitive screening, which can be used independently by players repeatedly as cognitive exercise, and this reinforces the idea that games are relevant tools that can be used as a stimulus supplement (exercises), assessment of the players and may offer useful information to the professionals involved in the care of the patients.The design

Table 1 .
Variables of the Game Sorting Sheep.
Time Nx (x= 1 to 4)Time in seconds to complete the respective level.If a level is not completed this variable is empty.

Table 2
presents the expected overlap of cognitive competences, in a comparison between the cognitive domains assessed by MoCA, and the demands of the Counting Sheep game, from the perspective of a trained Psychiatrist.The level of overlap is classified as strong (+++), moderate (++), weak (+) and none (-).We wanted to gather empirical data to corroborate, or not, this initial assessment.

Table 5 .
Performance in Game Group 1 and 2.

Table 8 .
Reception of the Game by Groups 1 and 2.

Table 9 .
Evaluation of the players of the sample Gr.1.

Table 10 .
Evaluation of the players of the sample Gr.2.