Big Data Analytics as Input for Problem Definition and Idea Generation in Technological Design

. Big data analytics enables organizations to process massive amounts of data in shorter amounts of time and with more understanding than ever before. Many uses have been found to take advantage of this tools and techniques, especially for decision making. However, little applications have been found in the first stages of innovation, namely problem definition and idea generation. This paper discusses how big data analytics can be utilized in those stages. It includes an example of application in problem definition and proposes a case study implementation in a higher education setting for idea generation.


Introduction
The current economy's fast-paced product development cycle has lead companies to decrease the time in all stages of new product development. Even before this change, companies spent proportionally little time in the idea generation process, compared to the time spent in technical development and testing. Little by little, companies are realizing the need for and the power of good ideas, thus requesting employees to dedicate more time and resources to the first stages of the new product development process, namely the identification of the opportunity or problem statement, information gathering, and the idea generation.
To create new ideas, the individual must form new combinations of knowledge he or she already possesses ( [1], [2]). However, it has been found that that participants will gravitate towards known solutions [3] and that popular ideas are constantly recombined [1]. To produce a radical result, the ideator needs to make highly varying ("wild") combinations [1]. It is necessary to find ways to promote wild combinations.
In previous literature, authors have discussed options to manage ideas in a product development process, designing collaboration platforms and software to facilitate the documentation and exchange of ideas. But with new information technologies, it is possible to benefit from the wealth of data we are able to collect and process. Data can enable organizations to find insights related to their processes, clients and market.
This article discusses the use of big data analytics for problem definition and idea generation. It includes a case where big data analytics was used to identify problems and a proposed use of readily available analytics tools to facilitate idea generation.

Idea generation sessions
Idea generation is the fundamental step of the innovation process and, more importantly, it "is central to engineering design" [4]. Participants from different domains or areas of expertise can work together during idea generation (ideation) sessions, to exchange and create knowledge, usually for a specific aim.
The purpose of ideation sessions is to set an environment and implement creativity techniques that will help participants produce, express and combine ideas. Another advantage of idea generation sessions is that the ideas of others sometimes trigger the creation of related or new ideas [5].
Ideation sessions are an interesting example to explore creativity support systems because of their unique characteristics: a defined purpose, limited time, multidisciplinary teams and willingness to create knowledge [6]. While there is not one generally agreed process for idea generation sessions, Shneiderman et al. [7] propose the following phases, found in recent literature and commonly accepted for new product development cycles (Figure 1).
There are many areas of opportunity to improve for the process of idea generation: sharing more ideas, providing feedback and decreasing the time it takes for the team to develop ideas into concepts.
Based on the process for idea generation sessions by Shneiderman et al. [7] and the examples found on extant literature, we categorized the use of information and identified how big data analytics can be used tool to help teams. It can be used in four phases of the process: to identify areas of opportunity (need identification), as input for inspiration (information gathering), to identify unrelated ideas to combine in new concepts, and to obtain insight from a large amount of ideas from a crowdsourcing effort (evaluation). For this work, the focus lays on the first stages, highlighted in Figure 1.

Big data analytics
People collaborate in many different ways, by sending emails with attachments, by sharing documents on the cloud, talking over the phone, exchanging messages. Information systems allow for those communications to occur, and to document the exchanges. All the data generated and collected in an organization is a source of untapped knowledge that can lead to inventive designs of new products and services if analyzed using powerful tools.
Big data is characterized not only by the speed of generation (velocity), but also the different types of data that must be analyzed (variety) and the massive amount of data being collected (volume) (Gartner's Laney, 2001, in [9]). To those characteristics, more recent authors have appended the dimensions of veracity [10], meaning how reliable information is, and value [10], which considers the impact the data can have on the organization when analyzed.
Big data analytics enables organizations to analyze their data in a way that was not possible before, by bringing together different sources of information and finding trends that are only visible with large amounts of data. This will make it easier to visualize the gaps in a domain [8]. The use of big data analytics will depend on the availability of the tools required to perform the analysis, and the characteristics (e.g. duration, number of participants, access to external sources of data) and aim of the idea generation session.

Problem definition / need identification
Müller et al. [8] created a software to support the identification of unexplored research areas through data attributes and visualizations. They propose that information (data) can be used to guide researchers to new unexplored paths. They theorize that data can be examined iteratively for "divergent and convergent thinking" to generate new hypotheses [8].
In this same spirit, data from various sources can be collected and exploited to find areas of opportunity for an organization. It is possible to find new applications or markets for the products and services, or even expertise already possessed. For researchers, it can signal new areas to explore. For artists and creators, it can find previously unthinkable combinations. Figure 2 depicts the flow of information to use big data analytics for problem or need identification.

Idea generation
Information inputs can help bolster the creativity of participants to generate ideas given that "creative thinking involves a process of iterative activation of 'cues'" [11]; furthermore, the likelihood of creating new knowledge from recombination is greater as we increment the number of external inspirations (Cohen & Levinthal, 1989, in [9]). Several works discuss the use of information as input for creativity: -In [11] to support music composers through cues and suggestions.
-In [12] to support brainstorming by recommending computer generated "ideas" (extractions from three databases). -In [13] to support the generation of alternative ideas using data prompting. -In [14] to complement the idea generation process by using aggregated data.
The examples listed demonstrate that there is an interest to enhance idea generation through the use of information. However the risk is that the material selected to form the knowledge base will already be biased towards a known solution. By using big data analytics, the information will reveal trends and connections that were previously unseen. This effect can potentially be amplified when extracting date from unrelated or complementary knowledge domains to promote new combinations. Figure 3 depicts the flow of information to use big data analytics for problem or need identification.

Application in a higher education setting
Big data analytics in the context of a complete new product development process can be used as a support for participants to identify problems and generate ideas. To test this hypothesis, the authors designed three case studies to be performed sequentially. This will enable to study the impact of the use throughout the whole creative process. The three case studies will take place in several higher education environments: -Problem definition: big data analytics will be used to define the challenges to be solved in subsequent activities. This case study has been completed and is presented in section 4.1. -Information gathering: the authors will build a knowledge base to be provided to participants of an innovation competition. This proposed case study is presented in section 4.2.
-Idea generation: during a month-long intensive master course on innovation, the authors will provide students with access to big data analytics tools. This proposed case study is presented in section 4.3.
Evaluation criteria. The creative process is measured by different metrics depending on the authors, for example: -Applicability of concepts [15] -Complexity level of concepts [15] -Detail of concepts [16] -Novelty of concepts [4], [15], [16] -Number of characters of a conclusion [17] -Number of chats [17], [18] -Number of comments [15] -Number of ideas [4], [15], [16], [17], [19][20][21][22] -Number of ideas evaluated [15] -Number of ideas shared [19] -Number of participants [18] -Number of record cards / sticky notes [18], [23] -Number of whiteboard events [23] -Perceived team cohesiveness / effort [19] -Quality of concepts / Ideas accepted [4], [16], [20], [22] -Time [17][18][19], [23] -Variety of concepts [4], [16] Given that there is currently no method to objectively measure the quality of an idea, this criterion will not be considered. Other metrics, such as the number of characters in a description, do not seem relevant to assess the impact of big data analytics for problem definition, information gathering or idea generation. It is also assumed that the concepts will be applicable to the problem at hand. Consequently, the focus will be on these four metrics: comments (feedback from the participants), complexity, ideas shared, and variety of ideas (to be assessed by domain experts).
We believe that the use of big data analytics as input for creativity will provide participants with hints to novel associations that may result in ideas with greater complexity and varied from current or competing solutions We expect to obtain positive feedback from participants regarding the use of data as input for the session and as a support to merge concepts and find innovative solutions.
The following sections describe the application performed to support one of the organizations in finding the challenges to propose to an innovation competition, a proposal to apply big data analytics for information gathering for participants of the innovation competition, and the proposed approach to support idea generation for solution design during a summer school.

Problem definition / need identification
In order to apply big data analytics for problem definition, the researchers worked with one of the organizations that will propose challenges for both the competition and the summer school. Their objective is to work with challenges related to river water quality and conservation. Since the problems to be solved were not defined, a creativity session was held to identify areas of opportunity.
The session took place on the 30 th of March, 2016 at the École de technologie supérieure in Montreal. All the community was invited to participate through the weekly bulletin board, 18 participants were registered, and 15 attended the session.
Input data. As discussed before, there is an enormous wealth of external and publicly available data that can be utilized. However, there is a difficulty in selecting relevant and valuable data, and cleaning the information to make it usable for the purpose. In this case, because the aim was to identify problems related to rivers, freshwater and water conservation which can potentially be solved by a technological solution or a data analysis solution, the data selected to be used as input are patents. Patents offer the advantage of having pre-defined sections, describing a problem and the solution.
To perform the data analysis, patents from Patbase which include keywords such as "freshwater" and "data analysis" + "river" were extracted.
Work session. It is important to set objectives and to provide participants with a sense of progress. To ensure the achievement of the purpose of the work session, a series of activities were planned for participants to follow (Table 1). After a brief welcome and explanation of the purpose of the session, participants were first asked to identify all the elements of the problem (stakeholders, inputs, outputs). The second step was to relate all the elements and identify which cause problems. The boards pictured in Figure 4 demonstrate the different approaches of teams to identify the elements of river issues. Each group of participants proceeded to identify key issues (examples can be seen in Figure 5). The purpose of this activity was to clearly state the key issues.  For the following phase of the work session, the teams were provided with access to a big data analytics analysis tools pre-loaded with freshwater and river related patents. The software used to analyze the data is IPMetrix [24], from the company TecKnowMetrix, which provides semantic analysis and cartographies of the information. In this session, the purpose was to use big data analytics as an information input to trigger new relations. Participants had time to explore the different concepts in the visualizations and selected various concepts to combine with their previously identified issues. The objective was to provide participants with new concepts that could work as prompts to open new fields of possibility, by considering the materials, measures, technologies or concepts in the mapped domain. Table 2 is a comparison of the results from each group of participants before and after the access to external data: Participants mentioned that they were able to identify links because of their previous knowledge, reinforcing the notion that the use of data as input can trigger the exploration of different directions.

Information gathering
For the competition, a world-wide event called "Les 24 heures de l'innovation", organizations propose a challenge, and participants have 24 hours to work on a solution. At the end of the 24 hours, the best solutions are awarded a prize. The competition takes place in over 40 sites in 20 countries around the world, at the same time. All students will be given access to information gathered to give them insights to the challenges proposed. The main site of the competition is the École de technologie supérieure in Montreal.
Measuring the impact. The objective of providing participants with data is to improve the novelty and originality of the solutions proposed. To measure the effect, a comparison will be made in the evaluation grid scores for innovativeness given to the winning solutions from this edition (compared to previous editions).

Idea generation
The next ground for experimentation will take place in the month of July, during the "ÉTS Internationals Summer School on innovation and technological design". In total, fifty engineering students will take part in the course, where they learn about the innovation process, creativity techniques, and work on a team project solving one of the challenges. The objective is to arrive to a functional prototype. Students will be placed in one of the 6 project teams. The teams will select one challenge to solve during the course, and will be guided by professors in the technical side and the creativity and innovation approach.
The students will have the possibility to implement different idea generation tools and techniques. For each, they will have a workshop where they will use the tool or technique and apply it to the problem they are trying to solve. The ultimate goal of providing participants with creative tools and techniques is to arrive to an original solution for the challenge (problem) to be solved.
Additional to the aforementioned tools and techniques, one course will be taught where they will learn to explore data to find hints for solutions.
Tracking the results. Because students will employ different tools and techniques, we need to compare the results of the application of each. To do so, students will be required to carry an "idea journal". In this journal, each group must document the ideas generated during each workshop. Ideas can be documented using brain-maps, lists, drawings, sketches or photographs to represent the work achieved with the tool / technique.

Discussion and conclusion
Solving problems and creating good ideas for new products, services and technologies are too important to rely only on human capacity to create and collaborate. Great inventions are built in the vast knowledge that was created before us. However, we live in an age where there is too much information for humans to absorb. There is a latent need to make sense of all the data generated every day. In this data therein lay clues for exciting combinations, hints to better solutions.
The purpose of using big data analytics in an idea generation context is precisely that of taking advantage of the wealth of knowledge available through the application of information technologies. The data by itself does not generate value, it is the participants making sense of it and making new connections which can create value. This paper presented an example of how including data in a problem-identification process can spark new combinations to explore different directions. The next work is expected to provide insights into the use of big data analytics for idea generations with the purpose of designing novel solutions.