Distribution Visualization for User Behavior Analysis on LTE Network

. In order to seamlessly provide high quality communication services, mobile network operators (MNOs) tackle to promptly respond to a degradation of the communication quality when it occurs. MNOs are facing a difficulty to detect the degradation without any error messages or nonconformity. For the first step of the study, we implemented a Self-Organizing Map (SOM)-based visualization system to analyze the users’ behavior in evolved packet core based on state transitions estimated by capturing LTE C-Plane signals. We show a case study of analyzing actual LTE signals using the implemented system, which demonstrates that we can intuitively see the unexpected characteristic of users’ behavior from the results.


Introduction
Mobile network operators (MNOs) are responsible for providing high quality of communication services. It is very important for them to monitor the communication quality. For this purpose, MNOs tackle to immediately detect the degradation of the communication quality when any incidents occur. The existing approaches are generally either the log-based or the conformance-based. In the log-based approaches, a system monitors messages and system logs of equipment in the LTE network [1]. The system detects hardware errors and link errors. On the other hand, in the conformance-based approaches, a system detects unfamiliar sequence of messages referring the specifications of 3GPP standard. However, there exists a degradation of the communication quality occurring without any error messages or nonconformity. For instance, ping-pong handover is a general phenomenon in mobile networks, which causes inefficient network performance and communication quality [2,3]. When one User Equipment (UE) which is moving close to the fringe between multiple evolve node Bs (eNBs) and connecting to one of them, it handovers from one eNB to another eNB, then it often immediately connects back to the former eNB. In the case where the UE stays around the fringe, it sometimes repeatedly handovers from/to these eNBs. In this situation, there exists no evolved packet core (EPC) equipment errors. However, this phenomenon still causes unnecessary control messages in EPC and degrades the communication quality. In such a case, MNOs hardly detect the degradation unless customers report the problem to them.
For the first step to study detecting the degradation without any errors, MNOs have to know how users behave in EPC. In order not to lose the generality, the users' behavior analysis should be exhaustive and comprehensive. However, since EPC signals through various interfaces between function nodes in EPC are mixture of different protocols and IDs, it is difficult to trace users' behavior sequentially.
In this paper, we report a preliminarily implemented system that captures and analyzes C-Plane signals in EPC, quantifies users' behavior, and visualizes the distribution of users' behavior. Then we introduce a case study with the actual C-Plane signals and a typical example for cluster of degraded situation of users' behavior.

Related Works
There exist several studies about users' behavior analysis in mobile networks. In [4], the authors analyze signaling storms based on radio resource control (RRC) protocol. In order to detect anomaly and malicious users' behavior causing signaling storms, [4] firstly models and analyzes the patterns of signals in RRC protocol. Then, it identifies the specific patterns. In [5], the authors focus on retrieving radio access information from S1-MME and S11 interfaces. As an example, the authors summarize the time transition of the duration of radio access bearer establishment.
To the best of our knowledge, there exists no study analyzing or visualizing users' behavior in EPC. Therefore, for the first step of research, we tackle to visualize users' behavior based on C-Plane signals in EPC.

Implementation
Figure 1 LTE architecture. Figure 2 System architecture. In order to analyze users' behavior in EPC, we implement a distribution visualization system in an actual LTE network which is standardized by 3GPP [6]. Figure 1 briefly depicts the LTE architecture regarding to C-Plane signals. In our implementa-tion, we focus on the signals through S1-MME, S10 and S11 interfaces. They are a mixture of S1 application protocol (S1AP) and Evolved general packet radio service tunneling protocol for control plane (GTPv2-C). Figure 2 shows the architecture of our implementation. Firstly, the capture server captures signals. Secondly, the signal analyzer extracts users' state transition from capture files. Thirdly, the statistics monitor quantify users' behavior based on users' state transition. Finally, the distribution visualizer draws users' behavior distribution using self-organizing map (SOM).

3.1
Capture of C-Plane signals and signal analysis  Table 2 Examples of state transition The process in our implementation starts with the capture of the signals. The implemented system groups the signals by user, then constructs signal sequences by user. After that, it extracts specific patterns of signal sequences. Note that, the implemented system does not identify the specific user. It can only distinguish users using a S1 HO in progress S1 HO succeeded HANDOVER NOT IFY 2 UL S1 HO initiated S1 HO allocation HANDOVER REQUEST 1 DL S1 HO initiated S1 HO cancelled HANDOVER CANCEL ACKNOWLEDGE 4 DL S1 HO initiated S1 HO preparation failed HANDOVER PREPARAT ION FAILURE 0 DL UNSUCESS S1 HO cancelled S1 HO initiated HANDOVER REQUIRED 0 UL temporary assigned identifier. Since the temporary identifier is valid for a certain duration of time, the implemented system can trace users' behavior for a short time. Thus the implemented system cannot follow any specific user for a long time, e.g. several hours or longer. Based on the signal sequences, the system constructs a state transition graph. The state transition graph consists of 5 elements as follows. The input is specific patterns of signaling messages extracted from S1AP and GTPv2-C signals. The states are defined according to 3GPP standard, and determined by the combination of current state and input. In the system, the initial and final state of the transition are ignored since, in actual LTE networks, the initial state should always be the same state and the final state should not be naturally defined. Tables 1 and 2 show the lists of states and examples of state transition respectively.

Statistics monitor
After that, the implemented system calculates statistics values in order to quantify the behavior of each user based on his/her state transition. In the implementation, in order to characterize the continuous-time state transition of a user, we adopt the state transition probability matrix (p(n,m)) as well as the average and the variation coefficient of the dwell time , at state n in a transition from state n to state m. The probability p(n,m) from state n to state m is calculated in Equation 1, The dwell time , is calculated by state transition as in Equation 2, where, and is the arrival time at state m and n in i-th state transition from state m to n respectively. To gather these values, we describe users' behavior with a multi-dimensional vector. As respecting the definition of states, number of possible state transition is 600. Since we adopt 3 different statistics values, users' behavior described in a 1,800-dimensional space in our implementation.

Distribution visualizer
In order to visualize the distribution in a multi-dimensional space, the distribution visualizer uses the self-organizing map (SOM) [7]. SOM is an artificial neural network using unsupervised learning to construct a two-dimensional space representing a multi-dimensional space. We can intuitively see the distribution of the users' behavior by mapping the distribution in a multi-dimension into a two-dimension.
According to the SOM algorithm, the distribution visualizer firstly define the vector space based on the entire input data. Secondly, the distribution visualizer plots the quantified user's behavior in an n-dimensional space one by one. Then it transforms the distribution into a two-dimensional space. In the process of the transformation, it draws regular grid of circles (namely, units) in the two-dimensional space. Each unit represents principal components and each plot is located in the closest circle so that the more the behaviors are similar, the closer they are located. The visualizer highlight the specific condition of users in the case where they are labeled in advance and we can compare different conditions of users intuitively.

Case study
In order to validate the result of the implemented system and assess its usefulness, we visualize users' behavior based on 24 hours of the actual anonymized C-Plane signals in a large urban area in Japan. In this case study, we intuitively identify the fundamental characteristics of specific users who had experienced ping-pong handovers and labeled in advance. Firstly, the signal analyzer parses the captured signals and constructs signal sequences by users. Then, it extracts state transitions. Figure 3 depicts the state transition diagram. In the figure, the indexes of the nodes are the indexes of the states in Table 1 and the width of edges are the probability of state transition (p(n,m)). For the readability, we ignore the edges which p(n,m) is less than 0.10 in the figure. According to the state transitions, the statistics monitor quantifies the users' behavior in terms of p(n,m), the mean value and variation coefficient of dwell time (t (i) (n,m)). Figure 4 Indexes of units. We define the input space using the entire 24 hours of input data. We prepare 100 units to describe the input space in the 2-dimensional space and the indexes of units are numbered in a left-to-right and bottom-to-top fashion as described in Fig. 4. Figure 5 depicts distribution maps of 0 am, 6 am, 12 pm and 6 pm in the day. In each unit, we plot users' behavior of each hour in gray color. Then we highlight users' who had experienced ping-pong handovers in the period of time in red color.

Figure 3 State transition diagram.
According to the figures, the number of visualized users are varied by time and the distribution of users are different especially between 0 am and 6 am. The highlighted users, however, located in similar units. Focusing on those unit 2, 3, 13, 14, 15, 23 and 24, the components of them commonly include variation coefficient of ∈ 11,24 , p(11,24), p(11,23), p (2,24). Since ping-pong handovers mean frequent handovers, it is quite understandable those users are likely to belong to those units which include transition from 11 to 23 or 24. However, the visualized results indicate a phenomenon that a number of users who experience ping-pong handovers also start X2 handover right after the initial context setup, which is unexpected by MNOs. Our system enables to highlight the unknown characteristics of ping-pong handovers.

Conclusion and future works
In order to analyze the users' behavior in EPC, we implemented the visualization system for user's behavior distribution. We draw the distribution maps using implemented system with the actual C-Plane data. As future works, we will deeply analyze users' behavior based on multi-hop transitions of states.