PSO-Least Squares SVM for Clustering in Cognitive Radio Sensor Networks

. In this paper, a solution for a cluster formation in cognitive radio networks is presented. The solution features a network-wide energy consumption model for these networks. The particle swarm optimisation (PSO) and least squares support vector machines (LS-SVMs) have been transformed into our clustering problem. The obtained results show that the given hybrid AI system provides a good estimate of a cluster formation. Through extensive simulations, we observed that the PSO-LS-SVM method can be eﬀectively used under various spectrum characteristics. Moreover, the formed clusters are reliable and stable in a dynamic frequency environment.


Introduction
Cognitive radio (CR) technology belongs to a key technology that enables the improvement of the spectrum use in a dynamic manner. The Federal Communications Commission (FCC) reported that a large portion of the assigned spectrum is used sporadically, leading to a significant amount of the unused spectrum [9].
CR devices allow one to sense the environment (cognitive capability), analyse and study the detected information, and adapt to the environment (reconfigurable capabilities). Moreover, CR technology can be integrated into wireless sensor networks [2]. In this integration, nodes can communicate by tuning their radios to vacant radio channels through a collision-free band instead of through the heavily crowded unlicensed bands. The cognitive radio sensor networks [1,16] (CRSN) allows us to implement multimedia applications with high bandwidth requirements. Moreover, the CRSN gathers all of the sensed data of observed phenomena.
A two-tiered CRSN is composed of a two-level hierarchy. The first level consists of ordinary sensors according to the application-specific objectives. All ordinary sensors are grouped into clusters. All cluster head sensor nodes that are responsible for data aggregation and sending it to the base station (BS) belong to the second level. It is obvious that each clustering technique used in CRSN must reduce the number of exchanged messages between the ordinary sensors and the cluster head nodes, and minimise the distance between the cluster head nodes to reduce transmission power.
Many solutions for the realisation of clustering in the CRSN have been proposed in the recent literature. Among others, in the paper by Younis et al. [17], an election clustering method based on nodes residual energy as a primary clustering parameter and node proximity toits neighbours as secondary parameter was proposed. Unfortunately, it was designed for fixed channel settings and can not be used in the CRSN. A deterministic clustering scheme was proposed in the paper by Bradonji and Lazos [3]. In this approach, clusters are realised by the maximising the summation of idle channels with the number of nodes within the cluster. According to the authors' statement, this solution is stable, but is a NP-hard and violates design constraints for energy and computation. A specific aware clustering method has been presented in the paper by Zhang et al. [18]. The proposed method demands introducing the k-mean clustering algorithm. In this method each node is treated as cluster head node and then merges cluster heads within each iteration until the number of cluster heads reaches an optimal number that was obtained a priori by the theoretical analysis. Applying artificial intelligence (AI) can be an alternative approach to solving the clustering problem in the CRSN. Cognitive radio sensor networks, that deal with the intelligent assignment and use of the radio spectrum, are the natural study research area for the use of AI techniques. Support vector machines (SVMs) have been developed as an alternative that avoids artificial neural network (ANN) limitations [14], [15]. SVM compute globally optimal solutions, unlike tend to fall into local minima [4]. Least squares support vector machines (LS-SVMs), introduced by Suykens et al. [11], [12], simplify the training process of standard SVM in a great extent by replacing the inequality constraints with equality ones. A particle swarm optimisation (PSO) [5] belongs to stochastic optimisation technique influenced by the social behaviour of bird flocking or fish schooling. The use of a PSO makes it possible to optimise the performance of the classifier LS-SVM method for clustering in the CRSN networks. Use of the PSO-LS-SVM methods permit to build a novel bybrid artificial intelligence (AI) system.
The main goal of this paper is to introduce a new hybrid AI system based on PSO-LSSVM techniques for clustering in CRSNs. In the proposed method, the structure of the CRSN is mapped into the network energy consumption model. Thus, a spectrum-aware clustering process is efficiently realised. We show that suggested method achieves good performance efficiency: the clusters are organized in such a way that the total communication power is minimised. This paper is organised as follows. In Section 2, we present an energy-efficient spectrum aware consumption model for CRSNs. Section 3 introduces the hybrid AI system based on the PSO and LS-SVM methods. In Section 4, we evaluate the effectiveness of the proposed method through intensive simulations. Section 5 concludes the paper.

Model of the network energy consumption
In this section, we present the network energy consumption model. We take into account data transmission, data signalling and data sensing in the CRSN.
The data transmission involves intracluster and intercluster communications. In the first case, all CRSN nodes belonging to the cluster send their data to the cluster heads [18]. Thus, the sum of transmission power of all cluster members is given by: where N k is the number of CRSN nodes in the k-th cluster, 1 ≤ k ≤ K. P i is the power needed to reading data transmission from the CRSN node to the cluster head node. The distance between cluster members and their cluster head are short within the cluster and, assuming a free-space channel model as an appropriate channel model, we can rewrite the sum transmission power as: where P r is the minimal received power required for the cluster head to correctly decode the transmitted information, C 0 is constant loss factor, and d(n k i , n k j ) is the Euclidean distance between the i-th and j-th nodes in the k-th cluster.
As the cluster head node, we can select the number of nodes with maximal residual energy. As proven by Zhang [19], intracluster energy achieves the upper bound if each node will have an equal probability of becoming cluster head. Hence, the sum for intracluster power consumption is given by: is the center of the k-th cluster. The total network wide energy consumption is formulated as follows: where P sen , P sig is the energy devoted to sensing and signalling, t sen , t sig , t data are the sensing, signalling, and data transmission times, respectively, d max is the transmission range of the CRSN node.

Particle Swarm Optimization
The particle swarm optimization (PSO) is a swarm intelligence method that models social behaviour of organisms such as bird flocking and fish schooling [5] to guide swarms of particles toward the most promising regions of the search space. PSO is conceptually similar to the crossover operation used by genetic algorithms. The main difference of particle swarm optimization method from the evolutionary computing is that flying potential through hyperspace are accelerating toward "better" solutions, while in evolutionary computation schemes operate on potential solutions which are represented as locations in hyperspace [6]. Each particle represents a candidate position (i.e., solution). A particle is considered as a point in a D-dimension space, and its status is characterized according to its position and velocity. Thus, the D-dimensional for the particle i at the t-th iteration can be represented as The velocity (i.e., distance change) for particle i at iteration t can be defined by . . , p t iD } be the best solution that particle i has obtained until iteration t and p t g = {p t g1 , p t g2 , . . . , p t gD } be the best solution obtained from p t i in the population at iteration t. To search for the optimal solution, each particle changes its velocity according to the cognition and social parts as follows: where c i indicates the cognition learning factor; c 2 indicates the social learning factor, and r 1 , r 2 are random numbers uniformly distributed in U (0, 1). Each particcle moves to a new potential solution based on the equation The iteration is terminated if the number of iteration achieves the given of number of maximum iteration.

LS-SVM Classifiers
Least squares support vector machines (LS-SVM) [11], [12] are least squares versions of support vector machines (SVM) [14], [15], which are set of related supervised learning methods that analyse data and recognize patterns, and which are used for classification and regression analysis.
be a given training set, with the input data x i ∈ R n and output data y i ∈ R with class labels y i ∈ {−1, +1} and the linear classifier When the data of the two classes are separable we have the original SVM classifier [14], [15] that satisfies the following conditions: These two sets of inequalities can be combined into one single set as follows where φ: R n → R m is the feature mapping the input space to a usually high dimensional feature space. The data points are linearly separable by a hyperplane defined by the pair (w ∈ R m , b ∈ R). Thus, the classification function is given by Instead of estimating with the help of the feature map we work with a kernel function in the original space given by In order to allow for the violation of Eq.(9), we introduce slack variables ξ i such that The following minimization problem is accounted for as follows: where C is a positive constant parameter used to control the tradeoff between the training error and the margin.
The dual problem of the system (13), obtained as a result of Karush-Kuhn-Tucker (KKT) condition [7], leads to a well-known convex quadratic programming (QP). The solution of the QP problem is slow for large vectors and it is difficult to implement in the on-line adaptive form. Therefore, a modified version of the SVM called the Least Squares SVM (LS-SVM) was proposed by Suykens et al [11], [13].
In the LS-SVM method, the following minimization problem is formulated The corresponding Lagrangian for Eq. (13) is given by where the α k are the Langrange multipliers. The optimality condition leads to the following (N + 1) × (N + 1) linear system and Ω * = ZZ T . Due to the application of Mercer's condition [8] there exists a mapping and an expansion Thus, the LS-SVM model for the function estimation is given where parameters α k and b are based on the solution to Eqs. (15) and (16).

Mixtures of kernels
Each kernel function is characterized by its advantages and disadvantages. For instance, a radial basis function (RBF) kernel K(x; where σ is the width of the radial basis function, is a typical local kernel in which only the data that are close have an influence on the kernel values. The polynomial kernel (see [14]) such as K(x; x i ) = [x · x i + 1] q , where q is the kernel parameter which defines the degree of the polynomial to be used, guarantees the influence of all the data points that are far away from each other. Therefore, the mixture of these kernels gives a better performance than any single kernel.
As was defined by Smits et al. [10], an exemplary mixture of the RBF and polynomial kernels is given by where ρ is the mixing coefficient treated as a constant scalar.

LS-SVM transformed into a clustering problem in CSRN
The input data are described by a coordinate (r, z) and the output data is the energy value. We can transform the Eq. (16) into a system Thus, the Ω is given by The solution of Eq. (21) gives the values By setting A = Ω −1 and B = 1 T Ω −1 where A and B are precalculated matrixes that depend only on the input vector (x k ) but not on the vector y k .
The sensors are usually correlated due to the high probability that the adjacent pixels will contain the sensors in the cluster. We assume that the sensor field is two-dimensional and the image energy distribution of the sensor field on the surface, known as the point spread funcion (PSF), can be approximated by the Gaussian PSF. On the other hand, the center point of the PSF corresponds to the measured sensor position.
In our approach based on the mixtures of kernels, we take into consideration two sets of indexes of the neighbourhood of the sensor image that satisfies the condition of linearly separable patterns. We recall that the linear separability requires that in order to be classified, the patterns must be separated from each other to ensure that decision surfaces should consist of hyperplans.
The LS-SVM with the RBF and polynomial kernels transformed into a sensor clustering problem has the fitted energy intensity surface function over the constant vector space as follows where (r, z) are the coordinates of the pixels. The function g(r, z) gives the corresponding energy intensity value, b and α are obtained as a solution of Eq. (23).

A multi-class formulation of the LS-SVM transformed into a sensor clustering problem
In comparison with the standard SVM method, the LS-SVM has a lower computational complexity and memory requirements. Nevertheless, in certain situations, such as the classification of several characters, clusters, etc., a multi-class classifiction is very suitable.
In the multi-class formalism we now use multiple output values y i with i = 1, . . . , n y , where n y defines the number of output values [15]. Thus, in the primal weight space the multi-class classification system possesses the following binary classifiers with mappings on a high dimensional feature space φ (i) (.) : R n → R n h , i = 1, 2, . . . , n y , with dimensions n h1 , n h2 , . . . , n hn y .
By the extension of Eq. (23) to a multi-class problem, we obtain where matrix Y is given by

Experimental Results
An experiment to verify clustering efficiency of PSO-LS-SVM approach was performed for randomly deployed Pus and CRSN nodes in a 200 m x 200 m square area. We assumed that each PU randomly operates in four channels in which the neighbouring Pus are inactive. In the simulation, we supposed that the distance between a transmitting and receiving radio pair is normally distributed with a mean of 30 and variance of 15.
We used a link gain of h = h i /d, where d is the distance from transmitter to receiver. If the distance is was less than 10, we set link gain to one.
The proposed PSO-LS-SVM approach for clustering in CRSN is composed of two main phases. PSO algorithm was applied as a technique to improve the setting of the parameter values of LS-SVM. Clustering using LS-SVM method consists of two phases: training phase and calculate fitness phase. A block diagram of the algorithm is depicted in Fig. 1.
The PSO-LS-SVM algorithm was developed based on Matlab R2014b and LS-SVMlab 1.8 for the optimization of model. Through training, the parameter values of the PSO and LS-SVM approach were set as follows. The cognition learning factor c 1 and the social learning factor c 2 were set to 1.8. The number of particles and generations were found to be 8 and 150. Since the PSO-LS-SVM approach has a larger solution space, the number of solution evaluated was raised to 1000. The optimized parameters for LS-SVMs inlude kernel parameters and regularization parameter. If radial basis function (RBF) or polynomial (Pol) function is chosen as a kernel function, we used the vector ν = (γ, σ 1 , σ 2 , . . . , σ Ninput ) and ν = (γ, σ, δ), respectively, where γ is the regularization parameter, δ and σ i (i = 1, 2, . . . , N input ) are kernel parameters, N input is the dimension number of input patterns. To perform the clustering evaluation, six PUs are randomly deployed in our sensor field. Furthermore, the sum of the squared errors (SSE) of the clustering results for various numbers of clusters was taken into account for each scenario. In Fig. 2, we compare the average SSE with respect to CRSNs of different sizes for all of the studied number of clusters. The experiment shows that the method for a given CRSN has clustering effectiveness with the minimal average SSE, when the cluster number is equal to five or six. Fig. 3 plots the average cluster size versus the number of nodes in the CRSN and given number of active PUs. As we can see, all PUs behave similarly in terms of the number of clusters and the cluster size for different numbers of PUs for various CRSN sizes. However, the small number of PUs gives a greater average cluster size.
To compare the efficiency of LS-SVM and PSO with various kernels, we used the Rand index. Rand index measures the percentage of decisions that are correct, namely where T P is the number of true positives, T N is the number of true negatives, F P is the number of false positives, and F N is the number of false negatives. Table 1 presents the values of Rand index for the PSO-LS-SVM approach under different circumstances. As can be seen from Table 1, the efficiency of PSO-LS-SVM with the mixture of kernels is better than other models.

Conclusion
In this paper, a new clustering method for cognitive radio sensor networks is introduced. The applied methodology used in clustering has been found in the network energy consumption model for CRSNs. Three parts of this model were used: multi-task sensing, signalling and data transmission. Using the energy distribution of the sensor field surface and applying the PSO-LS-SVM system, we have been able to solve the clustering problem in the CRSN. The presented hybrid artificial intelligence system is shown to satisfy CRSN energy constraints through extensive simulation experiments. Our PSO-LS-SVM method showed that the proposed solution offers several advantages with respect to classical clustering schemes, such as k-means clustering algorithm, etc. Moreover, this method incorporates the spectrum-sensing capability and spectrumaware sensing constraint in the nodes and their localisation in the network. The presented method achieves comparable effectiveness with the other clustering method in cognitive radio sensor networks. Future research directions include the development of clustering methods for CRSN with energy constraints by allowing each node to select its role without the need for cooperation between nodes.