Chaotic Time Series for Copper’s Price Forecast

. We investigated the potential of Artificial Neural Networks (ANN), ANN to forecasts in chaotic series of the price of copper; based on different combinations of structure and possibilities of knowledge in big discovery data. Two neural network models were built to predict the price of copper of the London Metal Exchange (LME) with lots of 100 to 1000 data. We used the Feed Forward Neural Network (FFNN) algorithm and Cascade Forward Neural Network (CFNN) combining training, transfer and performance implemented functions in MatLab. The main findings support the use of the ANN in financial forecasts in series of copper prices. The copper price's forecast using different batches size of data can be improved by changing the number of neurons, functions of transfer, and functions of performance s. In addition, a negative correlation of - 0.79 was found in performance indicators using RMS and IA.


Introduction
Copper is one of the basic metal products listed on major exchanges in the world: the LME, Commodity Exchange of New York (COMEX) and Shanghai Futures Exchange (SHFE).Prices in these exchanges reflect the balance between the supply and demand of copper worldwide, although they may be strongly influenced by the rates of currency exchange and investment flows, factors that may cause fluctuations of volatile prices partially linked to changes in the economic cycle activity [1].
The price of copper is a sensitive issue for major producers such as Codelco, Freeport-McMoRan Copper & Gold, Glencore Xstrata, BHP Billiton, Southern Copper Corporation, American Smelting and Refining Company.Economies such as those of Chile and Zambia rely heavily on copper production and, subsequently, in the evolution of the prices of the same [2], being Chile the largest producer and exporter of the world.
Several studies include copper as one of the products of interest in the evaluations of the forecast to improve the prediction of prices.They employ a variety of different methods and mathematical models: time series [3][4][5],combined with wavelet [6,7], transformed of Fourier [8], swarm optimisation algorithm [9], and models of multiproducts [10].
A fairly accurate time series model could predict several years forward, whose skill is an advantage for the planning of future requirements.Research on nonlinear dy-namical systems has allowed in recent years significantly improve the impact of the predictive capacity of the times series.Chaos is a universal complex dynamic phe-nomenon that exists in different natural and social systems such as communication, economics and biology.The auto-correlation of chaotic behaviour in the economy began in the decade of 1980, applied to macroeconomic variables such as the gross domestic product (GDP) and monetary aggregates [11].Since then, several studies have been conducted to search chaotic behaviour in economic and financial series [12,13].
In this context, amongst the most used techniques and tools are graphics analysis of recurrence, Temporal Space Entropy (STE) Hurst coefficient and exponent of the Lyapunov dimension of correlation for the matching of chaotic behaviour in these series [14].Additionally, the existence of chaotic behaviour in the commodities of copper was corroborated in [15].
The motivations and the need to carry out this research is to evaluate the ANN by different dimensions, type of networks, functions and structures.

Methodology
The continuous increase of the power of calculation and availability of data has grown the attention in the use of ANN in many types of problems of prediction.The ANN can model and predict linear and nonlinear time-series with a high degree of accuracy, to capture any relationship between the data without prior knowledge of the problem which is modelled [16].
The methodology used for visual and statistical analysis is summarised according to the Fig. 1 adapted the Loshin [17].
The closing prices of copper traded on the LME were used with the different functions of learning, optimisation and transfer as input.Then, the quality of the forecast of the ANN performed in MatLab R2014a simulation tool was evaluated with Root Mean Square (RMS) and the Adequacy Index (IA).The saved data goes through the process of Extract, Transform and Load (ETL) to be stored in the Data Warehouse (DW) in a multi-dimensional form, using SQL Server 2008.The visual analysis of the results is done by the software BI Tableau Desktop 8.3 as Frontend, and the R version 3.1.2software to perform the corresponding statistical analysis.

Segmentation of Data
Data were segmented into two sizes of batches.The former with 100 and the later with thousand data.Periods from 22/01/2015 to 16/06/2015 and the 01/07/2011 to 16/06/2015 respectively.Segmentation according to the evaluated batch records is according to the Table 1.

Artificial Neural Networks
Represented FFNN has been taken to this work in the Fig. 2

Alternatives of Evaluation
The different alternatives of a network are given according to different combinations between their functions as shown in equation (1).

𝑇 = 1080 * 2 * 2 = 4320
Were 27 simulations for each of the 4320 alternatives evaluated in this work with a total of 116640 simulations according to the equation (3).
Where,   , It is the total number of simulations carried out., It is the number of simulations for each alternative.
Fig. 3 shows the "star model" [19] used to perform multidimensional analysis according to the methodology proposed by Ralf Kimball [20].The construction of the DW was based on multidimensional modelling, in which the structures of the information on table are facts and dimensions [21].

2.5
Performance Measures The results were validated using performance measures that allow indicating the degree of generalisation of the model used.The indexes are the average quadratic Error (RMS), and the index of adequacy (IA).Both are shown in the equations ( 4) and ( 5), respectively, where   and   are the values observed and predicted respectively at the time , and  is the total number of data.In addition,   ′ =   −   and   ′ =   −   , being   value average of the observations [22].
IA indicates the degree of adjustment that has the values estimated with the actual values of a variable.A value close to 1 indicates a good estimate.On the other hand, near-zero RMS indicate a good quality setting.

Knowledge Discovery
The knowledge is in the data, however, must be extracted and built for the consumption of the users.New technological solutions are required for data management.The new software and methodologies are the solution to make sense of the data, allowing to extract useful information for the construction of knowledge, supporting decision making.Data from markets together with the new data generated from different data mining techniques, can help to the analysis and the management of Big Data [23,24].In this case, the business intelligence tool enables the discovery of knowledge.Fig. 4 shows the best results obtained in the average assessment (RMS) ̅ in lots of 100 data FFNN.The indexes are shown in a scale of colors, where the blue colour indicates better value performances corresponding to the minimum of the RMS and the red colour lower performance.After several iterations were obtained the best results to correspond to functions purelin, traincgb, traincgf, traincgp, mse and sse between 1 and 5 neurons.() in the FFNN average assessment indicates a better fit in the curves, for this case are kept the same functions and expands the range of neurons from 1 to 20.
In the case of the FFNN network for batches of 1000 data, the best results of  ̅̅̅̅̅̅ and  ̅̅̅ they correspond to the functions purelin, trainlm, mse and sse with ranges of neurons from 6 to 10 in Table 2.In the case of CFNN for batches of 100 data, the best results for the averages of  ̅̅̅̅̅̅ and  ̅̅̅ they correspond to the functions purelin, traincgb, traincgf, traincgp, mse and sse between 1 and 2 neurons in Table 3. CFNN batches of 1000 data, for best results unemployment averages  ̅̅̅̅̅̅ and  ̅̅̅ they correspond to the functions purelin, trainlm, mse and sse with ranges from 3 to 4 neurons in Table 4.
We found that FFNN and CFNN for batches of 100 data obtained the best results with the training functions named traincgf, traincgb, and traincgp.On the other hand, for batches of 1000 data, the best function of training was the trainlm in FFNN and CFNN.

Forecast
The charts show the forecast made by the FFNN and chaotic CFNN in series of the copper's price (Fig. 5a, Fig. 5b, Fig. 5c and Fig. 5d).FFNN and CFNN show similar values in his performance's ratings.

Statistic Analysis
The CFNN (trainFcn = trainlm, performFcn = sse, transferFcn = purelin, learnFcn = learngdm, lot = 1000, numNeuronas = 3) simulo up to 10000 simulations, where obtained good results in the performance of the network, with average measures of RMS of 0.00767957, IA of 0.9725127 and best_epoch of 3.1376; minimum values of 0.007535554; 0.9698627 and 1 respectively, maximum values of 0.007712344; 0.9731063 and 11 respectively, in the Table 5 the statistical summary of the simulation.
To review the distribution of RMS performance measures is observed which is skewed to the left Fig. 6 a, Therefore, the bias is negative (less than the median average); Instead the IA is skewed to the right index Fig.6 b, therefore, the bias is positive (greater than the median average) and the best_epoch this biased right in Fig. 6 c.On the other hand the correlations (Pearson) graph shows a high negative correlation of -0.79 between RMS and the IA which shows the good performance of the ANN indicated by RMS will also be it according to the IA.On the other hand presents a negative correlation of -0.49 between IA and best_epoch; and a positive correlation of 0.52 between RMS and best_epoch.

Conclusions
Forecasts based on neural network, nonlinear models achieved better results compared with linear forecasting models.Combinations of two neural network models, with nine training functions, two functions of performance, lots of 100 to 1000 data and ranges from 1 to 20 neurons and three transfer functions have been studied.In future work, it is necessary to review the contribution of other macroeconomic variables in the performance of the model, in particular, market of capitals of other commodities.
Finally, the characteristics of these time series, should be analysed the same set of combinations of functions in other economic series of price, to assess the possible relationships of the systems involved.
(a) and the represented CFNN in the Fig.2(b).We have selected some of the features available in the toolbox of Matlab software.The inputs used in the system correspond to  −1 ,  −2 ,  −3 ,  −4 ,  −1 ,  −6 and target  0 the copper price time series.

Fig. 3 .
Fig. 3. Star model for multidimensional analysis of project.

Fig. 5 .
Fig. 5. a)Forecast of the FFNN for data verification with lot of 100, purelin, traincgp and sse of 5 neurons functions; b) Forecast of the FFNN for data verification with lot of 1000, purelin, trainlm and sse of 7 neurons functions; c) Forecast of CFNN for data verification with lot of 100, purelin, traincgb and sse of 1 neurons functions; d) Forecast of CFNN for data verification with lot of 1000, purelin, trainlm and mse of 3 neurons functions.

Fig. 6 .
Fig. 6. a) distribution of RMS, b) distribution of IA, c) distribution of best_epoch and d) Plot of correlations between IA and RMS best_epoch.
where, , It is an alternative for a network number., It is the number of training functions., It is the number of performance functions., It is the number of transfer functions.n, It is the number of neurons in the hidden layer.Replacing values in the equation (1) gets the number of alternatives for a network.
where, , It is the number of evaluated alternatives., It is the number of networks (structure of similar functions) , It is the number of batches of data.Replacing values in the equation (2) obtained 4320 alternative evaluated in this work.

Table 2 .
FFNN for batches of 1000, functions purelin, trainlm, mse and sse with ranges of neurons from 6 to 10.