A Meta-Heuristic Approach for Copper Price Forecasting

. The price of copper and its variations represent a very important financial issue for mining companies and for the Chilean government because of its impact on the national economy. The price of commodities such as copper is highly volatile, dynamic and troublous. Due to this, forecasting is very complex. Using publicly data from October 24th of 2013 to August 29th of 2014 a multivaried based model using meta-heuristic optimization techniques is proposed. In particular, we use Genetic Algorithms and Simulated Annealing in order to find the best fitting parameters to forecast the variation on the copper price. A non-parametric test proposed by Timmermann and Pesaran is used to demonstrate the forecasting capacity of the models. Our numerical results show that the Genetic Algorithmic approach has a better performance than Simulated Annealing, being more effective for long range forecasting.


Introduction
As suggested by Adrian E. Drake: "The increasing availability of computing power in the last two decades has been used to develop new techniques of forecasting" 1 .This paper presents a multivaried based model to forecast the copper price variation.For this purpose, we propose two meta-heuristic algorithmic approaches in order to find the best fitting parameters for our multivaried model.
The first one corresponds to a Genetic Algorithm (GA) which is inspired by natural selection [1].Genetic algorithms were initially proposed by Holland [2] and have found applications in diverse areas such a process optimization [3], machine learning [4], and so on.Particularly, Recursive Genetic algorithms have been used for time series forecasting in [5] while adjusting the parameters of a dynamic multivaried model [6].The second one, is based on Simulated Annealing (SA) approach.In particular, SA has been combined with artificial neutral networks and fuzzy systems to forecast time series as well [7] [8].
As far as we know, it has never been used in order to adjust a dynamic multivaried model to forecast a financial time series.
Forecasting of copper price is an important task not only for investors, but also for the government and agents who are involved on copper mining businesses.Particularly, mining is one of the pillars of Chilean economy.Chile is the largest copper producer around the world 2 and the production of this product represents the 12.2% of the Gross Domestic Product (GDP) 3 .
The price of copper, as well as all the commodities, has a highly volatile, dynamic and troublous behavior making the forecasting a complex task.Antonino Parisi, Franco Parisi and David Díaz suggest that "…transaction strategies based on forecast of the direction in the price level fluctuation are more effective and can generate higher benefits than those based on a specific price level prediction." 4 .
The proposed model uses both Genetic and Simulated Annealing algorithm procedure to forecast the sign of the variation on copper price.The publicly data used to this purpose correspond to copper price from October 24th of 2013 to August 29th of 2014 and Dow Jones price for the same period.The organization of the paper is as follows.
The methodology and model formulation is described in Section 2.The meta-heuristic algorithms used to fit the model parameters are explained in Section 3. The analysis of the numerical results is explained in Section 4. A discussion section is presented in 5. Finally, the conclusions of the study are presented in Section 6.

Methodology
In this section, both the management of the data and the model formulation are presented.The non-parametric test and the sign predictive percentage are also explained and mathematically described.

Data Management
The publicly data obtained for copper and Dow Jones price corresponds to two hundred and twenty four observations from October 24th of 2013 to August 29th of 2014.The data was divided in two sets.The first one is used to train initially the multivaried model whereas the remaining data correspond to out-of-sample data and it is used to measure the predictive capacity of the model.Several models estimate their forecasting rates based on the entire amount of observations.In particular, the proposed model uses a "Rolling operation" methodology which means that the in-sample size remains constant to " " samples.For each iteration, the algorithm includes the next observation and discards the oldest one.The rolling time considered in the present paper corresponds to observations.By doing this, a recently added sample is considered by the model and the oldest one is discarded.Therefore, our model can predict until a three-step ahead forecasting.

Model Formulation
A dynamic multivaried model is proposed to forecast the price variation of copper considering the past four variations on the copper price and Dow Jones.For this purpose, we first define the variation on the copper price at instant " " as . ( Where : Variation on the copper price from to .: Copper price at time .: Copper price at time .
Similarly, the variation of Dow Jones price at time " " is defined as Where : Variation on the Dow Jones price from to .: Dow Jones price at time .: Dow Jones price at time .
To forecast the variation on the copper price, we use the past four variations on the copper price, Dow Jones price and the error of the forecasted prices.The model can be expressed formally as . ( Where : Forecasted variation on the copper price.: Real variation on the Dow Jones price at time .: Real variation on the copper price at time .: Error on the past forecasted variations obtained as ., and are parameters determined by using the meta-heuristic approaches.
The parameters are adjusted within three steps as follows: 1) The parameters of the model are obtained for the "n" samples.
2) Then, a one (two or three) step ahead price is forecasted.
3) Subsequently, the new observation is added on the in-sample data and the oldest value is discarded.Steps 1) and 2) are repeated.
This method requires to be adjusted permanently with new real data.The model parameters are not supplied in this paper, but only the sign predictive percentage is reported in Section 3.1.

Sign Prediction Percentage
As it was exposed in Section 1, determining the sign of the variation of copper price is an effective strategy which allows generating higher benefits.To determine the sign variation percentage of each forecasted period we use the following equation .(4) Where : Sign prediction percentage.: Variation on the price of copper from to : Forecasted variation on the price of copper.: The Heavyside function where if and if . : Number of forecasts made.
In case, the is lower than 50% the predicted variation on the copper price will have the opposite sign of the model ( ). Due to this the prediction percentage is obtained by . (5)

Non-parametric Test of Predictive Performance
Pesatan and Timmermann proposed a non-parametric test of predictive performance based on the correct direction of the forecasted variable [9] (Directional accuracy).This procedure proves the null hypothesis that the observed variations on the price are independently distributed from the forecasted variations.If the hypothesis is rejected, it means that there is statistical evidence that the model is able to forecast a future variation on the price.The Pesatan and Timmermann test compares the sign of the real variation on the price and the sign of the forecasted variation on the price.In order to obtain the real positive variations on the copper price, we use the equation (6) .(6) Where : Percentage of the real positive variation on the price of copper.
Similarly, the percentage of positive variations on the forecasted price is obtained by means of .(7) Where : Percentage of the forecasted positive variation on the copper price.
Furthermore, the success ratio index when the forecasted variations and the real variations on the price are independently distributed is obtained by means of . ( The variance of rate is obtained by means of equation ( 9) as follows . ( Consequently, the variance on the success ratio ) is defined as . ( Finally, the directional accuracy test is given by the equation . ( This test follows a standard normal distribution and it is used to prove the null hypothesis that the forecasted prices are independently distributed.This means, the larger the value of DAT the better the accuracy of the model.

Meta-Heuristic Optimization Algorithms
In this section, the meta-heuristic optimization algorithms used to adjust the model parameters are briefly described.

Genetic Algorithm
Genetic algorithms were initially proposed by Holland in 1975 [2].A genetic algorithm is a meta-heuristic approach inspired by Darwin's theory which is based on the survival of the fittest [1].Genetic Algorithms use a direct analogy of natural evolution.A possible solution is codified as an individual and each individual is composed of variables (or chromosomes).Each individual will have as many chromosomes as variables in the problem.In order to obtain an initial solution for the problem, a population of individuals is generated randomly.The population is modified within each iteration of the algorithm depending on the following operators: 1. Fitness Function: The fitness function (objective function) must be defined.In this particular optimization problem there are no constraints on the optimization problem.Thus, the fitness function is defined as: (12) 2. Selection: A portion of the existing population is selected.The best solutions are more likely to be selected as parents for the next generation.
3. Crossover: Crossover combines the chromosomes of each parent in order to create new solutions called offsprings. 4. Mutation.Some offsprings are randomly modified by a mutation process.
Mutation changes a particular gen on the chromosome.The mutation operator enables the population to explore new zones of the feasible space.
The process is continuously repeated creating new generations of individuals until a stopping criteria is met.In this particular problem the stopping criteria selected was four hundred generations of individuals.In particular, Genetic Algorithms have been previously used on financial forecasting [6] [10].

Simulated Annealing
Simulated Annealing is a meta-heuristic optimization method proposed independently by Scott Kirkpatrick, C. Daniel Gelatt and Mario P. Vecchi in 1983 and by Vlado Černý in 1985 [11] [12].The Simulated Annealing algorithm is inspired from the process of annealing in the metallurgical industry.It starts with an initial high temperature and then it slowly decreases during the execution of the algorithm.This reduction allows the algorithm to explore the solution space and avoid a possible local optimal.As the temperature is reduced so is the chance of accepting worse solutions.This temperature reduction allows the algorithm to focus on areas of the solution space where the global optimum may lie.Simulated Annealing has not been applied for multivaried dynamic models, however it has been mixed with fuzzy models [7] and artificial neural networks [8] for time series forecasting.Similarly, as in our genetic algorithmic approach, we use the objective function given by the equation ( 12) and the stopping criteria selected was four hundred iterations.

Numerical Results
The numerical results are presented in two subsections.The first one summarizes the model results for the adjustment of the parameters using in-sample data while the second one summarizes the forecasting results of the out-of-sample data.

In-sample Data Results
For each time step the model parameters where adjusted in order to maximize the SPP.A total of one hundred and seventy nine different iterations were performed using the multivaried model.Figure 1 shows the values of SPP for each iteration of the model.From the numerical results presented in Figure 1a), the SPP is larger than 50% when using Genetic Algorithms for all the iterations.In Figure 1b), we observe similar trends when using Simulated Annealing approach.For in-sample data the Genetic Algorithmic approach has a sign predictive performance mean of 66.96% while Simulated Annealing has a mean of 64.59%.This clearly shows that the genetic algorithmic approach is more effective.

Out-of-Sample Data Results
With the out-of-sample data, we compute the sign prediction percentage and perform the directional accuracy test for one, two and three step ahead forecasting.Table 2 shows the performances obtained.In Table 2, column 1 presents the name of the algorithmic approach.Column 2 shows the SPP values obtained in percentage.Finally, in columns 4 and 5 we present the values obtained for the DAT and p-values for the DAT, respectively.For the one hundred and sixty nine forecasted variations, the Genetic Algorithm has a Sign Predictive Performance of 57.54%, 57.54% and 53.63% for one, two and three step ahead forecasting, respectively.While the Directional Accuracy tests are 2.0282, 1.9773 and 0.9772 for one, two and three steps ahead forecasting, respectively.These values prove that the predictive capacity of the model is useful up to two steps ahead forecasting.
On the other hand, the Simulated Annealing based model has a Sign Predictive Performance of 58.1%, 50.28% and 51.40% and a Directional Accuracy Test of 2.1724, 0.03135 and 0.3211 for one, two and three steps ahead forecasting, respectively.These values prove that the model can be used only for one step ahead forecasting while using Simulated Annealing.

Discussion
Other authors have adapted their forecasting models by using genetic algorithms [13] [14].Nevertheless, there are still meta-heuristic approaches which have not yet been tested in order to compare their performances with Genetic Algorithms.Both algorithms used the same model to predict future prices.In the current literature, different models and their advantages to forecast future price of commodities based on the spot prices are well described [15] [16].However, the algorithms used to adapt these models have not been compared in order to guarantee that the decision of using genetic algorithms is the right choice.This paper compares Simulated Annealing and Genetic Algorithms in a dynamic multivaried model.Both have been compared for learning problems showing that Simulated Annealing found meanly better solutions than Genetic Algorithms in less computational cost [17] [18].In this paper, we confirm that the Genetic Algorithmic approach is the right choice as it gives better solutions when compared to Simulated Annealing for short and long range forecasting.Finally, the decision maker can use the adapted parameters in order to forecast future copper prices using the equation (3) proposed in subsection 2.2.

Conclusions
In this article, we proposed two different meta-heuristic approaches to forecast the variation on the copper price.The proposed model requires to be adjusted every time the copper price changes.An initial adjusted model would not have the capability of forecast many steps in the future if it does not have the most recent information about the copper price behavior.Genetic Algorithms have proved be an efficient optimization approach to forecast time series.In particular, when compared with another meta-heuristic optimization technique such as Simulated Annealing as it has proved to be superior for short and long range forecasting.Forecasting on the copper price can increase the return and reduce the risk associated with the transactions of this base metal.Therefore, this methodology can be used on a real scenario.As future research, we plan to use Genetic Algorithms hybridized with nonlinear model architectures such as Non-linear autoregressive with exogenous input networks (NARX) for long range forecasting.

Fig. 1 .
Fig. 1. a) Sign Predictive Percentage using Genetic algorithm.b) Sign Predictive Percentage using Simulated Annealing

Table 2 .
Sign predictive percentage and directional accuracy test for one, two and three step ahead forecasting obtained with Genetic and Simulated Annealing approaches.