Authors:Hoai Bach Nguyen; Bing Xue; Peter Andreae Pages: 95 - 110 Abstract: In classification, feature selection is an important pre-processing step to simplify the dataset and improve the data representation quality, which makes classifiers become better, easier to train, and understand. Because of an ability to analyse non-linear interactions between features, mutual information has been widely applied to feature selection. Along with counting approaches, a traditional way to calculate mutual information, many mutual information estimations have been proposed to allow mutual information to work directly on continuous datasets. This work focuses on comparing the effect of counting approach and kernel density estimation (KDE) approach in feature selection using particle swarm optimisation as a search mechanism. The experimental results on 15 different datasets show that KDE can work well on both continuous and discrete datasets. In addition, feature subsets evolved by KDE achieves similar or better classification performance than the counting approach. Furthermore, the results on artificial datasets with various interactions show that KDE is able to capture correctly the interaction between features, in both relevance and redundancy, which can not be achieved by using the counting approach. PubDate: 2016-08-20 DOI: 10.1007/s12065-016-0143-4 Issue No:Vol. 9, No. 3 (2016)

Authors:Nafrizuan Mat Yahya; M. Osman Tokhi; Hyreil Anuar Kasdirin Pages: 1 - 20 Abstract: Bats sonar algorithm (BSA) as a swarm intelligence approach utilises the concept of echolocation of bats to find prey. However, the algorithm is unable to achieve good precision and fast convergence rate to the optimum solution. With this in mind, an adaptive bats sonar algorithm is introduced with new paradigms of real bats echolocation behaviour. The performance of the algorithm is validated through rigorous tests with several single objective optimisation benchmark test functions. The obtained results show that the proposed scheme outperforms the BSA in terms of accuracy and convergence speed and can be efficiently employed to solve engineering problems. PubDate: 2016-02-18 DOI: 10.1007/s12065-016-0134-5 Issue No:Vol. 9, No. 1-2 (2016)

Authors:Seyed Alireza Mohseni; Tony Wong; Vincent Duchaine Pages: 21 - 36 Abstract: As with other nature-inspired algorithms, the cuckoo optimization algorithm (COA) produces a population of candidate solutions to find the (near-) optimal solutions to a problem. In this paper, several modifications, including a dynamic mutation operator, are proposed for this algorithm. Design of experiments is employed to determine factors controlling the value of parameters and the target levels of those values to achieve desirable output. The efficiency of the modified COA algorithm is substantiated with the help of several optimization test problems. The results are then compared to other well-known algorithms such as PSO, DE and harmony search using a non-parametric statistical procedure. In order to analyze its effectiveness, the proposed modified COA is applied to a feature selection problem and spacecraft attitude control problem. PubDate: 2016-04-12 DOI: 10.1007/s12065-016-0135-4 Issue No:Vol. 9, No. 1-2 (2016)

Authors:Mohamad Farhan Mohamad Mohsin; Abdul Razak Hamdan; Azuraliza Abu Bakar Pages: 37 - 51 Abstract: Signal normalization is a part of signal formalization which is a vital data pre-processing constraint required for the functioning of the dendritic cell algorithm. In existing applications, most normalization algorithms are developed purposely for a specific application with drawing on human domain expertise and very few algorithms are designed for general problems. This makes it difficult for the inexperienced user to exploit existing approaches to another problem, particularly when the initial information about the problem is limited. Therefore, this study proposes a new signal normalization method for the dendritic cell algorithm that uses the statistical upper and lower cumulative sum so that the algorithm can be applied to general classification problems. In addition, a new method to calculate the anomaly threshold based on the average mature-contact antigen value is presented to suit the proposed algorithm. The proposed model is evaluated by applying it to eight universal classification datasets and assessing its performance according to four measurement metrics: detection rate, specificity, false alarm rate, and accuracy. Its performance is compared with that of the existing dendritic cell algorithm and four non-bio-inspired classifiers, namely, rough set, decision tree, naïve Bayes, and multilayer perceptron. The results show that the proposed model outperforms the existing model and the other classifiers as well as demonstrates a significant improvement in terms of specificity, false alarm rate, and accuracy for all datasets. This indicates that the proposed normalization approach can be applied to general classification problems and can improve detection performance. PubDate: 2016-04-20 DOI: 10.1007/s12065-016-0136-3 Issue No:Vol. 9, No. 1-2 (2016)

Authors:D. Lim; Y. S. Ong; A. Gupta; C. K. Goh; P. S. Dutta Abstract: As the field of evolutionary optimization continues to expand, it is becoming increasingly common to incorporate various machine learning approaches, such as clustering, classification, and regression models, to improve algorithmic efficiency. However, we note that although problem learning is popularly used in improving the ongoing optimization process, little effort is ever made in extracting re-usable domain knowledge. In other words, the acquired knowledge is seldom transferred and exploited for future design exercises. Focusing on evolutionary optimization, in this paper we investigate the concept of simultaneous problem learning and optimization inspired by the following notions: (1) that prior/dynamically acquired knowledge can enhance the effectiveness of evolutionary search, and (2) that evolution can be geared towards gathering crucial knowledge about the underlying problem. Taking benchmark functions as well as an engineering (process) design problem into consideration, we demonstrate the efficacy of a novel classifier-assisted constrained EA towards simultaneous evolutionary search and problem learning. PubDate: 2016-10-11 DOI: 10.1007/s12065-016-0146-1

Authors:Dietmar Maringer; Sebastian H. M. Deininger Abstract: Selecting and estimating parsimonious models is often desired, but hard to achieve. This is particularly true when models can potentially contain a very large number of parameters but data are scarce—as is the case for many macro-economic models in general and interest-rate models in particular. These models need to cater for a large number of potential relationships and dependencies, but are fitted on low-frequency data to focus on the bigger picture and long-term effects. To identify the ideal model and estimating it is then particularly demanding from an optimization perspective. In this paper, we suggest an evolutionary approach that considers model selection and estimation simultaneously. Numerical experiments with artificial data suggest that the approach is well suited for this type of problem. In an empirical application for short-term and long-term interest rates denominated in US dollar, euro and the Japanese yen, respectively, parsimonious model structures are identified that highlight the dependencies as well as spill-overs across maturities and currencies. PubDate: 2016-09-27 DOI: 10.1007/s12065-016-0145-2

Authors:Igor Deplano; Giovanni Squillero; Alberto Tonda Abstract: Predicting the market’s behavior to profit from trading stocks is far from trivial. Such a task becomes even harder when investors do not have large amounts of money available, and thus cannot influence this complex system in any way. Machine learning paradigms have been already applied to financial forecasting, but usually with no restrictions on the size of the investor’s budget. In this paper, we analyze an evolutionary portfolio optimizer for the management of limited budgets, dissecting each part of the framework, discussing in detail the issues and the motivations that led to the final choices. Expected returns are modeled resorting to artificial neural networks trained on past market data, and the portfolio composition is chosen by approximating the solution to a multi-objective constrained problem. An investment simulator is eventually used to measure the portfolio performance. The proposed approach is tested on real-world data from New York’s, Milan’s and Paris’ stock exchanges, exploiting data from June 2011 to May 2014 to train the framework, and data from June 2014 to July 2015 to validate it. Experimental results demonstrate that the presented tool is able to obtain a more than satisfying profit for the considered time frame. PubDate: 2016-09-02 DOI: 10.1007/s12065-016-0144-3

Authors:Cao Truong Tran; Mengjie Zhang; Peter Andreae; Bing Xue Abstract: Missing values are an unavoidable problem of many real-world datasets. Inadequate treatment of missing values may result in large errors on classification; thus, dealing well with missing values is essential for classification. Feature selection has been well known for improving classification, but it has been seldom used for improving classification with incomplete datasets. Moreover, some classifiers such as C4.5 are able to directly classify incomplete datasets, but they often generate more complex classifiers with larger classification errors. The purpose of this paper is to propose a wrapper-based feature selection method to improve the ability of a classifier able to classify incomplete datasets. In order to achieve the purpose, the feature selection method evaluates feature subsets using a classifier able to classify incomplete datasets. Empirical results on 14 datasets using particle swarm optimisation for searching feature subsets and C4.5 for evaluating the feature subsets in the feature selection method show that the wrapper-based feature selection is not only able to improve classification accuracy of the classifier, but also able to reduce the size of trees generated by the classifier. PubDate: 2016-08-09 DOI: 10.1007/s12065-016-0141-6

Authors:Mohammad Majid al-Rifaie; Tim Blackwell Abstract: This paper extends particle aggregate reconstruction technique (PART), a reconstruction algorithm for binary tomography based on the movement of particles. PART supposes that pixel values are particles, and that particles diffuse through the image, staying together in regions of uniform pixel value known as aggregates. In this work, a variation of this algorithm is proposed and a focus is placed on reducing the number of projections and whether this impacts the reconstruction of images. The algorithm is tested on three phantoms of varying sizes and numbers of forward projections and compared to filtered back projection, a random search algorithm and to SART, a standard algebraic reconstruction method. It is shown that the proposed algorithm outperforms the aforementioned algorithms on small numbers of projections. This potentially makes the algorithm attractive in scenarios where collecting less projection data are inevitable. PubDate: 2016-08-08 DOI: 10.1007/s12065-016-0140-7

Authors:Yih-Chun Cheng; Tom Hartmann; Pei-Yun Tsai; Martin Middendorf Abstract: A population based ant colony optimization algorithm (PACO) for the reconstruction of electrocardiogram (ECG) signals is proposed. Specifically, the PACO finds a subset of nonzero positions of a sparse wavelet domain ECG signal vector that is used for the reconstruction of the signal. A time window is used by the proposed PACO for fixing certain decisions of the ants during the run of the algorithm. The optimization behaviour of the PACO is compared with various algorithms from the literature for ECG signal reconstruction, and with two random search heuristics. Experimental results are presented for ECG signals from the MIT-BIT Arrhythmia database. The influence of several algorithmic parameters and of a local search procedure is evaluated. The results show that the proposed PACO algorithm reconstructs ECG signals with high accuracy. PubDate: 2016-07-30 DOI: 10.1007/s12065-016-0139-0

Authors:Shu-Heng Chen; Umberto Gostoli Abstract: In this paper, we carry out a sensitivity analysis for an agent-based model of the use of public resources as manifested by the El Farol Bar problem. An early study using the same model has shown that a good-society equilibrium, characterized by both economic efficiency and economic equality, can be achieved probabilistically by a von Neumann network, and can be achieved surely with the presence of some agents having social preferences, such as the inequity-averse preference or the ‘keeping-up-with-the-Joneses’ preference. In this study, we examine this fundamental result by exploring the inherent complexity of the model; specifically, we address the effect of the three key parameters related to size, namely, the network size, the neighborhood size, and the memory size. We find that social preferences still play an important role over all the sizes considered. Nonetheless, it is also found that when network size becomes large, the parameter, the bar capacity (the attendance threshold), may also play a determining role. PubDate: 2016-07-02 DOI: 10.1007/s12065-016-0138-1

Authors:Alhanoof Althnian; Arvin Agah Abstract: This paper presents an evolutionary approach that, given a performance goal, produces a communication strategy that can improve a multi-agent system’s performance with respect to the desired goal. The evolved strategy determines what, when, and to whom agents communicate. The proposed approach further enables tuning the trade-off between the performance goal and communication cost, to produce a strategy that achieves a good balance between the two objectives, according the system’s designer needs. Experiments are designed to evaluate the approach using the Wumpus World application domain, with variations of three factors: fitness parameters (including objectives’ weights and action and communication costs), fitness goal, and simulation environment. Results show that the system’s performance can be highly tuned by controlling communication, and that the presented approach has significant utilization in improving the performance with respect to the goal. PubDate: 2016-06-13 DOI: 10.1007/s12065-016-0137-2

Authors:Theodore W. Cornforth; Hod Lipson Pages: 149 - 164 Abstract: Natural and artificial dynamical systems in the real world often have dynamics at multiple time scales. Such dynamics can contribute substantially to the complexity of a dynamical system and increase the difficulty with which it can be analyzed. Although evolutionary algorithms have been proposed that are amenable to the automated modeling of dynamical systems, none have explicitly taken into account multiple time scales or leveraged the information about these dynamics that is inherent in experimental observations. We propose a hybrid approach to the design of models for multiple-time-scale dynamical systems that combines an evolutionary algorithm with other metaheuristics and conventional nonlinear regression. With only minimal human-supplied domain knowledge, the algorithm automates the process of analyzing raw experimental observations and creating an interpretable symbolic model of the system under study. We describe the algorithm in detail and demonstrate its applicability to a variety of both physical and simulated systems. In addition, we study the performance and scalability of the algorithm under different types of dynamics, varying levels of experimental noise, and other factors relevant to the practical application of the algorithm. PubDate: 2015-01-31 DOI: 10.1007/s12065-015-0126-x Issue No:Vol. 8, No. 4 (2015)

Authors:Mohammad-H. Tayarani-N.; Adam Prügel-Bennett Pages: 165 - 184 Abstract: The anatomy of the fitness landscape for the quadratic assignment problem is studied in this paper. We study the properties of both random problems, and real-world problems. Using auto-correlation as a measure for the landscape ruggedness, we study the landscape of the problems and show how this property is related to the problem matrices with which the problems are represented. Our main goal in this paper is to study new properties of the fitness landscape, which have not been studied before, and we believe are more capable of reflecting the problem difficulties. Using local search algorithm which exhaustively explore the plateaus and the local optima, we explore the landscape, store all the local optima we find, and study their properties. The properties we study include the time it takes for a local search algorithm to find local optima, the number of local optima, the probability of reaching the global optimum, the expected cost of the local optima around the global optimum and the basin of attraction of the global and local optima. We study the properties for problems of different sizes, and through extrapolations, we show how the properties change with the system size and why the problem becomes harder as the system size grows. In our study we show how the real-world problems are similar to, or different from the random problems. We also try to show what properties of the problem matrices make the landscape of the real problems be different from or similar to one another. PubDate: 2015-05-22 DOI: 10.1007/s12065-015-0132-z Issue No:Vol. 8, No. 4 (2015)

Authors:Syahaneim Marzukhi; Will N. Browne; Mengjie Zhang Pages: 185 - 201 Abstract: The Three-Cornered Coevolution Framework describes a method that is capable of addressing classification tasks through coevolution (coadaptive evolution) where three different agents (i.e. a generation agent and two classification agents) learn and adapt to the changes of the problems without human involvement. Here, artificial problems can be generated in concert with classification agents in order to provide insight into their relationships. Previous work on the Two-Cornered Coevolution Framework provided foundation for implementing the system that was able to set-up the problem’s difficulty appropriately while triggering the coevolutionary process. However, the triggering process was set manually without utilising the third agent as proposed in the original framework to perform this task. Previous work on the Three-Cornered Coevolution introduced the third agent (a new classification agent) to trigger the coevolutionary process within the system, where its functionality and effect on the system requires investigation. This paper details the implementation for this case; two classification agents that use different styles of learning techniques (e.g. supervised versus reinforcement learning techniques) is adapted in the classification agents to learn the various classification problems. Ultimately, Learning Classifier System (LCS) is chosen to be implemented in the participating agents. LCS has several potential characteristics, such as interpretability, generalization capability and variations in representation, that are suitable for the system. Experiments show that the Pittsburgh-style LCS with the adaptation of Tabu Search technique in S capable to autonomously adjust the problem’s difficulty and generate a wide range of problems for classification. The adaptation of A-PLUS to an ‘on-line’ system is successful implemented. Further, the classification agents (i.e. R and I) are able to solve the classification tasks where the classification performance are varied. The Three-Cornered Coevolution Framework offers a great potential for autonomous learning and provides useful insight into coevolution learning over the standard studies of pattern recognition. The system is capable of autonomously generating various problems, learning and providing insight into each learning system’s ability by determining the problem domains where they perform relatively well. This is in contrast to humans having to determine the problem domains. PubDate: 2015-11-18 DOI: 10.1007/s12065-015-0133-y Issue No:Vol. 8, No. 4 (2015)

Authors:Larry Bull Pages: 55 - 70 Abstract: The direction set by Wilson’s XCS is that modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an overview of the evolution of Learning Classifier Systems up to XCS, and then of some of the subsequent developments of Wilson’s algorithm to different types of learning. PubDate: 2015-01-29 DOI: 10.1007/s12065-015-0125-y Issue No:Vol. 8, No. 2-3 (2015)

Authors:Ryan J. Urbanowicz; Jason H. Moore Pages: 89 - 116 Abstract: Algorithmic scalability is a major concern for any machine learning strategy in this age of ‘big data’. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExSTraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExSTraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExSTraCS usability was made simpler through the elimination of previously critical run parameters. PubDate: 2015-04-03 DOI: 10.1007/s12065-015-0128-8 Issue No:Vol. 8, No. 2-3 (2015)

Authors:Pier Luca Lanzi; Daniele Loiacono Pages: 117 - 132 Abstract: Tile coding is an effective reinforcement learning method that uses a rather ingenious generalization mechanism based on (1) a carefully designed parameter setting and (2) the assumption that nearby states in the problem space will correspond to similar payoff predictions in the action-value function. Previously, we extended XCSF with tile coding prediction and compared it to tabular tile coding, showing that (1) XCSF performs as well as parameter-optimized tile coding, while also (2) evolving individualized parameter settings in each problem subspace. Our comparison was based on a set of well-known reinforcement learning environments (2D Gridworld and the Mountain Car) that involved no action-value discontinuities and so posed no challenge to tabular tile coding. In this paper, we go a step further and test XCSF with tile coding on a set of problems designed to challenge tile coding by introducing discontinuities in the action value landscape. The new testbed (called MazeWorld) extends 2D Gridworld with impenetrable obstacles, a conceptually simple modification that can dramatically increase the problem complexity for tabular tile coding. We compare four versions of XCSF with tile coding (each adapting a different set of parameters) to tabular tile coding on four problems of increasing complexity. We show that our system (1) needs fewer training problems than standard tile coding to reach an optimal policy; (2) it can evolve adequate coding parameters in each subspace without any previous knowledge; and that (3) even when XCSF is not allowed to evolve these parameters, the genetic algorithm will still adapt classifier conditions to properly decompose the problem into subspaces thus being much less sensitive to the parameter settings than tabular tile coding. PubDate: 2015-04-11 DOI: 10.1007/s12065-015-0129-7 Issue No:Vol. 8, No. 2-3 (2015)