Authors:Hoai Bach Nguyen; Bing Xue; Peter Andreae Pages: 95 - 110 Abstract: Abstract
In classification, feature selection is an important pre-processing step to simplify the dataset and improve the data representation quality, which makes classifiers become better, easier to train, and understand. Because of an ability to analyse non-linear interactions between features, mutual information has been widely applied to feature selection. Along with counting approaches, a traditional way to calculate mutual information, many mutual information estimations have been proposed to allow mutual information to work directly on continuous datasets. This work focuses on comparing the effect of counting approach and kernel density estimation (KDE) approach in feature selection using particle swarm optimisation as a search mechanism. The experimental results on 15 different datasets show that KDE can work well on both continuous and discrete datasets. In addition, feature subsets evolved by KDE achieves similar or better classification performance than the counting approach. Furthermore, the results on artificial datasets with various interactions show that KDE is able to capture correctly the interaction between features, in both relevance and redundancy, which can not be achieved by using the counting approach. PubDate: 2016-08-20 DOI: 10.1007/s12065-016-0143-4 Issue No:Vol. 9, No. 3 (2016)

Authors:Nafrizuan Mat Yahya; M. Osman Tokhi; Hyreil Anuar Kasdirin Pages: 1 - 20 Abstract: Abstract
Bats sonar algorithm (BSA) as a swarm intelligence approach utilises the concept of echolocation of bats to find prey. However, the algorithm is unable to achieve good precision and fast convergence rate to the optimum solution. With this in mind, an adaptive bats sonar algorithm is introduced with new paradigms of real bats echolocation behaviour. The performance of the algorithm is validated through rigorous tests with several single objective optimisation benchmark test functions. The obtained results show that the proposed scheme outperforms the BSA in terms of accuracy and convergence speed and can be efficiently employed to solve engineering problems. PubDate: 2016-02-18 DOI: 10.1007/s12065-016-0134-5 Issue No:Vol. 9, No. 1-2 (2016)

Authors:Seyed Alireza Mohseni; Tony Wong; Vincent Duchaine Pages: 21 - 36 Abstract: Abstract
As with other nature-inspired algorithms, the cuckoo optimization algorithm (COA) produces a population of candidate solutions to find the (near-) optimal solutions to a problem. In this paper, several modifications, including a dynamic mutation operator, are proposed for this algorithm. Design of experiments is employed to determine factors controlling the value of parameters and the target levels of those values to achieve desirable output. The efficiency of the modified COA algorithm is substantiated with the help of several optimization test problems. The results are then compared to other well-known algorithms such as PSO, DE and harmony search using a non-parametric statistical procedure. In order to analyze its effectiveness, the proposed modified COA is applied to a feature selection problem and spacecraft attitude control problem. PubDate: 2016-04-12 DOI: 10.1007/s12065-016-0135-4 Issue No:Vol. 9, No. 1-2 (2016)

Authors:Mohamad Farhan Mohamad Mohsin; Abdul Razak Hamdan; Azuraliza Abu Bakar Pages: 37 - 51 Abstract: Abstract
Signal normalization is a part of signal formalization which is a vital data pre-processing constraint required for the functioning of the dendritic cell algorithm. In existing applications, most normalization algorithms are developed purposely for a specific application with drawing on human domain expertise and very few algorithms are designed for general problems. This makes it difficult for the inexperienced user to exploit existing approaches to another problem, particularly when the initial information about the problem is limited. Therefore, this study proposes a new signal normalization method for the dendritic cell algorithm that uses the statistical upper and lower cumulative sum so that the algorithm can be applied to general classification problems. In addition, a new method to calculate the anomaly threshold based on the average mature-contact antigen value is presented to suit the proposed algorithm. The proposed model is evaluated by applying it to eight universal classification datasets and assessing its performance according to four measurement metrics: detection rate, specificity, false alarm rate, and accuracy. Its performance is compared with that of the existing dendritic cell algorithm and four non-bio-inspired classifiers, namely, rough set, decision tree, naïve Bayes, and multilayer perceptron. The results show that the proposed model outperforms the existing model and the other classifiers as well as demonstrates a significant improvement in terms of specificity, false alarm rate, and accuracy for all datasets. This indicates that the proposed normalization approach can be applied to general classification problems and can improve detection performance. PubDate: 2016-04-20 DOI: 10.1007/s12065-016-0136-3 Issue No:Vol. 9, No. 1-2 (2016)

Authors:Cao Truong Tran; Mengjie Zhang; Peter Andreae; Bing Xue Abstract: Abstract
Missing values are an unavoidable problem of many real-world datasets. Inadequate treatment of missing values may result in large errors on classification; thus, dealing well with missing values is essential for classification. Feature selection has been well known for improving classification, but it has been seldom used for improving classification with incomplete datasets. Moreover, some classifiers such as C4.5 are able to directly classify incomplete datasets, but they often generate more complex classifiers with larger classification errors. The purpose of this paper is to propose a wrapper-based feature selection method to improve the ability of a classifier able to classify incomplete datasets. In order to achieve the purpose, the feature selection method evaluates feature subsets using a classifier able to classify incomplete datasets. Empirical results on 14 datasets using particle swarm optimisation for searching feature subsets and C4.5 for evaluating the feature subsets in the feature selection method show that the wrapper-based feature selection is not only able to improve classification accuracy of the classifier, but also able to reduce the size of trees generated by the classifier. PubDate: 2016-08-09 DOI: 10.1007/s12065-016-0141-6

Authors:Mohammad Majid al-Rifaie; Tim Blackwell Abstract: Abstract
This paper extends particle aggregate reconstruction technique (PART), a reconstruction algorithm for binary tomography based on the movement of particles. PART supposes that pixel values are particles, and that particles diffuse through the image, staying together in regions of uniform pixel value known as aggregates. In this work, a variation of this algorithm is proposed and a focus is placed on reducing the number of projections and whether this impacts the reconstruction of images. The algorithm is tested on three phantoms of varying sizes and numbers of forward projections and compared to filtered back projection, a random search algorithm and to SART, a standard algebraic reconstruction method. It is shown that the proposed algorithm outperforms the aforementioned algorithms on small numbers of projections. This potentially makes the algorithm attractive in scenarios where collecting less projection data are inevitable. PubDate: 2016-08-08 DOI: 10.1007/s12065-016-0140-7

Authors:Yih-Chun Cheng; Tom Hartmann; Pei-Yun Tsai; Martin Middendorf Abstract: Abstract
A population based ant colony optimization algorithm (PACO) for the reconstruction of electrocardiogram (ECG) signals is proposed. Specifically, the PACO finds a subset of nonzero positions of a sparse wavelet domain ECG signal vector that is used for the reconstruction of the signal. A time window is used by the proposed PACO for fixing certain decisions of the ants during the run of the algorithm. The optimization behaviour of the PACO is compared with various algorithms from the literature for ECG signal reconstruction, and with two random search heuristics. Experimental results are presented for ECG signals from the MIT-BIT Arrhythmia database. The influence of several algorithmic parameters and of a local search procedure is evaluated. The results show that the proposed PACO algorithm reconstructs ECG signals with high accuracy. PubDate: 2016-07-30 DOI: 10.1007/s12065-016-0139-0

Authors:Shu-Heng Chen; Umberto Gostoli Abstract: Abstract
In this paper, we carry out a sensitivity analysis for an agent-based model of the use of public resources as manifested by the El Farol Bar problem. An early study using the same model has shown that a good-society equilibrium, characterized by both economic efficiency and economic equality, can be achieved probabilistically by a von Neumann network, and can be achieved surely with the presence of some agents having social preferences, such as the inequity-averse preference or the ‘keeping-up-with-the-Joneses’ preference. In this study, we examine this fundamental result by exploring the inherent complexity of the model; specifically, we address the effect of the three key parameters related to size, namely, the network size, the neighborhood size, and the memory size. We find that social preferences still play an important role over all the sizes considered. Nonetheless, it is also found that when network size becomes large, the parameter, the bar capacity (the attendance threshold), may also play a determining role. PubDate: 2016-07-02 DOI: 10.1007/s12065-016-0138-1

Authors:Alhanoof Althnian; Arvin Agah Abstract: Abstract
This paper presents an evolutionary approach that, given a performance goal, produces a communication strategy that can improve a multi-agent system’s performance with respect to the desired goal. The evolved strategy determines what, when, and to whom agents communicate. The proposed approach further enables tuning the trade-off between the performance goal and communication cost, to produce a strategy that achieves a good balance between the two objectives, according the system’s designer needs. Experiments are designed to evaluate the approach using the Wumpus World application domain, with variations of three factors: fitness parameters (including objectives’ weights and action and communication costs), fitness goal, and simulation environment. Results show that the system’s performance can be highly tuned by controlling communication, and that the presented approach has significant utilization in improving the performance with respect to the goal. PubDate: 2016-06-13 DOI: 10.1007/s12065-016-0137-2

Authors:Theodore W. Cornforth; Hod Lipson Pages: 149 - 164 Abstract: Abstract
Natural and artificial dynamical systems in the real world often have dynamics at multiple time scales. Such dynamics can contribute substantially to the complexity of a dynamical system and increase the difficulty with which it can be analyzed. Although evolutionary algorithms have been proposed that are amenable to the automated modeling of dynamical systems, none have explicitly taken into account multiple time scales or leveraged the information about these dynamics that is inherent in experimental observations. We propose a hybrid approach to the design of models for multiple-time-scale dynamical systems that combines an evolutionary algorithm with other metaheuristics and conventional nonlinear regression. With only minimal human-supplied domain knowledge, the algorithm automates the process of analyzing raw experimental observations and creating an interpretable symbolic model of the system under study. We describe the algorithm in detail and demonstrate its applicability to a variety of both physical and simulated systems. In addition, we study the performance and scalability of the algorithm under different types of dynamics, varying levels of experimental noise, and other factors relevant to the practical application of the algorithm. PubDate: 2015-01-31 DOI: 10.1007/s12065-015-0126-x Issue No:Vol. 8, No. 4 (2015)

Authors:Mohammad-H. Tayarani-N.; Adam Prügel-Bennett Pages: 165 - 184 Abstract: Abstract
The anatomy of the fitness landscape for the quadratic assignment problem is studied in this paper. We study the properties of both random problems, and real-world problems. Using auto-correlation as a measure for the landscape ruggedness, we study the landscape of the problems and show how this property is related to the problem matrices with which the problems are represented. Our main goal in this paper is to study new properties of the fitness landscape, which have not been studied before, and we believe are more capable of reflecting the problem difficulties. Using local search algorithm which exhaustively explore the plateaus and the local optima, we explore the landscape, store all the local optima we find, and study their properties. The properties we study include the time it takes for a local search algorithm to find local optima, the number of local optima, the probability of reaching the global optimum, the expected cost of the local optima around the global optimum and the basin of attraction of the global and local optima. We study the properties for problems of different sizes, and through extrapolations, we show how the properties change with the system size and why the problem becomes harder as the system size grows. In our study we show how the real-world problems are similar to, or different from the random problems. We also try to show what properties of the problem matrices make the landscape of the real problems be different from or similar to one another. PubDate: 2015-05-22 DOI: 10.1007/s12065-015-0132-z Issue No:Vol. 8, No. 4 (2015)

Authors:Syahaneim Marzukhi; Will N. Browne; Mengjie Zhang Pages: 185 - 201 Abstract: Abstract
The Three-Cornered Coevolution Framework describes a method that is capable of addressing classification tasks through coevolution (coadaptive evolution) where three different agents (i.e. a generation agent and two classification agents) learn and adapt to the changes of the problems without human involvement. Here, artificial problems can be generated in concert with classification agents in order to provide insight into their relationships. Previous work on the Two-Cornered Coevolution Framework provided foundation for implementing the system that was able to set-up the problem’s difficulty appropriately while triggering the coevolutionary process. However, the triggering process was set manually without utilising the third agent as proposed in the original framework to perform this task. Previous work on the Three-Cornered Coevolution introduced the third agent (a new classification agent) to trigger the coevolutionary process within the system, where its functionality and effect on the system requires investigation. This paper details the implementation for this case; two classification agents that use different styles of learning techniques (e.g. supervised versus reinforcement learning techniques) is adapted in the classification agents to learn the various classification problems. Ultimately, Learning Classifier System (LCS) is chosen to be implemented in the participating agents. LCS has several potential characteristics, such as interpretability, generalization capability and variations in representation, that are suitable for the system. Experiments show that the Pittsburgh-style LCS with the adaptation of Tabu Search technique in S capable to autonomously adjust the problem’s difficulty and generate a wide range of problems for classification. The adaptation of A-PLUS to an ‘on-line’ system is successful implemented. Further, the classification agents (i.e. R and I) are able to solve the classification tasks where the classification performance are varied. The Three-Cornered Coevolution Framework offers a great potential for autonomous learning and provides useful insight into coevolution learning over the standard studies of pattern recognition. The system is capable of autonomously generating various problems, learning and providing insight into each learning system’s ability by determining the problem domains where they perform relatively well. This is in contrast to humans having to determine the problem domains. PubDate: 2015-11-18 DOI: 10.1007/s12065-015-0133-y Issue No:Vol. 8, No. 4 (2015)

Authors:Larry Bull Pages: 55 - 70 Abstract: Abstract
The direction set by Wilson’s XCS is that modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an overview of the evolution of Learning Classifier Systems up to XCS, and then of some of the subsequent developments of Wilson’s algorithm to different types of learning. PubDate: 2015-01-29 DOI: 10.1007/s12065-015-0125-y Issue No:Vol. 8, No. 2-3 (2015)

Authors:Masaya Nakata; Pier Luca Lanzi; Keiki Takadama Pages: 71 - 87 Abstract: Abstract
The XCS classifier system is a rule-based evolutionary machine learning system. XCS evolves classifiers in order to learn generalized solutions.
The XCS with adaptive action mapping (XCSAM) is inherited from XCS, which evolves a best action map where it evolves classifiers that advocate the best action in every state. Accordingly, XCSAM can potentially evolve solutions that are more compact than XCS, which in contrast focuses on a complete action map. Previous experimental results however have shown that, in some problems, XCSAM may produce solutions with more classifiers than XCS. In this paper, we initially show that the original fitness-based selection strategy of XCS produces non effective classifiers which are not likely to be in the best action map (i.e., they are inaccurate ones or do not have best actions) in XCSAM. Then, we introduce a new selection strategy for XCSAM that promotes the evolution of classifiers advocating the best action map and thus produces more compact solutions. The new strategy selects classifiers based both on their fitness (like XCS) and on the parameter optimality of action of XCSAM. The result is a pressure towards classifiers that are accurate and advocate the best actions. We present analyses showing that the new selection strategy successfully enables XCSAM to focus on classifiers having best actions. Our experimental results show that XCSAM with the new selection strategy (called XCSAM-SS) can evolve smaller solutions than XCS (and the original XCSAM) both in single-step and multi-step problems. As a consequence, XCSAM can also learn with smaller iterations than XCS in the single-step problem. Our conclusion is that, as the best action map potentially has a compact solution, XCSAM evolves a much compact solution than XCS by adding an adequate selection strategy. PubDate: 2015-04-02 DOI: 10.1007/s12065-015-0130-1 Issue No:Vol. 8, No. 2-3 (2015)

Authors:Ryan J. Urbanowicz; Jason H. Moore Pages: 89 - 116 Abstract: Abstract
Algorithmic scalability is a major concern for any machine learning strategy in this age of ‘big data’. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExSTraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExSTraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExSTraCS usability was made simpler through the elimination of previously critical run parameters. PubDate: 2015-04-03 DOI: 10.1007/s12065-015-0128-8 Issue No:Vol. 8, No. 2-3 (2015)

Authors:Pier Luca Lanzi; Daniele Loiacono Pages: 117 - 132 Abstract: Abstract
Tile coding is an effective reinforcement learning method that uses a rather ingenious generalization mechanism based on (1) a carefully designed parameter setting and (2) the assumption that nearby states in the problem space will correspond to similar payoff predictions in the action-value function. Previously, we extended XCSF with tile coding prediction and compared it to tabular tile coding, showing that (1) XCSF performs as well as parameter-optimized tile coding, while also (2) evolving individualized parameter settings in each problem subspace. Our comparison was based on a set of well-known reinforcement learning environments (2D Gridworld and the Mountain Car) that involved no action-value discontinuities and so posed no challenge to tabular tile coding. In this paper, we go a step further and test XCSF with tile coding on a set of problems designed to challenge tile coding by introducing discontinuities in the action value landscape. The new testbed (called MazeWorld) extends 2D Gridworld with impenetrable obstacles, a conceptually simple modification that can dramatically increase the problem complexity for tabular tile coding. We compare four versions of XCSF with tile coding (each adapting a different set of parameters) to tabular tile coding on four problems of increasing complexity. We show that our system (1) needs fewer training problems than standard tile coding to reach an optimal policy; (2) it can evolve adequate coding parameters in each subspace without any previous knowledge; and that (3) even when XCSF is not allowed to evolve these parameters, the genetic algorithm will still adapt classifier conditions to properly decompose the problem into subspaces thus being much less sensitive to the parameter settings than tabular tile coding. PubDate: 2015-04-11 DOI: 10.1007/s12065-015-0129-7 Issue No:Vol. 8, No. 2-3 (2015)

Authors:Masaya Nakata; Tim Kovacs; Keiki Takadama Pages: 133 - 148 Abstract: Abstract
Sequence labeling is an interesting classification domain where, like normal classification, every input has a class label, but unlike normal classification, prediction of an input’s label may depend on the values of other inputs or their classes, and so a learner may need to refer to inputs and classes at different time stamps to classify the current input. This is more difficult because a learner does not know where and how many inputs are needed to classify the current input. Our interest is in learning general rules for sequence labeling. The XCS algorithm is a rule-based knowledge discovery system powered by a genetic algorithm which has often been used for classification. Here we present XCS-SL, an extension of XCS classifier system which can be applicable to sequence labeling. Towards an application of Learning Classifier System (LCS) to sequence labeling, we propose a new classifier condition with memory (called a variable-length condition) and a rule-discovery system for the new classifier condition, which enables XCS to apply it to sequence labeling. In XCS-SL, classification rules (called “classifiers” here) can include extra conditions on previous inputs, which act as memories. In sequence labeling, the number of conditions/memories needed may be different for each input, hence, using a fixed number of conditions (i.e., fixed-length condition) for all classifiers is not a good solution. Instead, XCS-SL classifiers have a variable-length condition to provide more or less memory. The genetic algorithm can grow and shrink conditions to find a suitable memory size. On two synthetic benchmark problems XCS-SL learns optimal classifiers, and on a real-world sequence labeling task it derives high classification accuracy and discovers interesting knowledge that shows dependencies between inputs at different times. The comprehensively described system is the first application of a LCS to sequence labeling and we consider it a promising direction for future work. PubDate: 2015-03-10 DOI: 10.1007/s12065-015-0127-9 Issue No:Vol. 8, No. 2-3 (2015)