for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> COMPUTER SCIENCE (Total: 2102 journals)
    - ANIMATION AND SIMULATION (31 journals)
    - ARTIFICIAL INTELLIGENCE (103 journals)
    - AUTOMATION AND ROBOTICS (105 journals)
    - COMPUTER ARCHITECTURE (10 journals)
    - COMPUTER ENGINEERING (11 journals)
    - COMPUTER GAMES (21 journals)
    - COMPUTER PROGRAMMING (26 journals)
    - COMPUTER SCIENCE (1221 journals)
    - COMPUTER SECURITY (47 journals)
    - DATA BASE MANAGEMENT (14 journals)
    - DATA MINING (36 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (30 journals)
    - IMAGE AND VIDEO PROCESSING (40 journals)
    - INFORMATION SYSTEMS (108 journals)
    - INTERNET (95 journals)
    - SOCIAL WEB (53 journals)
    - SOFTWARE (34 journals)
    - THEORY OF COMPUTING (9 journals)

COMPUTER SCIENCE (1221 journals)                  1 2 3 4 5 6 7 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 21)
Abakós     Open Access   (Followers: 4)
ACM Computing Surveys     Hybrid Journal   (Followers: 29)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 8)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 16)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 15)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 5)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 9)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 12)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 3)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 18)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 6)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 6)
ACM Transactions on Economics and Computation     Hybrid Journal   (Followers: 1)
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 20)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 8)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 6)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 8)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Hybrid Journal   (Followers: 32)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 2)
Acta Informatica Malaysia     Open Access  
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 10)
Advanced Engineering Materials     Hybrid Journal   (Followers: 28)
Advanced Science Letters     Full-text available via subscription   (Followers: 11)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 7)
Advances in Artificial Intelligence     Open Access   (Followers: 15)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 4)
Advances in Catalysis     Full-text available via subscription   (Followers: 5)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 19)
Advances in Computer Engineering     Open Access   (Followers: 4)
Advances in Computer Science : an International Journal     Open Access   (Followers: 14)
Advances in Computing     Open Access   (Followers: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 55)
Advances in Engineering Software     Hybrid Journal   (Followers: 28)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 14)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 22)
Advances in Human-Computer Interaction     Open Access   (Followers: 20)
Advances in Materials Science     Open Access   (Followers: 14)
Advances in Operations Research     Open Access   (Followers: 12)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 7)
Advances in Porous Media     Full-text available via subscription   (Followers: 5)
Advances in Remote Sensing     Open Access   (Followers: 49)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access   (Followers: 6)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 9)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
AI EDAM     Hybrid Journal   (Followers: 1)
Air, Soil & Water Research     Open Access   (Followers: 12)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 11)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 5)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 6)
American Journal of Sensor Technology     Open Access   (Followers: 4)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 7)
Analysis in Theory and Applications     Hybrid Journal   (Followers: 1)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 4)
Annals of Data Science     Hybrid Journal   (Followers: 12)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 13)
Annals of Pure and Applied Logic     Open Access   (Followers: 3)
Annals of Software Engineering     Hybrid Journal   (Followers: 13)
Annals of West University of Timisoara - Mathematics and Computer Science     Open Access  
Annual Reviews in Control     Hybrid Journal   (Followers: 8)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 1)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 12)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 13)
Applied Computer Systems     Open Access   (Followers: 2)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 33)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 16)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 5)
Applied System Innovation     Open Access  
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 5)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 145)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 5)
arq: Architectural Research Quarterly     Hybrid Journal   (Followers: 8)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 7)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access   (Followers: 1)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 5)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 13)
Automation in Construction     Hybrid Journal   (Followers: 7)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 8)
Basin Research     Hybrid Journal   (Followers: 5)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Big Data and Cognitive Computing     Open Access   (Followers: 2)
Biodiversity Information Science and Standards     Open Access  
Bioinformatics     Hybrid Journal   (Followers: 308)
Biomedical Engineering     Hybrid Journal   (Followers: 16)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 13)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 20)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 35)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 49)
British Journal of Educational Technology     Hybrid Journal   (Followers: 149)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 12)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 1)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal   (Followers: 2)
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 15)
Capturing Intelligence     Full-text available via subscription  
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal   (Followers: 2)
Cell Communication and Signaling     Open Access   (Followers: 2)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access   (Followers: 3)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 15)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 8)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Hybrid Journal   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 15)
Clean Air Journal     Full-text available via subscription   (Followers: 1)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Clinical eHealth     Open Access  
Cluster Computing     Hybrid Journal   (Followers: 2)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combinatorics, Probability and Computing     Hybrid Journal   (Followers: 4)
Combustion Theory and Modelling     Hybrid Journal   (Followers: 14)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 23)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Computational Physics     Full-text available via subscription   (Followers: 2)
Communications in Information Science and Management Engineering     Open Access   (Followers: 4)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 51)
Communications of the Association for Information Systems     Open Access   (Followers: 16)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 8)
Computación y Sistemas     Open Access  
Computation     Open Access   (Followers: 1)
Computational and Applied Mathematics     Hybrid Journal   (Followers: 3)
Computational and Mathematical Biophysics     Open Access   (Followers: 1)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 12)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 2)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 9)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 17)
Computational Linguistics     Open Access   (Followers: 23)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 5)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 8)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 30)
Computer     Full-text available via subscription   (Followers: 99)
Computer Aided Surgery     Open Access   (Followers: 6)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 8)
Computer Communications     Hybrid Journal   (Followers: 16)
Computer Journal     Hybrid Journal   (Followers: 9)

        1 2 3 4 5 6 7 | Last

Journal Cover
Chemometrics and Intelligent Laboratory Systems
Journal Prestige (SJR): 0.672
Citation Impact (citeScore): 3
Number of Followers: 15  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 0169-7439
Published by Elsevier Homepage  [3162 journals]
  • Modeling and application of industrial process fault detection based on
           pruning vine copula
    • Abstract: Publication date: Available online 12 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Junxia Wan, Shaojun Li Industrial processes are usually nonlinear multivariate stochastic systems. Describing the distribution characteristics of variables through a vine copula model can well define complex correlation information. However, modeling based on vine copula involves computational complexity. This study proposes a method to prune vine copula and introduces an indicator to conduct pruning process. This technique constructs a simplified model by removing the weakly correlated component from the copula structure without reducing the accuracy of model. Lastly, fault detection for industrial processes based on pruning vine copula is conducted based on a generalized local probability monitoring index. Experimental results of the monitoring process show that the method can reduce the time of modeling and improve the effect of fault detection.
  • A novel ensemble model using PLSR integrated with multiple activation
           functions based ELM: Applications to soft sensor development
    • Abstract: Publication date: Available online 12 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Xiaohan Zhang, Qunxiong Zhu, Zhi-Ying Jiang, Yanlin He, Yuan Xu Soft sensor plays a decisive role in making control strategies and production plans. However, the difficulty in establishing accurate and robust soft sensors using an individual model is continuously increasing due to the increasing scale and complexity in modeling data. To handle this problem, an effective ensemble model using partial least squares regression (PLSR) integrated with extreme learning machine (ELM) with multiple activation functions (PLSR-MAFELM) is proposed in this paper. The proposed PLSR-MAFELM is simple in construction: firstly, train several ELM models assigned with different activation functions using the least squares solution; secondly, combine ELM models for enhancing accuracy and stability performance; finally, obtain the optimal ensemble outputs by aggregating the outputs of individual ELM models using PLSR. To test the performance of the proposed PLSR-MAFELM model, a UCI benchmark dataset and two real-world applications are selected to carry out simulation case studies. Simulation results show that PLSR-MAFELM can achieve good stability and accuracy performance, which indicates that the generalization capability of soft sensors can be improved through combining some single models.
  • Solution path efficiency and oracle variable selection by Lasso-type
    • Abstract: Publication date: Available online 10 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Sohail Chand, Sarah Ahmad, Madeeha Batool The continuous shrinkage while achieving consistent variable selection is a desirable property of these shrinkage methods. In this paper, we have studied the consistent variable selection of various shrinkage methods. We pointed out that efficiency of solution path plays the key role in selection of relevant variables. We also suggested a novel measure to study the solution path efficiency. Our numerical results show that efficient solution path makes the job of tuning parameter easier. Extensive simulations have been performed to compare the variable selection methods for correct classification of relevant and irrelevant predictors under various scenarios. The method selected on the basis of proposed measure has been applied on a real life example to illustrate its application.
  • A Deep Learning Framework for Sequence-Based bacteria type IV secreted
           effectors Prediction
    • Abstract: Publication date: Available online 8 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Li Xue, Bin Tang, Wei Chen, Jiesi Luo Type IV secretion system (T4SS) is a specialized protein delivery system in gram-negative bacteria that injects proteins (called effectors, T4SEs) directly into the eukaryotic host cytosol and facilitates bacterial infection. Since various T4SEs have been experimentally validated to play important roles in a wide variety of biological activities, identifying them is crucial to our understanding of host-pathogen interactions and bacterial pathogenesis. However, experimental identification is often time-consuming and expensive. In the post-genomic era, it becomes imperative to predict new T4SEs using information from the amino acid sequence alone when new proteins are being identified in a high-throughput mode. Consequently, in this work we propose, DeepT4, a novel deep learning method to directly classify any protein sequence into T4SEs or non-T4SEs only using the protein primary sequences. The backbone of our framework is a convolutional neural network (CNN), which automatically extracts T4SEs-related features from 50 N-terminal and 100 C-terminal residues of the protein. We train and test the deep CNN model on a comprehensive dataset across multiple bacterial species, with a high receiver operating curve of 0.876 in the 5-fold cross validation and an accuracy of 92.2% for the test set. Moreover, when performing on a common independent dataset, DeepT4 outperforms known sequence-based state-of-the-art T4SEs prediction methods. We believe that deep learning is a valuable method to predict type IV secreted effectors. This study will be useful in elucidating the secretion mechanism of T4SS and facilitating hypothesis-driven experimental design and validation.
  • Variable selection in partial least squares with the weighted variable
           contribution to the first singular value of the covariance matrix
    • Abstract: Publication date: Available online 8 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Weilu Lin, Haifeng Hang, Yingping Zhuang, Siliang Zhang The selection of informative variables in partial least squares (PLS) is important in process analytical technology (PAT) applications in the pharmaceutical industry, for example, the calibration of spectrometers. In the past, numerous approaches have been proposed to select the variables in partial least squares. In this work, a new variable selection method for PLS with the weighted variable contribution (PLS-WVC) to the first singular value of the covariance matrix for each PLS component is proposed. Several variants of PLS-WVC with different weighting factors are proposed. One variant of PLS-WVC is equivalent to the PLS with variable importance in projection (PLS-VIP). However, the variants with the correlation between Xγwγ and Yγqγ as the weighting factor are preferred based on the results of the simulation cases studies. The proposed PLS-WVCs are integrated with interval PLS (iPLS) further to select the informative wavelength intervals for spectroscopic modelling. The utility of the proposed WVC based variable selection methods in PLS is demonstrated with the real spectral data sets.
  • MVBatch: A matlab toolbox for batch process modeling and monitoring
    • Abstract: Publication date: Available online 7 November 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): J.M. González-Martínez, J. Camacho, A. Ferrer A novel user-friendly graphical interface for process understanding, monitoring and troubleshooting has been developed as a freely available MATLAB toolbox, called the MultiVariate Batch (MVBatch) Toolbox. The main contribution of this software package is the integration of recent developments in Principal Component Analysis (PCA) based Batch Multivariate Statistical Process Monitoring (BMSPM) that overcome modeling problems such as missing data, different speed of process evolution and length of batch trajectories, and multiple stages. An interactive user interface is provided, which aims to guide users in handling batch data through the main BMSPM steps: data alignment, data modeling, and the development of monitoring schemes. In addition, a small-scale non-linear dynamic simulator of the fermentation process of the Saccharomyces cerevisiae cultivation is available to generate realistic batch data under normal and abnormal operating conditions. This generator of synthetic data can be used for teaching purposes or as a benchmark to illustrate and compare the performance of new methods with sound techniques published in the field of BMSPM.
  • Monitoring batch processes with dynamic time warping and k-nearest
    • Abstract: Publication date: Available online 31 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Max Spooner, Murat Kulahci A novel data driven approach to batch process monitoring is presented, which combines the k-Nearest Neighbour rule with the dynamic time warping (DTW) distance. This online method (DTW-NN) calculates the DTW distance between an ongoing batch, and each batch in a reference database of batches produced under normal operating conditions (NOC). The sum of the k smallest DTW distances is monitored. If a fault occurs in the ongoing batch, then this distance increases and an alarm is generated. The monitoring statistic is easy to interpret, being a direct measure of similarity of the ongoing batch to its nearest NOC predecessors and the method makes no distributional assumptions regarding normal operating conditions. DTW-NN is applied to four extensive datasets from simulated batch production of penicillin, and tested on a wide variety of fault types, magnitudes and onset times. Performance of DTW-NN is contrasted with a benchmark multiway PCA approach, and DTW-NN is shown to perform particularly well when there is clustering of batches under NOC.
  • A new method for choosing the biasing parameter in ridge estimator for
           generalized linear model
    • Abstract: Publication date: Available online 31 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Zakariya Yahya Algamal Multicollinearity problem arises frequently in several modern applications, such as chemometrics, biology, and other scientific fields. The common feature of the multicollinearity problem is that a large number of predictors are highly correlated. Generalized linear model is a powerful and a popular approach for modeling a large variety of regression data. It is well known that the existence of multicollinearity can inflate the variance of the maximum likelihood estimator. To reduce the effects of multicollinearity, the ridge estimator has been efficiently demonstrated to be an attractive method. However, the choice of the biasing parameter of the ridge estimator is critical. Our aim is to efficiently estimate such a biasing parameter. Towards this aim, a kidney-inspired algorithm, which is a population-based algorithm inspiring by the kidney process in the human body, is proposed. Extensive comparisons with different classical biasing parameter estimating methods are conducted through simulation and real data application. The results demonstrate that our proposed approach is able to find the best biasing parameter value with high prediction accuracy. Further, the results indicate that the performance of our proposed approach is superior to that of other competitor methods.
  • Adaptive JIT-Lasso modeling for online application of near infrared
    • Abstract: Publication date: Available online 29 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Jin Liu, Xiaoli Luan, Fei Liu Near infrared (NIR) spectroscopy has been widely employed as a non-invasive analytical tool in industry. However, most NIR model are built offline which cannot address changes in process characteristics as well as nonlinearity. To solve this problem, an adaptive JIT-Lasso algorithm was proposed by merging the least absolute shrinkage and the selection operator (Lasso) algorithm into just-in-time (JIT) learning. A time-space similarity measure criterion that combined temporal relevance and spatial relevance was used to further improve the performance of the JIT-Lasso algorithm. This solved both the space nonlinearity and the time-varying issue of the process simultaneously. The proposed model updating approach not only solved the nonlinear and the time-varying issues based on JIT learning framework, but also reduced the computational complexity and improved the model interpretability through Lasso. The effectiveness of the method was demonstrated on a spectroscopic dataset from an industrial petroleum desalination process. Compared with traditional partial least squares, kernel partial least square, locally weighted partial least squares, locally weighted kernel partial least squares ,the proposed method achieves better performance.
  • Prediction of viscosity index and pour point in Ester Lubricants using
           quantitative structure-property relationship (QSPR)
    • Abstract: Publication date: Available online 29 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Shima Ghanavati Nasab, Abolfazl Semnani, Federico Marini, Alessandra Biancolillo Since ancient times, lubricants have been applied in different fields of technology and, from the very beginning, there has been a wide interest on improving some of their physical-chemical properties. The turning point in the development of lubricants came in the twentieth century, when modern synthetic ester base fluids were realized. In fact, with respect to “natural” lubricants (fats and mineral oils) they can be modified in order to optimize some specific technological properties; in particular, it is desirable that they present high viscosity index and a low pour point. Nevertheless, it is not always straightforward to accustom these parameters, and different theoretical studies have been pursued on this regard. Above all, valid tools to investigate these type of problems are the Quantitative Structure-Properties/Activity relationship studies (QSPA/ QSPR). Starting from this considerations, the aim of the present paper is to investigate, by the means of QSPR models, whether is possible to individuate or design ester base lubricants with some peculiar technological specificities. In particular, a QSPR analysis has been conducted in order to predict viscosity index and pour point on 41 ester lubricants by means of partial least squares combined with Leardi’s genetic algorithms. The present study has provided satisfying results from the prediction point of view, and it has led to interesting conclusions from the interpretation viewpoint. In fact, it has highlighted that, the viscosity index and, to a lesser extent, the pour point, are highly correlated to the 3D geometry, the molecular connectivity and the spatial autocorrelation of the investigated substances.
  • Nonlinear fault detection of batch processes based on functional kernel
           locality preserving projections
    • Abstract: Publication date: Available online 29 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Fei He, Chaojun Wang, Shu-Kai S. Fan Data-driven fault detection technique has exhibited its wide applications in industrial process monitoring. The batch dataset is often organized as a special three-way array (i.e., batch × variable × time), and the research of data-based batch process monitoring has attracted considerable attention in the literature. A novel method named functional kernel locality preserving projections (FKLPP) is proposed for batch process monitoring. Since the variables' trajectories often show functional nature and can be considered as smooth functions rather than just vectors, then the three-way data can be transformed into a two-way function matrix. Firstly, the variables’ trajectories are expressed by using the functional data analysis (FDA). By doing so, the original batch process data can be transformed into a two-way manner (batches × functions of the variable trajectories) by describing each variable trajectory as a function of time. Then, kernel locality preserving projections is used to perform dimensionality reduction on two-way function matrix directly. Different from principal component analysis (PCA) which aims at preserving the global Euclidean structure of the data, the FKLPP aims to preserve the local neighborhood information and to detect the intrinsic manifold structure of the data. The kernel trick is applied to the construction of nonlinear kernel model. Consequently, FKLPP may be useful to seek more meaningful intrinsic information hidden in the observations. Lastly, the effectiveness and potentials of the FKLPP-based monitoring approach are illustrated by a benchmark fed-batch penicillin fermentation process and the hot strip rolling process.
  • Small moving window calibration models for soft sensing processes with
           limited history
    • Abstract: Publication date: Available online 25 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Casey Kneale, Steven D. Brown While many soft-sensing strategies have been developed for monitoring chemical processes, most require extensive historical data to train the soft sensor calibration. Many of these strategies involve very complex models to account for drift and other effects in the process. When a process changes, perhaps because of some upset or a change in process parameters, it is often necessary to rebuild the soft-sensing model, a process that can take considerable time. This paper focuses on use of simple models that can be used to provide soft-sensing with minimal calibration data, and with minimal training of hyperparameters to permit process monitoring while a more complex model is rebuilt. The goal of this study was to explore simple soft sensing strategies for the case where a new process has been put online with limited history. Five soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were moving window partial least squares regression (and a recursive variant), moving window random forest regression, the mean moving window of the property y, and a novel random forest partial least squares regression ensemble (RF-PLS), all of which can be used with small sample sizes so that they can be rapidly placed online. It was found that, on two of the datasets studied, small window sizes led to the lowest prediction errors for all of the moving window methods studied. On the majority of datasets studied, the RF-PLS calibration method offered the lowest one-step-ahead prediction errors compared to those of the other methods, and it demonstrated greater predictive stability at larger time delays than moving window PLS alone. Both the random forest and RF-PLS methods most adequately modeled datasets that did not feature purely monotonic increases in property values, but both methods performed more poorly than moving window PLS models on one dataset with purely monotonic property values. Other data dependent findings are presented and discussed.
  • A two-stage gene selection method for biomarker discovery from microarray
           data for cancer classification
    • Abstract: Publication date: Available online 23 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Alok Kumar Shukla, Pradeep Singh, Manu Vardhan The microarrays permit experts to monitor the gene profiling for thousands of genes across an array of cellular responses, phenotype, and circumstances. Selecting a tiny subset of discriminate genes (biomarkers) from high dimensional data is one of the most significant tasks in bioinformatics. In this article, we develop a new hybrid framework by combining CMIM and AGA called CMIMAGA that can help to determine the significant biomarkers from the gene expression data. In the proposed approach, CMIM applied as a filter which is easy to understand and filter out most of the meaningless genes, and wrapper method as adaptive genetic algorithm (AGA) is employed to select the highly discriminating genes for distinguishing of instances from the reduced datasets. The AGA method uses the classifiers as a fitness function to select the relevant genes and to classify the tumour and cancer correctly. The performance of proposed approach is evaluated over six widely used microarray datasets using three classifiers, namely Extreme Learning Machine (ELM), Support Vector Machine (SVM), and k-nearest neighbor (k-NN). The experimental results reveal that our approach with ELM achieves the goal of better classification accuracy with a minimum number of genes and outperform to other filter and wrapper approaches.
  • Selecting temperature-dependent variables in near-infrared spectra for
    • Abstract: Publication date: 15 December 2018Source: Chemometrics and Intelligent Laboratory Systems, Volume 183Author(s): Xiaoyu Cui, Jin Zhang, Wensheng Cai, Xueguang Shao Temperature-dependent near-infrared (NIR) spectroscopy has been developed and taken as a new technique for aquaphotomic studies to analyze water structures in aqueous systems. However, due to the overlapping, it is difficult to obtain the information from the spectra to understand the temperature dependency. In this work, a method was proposed for selection of the temperature-dependent variables (wavenumbers) from the NIR spectra measured at different temperature. Continuous wavelet transform (CWT) was used to decompose the spectra into the spectral components with different frequencies, and then Monte-Carlo uninformative variable elimination (MC-UVE) was employed to evaluate the dependency of the variables with temperature. The feasibility of the method was tested by simulated datasets and the applicability was proved by the temperature-dependent NIR spectra of water and aqueous solutions. Several variables were selected from the spectra of water, indicating the complexity of water structure. The variables are located at similar but not identical wavenumbers for different solutions and they have a good relationship with the concentration of the solute, demonstrating that water spectrum can be a mirror to reflect the difference of the aqueous solutions. The method may provide a tool to identify the characteristic variables in NIR spectra for understanding the function of water in aqueous systems.Graphical abstractImage
  • Representative splitting cross validation
    • Abstract: Publication date: Available online 21 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Lu Xu, Ou Hu, Yuwan Guo, Mengqin Zhang, Daowang Lu, Chen-Bo Cai, Shunping Xie, Mohammad Goodarzi, Hai-Yan Fu, Yuan-Bin She Cross-validation (CV) is widely used to estimate model complexity or the number of significant latent variables (LVs) for multivariate calibration methods like partial least squares (PLS). A basic consideration when developing and validating multivariate calibration models is that both the training and validation sets should be representative and distributed in the experimental space as uniformly as possible. Motivated by this idea, we proposed a new CV method called representative splitting cross-validation (RSCV). In RSCV, firstly, the DUPLEX algorithm was used to sequentially divide the original training set into k (in this work, k = 2, 4, 8 and 16) equal parts. Secondly, a series of k-fold (k = 2, 4, 8 and 16) CVs were performed based on the above data splitting. Finally, the pooled root mean squared error of CV (RMSECV) was used to estimate model complexity. Five real multivariate calibration data sets were investigated and RSCV was compared with leave-one-out CV (LOOCV), 10-fold CV and Monte Carlo CV (MCCV). With a maximum k of 16, RSCV was shown to be a useful and stable method to select PLS LVs, and can obtain simpler models with acceptable computational burden.
  • Active learning for modeling and prediction of dynamical fluid processes
    • Abstract: Publication date: Available online 17 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Hongying Deng, Yi Liu, Ping Li, Shengchang Zhang Accurate prediction of the flow rate curve of a stroke for reciprocating multiphase pumps often encounters several challenges in practice, including process nonlinearity, dynamical characteristics, and changing multiphase transportation conditions. To enhance the prediction performance, an active learning method is proposed to efficiently design informative training data. Some initial training data are first collected from experiments to construct several local Gaussian process regression (GPR) models. Additionally, with the GPR-based probabilistic information, a relative variance-based criterion is proposed to explore which regions the new data should be introduced into the GPR prediction model. Moreover, an evaluation criterion is designed to implement the active learning procedure efficiently. Consequently, without time-consuming experiments, a set of new representative training data are sequentially introduced into the GPR prediction model. Experimental results and comparative studies for dynamical flow rate prediction of a stroke are carried out to demonstrate the effectiveness of the proposed method.
  • Final quality prediction method for new batch processes based on improved
           JYKPLS process transfer model
    • Abstract: Publication date: Available online 11 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Fei Chu, Xiang Cheng, Runda Jia, Fuli Wang, Meng Lei Data-driven methods have been successfully used in modern industrial production. The sufficient data is the basis for implementing these methods. However, it is often impossible to meet the requirement for a new industrial process. In this study, an improved JYKPLS (Joint-Y kernel partial least squares) process transfer model is proposed to solve this issue and perform final product quality prediction for a new batch process. Based on the latent variable transfer technology, the rich information from similar old process data is transferred to accelerate the proceeding of building a new process model. The requirements on the amount of modeling data and prior knowledge of new processes are visibly reduced. Moreover, in order to handle the nonlinear correlation in process data, the kernel function is introduced to make data linear and separable. With actual productions operating, the transfer model is improved gradually by updating it with online data. When the prediction error falls into its confidence interval, the old data with lower similarity will be eliminated to avoid the negative transfer. The prediction results of penicillin concentration verify the effectiveness of proposed method.
  • A hybrid model for predicting product sulfur concentration of diesel
           hydrogen desulfurization process
    • Abstract: Publication date: Available online 10 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Prafull Sharma, Syed Imtiaz, Salim Ahmed In this paper, a hybrid sulfur predictor is developed for the hydrodesulfurization unit and validated using lab-scale reactor and industrial data. The proposed sulfur predictor is configured to estimate product sulfur concentration under unknown feed composition changes. The hybrid structure has an offline feed sulfur estimator based on a mechanistic model for the hydrogen desulfurization reactor. The online predictor is based on support vector regression. The developed soft sensor is validated against an industrial data set. A Matlab-based graphical user interface (GUI) is developed for easy deployment of the developed hybrid online sulfur predictor.
  • Confocal Raman spectroscopy and multivariate data analysis for evaluation
           of spermatozoa with normal and abnormal morphology. A feasibility study
    • Abstract: Publication date: Available online 6 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): R.V. Nazarenko, A.V. Irzhak, A.L. Pomerantsev, O. Ye Rodionova This paper investigates a feasibility of using confocal Raman spectroscopy (CRS) and multivariate analysis for classification of sperm cells. The spectral based classification is compared with the morphological analysis, which is the main criterion for sperm selection in intracytoplasmic sperm injection procedure. The spectral analysis is conducted using the data driven soft independent modeling of class analogies method. The supervised classification reveals numerous outliers that pass from the 'normal' class to the 'abnormal' class, and vice versa. The ultimate result shows that the initial morphological discrimination overlaps with the spectral classification only partly. It is shown that CRS provides additional information regarding the nuclear DNA stability and helps to reveal spermatozoa with fragmented and defective DNA. This can be a promising direction for future evaluation of spectra from live, unfixed cells.
  • simsMVA: A tool for multivariate analysis of ToF-SIMS datasets
    • Abstract: Publication date: Available online 5 October 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Gustavo F. Trindade, Marie-Laure Abel, John F. Watts Imaging mass spectrometry datasets are every year larger and more complex, with unsupervised multivariate analysis (MVA) becoming a routine procedure for most researchers. Moreover, the increasing interdisciplinarity of the field demands the development of software for rapid and accessible MVA for researchers of various backgrounds. This paper presents a MATLAB-based software for performing principal component analysis (PCA), non-negative matrix factorisation (NMF) and k-means clustering of large analytical chemistry datasets with a particular focus on of time-of-flight secondary ions mass spectrometry (ToF-SIMS). All five modes of operation (spectra, profiles, images, 3D and multi) are described with a few examples of typical applications at The Surface Analysis Laboratory of the University of Surrey: point spectra analysis of wood growth regions, depth profiling of a metallic multi-layered sample, imaging of an organic coating on a metal substrate and 3D characterisation of an automotive grade polypropylene.
  • Accounting for spatial dependency in multivariate spectroscopic data
    • Abstract: Publication date: Available online 27 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): M. Prakash, J.K. Sarin, L. Rieppo, I.O. Afara, J. Töyräs We examine a hybrid multivariate regression technique to account for the spatial dependency in spectroscopic data due to adjacent measurement locations in the same joint by combining dimension reduction methods and linear mixed effects (LME) modeling. Spatial correlation is a common limitation (assumption of independence) faced in diagnostic applications involving adjacent measurement locations, such as mapping of tissue properties, and can impede tissue evaluations. Near-infrared spectra were collected from equine joints (n = 5) and corresponding biomechanical (n = 202), compositional (n = 530), and structural (n = 530) properties of cartilage tissue were measured. Subsequently, hybrid regression models for estimating tissue properties from the spectral data were developed in combination with principal component analysis (PCA-LME) scores and least absolute shrinkage and selection operator (LASSO-LME). Performance comparison of PCA-LME and principal component regression, and LASSO-LME and LASSO regression was conducted to evaluate the effects of spatial dependency. A systematic improvement in calibration models’ correlation coefficients (ρPCA-LME: 4–52% and ρLASSO-LME: 1–10%) and a decrease in cross validation errors (ErrorPCA-LME: 3–10% and ErrorLASSO-LME: 1–4%) were observed when accounting for spatial dependency. Our results indicate that accounting for spatial dependency using a LME-based approach leads to more accurate prediction models.
  • Controlling two-dimensional false discovery rates by combining two
           univariate multiple testing results with an application to mass spectral
    • Abstract: Publication date: Available online 26 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Youngrae Kim, Johan Lim, Jong Soo Lee, Jaesik Jeong Mass spectral data exhibit a small number of signals (true peaks) among many noisy observations (signals or true peaks) in a high-dimensional space. This unique aspect of mass spectral data necessitates solving the problem of testing for many composite null hypotheses simultaneously. In this study, we develop a new procedure to control the false discovery rate of simultaneous multiple hypothesis tests, consisting of many “bivariate” composite null hypotheses. Two types of composite null hypothesis, the intersection-type and the union-type null, are considered separately. The proposed procedure comprises two stages. In the first stage, we simultaneously test each “univariate” simple hypothesis of “bivariate” composite hypotheses at the pre-decided false discovery rate. In the second stage, we combine the marginal univariate test results so that the two-dimensional false discovery rate for the “bivariate” composite null hypotheses is less than the desired significance level α. The new procedure provides a closed-form decision rule on the bivariate test statistics, unlike existing methods for controlling the two-dimensional local false discovery rate (2d-fdr). We numerically compare the performance of our procedure to existing 2d-fdr control methods in different settings. We then apply the procedure to the problem of differentiating the origins of herbal medicine using gas chromatography-mass spectrometry.
  • Current multiblock methods: Competition or complementarity' A
           comparative study in a unified framework
    • Abstract: Publication date: Available online 24 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Stéphanie Bougeard, Ndèye Niang, Thomas Verron, Xavier Bry We address the issue of exploring—with respect to multiple regression model(s) or to simple pairwise links—the relationships between blocks of variables measured on the same observations. Multiblock methods have been developed over the past twenty years, and are now used more and more frequently, especially for high-dimensional data. We focus on three current methods: regularized Generalized Structured Component Analysis (rGSCA), regularized Generalized Canonical Correlation Analysis (rGCCA) and THEmatic Model Exploration (THEME). These methods are rewritten in a common formal setting and compared with respect to two issues: how they explore block-relationships, and how they separate information from noise. Multiblock methods are applied to simulated data and to real data pertaining to the chemistry framework to illustrate their differences and complementarities.
  • A novel variable selection method based on stability and variable
           permutation for multivariate calibration
    • Abstract: Publication date: Available online 22 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Junming Chen, Chunhua Yang, Hongqiu Zhu, Yonggang Li, Weihua Gui A novel variable selection method named stability and variable permutation (SVP) is proposed based on evolutionary principles of ‘intraspecific competition’ and ‘survival of the fittest’. In SVP, variables are selected in an iterative and competitive manner. In each iteration, Monte Carlo sampling (MCS) runs in sample space and variable space for stability and variable permutation, respectively. Variables are divided into elite variables and normal variables according to stability by adaptive reweighted sampling (ARS). Then, combining variable permutation analysis, exponentially decreasing function (EDF) is employed to select important variables from normal variables. Elite variables and important variables construct a new variable subset for the next iteration. After the selection iterations are terminated, the root mean square error of cross validation (RMSECV) of each subset is calculated. The variable subset with the minimal RMSECV is considered as the optimal variable subset. The performance of SVP is evaluated by three near-infrared (NIR) datasets: corn oil dataset, diesel fuel total aromatics dataset and wheat protein dataset. Compared with methods of moving window PLS (MWPLS), Monte Carlo uninformative variable elimination (MCUVE), competitive adaptive reweighted sampling (CARS), stability competitive adaptive reweighted sampling (SCARS) and variable permutation population analysis (VPPA), SVP shows better prediction results.
  • The common quantitative model for the determination of multiple near
           infrared spectrometers
    • Abstract: Publication date: Available online 21 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Jin Jin Liu, Bao Qiong Li, Hong Lin Zhai, Shao Hua Lu, Sha Sha Li Calibration model transfer is a practical application. In this contribution, we proposed the common model instead of model transfers to avoid the complex corrections for the obtained calibration model. The important chemical features of target components are extracted by Tchebichef image moment method based on the near infrared (NIR) three-dimensional spectra constructed from the determination of different spectrometers. Then the common models are established with stepwise regression and used to the determination of single spectrometer. The proposed approach was applied to the quantitative analysis of target components in mixtures using two datasets including the pharmaceutical samples (measured on two NIR spectrometers with the same type) and corn samples (measured on three NIR spectrometers with the different types), and the satisfactory results were obtained. Furthermore, multi-way partial least squares method was carried out and compared with the proposed approach. This study indicates that our approach is effective, accurate and reliable, and the common quantitative models can reveal the chemical feature information of target components in samples measured on whether the same or different types spectrometers, which provides a convenience for the application of NIR spectroscopy.Graphical abstractWe introduced the TM approach to build the common quantitative models based on the constructed NIR 3D spectra, which can quantify the target compounds in complex samples measured on single spectrometers.Image
  • Intelligent computational method for discrimination of anticancer peptides
           by incorporating sequential and evolutionary profiles information
    • Abstract: Publication date: Available online 20 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Muhammad Kabir, Muhammad Arif, Saeed Ahmad, Zakir Ali, Zar Nawab Khan Swati, Dong-Jun Yu Cancer is one of the prominent threats to human life worldwide. Traditional therapeutic mechanisms like chemotherapy, radiation and surgical operations are exploited for cancer treatment. However, these clinical treatments are unfavorable, challenging and have severe impacts on human body. Recently, the discovery of anticancer peptides (ACPs) has become an influential anticancer drug agent due to their nontoxic characteristic and safe cellular uptake of therapeutic drugs. In this regard, much progress has been made to develop computational methods for ACPs prediction to accelerate their effectiveness against cancer. However, challenges remain in terms of discriminative feature representation, typical imbalance issue and prediction performance. In this study, we report a novel predictor, TargetACP, by integrating sequential and evolutionary-profiles information solely from primary protein sequences. Synthetic minority oversampling technique is utilized to cope with imbalance phenomenon between minority (ACPs) and majority (non-ACPs) samples. Finally, Support vector machine (SVM) is employed as learning hypothesis. Experimental results demonstrated that our predictor achieved an accuracy of 98.78% on benchmark dataset using jackknife cross-validation test. The generalization capability of the proposed method was evaluated through independent dataset which yielded accuracy of 94.66%. The empirical outcomes reveal that our model outperformed existing methods on same datasets. Furthermore, it is anticipated that TargetACP model will provide deep insights to pharmaceutical industry to design new anticancer drugs and research community to innovative new ideas in the area of bioinformatics, proteomics and computational biology.
  • Decision table in Rough Set as a new chemometric approach for synthesis
           optimization: Mn-doped ZnS quantum dots as the example
    • Abstract: Publication date: Available online 20 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Shiwei Yang, Wanli Fan, Tingting Xiong, Dongmei Wang, Keyun Qin, Meikun Fan, Zhengjun Gong Decision table (DT) in Rough Set was first proposed as a new chemometric approach for the optimization of synthesis strategy in this work. The fluorescence (FL) performance optimization of Mn-doped ZnS quantum dots was utilized as an example to illustrate the analysis procedure and verify the rationality of DT. Five condition attributes (namely synthesis conditions) were first reduced to be three through the analysis of FL intensity (decision attribute) and attribute reduction of the first DT. Two core attributes were confirmed: the volume ratio of ZnSO4 and Na2S solution, the volume of MnCl2 solution. It was then found that the latter was the most important condition attribute from attribute reduction of the second DT and the optimal synthesis strategy was obtained. The results were then verified by the use of single factor analysis and orthogonal experiment. Finally, it is concluded that the proposed method has the advantages of attribute reduction, core attribute determination and no requirement on evenly distributed conditions despite the need of certain mathematical knowledge. More importantly, it presents superiority for synthesis optimization when handling larger number of condition attributes due to attribute reduction. DT might be a new chemometric approach for the optimization of materials synthesis with multiple factors.
  • Tutorial and spreadsheets for Bayesian evaluation of risks of false
           decisions on conformity of a multicomponent material or object due to
           measurement uncertainty
    • Abstract: Publication date: Available online 15 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): RicardoJ.N.B. da Silva, FrancescaR. Pennecchi, D. Brynn Hibbert, Ilya Kuselman A tutorial and a user-friendly program for evaluating risks of false decisions in conformity assessment of a multicomponent material or object due to measurement uncertainty, based on a Bayesian approach, are presented. The developed program consists of two separate MS-Excel spreadsheets. It allows calculation of the consumer's and producer's risks concerning each component of the material whose concentration was tested (‘particular risks’) as well as concerning the material as a whole (‘total risks’). According to the Bayesian framework, probability density functions of the actual/‘true’ component concentrations (prior pdfs) and likelihood functions (likelihoods) of the corresponding test results are used to model the knowledge about the material or object. Both cases of independent and correlated variables (the actual concentrations and the test results) are treated in the present work. Spreadsheets provide an estimate of the joint posterior pdf for the actual component concentrations as the normalized product of the multivariate prior pdf and the likelihood, starting from normal or log-normal prior pdfs and normal likelihoods, using Markov Chain Monte Carlo (MCMC) simulations by the Metropolis-Hastings algorithm. The principles of Bayesian inference and MCMC are described for users with basic knowledge in statistics, necessary for correct formulation of a task and interpretation of the calculation results. The spreadsheet program was validated by comparison of the obtained results with analytical results calculated in the R programming environment. The developed program allows estimation of risks greater than 0.003% with standard deviations of such estimates spreading from 0.001% to 1.5%, depending on the risk value. Such estimation characteristics are satisfactory, taking into account known variability in measurement uncertainty associated with the test results of multicomponent materials.
  • Professor Yi-Zeng Liang; great global scientist with strong enthusiastic
           and friendship
    • Abstract: Publication date: 15 November 2018Source: Chemometrics and Intelligent Laboratory Systems, Volume 182Author(s): Yukihiro Ozaki
  • A combination strategy of random forest and back propagation network for
           variable selection in spectral calibration
    • Abstract: Publication date: Available online 11 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Huazhou Chen, Xiaoke Liu, Zhen Jia, Zhenyao Liu, Kai Shi, Ken Cai Random forest (RF) and neural network have received significant interest for statistical data analysis as a result of their good predictive performance and attractive analytical properties. When developing a RF regression model for spectral analysis, some informative wavelengths are supposed to be selected so as to reduce dimension effectively and improve interpretability. Whereas a neural network has the merit of restoring high signals in data. A chemometric strategy was proposed in this paper, implemented through the combined use of the RF algorithm and back propagation (BP) network. The RF-selected informative wavelengths were further refined by a moderate 3-layer BP network, where the number of hidden nodes was tunable and finally determined by searching the minimum output error. The BP network was trained with the combined running of RF to generate a new comprehensive variable, so that a renewal informative-plus-net variable group could be produced. This renewed group of variables (or this selected group of variables) was used in a multiple linear regression model to predict the spectral analytical ability in quantitatively determining the content of the target analyte. The application case was based on the Fourier transform near infrared dataset of soil samples, aiming to chemometrically determine the content of the nutritional organic carbon. The prediction results indicated that the proposed strategy of combining RF and BP network can improve prediction accuracy and enhance model interpretability in comparison with the general RF method and the conventional benchmark partial least squares regression. The methodology presented here is of practical significance and has wide application in rapid nutrition determination in the development of precise agriculture.Graphical abstractImage 1
  • Illustration of merits of semi-supervised learning in regression analysis
    • Abstract: Publication date: 15 November 2018Source: Chemometrics and Intelligent Laboratory Systems, Volume 182Author(s): Hiromasa Kaneko Semi-supervised learning (SSL) is a method for learning the relationship between X and y, and the essential structure of the corresponding dataset, using both labeled and unlabeled data. In this paper, an approach to use a combination of labeled and unlabeled samples to reduce the dimension, then perform regression analysis using the labeled samples in a low-dimensional space is focused in SSL methods. While various SSL methods for regression have been developed, there has been insufficient discussion as to why SSL is effective in regression analysis. Therefore, in this study, the merits of SSL in regression analysis are discussed in terms of the stability or the robustness and applicability domains of regression models and prior distribution of X-variables. The superiorities of SSL methods over fully supervised methods in regression are demonstrated using data from numerical simulations, quantitative structure–activity relationships and quantitative structure–property relationships.
  • A computational approach to partial least squares model inversion in the
           framework of the process analytical technology and quality by design
    • Abstract: Publication date: Available online 1 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): S. Ruiz, M.C. Ortiz, L.A. Sarabia, M.S. Sánchez In the context of the paradigms founding the Quality by Design and Process Analytical Technology initiatives, the work herein presents a computational approach to support the decision-making process, in particular, about the feasibility of a product defined for some a priori given quality characteristics.The approach is based on the computation of the pareto-optimal front when simultaneously minimizing the expected differences between the predicted and the desired characteristics. Thus, the feasibility is tackle as an optimization problem with the novelty of doing so simultaneously for all the characteristics, preserving the correlation structure, but by handling each individual characteristic separately.With data from a low-density polyethylene production process, with fourteen process variables and five measured characteristics of the final polyethylene, solutions are found to define the Design Space for targeted quality characteristics on the product, and without the need of explicitly inverting the PLS (Partial Least Squares) prediction model fitted to the process.
  • Feature selection using particle swarm optimization-based logistic
           regression model
    • Abstract: Publication date: Available online 1 September 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Omar Saber Qasim, Zakariya Yahya Algamal In any classification problem, the dataset typically has a large number of features. However, not all features are necessary to obtain a good classification performance because some of them are irrelevant and redundant. Therefore, classifiers with less number of features but with better classification accuracy are favored for ease of interpretation. In this work, particle swarm optimization algorithm along with logistic regression model is proposed. Additionally, the Bayesian information criterion (BIC) as a fitness function is proposed. The performance of different fitness functions is investigated and compared with BIC. The performance of the proposed method is evaluated based on a large number of different types of datasets. Experimental results using different types of datasets demonstrate the usefulness of our proposed method in significantly obtaining an improved classification performance with few features. Further, the results show that the proposed methods have a competitive performance comparing with other existing fitness functions.
  • Descriptor selection evaluation of binary gravitational search algorithm
           in quantitative structure-activity relationship studies of benzyl phenyl
           ether diamidine's antiprotozoal activity and Chalcone's anticancer potency
    • Abstract: Publication date: Available online 30 August 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Tayebeh Baghgoli, Mehdi Mousavi, Behnam Mohseni Bababdani In this work, the effect of various gravitational constant functions, G(t), were evaluated on the performance of the newly proposed binary gravitational search algorithm (BGSA) as a feature selection method for chemical systems. To fulfill this aim, linear, exponential, logarithmic and square root functions were studied. These functions were implemented in the binary gravitational search algorithm computer code sequentially and their performances were evaluated in a quantitative structure-activity relationship (QSAR) study for selection of the most informative descriptors. In the QSAR study, sixty cationic benzyl phenyl ether diamidine derivatives, which had been synthesized and evaluated against acute infection of Trypanosoma brucei rhodesiense (T.b. rhodesiense), were examined. The number and the kind of descriptors, which were selected by the BGSA, were highly dependent on the applied G(t) function. The results of internal and external validation tests indicate that the exponential function was superior to the other gravitational constant functions for applying in the binary gravitational search algorithm. A general model was established using seven descriptors for the ten training and validation sets. Regardless of subsetting, the selected descriptors and generated model can successfully describe experimental variation of antiprotozoal activity of benzyl phenyl ether diamidine derivatives. In addition, in another QSAR study, anticancer potency of a series of 87 Chalcone derivatives was satisfactorily modeled by using the BGSA-BRANN method. Comparison of BGSA results with those obtained by genetic algorithm (GA) indicates superiority of the BGSA.
  • DBPPred-PDSD: Machine Learning Approach for Prediction of DNA-binding
           Proteins using Discrete Wavelet Transform and Optimized Integrated
           Features Space
    • Abstract: Publication date: Available online 23 August 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Farman Ali, Muhammad Kabir, Muhammad Arif, Zar Nawab Khan Swati, Zaheer Ullah Khan, Matee Ullah, Dong-Jun Yu DNA-binding proteins play a crucial role in various biological processes such as regulation of DNA modification, repair, replication, and transcription. These proteins widely participate in the production of drugs, antibiotics, and steroids. Many computational approaches have been developed to identify DNA-binding proteins, but some methods are time-consuming and expensive while some are laborious. Still, it is a challenging task for the researchers to develop highly promising computational methods to identify DNA-binding proteins with high precision. In our work, we developed a new computational approach named as DBPPred-PDSD which has more promising prediction power for DNA-binding proteins. We employed two datasets, extracted features via Split Amino Acid Composition (SAAC) and Position Specific Scoring Matrix (PSSM). Further, we applied the Discrete Wavelet Transform (DWT) on PSSM to extract dominant features. From these features space, optimal features are generated by Maximum Relevance and Minimum Redundancy (mRMR) and fused. To obtain highly informative features, we used Support Vector Machine-Recursive Feature Elimination (SVM-RFE) and provided to well-known classifiers namely Support Vector Machine (SVM) and Random Forest (RF). Our model with the SVM classifier on three tests i.e. Jackknife cross-validation, 10-fold cross-validation and Independent tests achieved the highest success rate than other existing methods in the literature.
  • Collaborative representation based classifier with partial least squares
           regression for the classification of spectral data
    • Abstract: Publication date: Available online 21 August 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Weiran Song, Hui Wang, Paul Maguire, Omar NiboucheABSTRACTThe need to classify high-dimensional spectral data is an increasingly common occurrence in rapid and non-destructive detection of object features and chemical species using spectroscopy. Partial least squares discriminant analysis (PLS-DA) is an effective method for spectral data classification, which is based on a multivariate regression model. Although powerful, PLS-DA suffers from performance degradation under complex conditions such as nonlinearity, imbalance and multiclass, which are common in real applications. Collaborative representation-based classifier (CRC) is a new machine learning algorithm which represents a query by a linear combination of training samples and classifies the query based on the representation. It offers the possibility of classifying even under nonlinearity, imbalance and multiclass conditions. In this paper, we present a novel method for spectral data classification, namely CRC-WPLS, which reaps the benefits of both PLS regression and CRC. This method searches for a weighted, linear combination of all training samples to represent the query by using PLS regression, and then assigns the query to the class which yields the least approximation error. CRC-WPLS is compared to PLS-DA, kernel PLS-DA and representation-based classifiers on fourteen general machine learning datasets and three spectral datasets. Experimental results show the proposed method can outperform 5 baseline methods in most cases, and achieve a high classification accuracy (> 92%) for low grade spectra obtained from portable instrumentation.
  • Fault detection based on time series modeling and multivariate statistical
           process control
    • Abstract: Publication date: Available online 11 August 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): A. Sánchez-Fernández, F.J. Baldán, G.I. Sainz-Palmero, J.M. Benítez, M.J. Fuente Monitoring complex industrial plants is a very important task in order to ensure the management, reliability, safety and maintenance of the desired product quality. Early detection of abnormal events allows actions to prevent more serious consequences, improve the system's performance and reduce manufacturing costs. In this work, a new methodology for fault detection is introduced, based on time series models and statistical process control (MSPC). The proposal explicitly accounts for both dynamic and non-linearity properties of the system. A dynamic feature selection is carried out to interpret the dynamic relations by characterizing the auto- and cross-correlations for every variable. After that, a time-series based model framework is used to obtain and validate the best descriptive model of the plant (either linear o non-linear). Fault detection is based on finding anomalies in the temporal residual signals obtained from the models by univariate and multivariate statistical process control charts. Finally, the performance of the method is validated on two benchmarks, a wastewater treatment plant and the Tennessee Eastman Plant. A comparison with other classical methods clearly demonstrates the over performance and feasibility of the proposed monitoring scheme.
  • An improved method based on a new wavelet transform for overlapped peak
           detection on spectrum obtained by portable Raman system
    • Abstract: Publication date: Available online 8 August 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Minghui Liu, Zuoren Dong, Guofeng Xin, Yanguang Sun, Ronghui Qu Peak detection is a particularly important pre-processing step in chemical identification using Raman spectra. At present, most peak detection methods have limited applicability when there are overlapping peaks, especially the spectrum measured by portable spectrometers with low resolution. In this paper, an improved method is proposed based on the application of continuous wavelet transform (CWT) peak detection using a new wavelet to the deconvolved Raman spectrum. The new wavelet has a smaller linewidth and is more similar to the intrinsic Lorentz line profile of the Raman spectrum. It, therefore, has several advantages with regard to the detection of overlapping peaks. The proposed method was evaluated using the Raman spectrum of solid amino acid mixtures, and the results show that it is better at detecting overlapping peaks than the two other investigated wavelets. The receiver operating characteristic curves show that this method can detect more true peaks while maintaining a low false discovery rate. Moreover, the maximum of the true positive rate is the largest in the new approach, which indicates better performance for overlapping peak detection.
  • Modern practical convolutional neural networks for multivariate
           regression: Applications to NIR calibration
    • Abstract: Publication date: Available online 20 July 2018Source: Chemometrics and Intelligent Laboratory SystemsAuthor(s): Chenhao Cui, Tom Fearn In this study, we investigate the use of convolutional neural networks (CNN) for near infrared (NIR) calibration. We propose a unified CNN structure that can be used for general multivariate regression purpose. The comparison between the CNN method and the partial least squares regression (PLSR) method was done on three different NIR datasets of spectra and lab reference values. Datasets are from different sources and contain 6998, 1000 and 415 training and 618, 597 and 108 validation samples, respectively. Results indicated that compared to the PLSR models, the CNN models are more accurate and less noisy. The convolutional layer in the CNN model can automatically find the suitable spectral preprocessing filter on the dataset, which significantly saves efforts in training the model.
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-