for Journals by Title or ISSN
for Articles by Keywords
help
  Subjects -> COMPUTER SCIENCE (Total: 1985 journals)
    - ANIMATION AND SIMULATION (29 journals)
    - ARTIFICIAL INTELLIGENCE (98 journals)
    - AUTOMATION AND ROBOTICS (98 journals)
    - CLOUD COMPUTING AND NETWORKS (63 journals)
    - COMPUTER ARCHITECTURE (9 journals)
    - COMPUTER ENGINEERING (9 journals)
    - COMPUTER GAMES (16 journals)
    - COMPUTER PROGRAMMING (23 journals)
    - COMPUTER SCIENCE (1153 journals)
    - COMPUTER SECURITY (45 journals)
    - DATA BASE MANAGEMENT (13 journals)
    - DATA MINING (32 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (27 journals)
    - ELECTRONIC DATA PROCESSING (21 journals)
    - IMAGE AND VIDEO PROCESSING (40 journals)
    - INFORMATION SYSTEMS (104 journals)
    - INTERNET (92 journals)
    - SOCIAL WEB (50 journals)
    - SOFTWARE (33 journals)
    - THEORY OF COMPUTING (8 journals)

COMPUTER SCIENCE (1153 journals)                  1 2 3 4 5 6 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 12)
Abakós     Open Access   (Followers: 3)
Academy of Information and Management Sciences Journal     Full-text available via subscription   (Followers: 67)
ACM Computing Surveys     Hybrid Journal   (Followers: 23)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 8)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 13)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 16)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 6)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 11)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 4)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 18)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 12)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 3)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 1)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 20)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 9)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 8)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 11)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Full-text available via subscription   (Followers: 21)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 3)
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 11)
Advanced Engineering Materials     Hybrid Journal   (Followers: 26)
Advanced Science Letters     Full-text available via subscription   (Followers: 7)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 8)
Advances in Artificial Intelligence     Open Access   (Followers: 15)
Advances in Artificial Neural Systems     Open Access   (Followers: 4)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 2)
Advances in Catalysis     Full-text available via subscription   (Followers: 5)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 15)
Advances in Computer Science : an International Journal     Open Access   (Followers: 13)
Advances in Computing     Open Access   (Followers: 3)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 53)
Advances in Engineering Software     Hybrid Journal   (Followers: 25)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 10)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 25)
Advances in Human-Computer Interaction     Open Access   (Followers: 19)
Advances in Materials Sciences     Open Access   (Followers: 16)
Advances in Operations Research     Open Access   (Followers: 11)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 7)
Advances in Porous Media     Full-text available via subscription   (Followers: 4)
Advances in Remote Sensing     Open Access   (Followers: 37)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access  
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 6)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
Air, Soil & Water Research     Open Access   (Followers: 7)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 10)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 3)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 6)
American Journal of Sensor Technology     Open Access   (Followers: 2)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 5)
Analysis in Theory and Applications     Hybrid Journal  
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 3)
Annals of Data Science     Hybrid Journal   (Followers: 8)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 6)
Annals of Pure and Applied Logic     Open Access   (Followers: 2)
Annals of Software Engineering     Hybrid Journal   (Followers: 12)
Annual Reviews in Control     Hybrid Journal   (Followers: 6)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 2)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 14)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Clinical Informatics     Hybrid Journal   (Followers: 1)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 12)
Applied Computer Systems     Open Access   (Followers: 1)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 32)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 16)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 4)
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 4)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 123)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 4)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 5)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access  
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 3)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 9)
Automation in Construction     Hybrid Journal   (Followers: 6)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 8)
Basin Research     Hybrid Journal   (Followers: 4)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Bioinformatics     Hybrid Journal   (Followers: 293)
Biomedical Engineering     Hybrid Journal   (Followers: 16)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 13)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 17)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 32)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 45)
British Journal of Educational Technology     Hybrid Journal   (Followers: 119)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 10)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 2)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal  
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 13)
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal  
Cell Communication and Signaling     Open Access   (Followers: 1)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access  
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 15)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 7)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Full-text available via subscription   (Followers: 12)
Circuits and Systems     Open Access   (Followers: 16)
Clean Air Journal     Full-text available via subscription   (Followers: 2)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combustion Theory and Modelling     Hybrid Journal   (Followers: 13)
Communication Methods and Measures     Hybrid Journal   (Followers: 11)
Communication Theory     Hybrid Journal   (Followers: 19)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 53)
Communications of the Association for Information Systems     Open Access   (Followers: 18)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access  
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 9)
Computación y Sistemas     Open Access  
Computation     Open Access  
Computational and Applied Mathematics     Hybrid Journal   (Followers: 2)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 12)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 1)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 8)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 13)
Computational Linguistics     Open Access   (Followers: 23)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 4)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 7)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 13)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 28)
Computer     Full-text available via subscription   (Followers: 83)
Computer Aided Surgery     Hybrid Journal   (Followers: 3)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 6)
Computer Communications     Hybrid Journal   (Followers: 10)
Computer Engineering and Applications Journal     Open Access   (Followers: 5)
Computer Journal     Hybrid Journal   (Followers: 7)
Computer Methods in Applied Mechanics and Engineering     Hybrid Journal   (Followers: 22)
Computer Methods in Biomechanics and Biomedical Engineering     Hybrid Journal   (Followers: 10)
Computer Methods in the Geosciences     Full-text available via subscription   (Followers: 1)
Computer Music Journal     Hybrid Journal   (Followers: 14)
Computer Physics Communications     Hybrid Journal   (Followers: 6)
Computer Science - Research and Development     Hybrid Journal   (Followers: 7)
Computer Science and Engineering     Open Access   (Followers: 17)
Computer Science and Information Technology     Open Access   (Followers: 11)
Computer Science Education     Hybrid Journal   (Followers: 12)
Computer Science Journal     Open Access   (Followers: 20)

        1 2 3 4 5 6 | Last

Journal Cover Chemometrics and Intelligent Laboratory Systems
  [SJR: 0.697]   [H-I: 92]   [15 followers]  Follow
    
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 0169-7439
   Published by Elsevier Homepage  [3042 journals]
  • An entropy based approach to estimation of analytical information. A
           hypothesis
    • Abstract: Publication date: Available online 21 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Vladimir V. Apyari
      In this article, we propose a hypothesis to outline features of amount of analytical information to be obtained during chemical analysis. The information is considered in its connection with thermodynamic properties of a system under analyzing, especially with entropy as a measure of the lost information. Based on this supposition, a formal mathematical relation that connects the volume of analytical information in a qualitative chemical analysis with the concentration of an analyte, temperature, molecular weight and its thermodynamic properties, e.g. enthalpy of formation, has been derived. This relation should not be considered as a rigid mathematical connection but may be used as a guide to search correlations between the analytical information and thermodynamic properties of a system. The first examples of such correlations are given. In many cases, absolute correlation coefficients, R, are higher than 0.9. In our opinion, this hypothesis seems to be promising but requires further confirmation within different fields of analytical chemistry, remaining open to question.

      PubDate: 2017-07-24T06:16:15Z
       
  • Improved kernel PLS combined with wavelength variable importance for near
           infrared spectral analysis
    • Abstract: Publication date: Available online 21 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Xin Huang, Li Xia
      In this study, a new strategy called variable importance kernel PLS (VIKPLS) method is developed for near infrared spectral analysis. The wavelength variable importance is incorporated into KPLS by modifying the primary kernel matrix, and variables in the kernel matrix are given the different importance, which provides a feasible way to differentiate between the informative and uninformative variables. The importance of variables is determined by the frequency of variables appearing in the best performing sub-models based on the weighted bootstrap sampling. The performance of VIKPLS is investigated with three real near infrared(NIR) spectroscopic datasets. Examples are given specifically for modifying the linear kernel and Gaussian kernel. Compared with standard kernel PLS, the results show the proposed method can improve the training and prediction performance of KPLS by using variable importance kernel. VIKPLS could be considered as a general and promising mechanism to introduce extra information to improve the performance of KPLS.

      PubDate: 2017-07-24T06:16:15Z
       
  • Recognition of flooding and sinking conditions in flotation process using
           soft measurement of froth surface level and QTA
    • Abstract: Publication date: Available online 20 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Lin Zhao, Tao Peng, Yongfang Xie, Chunhua Yang, Weihua Gui
      Accurate recognition of abnormal conditions is crucial for control and optimization of the running of flotation process. In this paper, a novel method using soft measurement of froth surface level and modified qualitative trend analysis (QTA) is proposed for flooding and sinking conditions recognition. First, the soft measurement method based on defocus depth recovery is proposed to derive the froth surface level from the 2D froth image. Then, a modified interval-halving QTA is proposed to extract the reliable and stable trend information from the froth surface level. Finally, the flooding and sinking conditions can be recognized by the classification decision tree combining the froth surface level and its trend. Offline and online experiments indicate the proposed approach can accurately recognize the flooding and sinking conditions even at the early stage.

      PubDate: 2017-07-24T06:16:15Z
       
  • Multi-class classification for steel surface defects based on machine
           learning with quantile hyper-spheres
    • Abstract: Publication date: Available online 20 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Maoxiang Chu, Jie Zhao, Xiaoping Liu, Rongfen Gong
      Focusing on steel surface defects, a novel multi-class classification method is proposed. The method is termed as machine learning with quantile hyper-spheres (QH-ML). In order to obtain sparse set with boundary information from finite defect dataset, a new quantile hyper-sphere data description (QHDD) model is proposed. This model is used to generate a quantile hyper-sphere for each finite defect subset. And this quantile hyper-sphere is insensitive to noise. Then, in order to realize incremental learning for new samples, an incremental learning with quantile hyper-spheres (QHIL) method is proposed. The advantage of QHIL method is that the dataset is invariant in size during the process of incremental learning for new boundary information. In the meanwhile, a novel classifier with multiple quantile hyper-spheres (MQHC) is used to realize multi-class classification for steel surface defects. The target class of MQHC uses QHDD model, and negative class applies the margin maximization principle. MQHC has natural multi-class classification gene and perfect classification performance. In testing experiments, the proposed QH-ML is used to classify six types of defects with incremental learning. Experimental results show that QH-ML keeps high classification accuracy and efficiency.

      PubDate: 2017-07-24T06:16:15Z
       
  • Moving-window two-dimensional correlation spectroscopy and
           perturbation-correlation moving-window two-dimensional correlation
           spectroscopy
    • Abstract: Publication date: Available online 20 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Shigeaki Morita, Yukihiro Ozaki
      Numerical computations and practical applications of moving-window two-dimensional (MW2D) correlation spectroscopy and perturbation-correlation moving-window two-dimensional (PCMW2D) correlation spectroscopy are reviewed. A series of spectra obtained under a certain external perturbation, e.g., temperature-dependent infrared spectra, time-resolved near-infrared spectra, etc., is used for the analyses. Two-dimensional correlation map spread in a plane between spectral variable axis and perturbation variable axis is obtained by the computation. These methods therefore have become one of promising techniques to find informative bands in the spectral variable direction as well as informative perturbation points such as phase transition temperature in the perturbation variable direction.

      PubDate: 2017-07-24T06:16:15Z
       
  • Exploring the changes in a series of measurements – The comparison of
           the two-dimensional correlation analysis and the alteration analysis
    • Abstract: Publication date: Available online 20 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): József Simon, Attila Felinger
      Alteration analysis (ALA) has been recently introduced to expand the two-dimensional correlation analysis (2DCOR) into further dimensions. 2DCOR is unable to work with 3D data arrays composed from a series of 2D measurements, but ALA has the advantage that it does not increase (multiply) the dimensions of the original data sets. Thus, it can easily be applied to more complex systems. ALA, however, does not work only with 3D arrays, but with matrices as well. In this study we present a comparison of the two methods. ALA has a different mathematical background, indicating that it has different properties. Therefore, some drawbacks are inevitable, however, ALA has a number of advantages over 2DCOR. While 2DCOR emphasises the correlation between the changes, ALA focuses on individual changes and provides more detailed information about them. Furthermore, we demonstrate that the connection between these changes can also be described with ALA. Besides, ALA simplifies the visual representation, because instead of two 2D maps (2DCOR) the information is shown on a single linear graph. Therefore, ALA is not only an extension, but it can be an alternative to 2DCOR.

      PubDate: 2017-07-24T06:16:15Z
       
  • Ordered homogeneity pursuit lasso for group variable selection with
           applications to spectroscopic data
    • Abstract: Publication date: Available online 13 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): You-Wu Lin, Nan Xiao, Li-Li Wang, Chuan-Quan Li, Qing-Song Xu
      In high-dimensional data modeling, variable selection methods have been a popular choice to improve the prediction accuracy by effectively selecting the subset of informative variables, and such methods can enhance the model interpretability with sparse representation. In this study, we propose a novel group variable selection method named ordered homogeneity pursuit lasso (OHPL) that takes the homogeneity structure in high-dimensional data into account. OHPL is particularly useful in high-dimensional datasets with strongly correlated variables. We illustrate the approach using three real-world spectroscopic datasets and compare it with four state-of-the-art variable selection methods. The benchmark results on real-world data show that the proposed method is capable of identifying a small number of influential groups and has better prediction performance than its competitors. The OHPL method and the spectroscopic datasets are implemented and included in an R package OHPL available from https://ohpl.io.

      PubDate: 2017-07-24T06:16:15Z
       
  • Reproducibility of nondominated solutions
    • Abstract: Publication date: 15 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 168
      Author(s): Nuno Costa, João Lourenço
      Natural process variability and model's uncertainty impact on nondominated solutions reproducibility cannot be ignored to assure that product or process will perform as expected when theoretical results are implemented in productive environments. To help the decision-maker in making more informed decisions when he/she selects a nondominated solution, two metrics are used to assess the predicted variability of nondominated solutions; one of them (the predicted standard error) quantifies the uncertainty in the estimated value for each response, the another one (the quality of predictions) quantifies the uncertainty associated to each generated solution. Supplementary material is provided to help the practitioners in calculating the metric values.

      PubDate: 2017-07-12T05:38:14Z
       
  • Editorial
    • Abstract: Publication date: Available online 8 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Roma Tauler, Philip Hopke


      PubDate: 2017-07-12T05:38:14Z
       
  • On the definition of mean, variance and covariance for periodic variables
           to avoid ambiguity in chemometric and bioinformatic data evaluation
    • Abstract: Publication date: Available online 3 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Gergely Tóth, Péter Király, Dóra Judit Kiss
      In the case of periodic variables, e.g. angular and temporal ones, the choice of the periodic section where experimental or calculated data are recorded is usually driven by convention. Unfortunately, basic statistics (mean, variance, covariance) and many multivariate statistical data evaluation methods (principal component analysis, clustering and classification methods, regressions …) provide different results for different data windows. We propose to change the selection of the periodic section to a data pattern based method, where that data window is selected which provides the smallest variance for the data. The use of smallest variance can be theoretically supported with the maximum likelihood principle. We show the advantages of the minimum-variance data window on two Ramachandran plots of simulated triglycine dihedral angles, where the unambiguously calculated means and clustering results are in agreement with the expectations of common sense. The second example concerns the enhanced effectivity in clustering of a temporal distribution of PM10 air pollutant, if the proposed data window is applied.
      Graphical abstract image

      PubDate: 2017-07-12T05:38:14Z
       
  • Predicting the gas-liquid critical temperature of binary mixtures based on
           the quantitative structure property relationship
    • Abstract: Publication date: 15 August 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 167
      Author(s): Lulu Zhou, Beibei Wang, Juncheng Jiang, Yong Pan, Qingsheng Wang
      Mixtures are used widely in the chemical industry and most chemical processes are designed based on mixtures' critical properties. Therefore, it is extremely important to study critical properties of mixtures. In this work, a quantitative structure property relationship (QSPR) study was employed to predict critical temperatures of binary mixtures. Dragon software was used to calculate molecular descriptors of pure chemicals. Descriptors of mixtures were calculated as mole weighted average. The genetic algorithm (GA) was used to select the optimal subset of descriptors which significantly contribute to the critical temperature of binary mixtures. The multiple linear regression (MLR) method was used to build QSPR models. The validations including internal and external validation were used to check the stability and predictive capability of the obtained models. Three different strategies of external validation, including the “points out”, “mixtures out” and “compounds out”, were used to divide the training set and test set. The applicability domain (AD) for the models was also discussed. All the results have shown that the obtained models had great fitness with the experimental data (R2 were 0.922, 0.925 and 0.900, root mean square error were 0.025, 0.029 and 0.030, average absolute error were 0.014, 0.021 and 0.018, respectively), excellent internal robustness (Q2 LMO were 0.733, 0.904 and 0.888), and good predictive ability (Q2 ext were 0.888, 0.822 and 0.780). The established models offer a reasonable estimation of the critical temperature of binary mixtures, and hence could provide guidance for chemical process design involving binary mixtures.

      PubDate: 2017-07-03T07:30:41Z
       
  • A phase diagram for gene selection and disease classification
    • Abstract: Publication date: 15 August 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 167
      Author(s): Hong-Dong Li, Qing-Song Xu, Yi-Zeng Liang
      Identifying a small subset of genes that can classify disease samples from healthy controls plays an import role for evaluating disease risk and facilitating diagnosis. Existing methods often provide a single metric to assess predictive performances of genes. Also, model-based gene importance is conditioned on the subset of genes used to build multivariate models, and is thus model/context-specific. Existing methods often do not take into account such context-specific effects. Here we present a novel gene selection approach that evaluates predictive performance of genes using two criteria by taking into account gene interactions and project them onto four different regions in a 2-dimensional plot, like a phase diagram (PHADIA) in chemistry. Using two publicly available microarray datasets, we showed that PHADIA achieves comparable or better classification accuracies compared to reported results in the literature. The source codes are freely available at: www.libpls.net.

      PubDate: 2017-07-03T07:30:41Z
       
  • QSAR models for predicting the bioactivity of Polo-like Kinase 1
           inhibitors
    • Abstract: Publication date: Available online 1 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yue Kong, Aixia Yan
      As a member of serine/threonine kinases family, Polo-like kinase 1 (PLK1) plays a key role in regulating cell cycle progression, particularly mitosis, emerging as an important target for cancer therapy. It is necessary and urgent to develop highly predictive in silico models to predict the bioactivity of PLK1 inhibitors. In our work, 16 single classifier models and one consensus Kohonen's Self-organizing Map (SOM) model were constructed to discriminate the highly active PLK1 inhibitors from the poorly active ones on a dataset of 601 noncongeneric PLK1 inhibitors. For these 16 single classifier models, we used four machine learning methods - Support Vector Machine (SVM), Naive Bayes (NB), C4.5 Decision Tree (C4.5 DT) and Random Forest (RF), with the MCCs ranging from 0.609 to 0.864 and the accuracies ranging from 78.7% to 93.1% for the test set. Then the consensus SOM model was built based on four single classifier models to obtain a more reliable and robust model. It turned out our consensus model outperformed all the single classifier models with the MCC of 0.872 and the accuracy of 93.6% on the test set. In addition, we combined two dataset splitting methods (by random and SOM) and two feature selection methods to find the best combination of them. As a result, SVMAttributeEval combined with SOM splitting method achieved the best model performance. Additionally, 20 good ECFP_4 features and 20 bad ECFP_4 features were found, which will help chemists to discriminate highly active PLK1 inhibitors from poorly active ones.

      PubDate: 2017-07-03T07:30:41Z
       
  • Semi-supervised fault classification based on dynamic Sparse Stacked
           auto-encoders model
    • Abstract: Publication date: Available online 27 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Li Jiang, Zhiqiang Ge, Zhihuan Song
      This paper proposes a hierarchical sparse artificial neural network for classifying the faults in dynamic processes base on limited labeled data. The Stacked auto-encoders (SAE) is developed to extract features from different faults. Each neural network in the proposed SAE is given a sparse constraint to learn a Sparse Stacked auto-encoders (SSAE). Then, the Dynamic time window is combined into SSAE to build Dynamic Sparse Stacked auto-encoders (DSSAE). DSSAE model based semi-supervised fault classification scheme is then formulated to classify the dynamic faulty data. Simulation studies on the Tennessee–Eastman (TE) benchmark process evaluate the performance of the developed method, which indicate that the DSSAE method performs better than both SAE and SSAE.

      PubDate: 2017-07-03T07:30:41Z
       
  • Cross-validatory framework for optimal parameter estimation of KPCA and
           KPLS models
    • Abstract: Publication date: Available online 13 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yujia Fu, Uwe Kruger, Zhe Li, Lei Xie, Jillian Thompson, David Rooney, Juergen Hahn, Huizhong Yang
      This article revisits recently proposed methods to determine the kernel parameter and the number of latent components for identifying kernel principal component analysis (KPCA) and kernel partial least squares (KPLS) models. A detailed analysis shows that existing work is neither optimal nor efficient in determining these important parameters and may lead to erroneous estimates. In addition to that, most methods are not designed to simultaneously estimate both parameters, i.e. they require one parameter to be predetermined. To address these practically important issues, the article introduces a cross-validatory framework to optimally determine both parameters. Application studies to a simulation example and a total of three experimental or industrial data sets confirm that the cross-validatory framework outperforms existing methods and yields optimal estimations for both parameters. In sharp contrast, existing work has the potential to substantially overestimate the number of latent components and to provide inadequate estimates for the kernel parameter.

      PubDate: 2017-06-16T04:33:55Z
       
  • Incremental model learning for spectroscopy-based food analysis
    • Abstract: Publication date: 15 August 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 167
      Author(s): Katerine Diaz-Chito, Konstantia Georgouli, Anastasios Koidis, Jesus Martinez del Rincon
      In this paper we propose the use of incremental learning for creating and improving multivariate analysis models in the field of chemometrics of spectral data. As main advantages, our proposed incremental subspace-based learning allows creating models faster, progressively improving previously created models and sharing them between laboratories and institutions without requiring transferring or disclosing individual spectra samples. In particular, our approach allows to improve the generalization and adaptability of previously generated models with a few new spectral samples to be applicable to real-world situations. The potential of our approach is demonstrated using vegetable oil type identification based on spectroscopic data as case study. Results show how incremental models maintain the accuracy of batch learning methodologies while reducing their computational cost and handicaps.

      PubDate: 2017-06-12T04:10:43Z
       
  • A non-equidistant wavenumber interval selection approach for classifying
           diesel/biodiesel samples
    • Abstract: Publication date: Available online 9 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Felipe Soares, Michel J. Anzanello, Marcelo C.A. Marcelo, Marco F. Ferrão
      In recent years, spectroscopy techniques such as Near infrared (NIR) and Fourier Transform Infrared (FTIR) have been widely adopted as analytical tools in different fields and with several purposes. NIR and FTIR data are typically comprised of hundreds or even thousands of highly correlated wavenumbers, fact that can jeopardize the accuracy of several statistical techniques. In light of that, wavenumber selection emerges as an important step in prediction and classification tasks based on spectroscopy data. This paper proposes a novel framework for wavenumber selection aimed at classifying samples into proper categories, which is applied to two data sets from the petroleum sector. The method relies on two main stages: determination of intervals based on the distance between the average spectra of the classes and selection of the most suitable intervals through cross-validation. An improvement in the misclassification rate was achieved for a NIR spectra data set of diesel, decreasing that metric from 13.90% to 11.63% after the application of the proposed method while retaining 23.19% of the original wavenumbers. As for the biodiesel FTIR data set, the method yielded a misclassification rate of 1.21% while retaining 4.95% of the original variables; misclassification rate was 4.71% when all wavenumbers were used. The proposed method also outperformed traditional approaches for wavenumber selection.

      PubDate: 2017-06-12T04:10:43Z
       
  • A new approach of GA-based type reduction of interval type-2 fuzzy model
           for nonlinear MIMO system: Application in methane oxidation process
    • Abstract: Publication date: Available online 8 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Shokoufe Tayyebi, Saeed Soltanali
      The aim of this paper is the application of a type-2 fuzzy model to a nonlinear MIMO system. In the suggested approach, genetic algorithm is employed to find the optimal type reduction of the interval type-2 fuzzy model for model prediction. To verify its performance, the proposed method is used in prediction of the methane conversion and the formic acid selectivity of the methane oxidation process with H2O2. In this process, the input variables are temperature, pressure, time on stream, catalyst content, H2O2 concentration and stirring speed. Interval type-2 fuzzy logic model has designed with reduced rules number compared with the type-1 fuzzy model. The results confirm that the proposed method based on type reduction interval type-2 fuzzy model outperforms the other strategies with an R2 value of greater than 0.988.

      PubDate: 2017-06-12T04:10:43Z
       
  • MCR-ALS applied to the quantification of the 5-Hydroxymethylfurfural using
           UV spectra: Study of catalytic process employing experimental design
    • Abstract: Publication date: Available online 3 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Larissa R. Terra, Mariana N. Catrinck, Reinaldo F. Teófilo
      Study of the production of 5-hydroxymethylfurfural (5-HMF) from glucose, applying niobic acid as catalyst, and the quantification of 5-HMF by selective analytical methods are the goals of this work. The high-performance liquid chromatography (HPLC) was used as reference method. The developed alternative method proposes the use of ultraviolet (UV) spectroscopy technique in combination with multivariate curve resolution alternating least squares (MCR-ALS). In addition, a partial least squares (PLS) regression model was built to compare MCR-ALS and HPLC results. Regression models using MCR-ALS and PLS were built for 5-HMF in the presence of levulinic acid in the range of 2.0–16.0 mg L−1. For HPLC the range used was of 10–800 mg L−1. The models were evaluated by analyzing statistical parameters of quality such as root mean square error (RMSE) and correlation coefficient (R). The calibration parameters obtained for MCR-ALS, PLS and HPLC were, respectively: RMSEC of 0.68, 0.27 and 4.92 mg L−1 and R equal to 0.988, 0.998 and 0.999. Central composite design (CCD) was used to optimize the two variables of reaction, i.e., time and mass of catalyst for glucose conversion into 5-HMF. The samples were analyzed by UV-MCR-ALS, UV-PLS and HPLC. The predicted concentrations of 5-HMF obtained by MCR-ALS and PLS versus HPLC predictions were evaluated. Both correlations were equals to 0.995. According to paired t-test results at a significance level of 0.05 both model predictions were statistically equals to the values predicted by HPLC. The results show that UV-MCR-ALS method can predict the concentration of 5-HMF in reaction mixtures with accuracy and obtain the relative concentration and pure spectra in a mixture without chromatographic separation.

      PubDate: 2017-06-07T03:50:59Z
       
  • Prediction subcellular localization of Gram-negative bacterial proteins by
           support vector machine using wavelet denoising and Chou's pseudo amino
           acid composition
    • Abstract: Publication date: Available online 1 June 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Bin Yu, Shan Li, Cheng Chen, Jiameng Xu, Wenying Qiu, Xue Wu, Ruixin Chen
      Information on the subcellular localization of Gram-negative bacterial proteins is of great significance to study the pathogenesis, drug design and discovery of certain diseases. Protein subcellular localization is an important part of proteomics, while providing new opportunities and challenges for chemometrics. Since the prediction of protein subcellular localization can help to understand their function and the role played by their metabolic processes, a number of protein subcellular localization prediction methods have been developed in recent years. In this paper, we propose a novel method by combining wavelet denoising with support vector machine to predict the subcellular localization of proteins for the first time. Firstly, the features of the protein sequence are extracted by Chou's pseudo amino acid composition (PseAAC), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of the Gram-negative bacterial proteins. Quite promising predictions are obtained using the jackknife test and compared with other predictive methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of protein subcellular localization, and it can be used to predict the other attributes of proteins.

      PubDate: 2017-06-02T03:32:04Z
       
  • A variable selection method for soft sensor development through mixed
           integer quadratic programming
    • Abstract: Publication date: Available online 25 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Weiyu Jian, Lingyu Zhu, Zuhua Xu, Xi Chen
      Soft sensors are widely employed in industry to predict quality variables, which are difficult to measure online, by using secondary variables. To build an accurate soft sensor, a proper variable selection is critical. In this project, a method of selecting the optimal secondary variables for a soft sensor model is proposed. It is formulated as a nested optimization problem. In each iteration, a mixed integer quadratic programming (MIQP) is conducted with the Bayesian information criterion (BIC) to estimate the prediction error. A warm start (WS) technique is developed to speed up the convergence. The proposed method is evaluated using a number of instances from the UCI Machine Learning Repository. The computational results demonstrate that this method is well suited for finding the best variable subsets. The method is successfully applied to build soft sensors for an industrial distillation column. The results show that the proposed method can effectively select feature variables that will improve the model prediction performance and reduce the model complexity. Comparisons with other methods, including the traditional partial least square technique, are also presented.

      PubDate: 2017-05-28T06:13:36Z
       
  • Selecting local constraint for alignment of batch process data with
           dynamic time warping
    • Abstract: Publication date: Available online 25 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Max Spooner, David Kold, Murat Kulahci
      There are two key reasons for aligning batch process data. The first is to obtain same-length batches so that standard methods of analysis may be applied, whilst the second reason is to synchronise events that take place during each batch so that the same event is associated with the same observation number for every batch. Dynamic time warping has been shown to be an effective method for meeting these objectives. This is based on a dynamic programming algorithm that aligns a batch to a reference batch, by stretching and compressing its local time dimension. The resulting ”warping function” may be interpreted as a progress signature of the batch which may be appended to the aligned data for further analysis. For the warping function to be a realistic reflection of the progress of a batch, it is necessary to impose some constraints on the dynamic time warping algorithm, to avoid an alignment which is too aggressive and which contains pathological warping. Previous work has focused on addressing this issue using global constraints. In this work, we investigate the use of local constraints in dynamic time warping and define criteria for evaluating the degree of time distortion and variable synchronisation obtained. A local constraint scheme is extended to include constraints not previously considered, and a novel method for selecting the optimal local constraint with respect to the two criteria is proposed. For illustration, the method is applied to real data from an industrial bacteria fermentation process.

      PubDate: 2017-05-28T06:13:36Z
       
  • Tchebichef-Hermite image moment method: A novel tool for chemometric
           analysis of three-dimensional spectra
    • Abstract: Publication date: Available online 22 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Bao Qiong Li, Shao Hua Lu, Xue Wang, Min Li Xu, Hong Lin Zhai
      Image moment methods have been applied to the qualitative and quantitative analyses in analytical chemistry owing to their availability. As anther attractive image moment method, Tchebichef-Hermite moment (THM) method was proposed for the first time in this work, which originated from the Tchebichef moment method and Hermite moment method. The performances of the THM method were tested with two instances for the quantitative analysis of multiple target compounds on the basis of HPLC-PAD and LC-MS three-dimensional (3D) spectra, respectively. Experimental results indicate that the THM method not only inherits the common advantages of these discrete orthogonal moments to deal with some fundamental challenges (such as partially overlapped signals, uncalibrated interferences, peak shifts and baseline drifts) during the analytical process of different kinds of 3D spectra, but also has its unique superiority in information extraction ability that simplify the determination of optimum maximal orders in moment methods. Compared with the previously used Tchebichef moment method, THM method is much more convenient and efficient.

      PubDate: 2017-05-23T05:49:06Z
       
  • Itakura-Saito distance based autoencoder for dimensionality reduction of
           mass spectra
    • Abstract: Publication date: Available online 20 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yuji Nozaki, Takamichi Nakamoto
      Small signals may contain important information. Mass spectra of chemical compounds are usually given in a format of sparse high-dimensional data of large dynamic range. As peaks at high m/z (mass to charge ratio) region of a mass spectrum contribute to sensory information, they should not be ignored during the dimensionality reduction process even if the peak is small. However, in most of dimensionality reduction techniques, large peaks in a dataset are typically more enphasized than tiny peaks when Euclidean space is assessed. Autoencoders are widely used nonlinear dimensionality reduction technique, which is known as one special form of artificial neural networks to gain a compressed, distributed representation after learning. In this paper, we present an autoencoder which uses IS (Itakura-Saito) distance as its cost function to achieve a high capability of approximation of small target inputs in dimensionality reduction. The result of comparative experiments showed that our new autoencoder achieved the higher performance in approximation of small targets than that of the autoencoders with conventional cost functions such as the mean squared error and the cross-entropy.

      PubDate: 2017-05-23T05:49:06Z
       
  • “Slicing” data array in quadrilinear component model: A alternative
           quadrilinear decomposition algorithm for third-order calibration method
    • Abstract: Publication date: Available online 20 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Li-Xia Xie, Hai-Long Wu, Xiao-Hua Zhang, Tong Wang, Li Zhu, Shou-Xia Xiang, Zhi Liu, Ru-Qin Yu
      A three-way data array in the trilinear component model can be described into sliced matrices and then decomposed into three underlying matrices by iterative procedure. In this paper, we make an in-depth study of the quadrilinear component model, generalize the “slice” to the four-way scenario, and develop a novel quadrilinear decomposition algorithm for third-order calibration, i.e., slicing alternating quadrilinear decomposition (SAQLD). The presently developed algorithm can be considered as a generalization of ATLD to four-way case. In the algorithm, updates of four underlying matrices are alternately iterated until convergence is reached. Operation of extracting diagonal elements is adopted, which makes SAQLD focus on extracting the quadrilinear part in data, leading to a significant decrease in the loss function and finally a high-performance computing strategy for SAQLD, i.e., fast convergence. Owing to its specific optimization approach, the proposed SAQLD algorithm recovers parameter matrices faster when compared with the existing quadrilinear decomposition algorithms. Both numerical simulations and experimental measurements demonstrate that third-order calibration based on the SAQLD algorithm allows one to obtain quantitative information regarding known constituents present in samples without worrying about other interferents. Moreover, quantitative results supplied by the SAQLD algorithm are still satisfying when the number of components used in calculation is excessive. Such a feature is very useful in quantitative chemical analysis since it is not easy to accurately determine the appropriate number of components due to the complex of chemical substrates.

      PubDate: 2017-05-23T05:49:06Z
       
  • DD-SIMCA — A MATLAB GUI tool for data driven SIMCA approach
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Y.V. Zontov, O.Ye. Rodionova, S.V. Kucheryavskiy, A.L. Pomerantsev


      PubDate: 2017-05-23T05:49:06Z
       
  • Performance of hybrid electronic tongue and HPLC coupled with chemometric
           analysis for the monitoring of yeast biotransformation
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Marcin Zabadaj, Iwona Ufnalska, Karolina Chreptowicz, Jolanta Mierzejewska, Wojciech Wróblewski, Patrycja Ciosek-Skibińska
      Monitoring of process parameters is of great importance in the field of bioprocess control. In this work two analytical systems: hybrid electronic tongue combining potentiometric and voltammetric detection (hET) and HPLC coupled with Partial Least Squares (PLS) analysis (HPLC-PLS) have been evaluated and compared as novel, rapid techniques for biotransformation monitoring. They were applied for the analysis of yeast culture media during batch fermentation carried out for 2-phenylethanol production. Work-flow of numerical analysis of sample fingerprints provided by these both techniques was presented. The ability of hET-PLS and HPLC-PLS to predict main process parameters, such as optical density, time of culture, glucose and 2-phenylethanol concentration was studied. Both non-selective analytical systems revealed to be suitable tools for yeast fermentation monitoring, while slightly better results were obtained by hET-PLS.
      Graphical abstract image

      PubDate: 2017-05-23T05:49:06Z
       
  • A new kernel function of support vector regression combined with
           probability distribution and its application in chemometrics and the QSAR
           modeling
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Sujie Xue, Xuefeng Yan
      Quantitative structure–activity relationship (QSAR) models are extensively used to identify new chemicals affecting human health and speed up the drug discovery process. The development of accurate QSAR models can lead to a reduced number of experiments conducted on rats and mice to analyze new compounds. In a typical QSAR model, only the relationship among variables is considered, and the probability distribution of the samples is disregarded. Thus, a new kernel function of support vector regression (SVR) that integrates probability distribution is proposed. The proposed kernel function, called SVR-pk, satisfies kernel function theory, and the mean and variance of the sample are used to reflect the main distribution information. To verify the performance of the new kernel function, simulation example, two sets of data from UCI (University of California, Irvine) and two experiments about the compounds toxicity in rodents data obtained from the Carcinogenic Potency Database are employed. Results show that compared with other SVR models utilizing kernel functions, SVR-pk exhibits better performance and is more suitable for QSAR model.

      PubDate: 2017-05-23T05:49:06Z
       
  • Receptor modeling of environmental aerosol data using MLPCA-MCR-ALS
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yahya Izadmanesh, Jahan B. Ghasemi, Roma Tauler
      Receptor models apportion the measured mass of an ambient particulate matter (PM) sample at a given site, called the receptor, to its emission sources by using multivariate factor analysis. In this work, a general workflow for PM data quality assessment, measurements uncertainty calculations, receptor modeling and error in model parameters estimation is proposed. Maximum likelihood principal component analysis - multivariate curve resolution–alternating least squares (MLPCA-MCR-ALS) is proposed for general bilinear receptor modeling of noisy environmental datasets and compared with other approaches used in the field. Equations to obtain PM-source contribution estimates (PM-SCE) and contribution-to-species to identify emission sources and their attributes are proposed. Propagation of experimental uncertainties in the parameters of the receptor model is obtained using extensive computer resampling methods. Results are shown for source apportionment of a particulate matter (PM10) air monitoring data set obtained under Fairmode-WG3 project.

      PubDate: 2017-05-23T05:49:06Z
       
  • Principal Component Analysis to interpret changes in chromatic parameters
           on paint dosimeters exposed long-term to urban air
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Agustín Herrera, Davide Ballabio, Natalia Navas, Roberto Todeschini, Carolina Cardell
      Atmospheric pollutants can originate the decay of historic paintings exposed to the outdoor elements. This is a cause of great concern, since such contaminants can produce physical-chemical alterations manifested initially in undesirable color change. This paper tests an unsupervised multivariate approach on discrete data color parameters in a pioneering study which combines spectrophotometric data and principal components analysis to detect unaesthetic color change on paint dosimeters in open-air monuments exposed long-term to the urban atmosphere of the city of Granada (South Spain). To this end the chromatic parameter of the CIEL*a*b* and CIEL*C*h* systems (L*, a*, b*, h*, C* and ΔE) were used as variables for subsequent multivariate analysis in order to determine the intrinsic color change trends. The aim is to evaluate the specific chromatic parameter(s) that cause the unaesthetic damage for each type of paint dosimeter, while also considering the influence of the binder (egg yolk/rabbit glue), the pigment (azurite, malachite and lapis lazuli) and for the first time, the grain size of the studied pigments (azurite). Results demonstrated that this approach is capable of discriminating samples on the basis of dosimeter composition, so enabling interpretation of their aging process. Azurite and lapis lazuli-laden dosimeters tended to turn green over time as a result of exposure to city air regardless of binder composition and location. By contrast, all malachite-laden dosimeters became bluer over time. Luminosity remained stronger in dosimeters prepared with collagen, an important parameter in binder discrimination. This information is also of great value for restoration purposes.

      PubDate: 2017-05-23T05:49:06Z
       
  • Computerized delimitation of odorant areas in
           gas-chromatography-olfactometry by kernel density estimation: Data
           processing on French white wines
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jean Blanquet, Yves Le Fur, Jordi Ballester
      GC-O using the detection frequency method gives a list of odor events (OEs) where each OE is described by a linear retention index (LRI) and by the aromatic descriptor given by a human assessor. The aim of the experimenter is to gather OEs in a total olfactogram on which he tries to delimit odorant areas (OAs), then to compute each detection frequency. This paper proposes a computerized mathematical method based on kernel density estimation that makes up the total olfactogram as continuous and differentiable function from the OEs LRI only. The corresponding curve looks like a chromatogram, the peaks of which are potential OAs. The limits of an OA are the LRI of the two minima surrounding the peak. The method was applied on a big data set: 18 white wines, 17 assessors, 13,037 OEs. A previous manual delimitation made by the experimenter was used as benchmark to test the quality of the rendition by the computed delimitation. A contingency table containing the numbers of OEs that belonged to both benchmark OAs and computed OAs was built. This table enabled to assess the quality of the global rendition (Tschuprow's T coefficients) and the quality of individual rendition of each benchmark OA. In order to define a suitable range of application, the kernel-based method was tested on sub-sets from the global dataset, by randomly drawing n wines out of 18 and p assessors out of 17. The method gave very satisfying results for at least n = 9 wines, p = 7 assessors for the peaks gathering at least (n + p)/2 OEs.

      PubDate: 2017-05-23T05:49:06Z
       
  • Dynamic learning on the manifold with constrained time information and its
           application for dynamic process monitoring
    • Abstract: Publication date: Available online 19 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jian Yang, Mingshan Zhang, Hongbo Shi, Shuai Tan
      Complex industrial processes exhibit dynamic behavior. Typically, samples are correlate in time. Therefore monitoring methods based on a single process may not perform well under such conditions. In this paper, a novel algorithm named time information constrained embedding (TICE) is proposed to improve the monitoring performance for the dynamic process. In this study, the neighbors are selected to reconstruct the current data point. With the consideration of the serial correlation, the time window with a certain length is adopted to restrict the scope of the neighbors' selection. To reveal the distance in the time scale as well as to preserve the neighborhood structure, a new expression of time weight is given to quantify the importance of sequential neighbors. Furthermore, an enhanced objective function is constructed to calculate the transformation matrix. Finally, the superiority of the proposed method is illustrated by an application example (TecQuipment CE117 process trainer) and the Tennessee Eastman (TE) process.

      PubDate: 2017-05-23T05:49:06Z
       
  • On the structure of dynamic principal component analysis used in
           statistical process monitoring
    • Abstract: Publication date: Available online 18 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Erik Vanhatalo, Murat Kulahci, Bjarne Bergquist
      When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent data. In DPCA the input matrix is augmented by adding time-lagged values of the variables. In building a DPCA model the analyst needs to decide on (1) the number of lags to add, and (2) given a specific lag structure, how many principal components to retain. In this article we propose a new analyst driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using simulated vector autoregressive and moving average data, and tested on Tennessee Eastman process data.

      PubDate: 2017-05-23T05:49:06Z
       
  • Predicting DNase I hypersensitive sites via un-biased pseudo trinucleotide
           composition
    • Abstract: Publication date: Available online 18 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Muhammad Kabir, Dong-Jun Yu
      DNase I Hypersensitive sites (DHS) are the regions that are sensitive to cleavage by the DNase I enzyme. Knowledge regarding these sites is helpful for decryption of the functions of non-coding genomic regions. Various biological processes need its intervention. Traditional techniques are laborious and time-consuming to predict DHS sites. Particularly, with the avalanche of DNA sequences generated in the post-genomic era, the development of computational approaches is highly essential to precisely and timely predict DHS sites in DNA sequences. The existing feature encoding schemes such as pseudo dinucleotide composition, pseudo trinucleotide composition etc. cannot effectively express features from DHS sequences. In the current study, we proposed a new computational technique to predict DHS sites which uses Un-biased Pseudo Trinucleotide Composition (Unb-PseTNC) strategy to extract nominal descriptors from the DHS benchmark dataset and avoid biasness among the classes during the classification phase. Several classification algorithm including Support vector machine (SVM), probabilistic neural network and k-nearest neighbor are employed to classify extracted features. It was observed that SVM in conjunction with Unb-PseTNC outperforms than other techniques. By comparing with other existing predictors, it was perceived that our proposed method achieved higher prediction rates by applying rigorous jackknife test. This indicates that the proposed model will become a useful tool to predict DHS sites and can also be utilized for in-depth study of DNA and genome analysis.

      PubDate: 2017-05-23T05:49:06Z
       
  • Determination of allura red dye in hard candies by using digital images
           obtained with a mobile phone and N-PLS
    • Abstract: Publication date: Available online 18 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Bruno G. Botelho, Kele C.F. Dantas, Marcelo M. Sena
      This paper describes the development of an optical sensor device using a smartphone and a homemade dark chamber built with recycled materials. This low cost instrument was employed in the development of multivariate image regression methods for the determination of the azo dye allura red in hard candies. To build the models, 238 candy samples of four flavors and different brands and batches were used. Firstly, a multivariate calibration model using RGB histograms and partial least squares (PLS) was built. This model provided high prediction errors, which were attributed to the presence of textural variations in the images. Then, a more complex image analysis methodology that incorporates spatial information, and consists of preprocessing by a two-dimensional fast Fourier transform followed by multi-way calibration with N-way PLS, provided better results, decreasing the prediction errors around 25–35%. The final model was submitted to a complete multivariate analytical validation, being considered precise, linear, sensitive and unbiased. The analytical range was established between 22.9 and 78.8 mg kg−1 of allura red. Root mean square errors of calibration (RMSEC) and prediction (RMSEP) of 4.8 and 6.1 mg kg−1 were estimated. The developed method is simple, rapid, and nondestructive.
      Graphical abstract image

      PubDate: 2017-05-23T05:49:06Z
       
  • Fuzzy decision fusion system for fault classification with analytic
           hierarchy process approach
    • Abstract: Publication date: Available online 17 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yue Liu, Zhiqiang Ge
      Performance of the most existing fault detection and classification methods can only be guaranteed when each of their own assumptions are met. In other words, a method works well in one condition may not perform well in another. In this paper, a new analytic hierarchy process (AHP) based fuzzy decision fusion system is proposed to tackle the fault classification problem. The AHP approach is introduced to determine the priorities of different classifiers, which are further utilized as the weights in ensemble system. Comparing to conventional equal weighted fusion system, the proposed fuzzy fusion system is able to provide more rational and convincing fault classification result. Effectiveness of the proposed fuzzy fusion system with model evaluation is verified through the Tennessee Eastman (TE) benchmark process.

      PubDate: 2017-05-18T05:43:26Z
       
  • Efficient android electronic nose design for recognition and perception of
           fruit odors using Kernel Extreme Learning Machines
    • Abstract: Publication date: Available online 17 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Ayşegül Uçar, Recep Özalp
      This study presents a novel android electronic nose construction using Kernel Extreme Learning Machines (KELMs). The construction consists of two parts. In the first part, an android electronic nose with fast and accurate detection and low cost are designed using Metal Oxide Semiconductor (MOS) gas sensors. In the second part, the KELMs are implemented to get the electronic nose to achieve fast and high accuracy recognition. The proposed algorithm is designed to recognize the odor of six fruits. Fruits at two concentration levels are placed to the sample chamber of the electronic nose to ensure the features invariant with the concentration. Odor samples in the form of time series are collected and preprocessed. This is a newly introduced simple feature extraction step that does not use any dimension reduction method. The obtained salient features are imported to the inputs of the KELMs. Additionally, K-Nearest Neighbor (K-NN) classifiers, the Support Vector Machines (SVMs), Least-Squares Support Vector Machines (LSSVMs), and Extreme Learning Machines (ELMs) are used for comparison. According to the comparative results for the proposed experimental setup, the KELMs produced good odor recognition performance in terms of the high test accuracy and fast response. In addition, odor concentration level was visualized on an android platform.
      Graphical abstract image

      PubDate: 2017-05-18T05:43:26Z
       
  • Evaluation of calibration transfer methods using the ATR-FTIR technique to
           predict density of crude oil
    • Abstract: Publication date: 15 July 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 166
      Author(s): Rayza R.T. Rodrigues, Julia T.C. Rocha, L. Mirela S.L. Oliveira, Júlio Cesar M. Dias, Edson I. Müller, Eustáquio V.R. Castro, Paulo R. Filgueiras
      Multivariate calibration combined with infrared technique is an alternative to traditional methods of determination of physicochemical parameters in crude oil. However, a multivariate model can only be applied for the instrument in which the spectra were measured. In case of equipment upkeep or change of instrument, transferring the calibration model is necessary for the new instrumental condition or new instrument. In this study, Fourier transform infrared spectra (FTIR) were measured in the mid-infrared region (MIR) in two different instruments for 96 crude oil samples with API gravity ranging from 11.2 to 54.0. Multivariate calibration models by PLS (Partial Least Squares) and OPLS (Orthogonal Projections to Latent Structures) were developed and forefront techniques in the calibration transfer area were tested, namely SBC (Slope and Bias Correction), DS (Direct Standardization) and PDS (Piecewise Direct Standardization). The OPLS method stands out for not requiring transfer samples, although it compromises the accuracy of prediction. Applying spectra transferred by PDS to the OPLS model resulted in accuracy statistically equal to that of the PLS original model.

      PubDate: 2017-05-07T14:46:50Z
       
  • Joint-individual monitoring of large-scale chemical processes with
           multiple interconnected operation units incorporating multiset CCA
    • Abstract: Publication date: Available online 5 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yang Wang, Qingchao Jiang, Xuefeng Yan, Jingqi Fu
      Large-scale processes with multiple interconnected operation units have become popular, and monitoring such processes is imperative. A joint-individual monitoring scheme that incorporates multiset canonical correlation analysis (MCCA) for large-scale chemical processes with several interconnected operation units is proposed. First, MCCA is employed to extract the joint features throughout the entire process. Second, for each operation unit, the measurements are projected into a joint feature subspace and its orthogonal complement subspace that contains the individual features of the unit. Then, corresponding statistics are constructed to examine the joint and individual features simultaneously. The proposed joint-individual monitoring scheme considers the global information throughout the entire process and the local information of a local operation unit and therefore exhibits superior monitoring performance. The joint-individual monitoring scheme is applied on a numerical example and the Tennessee Eastman benchmark process. Monitoring results indicate the efficiency of the proposed monitoring scheme.

      PubDate: 2017-05-07T14:46:50Z
       
  • Estimation of missing values in a food property database by matrix
           completion using PCA-based approaches
    • Abstract: Publication date: Available online 3 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Samuel Mercier, Martin Mondor, Bernard Marcos, Christine Moresoli, Sébastien Villeneuve
      In this work, five matrix completion algorithms were investigated for the estimation of missing values in a food property database: iterative PCA with (IPCAE) and without (IPCA) early stopping, trimmed scores regression with (TSRE) and without (TSR) early stopping and variational Bayesian PCA (VBPCA). Matrix completion was applied in the context of a food property database (31 properties × 663 observations) developed by meta-analysis for new food product development, a novel application of matrix completion. The database contained 68.7% of missing values. VBPCA and TSRE were the most accurate algorithms and explained on average 42% and 40%, respectively, of the variance of the missing values. The incorporation of an early stopping step in the TSR and IPCA algorithms decreased overfitting and improved significantly their accuracy. The accuracy of the missing value estimates varied significantly according to the property, and the coefficient of determination for each property with VBPCA ranged from 0.02 to 0.84. The accuracy of the missing value estimates was higher when the property known for only a few observations were included in the database, indicating that the matrix completion algorithms successfully used the additional information that those properties provided to improve the estimation of the other properties in the database. For 17% of the database, the matrix completion algorithms could identify if the missing value was above or below the average value of the property with a confidence level above 90%, providing additional information for product characterization at no experimental cost.

      PubDate: 2017-05-07T14:46:50Z
       
  • Detection of Nonlinearity in Soil Property Prediction Models Based on
           Near-infrared Spectroscopy
    • Abstract: Publication date: Available online 1 May 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Lu Yan, Matheus S. Escobar, Hiromasa Kaneko, Kimito Funatsu
      Soil property analysis is indispensable in precision agriculture, an advanced field regarding site-specific management for crop production enhancement and environmental sustainability. Because of the difficulties in soil sample collection and measurement of soil properties, such as moisture content, total carbon, total nitrogen, electricity, and pH, near-infrared (NIR) spectroscopy is a useful technique to predict soil properties by using statistical learning methods. However, the prediction of soil properties without any knowledge about how different variables might influence their behavior is not adequate. Soil properties differ depending on location and environment. The variability within the same area could cause nonlinearity on a global scale. Therefore, to determine which method and strategy are suitable for this task, the detection of nonlinearity between NIR spectroscopy and soil properties is the main purpose of this study. Various numerical tools and graphical methods were applied to this soil property dataset, such as variable selection, sample splitting, applicability domain evaluation, and residual inspection. Global nonlinearity for all five soil properties was confirmed, and the strength of such nonlinearities was found to be property dependent.

      PubDate: 2017-05-02T14:35:46Z
       
  • Fuzzy clustering as rational partition method for QSAR
    • Abstract: Publication date: Available online 27 April 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Alfonso Pérez-Garrido, Francisco Girón-Rodríguez, Andrés Bueno-Crespo, Jesús Soto, Horacio Pérez-Sánchez, Aliuska Morales Helguera
      Various methods are used to make the partition of data sets for QSAR development and model validation. In this work we used a fuzzy minimals partitioning and we compare this methodology with another rational partition methods like k-means clustering (KMS) and Minimal Test Set Dissimilarity (MTSD). For the development of QSAR models Ordinary Least Squares (OLS) and Extreme Learning Machine (ELM) methods were used. The generated QSAR equations were validated by the coefficient of determination of the internal leave one out (LOO) cross validation method Q LOO 2 and then the coefficient of the external test set Q ext 2 was compared between partition methods. The results of this comparison showed that using fuzzy minimal for big and structurally diverse data sets gave an applicability domain similar to KMS and a better predictability models than both methods, KMS and MTSD.
      Graphical abstract image Highlights

      PubDate: 2017-05-02T14:35:46Z
       
  • Double Outlyingness Analysis in Quantitative Spectral Calibration:
           Implicit Detection and Intuitive Categorization of Outliers
    • Abstract: Publication date: Available online 27 April 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Hui Cao, Yajie Yu, Yan Zhou, Xiali Hei
      In this study, outliers in the spectral calibration set were analyzed and categorized based on their error introducing patterns. A double outlyingness analysis (DOA) method for outlier detection and categorization was proposed as a tool to detail the error structures of outliers. Two outlyingness values were calculated based on a proposed procedure. The outlier diagnosis diagram based on the inlier model was drawn to distinguish four types of samples: type I (outlier), incorrect concentration(s) with contaminated spectral signals; type II (outlier), incorrect concentration(s) with uncontaminated spectral signals; type III (outlier), correct concentration(s) with contaminated spectral signals; type IV (inlier), correct concentration(s) with uncontaminated spectral signals. Four data sets for quantitative spectral calibrations were used to compare DOA and five existing methods. Results show that DOA is able to detect all types of outliers and provide a tool to analyze outlier structures.

      PubDate: 2017-05-02T14:35:46Z
       
  • L0-constrained regression using mixed integer linear programming
    • Abstract: Publication date: Available online 12 April 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Mark J. Willis, Moritz von Stosch
      In this work, sparse regression using a penalized least absolute deviations objective function is considered. Regression model sparsity is promoted using a L0 - pseudo norm penalty (the cardinality of the model parameter vector). Implemented using mixed integer linear programming (MILP) it is demonstrated that the use of the L0 - norm (without approximation) enables efficient and accurate solutions to sparse regression problems of practical size. For model development with a large number of potential model parameters (or features) methods to relax the MILP are also developed; using nonlinear function approximations to the L0- norm, penalty terms are linearized and solved using sequential linear programming. Experimental results (using both simulated and real data) demonstrate that these algorithms are also computationally efficient producing accurate and parsimonious model structures. Applications considered are the development of a calibration model for prediction with Near Infrared (NIR) data and the development of a model for the prediction of chemical toxicity - a quantitative structure activity relationship (QSAR).

      PubDate: 2017-04-18T14:01:59Z
       
  • Quantitative Analysis of Biofluids Based on Hybrid Spectra Space
    • Abstract: Publication date: Available online 11 April 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Zhigang Li, Tianhe Li, Hong Lv, Qiaoyun Wang, Guangyuan Si, Zhonghai He
      Direct determination of chemical constituents in complex biofluids without the need for any reagent or pre-processing of samples has become a promising technique for clinical analysis. Attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy has been widely studied as a powerful method of reagent-free biofluids analysis. In this work, to further utilize information and improve prediction performance, new hybrid spectra space was constructed based on different derivative spectra space. Then, hybrid spectra space ensemble interval partial least squares modeling (HSEiPLS) was proposed for quantification analysis of biofluids. In the experiment of determining glucose concentrations in 58 whole blood samples, the F-test is used to determine the optimal number of latent variables for models and the F-test significance level is set to 0.25. HSEiPLS model provided lower root mean square error of prediction (RMSEP) values 0.352mM/L compared with other methods. In the experiment of determining cholesterol concentrations in 50 whole blood samples, HSEiPLS model provided RMSEP values 0.205mM/L under the same condition of the significance level. Experimental results demonstrate that the proposed HSEiPLS based on hybrid spectra space provides superior predictive power for biofluids.
      Graphical abstract image

      PubDate: 2017-04-11T13:31:45Z
       
  • A statistical strategy to assess cleaning level of surfaces using
           Fluorescence spectroscopy and Wilks’ ratio
    • Abstract: Publication date: Available online 5 April 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Iuliana Madalina Stoica, Hamid Babamoradi, Frans van den Berg
      There is a high demand for techniques able to monitor on-line, in real-time, the bio-contamination level of contact surfaces in the food industry. Such techniques could help to react promptly whenever failures in the cleaning or sanitation operations occur, keep the safety parameters in control at any time during production, and ultimately tailor the operations towards more sustainable and efficient practices. However, monitoring surface areas such as conveyor belts comes with a distinct set of challenges from the construction materials used in food processing equipment such as compositional-heterogeneity, background signals and continuous changes due to wear and tear. In this work we demonstrate the potential of front-face fluorescence spectroscopy in combination with Wilks’ ratio statistics for monitoring large surface areas fouled under industrial working conditions. The technique was tested in both off-line and on-line mode, for a polymer-based conveyor surface, which presents an intrinsic natural variation across its running length and which was contaminated artificially for a proof of principle. Results show that any potential contamination will shift the variance and covariance structure of the in-control fluorescence landscapes modeled with PARAFAC, and detected this shift as a deviation from the reference clean state in a Wilks’ ratio based monitoring charts.

      PubDate: 2017-04-11T13:31:45Z
       
  • NMR-based metabolomic analyses for the componential differences and the
           corresponding metabolic responses of three batches of Farfarae Flos
    • Abstract: Publication date: Available online 29 March 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): ZhenYu Li, Jing Li, ZhengZheng Zhang, Xia Mi, GuanHua Du, XueMei Qin
      Farfarae Flos (FF) is a commonly used herbal drug which has been used for a long time in the Tradition Chinese Medicines (TCM). Nowadays, the cultivated FF in the Northern China is the main source of FF used in the clinic of TCM. The chemical compositions of the herbal drugs are always influenced by the weather, geographic location, soil conditions, and cultivation patterns. Thus it is difficult to guarantee the homogeneity or uniformity of the herbal drugs. In this study, a metabolomic approach was used to compare three batches of FF collected from different growing regions. The results showed that three batches of FFs differed from each other both in the primary and secondary metabolites, and there also existed in vivo differences among three groups of FFs. The clustering pattern of n-butanol extracts was similar to those of crude water extracts and serum, indicating that the polar compounds, such as phenylpropanoids and flavonoids, play an important role in the water extracts of FF. The results presented here suggested that the metabolomic approach can be used as a valuable method to evaluate the difference of herbal drugs from various origins.
      Graphical abstract image

      PubDate: 2017-04-04T13:24:31Z
       
  • Stacked Interval Sparse Partial Least Squares Regression Analysis
    • Abstract: Publication date: Available online 21 March 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Dominic V. Poerio, Steven D. Brown
      A new method based on a combination of stacked interval partial least squares (SIPLS) and sparse partial least squares (SPLS) regression, called stacked interval sparse PLS (SISPLS) regression, is explored. The proposed method is based on splitting spectral data into discrete, equally spaced intervals, building optimized SPLS models on each region, and weighting the local models based on the cross-validation error achieved during the optimization. The method is highly flexible and only performs explicit variable selection when advantageous; instead the aim is to find favorable rotations of the classical PLS solution while also utilizing local information in a spectrum. The SISPLS model regression vector clearly highlights regional and variable importance in the data, permitting a straightforward interpretation of the model. For a specific dataset, the optimal interval size is determined via a random sampling of the calibration data and exhaustive testing of the feasible interval sizes. The method is demonstrated on two NIR datasets and a Raman dataset. In addition to the multi-faceted interpretational advantage from the variable selection and weighting, we show that the predictions from the method are competitive with PLS, SPLS, SIPLS, and VIP selection.

      PubDate: 2017-03-27T13:11:43Z
       
 
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
 
Home (Search)
Subjects A-Z
Publishers A-Z
Customise
APIs
Your IP address: 54.162.164.247
 
About JournalTOCs
API
Help
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2016