for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> COMPUTER SCIENCE (Total: 2011 journals)
    - ANIMATION AND SIMULATION (30 journals)
    - AUTOMATION AND ROBOTICS (98 journals)
    - COMPUTER ARCHITECTURE (9 journals)
    - COMPUTER ENGINEERING (9 journals)
    - COMPUTER GAMES (16 journals)
    - COMPUTER PROGRAMMING (24 journals)
    - COMPUTER SCIENCE (1172 journals)
    - COMPUTER SECURITY (46 journals)
    - DATA BASE MANAGEMENT (13 journals)
    - DATA MINING (32 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (29 journals)
    - IMAGE AND VIDEO PROCESSING (39 journals)
    - INFORMATION SYSTEMS (108 journals)
    - INTERNET (92 journals)
    - SOCIAL WEB (50 journals)
    - SOFTWARE (34 journals)
    - THEORY OF COMPUTING (8 journals)

COMPUTER SCIENCE (1172 journals)                  1 2 3 4 5 6 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 15)
Abakós     Open Access   (Followers: 4)
ACM Computing Surveys     Hybrid Journal   (Followers: 24)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 9)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 13)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 16)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 6)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 12)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 3)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 18)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 6)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 2)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 21)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 8)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 9)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Full-text available via subscription   (Followers: 25)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 3)
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 11)
Advanced Engineering Materials     Hybrid Journal   (Followers: 26)
Advanced Science Letters     Full-text available via subscription   (Followers: 9)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 7)
Advances in Artificial Intelligence     Open Access   (Followers: 16)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 2)
Advances in Catalysis     Full-text available via subscription   (Followers: 6)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 18)
Advances in Computer Science : an International Journal     Open Access   (Followers: 15)
Advances in Computing     Open Access   (Followers: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 52)
Advances in Engineering Software     Hybrid Journal   (Followers: 27)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 11)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 27)
Advances in Human-Computer Interaction     Open Access   (Followers: 21)
Advances in Materials Sciences     Open Access   (Followers: 16)
Advances in Operations Research     Open Access   (Followers: 12)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 7)
Advances in Porous Media     Full-text available via subscription   (Followers: 5)
Advances in Remote Sensing     Open Access   (Followers: 40)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access   (Followers: 4)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 8)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
Air, Soil & Water Research     Open Access   (Followers: 9)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 11)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 5)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 5)
American Journal of Sensor Technology     Open Access   (Followers: 4)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 7)
Analysis in Theory and Applications     Hybrid Journal   (Followers: 1)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 3)
Annals of Data Science     Hybrid Journal   (Followers: 11)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 12)
Annals of Pure and Applied Logic     Open Access   (Followers: 2)
Annals of Software Engineering     Hybrid Journal   (Followers: 13)
Annual Reviews in Control     Hybrid Journal   (Followers: 6)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 1)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 13)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Clinical Informatics     Hybrid Journal   (Followers: 2)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 12)
Applied Computer Systems     Open Access   (Followers: 2)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 33)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 15)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 5)
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 5)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 137)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 4)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 7)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access   (Followers: 1)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 4)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 11)
Automation in Construction     Hybrid Journal   (Followers: 6)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 9)
Basin Research     Hybrid Journal   (Followers: 5)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Biodiversity Information Science and Standards     Open Access  
Bioinformatics     Hybrid Journal   (Followers: 287)
Biomedical Engineering     Hybrid Journal   (Followers: 15)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 14)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 18)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 34)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 47)
British Journal of Educational Technology     Hybrid Journal   (Followers: 139)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 10)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 2)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal  
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 14)
Capturing Intelligence     Full-text available via subscription  
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal   (Followers: 2)
Cell Communication and Signaling     Open Access   (Followers: 2)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access   (Followers: 1)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 14)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 7)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Full-text available via subscription   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 15)
Clean Air Journal     Full-text available via subscription   (Followers: 2)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combustion Theory and Modelling     Hybrid Journal   (Followers: 14)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 20)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 55)
Communications of the Association for Information Systems     Open Access   (Followers: 18)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 9)
Computación y Sistemas     Open Access  
Computation     Open Access  
Computational and Applied Mathematics     Hybrid Journal   (Followers: 2)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 11)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 2)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 9)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 15)
Computational Linguistics     Open Access   (Followers: 22)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 4)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 7)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 30)
Computer     Full-text available via subscription   (Followers: 91)
Computer Aided Surgery     Hybrid Journal   (Followers: 5)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 8)
Computer Communications     Hybrid Journal   (Followers: 10)
Computer Engineering and Applications Journal     Open Access   (Followers: 5)
Computer Journal     Hybrid Journal   (Followers: 9)
Computer Methods in Applied Mechanics and Engineering     Hybrid Journal   (Followers: 22)
Computer Methods in Biomechanics and Biomedical Engineering     Hybrid Journal   (Followers: 12)
Computer Methods in the Geosciences     Full-text available via subscription   (Followers: 2)
Computer Music Journal     Hybrid Journal   (Followers: 18)
Computer Physics Communications     Hybrid Journal   (Followers: 6)
Computer Science - Research and Development     Hybrid Journal   (Followers: 8)
Computer Science and Engineering     Open Access   (Followers: 19)
Computer Science and Information Technology     Open Access   (Followers: 13)
Computer Science Education     Hybrid Journal   (Followers: 14)
Computer Science Journal     Open Access   (Followers: 22)

        1 2 3 4 5 6 | Last

Journal Cover Chemometrics and Intelligent Laboratory Systems
  [SJR: 0.697]   [H-I: 92]   [14 followers]  Follow
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 0169-7439
   Published by Elsevier Homepage  [3123 journals]
  • Hierarchical mixture of linear regressions for multivariate spectroscopic
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Chenhao Cui, Tom Fearn
      This paper investigates the use of the hierarchical mixture of linear regressions (HMLR) and variational inference for multivariate spectroscopic calibration. The performance of HMLR is compared to the classical methods: partial least squares regression (PLSR), and PLS embedded locally weighted regression (LWR) on three different NIR datasets, including a publicly accessible one. In these tests, HMLR outperformed the other two benchmark methods. Compared to LWR, HMLR is parametric, which makes it interpretable and easy to use. In addition, HMLR provides a novel calibration scheme to build a two-tier PLS regression model automatically. This is especially useful when the investigated constituent covers a large range.

      PubDate: 2018-02-05T06:51:13Z
  • Ensemble deep kernel learning with application to quality prediction in
           industrial polymerization processes
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Yi Liu, Chao Yang, Zengliang Gao, Yuan Yao
      For predicting the melt index (MI) in industrial polymerization processes, traditional data-driven empirical models do not utilize the information in a large amount of the unlabeled data. To overcome this data-rich-but-information-poor (DRIP) problem in polymer industries, an ensemble deep kernel learning (EDKL) model is proposed. With an unsupervised learning stage, the deep brief network is adopted to extract useful information from the available data. Then, a kernel learning regression model is formulated to obtain a nonlinear relationship between the extracted features and MI values. Moreover, a bagging-based ensemble strategy is integrated into the deep kernel learning method to enhance the reliability of the prediction model. The industrial MI prediction results demonstrate the advantages of the developed EDKL model as compared with conventional supervised soft sensors (e.g., partial least squares and support vector regression) that only use the limited labeled data.

      PubDate: 2018-02-05T06:51:13Z
  • Improving prediction of extracellular matrix proteins using evolutionary
           information via a grey system model and asymmetric under-sampling
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Muhammad Kabir, Saeed Ahmad, Muhammad Iqbal, Zar Nawab Khan Swati, Zi Liu, Dong-Jun Yu
      Extracellular Matrix proteins (ECMP) play vigorous part in performing various biological functions including cell migration, adhesion, proliferation, differentiation. Furthermore, embryonic development, angiogenesis, gene expression, and tumor growth are also regulated by ECMP. In view of this incredible significance, precise and reliable identification of ECMP through computational techniques is highly requisite. Although, previous works made substantial improvement, however, accurately predicting ECMP from primary protein sequence is still at the infant stage due to the rapid growth of proteins samples in online databases. In the current study, a novel sequence-based prediction method called TargetECMP has been proposed, which is based on the evolutionary information extracted via a grey system model. It utilizes asymmetric under-sampling approach for splitting the benchmark dataset into eleven subsets in order to avoid class imbalance problem. Jackknife cross-validation test is performed with support vector machine (SVM) on each subset of data and then ensemble majority voting is utilized to integrate outputs of SVM against each subset. The experimental results achieved by TargetECMP outperformed the existing predictor on both benchmark dataset and independent dataset. Owning to best prediction results provided by TargetECMP, it is demonstrated that the analysis will provide novel insights into basic research, drug discovery and academia in general and function of extracellular matrix proteins in particular.

      PubDate: 2018-02-05T06:51:13Z
  • Multivariate comparison of classification performance measures
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Davide Ballabio, Francesca Grisoni, Roberto Todeschini
      The assessment of the classification performance can be based on class indices, such as sensitivity, specificity and precision, which describe the classification results achieved on each modelled class. However, in several situations, it is useful to represent the global classification performance with a single number. Therefore, several measures have been introduced in literature to deal with this assessment, accuracy being the most known and used. These metrics have been proposed to generally face binary classification tasks and can behave differently depending on the classification scenario. In this study, different global measures of classification performances are compared by means of results achieved on an extended set of real multivariate datasets. The systematic comparison is carried out through multivariate analysis. Further investigations are then derived on specific indices to understand how the presence of unbalanced classes and the number of modelled classes can influence their behaviour. Finally, this work introduces a set of benchmark values based on different random classification scenarios. These benchmark thresholds can serve as the initial criterion to accept or reject a classification model on the basis of its performance.

      PubDate: 2018-02-05T06:51:13Z
  • Alternate deflation and inflation of search space in reweighted sampling:
           An effective variable selection approach for PLS model
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Biswanath Mahanty
      Based on assessment of randomized sub-model populations generated through reweighted binary matrix sampling (BMS), an innovative variable selection strategy for PLS regression model, called alternate deflation and inflation of search space (ADISS) is proposed. Normalized regression coefficients of best PLS sub-models population is used to formulate the weight vector for re-weighted BMS. Unlike the most existing algorithm, ADISS alternatively shifts between forward selection (inflation) and backward elimination (deflation) of variable space, minimizing the risk of accidental loss of informative variables. Compared with methods such as competitive adaptive reweighted sampling (CARS), variable iterative space shrinkage approach (VISSA), or Monte Carlo uninformative variable elimination (MC-UVE), proposed method showed lower cross-validation or prediction error for two different benchmark NIR data sets. ADISS frequently selects nearly the same sets of variables across multiple independent runs, that signifies stability of the output. The unsupervised execution, termination and projection of final variable set from the algorithm is important advantage while considering for large scale data.

      PubDate: 2018-02-05T06:51:13Z
  • A novel convolutional neural network based approach to predictions of
           process dynamic time delay sequences
    • Abstract: Publication date: 15 March 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 174
      Author(s): Bo Yang, Hongguang Li
      It is practical that correlated process variables always involve dynamic time-delay sequences. In this paper, a novel convolutional neural network (CNN) based approach is proposed to predict dynamic time delay sequences. Firstly, according to the calculating similarities between correlated process variables, the time delay sequence is extracted offline using a dynamic time delay analysis by elastic windows (EW-DTDA) method. In addition, through an additional correlation analysis between the time delay sequence and process variables data, the process variables majorly influencing the time delay sequences can be obtained. Finally, a deep learning CNN model between the extracted time delay sequence and the obtained majorly influencing variables is constructed to predict the time delay sequence online. In order to validate the effectiveness of the proposed method, the method is applied to a real distillation column for analyzing dynamic time delay sequences, the simulation results conformed the effectiveness of the proposed approach.

      PubDate: 2018-02-05T06:51:13Z
  • Exploring the effects of sparsity constraint on the ranges of feasible
           solutions for resolution of GC-MS data
    • Abstract: Publication date: 15 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 173
      Author(s): Ahmad Mani-Varnosfaderani, Atefeh Kanginejad, Yadollah Yamini
      Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations are non-negative. Sparse non-negative matrix factorizations (SNMFs) are useful when some degrees of sparseness exist in original data, intrinsically. The present contribution is about the implementation of sparsity constraint in multivariate curve resolution-alternating least square (MCR-ALS) technique for analysis of GC-MS/LC-MS data. The GC-MS and LC-MS data are sparse in mass dimension, and implementation of SNMF techniques would be useful for analyzing such two-way chromatographic data. In this work, L1-regularization paradigm has been implemented in each iteration of the MCR-ALS algorithm in order to force the algorithm to return more sparse mass spectra. L1-regularization has been applied by using the least absolute shrinkage and selection operator (Lasso) instead of the ordinary least square. A comprehensive comparison has been made between MCR-ALS and Lasso-MCR-ALS algorithms for resolution of the simulated and real GC-MS data. The comparison has been made by calculation of the values of sum of square errors (SSE) for 5000 times repetition of both algorithms using the random mass spectra and concentration profiles as initial estimates. The results revealed that regularization of L1-norm in mass dimension prevents occurrence of overfitting in ALS algorithm and this increases the probability of finding “true solution” after the resolution procedure. Moreover, the effect of this “sparsity constraint” has been explored on the area of feasible solutions in MCR methods. The results in this work revealed that implementation of this constraint reduces the extent of rotational ambiguity in MCR solutions and can be helpful for resolution of GC-MS data with high degrees of overlapping in mass spectra and concentration profiles.

      PubDate: 2018-02-05T06:51:13Z
  • Determining optimum wavelengths for leaf water content estimation from
           reflectance: A distance correlation approach
    • Abstract: Publication date: 15 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 173
      Author(s): Celestino Ordóñez, Manuel Oviedo de la Fuente, Javier Roca-Pardiñas, José Ramón Rodríguez-Pérez
      This paper proposes a method to estimate leaf water content from reflectance in four commercial vineyard varieties by estimating the local maxima of a distance correlation function. First, it applies four different functional regression models to the data and compares the models to test the viability of estimating water content from reflectance. It then applies our methodology to select a small number of wavelengths (optimum wavelengths) from the continuous spectrum, which simplifies the regression problem. Finally, it compares the results to those obtained by means of two different methods: a nonparametric kernel smoothing for variable selection in functional data and a wavelet-based weighted LASSO functional linear regression. Our approach proved to have some advantages over these two testing approaches, mainly in terms of the computing time and the lack of assumption of an underlying model. Finally, the paper concludes that estimating water content from a few wavelengths is almost equivalent to doing so using larger wavelength intervals.

      PubDate: 2018-02-05T06:51:13Z
  • Distributed feature selection: A hesitant fuzzy correlation concept for
           microarray high-dimensional datasets
    • Abstract: Publication date: 15 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 173
      Author(s): Mohammad Kazem Ebrahimpour, Mahdi Eftekhari
      Feature selection has been the problem of interest for many years. Almost all existing feature selection approaches use all training samples and features at once to select salient features. These approaches are named centralized methods; however, there are other approaches that split the training data on their dimensions in order to run each batch on different clusters (Machine) for the cases which we are dealing with ultra-big data. In this paper, a novel distributed feature selection approach based on hesitant fuzzy sets is proposed. First, datasets are horizontally (by their features) divided into some subsets according to the information energies of hesitant fuzzy sets and shuffling. Then, on each subset our HCPF (Hesitant fuzzy set based feature selection algorithm using Correlation coefficients for Partitioning Features) is applied individually. Finally, a merging procedure is employed that updates the final feature subset according to improvements in the classification accuracy. The effectiveness of the proposed method has been evaluated by twenty two state-of-the-art distributed and centralized algorithms on eight well-known microarray high dimensional datasets. The experimental results reveal that the proposed method has achieved significant results compared to the other approaches due to the statistical non-parametric Wilcoxon signed rank test. Our experiments confirm that the proposed method is effective to tackle feature selection problem in terms of classification accuracy and dimension reduction in ultra-high dimensional datasets.

      PubDate: 2018-02-05T06:51:13Z
  • Non-normalized version of the Direct Inversion in the Spectral Subspace
           method. A new formulation of the method without Lagrangian
    • Abstract: Publication date: 15 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 173
      Author(s): János Eőri, Tamás Vörös, Zsuzsanna Kolos, Gábor Pongor
      Recently, a novel method, called the Direct Inversion in the Spectral Subspace (DISS, J. Math. Chem. 47, 1085–1105 (2010)), has been developed for the quantitative (and, in a limited sense, qualitative) analysis of homogeneous chemical mixtures. The method belongs to the “supervised classification” methods because (beyond the mixture's spectrum) it needs the knowledge of the components' spectra, either experimental or calculated. Two different versions of the DISS method were established: the normalized (approximate) and the non-normalized (accurate) methods. In the present work, the revised non-normalized version is discussed in a general and elegant way together with the normalized variant. The original DISS method (with the use of a sole restriction by the Lagrange multiplier method) leads to an iterative solution of a system of linearized equations. A new formulation of the method (abbreviated as DISS_Magar) is presented without Lagrange multipliers and iteration. The complete equivalence of the DISS_Magar and the original DISS methods has been proved for both the normalized and the non-normalized versions in an elegant and simple way. The DISS method leads to a much smaller system of linear equations than that in the “multiwavelength spectroscopic method,” and it is simpler than the MCR─ALS or ICA algorithms, with the DISS requiring less mathematical operations. Further mathematical proofs are presented for the principles underlying the DISS method along with applications to experimental and simulated data sets.

      PubDate: 2018-02-05T06:51:13Z
  • Application of non-linear optimization for estimating Tucker3 solutions
    • Abstract: Publication date: Available online 2 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Zohreh Shomali, Nematollah Omidikia, Mohsen Kompany-Zareh
      An extension of the non-linear optimization (FMIN) which was previously proposed to the estimation of multivariate curve resolution solutions by Tauler [AnalyticaChimicaActa 595 (2007) 289–298] to the multi-way analysis is reported in this contribution. TuckFMIN is presented for the calculation of Tucker3 solutions by the FMIN. TuckFMIN is heavily relying on the minimization of an objective function which is defined directly from the constraints non-fulfillment. Starting from higher-order singular value decomposition (HO-SVD) loadings, TuckFMIN represents a new approach for obtaining three proper rotation matrices for transforming the HO-SVD loadings to physically and chemically meaningful solutions. Different constraints should be imposed during optimization; sparsity and trilinearity are among new constraints for simplicity of the Tucker3 core and fast and robust convergence in a multi-way decomposition. Simulated fluorescence data was exemplified to evaluate the feasibility of proposed method. For the sake of comparison between different initialization methods, PARAFAC-ALS and HO-SVD loadings were used. LOF (lack of fit) from TuckFMIN modeling is always as same as LOF of HO-SVD and less than LOF of the PARAFAC-ALS for noise free data sets. Despite the PARAFAC-ALS decomposition, TuckFMIN has the best fitting regardless to rank-deficiency or levels of noise. By means of simulation study, it is demonstrated that TuckFMIN can be helpful for faster convergence and obtaining the reproducible results. An experimental 3D fluorescence data set from gold nano-particle (AuNP) interaction with HIV genome are successfully used for evaluating the performance of the TuckFMIN algorithm.

      PubDate: 2018-02-05T06:51:13Z
  • Better interpretable models after correcting for natural variation:
           Residual approaches examined
    • Abstract: Publication date: Available online 1 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Mike Koeman, Jasper Engel, Jeroen Jansen, Lutgarde Buydens
      The interpretation of estimates of model parameters in terms of biological information is often just as important as the predictions of the model itself. In this study we consider the identification of metabolites in a possibly biologically heterogeneous case group that show abnormal patterns with respect to a set of (healthy) control observations. For this purpose, we filter normal (baseline) natural variation from the data by projection of the data on a control sample model: the residual approach. This step should more easily highlight the abnormal metabolites. Interpretation is, however, hindered by a problem we named the ‘residual bias’ effect, which may lead to the identification of the wrong metabolites as ‘abnormal’. This effect is related to the smearing effect. We propose to alleviate residual bias by considering a weighted average of the filtered and raw data. This way, a compromise is found between excluding irrelevant natural variation from the data and the amount of residual bias that occurs. We show for simulated and real-world examples that this compromise may outperform inspection of the raw or filtered data. The method holds promise in numerous applications such as disease diagnoses, personalized healthcare, and industrial process control.

      PubDate: 2018-02-05T06:51:13Z
  • Space-filling designs for mixtures
    • Abstract: Publication date: Available online 1 February 2018
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): C. Gomes, M. Claeys-Bruno, M. Sergent
      Uniform experimental designs (Space Filling Designs) are now widely used with independent variables, particularly in numerical simulation. A lot of construction methods have been developed to uniformly cover the domain of variables, but they remain rarely applied for the study of mixtures. In this article, we propose methods to build space-filling designs for mixtures that can be used to modelize complex phenomena in formulation. Various algorithms like Kennard and Stone, WSP, Strauss and Dmax algorithms, to construct these designs are detailed and compared with respect to uniformity criteria.

      PubDate: 2018-02-05T06:51:13Z
  • Investigation of the photodegradation profile of tamoxifen using
           spectroscopic and chromatographic analysis and multivariate curve
    • Abstract: Publication date: Available online 31 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Marc Marín-García, Giuseppina Ioele, Helena Franquet-Griell, Sílvia Lacorte, Gaetano Ragno, Romà Tauler
      Photodegradation of tamoxifen (TAM) is investigated by chemometric analysis of the multiset data obtained by LC-DAD/MS and UV spectrophotometry. A hydroalcoholic solution of TAM was submitted to photodegradation by means of a dedicate irradiation cabinet able to simulate natural irradiation sources. The irradiance conditions were stressed by increasing the irradiation power to produce a rapid photodegradation of TAM. Drug photodegradation was monitored through UV spectrophotometry and the obtained photoproducts were investigated in detail by DAD/MS liquid chromatography. Data collected from the combination of both instrumental techniques were fused and processed jointly using the Multivariate Curve Resolution–Alternating Least Squares (MCR-ALS) method. A total number of five compounds were identified during the drug photodegradation and their kinetic evolution was described. The process included the isomerization of TAM to its (E)-form and the subsequent cyclization of these both compounds to give two phenanthrene derivatives. A photooxygenation process can also occur, giving a benzophenone derivative photoproduct as a result thereof. The multivariate resolution method proposed in this work allowed the resolution of this complex multicomponent system by the direct analysis of the experimental data.

      PubDate: 2018-02-05T06:51:13Z
  • Bagging classification tree-based robust variable selection for radial
           basis function network modeling in metabonomics data analysis
    • Abstract: Publication date: Available online 2 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Hui Gu, Lu Xu, Meng-Ying Tu, Hai-Yan Fu, Yan-Fang Cui, Yan-Ping Zhou
      Complex datasets can be routinely produced from modern analytical platforms in metabonomics surveys, which brings enormous challenges to existing chemometrics tools. In the current study, inspired by the characteristic of CT in automatically selecting the most informative variables and measuring their importance, the potential of bagging in improving the reliability and robustness of a single model, and the promising modeling performance of radial basis function network (RBFN), we designed a new chemometrics tool, i.e., bagging classification tree-radial basis function network (BAGCT-EBFN), for metabonomics data analysis. In BAGCT-RBFN, a series of parallel CT models were firstly established based on the idea of bagging (BAGCT). The informative variables can be successfully spied via inspecting the variable importance values over all CTs in BAGCT. Then, RBFN was utilized to relate the informative variables identified by BAGCT to the classification memberships. To demonstrate the practical application of BAGCT-RBFN in metabonomics, an H-1 NMR-based metabonomics dataset associated with lung cancer was applied. The results showed that BAGCT-RBFN can find a shortlist of discriminatory variables with reliability while attain more satisfactory classification accuracy than traditional CT and RBFN.

      PubDate: 2018-02-05T06:51:13Z
  • A new active learning strategy for soft sensor modeling based on feature
           reconstruction and uncertainty evaluation
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Qifeng Tang, Dewei Li, Yugeng Xi
      Soft sensor techniques have been increasingly applied in chemical processes to predict key variables online. Traditionally, soft sensors modeling needs labeled training samples that contain both subsidiary and key variables. However, labeled samples are limited because key variables are always difficult to be obtained online. Thus, a novel active learning (AL) strategy is proposed to reduce the labeling cost, which iteratively select the most informative candidates by jointly evaluating two criteria: (i) representativeness and (ii) uncertainty. The representative samples are defined as those whose features have the best reconstruction of global features extracted by kernel principal component analysis (KPCA), while the uncertain samples can be selected by the estimated variance based on Gaussian process regression (GPR) model. Then, the optimization scheme is also introduced to solve the optimization problem derived by the two sampling criteria. Three industrial application case studies show that the proposed AL strategy exhibits a good capability to select the most informative samples, which can improve the performance of the soft sensor.

      PubDate: 2017-12-27T12:43:37Z
  • Active learning algorithm can establish classifier of blueberry damage
           with very small training dataset using hyperspectral transmittance data
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Meng-Han Hu, Yu Zhao, Guang-Tao Zhai
      The aim of this study was to estimate the performance of active learning algorithm for detecting blueberry damage using hyperspectral transmittance data with the very low labeling cost. A hyperspectral transmittance imaging system was first applied to collect the hyperspectral transmittance data of blueberries. Subsequently, the mean hyperspectral transmittance data was extracted. With only 9 labeled berries, the estimated error reduction could achieve the accuracy, precision and recall of 0.87, 0.93 and 0.78 respectively, and it consistently improved or maintained the performance of classifier for the remainder of the queries. In contrast to the SOM and SVM models, the classifier based on estimated error reduction also provided higher accuracy, precision and recall with the much fewer labeled samples. The active learning algorithms can be extended to the large scale applications in which the labeled samples are very limited or expensive and the models are required to be frequently transferred. In our case, due to the significant biological variations existing among blueberry samples, the classifier required frequent updates in practical applications, and the active learning algorithms could remarkably reduce label effort during the model updating processes.

      PubDate: 2017-12-27T12:43:37Z
  • Estimating nitrogen status of rice canopy using hyperspectral reflectance
           combined with BPSO-SVR in cold region
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Kezhu Tan, Shuwen Wang, Yuzhu Song, Yao Liu, Zhenping Gong
      As special soil background, climate condition and growth period in cold region, it is necessary to find adaptable models to assess rice growth and nutrient situation. This study was conducted to estimate the nitrogen content of rice in different growth stages using rice canopy's hyperspectral reflectance toward the development of precision nitrogen status monitoring. The field experiment was undertaken applying four levels of nitrogen (N) treatment for cultivar ‘Daohuaxiang’. The hyperspectral reflectance of rice canopy of tilling stage, jointing stage and heading stage was captured using hyperspectral imaging system in the range of 372–1038 nm covering 128 wavebands. The average spectral reflectance was extracted from five region of interest (ROI) of each sample and a total of 192 groups of spectral reflectance data were obtained. Then, eight vegetation indices were calculated by spectral reflectance. A kind of machine learning method, termed “support vector regression based on binary particle swarm optimization algorithm (BPSO-SVR)” was proposed to predict nitrogen content. The results were achieved by selecting the best subset of input variables and optimizing the parameters ‘c’ and ‘g’ of SVR through the method of BPSO-SVR. In this work, we also established traditional prediction models such as partial least square regression (PLSR), principal components regression (PCR) and GA-BPANN. The predictive power of these regression models was compared using R 2 (coefficient of determination) and RMSE (root mean square error) of calibration set and testing set. The newly proposed ‘BPSO-SVR'method yielded the excellent R 2 (0.913–0.949) and the smaller RMSE (0.055–0.127) for fitting nitrogen concentration of rice canopy over three growth stages. The results showed that, the method proposed in this paper for predicting N content of rice canopy in different growth stages was potential for nitrogen status monitoring in cold region.

      PubDate: 2017-12-27T12:43:37Z
  • A visualization approach for unknown fault diagnosis
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Zhangming He, Zhiwen Chen, Haiyin Zhou, Dayi Wang, Yan Xing, Jiongqi Wang
      Since visualization can provide useful information to control engineers about the state of the process, visualization has become an dispensable item in the condition monitoring toolbox. The objective of this paper is to propose a visualization approach and apply it to unknown fault isolation. First, the data-driven parity space (PS) technique is used to identify the stable kernel representation (SKR) of a linear time invariant dynamic system. Then, the signature directions (SDs) and the current directions (CDs) are defined, based on which the detection and isolation rules are proposed for diagnosing both the known faults (KFs) and the unknown faults (UFs). Finally, a visualization approach is provided for projecting high-dimensional fault information onto a lower dimensional and drawable space. This approach maintains the fault isolability so that engineers will be able to diagnose the faults more reasonably. The proposed visualization approach is applied to a vertical take off and landing (VTOL) aircraft model and a glass tube manufacturing process.

      PubDate: 2017-12-27T12:43:37Z
  • Using spectral and textural data extracted from hyperspectral near
           infrared spectroscopy imaging to discriminate between processed pork,
           poultry and fish proteins
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Cristóbal Garrido-Novell, Ana Garrido-Varo, Dolores Pérez-Marín, José Emilio Guerrero
      This paper proposes a method based on Near Infrared Hyperspectral Imaging for discriminating between pork, poultry and fish species in processed animal protein meals. First, an investigation was conducted into the possible importance of incorporating into the discrimination models anomalous (or singular) pixels as probable discriminant pixels for each species. Subsequently, partial least squares discriminant analysis (PLS-DA) spectral and textural models were constructed. The former reflected the spectral information (spectral trace), and the latter the spatial (textural trace) information based on different groups of features. Finally, the spectral and textural information was integrated using classification trees, to ascertain whether the combined use of such information represented an improvement in accuracy in the effort to discriminate between species. The method was applied to a set of 40 pork, 40 poultry and 40 fish meals analysed in the 1000–1700 nm range. Models were then tested using an external validation set comprising 45 samples (15 pork, 15 poultry and 15 fish meals). The results demonstrated that combining spectral and appearance characteristics in a single classification tree generated better classification results for the samples used in the study (92% correct) than when using the PLSDA spectral model (83% correct).

      PubDate: 2017-12-27T12:43:37Z
  • Artificially generated near-infrared spectral data for classification
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Vilma Sem, Jana Kolar, Lara Lusa
      Near-Infrared Spectroscopy has became a widely used analytical technique in different research fields due to its non-destructiveness and low-cost. The spectra are rich in information but extremely complex, therefore their analysis necessitates the use of advanced statistical methods. The empirical properties of the statistical methods can be assessed using artificially generated data that resemble real Near-Infrared Spectroscopy. In this paper we propose a new data generation approach (ABS) that takes into account the theoretical knowledge about the near-infrared absorption of the functional groups. The proposed method is compared to real data and to a simpler data generation method, which simulates the data from a multivariate normal distribution whose parameters are estimated from real data (MVNorig). The comparison between real data and the data generation approaches is based on a class-imbalanced classification problem using linear discriminant analysis, classification trees and support vector machines. Both simulation approaches generated spectra with a good resemblance to real data, MVNorig performing slightly better than ABS; using real and simulated data we would have reached similar conclusions about the class-imbalance problem in classification. Both methods can be used to artificially generate near-infrared spectra. The method based on multivariate normal distribution can be used when a large number of real data spectra is available, while the appropriateness of the results of the ABS method depend on the exactness of functional group near-infrared absorption knowledge.

      PubDate: 2017-12-27T12:43:37Z
  • Steel surface defect classification using multiple hyper-spheres support
           vector machine with additional information
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Rongfen Gong, Chengdong Wu, Maoxiang Chu
      A novel multiple hyper-spheres support vector machine with additional information (MHSVM+) is proposed for multi-class steel surface defects classification. Originated from binary twin hyper-spheres support vector machine, MHSVM+ uses hyper-sphere to solve classification decision problem. Differently, MHSVM+ is a multi-class classifier, where it builds a corresponding hyper-sphere for each type of defect dataset. Moreover, MHSVM+ introduces learning paradigm using additional information, which means it can learn additional information hidden in defect dataset. Two types of additional information are provided: local neighbor information and local density information. Local neighbor information contains local classification results for defect samples. And local density information is used to capture label noise, isolated samples and important samples in defect dataset. The above two types of additional information are introduced into MHSVM+ model. Finally, MHSVM+ classifier is used to classify six types of steel surface defects. Experimental results show that the novel multi-class classifier has perfect classification accuracy for defect dataset, especially corrupted defect dataset.

      PubDate: 2017-12-27T12:43:37Z
  • A new reconstruction-based auto-associative neural network for fault
           diagnosis in nonlinear systems
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Shaojun Ren, Fengqi Si, Jianxin Zhou, Zongliang Qiao, Yuanlin Cheng
      Auto-associative neural network (AANN) is a typical nonlinear principal component analysis method, which is widely used in industry for fault diagnosis purposes, especially in nonlinear systems. However, the basic AANN often suffers from “smearing effects” problems that may lead to misdiagnosis, particularly with regards to the complex faults involving multiple variables. In this work, a new reconstruction-based AANN (RBAANN) method is proposed to enhance the capacity of fault diagnosis. In RBAANN, a generic derivative equation is developed to investigate the effects of AANN model inputs on the prediction error between model inputs and outputs. Based on the derivative equation, the reconstruction-based index for single or multiple variables, which is defined as the minimum prediction error, is obtained by tuning the corresponding model inputs iteratively. However, without the prior knowledge of the real faulty variables, all the possible variable sets need to be evaluated by the reconstruction-based index, and this may result in an exhaustive search and cause a huge computational burden. Thus, a branch and bound algorithm is introduced into RBAANN to solve the variable selection problem. Finally, an efficient fault diagnosis strategy by integrating RBAANN and branch and bound algorithm (BAB-RBAANN) is implemented to further pinpoint the source of the detected faults. This BAB-RBAANN method can handle both single and multiple variable(s) faults for nonlinear systems without prior knowledge efficiently. The effectiveness of the proposed methods is evaluated on a validation example and an industrial example. Comparisons with other methods, including principal component analysis techniques, are also presented.

      PubDate: 2017-12-27T12:43:37Z
  • A weighted heteroscedastic Gaussian Process Modelling via particle swarm
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Xiaodan Hong, Yongsheng Ding, Lihong Ren, Lei Chen, Biao Huang
      In many chemical engineering applications, it is often difficult to get accurate first-principle models because of complexity of modern processes. Even if it is possible to do so, it is often time consuming and computationally expensive. Hence, there is a growing need to develop data-driven models. Gaussian process regression (GPR) model has been extensively applied in data-based modelling due to its good adaptability to deal with high dimensional, small samples, and nonlinear problems. The standard GPR algorithm assumes constant noise power throughout the sampling process. However, in process systems, the observation noise often varies so that different sample points are corrupted by different degrees of noise. Under these circumstances, the standard GPR algorithm may not work properly. To model Gaussian process with heteroscedastic noise, this paper introduces a weighting strategy into the standard GPR algorithm, and proposes three weighted GPR algorithms: the clustered GPR (C-GPR) algorithm, the partial weighted GPR (PW-GPR) algorithm and the weighted GPR (W-GPR) algorithm. Different from the standard GPR algorithm, three weighted algorithms put the weight on sampled data by calculating the noise variance for each data point. In addition, in order to optimize the proposed algorithms, this paper utilizes the particle swarm optimization (PSO) algorithm to estimate hyper-parameters of the GPR model, instead of using the traditional conjugate gradient (CG) method. The effectiveness of the three weighted GPR algorithms is verified by means of two numerical examples and a wet spinning coagulation process. Extensive simulation results demonstrate that the proposed algorithms optimized by the PSO algorithm can improve prediction accuracy of the GPR model.

      PubDate: 2017-12-27T12:43:37Z
  • Identifying animal species in NIR hyperspectral images of processed animal
           proteins (PAPs): Comparison of multivariate techniques
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Cecilia Riccioli, Dolores Pérez-Marín, Ana Garrido-Varo
      The use of PAPs in animal feed has several advantages over other feed ingredients, but requires rigorous and accurate control mechanisms that ensure the absence of ruminant meal. In order to differentiate between animal species while simultaneously offering the capacity to inspect PAPs in large volumes, a hyperspectral imaging (HSI) system operating in the NIR spectral range is proposed. This study investigates the sensitivity, specificity and other parameters with which HSI can discriminate between different animal species (ruminants, swine and poultry), making use of various classification methods. Diffuse reflectance spectra were acquired from 125 rendered meal samples in the 1000–1700 nm wavelength range; measured PAPs included particles of scale, hair, feather, blood, grease, skin, muscle and bone from both ruminant and non-ruminant animals, obtained in a rendering plant. Various classification methods were then applied to the dataset to determine the accuracy with which different animal species could be discriminated from each other. Support Vector Machine classification performed best in discriminating between animal species, with a sensitivity and specificity of around 90% and a Matthew's correlation coefficient of around 0.7 for non-ruminant species and higher than 0.95 for ruminant species. Other methods, such PLS-DA and Subspace Discriminant, also produced acceptable results and required less computational time. This study showed that spectral analysis of PAPs, based on diffuse reflectance spectroscopy, is a promising technique for differentiating between ruminant species and other terrestrial animal species. The technique may therefore offer accurate and fast analysis of large volumes of feed products, a necessary prerequisite for the lifting of the EU ban on non-ruminant processed animal proteins.

      PubDate: 2017-12-27T12:43:37Z
  • Finding the optimal time resolution for batch-end quality prediction: MRQP
           – A framework for multi-resolution quality prediction
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Geert Gins, Jan F.M. Van Impe, Marco S. Reis
      Batch-end quality prediction is of paramount importance in industries producing high added-value products. However, all available methods hereto only use raw data at its native resolution. This paper proposes a new multi-resolution quality prediction (MRQP) methodology for batch-end quality, which exploits structured correlation in both the time and variables dimensions. The implementation of this methodology results in a more parsimonious and robust model structure, with a predictive performance bounded to be at least as good as the current standard approach. The implementation of MRQP is illustrated in three different case studies, where improvements over the standard single-resolution approach were found to be in the range of 10%–50%. From an interpretation standpoint, multi-resolution models are more robust with respect to the selection of too many predictors, facilitating the identification of key process variables, and providing information on the process time scales that influence final product quality, which can be further exploited for diagnosis, control, and optimization.

      PubDate: 2017-12-27T12:43:37Z
  • In-line Vis-NIR spectral analysis for the column chromatographic processes
           of Ginkgo biloba part I: End-point determination of the elution process
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Wenlong Li, Xu Yan, Houliu Chen, Haibin Qu
      A Vis-NIR (visible-near infrared) spectroscopy based method for the end-point determination of the column chromatographic processes of Ginkgo biloba was established. In-line collected Vis-NIR spectra were used in conjunction with various multivariate data analysis methods, including moving block standard deviation (MBSD), principal component scores distance analysis (PC-SDA), and principal component analysis-moving block standard deviation (PCA-MBSD) for the end-point determination of the elution process. Compared with the results validated by high performance liquid chromatography (HPLC) determination, PCA-MBSD was chosen as the most suitable method. The presented method provided an alternative solution to the end-point of the column chromatographic processes, which in the long term depends upon the workers' operating experience traditionally.

      PubDate: 2017-12-27T12:43:37Z
  • Support vector regression coupled with wavelength selection as a robust
           analytical method
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Felipe Soares, Michel J. Anzanello
      This paper assesses the support vector regression (SVR) as a robust alternative to partial least squares (PLS) in multivariate calibration using twelve public domain NIR spectroscopy datasets. It also proposes the use of the support vector regression – recursive feature elimination (SVR-RFE) algorithm to select the most informative wavelengths for SVR models. Models based on full spectra were built using SVR and PLS, while wavelength selection methods were carried out using SVR-RFE, interval PLS (iPLS), backward interval PLS (biPLS), synergy interval PLS (siPLS), and successive projection algorithm PLS (SPA-PLS). The prediction performance of tested methods was measured by means of the root mean squared error (RMSE), index of agreement (d-index) and R2 on the test set. SVR-based models yielded the best results in 8 out of 12 datasets, 4 of them using full spectra and 4 relying on SVR-RFE selected wavelengths. Statistical comparison was carried out for the wavelength selection algorithms using Friedman test, which pointed the SVR-RFE as a competitive technique when compared to the other algorithms. This study revealed SVR as a robust alternative to PLS, especially when SVR-RFE is employed for wavelength selection.

      PubDate: 2017-12-27T12:43:37Z
  • Deep-learning-based regression model and hyperspectral imaging for rapid
           detection of nitrogen concentration in oilseed rape (Brassica napus L.)
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Xinjie Yu, Huanda Lu, Qiyu Liu
      Deep-learning-based regression model composed of stacked auto-encoders (SAE) and fully-connected neural network (FNN) was used for the detection and quantification of nitrogen (N) concentration in oilseed rape leaf. SAE was applied to extract deep spectral features from visible and near-infrared (380–1030 nm) hyperspectral image of oilseed rape leaf, and then these features were used as input data for FNN to predict N concentration. The SAE-FNN model achieved reasonable performance with R2 P = 0.903, RMSEP =0 .307% and RPDP = 3.238 for N concentration. Results confirmed the possibility of rapid and nondestructive detecting N concentration in oilseed rape leaf by the combination of hyperspectral imaging technique and deep learning method.

      PubDate: 2017-12-27T12:43:37Z
  • Evaluation of diagnosis methods in PCA-based Multivariate Statistical
           Process Control
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Marta Fuentes-García, Gabriel Maciá-Fernández, José Camacho
      Multivariate Statistical Process Control (MSPC) based on Principal Component Analysis (PCA) is a well-known methodology in chemometrics that is aimed at testing whether an industrial process is under Normal Operation Conditions (NOC). As a part of the methodology, once an anomalous behaviour is detected, the root causes need to be diagnosed to troubleshoot the problem and/or avoid it in the future. While there have been a number of developments in diagnosis in the past decades, no sound method for comparing existing approaches has been proposed. In this paper, we propose such a procedure and use it to compare several diagnosis methods using randomly simulated data and from realistic data sources. This is a general comparative approach that takes into account factors that have not previously been considered in the literature. The results show that univariate diagnosis is more reliable than its multivariate counterpart.

      PubDate: 2017-12-27T12:43:37Z
  • Identification of robust probabilistic slow feature regression model for
           process data contaminated with outliers
    • Abstract: Publication date: Available online 24 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Lei Fan, Hariprasad Kodamana, Biao Huang
      Modeling of high dimensional dynamic process is considered as a challenging task. In this regard, probabilistic Slow Feature Analysis (PSFA), a dynamic latent variable model, is proven to be a useful tool which extracts temporally correlated dynamic features from the high-dimensional raw measurements. The extracted latent Slow Features (SFs) can capture process variations which are useful in developing dynamic models. Often times industrial data is affected by outliers, and modeling such data could result in inferior prediction performance. To deal with such scenarios, we propose a robust PSFA (RPSFA) based regression model that models outliers in the observation data using the Student's t-distribution. To estimate the parameters in RPSFA and to extract reduced dimension of SFs, we employ Expectation-Maximization (EM) algorithm under the Maximum Likelihood Estimation (MLE) framework considering SFs as hidden variables. To estimate the hidden SFs we propose a weighted gain Kalman filter based approach as the Normal distribution assumption of the observations is no longer valid. The validity and merits of the proposed approach are demonstrated though a simulated example, an industrial application and an experimental study.

      PubDate: 2017-12-27T12:43:37Z
  • MVC3_GUI: A MATLAB graphical user interface for third-order multivariate
           calibration. An upgrade including new multi-way models
    • Abstract: Publication date: Available online 24 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Sarmento J. Mazivila, Santiago A. Bortolato, Alejandro C. Olivieri
      An upgrade is presented of a MATLAB graphical user interface toolbox for implementing third-order/four-way multivariate calibration models. The new Multivariate Calibration 3 (MVC3_GUI) incorporates new models and features that make it a very versatile tool for four-way data processing. In addition to the quadrilinear decomposition (QLD) and latent structure models based on partial least-squares regression and residual trilinearization, included in the earlier software version, non-QLD models are now available. The latter include extended multivariate curve resolution-alternating least-squares (MCR-ALS), augmented parallel factor analysis (Augmented PARAFAC) and PARAFAC2. The software is presented as both a set of MATLAB codes and as a standalone program. MVC3_GUI accepts a variety of ASCII data for input. Appropriate working sensor regions in the different data modes can be selected. Model development and its subsequent application to unknown samples are straightforward from the interface. Prediction results are provided along with analytical figures of merit and standard concentration errors, as calculated by modern concepts of uncertainty propagation. Different examples of use of this updated interface are given in this work.

      PubDate: 2017-12-27T12:43:37Z
  • Applying Tchebichef image moments to quantitative analysis of the
           components in complex samples based on raw NIR spectra
    • Abstract: Publication date: Available online 22 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jin Jin Liu, Bao Qiong Li, Hong Lin Zhai, Xue Wang, Min Li Xu
      The interferences of irrelevant information, overlapping and shifts of peaks appear mostly in near infrared (NIR) spectroscopy, especially in complex samples, which seriously impede the accurate quantification. In this work, the features of raw NIR spectra represented by Tchebichef image moments (TMs) were employed to partial least square (PLS) modeling. The proposed strategy was applied to quantitative analysis of the components in complex samples based on their raw NIR spectra, and the obtained models were strictly evaluated by their statistical parameters. Our study indicates that the information in raw NIR spectra can be reorganized and represented by TM method owing to its powerful multi-resolution capability and inherent invariance property, which is beneficial to extract the important information of target components. Compared with the PLS and interval partial least square (iPLS) method, the proposed approach could provide accurate and reliable analytical results. Therefore, as an efficient pretreatment method, TMs can be used to improve the analytical precision of PLS based on conventional NIR spectra.
      Graphical abstract image

      PubDate: 2017-12-27T12:43:37Z
  • Call for nominations: 2018 Chemometrics and Intelligent Laboratory Systems
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171

      PubDate: 2017-12-27T12:43:37Z
  • A model-based data mining approach for determining the domain of validity
           of approximated models
    • Abstract: Publication date: Available online 16 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Marco Quaglio, Eric S. Fraga, Enhong Cao, Asterios Gavriilidis, Federico Galvanin
      Parametric models derived from simplifying modelling assumptions give an approximated description of the physical system under study. The value of an approximated model depends on the consciousness of its descriptive limits and on the precise estimation of its parameters. In this manuscript, a framework for identifying the model domain of validity for the simplifying model hypotheses is presented. A model-based data mining method for parameter estimation is proposed as central block to classify the observed experimental conditions as compatible or incompatible with the approximated model. A nonlinear support vector classifier is then trained on the classified (observed) experimental conditions to identify a decision function for quantifying the expected model reliability in unexplored regions of the experimental design space. The proposed approach is employed for determining the domain of reliability for a simplified kinetic model of methanol oxidation on silver catalyst.

      PubDate: 2017-11-19T00:54:29Z
  • PRFFECT: A versatile tool for spectroscopists
    • Abstract: Publication date: Available online 11 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Benjamin R. Smith, Matthew J. Baker, David S. Palmer
      PRFFECT is a computer program to aid with spectral preprocessing and the development of classification models. Via a simple text interface, PRFFECT allows users to select wavenumber ranges, perform spectral preprocessing, carry out data partitioning (into training and testing datasets), run a Random Forest classification, compute statistical results, and identify important descriptors for the classification. The preprocessing options provided fall into four categories: binning, smoothing, normalisation, and baseline correction. The program outputs a wide-variety of useful data, including classification metrics and graphs showing the importance of individual wavenumbers to the classification models. As proof-of-concept, PRFFECT has been benchmarked on preprocessing and classification of four food analysis datasets. Sensitivities and specificities above 0.92 were obtained in all cases. The results show that different preprocessing procedures are optimal for different datasets. The PRFFECT software is available freely to the community via GitHub. Link:

      PubDate: 2017-11-16T00:40:41Z
  • Identification of hindered internal rotational mode for complex chemical
           species: A data mining approach with multivariate logistic regression
    • Abstract: Publication date: Available online 8 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Triet H.M. Le, Tung T. Tran, Lam K. Huynh
      Thermodynamic properties are essential to understand and describe many chemical/biological processes in the real environment. To obtain correct thermodynamic data of chemical species for a wide range of temperatures, a rigorous Hindered Internal Rotation (HIR) treatment must be considered. Such a treatment requires detailed information about the internal rotation (i.e., rotational axis, group, frequency and symmetry and hindrance potential). However, it is very tedious, even prone-to-error, for chemists to prepare the input parameters for such a treatment. Among the HIR parameters, the rotational frequency (or mode) is the most difficult element due to the complex molecular structure and mixing vibrational modes of chemical species. Recently, a rule-based framework has been proposed to help chemists with this tedious process (Le et al., Comput. Theor. Chem., 2017, 61). This approach has been demonstrated to work well for simple species; however, it still lacked the ability to handle more complex cases. Therefore, in this study, a data mining approach is proposed to overcome the challenges of the previous algorithm. Within this framework, the HIR pattern was found using the features extracted from existing data provided by chemists. More specifically, multivariate logistic regression was implemented to analyze the chemical data to better predict the rotational frequency (mode) of chemical species as well as to highlight the effect of each attribute of the rotation. The experimental results were demonstrated to be more accurate than the previous study in terms of both accuracy and completeness. It also gives meaningful insights into the HIR itself. The proposed approach of this research will be integrated into MSMC-GUI ( to provide chemists with both an interactive and robust tool to prepare the data for their thermodynamic calculations on-the-fly.

      PubDate: 2017-11-08T12:51:16Z
  • Robust analysis of spectra with strong background signals by
           First-Derivative Indirect Hard Modeling (FD-IHM)
    • Abstract: Publication date: Available online 7 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): P. Beumers, D. Engel, T. Brands, H.J. Koß, A. Bardow
      Spectral analysis of mixtures often faces challenges due to nonlinear effects such as peak shifts or strong background signals. Nonlinear mixture effects can be effectively treated by the Indirect Hard Modeling (IHM) Method. In IHM, mixture effects are captured by adapting hard models of pure component spectra when fitting a mixture model. However, IHM requires a suitable background treatment, which can become laborious. Background signals do not arise from the components of interest but often superimpose their spectra. In statistical methods for spectral analysis, background treatment is often conducted by derivatives of a spectrum. Derivatives effectively damp broad background signals. Standard IHM is not applicable to derivatives of spectra as the negative parts of a derivative spectrum cannot be modeled by pseudo-Voigt peaks which are always positive. In this work, we propose First-Derivative Indirect Hard Modeling (FD-IHM). FD-IHM uses the analytical derivatives of the peak functions. The analytical derivatives are fitted to numerical derivatives of the spectra. Thereby, we combine background treatment by first derivatives with the IHM method to treat nonlinear effects. The presented FD-IHM is validated using Raman spectra of ethanol/acetone mixtures. To introduce a variety of background signals, we used fluorescence dye, scattering bodies (yeast) and various background light sources. Classical IHM allows us to predict the test sets with a root-mean-square error of prediction (RMSEP) ranging from 0.60 wt% to 2.06 wt%, but careful manual background treatment had to be applied. With FD-IHM, we reduce the RMSEP error by 21%–73% without any background treatment. Thus, FD-IHM allows for both, efficient and accurate analysis of spectra with large background signals.

      PubDate: 2017-11-08T12:51:16Z
  • Dynamic hypersphere based support vector data description for batch
           process monitoring
    • Abstract: Publication date: Available online 6 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jianlin Wang, Weimin Liu, Kepeng Qiu, Tao Yu, Liqiang Zhao
      Support Vector Data Description (SVDD) is an efficient monitoring method that captures the spherically shaped boundary around the normal batch data and sets the control limit related to support vectors (SVs) for online monitoring. Using nonlinear transformation functions, SVDD constructs an irregular hypersphere in high dimensional space. When the batch process is complicated, the accuracy of monitoring will decrease with traditional control limit of SVDD. In this paper, dynamic hypersphere based support vector data description (DH-SVDD) is proposed for batch process monitoring. In training process, static hypersphere is built by the important SVs of training dataset. In testing process, dynamic hypersphere is built by the important SVs of combined dataset with current test sample and training dataset. If there is a significant change between these two hyperspheres, it means that the current test sample is an outlier. Thus, DH-SVDD has a relatively high monitoring accuracy because it fully considers relationship between the current test sample and the historical training dataset in high dimensional space. Comparison is conducted between the proposed DH-SVDD and traditional methods such as K-chart-SVDD, max limit SVDD and validation limit SVDD. The effectiveness of the DH-SVDD is also verified by a semiconductor etch process and a fed-batch penicillin fermentation process.

      PubDate: 2017-11-08T12:51:16Z
  • HYPER-Tools. A graphical user-friendly interface for multivariate and
           hyperspectral image analysis
    • Abstract: Publication date: Available online 4 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): José Manuel Amigo, Nabiallah Mobaraki
      HYPER-Tools is a new graphical user-friendly interface (GUI) especially designed for the analysis of multivariate and hyperspectral images. This easy-to-use interface works under Matlab environment and integrates fundamental types of spectral and spatial pre-processing methods as well as the main chemometric tools (exploratory data analysis, clustering, regression, and classification) for multivariate and hyperspectral image analysis. The main feature of HYPER-Tools is the powerful visualization tools implemented and the interaction of the user with the interface, meaning that the user does barely need Matlab skill to use it. Together with the GUI several tutorials and videos are provided in the official website ( showing the working procedure of HYPER-Tools step by step in different situations.

      PubDate: 2017-11-08T12:51:16Z
  • ChemBCPP: A freely available web server for calculating commonly used
           physicochemical properties
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Jie Dong, Ning-Ning Wang, Ke-Yi Liu, Min-Feng Zhu, Yong-Huan Yun, Wen-Bin Zeng, Alex F. Chen, Dong-Sheng Cao
      The behavior of a chemical in human or environment mostly depends on its several key physicochemical properties, such as aqueous solubility, octanol-water partition coefficient (logP), boiling point (BP), density, flash point (FP), viscosity, surface tension (ST), vapor pressure (VP) and melting point (MP). Commonly, these properties are important for the environmental sciences and drug discovery, such as the absorption, distribution, metabolism, excretion, and toxicity (ADMET) for medicinal compounds and the common risk assessment for problematic chemicals. At present, the quantitative structure-property relationship (QSPR) model was widely applied to save time and money investment in the early stage of chemical research. Although some satisfactory models were already obtained, most of them are not available for the public researchers and thus cannot be directly applied to practical research projects. Herein, in this study, we developed a user-friendly web server named ChemBCPP that can be used to predict aforementioned 8 important physicochemical properties and calculate several other commonly used properties just by uploading a molecular structure or file. In addition, for a new chemical entity, users can not only get its predicted value but also obtain a leverage value (h value) which can be used to evaluate the reliability of predictive result. We believe that ChemBCPP could be widely applied in environmental science, chemical synthesis and drug ADMET fields with the demand for high quality of chemical properties. ChemBCPP could be freely available via
      Graphical abstract image

      PubDate: 2017-10-17T23:25:58Z
  • Calibration of a chemometric model by using a mathematical process model
           instead of offline measurements in case of a H. polymorpha cultivation
    • Abstract: Publication date: Available online 13 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): O. Paquet-Durand, T. Ladner, J. Büchs, B. Hitzmann
      Data driven regression models such as Principle Component Regression (PCR) or Partial Least Square Regression (PLS) in combination with spectroscopic methods are increasingly applied in bioprocess monitoring. However, as the name “data driven” implies, the calibration of these regression models requires a large amount of predictor (X) and response (Y) data. The predictor data in this case mostly consists of the spectroscopic data, which are easy to generate in large quantities, but the response data typically involves offline measurements in the laboratory that require much more effort to perform in large numbers. It will be shown that in case of a H. polymorpha cultivation performed in microtiter plates, those tedious offline measurements for response data can be replaced by a mathematical process model. Here, an exponential growth model in an ideal stirred tank reactor with lag-time is applied, which has three parameters (lag time, specific growth rate, and yield coefficient). Furthermore, it will be demonstrated that knowledge about the parameter values of this process model is not required, as these values can be determined from 2D fluorescence spectra alone. The only required information about the cultivation is the predictor data, 2D-fluorescence spectra in this case, and the initial state of at least three different cultivation runs, that is the initial values of biomass and substrate (glycerol) concentration. The smallest prediction error for biomass and glycerol obtained by the new calibration procedure are 0.19 g/L and 0.79 g/L respectively, and 0.19 g/L and 1.12 g/L, if a classical procedure using off-line measurements is applied. The inherently calculated process parameters of lag time, specific growth rate and yield coefficient are 4.77 h, 0.154 h−1, and 0.457 g/g, which are similar to values which are determined with offline measurements and least square fit 4.48 h, 0.139 h−1, 0.466 g/g respectively.

      PubDate: 2017-10-17T23:25:58Z
  • Evaluation and assessment of homogeneity in images. Part 1: Unique
           homogeneity percentage for binary images
    • Abstract: Publication date: Available online 8 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Leandro de Moura França, José Manuel Amigo, Carlos Cairós, Manel Bautista, Maria Fernanda Pimentel
      Texture features analysis is one of the most important approaches for the assessment of homogeneity on images. However, all of them are either relative to the comparison with a standardized set of images, or further multivariate models are strongly required to predict or classify the images according to their features. In this first work, we propose an alternative and novel methodology to calculate a percentage of homogeneity by only using the self-information contained on the image. This methodology is based on the macropixel analysis theory and the generation of what is called the “homogeneity curve”. The homogeneity curve is deeply explored and the knowledge to what it could be considered the most homogeneous and inhomogeneous distribution for every case is spanned. This first work postulates the theory and demonstrates its usefulness with several examples applied to binary images. This will provide a theoretical framework to fully understand the homogeneity curve, postulating a mathematical model to parametrize homogeneity and its plausible deviations.
      Graphical abstract image

      PubDate: 2017-10-11T03:12:18Z
  • Optimized self-adaptive model for assessment of soil organic matter using
           Fourier transform mid-infrared photoacoustic spectroscopy
    • Abstract: Publication date: Available online 7 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Fei Ma, Changwen Du, Jianmin Zhou, Yazhen Shen
      Advanced technologies, such as infrared spectroscopy, have been applied to develop rapid, cheap but accurate methods for the analysis of soil matter organic (SOM). However, the unsatisfied prediction accuracy resulted from heavy soil heterogeneity limits the practical application. In our previous work, soil identification based self-adaptive partial least squares model (SAM), which was built using identification algorithm and the partial least square regression (PLSR), makes it possible for a wide use. However, soil identification in the SAM needs further optimized. In this study, we designed an advanced optimal self-adaptive partial least squares model (OPT-SAM), a more general model to predict SOM. 597 soil samples from China with large variances were collected, and the soil spectra were recorded using Fourier transform mid-infrared photoacoustic spectroscopy (FTIR-PAS). Five typical algorithms (Correlation coefficients (CC), Euclidean distance (ED), Mahalanobis distance (MD), Angle cosine (AC), and k-medoids (KM)) were considered for the identification in the SAM model. The results demonstrated that the performances of CC-SAM, ED-SAM, MD-SAM, AC-SAM were significantly improved in comparison with no identification based SAM (NI-SAM), but KI-SAM showed a poor prediction. ED-SAM (R2 = 0.8890, RMSEP = 7.00 g kg−1, RPD = 2.96) indicated the highest accuracy and robustness in all algorithms, which was an optimal model for soil identification and prediction, and CC-SAM (R2 = 0.8572, RMSEP = 7.89 g kg−1, RPD = 2.44) was an alternative choice, especially for prediction with different soil types.

      PubDate: 2017-10-11T03:12:18Z
  • Review on data-driven modeling and monitoring for plant-wide industrial
    • Abstract: Publication date: Available online 29 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Zhiqiang Ge
      Data-driven modeling and applications in plant-wide processes have recently caught much attention in both academy and industry. This paper provides a systematic review on data-driven modeling and monitoring for plant-wide processes. First, methodologies of commonly used data processing and modeling procedures for the plant-wide process are presented. Detailed research statuses on various aspects for plant-wide process monitoring are reviewed since 2000. After that, extensions, opportunities, and challenges on data-driven modeling for plant-wide process monitoring are discussed and highlighted for future research.

      PubDate: 2017-10-04T06:42:34Z
  • A new measure of regression model accuracy that considers applicability
    • Abstract: Publication date: Available online 29 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Hiromasa Kaneko
      The coefficient of determination and the root-mean-squared error (RMSE) evaluate regression models for test samples without considering the applicability domains (ADs) of the models. In this study, we propose a new measure for evaluating the predictive performance of regression models that considers their ADs. The purpose is not selecting the best regression model among various competing models, but determining an appropriate model group corresponding to the AD of each model. The proposed measure is the area under coverage and RMSE curve for coverage less than p% (p%-AUCR). It is confirmed that some regression models have global predictive ability and others have local predictive ability, and p%-AUCR is an appropriate indicator for selecting between local and global regression models depending on the coverage and considering the AD. Selecting a regression model for each sample or each chemical structure using p%-AUCR can improve the prediction accuracy of data sets.

      PubDate: 2017-10-04T06:42:34Z
  • Penalized logistic regression for classification and feature selection
           with its application to detection of two official species of Ganoderma
    • Abstract: Publication date: Available online 28 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Ying Zhu, Tuck Lee Tan, Wai Kwong Cheang
      Two species of Ganoderma, Ganoderma lucidum (G. lucidum) and Ganoderma sinense (G. sinense) have been widely used as traditional Chinese herbal medicine for their high medicinal value. Recent studies show that the two species differ in levels of their main active compounds triterpenoids though both have antitumoral effects. An effective and simple analytical method using attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy to discriminate between the two species is of essential importance for its quality assurance and medicinal value estimation. In this study three penalized logistic regression models, weighted least absolute shrinkage and selection operator (Lasso), elastic net and weighted fusion, using ATR-FTIR spectroscopy have been explored for the purpose of classification and interpretation. The weighted fusion model incorporating spectral correlation structure allowed an automatic selection of a small number of spectral bands and achieved an excellent overall classification accuracy of 99% in discriminating spectra of G. lucidum from that of G. sinense. Its classification performance was superior to that of the weighted Lasso model and elastic net model. The automatic selection of informative spectral features results in substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of Ganoderma regarding its anti-cancer effects.

      PubDate: 2017-10-04T06:42:34Z
  • Concurrent probabilistic PLS regression model and its applications in
           process monitoring
    • Abstract: Publication date: Available online 28 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Qinghua Li, Feng Pan, Zhonggai Zhao
      The probabilistic PLS (PPLS) algorithm derives the latent variables by maximizing the likelihood of input scores and quality scores, but imposes no constraint on the input residuals and the quality residuals, which implies that residuals may contain large information. Motivated by the concurrent PLS method, this paper proposes a concurrent PPLS (CPPLS) method to perform further decomposition of these residuals, and then two more subspaces are obtained. In this method, the maximum-likelihood method along with the expectation-maximization (EM) algorithm are employed to develop the model, in which the variance of each variable explained by latent variables is introduced to determine the number of latent variables. Based on the CPPLS model, five monitoring statistics all based on Mahalanobis norm are constructed for the evaluation of five subspaces decomposed by CPPLS, respectively.

      PubDate: 2017-10-04T06:42:34Z
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-