for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> COMPUTER SCIENCE (Total: 2007 journals)
    - ANIMATION AND SIMULATION (30 journals)
    - AUTOMATION AND ROBOTICS (98 journals)
    - COMPUTER ARCHITECTURE (9 journals)
    - COMPUTER ENGINEERING (9 journals)
    - COMPUTER GAMES (16 journals)
    - COMPUTER PROGRAMMING (24 journals)
    - COMPUTER SCIENCE (1169 journals)
    - COMPUTER SECURITY (46 journals)
    - DATA BASE MANAGEMENT (13 journals)
    - DATA MINING (32 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (29 journals)
    - IMAGE AND VIDEO PROCESSING (39 journals)
    - INFORMATION SYSTEMS (108 journals)
    - INTERNET (92 journals)
    - SOCIAL WEB (50 journals)
    - SOFTWARE (34 journals)
    - THEORY OF COMPUTING (8 journals)

COMPUTER SCIENCE (1169 journals)                  1 2 3 4 5 6 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 14)
Abakós     Open Access   (Followers: 4)
ACM Computing Surveys     Hybrid Journal   (Followers: 24)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 9)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 13)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 16)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 6)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 12)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 4)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 18)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 6)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 1)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 21)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 8)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 9)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Full-text available via subscription   (Followers: 25)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 3)
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 11)
Advanced Engineering Materials     Hybrid Journal   (Followers: 26)
Advanced Science Letters     Full-text available via subscription   (Followers: 9)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 8)
Advances in Artificial Intelligence     Open Access   (Followers: 16)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 2)
Advances in Catalysis     Full-text available via subscription   (Followers: 5)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 15)
Advances in Computer Science : an International Journal     Open Access   (Followers: 15)
Advances in Computing     Open Access   (Followers: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 52)
Advances in Engineering Software     Hybrid Journal   (Followers: 26)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 11)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 26)
Advances in Human-Computer Interaction     Open Access   (Followers: 21)
Advances in Materials Sciences     Open Access   (Followers: 16)
Advances in Operations Research     Open Access   (Followers: 11)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 7)
Advances in Porous Media     Full-text available via subscription   (Followers: 4)
Advances in Remote Sensing     Open Access   (Followers: 40)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access   (Followers: 4)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 8)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
Air, Soil & Water Research     Open Access   (Followers: 9)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 11)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 4)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 5)
American Journal of Sensor Technology     Open Access   (Followers: 4)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 7)
Analysis in Theory and Applications     Hybrid Journal   (Followers: 1)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 3)
Annals of Data Science     Hybrid Journal   (Followers: 11)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 8)
Annals of Pure and Applied Logic     Open Access   (Followers: 2)
Annals of Software Engineering     Hybrid Journal   (Followers: 13)
Annual Reviews in Control     Hybrid Journal   (Followers: 6)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 1)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 13)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Clinical Informatics     Hybrid Journal   (Followers: 2)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 12)
Applied Computer Systems     Open Access   (Followers: 2)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 33)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 15)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 5)
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 5)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 131)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 4)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 7)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access   (Followers: 1)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 4)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 11)
Automation in Construction     Hybrid Journal   (Followers: 6)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 9)
Basin Research     Hybrid Journal   (Followers: 5)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Biodiversity Information Science and Standards     Open Access  
Bioinformatics     Hybrid Journal   (Followers: 298)
Biomedical Engineering     Hybrid Journal   (Followers: 15)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 14)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 18)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 34)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 45)
British Journal of Educational Technology     Hybrid Journal   (Followers: 131)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 10)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 2)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal  
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 14)
Capturing Intelligence     Full-text available via subscription  
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal   (Followers: 1)
Cell Communication and Signaling     Open Access   (Followers: 2)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access   (Followers: 1)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 14)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 7)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Full-text available via subscription   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 15)
Clean Air Journal     Full-text available via subscription   (Followers: 2)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combustion Theory and Modelling     Hybrid Journal   (Followers: 14)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 20)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 55)
Communications of the Association for Information Systems     Open Access   (Followers: 19)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 9)
Computación y Sistemas     Open Access  
Computation     Open Access  
Computational and Applied Mathematics     Hybrid Journal   (Followers: 2)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 11)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 2)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 9)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 15)
Computational Linguistics     Open Access   (Followers: 22)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 4)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 7)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 30)
Computer     Full-text available via subscription   (Followers: 89)
Computer Aided Surgery     Hybrid Journal   (Followers: 5)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 8)
Computer Communications     Hybrid Journal   (Followers: 10)
Computer Engineering and Applications Journal     Open Access   (Followers: 5)
Computer Journal     Hybrid Journal   (Followers: 9)
Computer Methods in Applied Mechanics and Engineering     Hybrid Journal   (Followers: 22)
Computer Methods in Biomechanics and Biomedical Engineering     Hybrid Journal   (Followers: 12)
Computer Methods in the Geosciences     Full-text available via subscription   (Followers: 2)
Computer Music Journal     Hybrid Journal   (Followers: 18)
Computer Physics Communications     Hybrid Journal   (Followers: 6)
Computer Science - Research and Development     Hybrid Journal   (Followers: 8)
Computer Science and Engineering     Open Access   (Followers: 19)
Computer Science and Information Technology     Open Access   (Followers: 13)
Computer Science Education     Hybrid Journal   (Followers: 14)
Computer Science Journal     Open Access   (Followers: 21)

        1 2 3 4 5 6 | Last

Journal Cover Chemometrics and Intelligent Laboratory Systems
  [SJR: 0.697]   [H-I: 92]   [14 followers]  Follow
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 0169-7439
   Published by Elsevier Homepage  [3118 journals]
  • A new active learning strategy for soft sensor modeling based on feature
           reconstruction and uncertainty evaluation
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Qifeng Tang, Dewei Li, Yugeng Xi
      Soft sensor techniques have been increasingly applied in chemical processes to predict key variables online. Traditionally, soft sensors modeling needs labeled training samples that contain both subsidiary and key variables. However, labeled samples are limited because key variables are always difficult to be obtained online. Thus, a novel active learning (AL) strategy is proposed to reduce the labeling cost, which iteratively select the most informative candidates by jointly evaluating two criteria: (i) representativeness and (ii) uncertainty. The representative samples are defined as those whose features have the best reconstruction of global features extracted by kernel principal component analysis (KPCA), while the uncertain samples can be selected by the estimated variance based on Gaussian process regression (GPR) model. Then, the optimization scheme is also introduced to solve the optimization problem derived by the two sampling criteria. Three industrial application case studies show that the proposed AL strategy exhibits a good capability to select the most informative samples, which can improve the performance of the soft sensor.

      PubDate: 2017-12-27T12:43:37Z
  • Active learning algorithm can establish classifier of blueberry damage
           with very small training dataset using hyperspectral transmittance data
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Meng-Han Hu, Yu Zhao, Guang-Tao Zhai
      The aim of this study was to estimate the performance of active learning algorithm for detecting blueberry damage using hyperspectral transmittance data with the very low labeling cost. A hyperspectral transmittance imaging system was first applied to collect the hyperspectral transmittance data of blueberries. Subsequently, the mean hyperspectral transmittance data was extracted. With only 9 labeled berries, the estimated error reduction could achieve the accuracy, precision and recall of 0.87, 0.93 and 0.78 respectively, and it consistently improved or maintained the performance of classifier for the remainder of the queries. In contrast to the SOM and SVM models, the classifier based on estimated error reduction also provided higher accuracy, precision and recall with the much fewer labeled samples. The active learning algorithms can be extended to the large scale applications in which the labeled samples are very limited or expensive and the models are required to be frequently transferred. In our case, due to the significant biological variations existing among blueberry samples, the classifier required frequent updates in practical applications, and the active learning algorithms could remarkably reduce label effort during the model updating processes.

      PubDate: 2017-12-27T12:43:37Z
  • Estimating nitrogen status of rice canopy using hyperspectral reflectance
           combined with BPSO-SVR in cold region
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Kezhu Tan, Shuwen Wang, Yuzhu Song, Yao Liu, Zhenping Gong
      As special soil background, climate condition and growth period in cold region, it is necessary to find adaptable models to assess rice growth and nutrient situation. This study was conducted to estimate the nitrogen content of rice in different growth stages using rice canopy's hyperspectral reflectance toward the development of precision nitrogen status monitoring. The field experiment was undertaken applying four levels of nitrogen (N) treatment for cultivar ‘Daohuaxiang’. The hyperspectral reflectance of rice canopy of tilling stage, jointing stage and heading stage was captured using hyperspectral imaging system in the range of 372–1038 nm covering 128 wavebands. The average spectral reflectance was extracted from five region of interest (ROI) of each sample and a total of 192 groups of spectral reflectance data were obtained. Then, eight vegetation indices were calculated by spectral reflectance. A kind of machine learning method, termed “support vector regression based on binary particle swarm optimization algorithm (BPSO-SVR)” was proposed to predict nitrogen content. The results were achieved by selecting the best subset of input variables and optimizing the parameters ‘c’ and ‘g’ of SVR through the method of BPSO-SVR. In this work, we also established traditional prediction models such as partial least square regression (PLSR), principal components regression (PCR) and GA-BPANN. The predictive power of these regression models was compared using R 2 (coefficient of determination) and RMSE (root mean square error) of calibration set and testing set. The newly proposed ‘BPSO-SVR'method yielded the excellent R 2 (0.913–0.949) and the smaller RMSE (0.055–0.127) for fitting nitrogen concentration of rice canopy over three growth stages. The results showed that, the method proposed in this paper for predicting N content of rice canopy in different growth stages was potential for nitrogen status monitoring in cold region.

      PubDate: 2017-12-27T12:43:37Z
  • A visualization approach for unknown fault diagnosis
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Zhangming He, Zhiwen Chen, Haiyin Zhou, Dayi Wang, Yan Xing, Jiongqi Wang
      Since visualization can provide useful information to control engineers about the state of the process, visualization has become an dispensable item in the condition monitoring toolbox. The objective of this paper is to propose a visualization approach and apply it to unknown fault isolation. First, the data-driven parity space (PS) technique is used to identify the stable kernel representation (SKR) of a linear time invariant dynamic system. Then, the signature directions (SDs) and the current directions (CDs) are defined, based on which the detection and isolation rules are proposed for diagnosing both the known faults (KFs) and the unknown faults (UFs). Finally, a visualization approach is provided for projecting high-dimensional fault information onto a lower dimensional and drawable space. This approach maintains the fault isolability so that engineers will be able to diagnose the faults more reasonably. The proposed visualization approach is applied to a vertical take off and landing (VTOL) aircraft model and a glass tube manufacturing process.

      PubDate: 2017-12-27T12:43:37Z
  • Using spectral and textural data extracted from hyperspectral near
           infrared spectroscopy imaging to discriminate between processed pork,
           poultry and fish proteins
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Cristóbal Garrido-Novell, Ana Garrido-Varo, Dolores Pérez-Marín, José Emilio Guerrero
      This paper proposes a method based on Near Infrared Hyperspectral Imaging for discriminating between pork, poultry and fish species in processed animal protein meals. First, an investigation was conducted into the possible importance of incorporating into the discrimination models anomalous (or singular) pixels as probable discriminant pixels for each species. Subsequently, partial least squares discriminant analysis (PLS-DA) spectral and textural models were constructed. The former reflected the spectral information (spectral trace), and the latter the spatial (textural trace) information based on different groups of features. Finally, the spectral and textural information was integrated using classification trees, to ascertain whether the combined use of such information represented an improvement in accuracy in the effort to discriminate between species. The method was applied to a set of 40 pork, 40 poultry and 40 fish meals analysed in the 1000–1700 nm range. Models were then tested using an external validation set comprising 45 samples (15 pork, 15 poultry and 15 fish meals). The results demonstrated that combining spectral and appearance characteristics in a single classification tree generated better classification results for the samples used in the study (92% correct) than when using the PLSDA spectral model (83% correct).

      PubDate: 2017-12-27T12:43:37Z
  • Artificially generated near-infrared spectral data for classification
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Vilma Sem, Jana Kolar, Lara Lusa
      Near-Infrared Spectroscopy has became a widely used analytical technique in different research fields due to its non-destructiveness and low-cost. The spectra are rich in information but extremely complex, therefore their analysis necessitates the use of advanced statistical methods. The empirical properties of the statistical methods can be assessed using artificially generated data that resemble real Near-Infrared Spectroscopy. In this paper we propose a new data generation approach (ABS) that takes into account the theoretical knowledge about the near-infrared absorption of the functional groups. The proposed method is compared to real data and to a simpler data generation method, which simulates the data from a multivariate normal distribution whose parameters are estimated from real data (MVNorig). The comparison between real data and the data generation approaches is based on a class-imbalanced classification problem using linear discriminant analysis, classification trees and support vector machines. Both simulation approaches generated spectra with a good resemblance to real data, MVNorig performing slightly better than ABS; using real and simulated data we would have reached similar conclusions about the class-imbalance problem in classification. Both methods can be used to artificially generate near-infrared spectra. The method based on multivariate normal distribution can be used when a large number of real data spectra is available, while the appropriateness of the results of the ABS method depend on the exactness of functional group near-infrared absorption knowledge.

      PubDate: 2017-12-27T12:43:37Z
  • Steel surface defect classification using multiple hyper-spheres support
           vector machine with additional information
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Rongfen Gong, Chengdong Wu, Maoxiang Chu
      A novel multiple hyper-spheres support vector machine with additional information (MHSVM+) is proposed for multi-class steel surface defects classification. Originated from binary twin hyper-spheres support vector machine, MHSVM+ uses hyper-sphere to solve classification decision problem. Differently, MHSVM+ is a multi-class classifier, where it builds a corresponding hyper-sphere for each type of defect dataset. Moreover, MHSVM+ introduces learning paradigm using additional information, which means it can learn additional information hidden in defect dataset. Two types of additional information are provided: local neighbor information and local density information. Local neighbor information contains local classification results for defect samples. And local density information is used to capture label noise, isolated samples and important samples in defect dataset. The above two types of additional information are introduced into MHSVM+ model. Finally, MHSVM+ classifier is used to classify six types of steel surface defects. Experimental results show that the novel multi-class classifier has perfect classification accuracy for defect dataset, especially corrupted defect dataset.

      PubDate: 2017-12-27T12:43:37Z
  • A new reconstruction-based auto-associative neural network for fault
           diagnosis in nonlinear systems
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Shaojun Ren, Fengqi Si, Jianxin Zhou, Zongliang Qiao, Yuanlin Cheng
      Auto-associative neural network (AANN) is a typical nonlinear principal component analysis method, which is widely used in industry for fault diagnosis purposes, especially in nonlinear systems. However, the basic AANN often suffers from “smearing effects” problems that may lead to misdiagnosis, particularly with regards to the complex faults involving multiple variables. In this work, a new reconstruction-based AANN (RBAANN) method is proposed to enhance the capacity of fault diagnosis. In RBAANN, a generic derivative equation is developed to investigate the effects of AANN model inputs on the prediction error between model inputs and outputs. Based on the derivative equation, the reconstruction-based index for single or multiple variables, which is defined as the minimum prediction error, is obtained by tuning the corresponding model inputs iteratively. However, without the prior knowledge of the real faulty variables, all the possible variable sets need to be evaluated by the reconstruction-based index, and this may result in an exhaustive search and cause a huge computational burden. Thus, a branch and bound algorithm is introduced into RBAANN to solve the variable selection problem. Finally, an efficient fault diagnosis strategy by integrating RBAANN and branch and bound algorithm (BAB-RBAANN) is implemented to further pinpoint the source of the detected faults. This BAB-RBAANN method can handle both single and multiple variable(s) faults for nonlinear systems without prior knowledge efficiently. The effectiveness of the proposed methods is evaluated on a validation example and an industrial example. Comparisons with other methods, including principal component analysis techniques, are also presented.

      PubDate: 2017-12-27T12:43:37Z
  • A weighted heteroscedastic Gaussian Process Modelling via particle swarm
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Xiaodan Hong, Yongsheng Ding, Lihong Ren, Lei Chen, Biao Huang
      In many chemical engineering applications, it is often difficult to get accurate first-principle models because of complexity of modern processes. Even if it is possible to do so, it is often time consuming and computationally expensive. Hence, there is a growing need to develop data-driven models. Gaussian process regression (GPR) model has been extensively applied in data-based modelling due to its good adaptability to deal with high dimensional, small samples, and nonlinear problems. The standard GPR algorithm assumes constant noise power throughout the sampling process. However, in process systems, the observation noise often varies so that different sample points are corrupted by different degrees of noise. Under these circumstances, the standard GPR algorithm may not work properly. To model Gaussian process with heteroscedastic noise, this paper introduces a weighting strategy into the standard GPR algorithm, and proposes three weighted GPR algorithms: the clustered GPR (C-GPR) algorithm, the partial weighted GPR (PW-GPR) algorithm and the weighted GPR (W-GPR) algorithm. Different from the standard GPR algorithm, three weighted algorithms put the weight on sampled data by calculating the noise variance for each data point. In addition, in order to optimize the proposed algorithms, this paper utilizes the particle swarm optimization (PSO) algorithm to estimate hyper-parameters of the GPR model, instead of using the traditional conjugate gradient (CG) method. The effectiveness of the three weighted GPR algorithms is verified by means of two numerical examples and a wet spinning coagulation process. Extensive simulation results demonstrate that the proposed algorithms optimized by the PSO algorithm can improve prediction accuracy of the GPR model.

      PubDate: 2017-12-27T12:43:37Z
  • Identifying animal species in NIR hyperspectral images of processed animal
           proteins (PAPs): Comparison of multivariate techniques
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Cecilia Riccioli, Dolores Pérez-Marín, Ana Garrido-Varo
      The use of PAPs in animal feed has several advantages over other feed ingredients, but requires rigorous and accurate control mechanisms that ensure the absence of ruminant meal. In order to differentiate between animal species while simultaneously offering the capacity to inspect PAPs in large volumes, a hyperspectral imaging (HSI) system operating in the NIR spectral range is proposed. This study investigates the sensitivity, specificity and other parameters with which HSI can discriminate between different animal species (ruminants, swine and poultry), making use of various classification methods. Diffuse reflectance spectra were acquired from 125 rendered meal samples in the 1000–1700 nm wavelength range; measured PAPs included particles of scale, hair, feather, blood, grease, skin, muscle and bone from both ruminant and non-ruminant animals, obtained in a rendering plant. Various classification methods were then applied to the dataset to determine the accuracy with which different animal species could be discriminated from each other. Support Vector Machine classification performed best in discriminating between animal species, with a sensitivity and specificity of around 90% and a Matthew's correlation coefficient of around 0.7 for non-ruminant species and higher than 0.95 for ruminant species. Other methods, such PLS-DA and Subspace Discriminant, also produced acceptable results and required less computational time. This study showed that spectral analysis of PAPs, based on diffuse reflectance spectroscopy, is a promising technique for differentiating between ruminant species and other terrestrial animal species. The technique may therefore offer accurate and fast analysis of large volumes of feed products, a necessary prerequisite for the lifting of the EU ban on non-ruminant processed animal proteins.

      PubDate: 2017-12-27T12:43:37Z
  • Finding the optimal time resolution for batch-end quality prediction: MRQP
           – A framework for multi-resolution quality prediction
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Geert Gins, Jan F.M. Van Impe, Marco S. Reis
      Batch-end quality prediction is of paramount importance in industries producing high added-value products. However, all available methods hereto only use raw data at its native resolution. This paper proposes a new multi-resolution quality prediction (MRQP) methodology for batch-end quality, which exploits structured correlation in both the time and variables dimensions. The implementation of this methodology results in a more parsimonious and robust model structure, with a predictive performance bounded to be at least as good as the current standard approach. The implementation of MRQP is illustrated in three different case studies, where improvements over the standard single-resolution approach were found to be in the range of 10%–50%. From an interpretation standpoint, multi-resolution models are more robust with respect to the selection of too many predictors, facilitating the identification of key process variables, and providing information on the process time scales that influence final product quality, which can be further exploited for diagnosis, control, and optimization.

      PubDate: 2017-12-27T12:43:37Z
  • In-line Vis-NIR spectral analysis for the column chromatographic processes
           of Ginkgo biloba part I: End-point determination of the elution process
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Wenlong Li, Xu Yan, Houliu Chen, Haibin Qu
      A Vis-NIR (visible-near infrared) spectroscopy based method for the end-point determination of the column chromatographic processes of Ginkgo biloba was established. In-line collected Vis-NIR spectra were used in conjunction with various multivariate data analysis methods, including moving block standard deviation (MBSD), principal component scores distance analysis (PC-SDA), and principal component analysis-moving block standard deviation (PCA-MBSD) for the end-point determination of the elution process. Compared with the results validated by high performance liquid chromatography (HPLC) determination, PCA-MBSD was chosen as the most suitable method. The presented method provided an alternative solution to the end-point of the column chromatographic processes, which in the long term depends upon the workers' operating experience traditionally.

      PubDate: 2017-12-27T12:43:37Z
  • Support vector regression coupled with wavelength selection as a robust
           analytical method
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Felipe Soares, Michel J. Anzanello
      This paper assesses the support vector regression (SVR) as a robust alternative to partial least squares (PLS) in multivariate calibration using twelve public domain NIR spectroscopy datasets. It also proposes the use of the support vector regression – recursive feature elimination (SVR-RFE) algorithm to select the most informative wavelengths for SVR models. Models based on full spectra were built using SVR and PLS, while wavelength selection methods were carried out using SVR-RFE, interval PLS (iPLS), backward interval PLS (biPLS), synergy interval PLS (siPLS), and successive projection algorithm PLS (SPA-PLS). The prediction performance of tested methods was measured by means of the root mean squared error (RMSE), index of agreement (d-index) and R2 on the test set. SVR-based models yielded the best results in 8 out of 12 datasets, 4 of them using full spectra and 4 relying on SVR-RFE selected wavelengths. Statistical comparison was carried out for the wavelength selection algorithms using Friedman test, which pointed the SVR-RFE as a competitive technique when compared to the other algorithms. This study revealed SVR as a robust alternative to PLS, especially when SVR-RFE is employed for wavelength selection.

      PubDate: 2017-12-27T12:43:37Z
  • Deep-learning-based regression model and hyperspectral imaging for rapid
           detection of nitrogen concentration in oilseed rape (Brassica napus L.)
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Xinjie Yu, Huanda Lu, Qiyu Liu
      Deep-learning-based regression model composed of stacked auto-encoders (SAE) and fully-connected neural network (FNN) was used for the detection and quantification of nitrogen (N) concentration in oilseed rape leaf. SAE was applied to extract deep spectral features from visible and near-infrared (380–1030 nm) hyperspectral image of oilseed rape leaf, and then these features were used as input data for FNN to predict N concentration. The SAE-FNN model achieved reasonable performance with R2 P = 0.903, RMSEP =0 .307% and RPDP = 3.238 for N concentration. Results confirmed the possibility of rapid and nondestructive detecting N concentration in oilseed rape leaf by the combination of hyperspectral imaging technique and deep learning method.

      PubDate: 2017-12-27T12:43:37Z
  • Evaluation of diagnosis methods in PCA-based Multivariate Statistical
           Process Control
    • Abstract: Publication date: 15 January 2018
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 172
      Author(s): Marta Fuentes-García, Gabriel Maciá-Fernández, José Camacho
      Multivariate Statistical Process Control (MSPC) based on Principal Component Analysis (PCA) is a well-known methodology in chemometrics that is aimed at testing whether an industrial process is under Normal Operation Conditions (NOC). As a part of the methodology, once an anomalous behaviour is detected, the root causes need to be diagnosed to troubleshoot the problem and/or avoid it in the future. While there have been a number of developments in diagnosis in the past decades, no sound method for comparing existing approaches has been proposed. In this paper, we propose such a procedure and use it to compare several diagnosis methods using randomly simulated data and from realistic data sources. This is a general comparative approach that takes into account factors that have not previously been considered in the literature. The results show that univariate diagnosis is more reliable than its multivariate counterpart.

      PubDate: 2017-12-27T12:43:37Z
  • Identification of robust probabilistic slow feature regression model for
           process data contaminated with outliers
    • Abstract: Publication date: Available online 24 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Lei Fan, Hariprasad Kodamana, Biao Huang
      Modeling of high dimensional dynamic process is considered as a challenging task. In this regard, probabilistic Slow Feature Analysis (PSFA), a dynamic latent variable model, is proven to be a useful tool which extracts temporally correlated dynamic features from the high-dimensional raw measurements. The extracted latent Slow Features (SFs) can capture process variations which are useful in developing dynamic models. Often times industrial data is affected by outliers, and modeling such data could result in inferior prediction performance. To deal with such scenarios, we propose a robust PSFA (RPSFA) based regression model that models outliers in the observation data using the Student's t-distribution. To estimate the parameters in RPSFA and to extract reduced dimension of SFs, we employ Expectation-Maximization (EM) algorithm under the Maximum Likelihood Estimation (MLE) framework considering SFs as hidden variables. To estimate the hidden SFs we propose a weighted gain Kalman filter based approach as the Normal distribution assumption of the observations is no longer valid. The validity and merits of the proposed approach are demonstrated though a simulated example, an industrial application and an experimental study.

      PubDate: 2017-12-27T12:43:37Z
  • MVC3_GUI: A MATLAB graphical user interface for third-order multivariate
           calibration. An upgrade including new multi-way models
    • Abstract: Publication date: Available online 24 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Sarmento J. Mazivila, Santiago A. Bortolato, Alejandro C. Olivieri
      An upgrade is presented of a MATLAB graphical user interface toolbox for implementing third-order/four-way multivariate calibration models. The new Multivariate Calibration 3 (MVC3_GUI) incorporates new models and features that make it a very versatile tool for four-way data processing. In addition to the quadrilinear decomposition (QLD) and latent structure models based on partial least-squares regression and residual trilinearization, included in the earlier software version, non-QLD models are now available. The latter include extended multivariate curve resolution-alternating least-squares (MCR-ALS), augmented parallel factor analysis (Augmented PARAFAC) and PARAFAC2. The software is presented as both a set of MATLAB codes and as a standalone program. MVC3_GUI accepts a variety of ASCII data for input. Appropriate working sensor regions in the different data modes can be selected. Model development and its subsequent application to unknown samples are straightforward from the interface. Prediction results are provided along with analytical figures of merit and standard concentration errors, as calculated by modern concepts of uncertainty propagation. Different examples of use of this updated interface are given in this work.

      PubDate: 2017-12-27T12:43:37Z
  • Applying Tchebichef image moments to quantitative analysis of the
           components in complex samples based on raw NIR spectra
    • Abstract: Publication date: Available online 22 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jin Jin Liu, Bao Qiong Li, Hong Lin Zhai, Xue Wang, Min Li Xu
      The interferences of irrelevant information, overlapping and shifts of peaks appear mostly in near infrared (NIR) spectroscopy, especially in complex samples, which seriously impede the accurate quantification. In this work, the features of raw NIR spectra represented by Tchebichef image moments (TMs) were employed to partial least square (PLS) modeling. The proposed strategy was applied to quantitative analysis of the components in complex samples based on their raw NIR spectra, and the obtained models were strictly evaluated by their statistical parameters. Our study indicates that the information in raw NIR spectra can be reorganized and represented by TM method owing to its powerful multi-resolution capability and inherent invariance property, which is beneficial to extract the important information of target components. Compared with the PLS and interval partial least square (iPLS) method, the proposed approach could provide accurate and reliable analytical results. Therefore, as an efficient pretreatment method, TMs can be used to improve the analytical precision of PLS based on conventional NIR spectra.
      Graphical abstract image

      PubDate: 2017-12-27T12:43:37Z
  • Call for nominations: 2018 Chemometrics and Intelligent Laboratory Systems
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171

      PubDate: 2017-12-27T12:43:37Z
  • THMGUI: A user-friendly Matlab graphical user interface of
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Bao Qiong Li, Min Li Xu, Xue Wang, Shao Hua Lu, Hong Lin Zhai, Jin Jin Liu
      This paper describes the development and implementation of THMGUI, a graphical user interface (GUI) for implementing the Tchebichef-Hermite moment (THM) method for quantitative analysis purpose on the basis of three-dimensional (3D) landscapes. It consists of four modules, namely “Import data”, “Image moment method”, “Splitting” and “Regression”. The THMGUI allows a user to apply the implemented methods in an easy way, in which the characteristic features acquisition, model establishment and its subsequent application in the analysis of unknown samples are straightforward from the interface. Moreover, the established models that involve the statistical parameters, and prediction results are conveniently managed through THMGUI shells and can be presented intuitively. For illustrating, an example involving the quantitative determination of two components on the basis of LC-MS 3D landscapes is presented.

      PubDate: 2017-12-27T12:43:37Z
  • A model-based data mining approach for determining the domain of validity
           of approximated models
    • Abstract: Publication date: Available online 16 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Marco Quaglio, Eric S. Fraga, Enhong Cao, Asterios Gavriilidis, Federico Galvanin
      Parametric models derived from simplifying modelling assumptions give an approximated description of the physical system under study. The value of an approximated model depends on the consciousness of its descriptive limits and on the precise estimation of its parameters. In this manuscript, a framework for identifying the model domain of validity for the simplifying model hypotheses is presented. A model-based data mining method for parameter estimation is proposed as central block to classify the observed experimental conditions as compatible or incompatible with the approximated model. A nonlinear support vector classifier is then trained on the classified (observed) experimental conditions to identify a decision function for quantifying the expected model reliability in unexplored regions of the experimental design space. The proposed approach is employed for determining the domain of reliability for a simplified kinetic model of methanol oxidation on silver catalyst.

      PubDate: 2017-11-19T00:54:29Z
  • PRFFECT: A versatile tool for spectroscopists
    • Abstract: Publication date: Available online 11 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Benjamin R. Smith, Matthew J. Baker, David S. Palmer
      PRFFECT is a computer program to aid with spectral preprocessing and the development of classification models. Via a simple text interface, PRFFECT allows users to select wavenumber ranges, perform spectral preprocessing, carry out data partitioning (into training and testing datasets), run a Random Forest classification, compute statistical results, and identify important descriptors for the classification. The preprocessing options provided fall into four categories: binning, smoothing, normalisation, and baseline correction. The program outputs a wide-variety of useful data, including classification metrics and graphs showing the importance of individual wavenumbers to the classification models. As proof-of-concept, PRFFECT has been benchmarked on preprocessing and classification of four food analysis datasets. Sensitivities and specificities above 0.92 were obtained in all cases. The results show that different preprocessing procedures are optimal for different datasets. The PRFFECT software is available freely to the community via GitHub. Link:

      PubDate: 2017-11-16T00:40:41Z
  • Optimal design of experiments for excipient compatibility studies
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Wannes G.M. Akkermans, Hans Coppenolle, Peter Goos
      A crucial stage in the development of medical drugs is to study which additives, usually called excipients, impact the active ingredient stability. This type of study is generally named an excipient compatibility study and requires a mixture experiment. Subsequently, the effect of the storage conditions, more specifically the relative humidity and temperature, on the stability is investigated. This so-called accelerated life test involves a factorial type of experiment. It has become, however, customary to include the storage conditions in the compatibility study. This provides valuable information concerning potential interactions between excipient combinations and storage conditions. Experiments that combine a mixture experiment with a factorial experiment are generally named mixture-process variable experiments. A limited number of designs for mixture-process variable experiments are available in the literature. One problem is that the proposed designs offer little flexibility. Another is that the required number of runs becomes prohibitively large for large numbers of mixture components. In this paper, we examine flexible, optimal designs for realistic mixture-process variable experiments. Our motivation is to provide guidance to pharmaceutical formulation scientists concerning state-of-the art models and designs for excipient compatibility studies. Using several proof-of-concept examples, we demonstrate that I-optimal designs offer both flexibility and small variances of prediction. We also discuss a real-life example, which could be used as a blueprint for future studies. Because many excipient compatibility studies are not completely randomized, we pay special attention to their logistics and to the resulting randomization restrictions, which lead to split-plot and strip-plot experiments.

      PubDate: 2017-11-08T12:51:16Z
  • Steel surface defects recognition based on multi-type statistical features
           and enhanced twin support vector machine
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Maoxiang Chu, Rongfen Gong, Song Gao, Jie Zhao
      For steel surface defect recognition, feature extraction and classification are very important steps. In this paper, multi-type statistical features and enhanced twin support vector machine classifier are formulated and applied. Firstly, four types of statistical features for different attributes of defect region are proposed. They are insensitive to affine transformation in scale and rotation. And those attributes include shape distance and local binary pattern operators with sign and magnitude. Then, dummy boundary samples and representative samples are extracted from steel surface defect dataset. Dummy boundary samples include the sparse boundary information of dataset. They can reduce the adverse impact of noise samples. Representative samples with local and global properties are used to replace samples with quadratic loss. They can exclude noise samples. Based on dummy boundary samples and representative samples, enhanced twin support vector machine is formulated. On one hand, it can solve multi-class classification problem. On the other hand, it has anti-noise ability and high classification efficiency. At last, enhanced twin support vector machine classifier and multi-type statistical features are applied to recognize five types of steel surface defects. The experimental results show that our proposed multi-class classifier has perfect performance in efficiency and accuracy. And multi-type statistical features are in favor of improving classification performance.

      PubDate: 2017-11-08T12:51:16Z
  • An effective high-quality prediction intervals construction method based
           on parallel bootstrapped RVM for complex chemical processes
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Yuan Xu, Chuan Mi, Qun-Xiong Zhu, Jing-Yang Gao, Yan-Lin He
      Data-driven techniques have been becoming increasingly popular and widely used for prediction in complex chemical processes. In general, prediction results are usually provided with point estimations. However, point estimations cannot meet the requirement of accuracy due to the characteristics of high-dimension, high nonlinearity, and containing noise of process data. In order to deal with the trend and the uncertainty of process data, an effective prediction intervals (PIs) method based on bootstrap and relevance vector machine (Bootstrapped RVM) is proposed in this paper. In the proposed method, bootstrap is adopted to obtain PIs and RVM is used as a regression tool. In order to accelerate the training and testing phases, a parallel algorithm is utilized in the proposed Bootstrapped RVM method. In addition, to better evaluating the quality of PIs, some performance indicators are improved. Finally, the proposed method is validated by using a standard function and High Density Polyethylene (HDPE) data. Compared with some other PIs methods, the simulation results show that the proposed method can achieve better performance in terms of prediction accuracy and training time.

      PubDate: 2017-11-08T12:51:16Z
  • Biomass concentration prediction via an input-weighed model based on
           artificial neural network and peer-learning cuckoo search
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Qiangda Yang, Hongbo Gao, Weijun Zhang
      Biomass concentration (BC) is considered as one of the most important biochemical parameters. Its reliable on-line estimation is crucial in the real-time status monitoring and quality control of fermentation processes. Considering that each input variable may have different influence on BC in actual fermentation processes, a novel input-weighted empirical model based on the radial basis function neural network (RBFN) and a new peer-learning cuckoo search (PLCS) algorithm, is proposed in this paper to predict BC. The determination of input variable weights and RBFN parameters for the proposed BC prediction model is framed as one and the same optimization problem. Inspired by a common social phenomenon that the mutual learning between team members (peers) would be extremely helpful for their team to accomplish a work efficiently, a PLCS algorithm is proposed to solve the resulting optimization (RO) problem, and thereby accomplish the development of the proposed BC prediction model. The effectiveness and superiority of this new prediction model is validated using the production data from a lab-scale nosiheptide fermentation process. Moreover, the performance of PLCS is also demonstrated on the RO problem with these data and some benchmark functions.

      PubDate: 2017-11-08T12:51:16Z
  • Identification of hindered internal rotational mode for complex chemical
           species: A data mining approach with multivariate logistic regression
    • Abstract: Publication date: Available online 8 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Triet H.M. Le, Tung T. Tran, Lam K. Huynh
      Thermodynamic properties are essential to understand and describe many chemical/biological processes in the real environment. To obtain correct thermodynamic data of chemical species for a wide range of temperatures, a rigorous Hindered Internal Rotation (HIR) treatment must be considered. Such a treatment requires detailed information about the internal rotation (i.e., rotational axis, group, frequency and symmetry and hindrance potential). However, it is very tedious, even prone-to-error, for chemists to prepare the input parameters for such a treatment. Among the HIR parameters, the rotational frequency (or mode) is the most difficult element due to the complex molecular structure and mixing vibrational modes of chemical species. Recently, a rule-based framework has been proposed to help chemists with this tedious process (Le et al., Comput. Theor. Chem., 2017, 61). This approach has been demonstrated to work well for simple species; however, it still lacked the ability to handle more complex cases. Therefore, in this study, a data mining approach is proposed to overcome the challenges of the previous algorithm. Within this framework, the HIR pattern was found using the features extracted from existing data provided by chemists. More specifically, multivariate logistic regression was implemented to analyze the chemical data to better predict the rotational frequency (mode) of chemical species as well as to highlight the effect of each attribute of the rotation. The experimental results were demonstrated to be more accurate than the previous study in terms of both accuracy and completeness. It also gives meaningful insights into the HIR itself. The proposed approach of this research will be integrated into MSMC-GUI ( to provide chemists with both an interactive and robust tool to prepare the data for their thermodynamic calculations on-the-fly.

      PubDate: 2017-11-08T12:51:16Z
  • Dealing with three-way data containing missing values by new weighted
           method for second-order calibration
    • Abstract: Publication date: Available online 7 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yong Li, Hai-long Wu, Xiang-yang Yu
      Multi-way data arrays contain missing values for several reasons, such as various malfunctions of instruments, responses being outside instrument ranges, irregular measurement intervals between samples and data postprocessing. In the present study, one new method, weighted penalty alternating trilinear decomposition (W-APTLD), based on the weighted trilinear model and the idea of alternative trilinear decomposition was given to analyze three-way data arrays containing missing values. In addition, one improved core consistency diagnostic method (W-CORCONDIA) was proposed to estimate the chemical ranks of three-way data arrays containing missing values. The results of one simulation and two real data sets demonstrate that the new method W-APTLD could be used to deal with missing values and reserves the second-order advantage. When meeting excessive factors, W-APTLD could give more accurate results than weighted PARAFAC (W-PARAFAC), PARAFAC with single imputation (PARAFAC-SI) and incomplete data PARAFAC (INDAFAC). The convergence rate of W-APTLD was much faster than W-PARAFAC and PARAFAC-SI but slower than INDAFAC. Better than W-PARAFAC and PARAFAC-SI, W-APTLD could overcome the problem due to severe collinearity. In addition, this new method could be extended to analyze higher-way data arrays containing missing values.

      PubDate: 2017-11-08T12:51:16Z
  • Robust analysis of spectra with strong background signals by
           First-Derivative Indirect Hard Modeling (FD-IHM)
    • Abstract: Publication date: Available online 7 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): P. Beumers, D. Engel, T. Brands, H.J. Koß, A. Bardow
      Spectral analysis of mixtures often faces challenges due to nonlinear effects such as peak shifts or strong background signals. Nonlinear mixture effects can be effectively treated by the Indirect Hard Modeling (IHM) Method. In IHM, mixture effects are captured by adapting hard models of pure component spectra when fitting a mixture model. However, IHM requires a suitable background treatment, which can become laborious. Background signals do not arise from the components of interest but often superimpose their spectra. In statistical methods for spectral analysis, background treatment is often conducted by derivatives of a spectrum. Derivatives effectively damp broad background signals. Standard IHM is not applicable to derivatives of spectra as the negative parts of a derivative spectrum cannot be modeled by pseudo-Voigt peaks which are always positive. In this work, we propose First-Derivative Indirect Hard Modeling (FD-IHM). FD-IHM uses the analytical derivatives of the peak functions. The analytical derivatives are fitted to numerical derivatives of the spectra. Thereby, we combine background treatment by first derivatives with the IHM method to treat nonlinear effects. The presented FD-IHM is validated using Raman spectra of ethanol/acetone mixtures. To introduce a variety of background signals, we used fluorescence dye, scattering bodies (yeast) and various background light sources. Classical IHM allows us to predict the test sets with a root-mean-square error of prediction (RMSEP) ranging from 0.60 wt% to 2.06 wt%, but careful manual background treatment had to be applied. With FD-IHM, we reduce the RMSEP error by 21%–73% without any background treatment. Thus, FD-IHM allows for both, efficient and accurate analysis of spectra with large background signals.

      PubDate: 2017-11-08T12:51:16Z
  • Sampling Error Profile Analysis for calibration transfer in multivariate
    • Abstract: Publication date: Available online 7 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Feiyu Zhang, Wanchao Chen, Ruoqiu Zhang, Boyang Ding, Heming Yao, Jiong Ge, Lei Ju, Wuye Yang, Yiping Du
      A new strategy named Sampling Error Profile Analysis (SEPA) is proposed in the optimization for some parameters in piecewise direct standardization (PDS), such as the number of principal components and window size, and the evaluation for the calibration transfer. Partial least squares (PLS) with mean-centering is used in PDS for calibration transfer. Random re-sampling is carried out in SEPA to obtain a series of subsets and build same number sub-models that produce corresponding number root mean square errors (RMSE), of which the mean value and standard deviation are calculated. To take both accuracy and stability into account, the sum of the mean value and standard deviation are used for parameter optimization and model evaluation. The performance of the proposed strategy has been tested on two data sets: a ternary mixture dataset and a corn dataset. Compared with PDS, SEPA-PDS obtained lower prediction errors, indicating that the transfer model would be more robust and effective when using the parameters optimized by SEPA. Compared with other two commonly used calibration transfer methods of slope and bias correction (SBC) and spectral space transformation (SST), SEPA-PDS acquired more satisfactory results.

      PubDate: 2017-11-08T12:51:16Z
  • Dynamic hypersphere based support vector data description for batch
           process monitoring
    • Abstract: Publication date: Available online 6 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Jianlin Wang, Weimin Liu, Kepeng Qiu, Tao Yu, Liqiang Zhao
      Support Vector Data Description (SVDD) is an efficient monitoring method that captures the spherically shaped boundary around the normal batch data and sets the control limit related to support vectors (SVs) for online monitoring. Using nonlinear transformation functions, SVDD constructs an irregular hypersphere in high dimensional space. When the batch process is complicated, the accuracy of monitoring will decrease with traditional control limit of SVDD. In this paper, dynamic hypersphere based support vector data description (DH-SVDD) is proposed for batch process monitoring. In training process, static hypersphere is built by the important SVs of training dataset. In testing process, dynamic hypersphere is built by the important SVs of combined dataset with current test sample and training dataset. If there is a significant change between these two hyperspheres, it means that the current test sample is an outlier. Thus, DH-SVDD has a relatively high monitoring accuracy because it fully considers relationship between the current test sample and the historical training dataset in high dimensional space. Comparison is conducted between the proposed DH-SVDD and traditional methods such as K-chart-SVDD, max limit SVDD and validation limit SVDD. The effectiveness of the DH-SVDD is also verified by a semiconductor etch process and a fed-batch penicillin fermentation process.

      PubDate: 2017-11-08T12:51:16Z
  • SRO_ANN: An integrated MatLab toolbox for multiple surface response
           optimization using radial basis functions
    • Abstract: Publication date: Available online 4 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Pablo C. Giordano, Héctor C. Goicoechea, Alejandro C. Olivieri
      SRO_ANN, a MatLab® toolbox for implementing multiple surface response optimization by artificial neural networks (SRO_ANN) is presented. Radial basis functions, a type of artificial neural networks, are applied through an easily managed graphical user interface. A detailed description of the interface is provided, including a simulated and two literature examples which allow one to show the potentiality of the software. The discussed experimental examples correspond to: (1) the maximization of the research octane number (RON) of fuels, influenced by three factors (reaction temperature, operating pressure and low liquid hourly space velocity), and (2) the optimization of the calcification process for diced tomatoes, evaluated through three different responses (calcium content, firmness and pH), which are affected by three factors (calcium concentration, solution temperature and treatment time). The results show that the application of a nonparametric tool can enhance the performance of optimization modeling tasks.

      PubDate: 2017-11-08T12:51:16Z
  • HYPER-Tools. A graphical user-friendly interface for multivariate and
           hyperspectral image analysis
    • Abstract: Publication date: Available online 4 November 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): José Manuel Amigo, Nabiallah Mobaraki
      HYPER-Tools is a new graphical user-friendly interface (GUI) especially designed for the analysis of multivariate and hyperspectral images. This easy-to-use interface works under Matlab environment and integrates fundamental types of spectral and spatial pre-processing methods as well as the main chemometric tools (exploratory data analysis, clustering, regression, and classification) for multivariate and hyperspectral image analysis. The main feature of HYPER-Tools is the powerful visualization tools implemented and the interaction of the user with the interface, meaning that the user does barely need Matlab skill to use it. Together with the GUI several tutorials and videos are provided in the official website ( showing the working procedure of HYPER-Tools step by step in different situations.

      PubDate: 2017-11-08T12:51:16Z
  • Industrial Mooney viscosity prediction using fast semi-supervised
           empirical model
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Wenjian Zheng, Xuejin Gao, Yi Liu, Limei Wang, Jianguo Yang, Zengliang Gao
      In industrial rubber mixing processes, the quality index (i.e., Mooney viscosity) cannot be online measured directly. Traditional data-driven empirical models for online prediction of the Mooney viscosity have not utilized the information hidden in lots of unlabeled data (e.g., process input variables during each mixing batch). A simple semi-supervised nonlinear soft sensor method for the Mooney viscosity prediction is developed. It integrates extreme learning machine (ELM) and the graph Laplacian regularization into a unified modeling framework. The useful information in unlabeled data can be explored and introduced into the prediction model. Furthermore, a bagging-based ensemble strategy is combined into semi-supervised ELM (SELM) to obtain more accurate predictions. The Mooney viscosity prediction in an industrial internal mixer exhibits its promising prediction performance of the proposed method by incorporating the information in unlabeled data efficiently.

      PubDate: 2017-11-02T12:45:23Z
  • Authenticity assessment and protection of high-quality Nebbiolo-based
           Italian wines through machine learning
    • Abstract: Publication date: Available online 31 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Luigi Portinale, Giorgio Leonardi, Marco Arlorio, Jean Daniel Coïsson, Fabiano Travaglia, Monica Locatelli
      This paper discusses an intelligent data analysis approach, based on machine learning techniques, and aimed at the definition of methods for chemical data analysis assessment of the authenticity and protection, against fake versions, of some of the highest value Nebbiolo-based wines from Piedmont (Italy). This is an important and very relevant issue in the wine market, where commercial frauds related to such a kind of products are estimated to be worth millions of Euros. The objective is twofold: to show that the problem can be addressed without expensive and hyper-specialized wine chemical analyses, and to demonstrate the actual usefulness of classification algorithms for data mining and machine learning on the resulting chemical profiles. Following Wagstaff's proposal for practical exploitation of machine learning approaches, we describe how data have been collected and prepared for the production of different datasets, how suitable classification models have been identified and how the interpretation of the results suggests the emergence of an active role of machine learning classification techniques, based on standard chemical profiling, for the assesment of the authenticity of the wines target of the study. Experiments have been performed with both datasets of real samples and with syntethic datasets which have been artificially generated from real data.

      PubDate: 2017-11-02T12:45:23Z
  • Authentication and inference of seal stamps on Chinese traditional
           painting by using multivariate classification and near-infrared
    • Abstract: Publication date: Available online 31 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Zewei Chen, An Gu, Xin Zhang, Zhuoyong Zhang
      Chinese traditional paintings occupy an important position in Chinese cultural heritage and it is very important for archeologist and artist to identify their authenticity, which is difficult to be realized. Near-infrared spectroscopy (NIRS) coupled with multivariate models was used for authenticating stamps of 12 seals on a Chinese traditional painting in this work. The robustness of linear and nonlinear multivariate models, i.e. partial least squares-discriminant analysis (PLS-DA) and support vector machine (SVM), were evaluated by adding 5 different levels of noise (from 1% to 5%) into 3 original NIR spectra of each the stamps. These spectral data with noise added were fused together with original spectra to establish identification models and then to evaluate the abilities of the two models to tolerate noise disturbance. Accuracies of 92.6% and 100% were yielded by linear PLS-DA and nonlinear SVM methods respectively. The results demonstrate the feasibility of multivariate approaches in authenticating stamps of seals on the Chinese traditional painting. It is also important and necessary to infer the approximate eras of seal stamps on Chinese traditional painting in archeological study. By comparing the Mahalanobis distances between the 12 stamps on the painting, hierarchical cluster analysis (HCA) was adopted to assist the inference of eras for those unknown seal stamps on the Chinese traditional painting. This work demonstrates that NIR spectroscopy combined with multivariate models can be utilized as a non-destructive approach for authentication of stamps on Chinese traditional painting. HCA can also provide useful information to speculate the time period of the stamps of unknown seals on the Chinese traditional painting.

      PubDate: 2017-11-02T12:45:23Z
  • An improved multi-kernel RVM integrated with CEEMD for high-quality
           intervals prediction construction and its intelligent modeling application
    • Abstract: Publication date: Available online 31 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Yuan Xu, Mingqing Zhang, Qunxiong Zhu, Yanlin He
      Most of existing modeling methods are based on point prediction. However, the accuracy of point prediction cannot meet the actual demand due to existence of high noise, volatility, complexity and irregularity inherent in the chemical process data. In order to solve this problem, a hybrid high-quality prediction intervals (PIs) method integrating complementary ensemble empirical mode decomposition (CEEMD), sample entropy (SE), and improved multi-kernel relevant vector machine (RVM) is proposed in the paper. The proposed PIs method mainly consists of three aspects: Firstly, CEEMD is adopted to decompose the original data into several independent intrinsic mode functions (IMFS), and then SE is used to analyze the complexity of the extracted IMFs to obtain recombinant components; Secondly, an improved multi-kernel RVM (MRVM) is presented to predict recombinant components independently, in which the linear kernel and the Gaussian kernel are combined; Thirdly, the predicted components are aggregated to obtain an ensemble result using another MRVM for constructing the high-quality PIs. To verify the performance of the proposed PIs method, a purified Terephthalic acid (PTA) solvent system is selected. Comparative simulation results demonstrate that the proposed PIs method greatly outperforms on coverage probability and sharpness in all the step predictions.

      PubDate: 2017-11-02T12:45:23Z
  • Detection of formaldehyde oxidation catalysis by MCR-ALS analysis of
           multiset ToF-SIMS data in positive and negative modes
    • Abstract: Publication date: Available online 25 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Xin Zhang, Nicolas Nuns, Jean-François Lamonier, Romà Tauler, Ludovic Duponchel
      Toxicity of formaldehyde is extremely high even at very low concentration in air. We have used ToF-SIMS (Time of Flight Secondary Ion Mass Spectroscopy) to explore formaldehyde oxidation catalysed by birnessite (Kx (Mn4+,Mn3+)2O4) in the temperature range from 25 to 200 °C. ToF-SIMS is a powerful tool for the generation of chemical maps because of its high spatial resolution, its sensitivity and relatively low acquisition time. In this method the mass analysis of positive and negative ions sputtered out of the uppermost layers of the analyzed sample is performed. ToF-SIMS produces large raw data sets with rich chemical information but rather complex to be analyzed and interpreted. This is why the application of chemometric methods is proposed in this work for the exploration of ToF-SIMS complex data sets. In this work, MCR-ALS (Multivariate Curve Resolution-Alternating Least Squares) has been applied to resolve both the positive and negative ions present in ToF-SIMS data sets simultaneously analyzed using a data matrix augmentation strategy. Birnessite without catalyzed formaldehyde was first analyzed to resolve background contributions and use them to implement a selectivity constraint for MCR-ALS to remove them in the analysis of the formaldehyde oxidation data sets. Results show that, following the temperature increase, concentration of formaldehyde combined with manganese ions decreased whereas concentration of manganese oxide increased. Conformation changes of manganese formaldehyde metal complex were then inferred. It is concluded that the formation of the metal complex species formed between two manganese ions and only one formaldehyde molecule is very unlikely to exist.

      PubDate: 2017-10-26T06:03:27Z
  • Kernel dynamic latent variable model for process monitoring with
           application to hot strip mill process
    • Abstract: Publication date: Available online 21 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Gang Li, Kaixiang Peng, Tao Yuan, Maiying Zhong
      Dynamic models are preferred rather than static models in the process monitoring of modern manufacturing. Compared with static models, dynamic models can reflect not only correlations but also causality among measurements and manipulated variables. Linear dynamic models are very common due to the simplicity of representation and parameter estimation. However, because of natural nonlinearity of a dynamic process, it is ineffective to apply linear models within a long term and varying condition. Nonlinear dynamic models are hence desired under such a circumstance. In this paper, a kernel dynamic latent variable (KDLV) model is proposed to describe the nonlinearity between original measurements and dynamic latent variables. This model is an extension of dynamic latent variable model in the aspect of nonlinearity, and keeps all merits of it. In order to build such a model, a KDLV search algorithm is proposed to acquire key model parameters from data, then a KDLV modeling procedure is derived to complete the whole model. After the KDLV model is trained from data, corresponding detection strategy is also developed to perform fault detection. The KDLV based fault detection is applied to the monitoring of hot strip mill process and comparison study is also conducted on both DLV and DKPCA models.

      PubDate: 2017-10-26T06:03:27Z
  • The construction of D- and I-optimal designs for mixture experiments with
           linear constraints on the components
    • Abstract: Publication date: Available online 20 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Roelof Coetzer, Linda M. Haines
      Mixture experiments in which linear constraints are imposed on the components of the mixture are used extensively in practice. The problem of constructing designs which are in some sense optimal for this experimental setting is not straightforward. More specifically, the design space is a polytope embedded in a regular simplex. In the present paper, a new approach to the problem, which builds on the fact that points in the polytope can be represented as convex combinations of the vertices of that polytope, is introduced. Some theory underpinning this idea and rooted in the notion of barycentric coordinates is developed and algorithms for the construction of exact and approximate D- and I-optimal designs for the Scheffé model are delineated. The methodology is illustrated by means of examples involving three- and four-component mixtures.

      PubDate: 2017-10-26T06:03:27Z
  • ChemBCPP: A freely available web server for calculating commonly used
           physicochemical properties
    • Abstract: Publication date: 15 December 2017
      Source:Chemometrics and Intelligent Laboratory Systems, Volume 171
      Author(s): Jie Dong, Ning-Ning Wang, Ke-Yi Liu, Min-Feng Zhu, Yong-Huan Yun, Wen-Bin Zeng, Alex F. Chen, Dong-Sheng Cao
      The behavior of a chemical in human or environment mostly depends on its several key physicochemical properties, such as aqueous solubility, octanol-water partition coefficient (logP), boiling point (BP), density, flash point (FP), viscosity, surface tension (ST), vapor pressure (VP) and melting point (MP). Commonly, these properties are important for the environmental sciences and drug discovery, such as the absorption, distribution, metabolism, excretion, and toxicity (ADMET) for medicinal compounds and the common risk assessment for problematic chemicals. At present, the quantitative structure-property relationship (QSPR) model was widely applied to save time and money investment in the early stage of chemical research. Although some satisfactory models were already obtained, most of them are not available for the public researchers and thus cannot be directly applied to practical research projects. Herein, in this study, we developed a user-friendly web server named ChemBCPP that can be used to predict aforementioned 8 important physicochemical properties and calculate several other commonly used properties just by uploading a molecular structure or file. In addition, for a new chemical entity, users can not only get its predicted value but also obtain a leverage value (h value) which can be used to evaluate the reliability of predictive result. We believe that ChemBCPP could be widely applied in environmental science, chemical synthesis and drug ADMET fields with the demand for high quality of chemical properties. ChemBCPP could be freely available via
      Graphical abstract image

      PubDate: 2017-10-17T23:25:58Z
  • A note on the calculation of reference change values for two consecutive
           normally distributed laboratory results
    • Abstract: Publication date: Available online 13 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): M. Regis, Th.A. Postma, E.R. van den Heuvel
      Population reference limits are inadequate for personalized analyses of medical laboratory results. Reference change values have been recommended as a valid alternative in assessing individual changes across sequential measurements. In this paper, we investigate the accuracy (type I error) and power (complement of type II error) of reference change values under three different statistical modeling scenarios and show that oversimplified hypotheses lead to misinterpretation of laboratory results. The power is strongly affected by the statistical modeling assumptions: it is shown that positive shifts in the individual average health condition are difficult to detect, while it is much easier to identify negative shifts.

      PubDate: 2017-10-17T23:25:58Z
  • Calibration of a chemometric model by using a mathematical process model
           instead of offline measurements in case of a H. polymorpha cultivation
    • Abstract: Publication date: Available online 13 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): O. Paquet-Durand, T. Ladner, J. Büchs, B. Hitzmann
      Data driven regression models such as Principle Component Regression (PCR) or Partial Least Square Regression (PLS) in combination with spectroscopic methods are increasingly applied in bioprocess monitoring. However, as the name “data driven” implies, the calibration of these regression models requires a large amount of predictor (X) and response (Y) data. The predictor data in this case mostly consists of the spectroscopic data, which are easy to generate in large quantities, but the response data typically involves offline measurements in the laboratory that require much more effort to perform in large numbers. It will be shown that in case of a H. polymorpha cultivation performed in microtiter plates, those tedious offline measurements for response data can be replaced by a mathematical process model. Here, an exponential growth model in an ideal stirred tank reactor with lag-time is applied, which has three parameters (lag time, specific growth rate, and yield coefficient). Furthermore, it will be demonstrated that knowledge about the parameter values of this process model is not required, as these values can be determined from 2D fluorescence spectra alone. The only required information about the cultivation is the predictor data, 2D-fluorescence spectra in this case, and the initial state of at least three different cultivation runs, that is the initial values of biomass and substrate (glycerol) concentration. The smallest prediction error for biomass and glycerol obtained by the new calibration procedure are 0.19 g/L and 0.79 g/L respectively, and 0.19 g/L and 1.12 g/L, if a classical procedure using off-line measurements is applied. The inherently calculated process parameters of lag time, specific growth rate and yield coefficient are 4.77 h, 0.154 h−1, and 0.457 g/g, which are similar to values which are determined with offline measurements and least square fit 4.48 h, 0.139 h−1, 0.466 g/g respectively.

      PubDate: 2017-10-17T23:25:58Z
  • Evaluation and assessment of homogeneity in images. Part 1: Unique
           homogeneity percentage for binary images
    • Abstract: Publication date: Available online 8 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Leandro de Moura França, José Manuel Amigo, Carlos Cairós, Manel Bautista, Maria Fernanda Pimentel
      Texture features analysis is one of the most important approaches for the assessment of homogeneity on images. However, all of them are either relative to the comparison with a standardized set of images, or further multivariate models are strongly required to predict or classify the images according to their features. In this first work, we propose an alternative and novel methodology to calculate a percentage of homogeneity by only using the self-information contained on the image. This methodology is based on the macropixel analysis theory and the generation of what is called the “homogeneity curve”. The homogeneity curve is deeply explored and the knowledge to what it could be considered the most homogeneous and inhomogeneous distribution for every case is spanned. This first work postulates the theory and demonstrates its usefulness with several examples applied to binary images. This will provide a theoretical framework to fully understand the homogeneity curve, postulating a mathematical model to parametrize homogeneity and its plausible deviations.
      Graphical abstract image

      PubDate: 2017-10-11T03:12:18Z
  • Optimized self-adaptive model for assessment of soil organic matter using
           Fourier transform mid-infrared photoacoustic spectroscopy
    • Abstract: Publication date: Available online 7 October 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Fei Ma, Changwen Du, Jianmin Zhou, Yazhen Shen
      Advanced technologies, such as infrared spectroscopy, have been applied to develop rapid, cheap but accurate methods for the analysis of soil matter organic (SOM). However, the unsatisfied prediction accuracy resulted from heavy soil heterogeneity limits the practical application. In our previous work, soil identification based self-adaptive partial least squares model (SAM), which was built using identification algorithm and the partial least square regression (PLSR), makes it possible for a wide use. However, soil identification in the SAM needs further optimized. In this study, we designed an advanced optimal self-adaptive partial least squares model (OPT-SAM), a more general model to predict SOM. 597 soil samples from China with large variances were collected, and the soil spectra were recorded using Fourier transform mid-infrared photoacoustic spectroscopy (FTIR-PAS). Five typical algorithms (Correlation coefficients (CC), Euclidean distance (ED), Mahalanobis distance (MD), Angle cosine (AC), and k-medoids (KM)) were considered for the identification in the SAM model. The results demonstrated that the performances of CC-SAM, ED-SAM, MD-SAM, AC-SAM were significantly improved in comparison with no identification based SAM (NI-SAM), but KI-SAM showed a poor prediction. ED-SAM (R2 = 0.8890, RMSEP = 7.00 g kg−1, RPD = 2.96) indicated the highest accuracy and robustness in all algorithms, which was an optimal model for soil identification and prediction, and CC-SAM (R2 = 0.8572, RMSEP = 7.89 g kg−1, RPD = 2.44) was an alternative choice, especially for prediction with different soil types.

      PubDate: 2017-10-11T03:12:18Z
  • Review on data-driven modeling and monitoring for plant-wide industrial
    • Abstract: Publication date: Available online 29 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Zhiqiang Ge
      Data-driven modeling and applications in plant-wide processes have recently caught much attention in both academy and industry. This paper provides a systematic review on data-driven modeling and monitoring for plant-wide processes. First, methodologies of commonly used data processing and modeling procedures for the plant-wide process are presented. Detailed research statuses on various aspects for plant-wide process monitoring are reviewed since 2000. After that, extensions, opportunities, and challenges on data-driven modeling for plant-wide process monitoring are discussed and highlighted for future research.

      PubDate: 2017-10-04T06:42:34Z
  • A new measure of regression model accuracy that considers applicability
    • Abstract: Publication date: Available online 29 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Hiromasa Kaneko
      The coefficient of determination and the root-mean-squared error (RMSE) evaluate regression models for test samples without considering the applicability domains (ADs) of the models. In this study, we propose a new measure for evaluating the predictive performance of regression models that considers their ADs. The purpose is not selecting the best regression model among various competing models, but determining an appropriate model group corresponding to the AD of each model. The proposed measure is the area under coverage and RMSE curve for coverage less than p% (p%-AUCR). It is confirmed that some regression models have global predictive ability and others have local predictive ability, and p%-AUCR is an appropriate indicator for selecting between local and global regression models depending on the coverage and considering the AD. Selecting a regression model for each sample or each chemical structure using p%-AUCR can improve the prediction accuracy of data sets.

      PubDate: 2017-10-04T06:42:34Z
  • Penalized logistic regression for classification and feature selection
           with its application to detection of two official species of Ganoderma
    • Abstract: Publication date: Available online 28 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Ying Zhu, Tuck Lee Tan, Wai Kwong Cheang
      Two species of Ganoderma, Ganoderma lucidum (G. lucidum) and Ganoderma sinense (G. sinense) have been widely used as traditional Chinese herbal medicine for their high medicinal value. Recent studies show that the two species differ in levels of their main active compounds triterpenoids though both have antitumoral effects. An effective and simple analytical method using attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy to discriminate between the two species is of essential importance for its quality assurance and medicinal value estimation. In this study three penalized logistic regression models, weighted least absolute shrinkage and selection operator (Lasso), elastic net and weighted fusion, using ATR-FTIR spectroscopy have been explored for the purpose of classification and interpretation. The weighted fusion model incorporating spectral correlation structure allowed an automatic selection of a small number of spectral bands and achieved an excellent overall classification accuracy of 99% in discriminating spectra of G. lucidum from that of G. sinense. Its classification performance was superior to that of the weighted Lasso model and elastic net model. The automatic selection of informative spectral features results in substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of Ganoderma regarding its anti-cancer effects.

      PubDate: 2017-10-04T06:42:34Z
  • Concurrent probabilistic PLS regression model and its applications in
           process monitoring
    • Abstract: Publication date: Available online 28 September 2017
      Source:Chemometrics and Intelligent Laboratory Systems
      Author(s): Qinghua Li, Feng Pan, Zhonggai Zhao
      The probabilistic PLS (PPLS) algorithm derives the latent variables by maximizing the likelihood of input scores and quality scores, but imposes no constraint on the input residuals and the quality residuals, which implies that residuals may contain large information. Motivated by the concurrent PLS method, this paper proposes a concurrent PPLS (CPPLS) method to perform further decomposition of these residuals, and then two more subspaces are obtained. In this method, the maximum-likelihood method along with the expectation-maximization (EM) algorithm are employed to develop the model, in which the variance of each variable explained by latent variables is introduced to determine the number of latent variables. Based on the CPPLS model, five monitoring statistics all based on Mahalanobis norm are constructed for the evaluation of five subspaces decomposed by CPPLS, respectively.

      PubDate: 2017-10-04T06:42:34Z
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2016