A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

              [Sort by number of followers]   [Restore default list]

  Subjects -> STATISTICS (Total: 130 journals)
Showing 1 - 151 of 151 Journals sorted alphabetically
Advances in Complex Systems     Hybrid Journal   (Followers: 11)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 61)
Annals of Applied Statistics     Full-text available via subscription   (Followers: 39)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Argumentation et analyse du discours     Open Access   (Followers: 11)
Asian Journal of Mathematics & Statistics     Open Access   (Followers: 8)
AStA Advances in Statistical Analysis     Hybrid Journal   (Followers: 4)
Australian & New Zealand Journal of Statistics     Hybrid Journal   (Followers: 13)
Bernoulli     Full-text available via subscription   (Followers: 9)
Biometrical Journal     Hybrid Journal   (Followers: 11)
Biometrics     Hybrid Journal   (Followers: 52)
British Journal of Mathematical and Statistical Psychology     Full-text available via subscription   (Followers: 18)
Building Simulation     Hybrid Journal   (Followers: 2)
Bulletin of Statistics     Full-text available via subscription   (Followers: 4)
CHANCE     Hybrid Journal   (Followers: 5)
Communications in Statistics - Simulation and Computation     Hybrid Journal   (Followers: 9)
Communications in Statistics - Theory and Methods     Hybrid Journal   (Followers: 11)
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 37)
Current Research in Biostatistics     Open Access   (Followers: 8)
Decisions in Economics and Finance     Hybrid Journal   (Followers: 11)
Demographic Research     Open Access   (Followers: 15)
Electronic Journal of Statistics     Open Access   (Followers: 8)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
ESAIM: Probability and Statistics     Full-text available via subscription   (Followers: 5)
Extremes     Hybrid Journal   (Followers: 2)
Fuzzy Optimization and Decision Making     Hybrid Journal   (Followers: 9)
Geneva Papers on Risk and Insurance - Issues and Practice     Hybrid Journal   (Followers: 13)
Handbook of Numerical Analysis     Full-text available via subscription   (Followers: 5)
Handbook of Statistics     Full-text available via subscription   (Followers: 7)
IEA World Energy Statistics and Balances -     Full-text available via subscription   (Followers: 2)
International Journal of Computational Economics and Econometrics     Hybrid Journal   (Followers: 6)
International Journal of Quality, Statistics, and Reliability     Open Access   (Followers: 17)
International Journal of Stochastic Analysis     Open Access   (Followers: 3)
International Statistical Review     Hybrid Journal   (Followers: 13)
International Trade by Commodity Statistics - Statistiques du commerce international par produit     Full-text available via subscription  
Journal of Algebraic Combinatorics     Hybrid Journal   (Followers: 4)
Journal of Applied Statistics     Hybrid Journal   (Followers: 21)
Journal of Biopharmaceutical Statistics     Hybrid Journal   (Followers: 21)
Journal of Business & Economic Statistics     Full-text available via subscription   (Followers: 39, SJR: 3.664, CiteScore: 2)
Journal of Combinatorial Optimization     Hybrid Journal   (Followers: 7)
Journal of Computational & Graphical Statistics     Full-text available via subscription   (Followers: 20)
Journal of Econometrics     Hybrid Journal   (Followers: 84)
Journal of Educational and Behavioral Statistics     Hybrid Journal   (Followers: 6)
Journal of Forecasting     Hybrid Journal   (Followers: 17)
Journal of Global Optimization     Hybrid Journal   (Followers: 7)
Journal of Interactive Marketing     Hybrid Journal   (Followers: 10)
Journal of Mathematics and Statistics     Open Access   (Followers: 8)
Journal of Nonparametric Statistics     Hybrid Journal   (Followers: 6)
Journal of Probability and Statistics     Open Access   (Followers: 10)
Journal of Risk and Uncertainty     Hybrid Journal   (Followers: 33)
Journal of Statistical and Econometric Methods     Open Access   (Followers: 5)
Journal of Statistical Physics     Hybrid Journal   (Followers: 13)
Journal of Statistical Planning and Inference     Hybrid Journal   (Followers: 8)
Journal of Statistical Software     Open Access   (Followers: 21, SJR: 13.802, CiteScore: 16)
Journal of the American Statistical Association     Full-text available via subscription   (Followers: 72, SJR: 3.746, CiteScore: 2)
Journal of the Korean Statistical Society     Hybrid Journal   (Followers: 1)
Journal of the Royal Statistical Society Series C (Applied Statistics)     Hybrid Journal   (Followers: 33)
Journal of the Royal Statistical Society, Series A (Statistics in Society)     Hybrid Journal   (Followers: 27)
Journal of the Royal Statistical Society, Series B (Statistical Methodology)     Hybrid Journal   (Followers: 43)
Journal of Theoretical Probability     Hybrid Journal   (Followers: 3)
Journal of Time Series Analysis     Hybrid Journal   (Followers: 16)
Journal of Urbanism: International Research on Placemaking and Urban Sustainability     Hybrid Journal   (Followers: 30)
Law, Probability and Risk     Hybrid Journal   (Followers: 8)
Lifetime Data Analysis     Hybrid Journal   (Followers: 7)
Mathematical Methods of Statistics     Hybrid Journal   (Followers: 4)
Measurement Interdisciplinary Research and Perspectives     Hybrid Journal   (Followers: 1)
Metrika     Hybrid Journal   (Followers: 4)
Modelling of Mechanical Systems     Full-text available via subscription   (Followers: 1)
Monte Carlo Methods and Applications     Hybrid Journal   (Followers: 6)
Monthly Statistics of International Trade - Statistiques mensuelles du commerce international     Full-text available via subscription   (Followers: 2)
Multivariate Behavioral Research     Hybrid Journal   (Followers: 5)
Optimization Letters     Hybrid Journal   (Followers: 2)
Optimization Methods and Software     Hybrid Journal   (Followers: 8)
Oxford Bulletin of Economics and Statistics     Hybrid Journal   (Followers: 34)
Pharmaceutical Statistics     Hybrid Journal   (Followers: 17)
Probability Surveys     Open Access   (Followers: 4)
Queueing Systems     Hybrid Journal   (Followers: 7)
Research Synthesis Methods     Hybrid Journal   (Followers: 8)
Review of Economics and Statistics     Hybrid Journal   (Followers: 128)
Review of Socionetwork Strategies     Hybrid Journal  
Risk Management     Hybrid Journal   (Followers: 15)
Sankhya A     Hybrid Journal   (Followers: 2)
Scandinavian Journal of Statistics     Hybrid Journal   (Followers: 9)
Sequential Analysis: Design Methods and Applications     Hybrid Journal  
Significance     Hybrid Journal   (Followers: 7)
Sociological Methods & Research     Hybrid Journal   (Followers: 38)
SourceOCDE Comptes nationaux et Statistiques retrospectives     Full-text available via subscription  
SourceOCDE Statistiques : Sources et methodes     Full-text available via subscription  
SourceOECD Bank Profitability Statistics - SourceOCDE Rentabilite des banques     Full-text available via subscription   (Followers: 1)
SourceOECD Insurance Statistics - SourceOCDE Statistiques d'assurance     Full-text available via subscription   (Followers: 2)
SourceOECD Main Economic Indicators - SourceOCDE Principaux indicateurs economiques     Full-text available via subscription   (Followers: 1)
SourceOECD Measuring Globalisation Statistics - SourceOCDE Mesurer la mondialisation - Base de donnees statistiques     Full-text available via subscription  
SourceOECD Monthly Statistics of International Trade     Full-text available via subscription   (Followers: 1)
SourceOECD National Accounts & Historical Statistics     Full-text available via subscription  
SourceOECD OECD Economic Outlook Database - SourceOCDE Statistiques des Perspectives economiques de l'OCDE     Full-text available via subscription   (Followers: 2)
SourceOECD Science and Technology Statistics - SourceOCDE Base de donnees des sciences et de la technologie     Full-text available via subscription  
SourceOECD Statistics Sources & Methods     Full-text available via subscription   (Followers: 1)
SourceOECD Taxing Wages Statistics - SourceOCDE Statistiques des impots sur les salaires     Full-text available via subscription  
Stata Journal     Full-text available via subscription   (Followers: 9)
Statistica Neerlandica     Hybrid Journal   (Followers: 1)
Statistical Applications in Genetics and Molecular Biology     Hybrid Journal   (Followers: 5)
Statistical Communications in Infectious Diseases     Hybrid Journal  
Statistical Inference for Stochastic Processes     Hybrid Journal   (Followers: 3)
Statistical Methodology     Hybrid Journal   (Followers: 7)
Statistical Methods and Applications     Hybrid Journal   (Followers: 6)
Statistical Methods in Medical Research     Hybrid Journal   (Followers: 27)
Statistical Modelling     Hybrid Journal   (Followers: 19)
Statistical Papers     Hybrid Journal   (Followers: 4)
Statistical Science     Full-text available via subscription   (Followers: 13)
Statistics & Probability Letters     Hybrid Journal   (Followers: 13)
Statistics & Risk Modeling     Hybrid Journal   (Followers: 3)
Statistics and Computing     Hybrid Journal   (Followers: 13)
Statistics and Economics     Open Access   (Followers: 1)
Statistics in Medicine     Hybrid Journal   (Followers: 198)
Statistics, Politics and Policy     Hybrid Journal   (Followers: 6)
Statistics: A Journal of Theoretical and Applied Statistics     Hybrid Journal   (Followers: 15)
Stochastic Models     Hybrid Journal   (Followers: 3)
Stochastics An International Journal of Probability and Stochastic Processes: formerly Stochastics and Stochastics Reports     Hybrid Journal   (Followers: 2)
Structural and Multidisciplinary Optimization     Hybrid Journal   (Followers: 12)
Teaching Statistics     Hybrid Journal   (Followers: 7)
Technology Innovations in Statistics Education (TISE)     Open Access   (Followers: 2)
TEST     Hybrid Journal   (Followers: 3)
The American Statistician     Full-text available via subscription   (Followers: 23)
The Annals of Applied Probability     Full-text available via subscription   (Followers: 8)
The Annals of Probability     Full-text available via subscription   (Followers: 10)
The Annals of Statistics     Full-text available via subscription   (Followers: 34)
The Canadian Journal of Statistics / La Revue Canadienne de Statistique     Hybrid Journal   (Followers: 11)
Wiley Interdisciplinary Reviews - Computational Statistics     Hybrid Journal   (Followers: 1)

              [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Advances in Data Analysis and Classification
Journal Prestige (SJR): 1.09
Citation Impact (citeScore): 1
Number of Followers: 61  
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1862-5355 - ISSN (Online) 1862-5347
Published by Springer-Verlag Homepage  [2657 journals]
  • REMAXINT: a two-mode clustering-based method for statistical inference on
           two-way interaction
    • Abstract: We present a novel method, REMAXINT, that captures the gist of two-way interaction in row by column (i.e., two-mode) data, with one observation per cell. REMAXINT is a probabilistic two-mode clustering model that yields two-mode partitions with maximal interaction between row and column clusters. For estimation of the parameters of REMAXINT, we maximize a conditional classification likelihood in which the random row (or column) main effects are conditioned out. For testing the null hypothesis of no interaction between row and column clusters, we propose a \(max-F\) test statistic and discuss its properties. We develop a Monte Carlo approach to obtain its sampling distribution under the null hypothesis. We evaluate the performance of the method through simulation studies. Specifically, for selected values of data size and (true) numbers of clusters, we obtain critical values of the \(max-F\) statistic, determine empirical Type I error rate of the proposed inferential procedure and study its power to reject the null hypothesis. Next, we show that the novel method is useful in a variety of applications by presenting two empirical case studies and end with some concluding remarks.
      PubDate: 2021-04-27
  • Hierarchical clustering with discrete latent variable models and the
           integrated classification likelihood
    • Abstract: Finding a set of nested partitions of a dataset is useful to uncover relevant structure at different scales, and is often dealt with a data-dependent methodology. In this paper, we introduce a general two-step methodology for model-based hierarchical clustering. Considering the integrated classification likelihood criterion as an objective function, this work applies to every discrete latent variable models (DLVMs) where this quantity is tractable. The first step of the methodology involves maximizing the criterion with respect to the partition. Addressing the known problem of sub-optimal local maxima found by greedy hill climbing heuristics, we introduce a new hybrid algorithm based on a genetic algorithm which allows to efficiently explore the space of solutions. The resulting algorithm carefully combines and merges different solutions, and allows the joint inference of the number K of clusters as well as the clusters themselves. Starting from this natural partition, the second step of the methodology is based on a bottom-up greedy procedure to extract a hierarchy of clusters. In a Bayesian context, this is achieved by considering the Dirichlet cluster proportion prior parameter \(\alpha \) as a regularization term controlling the granularity of the clustering. A new approximation of the criterion is derived as a log-linear function of \(\alpha \) , enabling a simple functional form of the merge decision criterion. This second step allows the exploration of the clustering at coarser scales. The proposed approach is compared with existing strategies on simulated as well as real settings, and its results are shown to be particularly relevant. A reference implementation of this work is available in the R-package greed accompanying the paper.
      PubDate: 2021-04-13
  • Nonlinear dimension reduction for conditional quantiles
    • Abstract: In practice, data often display heteroscedasticity, making quantile regression (QR) a more appropriate methodology. Modeling the data, while maintaining a flexible nonparametric fitting, requires smoothing over a high-dimensional space which might not be feasible when the number of the predictor variables is large. This problem makes necessary the use of dimension reduction techniques for conditional quantiles, which focus on extracting linear combinations of the predictor variables without losing any information about the conditional quantile. However, nonlinear features can achieve greater dimension reduction. We, therefore, present the first nonlinear extension of the linear algorithm for estimating the central quantile subspace (CQS) using kernel data. First, we describe the feature CQS within the framework of reproducing kernel Hilbert space, and second, we illustrate its performance through simulation examples and real data applications. Specifically, we emphasize on visualizing various aspects of the data structure using the first two feature extractors, and we highlight the ability to combine the proposed algorithm with classification and regression linear algorithms. The results show that the feature CQS is an effective kernel tool for performing nonlinear dimension reduction for conditional quantiles.
      PubDate: 2021-03-23
  • Learning multivariate shapelets with multi-layer neural networks for
           interpretable time-series classification
    • Abstract: Shapelets are discriminative subsequences extracted from time-series data. Classifiers using shapelets have proven to achieve performances competitive to state-of-the-art methods, while enhancing the model’s interpretability. While a lot of research has been done for univariate time-series shapelets, extensions for the multivariate setting have not yet received much attention. To extend shapelets-based classification to a multidimensional setting, we developed a novel architecture for shapelets learning, by embedding them as trainable weights in a multi-layer Neural Network. We also investigated the introduction of a novel learning strategy for the shapelets, comprising of two additional terms in the optimization goal, to retrieve a reduced set of uncorrelated shapelets. This paper describes the proposed architecture and presents results on ten publicly available benchmark datasets, as well as a comparison with existing state-of-the-art methods. Moreover, the proposed optimization objective leads the model to automatically select smaller sets of uncorrelated shapelets, thus requiring no additional manual optimization on typically important hyper-parameters such as number and length of shapelets. The results show how the proposed approach achieves competitive performance across the datasets, and always leads to a significant reduction in the number of shapelets used. This can make it faster for a domain expert to match shapelets to real patterns, thus enhancing the interpretability of the model. Finally, since the shapelets learnt during training can be extracted from the model they can serve as meaningful insights on the classifier’s decisions and the interactions between different dimensions.
      PubDate: 2021-03-04
      DOI: 10.1007/s11634-021-00437-8
  • Editorial for ADAC issue 1 of volume 15 (2021)
    • PubDate: 2021-03-01
  • A cost-sensitive constrained Lasso
    • Abstract: The Lasso has become a benchmark data analysis procedure, and numerous variants have been proposed in the literature. Although the Lasso formulations are stated so that overall prediction error is optimized, no full control over the accuracy prediction on certain individuals of interest is allowed. In this work we propose a novel version of the Lasso in which quadratic performance constraints are added to Lasso-based objective functions, in such a way that threshold values are set to bound the prediction errors in the different groups of interest (not necessarily disjoint). As a result, a constrained sparse regression model is defined by a nonlinear optimization problem. This cost-sensitive constrained Lasso has a direct application in heterogeneous samples where data are collected from distinct sources, as it is standard in many biomedical contexts. Both theoretical properties and empirical studies concerning the new method are explored in this paper. In addition, two illustrations of the method on biomedical and sociological contexts are considered.
      PubDate: 2021-03-01
  • Kappa coefficients for dichotomous-nominal classifications
    • Abstract: Two types of nominal classifications are distinguished, namely regular nominal classifications and dichotomous-nominal classifications. The first type does not include an ‘absence’ category (for example, no disorder), whereas the second type does include an ‘absence’ category. Cohen’s unweighted kappa can be used to quantify agreement between two regular nominal classifications with the same categories, but there are no coefficients for assessing agreement between two dichotomous-nominal classifications. Kappa coefficients for dichotomous-nominal classifications with identical categories are defined. All coefficients proposed belong to a one-parameter family. It is studied how the coefficients for dichotomous-nominal classifications are related and if the values of the coefficients depend on the number of categories. It turns out that the values of the new kappa coefficients can be strictly ordered in precisely two ways. The orderings suggest that the new coefficients are measuring the same thing, but to a different extent. If one accepts the use of magnitude guidelines, it is recommended to use stricter criteria for the new coefficients that tend to produce higher values.
      PubDate: 2021-03-01
  • Interval forecasts based on regression trees for streaming data
    • Abstract: In forecasting, we often require interval forecasts instead of just a specific point forecast. To track streaming data effectively, this interval forecast should reliably cover the observed data and yet be as narrow as possible. To achieve this, we propose two methods based on regression trees: one ensemble method and one method based on a single tree. For the ensemble method, we use weighted results from the most recent models, and for the single-tree method, we retain one model until it becomes necessary to train a new model. We propose a novel method to update the interval forecast adaptively using root mean square prediction errors calculated from the latest data batch. We use wavelet-transformed data to capture long time variable information and conditional inference trees for the underlying regression tree model. Results show that both methods perform well, having good coverage without the intervals being excessively wide. When the underlying data generation mechanism changes, their performance is initially affected but can recover relatively quickly as time proceeds. The method based on a single tree performs the best in computational (CPU) time compared to the ensemble method. When compared to ARIMA and GARCH modelling, our methods achieve better or similar coverage and width but require considerably less CPU time.
      PubDate: 2021-03-01
  • A combination of k -means and DBSCAN algorithm for solving the multiple
           generalized circle detection problem
    • Abstract: Motivated by the problem of identifying rod-shaped particles (e.g. bacilliform bacterium), in this paper we consider the multiple generalized circle detection problem. We propose a method for solving this problem that is based on center-based clustering, where cluster-centers are generalized circles. An efficient algorithm is proposed which is based on a modification of the well-known k-means algorithm for generalized circles as cluster-centers. In doing so, it is extremely important to have a good initial approximation. For the purpose of recognizing detected generalized circles, a QAD-indicator is proposed. Also a new DBC-index is proposed, which is specialized for such situations. The recognition process is intitiated by searching for a good initial partition using the DBSCAN-algorithm. If QAD-indicator shows that generalized circle-cluster-center does not recognize searched generalized circle for some cluster, the procedure continues searching for corresponding initial generalized circles for these clusters using the Incremental algorithm. After that, corresponding generalized circle-cluster-centers are calculated for obtained clusters. This will happen if a data point set stems from intersected or touching generalized circles. The method is illustrated and tested on different artificial data sets coming from a number of generalized circles and real images.
      PubDate: 2021-03-01
  • A novel semi-supervised support vector machine with asymmetric squared
    • Abstract: Laplacian support vector machine (LapSVM), which is based on the semi-supervised manifold regularization learning framework, performs better than the standard SVM, especially for the case where the supervised information is insufficient. However, the use of hinge loss leads to the sensitivity of LapSVM to noise around the decision boundary. To enhance the performance of LapSVM, we present a novel semi-supervised SVM with the asymmetric squared loss (asy-LapSVM) which deals with the expectile distance and is less sensitive to noise-corrupted data. We further present a simple and efficient functional iterative method to solve the proposed asy-LapSVM, in addition, we prove the convergence of the functional iterative method from two aspects of theory and experiment. Numerical experiments performed on a number of commonly used datasets with noise of different variances demonstrate the validity of the proposed asy-LapSVM and the feasibility of the presented functional iterative method.
      PubDate: 2021-03-01
  • Clustering discrete-valued time series
    • Abstract: There is a need for the development of models that are able to account for discreteness in data, along with its time series properties and correlation. Our focus falls on INteger-valued AutoRegressive (INAR) type models. The INAR type models can be used in conjunction with existing model-based clustering techniques to cluster discrete-valued time series data. With the use of a finite mixture model, several existing techniques such as the selection of the number of clusters, estimation using expectation-maximization and model selection are applicable. The proposed model is then demonstrated on real data to illustrate its clustering applications.
      PubDate: 2021-03-01
  • From-below Boolean matrix factorization algorithm based on MDL
    • Abstract: During the past few years Boolean matrix factorization (BMF) has become an important direction in data analysis. The minimum description length principle (MDL) was successfully adapted in BMF for the model order selection. Nevertheless, a BMF algorithm performing good results w.r.t. standard measures in BMF is missing. In this paper, we propose a novel from-below Boolean matrix factorization algorithm based on formal concept analysis. The algorithm utilizes the MDL principle as a criterion for the factor selection. On various experiments we show that the proposed algorithm outperforms—from different standpoints—existing state-of-the-art BMF algorithms.
      PubDate: 2021-03-01
  • Simultaneous dimension reduction and clustering via the NMF-EM algorithm
    • Abstract: Mixture models are among the most popular tools for clustering. However, when the dimension and the number of clusters is large, the estimation of the clusters become challenging, as well as their interpretation. Restriction on the parameters can be used to reduce the dimension. An example is given by mixture of factor analyzers for Gaussian mixtures. The extension of MFA to non-Gaussian mixtures is not straightforward. We propose a new constraint for parameters in non-Gaussian mixture model: the K components parameters are combinations of elements from a small dictionary, say H elements, with \(H \ll K\) . Including a nonnegative matrix factorization (NMF) in the EM algorithm allows us to simultaneously estimate the dictionary and the parameters of the mixture. We propose the acronym NMF-EM for this algorithm, implemented in the R package nmfem. This original approach is motivated by passengers clustering from ticketing data: we apply NMF-EM to data from two Transdev public transport networks. In this case, the words are easily interpreted as typical slots in a timetable.
      PubDate: 2021-03-01
  • Efficient regularized spectral data embedding
    • Abstract: Data embedding (DE) or dimensionality reduction techniques are particularly well suited to embedding high-dimensional data into a space that in most cases will have just two dimensions. Low-dimensional space, in which data samples (data points) can more easily be visualized, is also often used for learning methods such as clustering. Sometimes, however, DE will identify dimensions that contribute little in terms of the clustering structures that they reveal. In this paper we look at regularized data embedding by clustering, and we propose a simultaneous learning approach for DE and clustering that reinforces the relationships between these two tasks. Our approach is based on a matrix decomposition technique for learning a spectral DE, a cluster membership matrix, and a rotation matrix that closely maps out the continuous spectral embedding, in order to obtain a good clustering solution. We compare our approach with some traditional clustering methods and perform numerical experiments on a collection of benchmark datasets to demonstrate its potential.
      PubDate: 2021-03-01
  • A robust spatial autoregressive scalar-on-function regression with t
    • Abstract: Modelling functional data in the presence of spatial dependence is of great practical importance as exemplified by applications in the fields of demography, economy and geography, and has received much attention recently. However, for the classical scalar-on-function regression (SoFR) with functional covariates and scalar responses, only a relatively few literature is dedicated to this relevant area, which merits further research. We propose a robust spatial autoregressive scalar-on-function regression by incorporating a spatial autoregressive parameter and a spatial weight matrix into the SoFR to accommodate spatial dependencies among individuals. The t-distribution assumption for the error terms makes our model more robust than the classical spatial autoregressive models under normal distributions. We estimate the model by firstly projecting the functional predictor onto a functional space spanned by an orthonormal functional basis and then presenting an expectation–maximization algorithm. Simulation studies show that our estimators are efficient, and are superior in the scenario with spatial correlation and heavy tailed error terms. A real weather dataset demonstrates the superiority of our model to the SoFR in the case of spatial dependence.
      PubDate: 2021-03-01
  • Robust regression with compositional covariates including cellwise
    • Abstract: We propose a robust procedure to estimate a linear regression model with compositional and real-valued explanatory variables. The proposed procedure is designed to be robust against individual outlying cells in the data matrix (cellwise outliers), as well as entire outlying observations (rowwise outliers). Cellwise outliers are first filtered and then imputed by robust estimates. Afterwards, rowwise robust compositional regression is performed to obtain model coefficient estimates. Simulations show that the procedure generally outperforms a traditional rowwise-only robust regression method (MM-estimator). Moreover, our procedure yields better or comparable results to recently proposed cellwise robust regression methods (shooting S-estimator, 3-step regression) while it is preferable for interpretation through the use of appropriate coordinate systems for compositional data. An application to bio-environmental data reveals that the proposed procedure—compared to other regression methods—leads to conclusions that are best aligned with established scientific knowledge.
      PubDate: 2021-02-24
      DOI: 10.1007/s11634-021-00436-9
  • Sparse principal component regression via singular value decomposition
    • Abstract: Principal component regression (PCR) is a two-stage procedure: the first stage performs principal component analysis (PCA) and the second stage builds a regression model whose explanatory variables are the principal components obtained in the first stage. Since PCA is performed using only explanatory variables, the principal components have no information about the response variable. To address this problem, we present a one-stage procedure for PCR based on a singular value decomposition approach. Our approach is based upon two loss functions, which are a regression loss and a PCA loss from the singular value decomposition, with sparse regularization. The proposed method enables us to obtain principal component loadings that include information about both explanatory variables and a response variable. An estimation algorithm is developed by using the alternating direction method of multipliers. We conduct numerical studies to show the effectiveness of the proposed method.
      PubDate: 2021-02-08
      DOI: 10.1007/s11634-020-00435-2
  • PCA-KL: a parametric dimensionality reduction approach for unsupervised
           metric learning
    • Abstract: Dimensionality reduction algorithms are powerful mathematical tools for data analysis and visualization. In many pattern recognition applications, a feature extraction step is often required to mitigate the curse of the dimensionality, a collection of negative effects caused by an arbitrary increase in the number of features in classification tasks. Principal Component Analysis (PCA) is a classical statistical method that creates new features based on linear combinations of the original ones through the eigenvectors of the covariance matrix. In this paper, we propose PCA-KL, a parametric dimensionality reduction algorithm for unsupervised metric learning, based on the computation of the entropic covariance matrix, a surrogate for the covariance matrix of the data obtained in terms of the relative entropy between local Gaussian distributions instead of the usual Euclidean distance between the data points. Numerical experiments with several real datasets show that the proposed method is capable of producing better defined clusters and also higher classification accuracy in comparison to regular PCA and several manifold learning algorithms, making PCA-KL a promising alternative for unsupervised metric learning.
      PubDate: 2021-01-07
      DOI: 10.1007/s11634-020-00434-3
  • Functional data clustering by projection into latent generalized
           hyperbolic subspaces
    • Abstract: We introduce a latent subpace model which facilitates model-based clustering of functional data. Flexible clustering is attained by imposing jointly generalized hyperbolic distributions on projections of basis expansion coefficients into group specific subspaces. The model acquires parsimony by assuming these subspaces are of relatively low dimension. Parameter estimation is done through a multicycle ECM algorithm. Application to simulated and real datasets illustrate competitive clustering capabilities, and demonstrate the models general applicability.
      PubDate: 2021-01-07
      DOI: 10.1007/s11634-020-00432-5
  • A bivariate finite mixture growth model with selection
    • Abstract: A model is proposed to analyze longitudinal data where two response variables are available, one of which is a binary indicator of selection and the other is continuous and observed only if the first is equal to 1. The model also accounts for individual covariates and may be considered as a bivariate finite mixture growth model as it is based on three submodels: (i) a probit model for the selection variable; (ii) a linear model for the continuous variable; and (iii) a multinomial logit model for the class membership. To suitably address endogeneity, the first two components rely on correlated errors as in a standard selection model. The proposed approach is applied to the analysis of the dynamics of household portfolio choices based on an unbalanced panel dataset of Italian households over the 1998–2014 period. For this dataset, we identify three latent classes of households with specific investment behaviors and we assess the effect of individual characteristics on households’ portfolio choices. Our empirical findings also confirm the need to jointly model risky asset market participation and the conditional portfolio share to properly analyze investment behaviors over the life-cycle.
      PubDate: 2020-12-29
      DOI: 10.1007/s11634-020-00433-4
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-