Subjects -> STATISTICS (Total: 130 journals)
 The end of the list has been reached or no journals were found for your choice.
Similar Journals
 TESTJournal Prestige (SJR): 1.514 Citation Impact (citeScore): 1Number of Followers: 2      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1863-8260 - ISSN (Online) 1133-0686 Published by Springer-Verlag  [2467 journals]
• Nonequivalence of two least-absolute-deviation estimators for mediation
effects

Abstract: Abstract This paper provides two groups of conditions of model consistency in least-absolute-deviation mediation models. Under model consistency, we establish the asymptotic theory of the difference estimator and the product estimator, and show that the two estimators are not only numerically nonequivalent but asymptotically nonequivalent, which is dramatically different from the situation in the least squares mediation analysis where these two estimators are numerically equivalent. In all three possible scenarios of model parameters, both the asymptotic theories and simulation studies show that the product estimator is more efficient than the difference estimator.
PubDate: 2022-11-09

• On functional logistic regression: some conceptual issues

Abstract: Abstract The main ideas behind the classic multivariate logistic regression model make sense when translated to the functional setting, where the explanatory variable X is a function and the response Y is binary. However, some important technical issues appear (or are aggravated with respect to those of the multivariate case) due to the functional nature of the explanatory variable. First, the mere definition of the model can be questioned: While most approaches so far proposed rely on the $$L^2$$ -based model, we explore an alternative (in some sense, more general) approach, based on the theory of reproducing kernel Hilbert spaces (RKHS). The validity conditions of such RKHS-based model, and their relation with the $$L^2$$ -based one, are investigated and made explicit in two formal results. Some relevant particular cases are considered as well. Second, we show that, under very general conditions, the maximum likelihood of the logistic model parameters fails to exist in the functional case, although some restricted versions can be considered. Third, we check (in the framework of binary classification) the practical performance of some RKHS-based procedures, well-suited to our model: They are compared to several competing methods via Monte Carlo experiments and the analysis of real data sets.
PubDate: 2022-10-31

• A data-driven reversible jump for estimating a finite mixture of
regression models

Abstract: Abstract We propose a data-driven reversible jump (DDRJ) method for selecting and estimating a mixture of regression models in a single run, which can also be applied as a robust regression model to outliers. We compare the clustering and estimation performance of the proposed method with Expectation–Maximization and Gibbs sampler algorithms combined with model selection criteria in synthetic data sets. Under tested conditions, DDRJ outperforms these traditional methods in identifying the number of groups, classification and precision of estimates. When compared with traditional reversible jump algorithms, the data-driven procedure simplifies the calculations and implementation and shows a better mixing and faster convergence. Finally, we apply the proposed method to analyze two well-studied data sets: tone perception (a simple data set) and baseball salaries (a more complex data set with a larger number of covariates).
PubDate: 2022-10-31

• Criterion constrained Bayesian hierarchical models

Abstract: Abstract The goal of this article is to improve the predictive performance of a Bayesian hierarchical statistical model by incorporating a criterion typically used for model selection. In this article, we view the problem of prediction of a latent real-valued mean as a model selection problem, where the candidate models are from an uncountable infinite set (i.e., the parameter space of the mean represents the candidate set of models). Specifically, we select a subset of our Bayesian hierarchical statistical model’s parameter space with high predictive performance (as measured by a criterion). Explicitly, we truncate the joint support of the data and the parameter space of a given Bayesian hierarchical model to only include small values of the covariance penalized error (CPE) criterion. The CPE is a general expression that contains several information criteria as special cases. Simulation results show that as long as the truncated set does not have near-zero probability, we tend to obtain a lower squared error than Bayesian model averaging. Additional theoretical results are provided asthe foundation for these observations. We apply our approach to a dataset consisting of American Community Survey period estimates to illustrate that this perspective can lead to improvements in a single model.
PubDate: 2022-10-05

• On coregionalized multivariate Gaussian Markov random fields:
construction, parameterization, and Bayesian estimation and inference

Abstract: Abstract Gaussian Markov random fields (GMRF) and their multivariate extensions (MGMRFs) are powerful tools for modelling probabilistic interactions of directly related variables. As an important category of graphical models, the potentials of (M)GMRF application are far-reaching. In this article, we review a class of coregionalized MGMRFs that coordinates, integrates, and generalizes the key MGMRFs in the literature. Theoretical and analytic results, including simulation and case studies, yield important insights into the model class, its options of parameterization, and the nature of asymmetry modelled by an asymmetric matrix of spatial parameters and its adaptive extensions. We show that while the Markovian interpretation of latent conditionals may be the main appeal of MGMRFs for some applications, another attraction of this model class is its coregionalization models that harness multidimensional interactions, dependencies, correlations, and variabilities for analysis of covariance structure. The latter is further illustrated by presenting models for shared component analysis, principal component analysis, and dimension reduction. The model class and its wide-ranging options for generalization are discussed for their richness and broad applicability to spatial and image data analytics and beyond.
PubDate: 2022-09-23

• Specification testing of partially linear single-index models: a groupwise

Abstract: Abstract This paper develops a groupwise dimension reduction-based adaptive-to-model test for partially linear single-index models. The test behaves as a local smoothing test would if the model were bivariate. The test statistic under the null hypothesis is asymptotically normally distributed. The test can detect local alternatives distinct from the null hypothesis at the rate that existing local smoothing tests can achieve when the regression model contains bivariate covariates. Therefore, the curse of dimensionality is largely alleviated. Numerical studies, including two real data examples, are conducted to examine the finite sample performance of the proposed test.
PubDate: 2022-09-17
DOI: 10.1007/s11749-022-00833-y

• Some parametric tests based on sample spacings

Abstract: Abstract Assume that we have a random sample from an absolutely continuous distribution (univariate, or multivariate) with a known functional form and some unknown parameters. In this paper, we have studied several parametric tests based on statistics that are symmetric functions of m-step disjoint sample spacings. Asymptotic properties of these tests have been investigated under the simple null hypothesis and under a sequence of local alternatives converging to the null hypothesis. The asymptotic properties of the proposed tests have also been studied under the composite null hypothesis. We observed that these tests have similar asymptotic properties as the likelihood ratio test. Finite sample performances of the proposed tests are assessed numerically. A data analysis based on real data is also reported. The proposed tests provide alternative to similar tests based on simple spacings (i.e. $$m=1$$ ), that were proposed earlier in the literature. These tests also provide an alternative to likelihood ratio tests in situations where likelihood function may be unbounded, and hence, likelihood ratio tests do not exist.
PubDate: 2022-09-16
DOI: 10.1007/s11749-022-00831-0

• Correct specification of design matrices in linear mixed effects models:
tests with graphical representation

Abstract: Abstract Linear mixed effects models (LMMs) are a popular and powerful tool for analysing grouped or repeated observations for numeric outcomes. LMMs consist of a fixed and a random component, which are specified in the model through their respective design matrices. Verifying the correct specification of the two design matrices is important since mis-specifying them can affect the validity and efficiency of the analysis. We show how to use empirical stochastic processes constructed from appropriately ordered and standardized residuals from the model to test whether the design matrices of the fitted LMM are correctly specified. We define two different processes: one can be used to test whether both design matrices are correctly specified, and the other can be used only to test whether the fixed effects design matrix is correctly specified. The proposed empirical stochastic processes are smoothed versions of cumulative sum processes, which have a nice graphical representation in which model mis-specification can easily be observed. The amount of smoothing can be adjusted, which facilitates visual inspection and can potentially increase the power of the tests. We propose a computationally efficient procedure for estimating p-values in which refitting of the LMM is not necessary. Its validity is shown by using theoretical results and a large Monte Carlo simulation study. The proposed methodology could be used with LMMs with multilevel or crossed random effects.
PubDate: 2022-09-08
DOI: 10.1007/s11749-022-00830-1

• Homogeneity tests for one-way models with dependent errors under
correlated groups

Abstract: Abstract We consider the problem of testing for the existence of fixed effects and random effects in one-way models, where the groups are correlated and the disturbances are dependent. The classical F-statistic in the analysis of variance is not asymptotically distribution-free in this setting. To overcome this problem, we propose a new test statistic for this problem without any distributional assumptions, so that the test statistic is asymptotically distribution-free. The proposed test statistic takes the form of a natural extension of the classical F-statistic in the sense of distribution-freeness. The new tests are shown to be asymptotically size $$\alpha$$ and consistent. The nontrivial power under local alternatives is also elucidated. The theoretical results are justified by numerical simulations for the model with disturbances from linear time series with innovations of symmetric random variables, heavy-tailed variables, and skewed variables, and furthermore from GARCH models. The proposed test is applied to log-returns for stock prices and uncovers random effects in sectors.
PubDate: 2022-09-02
DOI: 10.1007/s11749-022-00828-9

• Correction to: Second-order and local characteristics of network intensity
functions

PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00791-x

• Correction to: Testing the hypothesis of a block compound symmetric
covariance matrix for elliptically contoured distributions

PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00796-6

• On finite mixtures of Discretized Beta model for ordered responses

Abstract: Abstract The paper discusses the specification of finite mixture models based on the Discretized Beta distribution for the analysis of ordered discrete responses, as ratings and count data. The ultimate goal of the paper is to parameterize clusters of opposite and intermediate response outcomes. After a thorough discussion on model interpretation, identifiability and estimation, the proposal is illustrated on the wake of a case study on the probability to vote for German Political Parties and with a comparative discussion with the state of the art.
PubDate: 2022-09-01
DOI: 10.1007/s11749-022-00800-7

• General dependence structures for some models based on exponential

Abstract: Abstract We describe a procedure to introduce general dependence structures on a set of random variables. These include order-q moving average-type structures, as well as seasonal, periodic, spatial and spatio-temporal dependences. The invariant marginal distribution can be in any family that is conjugate to an exponential family with quadratic variance function. Dependence is induced via a set of suitable latent variables whose conditional distribution mirrors the sampling distribution in a Bayesian conjugate analysis of such exponential families. We obtain strict stationarity as a special case.
PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00798-4

• Increasing the replicability for linear models via adaptive significance
levels

Abstract: Abstract We put forward an adaptive $$\alpha$$ (type I error) that decreases as the information grows for hypothesis tests comparing nested linear models. A less elaborate adaptation was presented in Pérez and Pericchi (Stat Probab Lett 85:20–24, 2014) for general i.i.d. models. The calibration proposed in this paper may be interpreted as a Bayes–non-Bayes compromise, of a simple translation of a Bayes factor on frequentist terms that leads to statistical consistency, and most importantly, it is a step toward statistics that promotes replicable scientific findings.
PubDate: 2022-09-01
DOI: 10.1007/s11749-022-00803-4

• A simple and useful regression model for fitting count data

Abstract: Abstract We present a novel regression model for count data where the response variable is BerG-distributed using a new parameterization of this distribution, which is indexed by mean and dispersion parameters. An attractive feature of this model lies in its potential to fit count data when overdispersion, equidispersion, underdispersion, or zero inflation (or deflation) is indicated. The advantage of our new parameterization and approach is the straightforward interpretation of the regression coefficients in terms of the mean and dispersion as in generalized linear models. The maximum likelihood method is used to estimate the model parameters. Also, we conduct hypothesis tests for the dispersion parameter and consider residual analysis. Simulation studies are conducted to empirically evidence the properties of the estimators, the test statistics, and the residuals in finite-sized samples. The proposed model is applied to two real datasets on wildlife habitat and road traffic accidents, which illustrates its capabilities in accommodating both over- and underdispersed count data. This paper contains Supplementary Material.
PubDate: 2022-09-01
DOI: 10.1007/s11749-022-00801-6

• On automatic kernel density estimate-based tests for goodness-of-fit

Abstract: Abstract Although estimation and testing are different statistical problems, if we want to use a test statistic based on the Parzen–Rosenblatt estimator to test the hypothesis that the underlying density function f is a member of a location-scale family of probability density functions, it may be found reasonable to choose the smoothing parameter in such a way that the kernel density estimator is an effective estimator of f irrespective of which of the null or the alternative hypothesis is true. In this paper we address this question by considering the well-known Bickel–Rosenblatt test statistics which are based on the quadratic distance between the nonparametric kernel estimator and two parametric estimators of f under the null hypothesis. For each one of these test statistics we describe their asymptotic behaviours for a general data-dependent smoothing parameter, and we state their limiting Gaussian null distribution and the consistency of the associated goodness-of-fit test procedures for location-scale families. In order to compare the finite sample power performance of the Bickel–Rosenblatt tests based on a null hypothesis-based bandwidth selector with other bandwidth selector methods existing in the literature, a simulation study for the normal, logistic and Gumbel null location-scale models is included in this work.
PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00799-3

• Some results on the Gaussian Markov Random Field construction problem
based on the use of invariant subgraphs

Abstract: Abstract The study of Gaussian Markov Random Fields has attracted the attention of a large number of scientific areas due to its increasing usage in several fields of application. Here, we consider the construction of Gaussian Markov Random Fields from a graph and a positive-definite matrix, which is closely related to the problem of finding the Maximum Likelihood Estimator of the covariance matrix of the underlying distribution. In particular, it is simultaneously required that the variances and the covariances between variables associated with adjacent nodes in the graph are fixed by the positive-definite matrix and that pairs of variables associated with non-adjacent nodes in the graph are conditionally independent given all other variables. The solution to this construction problem exists and is unique up to the choice of a vector of means. In this paper, some results focusing on a certain type of subgraphs (invariant subgraphs) and a representation of the Gaussian Markov Random Field as a Multivariate Gaussian Markov Random Field are presented. These results ease the computation of the solution to the aforementioned construction problem.
PubDate: 2022-09-01
DOI: 10.1007/s11749-022-00804-3

• Data-driven portmanteau tests for time series

Abstract: Abstract Portmanteau tests and information criteria are widely used for checking the hypothesis of independence in time series. More recently, data-driven versions were proposed, where the tests are calibrated based on the largest estimated autocorrelation. It seems natural to introduce a double test statistic (M, Q) where Q is the portmanteau and M is the largest squared autocorrelation. Both statistics have been investigated at length in the past decades. We computed under reasonable assumptions the bivariate probability distribution of this double statistic, conditional, in addition, to the lag at which the largest autocorrelation is found. Tests of the null hypothesis of independence based on rejection regions in the plane (M, Q) are proposed, and some methods to select the rejection region in order to maximize power when the alternative hypothesis is unknown are suggested. A simulation study and a thorough comparison with some popular tests have been performed to show the advantages of our proposal. Notice that this latter includes some well-known univariate tests, so we could expect not only an optimal choice but also additional information which may turn useful for a better understanding of the time series for both model building and forecasting.
PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00794-8

• Penalized robust estimators in sparse logistic regression

Abstract: Abstract Sparse covariates are frequent in classification and regression problems where the task of variable selection is usually of interest. As it is well known, sparse statistical models correspond to situations where there are only a small number of nonzero parameters, and for that reason, they are much easier to interpret than dense ones. In this paper, we focus on the logistic regression model and our aim is to address robust and penalized estimation for the regression parameter. We introduce a family of penalized weighted M-type estimators for the logistic regression parameter that are stable against atypical data. We explore different penalization functions including the so-called Sign penalty. We provide a careful analysis of the estimators convergence rates as well as their variable selection capability and asymptotic distribution for fixed and random penalties. A robust cross-validation criterion is also proposed. Through a numerical study, we compare the finite sample performance of the classical and robust penalized estimators, under different contamination scenarios. The analysis of real datasets enables to investigate the stability of the penalized estimators in the presence of outliers.
PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00792-w

• A class of random fields with two-piece marginal distributions for
modeling point-referenced data with spatial outliers

Abstract: Abstract In this paper, we propose a new class of non-Gaussian random fields named two-piece random fields. The proposed class allows to generate random fields that have flexible marginal distributions, possibly skewed and/or heavy-tailed and, as a consequence, has a wide range of applications. We study the second-order properties of this class and provide analytical expressions for the bivariate distribution and the associated correlation functions. We exemplify our general construction by studying two examples: two-piece Gaussian and two-piece Tukey-h random fields. An interesting feature of the proposed class is that it offers a specific type of dependence that can be useful when modeling data displaying spatial outliers, a property that has been somewhat ignored from modeling viewpoint in the literature for spatial point referenced data. Since the likelihood function involves analytically intractable integrals, we adopt the weighted pairwise likelihood as a method of estimation. The effectiveness of our methodology is illustrated with simulation experiments as well as with the analysis of a georeferenced dataset of mean temperatures in Middle East.
PubDate: 2022-09-01
DOI: 10.1007/s11749-021-00797-5

JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762