STATISTICS
Similar Journals
 Journal of the Korean Statistical SocietyJournal Prestige (SJR): 0.545 Citation Impact (citeScore): 1Number of Followers: 0      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1226-3192 - ISSN (Online) 2005-2863 Published by Elsevier  [2974 journals]
• Bayesian pathway selection

Abstract: Abstract We propose a Bayesian pathway selection method that allows the selection of pathways (sets of genes) directly related to a continuous response variable under a non-parametric hierarchical model framework. The fact that sets of genes effectively explain more the response variable than individual genes was the driving force behind this research. We utilize the stochastic search variable selection and kernel machine method to select effective pathways after adjusting clinical covariates effects. The selection of pathways simultaneously works compared to other methods, where pathways are analyzed separately. We show that the proposed model can successfully detect effective pathways associated with outcomes through simulation studies and real data application.
PubDate: 2023-01-24

• Goodness of fit test for uniform distribution with censored observation

Abstract: Abstract We develop new goodness of fit test for uniform distribution based on a conditional moment characterization. We study the asymptotic properties of the proposed test statistic. We also present a goodness of fit test for uniform distribution to incorporate the right censored observations and studied its properties. A Monte Carlo simulation study is carried out to evaluate the finite sample performance of the proposed tests. We illustrate the test procedures using real data sets.
PubDate: 2023-01-23

• The expectation–maximization approach for Bayesian additive Cox
regression with current status data

Abstract: Abstract In this paper, we propose a Bayesian additive Cox model for analyzing current status data based on the expectation–maximization variable selection method. This model concurrently estimates unknown parameters and identifies risk factors, which efficiently improves model interpretability and predictive ability. To identify risk factors, we assign appropriate priors on the indicator variables which denote whether the risk factors are included. By assuming partially linear effects of the covariates, the proposed model offers flexibility to account for the relationship between risk factors and survival time. The baseline cumulative hazard function and nonlinear effects are approximated via penalized B-splines to reduce the dimension of parameters. An easy to implement expectation–maximization algorithm is developed using a two-stage data augmentation procedure involving latent Poisson variables. Finally, the performance of the proposed method is investigated by simulations and a real data analysis, which shows promising results of the proposed Bayesian variable selection method.
PubDate: 2023-01-22

• Deconvolution problem of cumulative distribution function with
heteroscedastic errors

Abstract: Abstract We study the nonparametric deconvolution problem of cumulative distribution function when measurement errors are heteroscedastic and have known distributions. Using a Fourier-type deconvolution method, we propose an estimator for the target function that depends only on a regularization parameter. Our estimator achieves minimax optimal convergence rates when the errors are all either ordinary smooth or supersmooth. A simulation study is also conducted to illustrate the effectiveness of the proposed estimator.
PubDate: 2023-01-21

• Nonresponse adjusted estimation based on a composite weighting method in a
panel survey

Abstract: Abstract Respondents to panel surveys are commonly divided into continuous and noncontinuous groups based on their response patterns. In this study, we propose an estimator based on composite weights as an effective nonresponse adjustment method to reduce the bias of noncontinuous response groups. We derive the properties of the proposed estimator, such as its bias and mean squared error, and then compare its efficiency with that of alternative estimators. We present the results of simulations demonstrating that the proposed estimator exhibits less variance than the conventional method of directly using the response rate of a noncontinuous response group. It also exhibited a lower bias than that obtained using the response rate of the continuous response group. The composite weighting method used in the proposed estimator showed stable results in terms of minimizing extreme weights, indicating that it may be considered highly effective for noncontinuous response groups in panel surveys.
PubDate: 2023-01-11

• A Bayesian method for multinomial probit model

Abstract: Abstract The independence of irrelevant alternatives (IIA) property states that the ratio of any two choice probabilities in a set of alternatives is independent of the presence or absence of other alternatives. In the modeling of multinomial data, the IIA is not feasible. In this situation, the multinomial probit (MNP) model is a type of discrete choice model that is commonly used. Due to the identifiability problem and the positive-definiteness constraint, modeling the covariance matrix in the MNP is difficult. All existing methods use unidentifiable parameters in the covariance matrix to solve the unidentifiability problem and improve the rate of convergence of a data augmentation algorithm. These methods also use the inverse Wishart distribution, which is frequently insufficient (Barnard et al. Stat Sin 10(4):1281–1311, 2000). We employed variance-correlation decomposition to decompose the identifiable covariance matrix into standard deviations and a correlation matrix instead of using the unidentifiable covariance matrix. Hypersphere decomposition was also used to decompose the correlation matrix. Thus, the estimated covariance matrix satisfied the positive definiteness constraint. The performance of our proposed model was illustrated using a detergent dataset from market research.
PubDate: 2022-12-03

• A note on maximum likelihood estimation for mixture models

Abstract: Abstract Practitioners as well as some statistics students often blindly use standard software or algorithms to get maximum likelihood estimator (MLE) without checking the validity of existence of such an estimator. Even in simple situations where data comes from mixtures of Gaussians, global MLE does not exist. This note is intended as a teachers corner, highlighting existential issues related to MLE for mixture models, even when the components are not necessarily Gaussian.
PubDate: 2022-12-01

• Estimation of the parameters of a Wishart extension on symmetric matrices

Abstract: Abstract This paper deals with the parameters of a natural extension of the Wishart distribution, that is the Riesz distribution on the space of symmetric matrices. We estimate the shape parameter using two different approaches. The first one is based on the method of moments, we give its expression and investigate some of its properties. The second represents the maximum likelihood estimator. Unfortunately, in this case we do not have an explicit formula for this estimator. This latter is expressed in terms of the digamma function and sample mean of log-gamma variables. However, we derive the strong consistency and asymptotic normality properties of this estimator. A numerical comparative study between the two estimators is carried out in order to test the performance of the proposed approaches. For the second parameter, that is the scale parameter, we prove that the distribution of the maximum likelihood estimator given by Kammoun et al. (J Statist Prob Lett 126:127–131, 2017) is related to the Riesz distribution. We examine some properties concerning this estimator and we assess its performance by a numerical study.
PubDate: 2022-12-01

• A computationally efficient and flexible algorithm for high dimensional
mean and covariance matrix change point models

Abstract: Abstract This paper proposes a computationally efficient algorithm, FBS (Fast Binary Segmentation), for both single and multiple change point detection under high-dimensional setups. As a general technique, it can be widely used in various change point problems including mean vectors and covariance matrices change point models. Based on various $$\ell _{(s,p)}$$ -norm aggregations for the cumulative sum (CUSUM) statistics, the new algorithm can be applied to a wide range of alternative structures including sparse and dense settings as special cases. We present the essence of the new algorithm. The efficiency and accuracy of the new algorithm are justified via comparing it with the other existing techniques. Lastly, a real data application further demonstrates the usefulness of our method.
PubDate: 2022-12-01

• Bayesian empirical likelihood inference for the generalized binomial AR(1)
model

Abstract: Abstract In this paper, the Bayesian empirical likelihood (BEL) inference is considered for the generalized binomial AR(1) model. We establish a nonparametric likelihood using the empirical likelihood (EL) approach and consider a specific prior based on copulas. An efficient Markov chain Monte Carlo (MCMC) procedure is described for the required computation of the posterior distribution. In the simulation study, we analyze the accuracy and sensitivity of the MCMC algorithm. We also study the robustness of the new method. The results imply that our algorithm converges quickly and not strongly influenced by the model assumptions. Furthermore, the BEL method is robust. Finally, a real data example is analyzed to illustrate of our method.
PubDate: 2022-12-01

• Flexible INAR(1) models for equidispersed, underdispersed or overdispersed
counts

Abstract: Abstract Equidispersed, underdispersed and overdispersed count data are commonly encountered in practice. To better describe these data characteristics, this paper develops two classes of INAR(1) processes, which not only can model a wide range of overdispersion and underdispersion, but also have ability to describe the zero-inflated and zero-deflated characteristics of the count data. The probabilistic and statistical properties of the two processes are studied. Estimators of the model parameters are derived by using conditional maximum likelihood (CML) and modified conditional least squares (MCLS) methods. Some asymptotic properties and numerical results of the estimators are investigated. Three real examples are given to show the flexibility and usefulness of the proposed models.
PubDate: 2022-12-01

• Robust estimation for a general functional single index model via quantile
regression

Abstract: Abstract This paper studies the estimation of a general functional single index model, in which the conditional distribution of the response depends on the functional predictor via a functional single index structure. We find that the slope function can be estimated consistently by the estimation obtained by fitting a misspecified functional linear quantile regression model under some mild conditions. We first obtain a consistent estimator of the slope function using functional linear quantile regression based on functional principal component analysis, and then employ a local linear regression technique to estimate the conditional quantile function and establish the asymptotic normality of the resulting estimator for it. The finite sample performance of the proposed estimation method is studied in Monte Carlo simulations, and is illustrated by an application to a real dataset.
PubDate: 2022-12-01

• Revisiting feature selection for linear models with FDR and power
guarantees

Abstract: Abstract The problem of feature selection for linear models is re-examined by using the fixed-X knockoff procedure and incorporating the selection probability as variable importance scores. Unlike previous work that predominantly focused on false discovery rate (FDR) control, this paper aims to establish theoretical power guarantees for the fixed-X knockoff procedure in linear models. An intersection selection procedure is proposed to make better use of sample data for estimating the selection probability, which in practice results in increasing the selection power. In addition, a two-stage procedure by using the data splitting technique is further developed to explore related theoretical results under high dimensionality. The performance of the proposal over its main competitors is demonstrated through comprehensive simulation studies and real data analysis.
PubDate: 2022-12-01

• Penalized polygram regression

Abstract: Abstract We consider a study on regression function estimation over a bounded domain of arbitrary shapes based on triangulation and penalization techniques. A total variation type penalty is imposed to encourage fusion of adjacent triangles, which leads to a partition of the domain consisting of disjointed polygons. The proposed method provides a piecewise linear, and continuous estimator over a data adaptive polygonal partition of the domain. We adopt a coordinate decent algorithm to handle the non-separable structure of the penalty and investigate its convergence property. Regarding the asymptotic results, we establish an oracle type inequality and convergence rate of the proposed estimator. A numerical study is carried out to illustrate the performance of this method. An R software package polygram is available.
PubDate: 2022-12-01

• Partial linear regression of compositional data

Abstract: Abstract We study a partial linear model in which the response is compositional and the predictors include both compositional and Euclidean variables. We define a partial linear regression model under Aitchison geometry based on isometric log-ratio (ilr) transformation. An identification condition of linear parameters is provided in terms of expectations and conditional expectations of the response and covariates. An estimator based on the identification is developed and asymptotic properties of proposed estimators are derived. The proposed method can be implemented easily by using existing R packages such as np and compositions. The limiting distribution of the proposed estimator is provided with normal distribution in Euclidean space so that it is easy to use for inference. Also, some finite sample properties are presented via simulation studies. We also present election data as an illustrative example.
PubDate: 2022-12-01

• Robust coefficients of correlation or spatial autocorrelation based on
implicit weighting

Abstract: Abstract Pearson product-moment correlation coefficient represents a fundamental tool for measuring linear association between two data vectors. In various applications, it is often reasonable to consider its weighted version known as the weighted correlation coefficient. This paper starts with theoretical considerations related to properties of the weighted correlation coefficient, particularly to its local robustness and relationship to other similarity measures. Inspired by the least weighted squares regression estimator, a robust correlation coefficient is investigated here together with its spatial autocorrelation extension. Finally, the considered methods are investigated in two image processing tasks.
PubDate: 2022-12-01

• Wild bootstrap Ljung–Box test for residuals of ARMA models robust to
variance change

Abstract: Abstract Ljung–Box (LB) test is one of the most popular test for determining whether autocorrelations in residuals of fitted time series models exist or not. However, it may not be appropriate to apply LB test to time series data with variance change due to size distortions. In this paper, we proposed a wild bootstrap-based LB test for residuals of fitted ARMA models. Our simulation study shows that our wild bootstrap-based LB test achieves the correct sizes and comparable powers in finite samples in the presence of variance change.
PubDate: 2022-12-01

• Monitoring multivariate data with high missing rate by pooling univariate
statistics

Abstract: Abstract In this paper, we propose a control chart to monitor multivariate data with missing values as an alternative to the most common imputation-based control chart. The chart statistic we use is the weighted sum of the chi-square statistics of each variable, say $$Q^*$$ , which is named as pooled component test statistic (PCT) by Wu et al. (2006). We modify the statistic in the context of monitoring and approximate the in-control distribution of $$Q^*$$ as a scaled chi-square distribution. The PCT chart we propose in this paper is a Shewhart chart using $$Q^*$$ , and its control limits are from the estimate of the approximate in-control distribution of $$Q^*$$ . We numerically show that the PCT chart performs better than the imputation-based methods in the literature. We finally apply it to monitoring a semiconductor manufacturing process using production environmental variables.
PubDate: 2022-12-01

• Correction: Robust MAVE for single-index varying-coefficient models

PubDate: 2022-09-16

• Robust MAVE for single-index varying-coefficient models

Abstract: Abstract In this paper, a robust, efficient and easily implemented estimation procedure for single-index varying-coefficient models is proposed by combining minimum average variance estimation (MAVE) with exponential squared loss. The merit of the proposed method is robust against outliers or heavy-tailed error distributions while asymptotically efficient as the original MAVE under the normal error case. A practical minorization–maximization algorithm is proposed for implementation. Under some regularity conditions, asymptotic distributions of the resulting estimators are derived. Simulation studies and a real data example are conducted to examine the finite sample performance of the proposed method. Both theoretical and empirical findings confirm that our proposed method works very well.
PubDate: 2022-09-02

