|
|
- Mixing convergence of LSE for supercritical AR(2) processes with Gaussian
innovations using random scaling-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We prove mixing convergence of the least squares estimator of autoregressive parameters for supercritical autoregressive processes of order 2 with Gaussian innovations having real characteristic roots with different absolute values. We use an appropriate random scaling such that the limit distribution is a two-dimensional normal distribution concentrated on a one-dimensional ray determined by the characteristic root having the larger absolute value. PubDate: 2024-08-01
- Second-order (s.o.) multi-stage fixed-width confidence interval (FWCI)
estimation strategies for comparing location parameters from two negative exponential (NE) populations: illustrations with cancer data-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We consider two negative exponential (NE) populations with unknown location parameters and unknown but unequal scale parameters. We develop fixed-width confidence interval (FWCI) estimation strategies for comparing the location parameters. First, we formulate a unified multi-stage estimation strategy and derive the theory for an asymptotic second-order (s.o.) expansion of the coverage probability as well as s.o. efficiency. Next, we successively specialize by providing a range of asymptotics associated with (i) purely sequential, (ii) parallel piecewise sequential, (iii) accelerated sequential, and (iv) three-stage strategies. Theoretical findings are supplemented by simulations and real data illustrations from head and neck cancer research. PubDate: 2024-08-01
- On regression and classification with possibly missing response variables
in the data-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper considers the problem of kernel regression and classification with possibly unobservable response variables in the data, where the mechanism that causes the absence of information can depend on both predictors and the response variables. Our proposed approach involves two steps: First we construct a family of models (possibly infinite dimensional) indexed by the unknown parameter of the missing probability mechanism. In the second step, a search is carried out to find the empirically optimal member of an appropriate cover (or subclass) of the underlying family in the sense of minimizing the mean squared prediction error. The main focus of the paper is to look into some of the theoretical properties of these estimators. The issue of identifiability is also addressed. Our methods use a data-splitting approach which is quite easy to implement. We also derive exponential bounds on the performance of the resulting estimators in terms of their deviations from the true regression curve in general \(L_p\) norms, where we allow the size of the cover or subclass to diverge as the sample size n increases. These bounds immediately yield various strong convergence results for the proposed estimators. As an application of our findings, we consider the problem of statistical classification based on the proposed regression estimators and also look into their rates of convergence under different settings. Although this work is mainly stated for kernel-type estimators, it can also be extended to other popular local-averaging methods such as nearest-neighbor and histogram estimators. PubDate: 2024-08-01
- Stochastic comparisons of two finite mixtures of general family of
distributions-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We consider here two finite (arithmetic) mixture models (FMMs) with general parametric family of distributions. Sufficient conditions for the usual stochastic order and hazard rate order are then established under the assumption that the model parameter vectors are connected in p-larger order, reciprocal majorization order and weak super/sub majorization order. Furthermore, we establish hazard rate order and reversed hazard rate order between two mixture random variables (MRVs) when a matrix of model parameters and mixing proportions changes to another matrix in some mathematical sense. We have also considered scale family of distributions to establish some sufficient conditions under which the MRVs have hazard rate order. Several examples are presented to illustrate and clarify all the results established here. PubDate: 2024-08-01
- A proper selection among multiple Buckley–James estimates
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Consider the semiparametric linear regression estimation problem with right-censored data. Under right censoring, the Buckley–James estimator (BJE) is the standard extension of the least squares estimator. Moreover, an iterative algorithm for the BJE has been implemented in R package called rms. We show that it often does not yield a solution, even if a consistent BJE exists. Yu and Wong (J Stat Comput Simul 72:451–460, 2002) proposed another algorithm to find all possible BJEs. The latter algorithm is modified in this paper so that it indeed finds all BJEs when the underlying regression parameter vector is identifiable. We show that some of these BJE’s can be inconsistent. Thus it is important to decide how to select a proper BJE such that it is consistent if the parameter is identifiable. We suggest either choose one close to the modified semi-parametric maximum likelihood estimator (Yu and Wong in Technometrics 47:34–42, 2005) or a finite boundary point if there are infinitely many BJEs. PubDate: 2024-08-01
- Mean test for high-dimensional data based on covariance matrix with linear
structures-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this work, the mean test is considered under the condition that the number of dimensions p is much larger than the sample size n when the covariance matrix is represented as a linear structure as possible. At first, the estimator of coefficients in the linear structures of the covariance matrix is constructed, and then an efficient covariance matrix estimator is naturally given. Next, a new test statistic similar to the classical Hotelling’s \(T^2\) test is proposed by replacing the sample covariance matrix with the given estimator of covariance matrix. Then the asymptotic normality of the estimator of coefficients and that of a new statistic for the mean test are separately obtained under some mild conditions. Simulation results show that the performance of the proposed test statistic is almost the same as the Hotelling’s \(T^2\) test statistic for which the covariance matrix is known. Our new test statistic can not only control reasonably the nominal level; it also gains greater empirical powers than competing tests. It is found that the power of mean test has great improvement when considering the structure information of the covariance matrix, especially for high-dimensional cases. Moreover, an example with real data is provided to show the application of our approach. PubDate: 2024-07-02
- An inverse Laplace transform oracle estimator for the normal means problem
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In an effort to estimate the number of true nulls in large scale multiplicity problems (the normal means problem), we generalize the current Fourier transform based oracle estimator with a Laplace transform based estimator. Our interest in this problem stems from the application of r-power which requires knowledge of the number of nulls (Dasgupta et al. in Sankhya B 78(1):96–118, 2016). We analytically show that our method is consistent and theoretically has lower mean squared error than the existing competitor (Jin in J R Stat Soc Ser B (Stat Methodol) 70(3):461–493, 2008). We follow up by a numerical example and a simulation study that ratifies our theoretical results. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00922-4
- The distribution of the sample correlation coefficient under
variance-truncated normality-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The non-null distribution of the sample correlation coefficient under bivariate normality is derived when each of the associated two sample variances is subject to stripe truncation including usual single and double truncation as special cases. The probability density function is obtained using series expressions as in the untruncated case with new definitions of weighted hypergeometric functions. Formulas of the moments of arbitrary orders are given using the weighted hypergeometric functions. It is shown that the null joint distribution of the sample correlation coefficients under multivariate untruncated normality holds also in the variance-truncated cases. Some numerical illustrations are shown. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00918-0
- On the asymptotic behaviour of the joint distribution of the maxima and
minima of observations, when the sample size is a random variable-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we obtain the asymptotic form of the joint distribution of maxima and minima of independent observations, when the sample size is a random variable. We also discuss the asymptotic distribution of the range. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00928-y
- Large deviations for randomly weighted least squares estimator in a
nonlinear regression model-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this work, we introduce the random weighting method to the nonlinear regression model and study the asymptotic properties for the randomly weighted least squares estimator with dependent errors. The results reveal that this new estimator is consistent. Moreover, some simulations are also carried out to show the performance of the proposed estimator. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00926-0
- A generalisation of the aggregate association index (AAI): incorporating a
linear transformation of the cells of a 2 × 2 table-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The analysis of aggregate, or marginal, data for contingency tables is an increasingly important area of statistics, applied sciences and the social sciences. This is largely due to confidentiality issues arising from the imposition of government and corporate protection and data collection methods. The availability of only aggregate data makes it difficult to draw conclusions about the association between categorical variables at the individual level. For data analysts, this issue is of growing concern, especially for those dealing with the aggregate analysis of a single 2 × 2 table or stratified 2 × 2 tables and lies in the field of ecological inference. As an alternative to ecological inference techniques, one may consider the aggregate association index (AAI) to obtain valuable information about the magnitude and direction of the association between two categorical variables of a single 2 × 2 table or stratified 2 × 2 tables given only the marginal totals. Conventionally, the AAI has been examined by considering \({\mathrm{p}}_{11}\) —the proportion of the sample that lies in the (1, 1)th cell of a given 2 × 2 table. However, the AAI can be expanded for other association indices. Therefore, a new generalisation of the original AAI is given here by reformulating and expanding the index so that it incorporates any linear transformation of \({\mathrm{p}}_{11}\) . This study shall consider the consistency of the AAI under the transformation by examining four classic association indices, namely the independence ratio, Pearson’s ratio, standardised residual and adjusted standardised residual, although others may be incorporated into this general framework. We will show how these indices can be utilised to examine the strength and direction of association given only the marginal totals. Therefore, this work enhances our understanding of the AAI and establishes its links with common association indices. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00919-z
- Bayesian multivariate nonlinear mixed models for censored longitudinal
trajectories with non-monotone missing values-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The analysis of multivariate longitudinal data may often encounter a difficult task, particularly in the presence of censored measurements induced by detection limits and intermittently missing values arising when subjects do not respond to a part of outcomes during scheduled visits. The multivariate nonlinear mixed model (MNLMM) has emerged as a promising analytical tool for multi-outcome longitudinal data following arbitrarily nonlinear profiles with random phenomena. This article presents a generalization of the MNLMM, called MNLMM-CM, designed to simultaneously accommodate the effects of censorship and missingness within a Bayesian framework. Specifically, we develop a Markov chain Monte Carlo procedure that combines a Gibbs sampler with the Metropolis–Hastings algorithm. This hybrid approach facilitates Bayesian estimation of essential model parameters and imputation of non-responses under the missing at random mechanism. The issue of posterior predictive inference for the censored and missing outcomes is also addressed. The effectiveness and performance of the proposed methodology are demonstrated through the analysis of simulated data and a real example from an AIDS clinical study. PubDate: 2024-07-01 DOI: 10.1007/s00184-023-00929-x
- Bounds of expectations of order statistics for distributions possessing
monotone reversed failure rates-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In the literature, the sharp positive upper mean-variance bounds on the expectations of order statistics based on independent identically distributed random variables with the decreasing and increasing failure rates, have been recently presented. In this paper we determine analogous evaluations in the dual cases when the parent distributions have monotone reversed failure rates. PubDate: 2024-05-24 DOI: 10.1007/s00184-024-00968-y
- Bayesian finite mixtures of Ising models
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We introduce finite mixtures of Ising models as a novel approach to study multivariate patterns of associations of binary variables. Our proposed models combine the strengths of Ising models and multivariate Bernoulli mixture models. We examine conditions required for the local identifiability of Ising mixture models, and develop a Bayesian framework for fitting them. Through simulation experiments and real data examples, we show that Ising mixture models lead to meaningful results for sparse binary contingency tables with imbalanced cell counts. The code necessary to replicate our empirical examples is available on GitHub: https://github.com/Epic19mz/BayesianIsingMixtures. PubDate: 2024-05-20 DOI: 10.1007/s00184-024-00970-4
- Parametric estimation for linear parabolic SPDEs in two space dimensions
based on temporal and spatial increments-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We deal with parameter estimation for linear parabolic second-order stochastic partial differential equations in two space dimensions driven by two types of Q-Wiener processes based on high frequency data with respect to time and space. We propose minimum contrast estimators of the coefficient parameters based on temporal and spatial increments, and provide adaptive estimators of the coefficient parameters based on approximate coordinate processes. We also give an example and simulation results of the proposed estimators. PubDate: 2024-05-18 DOI: 10.1007/s00184-024-00969-x
- Statistical inference for linear quantile regression with measurement
error in covariates and nonignorable missing responses-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we consider quantile regression estimation for linear models with covariate measurement errors and nonignorable missing responses. Firstly, the influence of measurement errors is eliminated through the bias-corrected quantile loss function. To handle the identifiability issue in the nonignorable missing, a nonresponse instrument is used. Then, based on the inverse probability weighting approach, we propose a weighted bias-corrected quantile loss function that can handle both nonignorable missingness and covariate measurement errors. Under certain regularity conditions, we establish the asymptotic properties of the proposed estimators. The finite sample performance of the proposed method is illustrated by Monte Carlo simulations and an empirical data analysis. PubDate: 2024-05-18 DOI: 10.1007/s00184-024-00967-z
- Model-X Knockoffs for high-dimensional controlled variable selection under
the proportional hazards model with heterogeneity parameter-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract A major challenge arising from data integration pertains to data heterogeneity in terms of study population, study design, or study coordination. Ignoring such heterogeneity in data analysis can lead to the biased estimation. In this paper, regression analysis of the proportional hazards model with heterogeneity parameter is studied. We combine the Model-X Knockoffs procedure with fused LASSO approach to control the false discovery rate in the variable selection and learn the integrative data analysis of partially heterogeneous subgroups when the outcome of interest is time to event. A regularized working partial likelihood function is established and a trick of reparameterization is developed for the numerical calculation of the proposed estimator. Simulation studies are conducted to assess the finite-sample performance of the proposed method. A data example from a clinical trial in primary biliary cirrhosis study is analyzed to demonstrate the application of our proposed method. PubDate: 2024-05-06 DOI: 10.1007/s00184-024-00966-0
- On Bernoulli trials with unequal harmonic success probabilities
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract A Bernoulli scheme with unequal harmonic success probabilities is investigated, together with some of its natural extensions. The study includes the number of successes over some time window, the times to (between) successive successes and the time to the first success. Large sample asymptotics, statistical parameter estimation, and relations to Sibuya distributions and Yule–Simon distributions are discussed. This toy model is relevant in several applications including reliability, species sampling problems, record values breaking and random walks with disasters. PubDate: 2024-05-01 DOI: 10.1007/s00184-023-00913-5
- Refining analytic approximation based estimation of mixed multinomial
probit models by parameter selection-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Applying analytic approximations for computing multivariate normal cumulative distribution functions has led to a substantial improvement in the estimability of mixed multinomial probit models, both in terms of accuracy and especially in terms of computation time. This paper makes a contribution by presenting a possible way to improve the accuracy of estimating mixed multinomial probit model covariances based on the idea of parameter selection using cross-validation. Comparisons to the MACML approach indicate that the proposed parameter selection approach is able to recover covariance parameters more accurately, even when there is a moderate degree of independence between the random coefficients. The approach also estimates parameters efficiently, with standard errors tending to be smaller than those of the MACML approach, which can be observed by means of a real data case. PubDate: 2024-05-01 DOI: 10.1007/s00184-023-00920-6
- Optimal subsampling for modal regression in massive data
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Many modern statistical analysis research efforts are focused on solving the limited computational resources problem that arises when dealing with large datasets. One popular and effective method to address this challenge is to obtain informative subdata from the full dataset based on optimal subsampling probabilities. In this article, we present an optimal subsampling approach for big data modal regression from the perspective of minimizing asymptotic mean squared error. The estimation procedure is achieved by running a two-step algorithm based on the modal expectation-maximization algorithm when the bandwidth for the modal regression is not related to the subsample size. Under certain regularity conditions, we investigate the consistency and asymptotic normality of the subsample-based estimator given the full data. Furthermore, an optimal bandwidth selection approach within this framework is also investigated. Simulation studies demonstrate that our proposed subsampling method performs well in the context of big data modal regression. Empirical evaluation is also conducted using real data. PubDate: 2024-05-01 DOI: 10.1007/s00184-023-00916-2
|