|
|
- Using statistical models for optimal packaging in semiconductor
manufacturing processes-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The importance of the back-end process in semiconductor manufacturing has recently received significant attention from global manufacturers. The analysis of manufacturing data often provides crucial insights into problems inherent in the manufacturing processes. An important goal of the back-end process is to improve the yield of final products, called packages. A simple way to achieve this goal is to characterize low-quality wafers based on the analysis of manufacturing data and discard them before proceeding to the packaging step. Alternatively, this paper proposes a novel packaging method that significantly improves the package yield using statistical models scoring the quality of dies. We prove that the proposed packaging method is optimal and conduct thorough numerical experiments, showing its superiority. PubDate: 2024-08-07
- Generalized parametric help in Hilbertian additive regression
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper introduces a powerful bias reduction technique applied to local linear additive regression. The main idea is to make use of a parametric family. Existing techniques based on this idea use a parametric model that is linear in the parameter. In this paper we generalize the approaches by allowing nonlinear parametric families. We develop the methodology and theory for response variables taking values in a general separable Hilbert space. Under mild conditions, our proposed approach not only offers flexibility but also gains bias reduction while maintaining the variance of the local linear additive regression estimators. We also provide numerical evidences that support our approach. PubDate: 2024-07-30
- Objective Bayesian multiple testing for k normal populations
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This article proposes objective Bayesian multiple testing procedures for a normal model. The challenging task of considering all the configurations of true and false null hypotheses is addressed here by ordering the null hypotheses based on their Bayes factors. This approach reduces the size of the compared models for posterior search from \(2^k\) to \(k+1\) , for k null hypotheses. Furthermore, the consistency of the proposed multiple testing procedures is established and their behavior is analyzed with simulated and real examples. In addition, the proposed procedures are compared with classical and Bayesian multiple testing procedures in all the possible configurations of true and false ordered null hypotheses. PubDate: 2024-07-29
- Kernel machine in semiparametric regression with nonignorable missing
responses-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Missing data is prevalent in many fields. Among all missing mechanisms, nonignorable missing data is more challenging for model identification. In this paper, we propose a semiparametric regression model estimation method with nonignorable missing responses. To be specific, we first construct a parametric model for the propensity score and apply the generalized method of moments to obtain the estimated propensity score. For nonignorable missing responses, based on the inverse probability weighting approach, we propose the penalized garrotized kernel machine method to flexibly depict the complex nonlinear relationships between the response and the predictors, allow for interactions between the predictors, and eliminate the redundant variables automatically. The cyclical coordinate descent algorithm is provided to solve the corresponding optimization problems. Numerical results and real data analysis indicate that our proposed method achieves better prediction performance compared with the competing ones. PubDate: 2024-07-26
- Spatial regression with multiplicative errors, and its application with
LiDAR measurements-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Multiplicative errors in addition to spatially referenced observations often arise in geodetic applications, particularly with light detection and ranging (LiDAR) measurements. However, regression involving multiplicative errors remains relatively unexplored in such applications. In this regard, we present a penalized modified least squares estimator to handle the complexities of a multiplicative error structure while identifying significant variables in spatially dependent observations. The proposed estimator can be also applied to classical additive error spatial regression. By establishing asymptotic properties of the proposed estimator under increasing domain asymptotics with stochastic sampling design, we provide a rigorous foundation for its effectiveness. A comprehensive simulation study confirms the superior performance of our proposed estimator in accurately estimating and selecting parameters, outperforming existing approaches. To demonstrate its real-world applicability, we employ our proposed method, along with other alternative techniques, to estimate a rotational landslide surface using LiDAR measurements. The results highlight the efficacy and potential of our approach in tackling complex spatial regression problems involving multiplicative errors. PubDate: 2024-07-23
- Online debiased lasso estimation and inference for heterogenous updating
regressions-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In the era of big data, online updating problems have attracted extensive attention. In practice, the covariates set of the models may change according to the conditions of data streams. In this paper, we propose a two-stage online debiased lasso estimation and inference method for high-dimensional heterogenous linear regression models with new variables added midway. At the first stage, the homogenization strategy is conducted to represent the heterogenous models by defining the pseudo covariates and responses. At the second stage, we conduct the online debiased lasso estimation procedure to obtain the final estimator. Theoretically, the asymptotic normality of the heterogenous online debiased lasso estimator (HODL) is established. The finite-sample performance of the proposed estimators is studied through simulation studies and a real data example. PubDate: 2024-07-19
- Scale invariant and efficient estimation for groupwise scaled envelope
model-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Motivated by different groups containing different group information under the heteroscedastic error structure, we propose the groupwise scaled envelope model that is invariable to scale changes and is permissible for distinct regression coefficients and the heteroscedastic error structure across groups. It retains the potential of the scaled envelope methods to keep the scale invariant and allows for both different regression coefficients and different error structures for diverse groups. Further, we demonstrate the maximum likelihood estimators and its theoretical properties including parameter identifiability, asymptotic distribution and consistency of the groupwise scaled envelope estimator. Lastly, simulation studies and a real-data example demonstrate the advantages of the groupwise scaled envelope estimators, including a comparison with the standard model estimators, groupwise envelope estimators, scaled envelope estimators and separate scaled envelope estimators. PubDate: 2024-07-14
- Statistical inference of pth-order generalized binomial autoregressive
model-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract To capture the higher-order autocorrelation structure for finite-range integer-valued time series of counts, and to consider the interdependence between individuals, a pth-order generalized binomial autoregressive (GBAR(p)) process is proposed in this paper. The stationarity and ergodicity of the GBAR(p) model are proved, and the basic probabilistic and statistical properties of the model are discussed. The unknown parameters are estimated by the conditional least squares and conditional maximum likelihood methods. The performances of two kinds of estimators are studied via simulations, and the forecasting problem of this model is also considered in this paper. Finally, the model is applied to a real data set and compared with some existing models to investigate the rationality of the GBAR(p) model. PubDate: 2024-07-13
- Strong convergence of a nonparametric relative error regression estimator
under missing data with functional predictors-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In this paper, we develop a nonparametric estimator of the regression function for a functional explanatory variable and a scalar response variable that is subject to left truncation and right censoring. The estimator is constructed by minimizing the mean squared relative error, which is a robust criterion that reduces the impact of outliers relatively to the Nadaraya Watson estimator. We prove the pointwise and uniform convergence of the estimator under some regular conditions and assess its performance by a numerical study. We also investigate the robustness of the estimator using the influence function as a measure of sensitivity to outliers and apply the estimator to a real dataset. PubDate: 2024-07-07
- Modeling and inferences for bounded multivariate time series of counts
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This paper considers modeling bounded multivariate time series of counts and the inferential procedures of this model. For modeling, we introduce a hybrid type model similar to the scheme of integer-valued autoregressive (INAR) and conditional autoregressive heteroscedastic (INARCH) models. To estimate the model parameters, we use the conditional least squares estimator (CLSE) and minimum density power divergence estimator (MDPDE). To evaluate the small sample performances of the proposed estimators, we conduct a Monte Carlo simulation study and demonstrate that the proposed methods work well. Real data analysis is also carried out using syphilis data in the U.S. for illustration. PubDate: 2024-06-25
- Bayesian hierarchical spatial model for small-area estimation with
non-ignorable nonresponses and its application to the NHANES dental caries data-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The National Health and Nutrition Examination Survey (NHANES) is a major program of the National Center for Health Statistics, designed to assess the health and nutritional status of adults and children in the United States. The analysis of NHANES dental caries data faces several challenges, including (1) the data were collected using a complex, multistage, stratified, unequal-probability sampling design; (2) the sample size of some primary sampling units (PSU), e.g., counties, is very small; (3) the measures of dental caries have complicated structure and correlation, and (4) there is a substantial percentage of nonresponses, which are expected not to be missing at random or non-ignorable. We propose a Bayesian hierarchical spatial model to address these analysis challenges. We develop a two-level Potts model that closely resembles the caries evolution process, and captures complicated spatial correlations between teeth and surfaces of the teeth. By adding Bayesian hierarchies to the Potts model, we account for the multistage survey sampling design, while also enabling information borrowing across PSUs for small-area estimation. We incorporate sampling weights by including them as a covariate in the model and adopt flexible B-splines to achieve robust inference. We account for non-ignorable missing outcomes and covariates using the selection model. We use data augmentation coupled with the noisy Monte Carlo algorithm to overcome the numerical difficulty caused by doubly-intractable normalizing constants and sample posteriors. Our analysis results show strong spatial associations between teeth and tooth surfaces, including that dental hygienic factors, such as fluorosis and sealant, reduce dental disease risks. PubDate: 2024-06-22
- An index for measuring degree of departure from symmetry for ordinal
square contingency tables-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract For the analysis of square contingency tables with the same row and column ordinal classifications, this study proposes an index for measuring the degree of departure from the symmetry model using new cumulative probabilities. The proposed index is constructed based on the Cressie and Read’s power divergence, or the weighted average of the Patil and Taillie’s diversity index. This study derives a plug-in estimator of the proposed index and an approximate confidence interval for the proposed index. The estimator of the proposed index is expected to reduce the bias more than the estimator of the existing index, even when the sample size is not large. The proposed index is identical to the existing index under the conditional symmetry model. Therefore, assuming the probability structure in which the conditional symmetry model holds, the performances of plug-in estimators of the proposed and existing indexes can be simply compared. Through numerical examples and real data analysis, the usefulness of the proposed index compared to the existing index is demonstrated. PubDate: 2024-06-16
- Optimal designs for comparing several regression curves
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This article is concerned with the optimal design problem of efficient statistical inference for comparing several regression curves estimated from samples of independent measurements. The objective is to find the \(\mu ^c_{p}\) -optimal designs that minimize an \(L_p\) -norm of the asymptotic variance of the prediction for the contrasts of k regression curves. General equivalence theorems are established to verify the \(\mu ^c_p\) -optimality in the set of all approximate designs. Invariant property with respect to model reparameterization are also obtained. The results obtained for the linear models are extended to the situation of generalized linear models. Three examples are presented to illustrate the applications of the obtained results. PubDate: 2024-06-14
- Testing for conditional independence of survival time from covariate
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This study examined the test of independence of survival time from a covariate in a more general setting using empirical process techniques. Previous research has been extended in several ways: (1) allow incompleteness of observation owing to censoring (2) allow the time-dependent covariate (3) allow the non-uniform covariate (4) prove the validity of weighted bootstrap to implement the proposed testing procedure. Certain classes of test statistics that are functionals of a natural empirical process were studied, and the limiting distribution of these statistics was then derived using the functional delta method. The limiting distributions included some linear functionals of zero mean tight Brownian bridges under the null hypothesis, and the tests were consistent against general alternatives. Tests implemented using weighted bootstrap were shown to be valid. The proposals are illustrated via simulation studies and an application to acute leukemia data. PubDate: 2024-06-01
- Sequential online monitoring for autoregressive time series of counts
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract This study considers the online monitoring problem for detecting the parameter change in time series of counts. For this task, we construct a monitoring process based on the residuals obtained from integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models. We consider this problem within a more general framework using martingale difference sequences as the monitoring problem on GARCH-type processes based on the residuals or score vectors can be viewed as a special case of the monitoring problems on martingale differences. The limiting behavior of the stopping rule is investigated in this general set-up and is applied to the INGARCH processes. To assess the performance of our method, we conduct Monte Carlo simulations. A real data analysis is also provided for illustration. Our findings in this empirical study demonstrate the validity of the proposed monitoring process. PubDate: 2024-06-01
- Use of ridge calibration method in predicting election results
-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Ridge calibration is a penalized method used in survey sampling to reduce the variability of the final set of weights by relaxing the linear restrictions. We proposed a method for selecting the penalty parameter that minimizes the estimated mean squared error of the mean estimator when estimated auxiliary information is used. We showed that the proposed estimator is asymptotically equivalent to the generalized regression estimator. A simple simulation study shows that our estimator has the smaller MSE compared to the traditional calibration ones. We applied our method to predict election result using National Barometer Survey and Korea Social Integration Survey. PubDate: 2024-06-01
- Logistic regression models for elastic shape of curves based on tangent
representations-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Shape analysis is widely used in many application areas such as computer vision, medical and biological studies. One challenge to analyze the shape of an object in an image is its invariant property to shape-preserving transformations. To measure the distance or dissimilarity between two different shapes, we worked with the square-root velocity function (SRVF) representation and the elastic metric. Since shapes are inherently high-dimensional in a nonlinear space, we adopted a tangent space at the mean shape and a few principal components (PCs) on the linearized space. We proposed classification methods based on logistic regression using these PCs and tangent vectors with the elastic net penalty. We then compared its performance with other model-based methods for shape classification in application to shape of algae in watersheds as well as simulated data generated by the mixture of von Mises-Fisher distributions. PubDate: 2024-06-01
- A self-normalization test for structural breaks in a regression model for
panel data sets-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We construct a new structural break test in a panel regression model using the self-normalization method. The self-normalization test is shown to be superior to an existing test in that the former is theoretically and experimentally valid for regression models with serially and/or cross-sectionally correlated errors while the latter is not. We derive the asymptotic null distribution of the self-normalization test and its consistency under an alternative hypothesis. Unlike the existing test requiring bootstrap computation for critical values, the self-normalization test is implemented easily with a set of simple critical values. A Monte Carlo experiment reports that the self-normalization resolves the severe over-size problem of the existing test under serial and/or cross-sectional error correlation. PubDate: 2024-06-01
- Disseminating massive frequency tables by masking aggregated cell
frequencies-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract We propose a confidential approach for disseminating frequency tables constructed for any combination of key variables in the given microdata, including those of hierarchical key variables. The system generates all possible frequency tables by either marginalizing or aggregating fully joint frequency tables of key variables while protecting the original cells with low frequencies through two masking steps: the small cell adjustments for joint tables followed by the proposed algorithm called information loss bounded aggregation for aggregated cells. The two-step approach is designed to control both disclosure risk and information loss by ensuring the k-anonymity of original cells with small frequencies while keeping the loss within a bounded limit. PubDate: 2024-06-01
- Correction: Disseminating massive frequency tables by masking aggregated
cell frequencies-
Free pre-print version: Loading...
Rate this result:
What is this?
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
PubDate: 2024-04-03
|