Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: In reliability engineering, the relevation model can be adopted to characterize the performance of redundancy allocation for coherent systems. In this paper, we investigate the allocation problems of relevations for two nodes in a coherent system with independent components for enhancing system reliability. We first investigate the optimal allocation policy of two relevations for two nodes of the system under certain conditions. As a special setting of the relevation, we further discuss optimal allocation strategies for a batch of minimal repairs allocated to two components of the coherent system by applying the useful tool of majorization order. Sufficient conditions are established in terms of structural relationships between the components induced by minimal cut or path sets and the reliabilities of components and relevations. Some numerical examples are provided as illustrations. A real application in aircraft indicator lights systems is also presented to show the availability of our results. PubDate: 2023-03-17
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Matrix-valued data arise in many applications. In this paper, we consider the setting where one collects both a matrix-valued data \(\textbf{Y}\in \mathbb {R}^{p\times q}\) and a generic scalar X that can be continuous, discrete or categorical. Since the rows and columns of \(\textbf{Y}\) often have specific meanings in practice, it is interesting to make statistical inferences on the significance of rows and columns of \(\textbf{Y}\) . In this paper, by taking into account the background effect, we propose a new measure on significance of rows and columns based on an additive model. The point estimates, hypothesis testings and confidence intervals of the significance of a given row or column of \(\textbf{Y}\) are considered. Moreover, a procedure is proposed to select significant rows and columns. Our method is applicable to both p and q being much larger than sample size n. Simulation results and real data analysis demonstrate the effectiveness of the proposed method. PubDate: 2023-03-13
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: We establish a global unbiased divide-and-conquer estimation (gub-DC) in linear model and a global bias reduced DC estimation (gbr-DC) in nonlinear model under the case of memory constraint. To introduce the new strategy in linear model, we first provide a new insight into the statistical structure through the closed representation of the local biased estimator and then construct a pro forma linear regression with the local estimator as “response variable” and the parameter of interest as “intercept.” Based on such a regression structure, we composite a global unbiased estimator as the least squares estimator of the intercept. Generally, the gub-DC method can be applied to various biased estimations such as Ridge estimator, principal component estimator and Stein estimator in linear model. Moreover, the method can be extended into nonlinear model to construct a global bias reduced estimator. The main advantage over the classical DC methods is that the new proposed procedures can absorb the information hidden in the statistical structure, and the resulting global estimators are strictly unbiased or can achieve root-n consistency, without any constraint on the number of batches. Another attractive feature refers to the computational simplicity and efficiency. Detailed simulation studies demonstrate that the new estimators are significantly bias-corrected, and their behaviors are comparable with the entire data estimation and are better or at least not worse than the competitors. PubDate: 2023-03-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Benford’s law is often used to support critical decisions related to data quality or the presence of data manipulations or even fraud in large datasets. However, many authors argue that conventional statistical tests will reject the null of data “Benford-ness” if applied in samples of the typical size in this kind of applications, even in the presence of tiny and practically unimportant deviations from Benford’s law. Therefore, they suggest using alternative criteria that, however, lack solid statistical foundations. This paper contributes to the debate on the “large n” (or “excess power”) problem in the context of Benford’s law testing. This issue is discussed in relation with the notion of severity testing for goodness-of-fit tests, with a specific focus on tests for conformity with Benford’s law. To do so, we also derive the asymptotic distribution of the mean absolute deviation (MAD) statistic as well as an asymptotic standard normal test. Finally, the severity testing principle is applied to six controversial large datasets to assess their “Benford-ness”. PubDate: 2023-02-28
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: We propose a novel class of first-order integer-valued AutoRegressive (INAR(1)) models based on a new operator, the so-called geometric thinning operator, which induces a certain non-linearity to the models. We show that this non-linearity can produce better results in terms of prediction when compared to the linear case commonly considered in the literature. The new models are named non-linear INAR(1) (in short NonLINAR(1)) processes. We explore both stationary and non-stationary versions of the NonLINAR processes. Inference on the model parameters is addressed and the finite-sample behavior of the estimators investigated through Monte Carlo simulations. Two real data sets are analyzed to illustrate the stationary and non-stationary cases and the gain of the non-linearity induced for our method over the existing linear methods. A generalization of the geometric thinning operator and an associated NonLINAR process are also proposed and motivated for dealing with zero-inflated or zero-deflated count time series data. PubDate: 2023-02-25
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: In partially linear single-index models, there are two different covariate matrices in the model for the linear part and nonlinear part. All covariate information needs to be divided into two parts before the model can be fitted. In contrast, in the extended partially linear single-index models, all the covariate variables are included in one matrix, which is contained in both the linear part and nonlinear part of the model. We propose local smoothing estimators for the model parameters and unknown function with computationally efficient and accurate computation methodologies and obtain the asymptotic properties of the model parameter estimators. We also employ the LASSO penalty to obtain penalized estimators with consistency and oracle property in order to carry out estimation and variable selection simultaneously. Then we develop a linear hypothesis test for the model parameters. Furthermore, we extend the proposed methodology to the increasing dimensional settings under certain assumptions. Simulation studies are presented that support our analytic results. In addition, a real data analysis is provided for illustration. PubDate: 2023-02-21
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Principal component analysis (PCA) and canonical correlation analysis (CCA) are dimension-reduction techniques in which either a random vector is well approximated in a lower dimensional subspace or two random vectors from high dimensional spaces are reduced to a new pair of low dimensional vectors after applying linear transformations to each of them. In both techniques, the closeness between the higher dimensional vector and the lower representations is under concern, measuring the closeness through a robust function. Robust SM-estimation has been treated in the context of PCA and CCA showing an outstanding performance under casewise contamination, which encourages the study of asymptotic properties. We analyze consistency and asymptotic normality for the SM-canonical vectors. As a by-product of the CCA derivations, the asymptotics for PCA can also be obtained. A classical measure of robustness as the influence function is analyzed, showing the usual performance of S-estimation in different statistical models. The general ideas behind SM-estimation in either PCA or CCA are specially tailored to the context of association, rendering robust measures of association between random variables. By these means, a robust correlation measure is derived and the connection with the association measure provided by S-estimation for bivariate scatter is analyzed. On the other hand, we also propose a second robust correlation measure which is reminiscent of depth-based procedures. PubDate: 2023-02-20
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: This paper investigates the small area estimation of population averages of unit-level compositional data. The new methodology transforms the compositions into vectors of \(R^m\) and assumes that the vectors follow a multivariate nested error regression model. Empirical best predictors of domain indicators are derived from the fitted model, and their mean squared errors are estimated by parametric bootstrap. The empirical analysis of the behavior of the introduced predictors is investigated by means of simulation experiments. An application to real data from the Spanish household budget survey is given. The target is to estimate the average of proportions of annual household expenditures on food, housing and others, by Spanish provinces. PubDate: 2023-02-15
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Sharp upper bounds are proved for the probability that a standardized random variable takes on a value outside a possibly asymmetric interval around 0. Six classes of distributions for the random variable are considered, namely the general class of ‘distributions’, the class of ‘symmetric distributions’, of ‘concave distributions’, of ‘unimodal distributions’, of ‘unimodal distributions with coinciding mode and mean’, and of ‘symmetric unimodal distributions’. In this way, results by Gauß (Commentationes Societatis Regiae Scientiarum Gottingensis Recentiores 5:1–58, 1823), Bienaymé (C R Hebd Séance Acad Sci Paris 37:309–24, 1853), Bienaymé (C R Hebd Séance Acad Sci Paris 37:309–24, 1853), Chebyshev (Journal de mathématiques pures et appliqués (2) 12:177–184, 1867), and Cantelli (Atti del Congresso Internazionale dei Matematici 6:47–59, 1928) are generalized. For some of the known inequalities, such as the Gauß inequality, an alternative proof is given. PubDate: 2023-01-24
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: We explore the use of penalized complexity (PC) priors for assessing the dependence structure in a multivariate distribution F, with a particular emphasis on the bivariate case. We use the copula representation of F and derive the PC prior for the parameter governing the copula. We show that any \(\alpha \) -divergence between a multivariate distribution and its counterpart with independent components does not depend on the marginal distribution of the components. This implies that the PC prior for the parameters of the copula can be elicited independently of the specific form of the marginal distributions. This represents a useful simplification in the model building step and may offer a new perspective in the field of objective Bayesian methodology. We also consider strategies for minimizing the role of subjective inputs in the prior elicitation step. Finally, we explore the use of PC priors in Bayesian hypothesis testing. Our prior is compared with competing default priors both for estimation purposes and testing. PubDate: 2023-01-14 DOI: 10.1007/s11749-022-00843-w
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Sufficient dimension reduction reduces the dimension of a regression model without loss of information by replacing the original predictor with its lower-dimensional linear combinations. Partial (sufficient) dimension reduction arises when the predictors naturally fall into two sets \(\textbf{X}\) and \(\textbf{W}\) , and pursues a partial dimension reduction of \(\textbf{X}\) . Though partial dimension reduction is a very general problem, only very few research results are available when \(\textbf{W}\) is continuous. To the best of our knowledge, none can deal with the situation where the reduced lower-dimensional subspace of \(\textbf{X}\) varies with \(\textbf{W}\) . To address such issue, we in this paper propose a novel variable-dependent partial dimension reduction framework and adapt classical sufficient dimension reduction methods into this general paradigm. The asymptotic consistency of our method is investigated. Extensive numerical studies and real data analysis show that our variable-dependent partial dimension reduction method has superior performance compared to the existing methods. PubDate: 2023-01-10 DOI: 10.1007/s11749-022-00841-y
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: The structure testing of a high-dimensional covariance matrix plays an important role in financial stock analyses, genetic series analyses, and many other fields. Testing that the covariance matrix is block-diagonal under the high-dimensional setting is the main focus of this paper. Several test procedures that rely on normality assumptions, two-diagonal block assumptions, or sub-block dimensionality assumptions have been proposed to tackle this problem. To relax these assumptions, we develop a test framework based on U-statistics, and the asymptotic distributions of the U-statistics are established under the null and local alternative hypotheses. Moreover, a test approach is developed for alternatives with different sparsity levels. Finally, both a simulation study and real data analysis demonstrate the performance of our proposed methods. PubDate: 2022-12-26 DOI: 10.1007/s11749-022-00842-x
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: While there exist many bandwidth selectors for estimation, bandwidth selection for statistical matching and prediction has hardly been studied so far. We introduce a computationally attractive selector for nonparametric out-of-sample prediction problems like data matching, impact evaluation, scenario simulations or imputing missings. Even though the method is bootstrap based, we can derive closed expressions for the criterion function which avoids the need of Monte Carlo approximations. We study both, asymptotic and finite sample performance. The derived consistency, convergence rate and extensive simulation studies show the successful operation of the selector. The method is illustrated by applying it to real data for studying the gender wage gap in Spain. Specifically, the salary of Spanish women is predicted nonparametrically by the wage equation estimated for men while conditioned on their own (i.e., women’s) characteristics. An important discrepancy between observed and predicted wages is found, exhibiting a serious gender wage gap. PubDate: 2022-12-10 DOI: 10.1007/s11749-022-00838-7
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Never better said, a correct diagnosis is crucial for patient recovery. In the eradication of poverty, which is the first of the sustainable development goals (SDGs) established by the United Nations, efforts in the form of social aid and programs will be useless if they are not directed where they are most needed. Nowadays, monitoring the progress on the SDGs is even more urgent after the sanitary crisis, which is reversing the global poverty reduction observed since 1990 and, given that social development funds are always limited, managing them correctly requires disaggregated statistical information on poverty of acceptable quality. But reliable estimates on living conditions are scarce due to sample size limitations of most official surveys. Common small area estimation procedures supplement the survey data with auxiliary data sources to produce more reliable disaggregated estimates than those based solely on the survey data. We describe the traditional as well as recent model-based procedures for obtaining reliable disaggregated estimates of poverty and inequality indicators, discussing their properties from a practical point of view, placing emphasis on real applications and describing software implementations. We discuss results from recent simulation experiments that compare some of the unit-level methods in terms of bias and efficiency, under model- and design-based setups. Finally, we provide some concluding remarks. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00822-1
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: We analyze reliability systems with components whose lifetimes are identically distributed, and whose joint distribution admits a Samaniego signature representation of the system lifetime distribution. Our main result is the following. We assume that two systems have the same structure and that the lifetimes of the components of the systems share the same dependence copula. If the first system lifetime precedes (succeeds) its single component lifetime in the convex transform order, and if also the component lifetime of the second system precedes the (succeeds) component lifetime of the first system in the convex transform order then the system-component ordering property is preserved by the second system lifetime, i.e., the system lifetime precedes (succeeds) the component lifetime in the second system also. This allows us to conclude various sufficient and necessary conditions on the system signatures under which the monotone failure rate and density properties of the component lifetimes are inherited by the system lifetime under the condition that the component lifetimes are independent. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00808-z
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: The \(\delta \) -shock model is one of the basic shock models which has a wide range of applications in reliability, finance and related fields. In existing literature, it is assumed that the recovery time of a system from the damage induced by a shock is constant as well as the shocks magnitude. However, as technical systems gradually deteriorate with time, it takes more time to recover from this damage, whereas the larger magnitude of a shock also results in the same effect. Therefore, in this paper, we introduce a general \(\delta \) -shock model when the recovery time depends on both the arrival times and the magnitudes of shocks. Moreover, we also consider a more general and flexible shock process, namely, the Poisson generalized gamma process. It includes the homogeneous Poisson process, the non-homogeneous Poisson process, the Pólya process and the generalized Pólya process as the particular cases. For the defined survival model, we derive the relationships for the survival function and the mean lifetime and study some relevant stochastic properties. As an application, an example of the corresponding optimal replacement policy is discussed. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00810-5
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Joint distribution between two or more variables could be influenced by the outcome of a conditioning variable. In this paper, we propose a flexible Wald-type statistic to test for such influence. The test is based on a conditioned multivariate Kendall’s tau nonparametric estimator. The asymptotic properties of the test statistic are established under different null hypotheses to be tested for, such as conditional independence or testing for constant conditional dependence. Two simulation studies are presented: The first shows that the estimator proposed and the bandwidth selection procedure perform well. The second presents different bivariate and multivariate models to check the size and power of the test and runs comparisons with previous proposals when appropriate. The results support the contention that the test is accurate even in complex situations and that its computational cost is low. As an empirical application, we study the dependence between some pillars of European Regional Competitiveness when conditioned on the quality of regional institutions. We find interesting results, such as weaker links between innovation and higher education in regions with lower institutional quality. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00806-1
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: We develop new models for imperfect repair and the corresponding generalized renewal processes for stochastic description of repairable items that fail when their degradation reaches the specified deterministic or random threshold. The discussion is based on the recently suggested notion of a random virtual age and is applied to monotone processes of degradation with independent increments. Imperfect repair reduces degradation of an item on failure to some intermediate level. However, for the nonhomogeneous processes, the corresponding age reduction, which sets back the ‘clock’ of the process, is also performed. Some relevant stochastic comparisons are obtained. It is shown that the cycles of the corresponding generalized imperfect renewal process are stochastically decreasing/increasing depending on the monotonicity properties of the failure rate that describes the random failure threshold of an item. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00813-2
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Estimation and testing is studied for functional data with temporally dependent errors, an interesting example of which is the event-related potential (ERP). B-spline estimators are formulated for individual smooth trajectories and their population mean as well. The mean estimator is shown to be oracally efficient in the sense that it is as efficient as the infeasible mean estimator if all trajectories had been fully observed without contamination of errors. The oracle efficiency entails asymptotically correct simultaneous confidence band (SCB) for the mean function, which is useful for making inference on the global shape of the mean. Extensive simulation experiments with various time series errors and functional principal components confirm the theoretical conclusions. For a moderate-sized ERP data set, multiple comparison is made by constructing paired SCBs among four different stimuli, over three components N450, N1, and N2 separately or simultaneously, leading to interesting findings. PubDate: 2022-12-01 DOI: 10.1007/s11749-022-00820-3