Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In this work, we discuss the asymptotic behavior of minima and maxima of moving sums of independent and non-identically distributed random variables. We first establish some theoretical results associated with the asymptotic behavior of minima and maxima. Then, we apply these results to exponential and normal models. We also derive strong limit results for the minima and maxima of moving sums taken from these two models. PubDate: 2024-06-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: \(U\) -statistics represent a fundamental class of statistics from modeling quantities of interest defined by multi-subject responses. \(U\) -statistics generalize the empirical mean of a random variable \(X\) to sums over every \(m\) -tuple of distinct observations of \(X\) . Stute [103] introduced a class of so-called conditional \(U\) -statistics, which may be viewed as a generalization of the Nadaraya-Watson estimates of a regression function. Stute proved their strong pointwise consistency to: $$r^{(k)}(\varphi,\tilde{\mathbf{t}}):=\mathbb{E}[\varphi(Y_{1},\ldots,Y_{k}) (X_{1},\ldots,X_{k})=\tilde{\mathbf{t}}]\quad\textrm{for}\quad\tilde{\mathbf{t}}=\left(\mathbf{t}_{1},\ldots,\mathbf{t}_{k}\right)\in\mathbb{R}^{dk}.$$ In the analysis of modern machine learning algorithms, sometimes we need to manipulate kernel estimation within the nonconventional setting with intricate kernels that might even be irregular and asymmetric. In this general setting, we obtain the strong uniform consistency result for the general kernel on Riemannian manifolds with Riemann integrable kernels for the conditional \(U\) -processes. We treat both cases when the class of functions is bounded or unbounded, satisfying some moment conditions. These results are proved under some standard structural conditions on the classes of functions and some mild conditions on the model. Our findings are applied to the regression function, the set indexed conditional \(U\) -statistics, the generalized \(U\) -statistics, and the discrimination problem. The theoretical results established in this paper are (or will be) key tools for many further developments in manifold data analysis. PubDate: 2024-06-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Statistical data analysis is of great interest in every field of management, business, engineering, medicine, etc. At the time of classification and analysis, errors may arise, like a classification of observation in the other class instead of the actual class. All fields of science and economics have substantial problems due to misclassification errors in the observed data. Due to a misclassification error in the data, the sampling process may not suggest an appropriate probability distribution, and in that case, inference is impaired. When these types of errors are identified in variables, it is expected to consider the problem’s solution regarding classification errors. This paper presents the situation where specific counts are reported erroneously as belonging to other counts in the context of size biased Uniform Poisson distribution, the so-called misclassified size biased Uniform Poisson distribution. Further, we have estimated the parameters of misclassified size biased Uniform Poisson distribution by applying the method of moments, maximum likelihood method, and approximate Bayes estimation method. A simulation study is carried out to assess the performance of estimation methods. A real dataset is discussed to demonstrate the suitability and applicability of the proposed distribution in the modeling count dataset. A Monte Carlo simulation study is presented to compare the estimators. The simulation results show that the ML estimates perform better than their corresponding moment estimates and approximate Bayes estimates. PubDate: 2024-06-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In actuarial science, it is often of interest to compare stochastically smallest claim amounts from heterogeneous portfolios. In this paper, we obtain the usual stochastic order between the smallest claim amounts when the matrix of parameters \((\boldsymbol{\alpha}\) , \(\boldsymbol{\lambda})\) changes to another matrix in terms of chain majorization order. By using the Archimedean copula and weak majorization conceptions, we also obtain some conditions for comparison of smallest claim amounts in terms of usual stochastic order. PubDate: 2024-06-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In survival analysis a random right-censoring partitions data into uncensored and censored observations of the lifetime of interest. The dominance of uncensored observations is a familiar methodology in nonparametric estimation motivated by the classical Kaplan–Meier product-limit and Cox partial likelihood estimators. Nonetheless, for high rate censoring it is of interest to understand what, if anything, can be done by aggregating uncensored and censored observations for the staple nonparametric problems of density and regression estimation. The oracle, who knows distribution of the censoring lifetime, can use each subsample for consistent estimation and hence may shed light on the aggregation. The oracle’s asymptotic theory reveals that density estimation, based on censored observations, is an ill-posed problem with slower rates of risk convergence, the ill-posedness occurs in frequency-domain, its severity increases with frequency, and accordingly a special aggregation on low frequencies may be beneficial. On the other hand, censored observations are not ill-posed for nonparametric regression and the aggregation is feasible. Based on these theoretical results, methodology of aggregation in frequency domain is developed and proposed estimators are tested on simulated and real examples. PubDate: 2024-06-01

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In a number of research areas, such as non-convex optimization and machine learning, determining and assessing regions of monotonicity of functions is pivotal. Numerically, it can be done using the proportion of positive (or negative) increments of transformed ordered inputs. When the number of inputs grows, the proportion tends to an index of increase (or decrease) of the underlying function. In this paper, we introduce a most general index of monotonicity and provide its interpretation in all practically relevant scenarios, including those that arise when the distribution of inputs has jumps and flat regions, and when the function is only piecewise differentiable. This enables us to assess monotonicity of very general functions under particularly mild conditions on the inputs. PubDate: 2024-03-01 DOI: 10.3103/S1066530724700054

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Separation has a significant impact on parameter estimates for logistic regression models in frequentist approach and in Bayesian approach. When separation presents in a sample, the maximum likelihood estimation (MLE) does not exist through standard estimation methods. The existence of posterior means is affected by the presence of separation and also depended on the forms of prior distributions. Therefore, controlling the appearance of separation in generating samples from the logistic regression models has an important role for parameter estimation techniques. In this paper, we propose necessary and sufficient conditions for separation occurring in the logistic regression samples with two dimensional models and multiple dimensional models of independent variables. By using the technique of rotating Castesian coordinates of p dimensions, the characteristic of separation occurring in general cases is presented. Using these results, we propose algorithms to control the probability of separation appearance in generated samples for given sample sizes and multiple dimensional models of independent variables. The simulation studies show that the proposed algorithms can effectively generate the designed random samples with controlling the probability of separation appearance. PubDate: 2024-03-01 DOI: 10.3103/S1066530724700017

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In this paper, we estimate the precision matrix \({\Sigma}^{-1}\) of a Gaussian multivariate linear regression model through its canonical form \(({Z}^{T},{U}^{T})^{T}\) where \(Z\) and \(U\) are respectively an \(m\times p\) and an \(n\times p\) matrices. This problem is addressed under the data-based loss function \(\textrm{tr}\ [({\hat{\Sigma}}^{-1}-{\Sigma}^{-1})S]^{2}\) , where \({\hat{\Sigma}}^{-1}\) estimates \({\Sigma}^{-1}\) , for any ordering of \(m,n\) and \(p\) , in a unified approach. We derive estimators which, besides the information contained in the sample covariance matrix \(S={U}^{T}U\) , use the information contained in the sample mean \(Z\) . We provide conditions for which these estimators improve over the usual estimators \(a{S}^{+}\) where \(a\) is a positive constant and \({S}^{+}\) is the Moore-Penrose inverse of \(S\) . Thanks to the role of \(Z\) , such estimators are also improved by their truncated version. PubDate: 2024-03-01 DOI: 10.3103/S1066530724700029

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Our research employs general empirical process methods to investigate and establish moderate deviation principles for kernel-type function estimators that rely on an infinite-dimensional covariate, subject to mild regularity conditions. In doing so, we introduce a valuable moderate deviation principle for a function-indexed process, utilizing intricate exponential contiguity arguments. The primary objective of this paper is to contribute to the existing literature on functional data analysis by establishing functional moderate deviation principles for both Nadaraya–Watson and conditional distribution processes. These principles serve as fundamental tools for analyzing and understanding the behavior of these processes in the context of functional data analysis. By extending the scope of moderate deviation principles to the realm of functional data analysis, we enhance our understanding of the statistical properties and limitations of kernel-type function estimators when dealing with infinite-dimensional covariates. Our findings provide valuable insights and contribute to the advancement of statistical methodology in functional data analysis. PubDate: 2024-03-01 DOI: 10.3103/S1066530724700030

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: As a flexible extension of the common Poisson model, the Conway–Maxwell–Poisson distribution allows for describing under- and overdispersion in count data via an additional parameter. Estimation methods for two Conway–Maxwell–Poisson parameters are then required to specify the model. In this work, two characterization results are provided related to maximum likelihood estimation of the Conway–Maxwell–Poisson parameters. The first states that maximum likelihood estimation fails if and only if the range of the observations is less than two. Assuming that the maximum likelihood estimate exists, the second result then comprises a simple necessary and sufficient condition for the maximum likelihood estimate to be a solution of the likelihood equation; otherwise it lies on the boundary of the parameter set. A simulation study is carried out to investigate the accuracy of the maximum likelihood estimate in dependence of the range of the underlying observations. PubDate: 2024-03-01 DOI: 10.3103/S1066530724700042

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In many longitudinal and hierarchical epidemiological frameworks, observations regarding to each individual are recorded repeatedly over time. In these follow-ups, accurate measurements of time-dependent covariates might be invalid or expensive to be obtained. In addition, in the recording process, or as a result of other undetected reasons, miscategorization of the response variable might occur, that does not demonstrate the true condition of the response process. In contrast with binary outcome by which classification error occurs between two categories, disorderliness in categorical outcome has more intricate impacts, as a result of the increased number of categories and asymmetric miscategorization matrix. When no modification is made, insensitivity of errors in either covariate or response variable, results in potentially incorrect conclusion, tends to bias the statistical inference and eventually degrades the efficiency of the decision-making procedure. In this article, we provide an approach to simultaneously adjust for misclassification in the correlated nominal response and measurement error in the covariates, incorporating validation data in the estimation of misclassification probabilities, using the multivariate Gauss–Hermite quadrature technique for the approximation of the likelihood function. Simulation results demonstrate the effects of modifying covariate measurement error and response misclassification on the estimation procedure. PubDate: 2023-12-01 DOI: 10.3103/S1066530723040026

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: We propose a method to estimate a sample skewness from the given summary statistics and give explicit formulas for the most common scenarios. We show that our method provides a nearly unbiased estimator for the non-parametric skewness measure. We empirically evaluate the performance on real-life data sets of COVID-19 vaccination status. We also demonstrate how the method can be applied to detect the skewness of the underlying distribution. PubDate: 2023-12-01 DOI: 10.3103/S106653072304004X

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: The marginalized zero-inflated poisson (MZIP) regression model quantifies the effects of an explanatory variable in the mixture population. Also, in practice the variables are usually partially observed. Thus, we first propose to study the maximum likelihood estimator when all variables are observed. Then, assuming that the probability of selection is modeled using mixed covariates (continuous, discrete and categorical), we propose a semiparametric inverse-probability weighted (SIPW) method for estimating the parameters of the MZIP model with covariates missing at random (MAR). The asymptotic properties (consistency, asymptotic normality) of the proposed estimators are established under certain regularity conditions. Through numerical studies, the performance of the proposed estimators was evaluated. Then the results of the SIPW are compared to the results obtained by semiparametric inverse-probability weighted kermel-based (SIPWK) estimator method. Finally, we apply our methodology to a dataset on health care demand in the United States. PubDate: 2023-12-01 DOI: 10.3103/S1066530723040038

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Nonparametric regression estimation with Gaussian measurement errors in predictors is a classical statistical problem. It is well known that the errors dramatically slow down the rate of regression estimation, and this paper complement that result by presenting a sharp constant. Then an interesting example of using this sharp constant to discover a new curse of dimensionality in functional nonparametric regression is presented, and analysis of real data complements the theory. PubDate: 2023-09-01 DOI: 10.3103/S1066530723030031

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In this paper, we investigate multivariate doubly truncated moments for a class of multivariate location-scale mixture of elliptical (LSME) distributions. This rich family includes some well-known distributions, such as location-scale mixture of normal, location-scale mixture of Student- \(t\) , location-scale mixture of logistic and location-scale mixture of Laplace distributions, as special cases. We first present general formulae for computing the first two moments of the LSME distributions under the double truncation. We then consider a special case for cross moment. As an application, we present the results of multivariate tail conditional expectation (MTCE) for generalized hyperbolic (MGH) distribution. PubDate: 2023-09-01 DOI: 10.3103/S1066530723030043

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In this paper, we study the information-generating (IG) measure of \(k\) -record values and examine some of its main properties. We establish some bounds for the IG measure of \(k\) -record values. In addition, we present some results related to the characterization of an exponential distribution by maximization (minimization) of the IG measure of record values under certain conditions. We also examine the relative information generating (RIG) measure between the distribution of record values and the corresponding underlying distribution and present some results in this regard. Several examples have been provided throughout the study to illustrate the results. We also consider the problem of estimation of the IG measure for a two-parameter Weibull distribution based on the upper \(k\) -record values. PubDate: 2023-09-01 DOI: 10.3103/S106653072303002X

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: The goal of this paper is to introduce an efficient method for solving problems formulated by stochastic mixed Volterra–Fredholm integral equations driven by space-time white noise. Two dimensional triangular functions and their operational matrix and stochastic operational matrix of integration are considered. This method has several benefits; in addition to validity and good degree of accuracy, arithmetic operations are carried out without the need to derivative or integration. Illustrative examples are included to demonstrate the efficiency and applicability of the operational matrices based on two dimensional triangular functions. PubDate: 2023-09-01 DOI: 10.3103/S1066530723030055

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: In this paper, we deal with the estimation problem for the extreme value parameters in the case of stationary \(\beta\) -mixing serials with heavy-tailed distributions. We first introduce two families of estimators generalizing the Hill’s estimator. And from those families, three asymptotically unbiased estimators of the extreme value index are established. Our reflection is based on the generalized Jackknife methodology which consists of taking any pair of three special cases of our family of estimators to cancel the bias term. The resulting estimators are also used to deduce three asymptotically unbiased estimators of the extreme quantiles. In a simulation survey, the performance of our proposed methods are compared to alternative estimators recently introduced in the literature. Finally, our methods are applied to high financial losses data in order to estimate the Value-at-Risk of the daily stock returns on the S&P500 index. PubDate: 2023-06-01 DOI: 10.3103/S1066530723020011

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Families of distributions built from the fractional or continuous iteration of exponential-type functions are characterized by a wide range of tail-heaviness. The present paper aims to define classes of distributions supported on the whole real line based on the continuous iteration of the hyperbolic sine function sinh. This function has already been commonly employed in univariate transformations such as the Johnson’s \(S_{U}\) and sinh–arcsinh transforms. The tail versatility generated by a transformation based on the continuous iteration of sinh is highlighted based on an initial logistic distribution. It leads to the Hyperbolic Tetration distribution. The Double Hyperbolic Tetration distribution, defined from two successive hyperbolic transformations, is also introduced. It is among the first class of distributions with potential distinct tetration indices at plus and minus infinity. The distributions are applied to multiple data sets in hydrology. PubDate: 2023-06-01 DOI: 10.3103/S1066530723020023

Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.

Abstract: Terrell [18] showed that the Pearson coefficient of correlation of an ordered pair from a random sample of size two is at most one-half, and the equality is attained only for rectangular (uniform over some interval) distributions. In the present note it is proved that the same is true for the discrete case, in the sense that the correlation coefficient attains its maximal value only for discrete rectangular (uniform over some finite lattice) distributions. PubDate: 2023-06-01 DOI: 10.3103/S1066530723020035