 Statistical Methods and ApplicationsJournal Prestige (SJR): 0.466 Citation Impact (citeScore): 1Number of Followers: 6      Hybrid journal (It can contain Open Access articles) ISSN (Print) 1613-981X - ISSN (Online) 1618-2510 Published by Springer-Verlag  [2469 journals]
• Correction to: On the equivalence of conglomerability and disintegrability
for unbounded random variables

PubDate: 2022-08-03

• Nonparametric directional testing for multivariate problems in conjunction
with a closed testing principle

Abstract: Abstract It is common in a number of disciplines such as economics, sociology, psychology and clinical trials that researchers are interested to test treatment effects among several of the outcomes in the same direction. Such tests can be performed by using equi-directional test statistics for multivariate data. If on the other hand, treatment effects with respect to one or more of the outcomes differ in direction, the power of equi-directional tests is compromised. Thus, we interchanged the signs of different outcomes by multiplying the values with $$-\,1$$ and made the anticipated direction similar. Following this, we employed a recently proposed test statistic which handles equi-directional alternatives since the direction of treatment effects is made uniform through interchanging the signs. Once monotonic trend, that is, monotonic increasing for some of the outcomes and monotonic decreasing for others is demonstrated through the global test statistic, an investigator may be further interested in which specific outcomes or sets of outcomes actually these trends are observed. To address this issue, we adapted a closed testing principle. The whole procedure is illustrated by data sets from a toxicology study carried out by the National Toxicology Program, and a cost of transporting milk from farms to dairy plants per mile by different trucks.
PubDate: 2022-07-28

• Correction to: RIF regression via sensitivity curves

PubDate: 2022-07-13

• Influence measures in nonparametric regression model with symmetric random
errors

Abstract: Abstract In this paper we present several diagnostic measures for the class of nonparametric regression models with symmetric random errors, which includes all continuous and symmetric distributions. In particular, we derive some diagnostic measures of global influence such as residuals, leverage values, Cook’s distance and the influence measure proposed by Peña (Technometrics 47(1):1–12, 2005) to measure the influence of an observation when it is influenced by the rest of the observations. A simulation study to evaluate the effectiveness of the diagnostic measures is presented. In addition, we develop the local influence measure to assess the sensitivity of the maximum penalized likelihood estimator of smooth function. Finally, an example with real data is given for illustration.
PubDate: 2022-06-25

• RIF regression via sensitivity curves SMAP-D-21-00378

Abstract: Abstract This paper proposes an empirical method to implement the recentered influence function (RIF) regression of Firpo et al. (Econometrica 77(3):953–973, 2009), a relevant method to study the effect of covariates on many statistics beyond the mean. In empirically relevant situations where the influence function is not available or difficult to compute, we suggest to use the sensitivity curve (as reported by Tukey in Exploratory Data Analysis. Addison-Wesley, Reading, MA, 1977) as a feasible alternative. This may be computationally cumbersome when the sample size is large. The relevance of the proposed strategy derives from the fact that, under general conditions, the sensitivity curve converges in probability to the influence function. In order to save computational time we propose to use a cubic splines non-parametric method for a random subsample and then to interpolate to the rest of the cases where it was not computed. Monte Carlo simulations show good finite sample properties. We illustrate the proposed estimator with an application to the polarization index of Duclos et al. (Econometrica 72(6):1737–1772, 2004).
PubDate: 2022-06-25

• Does education protect families' well-being in times of crisis'
Measurement issues and empirical findings from IT-SILC data

Abstract: Abstract This study analyses the relationship between education and material well-being from a longitudinal perspective using the European Survey on Income and Living Conditions (EU-SILC) data collected in Italy in four waves (2009–2012). It has two main aims: (i) to measure household material well-being on the basis of householders’ responses to multiple survey items (addressed to gather information on the household availability of material resources) by advancing indexes, which can account for global and relative divergences in households’ material well-being across survey waves; (ii) to assess how education and other sociodemographic characteristics affect absolute well-being and its variation (i.e. relative well-being) in the time span considered. Both aims are pursued, combining measuring and explanatory modelling approaches. That is, the use of the Multilevel Item Response Theory model allows to measure the global household material well-being and its yearly variation (i.e. relative material well-being) in the four waves. Meanwhile, the use of a multivariate (and multivariate multilevel) regression model allows to assess the effects of education and other sociodemographic characteristics on both components (absolute and relative well-being), controlling for the relevant sources of heterogeneity in the data. The value added to using the proposed methodologies with the main findings and economic implications are discussed.
PubDate: 2022-06-23

• Maximum likelihood estimation of missing data probability for nonmonotone
missing at random data

Abstract: Abstract In general, statistical analysis with missing data requires specification of a model for the missing data probability and/or the covariate distribution. For nonmonotone missing data patterns, modeling and practical estimation of the missing data probability are very challenging. Recently a semiparametric likelihood model was developed to estimate parametric regression models for the missing data mechanism based on all the observed data, which can deal with arbitrary nonmonotone missing data patterns. However, due to the curse of dimensionality in the likelihood-based models, this method becomes impractical if the number of variables increases. This research generalizes the semiparametric likelihood model such that it can deal with any number of variables with arbitrary nonmonotone missing data patterns. It further introduces a semiparametric estimator of the missing data probability for the partially observed data, which can be used to assess the model fit. An EM algorithm with closed form expressions at each step are used to compute the estimates. Simulation studies in various settings indicate that the performance of the new method is acceptable for practical implementation. The missing data mechanism of a case-control study of hip fractures among male veterans is analyzed to illustrate the method.
PubDate: 2022-06-17

• Generalised calibration with latent variables for the treatment of unit
nonresponse in sample surveys

Abstract: Abstract Sample surveys may suffer from nonignorable unit nonresponse. This happens when the decision of whether or not to participate in the survey is correlated with variables of interest; in such a case, nonresponse produces biased estimates for parameters related to those variables, even after adjustments that account for auxiliary information. This paper presents a method to deal with nonignorable unit nonresponse that uses generalised calibration and latent variable modelling. Generalised calibration enables to model unit nonresponse using a set of auxiliary variables (instrumental or model variables), that can be different from those used in the calibration constraints (calibration variables). We propose to use latent variables to estimate the probability to participate in the survey and to construct a reweighting system incorporating such latent variables. The proposed methodology is illustrated, its properties discussed and tested on two simulation studies. Finally, it is applied to adjust estimates of the finite population mean wealth from the Italian Survey of Household Income and Wealth.
PubDate: 2022-06-09

• Modelling time-varying covariates effect on survival via functional data
analysis: application to the MRC BO06 trial in osteosarcoma

Abstract: Abstract Time-varying covariates are of great interest in clinical research since they represent dynamic patterns which reflect disease progression. In cancer studies biomarkers values change as functions of time and chemotherapy treatment is modified by delaying a course or reducing the dose intensity, according to patient’s toxicity levels. In this work, a Functional covariate Cox Model (FunCM) to study the association between time-varying processes and a time-to-event outcome is proposed. FunCM first exploits functional data analysis techniques to represent time-varying processes in terms of functional data. Then, information related to the evolution of the functions over time is incorporated into functional regression models for survival data through functional principal component analysis. FunCM is compared to a standard time-varying covariate Cox model, commonly used despite its limiting assumptions that covariate values are constant in time and measured without errors. Data from MRC BO06/EORTC 80931 randomised controlled trial for treatment of osteosarcoma are analysed. Time-varying covariates related to alkaline phosphatase levels, white blood cell counts and chemotherapy dose during treatment are investigated. The proposed method allows to detect differences between patients with different biomarkers and treatment evolutions, and to include this information in the survival model. These aspects are seldom addressed in the literature and could provide new insights into the clinical research.
PubDate: 2022-06-09

• Automatic robust Box–Cox and extended Yeo–Johnson
transformations in regression

Abstract: Abstract The paper introduces an automatic procedure for the parametric transformation of the response in regression models to approximate normality. We consider the Box–Cox transformation and its generalization to the extended Yeo–Johnson transformation which allows for both positive and negative responses. A simulation study illuminates the superior comparative properties of our automatic procedure for the Box–Cox transformation. The usefulness of our procedure is demonstrated on four sets of data, two including negative observations. An important theoretical development is an extension of the Bayesian Information Criterion (BIC) to the comparison of models following the deletion of observations, the number deleted here depending on the transformation parameter.
PubDate: 2022-06-08

• 2-step Gradient Boosting approach to selectivity bias correction in tax
audit: an application to the VAT gap in Italy

Abstract: Abstract The revenue loss from tax avoidance can undermine the effectiveness and equity of the government policies. A standard measure of its magnitude is known as the tax gap, that is defined as the difference between the total taxes theoretically collectable and the total taxes actually collected in a given period. Estimation from a micro perspective is usually tackled in the context of bottom-up approaches, where data regularly collected through fiscal audits are analyzed in order to provide inference on the general population. However, the sampling scheme of fiscal audits performed by revenue agencies is not random but characterized by a selection bias toward risky taxpayers. The current standard adopted by the Italian Revenue Agency (IRA) for overcoming this issue in the Tax audit context is the Heckman model, based on linear models for modeling both the selection and the outcome mechanisms. Here we propose the adoption of the CART-based Gradient Boosting in place of standard linear models to account for the complex patterns often arising in the relationships between covariates and outcome. Selection bias is corrected by considering a re-weighting scheme based on propensity scores, attained through the sequential application of a classifier and a regressor. In short we refer to the method as 2-step Gradient Boosting. We argue how this scheme fits the sampling mechanism of the IRA fiscal audits, and it is applied to a sample of VAT declarations from Italian individual firms in the fiscal year 2011. Results show a marked dominance of the proposed method over the currently adopted Heckman model in terms of predictive performances.
PubDate: 2022-06-06

• Publisher Correction to: Goodness-of-fit test for α-stable distribution
based on the quantile conditional variance statistics

PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00577-3

• Discussion to: Bayesian graphical models for modern biological

Abstract: Abstract In this writing we discuss the contributions made in the article (Ni et al. Stat Methods Appl, 2021. https://doi.org/10.1007/s10260-021-00572-8 ).
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00604-3

• Discussion to: Bayesian graphical models for modern biological

PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00607-0

• Discussion to: Bayesian Graphical Models for Modern Biological

PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00602-5

• Model detection and variable selection for mode varying coefficient model

Abstract: Abstract Varying coefficient model is often used in statistical modeling since it is more flexible than the parametric model. However, model detection and variable selection of varying coefficient model are poorly understood in mode regression. Existing methods in the literature for these problems are often based on mean regression and quantile regression. In this paper, we propose a novel method to solve these problems for mode varying coefficient model based on the B-spline approximation and SCAD penalty. Moreover, we present a new algorithm to estimate the parameters of interest, and discuss the parameters selection for the tuning parameters and bandwidth. We also establish the asymptotic properties of estimated coefficients under some regular conditions. Finally, we illustrate the proposed method by some simulation studies and an empirical example.
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00576-4

• Goodness-of-fit test for $$\alpha$$ α -stable distribution based on the
quantile conditional variance statistics

Abstract: Abstract The class of $$\alpha$$ -stable distributions is ubiquitous in many areas including signal processing, finance, biology, physics, and condition monitoring. In particular, it allows efficient noise modeling and incorporates distributional properties such as asymmetry and heavy-tails. Despite the popularity of this modeling choice, most statistical goodness-of-fit tests designed for $$\alpha$$ -stable distributions are based on a generic distance measurement methods. To be efficient, those methods require large sample sizes and often do not efficiently discriminate distributions when the corresponding $$\alpha$$ -stable parameters are close to each other. In this paper, we propose a novel goodness-of-fit method based on quantile (trimmed) conditional variances that is designed to overcome these deficiencies and outperforms many benchmark testing procedures. The effectiveness of the proposed approach is illustrated using extensive simulation study with focus set on the symmetric case. For completeness, an empirical example linked to plasma physics is provided.
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00571-9

• Estimation and decomposition of food price inflation risk

Abstract: Abstract Ensuring aggregate food price stability requires a forward-looking assessment of the risk that unexpected deviations in individual food items’ inflation lead to large shocks in the aggregate food price inflation. To do so, we propose using a multivariate GARCH framework in combination with the Euler method to (1) estimate the conditional standard deviation and quantiles of the food price inflation shocks and (2) attribute the total risk to the underlying food items. For the FAO food price index, we find that even though meat inflation systematically has the highest weight in the aggregate index, cereal inflation is the main contributor to the total food price inflation risk over the period 1990–2018. The use of time series models and the Cornish-Fisher expansion make the risk characterization forward-looking and a potentially helpful tool for risk management.
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00574-6

• Extending graphical models for applications: on covariates, missingness
and normality

Abstract: Abstract The authors of the paper “Bayesian Graphical Models for Modern Biological Applications” have put forward an important framework for making graphical models more useful in applied settings. In this discussion paper, we give a number of suggestions for making this framework even more suitable for practical scenarios. Firstly, we show that an alternative and simplified definition of covariate might make the framework more manageable in high-dimensional settings. Secondly, we point out that the inclusion of missing variables is important for practical data analysis. Finally, we comment on the effect that the Gaussianity assumption has in identifying the underlying conditional independence graph and how this can be circumvented. The Bayesian framework proposed by the authors is flexible enough to accommodate extensions that can deal with these aspects, which are often encountered in real data analyses such as the complex modern applications considered by the authors.
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00605-2

• Discussion to: Bayesian graphical models for modern biological

Abstract: Abstract It is a pleasure to congratulate Ni et al. (Stat Methods Appl 490:1–32, 2021) on the recent advances in Bayesian graphical models reviewed in Ni et al. (Stat Methods Appl 490:1–32, 2021). The authors have given considerable thought to the construction and estimation of Bayesian graphical models that capture salient features of biological networks. My discussion focuses on computational challenges and opportunities along with priors, pointing out limitations of the Markov random field priors reviewed in Ni et al. (Stat Methods Appl 490:1–32, 2021) and exploring possible generalizations that capture additional features of conditional independence graphs, such as hub structure and clustering. I conclude with a short discussion of the intersection of graphical models and random graph models.
PubDate: 2022-06-01
DOI: 10.1007/s10260-021-00600-7

