A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

  Subjects -> STATISTICS (Total: 130 journals)
The end of the list has been reached or no journals were found for your choice.
Similar Journals
Journal Cover
Statistical Methods and Applications
Journal Prestige (SJR): 0.466
Citation Impact (citeScore): 1
Number of Followers: 6  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1613-981X - ISSN (Online) 1618-2510
Published by Springer-Verlag Homepage  [2467 journals]
  • Student’s-t process with spatial deformation for spatio-temporal
           data

    • Free pre-print version: Loading...

      Abstract: Abstract Many models for environmental data that are observed in time and space have been proposed in the literature. The main objective of these models is usually to make predictions in time and to perform interpolations in space. Realistic predictions and interpolations are obtained when the process and its variability are well represented through a model that takes into consideration its peculiarities. In this paper, we propose a spatio-temporal model to handle observations that come from distributions with heavy tails and for which the assumption of isotropy is not realistic. As a natural choice for a heavy-tailed model, we take a Student’s-t distribution. The Student’s-t distribution, while being symmetric, provides greater flexibility in modeling data with kurtosis and shape different from the Gaussian distribution. We handle anisotropy through a spatial deformation method. Under this approach, the original geographic space of observations gets mapped into a new space where isotropy holds. Our main result is, therefore, an anisotropic model based on the heavy-tailed t distribution. Bayesian approach and the use of MCMC enable us to sample from the posterior distribution of the model parameters. In Sect. 2, we discuss the main properties of the proposed model. In Sect. 3, we present a simulation study, showing its superiority over the traditional isotropic Gaussian model. In Sect. 4, we show the motivation that has led us to propose the t distribution-based anisotropic model—the real dataset of evaporation coming from the Rio Grande do Sul state of Brazil.
      PubDate: 2022-12-01
       
  • Double-calibration estimators accounting for under-coverage and
           nonresponse in socio-economic surveys

    • Free pre-print version: Loading...

      Abstract: Abstract Under-coverage and nonresponse problems are jointly present in most socio-economic surveys. The purpose of this paper is to propose an estimation strategy that accounts for both problems by performing a two-step calibration. The first calibration exploits a set of auxiliary variables only available for the units in the sampled population to account for nonresponse. The second calibration exploits a different set of auxiliary variables available for the whole population, to account for under-coverage. The two calibrations are then unified in a double-calibration estimator. Mean and variance of the estimator are derived up to the first order of approximation. Conditions ensuring approximate unbiasedness are derived and discussed. The strategy is empirically checked by a simulation study performed on a set of artificial populations. A case study is derived from the European Union Statistics on Income and Living Conditions survey data. The strategy proposed is flexible and suitable in most situations in which both under-coverage and nonresponse are present.
      PubDate: 2022-12-01
       
  • Spare time use: profiles of Italian Millennials (beyond the media hype)

    • Free pre-print version: Loading...

      Abstract: Abstract This paper focuses on a particular population segment, that of Millennials, which has attracted much attention over recent years. Beyond the media hype, little is known about the habits of this generation towards spare time use. The present study builds on a previous work devoted to detect the different ways Italian Millennials interact with spare time, and aims at identifying profiles of Millennials branded with profile-specific time use habits and styles. In so doing, we (i) account for the multidimensional nature of time use attitude and express it into a reduced number of distinct dimensions and (ii) identify and qualify profiles of Millennials as regards the ascertained time use dimensions. By relying on an extended Item Response Theory model applied to the Italian “Multipurpose survey on households”, our main findings reveal that the way Millennials use spare time and interact with technology is much more complex, varied and multifaceted than what claimed by the media.
      PubDate: 2022-12-01
       
  • The inextricable association of measurement errors and tax evasion as
           examined through a microanalysis of survey data matched with fiscal data:
           a case study

    • Free pre-print version: Loading...

      Abstract: Abstract Individual records referring to personal interviews conducted for a survey on income in Modena during 2012 and tax year 2011 were matched with the corresponding records in the Italian Ministry of Finance databases containing fiscal income data for tax year 2011. The analysis of the resulting data set suggested that the fiscal income data were generally more reliable than the surveyed income data. Moreover, the obtained data set enabled identification of the factors determining over- and under-reporting, as well as measurement errors, through a comparison of the surveyed income data with the fiscal income data, only for suitable categories of interviewees, that is, taxpayers who are forced to respect the tax laws (the public sector) and taxpayers who have many evasion options (the private sector). The percentage of under-reporters (67.3%) was higher than the percentage of over-reporters (32.7%). Level of income, age, and education were the main regressors affecting measurement errors and the behaviours of tax evaders. Tax evasion and the impacts of personal factors affecting evasion were evaluated using various approaches. The average tax evasion amounted to 26.0% of the fiscal income. About 10% of the sample was made up of possible total tax evaders.
      PubDate: 2022-12-01
       
  • Quantile regression via the EM algorithm for joint modeling of mixed
           discrete and continuous data based on Gaussian copula

    • Free pre-print version: Loading...

      Abstract: Abstract In this paper, we develop a joint quantile regression model for correlated mixed discrete and continuous data using Gaussian copula. Our approach entails specifying marginal quantile regression models for the responses, and combining them via a copula to form a joint model. For modeling the quantiles of continuous response an asymmetric Laplace (AL) distribution is assigned to the error terms in both continuous and discrete models. For modeling the discrete response an underlying latent variable model and the threshold concept are used. Quantile regression for discrete responses can be fitted using monotone equivariance property of quantiles. By assuming a latent variable framework to describe discrete responses, the applied proposed copula still uniquely determines the joint distribution. The likelihood function of the joint model have also a tractable form but it is not differentiable in some points of the parameter space. However, by using the stochastic representation of AL distribution, the maximum likelihood estimate of parameters are obtained using an EM algorithm and also in order to carry out inference about parameters Bootstrap confidence intervals are specified using a Monte Carlo technique. Some simulation studies are performed to illustrate the performance of the model. Finally, we illustrate applications of the proposed approach using burn injuries data.
      PubDate: 2022-12-01
       
  • On a bivariate copula for modeling negative dependence: application to New
           York air quality data

    • Free pre-print version: Loading...

      Abstract: Abstract In many practical scenarios, including finance, environmental sciences, system reliability, etc., it is often of interest to study the various notion of negative dependence among the observed variables. A new bivariate copula is proposed for modeling negative dependence between two random variables that complies with most of the popular notions of negative dependence reported in the literature. Specifically, the Spearman’s rho and the Kendall’s tau for the proposed copula have a simple one-parameter form with negative values in the full range. Some important ordering properties comparing the strength of negative dependence with respect to the parameter involved are considered. Simple examples of the corresponding bivariate distributions with popular marginals are presented. Application of the proposed copula is illustrated using a real data set on air quality in the New York City, USA.
      PubDate: 2022-12-01
       
  • Joint modeling for longitudinal covariate and binary outcome via
           h-likelihood

    • Free pre-print version: Loading...

      Abstract: Abstract Joint modeling techniques of longitudinal covariates and binary outcomes have attracted considerable attention in medical research. The basic strategy for estimating the coefficients of joint models is to define a joint likelihood based on two submodels with shared random effects. Numerical integration, however, is required in the estimation step for the joint likelihood, which is computationally expensive due to the complexity of the assumed submodels. To overcome this issue, we propose a joint modeling procedure using the h-likelihood to avoid numerical integration in the estimation algorithm. We conduct Monte Carlo simulations to investigate the effectiveness of our proposed modeling procedures by evaluating both the accuracy of the parameter estimates and computational time. The accuracy of the proposed procedure is compared to the two-stage modeling and numerical integration approaches. We also validate our proposed modeling procedure by applying it to the analysis of real data.
      PubDate: 2022-12-01
       
  • A procedure for testing the hypothesis of weak efficiency in financial
           markets: a Monte Carlo simulation

    • Free pre-print version: Loading...

      Abstract: Abstract The weak form of the efficient market hypothesis is identified with the conditions established by different types of random walks (1–3) on the returns associated with the prices of a financial asset. The methods traditionally applied for testing weak efficiency in a financial market as stated by the random walk model test only some necessary, but not sufficient, condition of this model. Thus, a procedure is proposed to detect if a return series associated with a given price index follows a random walk and, if so, what type it is. The procedure combines methods that test only a necessary, but not sufficient, condition for the fulfilment of the random walk hypothesis and methods that directly test a particular type of random walk. The proposed procedure is evaluated by means of a Monte Carlo experiment, and the results show that this procedure performs better (more powerful) against linear correlation-only alternatives when starting from the Ljung–Box test. On the other hand, against the random walk type 3 alternative, the procedure is more powerful when it is initiated from the BDS test.
      PubDate: 2022-12-01
       
  • Markov models for duration-dependent transitions: selecting the states
           using duration values or duration intervals'

    • Free pre-print version: Loading...

      Abstract: Abstract In a Markov model the transition probabilities between states do not depend on the time spent in the current state. The present paper explores two ways of selecting the states of a discrete-time Markov model for a system partitioned into categories where the duration of stay in a category affects the probability of transition to another category. For a set of panel data, we compare the likelihood fits of the Markov models with states based on duration intervals and with states defined by duration values. For hierarchical systems, we show that the model with states based on duration values has a better maximum likelihood fit than the baseline Markov model where the states are the categories. We also prove that this is not the case for the duration-interval model, under conditions on the data that seem realistic in practice. Furthermore, we use the Akaike and Bayesian information criteria to compare these alternative Markov models. The theoretical findings are illustrated by an analysis of a real-world personnel data set.
      PubDate: 2022-12-01
       
  • A Bayesian approach to model individual differences and to partition
           individuals: case studies in growth and learning curves

    • Free pre-print version: Loading...

      Abstract: Abstract The first objective of the paper is to implement a two stage Bayesian hierarchical nonlinear model for growth and learning curves, particular cases of longitudinal data with an underlying nonlinear time dependence. The aim is to model simultaneously individual trajectories over time, each with specific and potentially different characteristics, and a time-dependent behavior shared among individuals, including eventual effect of covariates. At the first stage inter-individual differences are taken into account, while, at the second stage, we search for an average model. The second objective is to partition individuals into homogeneous groups, when inter individual parameters present high level of heterogeneity. A new multivariate partitioning approach is proposed to cluster individuals according to the posterior distributions of the parameters describing the individual time-dependent behaviour. To assess the proposed methods, we present simulated data and two applications to real data, one related to growth curve modeling in agriculture and one related to learning curves for motor skills. Furthermore a comparison with finite mixture analysis is shown.
      PubDate: 2022-12-01
       
  • School climate and academic performance of Italian students: the role of
           disciplinary behaviour and parental involvement

    • Free pre-print version: Loading...

      Abstract: Abstract Educational researchers have increasingly recognised the importance of school climate as a malleable factor for improving academic performance. In this perspective, we exploit the data collected by the Italian Institute for the Evaluation of the Education System (INVALSI) to assess the effect of some school climate related factors on academic performance of tenth-grade Italian students. A Multilevel Bayesian Structural Equation Model (MBSEM) is adopted to highlight the effect of some relevant dimensions of school climate (students’ disciplinary behaviour and parents’ involvement) on academic performance and their role on the relationships between student socioeconomic status and achievement. The main findings show that disciplinary behaviour, on the one hand, directly influences the level of competence of the students, and, on the other hand, it partly mediates the effect of socioeconomic background whereas parents’ involvement does not appear to exert any significant effect on students’ performance.
      PubDate: 2022-12-01
       
  • Selection of mixed copula for association modeling with tied observations

    • Free pre-print version: Loading...

      Abstract: Abstract The link between Obesity and Hypertension is among the most popular topics which have been explored in medical research in recent decades. However, it is challenging to establish the relationship comprehensively and accurately because the distribution of BMI and blood pressure is usually fat tailed and severely tied. In this paper, we propose a data-driven copulas selection approach via penalized likelihood which can deal with tied data by interval censoring estimation. Minimax Concave Penalty is involved to perform the unbiased selection of mixed copula model for its convergence property to get un-penalized solution. Interval censoring and maximizing pseudo-likelihood, inspired from survival analysis, is introduced by considering ranks as intervals with upper and lower limits. This paper describes the model and corresponding iterative algorithm. Simulations to compare the proposed approach versus existing methods in different scenarios are presented. Additionally, the proposed method is also applied to the association modeling on the China Health and Nutrition Survey (CHNS) data. Both numerical studies and real data analysis reveal good performance of the proposed method.
      PubDate: 2022-12-01
       
  • A mixture model approach to spectral clustering and application to textual
           data

    • Free pre-print version: Loading...

      Abstract: Abstract The spectral clustering algorithm is a technique based on the properties of the pairwise similarity matrix coming from a suitable kernel function. It is a useful approach for high-dimensional data since the units are clustered in feature space with a reduced number of dimensions. In this paper, we consider a two-step model-based approach within the spectral clustering framework. Based on simulated data, first, we discuss criteria for selecting the number of clusters and analyzing the robustness of the model-based approach concerning the choice of the proximity parameters of the kernel functions. Finally, we consider applications of the spectral methods to cluster five real textual datasets and, in this framework, a new kernel function is also proposed. The approach is illustrated on the ground of a large numerical study based on both simulated and real datasets.
      PubDate: 2022-12-01
       
  • Correction: Boosted-oriented probabilistic smoothing-spline clustering of
           series

    • Free pre-print version: Loading...

      PubDate: 2022-11-21
       
  • Assessing individual skill influence on housework time of Italian women:
           an endogenous-switching approach

    • Free pre-print version: Loading...

      Abstract: Abstract Using Italian data from the Time Use Survey (Istat) on the time devoted by Italian women to housework tasks, in this study we analyze how much individual ability of a woman employed in the market influences her housework time. To this aim we estimate a two-regime Endogenous-Switching model for both employed and not employed women. As a novelty, a ML estimation of this model provides also the point-estimation of the across-regime correlation parameter, that allows us to evaluate the individual skill effect on the time devoted to housework tasks by a woman and to calculate the probability of choosing one of the two regimes, corrected for the endogeneity of the choice. The estimation framework allows us to identify the role of individual skills of the Italian women in household decision-making.
      PubDate: 2022-11-21
       
  • A new cure rate frailty regression model based on a weighted Lindley
           distribution applied to stomach cancer data

    • Free pre-print version: Loading...

      Abstract: Abstract In this paper, we propose a new cure rate frailty regression model based on a two-parameter weighted Lindley distribution. The weighted Lindley distribution has attractive properties such as flexibility on its probability density function, Laplace transform function on closed-form, among others. An advantage of proposed model is the possibility to jointly model the heterogeneity among patients by their frailties and the presence of a cured fraction of them. To make the model parameters identifiable, we consider a reparameterized version of the weighted Lindley distribution with unit mean as frailty distribution. The proposed model is very flexible in sense that has some traditional cure rate models as special cases. The statistical inference for the model’s parameters is discussed in detail using the maximum likelihood estimation under random right-censoring. Further, we present a Monte Carlo simulation study to verify the maximum likelihood estimators’ behavior assuming different sample sizes and censoring proportions. Finally, the new model describes the lifetime of 22,148 patients with stomach cancer, obtained from the Fundação Oncocentro de São Paulo, Brazil.
      PubDate: 2022-11-17
       
  • A multilevel structured latent curve model for disaggregating student and
           school contributions to learning

    • Free pre-print version: Loading...

      Abstract: Abstract Educational researchers continue to debate the relative contribution of individual and environmental factors to learning. Concomitant with the proliferation of longitudinal educational testing following students and schools over time, recent research has shown that nonlinear mixed effect models can be parameterized to directly estimate quantities meaningful to learning processes and are situated to address questions about whether learning is driven by the individuals or the context. However, three-level nonlinear models pose estimation challenges because the likelihood does not have a closed-form solution and integral approximations are intractable when there are multiple random effects at multiple levels of the model. Multivariate reparameterization to a structured latent curve model has been suggested as a method to circumvent similar issues in two-level models, but the approach has not yet to be extended to the context of three-level models. We extend the idea of structured latent curve models to accommodate data with a three-level hierarchy. We apply the model to six years of mathematics and reading scores from 6346 students in 68 schools to partition the variance of learning parameters into school- and student-level components. The results show that—compared to reading—learning in mathematics is more heavily influenced by school-level factors and that there is evidence for stronger Matthew effects (“the rich get richer”) in mathematics than in reading.
      PubDate: 2022-10-31
       
  • Boosted-oriented probabilistic smoothing-spline clustering of series

    • Free pre-print version: Loading...

      Abstract: Abstract Fuzzy clustering methods allow the objects to belong to several clusters simultaneously, with different degrees of membership. However, a factor that influences the performance of fuzzy algorithms is the value of fuzzifier parameter. In this paper, we propose a fuzzy clustering procedure for data (time) series that does not depend on the definition of a fuzzifier parameter. It comes from two approaches, theoretically motivated for unsupervised and supervised classification cases, respectively. The first is the Probabilistic Distance clustering procedure. The second is the well known Boosting philosophy. Our idea is to adopt a boosting prospective for unsupervised learning problems, in particular we face with non hierarchical clustering problems. The global performance of the proposed method is investigated by various experiments.
      PubDate: 2022-10-27
       
  • When does morbidity start' An analysis of changes in morbidity between
           2013 and 2019 in Italy

    • Free pre-print version: Loading...

      Abstract: Abstract Morbidity is one of the key aspects for assessing populations’ well-being. In particular, chronic diseases negatively affect the quality of life in the old age and the risk that more years added to lives are years of disability and illness. Novel analysis, interventions and policies are required to understand and potentially mitigate this issue. In this article, we focus on investigating whether in Italy the compression of morbidity is in act in the recent years, parallely to an increase of life expectancy. Our analysis rely on large repeated cross-sectional data from the national surveillance system passi, providing deep insights on the evolution of morbidity together with other socio-demographical variables. In addition, we investigate differences in morbidity across subgroups, focusing on disparities by gender, level of education and economic difficulties, and assessing the evolution of these differences across the period 2013–2019.
      PubDate: 2022-10-24
       
  • A new measure for the attitude to mobility of Italian students and
           graduates: a topological data analysis approach

    • Free pre-print version: Loading...

      Abstract: Abstract Students’ and graduates’ mobility is an interesting topic of discussion especially for the Italian education system and universities. The main reasons for migration and for the so called brain drain, can be found in the socio-economic context and in the famous North–South divide. Measuring mobility and understanding its dynamic over time and space are not trivial tasks. Most of the studies in the related literature focus on the determinants of such phenomenon, in this paper, instead, combining tools coming from graph theory and Topological Data Analysis we propose a new measure for the attitude to mobility. Each mobility trajectory is represented by a graph and the importance of the features constituting the graph are evaluated over time using persistence diagrams. The attitude to mobility of the students is then ranked computing the distance between the individual persistence diagram and the theoretical persistence diagram of the stayer student. The new approach is used for evaluating the mobility of the students that in 2008 enrolled in an Italian university. The relation between attitude to mobility and the main socio-demographic variables is investigated.
      PubDate: 2022-10-24
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 35.175.107.185
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-