Abstract: Partial ranked set sampling (PRSS) is a cost-effective sampling method. It is a combination of simple random sample (SRS) and ranked set sampling (RSS) designs. The PRSS method allows flexibility for the experimenter in selecting the sample when it is either difficult to rank the units within each set with full confidence or when experimental units are not available. In this article, we introduce and define the likelihood function of any probability distribution under the PRSS scheme. The performance of the maximum likelihood estimators is examined when the available data are assumed to have an exponentiated exponential (EE) distribution via some selective RSS schemes as well as SRS. The suggested ranked schemes include the PRSS, RSS, neoteric RSS (NRSS), and extreme RSS (ERSS). An intensive simulation study was conducted to compare and explore the behaviour of the proposed estimators. The study demonstrated that the maximum likelihood estimators via PRSS, NRSS, ERSS, and RSS schemes are more efficient than the corresponding estimators under SRS. A real data set is presented for illustrative purposes. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Various techniques of scale parameter estimation have been proposed in the case of alpha stable distributions. In the paper, the authors present an estimation technique that involves the k-th record theory. Although this theory is over 40 years old, its implementation in the classical extreme value theory – being the other cornerstone of the presented approach – is quite new, and tempting. Several theoretical properties of the introduced scale parameter estimators are presented. With the use of Monte Carlo methods, a comparative analysis is performed between the approach based on k-th records and approaches based on Hill’s and Pickands’ estimators. Additionally, the paper uses a real-life data set to illustrate how to effectively apply the k-th record estimator of the scale parameter. The research indicates several advantages of the k-th record approach over its other counterparts, especially when dealing with incomplete information about the underlying sample. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: There are many models in the current statistical literature for making inferences based on samples selected from a finite population. Parametric models may be problematic because statistical inference is sensitive to parametric assumptions. The Dirichlet process (DP) prior is very flexible and determines the complexity of the model. It is indexed by two hyper-parameters: the baseline distribution and concentration parameter. We address two distinct problems in the article. Firstly, we review the current sampling methods for the concentration parameter, which use the continuous baseline distribution. We compare three different methods: the adaptive rejection method, the mixture of Gammas method and the grid method. We also propose a new method based on the ratio of uniforms. Secondly, in practice, some survey responses are known to be discrete. If a continuous distribution is adopted as the baseline distribution, the model is misspecified and standard inference may be invalid. We propose a discrete baseline approach to the DP prior and sample the unobserved responses from the finite population both using a Polya urn scheme and a Multinomial distribution. We applied our discrete baseline approach to a Phytophthora data set. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: The Poisson-Modification of Quasi Lindley (PMQL) distribution is a newly introduced mixed Poisson distribution for over-dispersed count data. The aim of this article is to introduce the Zero-modified PMQL (ZMPMQL) distribution as an alternative to the PMQL distribution in order to accommodate zero inflation/deflation. The method of obtaining the ZMPMQL distribution jointly with some of its important properties, namely the probability mass and distribution functions, mean, variance, index of dispersion, and quantile function are presented. Furthermore, some of its special cases are discussed. The maximum likelihood (ML) estimation method is used for the unknown parameter estimation. A simulation study is conducted in order to evaluate the asymptotic theory of the ML estimation method and to show the superiority of the ML method over the method of moments estimation. The applicability of the introduced distribution is illustrated by using a real-world data set. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Two-predictor suppression situations continue to produce uninterpretable conditions in linear regression. In an attempt to address the theoretical complexities related to suppression situations, the current study introduces two different versions of a software called suppression simulator (Supsim): a) the command-line Python package, and b) the web-based JavaScript tool, both of which are able to simulate numerous random two-predictor models (RTMs). RTMs are randomly generated, normally distributed data vectors x1, x2, and y simulated in such a way that regressing y on both x1 and x2 results in the occurrence of numerous suppression and non-suppression situations. The web-based Supsim requires no coding skills and additionally, it provides users with 3D scatterplots of the simulated RTMs. This study shows that comparing 3D scatterplots of different suppression and non-suppression situations provides important new insights into the underlying mechanisms of two-predictor suppression situations. An important focus is on the comparison of 3D scatterplots of certain enhancement situations called Hamilton’s extreme example with those of redundancy situations. Such a comparison suggests that the basic mathematical concepts of two-predictor suppression situations need to be reconsidered with regard to the important issue of the statistical control function. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Frailty models are the possible choice to counter the problem of the unobserved heterogeneity in individual risks of disease and death. Based on earlier studies, shared frailty models can be utilised in the analysis of bivariate data related to survival times (e.g. matched pairs experiments, twin or family data). In this article, we assume that frailty acts additively to the hazard rate. A new class of shared frailty models based on generalised Lindley distribution is established. By assuming generalised Weibull and generalised log-logistic baseline distributions, we propose a new class of shared frailty models based on the additive hazard rate. We estimate the parameters in these frailty models and use the Bayesian paradigm of the Markov Chain Monte Carlo (MCMC) technique. Model selection criteria have been applied for the comparison of models. We analyse kidney infection data and suggest the best model. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: The purpose of this paper is to study and compare the methods for constructing confidence intervals for variance components in an unbalanced one-way random effects model. The methods are based on a classical exact, generalised pivotal quantity, a fiducial inference and a fiducial generalised pivotal quantity. The comparison of criteria involves the empirical coverage probability that maintains at the nominal confidence level of 0.95 and the shortest average length of the confidence interval. The simulation results show that the method based on the generalised pivotal quantity and the fiducial inference perform very well in terms of both the empirical coverage probability and the average length of the confidence interval. The classical exact method performs well in some situations, while the fiducial generalised pivotal quantity performs well in a very unbalanced design. Therefore, the method based on the generalised pivotal quantity is recommended for all situations. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: This paper develops optimal designs when it is not feasible for every cluster to be represented in a sample as in stratified design, by assuming equal probability two-stage sampling where clusters are small areas. The paper develops allocation methods for two-stage sample surveys where small-area estimates are a priority. We seek efficient allocations where the aim is to minimize the linear combination of the mean squared errors of composite small area estimators and of an estimator of the overall mean. We suggest some alternative allocations with a view to minimizing the same objective. Several alternatives, including the area-only stratified design, are found to perform nearly as well as the optimal allocation but with better practical properties. Designs are evaluated numerically using Switzerland canton data as well as Botswana administrative districts data. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Sample surveys are often affected by missing observations and non-response caused by the respondents’ refusal or unwillingness to provide the requested information or due to their memory failure. In order to substitute the missing data, a procedure called imputation is applied, which uses the available data as a tool for the replacement of the missing values. Two auxiliary variables create a chain which is used to substitute the missing part of the sample. The aim of the paper is to present the application of the Chain-type factor estimator as a means of source imputation for the non-response units in an incomplete sample. The proposed strategies were found to be more efficient and bias-controllable than similar estimation procedures described in the relevant literature. These techniques could also be made nearly unbiased in relation to other selected parametric values. The findings are supported by a numerical study involving the use of a dataset, proving that the proposed techniques outperform other similar ones. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Randomisation tests (R-tests) are regularly proposed as an alternative method of hypothesis testing when assumptions of classical statistical methods are violated in data analysis. In this paper, the robustness in terms of the type-I-error and the power of the R-test were evaluated and compared with that of the F-test in the analysis of a single factor repeated measures design. The study took into account normal and non-normal data (skewed: exponential, lognormal, Chi-squared, and Weibull distributions), the presence and lack of outliers, and a situation in which the sphericity assumption was met or not under varied sample sizes and number of treatments. The Monte Carlo approach was used in the simulation study. The results showed that when the data were normal, the R-test was approximately as sensitive and robust as the F-test, while being more sensitive than the F-test when data had skewed distributions. The R-test was more sensitive and robust than the F-test in the presence of an outlier. When the sphericity assumption was met, both the R-test and the F-test were approximately equally sensitive, whereas the R-test was more sensitive and robust than the F-test when the sphericity assumption was not met. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: The paper shows that treating failure-free time in the three-parameter Weibull distribution not a constant, but as a random variable makes the resulting distribution much more flexible at the expense of only one additional parameter. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: Wages and salaries represent the most important component of household disposable income. The aim of the article is to examine how the relationship between the shares of households’ wages and final consumption expenditure in their gross disposable income has developed over the past 20 years. The presented analysis uses publicly available national accounts data for 30 countries for the period of 2000–2019. The studied indicators include the proportion of households’ wages and salaries, and final consumption expenditure in their gross disposable income. Using the proposed method based on the evaluation of changes in the spatial map, it is possible to observe any significant changes in these proportion values in the years of financial crisis and recession, as well as in the years of prosperity. The procedure can therefore serve as an indicator of appreciable changes in economic development. PubDate: Tue, 27 Dec 2022 00:00:00 GMT

Abstract: In this paper, we studied estimators based on an interval shrinkage with equal weights point shrinkage estimators for all individual target points ¯θ ∈ (θ0,θ1) for exponentially distributed observations in the presence of outliers drawn from a uniform distribution. Estimators obtained from both shrinkage and interval shrinkage were compared, showing that the estimators obtained via the interval shrinkage method perform better. Symmetric and asymmetric loss functions were also used to calculate the estimators. Finally, a numerical study and illustrative examples were provided to describe the results. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: The coronavirus (COVID-19) pandemic affected every country worldwide. In particular, outbreaks in Belgium, the Czech Republic, Poland and Switzerland entered the second wave and was exponentially increasing between July and November, 2020. The aims of the study are: to estimate the compound growth rate, to develop a modified exponential time-series model compared with the hyperbolic time-series model, and to estimate the optimal parameters for the models based on the exponential least-squares, three selected points, partial-sums methods, and the hyperbolic least-squares for the daily COVID-19 cases in Belgium, the Czech Republic, Poland and Switzerland. The speed and spreading power of COVID-19 infections were obtained by using derivative and root-mean-squared methods, respectively. The results show that the exponential least-squares method was the most suitable for the parameter estimation. The compound growth rate of COVID-19 infection was the highest in Switzerland, and the speed and spreading power of COVID-19 infection were the highest in Poland between July and November, 2020. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: The Uniformly Minimum Variance Unbiased (UMVU) and the Maximum Likelihood (ML) estimations of R = P(X ≤ Y) and the associated variance are considered for independent discrete random variables X and Y. Assuming a discrete uniform distribution for X and the distribution of Y as a member of the discrete one parameter exponential family of distributions, theoretical expressions of such quantities are derived. Similar expressions are obtained when X and Y interchange their roles and both variables are from the discrete uniform distribution. A simulation study is carried out to compare the estimators numerically. A real application based on demand-supply system data is provided. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: The COVID-19 pandemic has had a substantial impact on public health all over the world. In order to prevent the spread of the virus, the majority of countries introduced restrictions which entailed considerable economic and social costs. The main goal of the article is to study how the lockdown introduced in Poland affected the spread of the pandemic in the country. The study used synthetic control method to this end. The analysis was carried on the basis of data from the Local Data Bank and a government website on the state of the epidemic in Poland.The results indicated that the lockdown significantly curbed the spread of the COVID-19 pandemic in Poland. Restrictions led to the substantial drop in infections – by 9500 cases – in three weeks. The results seem to stay the same despite the change of assumptions in the study. Such conclusion can be drawn from the performance of the placebo-in-space and placebo-in-time analyses. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: In the present study income inequality in Poland is evaluated using corrected income data to provide more reliable estimates. According to most empirical studies based on household surveys and considering the European standards, the recent income inequality in Poland is moderate and decreased significantly after reaching its peaks during the first decade of the 21st century. These findings were challenged by Brzeziński et al. (2022), who placed Polish income inequality among the highest in Europe. Such a conclusion was possible when combining the household survey data with information on personal income tax. In the present study the above-mentioned findings are further explored using 2014 and 2015 data and employing additional corrections to the household survey incomes. Incomes of the poorest people are replaced by their predictions made on a large set of well-being correlates, using the hierarchical correlation reconstruction. Applying this method together with the corrections based on Brzeziński’s et al. results reduces the 2014 and 2015 revised Gini indices, still keeping them above the values obtained with the use of the survey data only. It seems that the hierarchical correlation reconstruction offers more accurate proxies to the actual low incomes, while matching tax data provides better proxies to the top incomes. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: The measurement of preferences can be based on historical observations of consumer behaviour or on data describing consumer intentions. In the latter case, the measure-ment of preferences is performed using methods which express consumer attitudes at the time of research. However, most of these methods are very laborious, especially when a large number of objects is tested. In such cases incomplete analyses may prove useful. An incomplete analysis involves the division of objects into subgroups, so that each pair of objects appears at exactly the same frequency and all objects are in each subgroup.The purpose of the work is to compare two incomplete methods for measuring the similarity of preferences, i.e. the triad method and the tetrad method. These methods can be used whenever similarities are measured on an ordinal scale. They have been com-pared in terms of their labour intensity and ability to map the known structure of ob-jects, even when all pairs of objects in subgroups cannot be presented equally frequent-ly. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: A Fibonacci-type probability distribution provides the probabilistic models for establishing stopping rules associated with the number of consecutive successes. It can be interpreted as a generalized version of a geometric distribution. In this article, after revisiting the Fibonacci-type probability distribution to explore its definition, moments and properties, we proposed numerical methods to obtain two estimators of the success probability: the method of moments estimator (MME) and maximum likelihood estimator (MLE). The ways both of them performed were compared in terms of the mean squared error. A numerical study demon-srated that the MLE tends to outperform the MME for most of the parameter space with various sample sizes. PubDate: Fri, 23 Sep 2022 00:00:00 GMT

Abstract: One of the greatest challenges facing official statistics in the 21st century is the use of alternative sources of data about prices (scanned and scraped data) in the analysis of price dynamics, which also involves selecting the appropriate formula of the price index at the elementary group (5-digit) level. When consumer price indices of goods and services are constructed, a number of subjective decisions are made at different stages, e.g. regarding the choice of data sources and types of indices used for the purpose of estimation. All of these decisions can affect the bias of consumer price indices, i.e. the extent to which they contribute to the overall uncertainty about the resulting index values. By measuring how robust consumer price indices are, one can assess the impact that the decisions made at the different stages of index construction have on the index values. This assessment involves analysing uncertainty and sensitivity. The purpose of the study described in the article was to determine how much and in which direction the consumer price index changes when including scanner and scraped data in the analysis, in addition to the data on prices collected by enumerators. The impact of these new data sources was assessed by analysing uncertainty and sensitivity under the deterministic approach. To the best of the authors’ knowledge, it is a novel application of robustness analysis to measure inflation using new data sources. The empirical study was based on data for February and March 2021, while scanner and scraped data about selected categories of food products were obtained from one retail chain operating hundreds of points of sale in Poland and selling products online. It was found that the choice of a data source has the most significant impact on the final value of the index at the elementary group level, while the choice of the aggregation formula used to consolidate different data sources is of secondary importance. PubDate: Thu, 22 Sep 2022 00:00:00 GMT