Authors:Md. Mahabubur Rahman; Bander Al-Zahrani; Muhammad Qaiser Shahbaz Abstract: In this paper, a cubic transmuted Weibull ( \( CTW \) ) distribution has been proposed by using the general family of transmuted distributions introduced by Rahman et al. (Pak J Stat Oper Res 14:451–469, 2018). We have explored the proposed \( CTW \) distribution in details and have studied its statistical properties as well. The parameter estimation and inference procedure for the proposed distribution have been discussed. We have conducted a simulation study to observe the performance of estimation technique. Finally, we have considered two real-life data sets to investigate the practicality of proposed \( CTW \) distribution. PubDate: 2019-01-11 DOI: 10.1007/s40745-018-00188-y

Authors:El-Sayed A. El-Sherpieny; Mamhoud M. Elsehetry Abstract: A new family of distributions called type II Kumaraswamy half logistic-G class is introduced and studied. Five new special models of the proposed class are presented. Some mathematical properties of the new family are studied. Explicit expressions for the moments, probability weighted moments, quantile function, mean deviation, order statistics and Rényi entropy are investigated. Parameter estimation of the unknown parameters are obtained based on the maximum likelihood approach. Simulation study is carried out to estimate the model parameters of distribution. One real data set is employed to show the usefulness of the new family. PubDate: 2019-01-09 DOI: 10.1007/s40745-018-00187-z

Authors:Manish Sharma; Shikha N. Khera; Pritam B. Sharma Abstract: The Trait Meta Mood Scale (TMMS) is one of the widely used instruments for measuring the emotional intelligence. This scale helps in ascertaining the overall emotional intelligence and can be used by organizations to handle the workforce and hence increase the efficiency and effectiveness by taking corrective measures, thereby transforming the organizations. If a large data set is available with some missing value, it becomes difficult to find the overall emotional intelligence of the given group and carry out the statistical analysis. This work proposes a model which applies neural network to find out the missing data and to perform regression. The model provides a flexible system to measure emotional intelligence. It paves a way for the application of machine learning in the TMMS scale of emotional intelligence but also in other scales of emotional intelligence. PubDate: 2019-01-05 DOI: 10.1007/s40745-018-00185-1

Authors:Totan Garai; Dipankar Chakraborty; Tapan Kumar Roy Abstract: In this paper, we investigated a multi-objective inventory model under both stock-dependent demand rate and holding cost rate with fuzzy random coefficients. Chance constrained fuzzy random multi-objective model and a traditional solution procedure based on an interactive fuzzy satisfying method are discussed. In addition, the technique of fuzzy random simulation is applied to deal with general fuzzy random objective functions and fuzzy random constraints which are usually difficult to converted into their crisp equivalents. The purposed of this study is to determine optimal order quantity and inventory level such that the total profit and wastage cost are maximized and minimize for the retailer respectively. Finally, illustrate example is given in order to show the application of the proposed model. PubDate: 2019-01-03 DOI: 10.1007/s40745-018-00186-0

Authors:M. S. Eliwa; M. El-Morshedy Abstract: In this paper, a new class of bivariate distributions called the bivariate Gumbel-G family is proposed, whose marginal distributions are Gumbel-G families. Several of its statistical properties are derived. After introducing the general class, a special model of the new family is discussed in-detail. Bayesian and maximum likelihood techniques are used to estimate the model parameters. Simulation study is carried out to examine the bias and mean square error of Bayesian and maximum likelihood estimators. Finally, a real data set is analyzed for illustrative the flexibility of the proposed bivariate family. PubDate: 2019-01-02 DOI: 10.1007/s40745-018-00190-4

Authors:Mohiuddin Ahmed Pages: 497 - 512 Abstract: In certain cyber-attack scenarios, such as flooding denial of service attacks, the data distribution changes significantly. This forms a collective anomaly, where some similar kinds of normal data instances appear in abnormally large numbers. Since they are not rare anomalies, existing anomaly detection techniques cannot properly identify them. This paper investigates detecting this behaviour using the existing clustering and co-clustering based techniques and utilizes the network traffic modelling technique via Hurst parameter to propose a more effective algorithm combining clustering and Hurst parameter. Experimental analysis reflects that the proposed Hurst parameter-based technique outperforms existing collective and rare anomaly detection techniques in terms of detection accuracy and false positive rates. The experimental results are based on benchmark datasets such as KDD Cup 1999 and UNSW-NB15 datasets. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0149-0 Issue No:Vol. 5, No. 4 (2018)

Authors:Selamawit Endale Gurmu Pages: 513 - 527 Abstract: Cervical cancer is one of the leading causes of death in the world and represents a tremendous burden on patients, families and societies. It is estimated that over one million women worldwide currently have cervical cancer; most of them have not been diagnosed or have no access to treatment that could cure them or prolong their lives. The goal of this study is to investigate potential risk factors affecting survival time of women with cervical cancer at Tikur Anbessa specialized hospital. Data were taken from patients’ medical record card that enrolled during September 2011–September 2015. Kaplan–Meier estimation method, Cox proportional hazard model and parametric shared frailty model were used to analysis survival time of cervical cancer patients. Study subjects (cervical cancer patients) in this study came from clustered community and hence clustered survival data correlated at the regional level. Parametric frailty models will be explored assuming that women with in the same cluster (region for this study) shares similar risk factors. We used Exponential, Weibull, Log logistics and Log normal distributions and based on AIC criteria, all models were compared for their performance. The lognormal inverse Gaussian model has the minimum AIC value among the models compared. The results implied that not giving birth up to the study ends and married after twenty years were significantly prolong the survival time of patients while age class 51–60, 61–70, > 70, smoking cigarettes, patients with stage III and IV disease, family history of cervical cancer, history of abortion and living with HIV AIDS were significantly shorten survival time of patients. The findings of this study suggested that age, smoking cigarettes, stage, family history, abortion history, living with HIV AIDS, age at first marriage and age at first birth were major factors to survival time of patients. Heterogeneity between the regions in the survival time of cervical cancer patients, indicating that one needs to account for this clustering variable using frailty models. The fit statistics showed that lognormal inverse-Gaussian frailty model described the survival time of cervical cancer patients dataset better than other distributions used in this study. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0150-7 Issue No:Vol. 5, No. 4 (2018)

Authors:Desa Daba Fufa; Belianeh Legesse Zeleke Pages: 529 - 547 Abstract: This paper provides a robust analysis of volatility forecasting of Euro-ETB exchange rate using weekly data spanning the period January 3, 2000–December 2, 2015. The forecasting performance of various GARCH-type models is investigated based on forecasting performance criteria such as MSE and MAE based tests, and alternative measures of realized volatility. To our knowledge, this is the first study that focuses on Euro-ETB exchange rate using high frequency data, and a range of econometric models and forecast performance criteria. The empirical results indicate that the Euro-ETB exchange rate series exhibits persistent volatility clustering over the study period. We document evidence that ARCH (8), GARCH (1, 1), EGARCH (1, 1) and GJR-GARCH (2, 2) models with normal distribution, student’s-t distribution and GED are the best in-sample estimation models in terms of the volatility behavior of the series. Amongst these models, GJR-GARCH (2, 2) and GARCH (1, 1) with students t-distribution are found to perform best in terms of one step-ahead forecasting based on realized volatility calculated from the underlying daily data and squared weekly first differenced of the logarithm of the series, respectively. A one-step-ahead forecasted conditional variance of weekly Euro-ETB exchange rate portrays large spikes around 2010 and it is evident that weekly Euro-ETB exchange rate are volatile. This large spikes indicates that devaluation of Ethiopian birr against the Euro. This volatility behavior may affects the International Foreign Investment and trade balance of the country. Therefore, GJR-GARCH (2, 2) with student’s t-distribution is the best model both interms of the stylized facts and forecasting performance of the volatility of Ethiopian Birr/Euro exchange rate among others. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0151-6 Issue No:Vol. 5, No. 4 (2018)

Authors:Rong Liu; Robert Rallo; Yoram Cohen Pages: 549 - 563 Abstract: The box-counting approach for fractal dimension calculation is scaled up for big data using a data structure named box locality index (BLI). The BLI is constructed as key-value pairs with the key indexing the location of a “box” (i.e., a grid cell on the multi-dimensional space) and the value counting the number of data points inside the box (i.e., “box occupancy”). Such a key-value pair structure of BLI significantly simplifies the traditionally used hierarchical structure and encodes only necessary information required by the box-counting approach for fractal dimension calculation. Moreover, as the box occupancies (i.e., the values) associated with the same index (i.e., the key) are aggregatable, the BLI grants the box-counting approach the needed scalability for fractal dimension calculation of big data using distributed computing techniques (e.g., MapReduce and Spark). Taking the advantage of the BLI, MapReduce and Spark methods for fractal dimension calculation of big data are developed, which conduct box-counting for each grid level as a cascade of MapReduce/Spark jobs in a bottom-up fashion. In an empirical validation, the MapReduce and Spark methods demonstrated good effectiveness and efficiency in fractal calculation of a big synthetic dataset. In summary, this work provides an efficient solution for estimating the intrinsic dimension of big data, which is essential for many machine learning methods and data analytics including feature selection and dimensionality reduction. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0152-5 Issue No:Vol. 5, No. 4 (2018)

Authors:M. Elgarhy; I. Elbatal; Muhammad Ahsan ul Haq; Amal S. Hassan Pages: 565 - 581 Abstract: The Lindley distribution is one of the widely used models for studying most of reliability modeling. Besides, several of researchers have motivated new classes of distributions based on modifications of the quasi Lindley distribution. In this article, a new version of generalized distributions named as the transmuted Kumaraswamy quasi Lindley (TKQL) is introduced. Various statistical properties of the TKQL distribution are provided. The rth moment of the TKQL distribution and its moment generating function are explored. Moreover, estimation of the model parameters is discussed via the method of maximum likelihood. Applications to real data are performed to clarify the flexibility of the TKQL distribution in comparison with some sub-models. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0153-4 Issue No:Vol. 5, No. 4 (2018)

Authors:A. Shabani; M. Khaleghi Moghadam; A. Gholami; E. Moradi Pages: 583 - 613 Abstract: A new class of lifetime distributions is proposed. Closed form expressions are provided for the density, cumulative distribution, survival and hazard rate functions. Maximum likelihood estimation is discussed and formulas for the elements of the observed information matrix are provided. Simulation studies are conducted. Finally, two real data applications are given showing the flexibility and potentiality of the new distribution PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0154-3 Issue No:Vol. 5, No. 4 (2018)

Authors:Rabia Aziz; C. K. Verma; Namita Srivastava Pages: 615 - 635 Abstract: Classification of high dimensional data is a very crucial task in bioinformatics. Cancer classification of the microarray is a typical application of machine learning due to the large numbers of genes. Feature (genes) selection and classification with computational intelligent techniques play an important role in diagnosis and prediction of disease in the microarray. Artificial neural networks (ANN) is an artificial intelligence technique for classifying, image processing and predicting the data. This paper evaluates the performance of ANN classifier using six different hybrid feature selection techniques, for gene selection of microarray data. These hybrid techniques use Independent component analysis (ICA), as an extraction technique, popular filter techniques and bio-inspired algorithm for optimization of the ICA feature vector. Five binary gene expression microarray datasets are used to compare the performance of these techniques and determine how these techniques improve the performance of ANN classifier. These techniques can be extremely useful in feature selection because they achieve the highest classification accuracy along with the lowest average number of selected genes. Furthermore, to check the significant difference between these different algorithms a statistical hypothesis test was employed with a certain level of confidence. The experimental result shows that a combination of ICA with genetic bee colony algorithm shows superior performance as it heuristically removes non-contributing features to improve the performance of classifiers. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0155-2 Issue No:Vol. 5, No. 4 (2018)

Authors:U. H. Salemi; S. Rezaei; Y. Si; S. Nadarajah Pages: 637 - 658 Abstract: Selection of optimal progressive censoring schemes for the normal distribution is discussed according to maximum likelihood estimation and best linear unbiased estimation. The selection is based on variances of the estimators of the two parameters of the normal distribution. The extreme left censoring scheme is shown to be an optimal progressive censoring scheme. The usual type-II right censoring case is shown to be the worst progressive censoring scheme for estimating the scale parameter. It can greatly increase the variance of estimators. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0156-1 Issue No:Vol. 5, No. 4 (2018)

Authors:Aboma Temesgen; Abdisa Gurmesa; Yehenew Getchew Pages: 659 - 678 Abstract: Tuberculosis (TB) and HIV have been closely linked since the emergence of AIDS; TB enhances HIV replication by accelerating the natural evolution of HIV infection which is the leading cause of sickness and death of peoples living with HIV/AIDS. To improve their life the co-infected patients are started to take antiretroviral treatment as patient started to take ART it is common to measure CD4 and other clinical outcomes which is correlated with survival time. However, the separate analysis of such data does not handle the association between the longitudinal measured out come and time-to-event where the joint modeling does to obtain valid and efficient survival time. Joint modeling of longitudinally measured CD4 and time-to death to understand their association. Furthermore, the study identifies factors affecting the mean change in square root CD4 measurement over time and risk factors for the survival time of HIV/TB co-infected patients. The study consists of 254 HIV/TB co-infected patients who were 18 years old or older and who were on antiretroviral treatment follow up from first February 2009 to fist July 2014 in Jimma University Specialized Hospital, West Ethiopia. First, data were analyzed using linear mixed model and survival models separately. After having appropriate separate models using Akaki information criteria, different joint models employed with different random effects longitudinal model and different shared parameters association structure of survival model and compared with deviance information criteria score. The linear mixed model showed functional status, weight, linear time and quadratic time effects have significant effect on the mean change of CD4 measurement over time. The Cox and Weibull survival model showed base line weight, baseline smoking, separated marital status group and base line functional status have significant effect on hazard function of the survival time whereas the joint model showed subject specific base line value; subject specific linear and quadratic slopes of CD4 measurement of were significantly associated with the survival time of co-infected patient at 5% significance levels. The longitudinally measured CD4 count measurement marker process is significantly associated with time to death and subject specific quadratic slope growth of CD4 measurement, base line clinical stage IV and smoking is the high risk factors that lower the survival time of HIV/TB co-infected patients. Since the longitudinally measured CD4 measurement is correlated with survival time joint modeling are used to handle the associations between these two processes to obtain valid and efficient survival time. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0157-0 Issue No:Vol. 5, No. 4 (2018)

Authors:Tanmay Sen; Yogesh Mani Tripathi; Ritwik Bhattacharya Pages: 679 - 708 Abstract: This article considers estimation of unknown parameters and prediction of future observations of a generalized exponential distribution based on Type-II hybrid censored data. Bayes point and HPD interval estimates of the unknown parameters are obtained under the assumption of independent gamma priors. Different classical and Bayesian point predictors and prediction intervals are obtained in two-sample situation against squared error loss function. The optimum censoring schemes are computed under various optimality criteria. Monte Carlo simulations are performed to compare different methods and two data sets are analyzed for illustrative purposes. PubDate: 2018-12-01 DOI: 10.1007/s40745-018-0158-z Issue No:Vol. 5, No. 4 (2018)

Authors:Bilal Ahmad Para; Tariq Rashid Jan Abstract: In this paper, a new discrete version of generalized inverse Weibull distribution is proposed using the general approach of discretization. Structural properties of the newly introduced discrete model have been discussed comprehensively. Characterization results have also been made to establish a direct link between the discrete generalized inverse Weibull distribution and its continuous counterpart. Various theorems relating a generalized inverse Weibull distribution with other probability models have also been proved. Finally, a real life count data set from medical sciences is used to illustrate the application of discrete inverse Weibull distribution. PubDate: 2018-11-24 DOI: 10.1007/s40745-018-0184-x

Authors:Amal S. Hassan; Marwa Abd-Allah Abstract: We introduce and study a new three-parameter lifetime distribution named as the inverse power Lomax. The proposed distribution is obtained as the inverse form of the power Lomax distribution. Some statistical properties of the inverse power Lomax model are implemented. Based on censored samples, maximum likelihood estimators of the model parameters are obtained. An intensive simulation study is performed for evaluating the behavior of estimators based on their biases and mean square errors. Superiority of the new model over some well-known distributions is illustrated by means of real data sets. The results revealed the fact that; the suggested model can produce better fits than some well-known distributions. PubDate: 2018-11-16 DOI: 10.1007/s40745-018-0183-y

Authors:Derbachew Asfaw; Zeytu Gashaw Abstract: We examined the determinants of the admittance of students into their top wished-fields of study by university students using data from Ethiopian National Educational Assessment and Examination Agency. It is based on a 2016 cohort of 41,371 applicants in Social Science and 92,135 applicants in Natural Science, who were admitted to public universities in Ethiopia. We use a binary logistic regression model applied to four broadly defined fields in Social Science streaming and found that students’ place of residence, gender, EHEECE admission grade and age of the student have a significant positive impact on the decision process towards admitting students into their top wished-fields. Results also showed that there were significant positive interaction effects of EHEECE admission grade, gender and wished-fields on the decision process. We noticed a fair selection between girls and boys into the field of Law and Theatrical Fine Art and Music. For girls the odds of being admitted into the field of Other Social Science and Humanities were relatively better than the odds of being admitted into Business and Economics. We use a polytomous logit regression model applied to seven broadly defined fields in Natural Science streaming and found no selection bias in admitting applicants into the field of first and second ordered preferences among girls and boys, whilst there were a variation among the fields ranked thereafter. PubDate: 2018-10-03 DOI: 10.1007/s40745-018-0182-z

Authors:Devendra Kumar; Manoj Kumar Abstract: We introduce a new lifetime distribution namely, transmuted extended exponential distribution which generalizes the extended exponential distribution proposed by Nadarajah and Haghighi (Statistics 45:543–558, 2011) with an additional parameter using the quadratic rank transmutation map which was studied by Shaw and Buckley (The alchemy of probability distributions: beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map, 2009. arXiv:0901.0434) to provide greater flexibility in modeling data from a practical point of view. In this paper, our main focus is on estimation from frequentist point of view, yet, some statistical and reliability characteristics for the model are derived. We briefly describe different estimation procedures namely, the method of maximum likelihood estimation, maximum product of spacings estimation and least square estimation. Monte Carlo simulations are performed to compare the performance of the proposed methods of estimations for both small and large samples. Finally, the potentiality of the model is analyzed by means of one real data set. PubDate: 2018-09-22 DOI: 10.1007/s40745-018-0181-0

Authors:Yuvraj Sunecher; Naushad Mamode Khan; Vandna Jowaheer; Marcelo Bourguignon; Mohammad Arashi Abstract: The ranking of some English Premier League (EPL) clubs during football season is of keen interest to many stakeholders with special attention to the London rivals: Arsenal, Chelsea and Tottenham. In particular, the first (GF) and second half (GS) scores, besides being inter-related, is perceived as a convenient measure of the clubs potential. This paper studies the contributory effects of the possible factors that commonly influence the club scoring capacity in the halves along with forecasted measures diagnostics via a novel flexible bivariate time series model with COM-Poisson innovations using data from August 2014 to December 2017. PubDate: 2018-09-11 DOI: 10.1007/s40745-018-0180-1