Authors:Ke-Hai Yuan; Ge Jiang, Ying Cheng Abstract: Data in psychology are often collected using Likert-type scales, and it has been shown that factor analysis of Likert-type data is better performed on the polychoric correlation matrix than on the product-moment covariance matrix, especially when the distributions of the observed variables are skewed. In theory, factor analysis of the polychoric correlation matrix is best conducted using generalized least squares with an asymptotically correct weight matrix (AGLS). However, simulation studies showed that both least squares (LS) and diagonally weighted least squares (DWLS) perform better than AGLS, and thus LS or DWLS is routinely used in practice. In either LS or DWLS, the associations among the polychoric correlation coefficients are completely ignored. To mend such a gap between statistical theory and empirical work, this paper proposes new methods, called ridge GLS, for factor analysis of ordinal data. Monte Carlo results show that, for a wide range of sample sizes, ridge GLS methods yield uniformly more accurate parameter estimates than existing methods (LS, DWLS, AGLS). A real-data example indicates that estimates by ridge GLS are 9–20% more efficient than those by existing methods. Rescaled and adjusted test statistics as well as sandwich-type standard errors following the ridge GLS methods also perform reasonably well. PubDate: 2017-05-26T01:55:33.202119-05: DOI: 10.1111/bmsp.12098

Authors:Chen-Wei Liu; Wen-Chung Wang Abstract: Examinee-selected item (ESI) design, in which examinees are required to respond to a fixed number of items in a given set, always yields incomplete data (i.e., when only the selected items are answered, data are missing for the others) that are likely non-ignorable in likelihood inference. Standard item response theory (IRT) models become infeasible when ESI data are missing not at random (MNAR). To solve this problem, the authors propose a two-dimensional IRT model that posits one unidimensional IRT model for observed data and another for nominal selection patterns. The two latent variables are assumed to follow a bivariate normal distribution. In this study, the mirt freeware package was adopted to estimate parameters. The authors conduct an experiment to demonstrate that ESI data are often non-ignorable and to determine how to apply the new model to the data collected. Two follow-up simulation studies are conducted to assess the parameter recovery of the new model and the consequences for parameter estimation of ignoring MNAR data. The results of the two simulation studies indicate good parameter recovery of the new model and poor parameter recovery when non-ignorable missing data were mistakenly treated as ignorable. PubDate: 2017-04-08T07:17:00.085834-05: DOI: 10.1111/bmsp.12097

Authors:Michael Smithson; Yiyun Shou Abstract: This paper introduces a two-parameter family of distributions for modelling random variables on the (0,1) interval by applying the cumulative distribution function of one ‘parent’ distribution to the quantile function of another. Family members have explicit probability density functions, cumulative distribution functions and quantiles in a location parameter and a dispersion parameter. They capture a wide variety of shapes that the beta and Kumaraswamy distributions cannot. They are amenable to likelihood inference, and enable a wide variety of quantile regression models, with predictors for both the location and dispersion parameters. We demonstrate their applicability to psychological research problems and their utility in modelling real data. PubDate: 2017-03-17T09:30:56.538616-05: DOI: 10.1111/bmsp.12091

Authors:Maria Umlauft; Frank Konietschke, Markus Pauly Abstract: Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal–Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. PubDate: 2017-03-15T02:05:36.852211-05: DOI: 10.1111/bmsp.12089

Authors:Joe W. Tidwell; Michael R. Dougherty, Jeffrey S. Chrabaszcz, Rick P. Thomas Abstract: Despite the fact that data and theories in the social, behavioural, and health sciences are often represented on an ordinal scale, there has been relatively little emphasis on modelling ordinal properties. The most common analytic framework used in psychological science is the general linear model, whose variants include ANOVA, MANOVA, and ordinary linear regression. While these methods are designed to provide the best fit to the metric properties of the data, they are not designed to maximally model ordinal properties. In this paper, we develop an order-constrained linear least-squares (OCLO) optimization algorithm that maximizes the linear least-squares fit to the data conditional on maximizing the ordinal fit based on Kendall's τ. The algorithm builds on the maximum rank correlation estimator (Han, 1987, Journal of Econometrics, 35, 303) and the general monotone model (Dougherty & Thomas, 2012, Psychological Review, 119, 321). Analyses of simulated data indicate that when modelling data that adhere to the assumptions of ordinary least squares, OCLO shows minimal bias, little increase in variance, and almost no loss in out-of-sample predictive accuracy. In contrast, under conditions in which data include a small number of extreme scores (fat-tailed distributions), OCLO shows less bias and variance, and substantially better out-of-sample predictive accuracy, even when the outliers are removed. We show that the advantages of OCLO over ordinary least squares in predicting new observations hold across a variety of scenarios in which researchers must decide to retain or eliminate extreme scores when fitting data. PubDate: 2017-02-27T02:20:29.642191-05: DOI: 10.1111/bmsp.12090

Authors:Siwei Liu Abstract: This paper compares the multilevel modelling (MLM) approach and the person-specific (PS) modelling approach in examining autoregressive (AR) relations with intensive longitudinal data. Two simulation studies are conducted to examine the influences of sample heterogeneity, time series length, sample size, and distribution of individual level AR coefficients on the accuracy of AR estimates, both at the population level and at the individual level. It is found that MLM generally outperforms the PS approach under two conditions: when the sample has a homogeneous AR pattern, namely, when all individuals in the sample are characterized by AR processes with the same order; and when the sample has heterogeneous AR patterns, but a multilevel model with a sufficiently high order (i.e., an order equal to or higher than the maximum order of individual AR patterns in the sample) is fitted and successfully converges. If a lower-order multilevel model is chosen for heterogeneous samples, the higher-order lagged effects are misrepresented, resulting in bias at the population level and larger prediction errors at the individual level. In these cases, the PS approach is preferable, given sufficient measurement occasions (T ≥ 50). In addition, sample size and distribution of individual level AR coefficients do not have a large impact on the results. Implications of these findings on model selection and research design are discussed. PubDate: 2017-02-22T08:45:50.43108-05:0 DOI: 10.1111/bmsp.12096

Authors:Pasquale Anselmi; Luca Stefanutti, Debora Chiusole, Egidio Robusto Abstract: The gain–loss model (GaLoM) is a formal model for assessing knowledge and learning. In its original formulation, the GaLoM assumes independence among the skills. Such an assumption is not reasonable in several domains, in which some preliminary knowledge is the foundation for other knowledge. This paper presents an extension of the GaLoM to the case in which the skills are not independent, and the dependence relation among them is described by a well-graded competence space. The probability of mastering skill s at the pretest is conditional on the presence of all skills on which s depends. The probabilities of gaining or losing skill s when moving from pretest to posttest are conditional on the mastery of s at the pretest, and on the presence at the posttest of all skills on which s depends. Two formulations of the model are presented, in which the learning path is allowed to change from pretest to posttest or not. A simulation study shows that models based on the true competence space obtain a better fit than models based on false competence spaces, and are also characterized by a higher assessment accuracy. An empirical application shows that models based on pedagogically sound assumptions about the dependencies among the skills obtain a better fit than models assuming independence among the skills. PubDate: 2017-02-17T04:05:40.202974-05: DOI: 10.1111/bmsp.12095

Authors:María Rubio-Aparicio; Julio Sánchez-Meca, José Antonio López-López, Juan Botella, Fulgencio Marín-Martínez Abstract: Subgroup analyses allow us to examine the influence of a categorical moderator on the effect size in meta-analysis. We conducted a simulation study using a dichotomous moderator, and compared the impact of pooled versus separate estimates of the residual between-studies variance on the statistical performance of the QB(P) and QB(S) tests for subgroup analyses assuming a mixed-effects model. Our results suggested that similar performance can be expected as long as there are at least 20 studies and these are approximately balanced across categories. Conversely, when subgroups were unbalanced, the practical consequences of having heterogeneous residual between-studies variances were more evident, with both tests leading to the wrong statistical conclusion more often than in the conditions with balanced subgroups. A pooled estimate should be preferred for most scenarios, unless the residual between-studies variances are clearly different and there are enough studies in each category to obtain precise separate estimates. PubDate: 2017-02-06T03:35:27.313476-05: DOI: 10.1111/bmsp.12092

Authors:Paul De Boeck; Haiqin Chen, Mark Davison Abstract: Based on data from a cognitive test presented in a condition with time constraints per item and a condition without time constraints, the effect of speed on accuracy is investigated. First, if the effect of imposed speed on accuracy is negative it can be explained by the speed–accuracy trade-off, and if it can be captured through the corresponding latent variables, then measurement invariance applies between a condition with and a condition without time constraints. The results do show a negative effect and a lack of measurement invariance. Second, the conditional accuracy function (CAF) is investigated in both conditions, with and without time constraints. The CAF shows an (item-dependent) negative conditional dependence between response time and response accuracy and thus a positive relationship between speed and accuracy, which implies that faster responses are more accurate. In sum, there seem to be two kinds of speed effects: a speed–accuracy trade-off effect induced by imposed speed and an opposite CAF effect associated with speed within conditions. The second effect is interpreted as stemming from a within-person variation of the cognitive capacity during the test which simultaneously favours or disfavours speed and accuracy. PubDate: 2017-02-03T07:13:33.4059-05:00 DOI: 10.1111/bmsp.12094

Authors:Jochen Ranger; Jörg-Tobias Kuhn, Carsten Szardenings Abstract: Cognitive psychometric models embed cognitive process models into a latent trait framework in order to allow for individual differences. Due to their close relationship to the response process the models allow for profound conclusions about the test takers. However, before such a model can be used its fit has to be checked carefully. In this manuscript we give an overview over existing tests of model fit and show their relation to the generalized moment test of Newey (Econometrica, 53, 1985, 1047) and Tauchen (Journal of Econometrics, 30, 1985, 415). We also present a new test, the Hausman test of misspecification (Hausman, Econometrica, 46, 1978, 1251). The Hausman test consists of a comparison of two estimates of the same item parameters which should be similar if the model holds. The performance of the Hausman test is evaluated in a simulation study. In this study we illustrate its application to two popular models in cognitive psychometrics, the Q-diffusion model and the D-diffusion model (van der Maas, Molenaar, Maris, Kievit, & Boorsboom, Psychological Review, 118, 2011, 339; Molenaar, Tuerlinckx, & van der Maas, Journal of Statistical Software, 66, 2015, 1). We also compare the performance of the test to four alternative tests of model fit, namely the M2 test (Molenaar et al., Journal of Statistical Software, 66, 2015, 1), the moment test (Ranger et al., British Journal of Mathematical and Statistical Psychology, 2016) and the test for binned time (Ranger & Kuhn, Psychological Test and Assessment Modeling, 56, 2014b, 370). The simulation study indicates that the Hausman test is superior to the latter tests. The test closely adheres to the nominal Type I error rate and has higher power in most simulation conditions. PubDate: 2017-02-03T07:07:07.130454-05: DOI: 10.1111/bmsp.12082

Authors:Dylan Molenaar; Maria Bolsinova Abstract: In generalized linear modelling of responses and response times, the observed response time variables are commonly transformed to make their distribution approximately normal. A normal distribution for the transformed response times is desirable as it justifies the linearity and homoscedasticity assumptions in the underlying linear model. Past research has, however, shown that the transformed response times are not always normal. Models have been developed to accommodate this violation. In the present study, we propose a modelling approach for responses and response times to test and model non-normality in the transformed response times. Most importantly, we distinguish between non-normality due to heteroscedastic residual variances, and non-normality due to a skewed speed factor. In a simulation study, we establish parameter recovery and the power to separate both effects. In addition, we apply the model to a real data set. PubDate: 2017-02-03T06:55:41.161916-05: DOI: 10.1111/bmsp.12087

Authors:Oscar L. Olvera Astivia; Bruno D. Zumbo Abstract: The purpose of this paper is to highlight the importance of a population model in guiding the design and interpretation of simulation studies used to investigate the Spearman rank correlation. The Spearman rank correlation has been known for over a hundred years to applied researchers and methodologists alike and is one of the most widely used non-parametric statistics. Still, certain misconceptions can be found, either explicitly or implicitly, in the published literature because a population definition for this statistic is rarely discussed within the social and behavioural sciences. By relying on copula distribution theory, a population model is presented for the Spearman rank correlation, and its properties are explored both theoretically and in a simulation study. Through the use of the Iman–Conover algorithm (which allows the user to specify the rank correlation as a population parameter), simulation studies from previously published articles are explored, and it is found that many of the conclusions purported in them regarding the nature of the Spearman correlation would change if the data-generation mechanism better matched the simulation design. More specifically, issues such as small sample bias and lack of power of the t-test and r-to-z Fisher transformation disappear when the rank correlation is calculated from data sampled where the rank correlation is the population parameter. A proof for the consistency of the sample estimate of the rank correlation is shown as well as the flexibility of the copula model to encompass results previously published in the mathematical literature. PubDate: 2017-01-31T08:38:08.923003-05: DOI: 10.1111/bmsp.12085

Authors:Frank Goldhammer; Merle A. Steinwascher, Ulf Kroehne, Johannes Naumann Pages: 238 - 256 Abstract: Completing test items under multiple speed conditions avoids the performance measure being confounded with individual differences in the speed–accuracy compromise, and offers insights into the response process, that is, how response time relates to the probability of a correct response. This relation is traditionally represented by two conceptually different functions: the speed-accuracy trade-off function (SATF) across conditions relating the condition average response time to the condition average of accuracy, and the conditional accuracy function (CAF) within a condition describing accuracy conditional on response time. Using a generalized linear mixed modelling approach, we propose an item response modelling framework that is suitable for item response and response time data from experimental speed conditions. The proposed SATF and CAF model accommodates response time effects between conditions (i.e., person and item SATF slope) and within conditions (i.e., residual CAF slopes), captures person and item differences in these effects, and is suitable for measures with a strong speed component. Moreover, for a single condition a CAF model is proposed distinguishing person, item and residual CAF. The properties of the models are illustrated with an empirical example. PubDate: 2017-05-05T06:25:12.835182-05: DOI: 10.1111/bmsp.12099

Authors:Ingmar Visser; Rens Poessé Pages: 280 - 296 Abstract: The linear ballistic accumulator (LBA) model (Brown & Heathcote, , Cogn. Psychol., 57, 153) is increasingly popular in modelling response times from experimental data. An R package, glba, has been developed to fit the LBA model using maximum likelihood estimation which is validated by means of a parameter recovery study. At sufficient sample sizes parameter recovery is good, whereas at smaller sample sizes there can be large bias in parameters. In a second simulation study, two methods for computing parameter standard errors are compared. The Hessian-based method is found to be adequate and is (much) faster than the alternative bootstrap method. The use of parameter standard errors in model selection and inference is illustrated in an example using data from an implicit learning experiment (Visser et al., , Mem. Cogn., 35, 1502). It is shown that typical implicit learning effects are captured by different parameters of the LBA model. PubDate: 2017-05-05T06:25:12.436464-05: DOI: 10.1111/bmsp.12100

Authors:Peter W. Rijn; Usama S. Ali Pages: 317 - 345 Abstract: We compare three modelling frameworks for accuracy and speed of item responses in the context of adaptive testing. The first framework is based on modelling scores that result from a scoring rule that incorporates both accuracy and speed. The second framework is the hierarchical modelling approach developed by van der Linden (2007, Psychometrika, 72, 287) in which a regular item response model is specified for accuracy and a log-normal model for speed. The third framework is the diffusion framework in which the response is assumed to be the result of a Wiener process. Although the three frameworks differ in the relation between accuracy and speed, one commonality is that the marginal model for accuracy can be simplified to the two-parameter logistic model. We discuss both conditional and marginal estimation of model parameters. Models from all three frameworks were fitted to data from a mathematics and spelling test. Furthermore, we applied a linear and adaptive testing mode to the data off-line in order to determine differences between modelling frameworks. It was found that a model from the scoring rule framework outperformed a hierarchical model in terms of model-based reliability, but the results were mixed with respect to correlations with external measures. PubDate: 2017-05-05T06:25:11.209892-05: DOI: 10.1111/bmsp.12101

Authors:Hyeon-Ah Kang Abstract: The Cox proportional hazards model with a latent trait variable (Ranger & Ortner, 2012, Br. J. Math. Stat. Psychol., 65, 334) has shown promise in accounting for the dependency of response times from the same examinee. The model allows flexibility in shapes of response time distributions using the non-parametric baseline hazard rate while allowing parametric inference about the latent variable via exponential regression. The flexibility of the model, however, comes at the price of a significant increase in the complexity of estimating the model. The purpose of this study is to propose a new estimation approach to overcome this difficulty in model estimation. The new procedure is based on the penalized partial likelihood estimator in which the partial likelihood is maximized in the presence of a penalty function. The potential of the proposed method is corroborated by a series of simulation studies for fitting the proportional hazards latent trait model to psychological and educational testing data. The application of the estimation method to the hierarchical framework (van der Linden, 2007, Psychometrika, 72, 287) is also illustrated for jointly analysing response times and accuracy scores. PubDate: 2016-12-13T06:32:21.481205-05: DOI: 10.1111/bmsp.12080

Authors:Maria Bolsinova; Jesper Tijmstra, Dylan Molenaar Abstract: It is becoming more feasible and common to register response times in the application of psychometric tests. Researchers thus have the opportunity to jointly model response accuracy and response time, which provides users with more relevant information. The most common choice is to use the hierarchical model (van der Linden, 2007, Psychometrika, 72, 287), which assumes conditional independence between response time and accuracy, given a person's speed and ability. However, this assumption may be violated in practice if, for example, persons vary their speed or differ in their response strategies, leading to conditional dependence between response time and accuracy and confounding measurement. We propose six nested hierarchical models for response time and accuracy that allow for conditional dependence, and discuss their relationship to existing models. Unlike existing approaches, the proposed hierarchical models allow for various forms of conditional dependence in the model and allow the effect of continuous residual response time on response accuracy to be item‐specific, person‐specific, or both. Estimation procedures for the models are proposed, as well as two information criteria that can be used for model selection. Parameter recovery and usefulness of the information criteria are investigated using simulation, indicating that the procedure works well and is likely to select the appropriate model. Two empirical applications are discussed to illustrate the different types of conditional dependence that may occur in practice and how these can be captured using the proposed hierarchical models. PubDate: 2016-09-12T09:45:56.433563-05: DOI: 10.1111/bmsp.12076