Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Scale-invariance is an open problem in many computer vision subfields. For example, object labels should remain constant across scales, yet model predictions diverge in many cases. This problem gets harder for tasks where the ground-truth labels change with the presentation scale. In image quality assessment (IQA), down-sampling attenuates impairments, e.g., blurs or compression artifacts, which can positively affect the impression evoked in subjective studies. To accurately predict perceptual image quality, cross-resolution IQA methods must therefore account for resolution-dependent discrepancies induced by model inadequacies as well as for the perceptual label shifts in the ground truth. We present the first study of its kind that disentangles and examines the two issues separately via KonX, a novel, carefully crafted cross-resolution IQA database. This paper contributes the following: 1. Through KonX, we provide empirical evidence of label shifts caused by changes in the presentation resolution. 2. We show that objective IQA methods have a scale bias, which reduces their predictive performance. 3. We propose a multi-scale and multi-column deep neural network architecture that improves performance over previous state-of-the-art IQA models for this task. We thus both raise and address a novel research problem in image quality assessment. PubDate: 2023-08-18
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Research efforts have previously explored various components of physical/virtual workspaces that adaptively interact with knowledge workers in order to support them in their work. In this paper, we propose an encompassing framework for these efforts, which we refer to as Human-Workspace Interaction (HWI), with the goal of increasing awareness and understanding of the research area and encouraging its further development. Specifically, we present a taxonomy of HWI focusing on the types of components, research approaches, interaction targets and objectives, and then review the prior research efforts over the past two decades based on these criteria. Finally, we discuss challenges to further advance the development of HWI and future prospects, taking into account the impact of the societal changes caused by the COVID-19 pandemic. PubDate: 2023-08-18
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The concept of conducting ecologically valid user studies is gaining traction in the field of Quality of Experience (QoE). However, despite previous research exploring this concept, the increasing volume of studies has made it challenging to obtain a comprehensive overview of existing guidelines and the key aspects to consider when designing ecologically valid studies. Therefor this paper aims to provide a systematic review of research articles published between 2011 and 2021 that offer insight into conducting ecologically valid user studies. From an initial count of 782 retrieved studies, a final count of 12 studies met the predefined criteria and were included in the final review. The systematic review resulted in the extraction of 55 guidelines that provide guidance towards conducting ecologically valid user studies. These guidelines have been grouped within 8 categories (Environment, Technology, Content, Participant Recruitment, User Behavior, Study Design, Task and data collection) overarching the three main dimensions (Setting, Users and Research Methodology). Furthermore, the review discusses: the flip side of ecological validity, the implications for QoE research, as well as provides a basic visualisation model for assessing the ecological validity of a study. In conclusion, the current review indicates that future research should address more in detail how and when research approaches characterized by high ecological validity (and correspondingly, low internal validity) and those characterized by low ecological validity (and normally high internal validity) can best complement each other in order to better understand the key factors influencing QoE for various types of applications, user segments, settings. Further, we argue that more transparency around the (sub)dimensions of ecological validity with respect to a particular study or set of studies is necessary. PubDate: 2023-07-05 DOI: 10.1007/s41233-023-00059-2
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Providing sophisticated web Quality of Experience (QoE) has become paramount for web service providers and network operators alike. Due to advances in web technologies (HTML5, responsive design, etc.), traditional web QoE models focusing mainly on loading times have to be refined and improved. In this work, we relate Google’s Core Web Vitals, a set of metrics for improving user experience, to the loading time aspects of web QoE, and investigate whether the Core Web Vitals and web QoE agree on the perceived experience. To this end, we first perform objective measurements in the web using Google’s Lighthouse. To close the gap between metrics and experience, we complement these objective measurements with subjective assessment by performing multiple crowdsourcing QoE studies. For this purpose, we developed CWeQS, a customized framework to emulate the entire web page loading process, and ask users for their experience while controlling the Core Web Vitals, which is available to the public. To properly configure CWeQS for the planned QoE study and the crowdsourcing setup, we conduct pre-studies, in which we evaluate the importance of the loading strategy of a web page and the importance of the user task. The obtained insights allow us to conduct the desired QoE studies for each of the Core Web Vitals. Furthermore, we assess the impact of cookie consent banners, which have become ubiquitous due to regulatory demands, on the Core Web Vitals and investigate their influence on web QoE. Our results suggest that the Core Web Vitals are much less predictive for web QoE than expected and that page loading times remain the main metric and influence factor in this context. We further observe that unobtrusive and acentric cookie consent banners are preferred by end-users and that additional delays caused by interacting with consent banners in order to agree to or reject cookies should be accounted along with the actual page load time to reduce waiting times and thus to improve web QoE. PubDate: 2023-06-30 DOI: 10.1007/s41233-023-00058-3
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Efficient objective and perceptual metrics are valuable tools to evaluate the visual impact of compression artifacts on the visual quality of volumetric videos (VVs). In this paper, we present some of the MPEG group efforts to create, benchmark and calibrate objective quality assessment metrics for volumetric videos represented as textured meshes. We created a challenging dataset of 176 volumetric videos impaired with various distortions and conducted a subjective experiment to gather human opinions (more than 5896 subjective scores were collected). We adapted two state-of-the-art model-based metrics for point cloud evaluation to our context of textured mesh evaluation by selecting efficient sampling methods. We also present a new image-based metric for the evaluation of such VVs whose purpose is to reduce the cumbersome computation times inherent to the point-based metrics due to their use of multiple kd-tree searches. Each metric presented above is calibrated (i.e., selection of best values for parameters such as the number of views or grid sampling density) and evaluated on our new ground-truth subjective dataset. For each metric, the optimal selection and combination of features is determined by logistic regression through cross-validation. This performance analysis, combined with MPEG experts’ requirements, lead to the validation of two selected metrics and recommendations on the features of most importance through learned feature weights. PubDate: 2023-06-06 DOI: 10.1007/s41233-023-00057-4
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract As virtual reality (VR) technology is extensively developing in past years, more and more people are using it in different fields. One of the fast-developing areas in VR is exergaming, a combination of physical exercise and a game. VR exergames that aim to engage people in physical activity should look and feel good for users regardless of their age, gender, or their previous VR experience with similar technologies. However, recent studies showed that those factors are influencing the user experience (UX) with virtual reality. Building on top of the initial study that has reported on the effect of human influencing factors for exergaming, with this work, we investigated the influence of user parameters (such as age, gender, and previous VR experience) on their motivation for sports and VR exergaming. The study was done using a crowdsourcing platform to recruit a diverse set of participants, with the aim to explore how different user factors are connected to sports motivation. Results show significant differences in the user’s sports motivation and affinity for technology interaction depending on the age group, gender, previous experience with VR, their weekly exercise routine, and how much money they spend on sports yearly. PubDate: 2023-05-04 DOI: 10.1007/s41233-023-00056-5
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract In many research fields, human-annotated data plays an important role as it is used to accomplish a multitude of tasks. One such example is in the field of multimedia quality assessment where subjective annotations can be used to train or evaluate quality prediction models. Lab-based tests could be one approach to get such quality annotations. They are usually performed in well-defined and controlled environments to ensure high reliability. However, this high reliability comes at a cost of higher time consumption and costs incurred. To mitigate this, crowd or online tests could be used. Usually, online tests cover a wider range of end devices, environmental conditions, or participants, which may have an impact on the ratings. To verify whether such online tests can be used for visual quality assessment, we designed three online tests. These online tests are based on previously conducted lab tests as this enables comparison of the results of both test paradigms. Our focus is on the quality assessment of high-resolution images and videos. The online tests use AVrate Voyager, which is a publicly accessible framework for online tests. To transform the lab tests into online tests, dedicated adaptations in the test methodologies are required. The considered modifications are, for example, a patch-based or centre cropping of the images and videos, or a randomly sub-sampling of the to-be-rated stimuli. Based on the analysis of the test results in terms of correlation and SOS analysis it is shown that online tests can be used as a reliable replacement for lab tests albeit with some limitations. These limitations relate to, e.g., lack of appropriate display devices, limitation of web technologies, and modern browsers considering support for different video codecs and formats. PubDate: 2023-04-13 DOI: 10.1007/s41233-023-00055-6
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Augmented reality (AR) is an emerging technology that has significant potential as a solution for novel procedure assistance in mass customisation. Procedure assistance is a series of steps or instructions required to aid a person to complete a task. The informational phase of a procedure is the period when a user is trying to understand instructions, in advance of implementing them. With AR as a potential solution to communicate these steps, it is important to understand the factors that influence user acceptability and experience. In this context, this paper reports the results of a Quality of Experience (QoE) evaluation of two approaches for informational phase assistance: AR procedure assistance and paper-based procedure assistance (control group). Each approach presented a procedure to solve a Rubik’s Cube® in the minimum number of steps. As part of the evaluation methodology of these procedure assistance methods, different metrics were captured. These included the user’s physiological ratings, facial expression features and self-reported measures in terms of affect, task load and QoE. The results show that AR-based assistance yielded significantly reduced procedure completion times and increased success rates compared to paper-based instructions. Several correlations were discovered between physiological and self-reported measures. For example, frustration and mental task load components were seen to correlate to both electrodermal and interbeat interval ratings. The findings from this work will stimulate experimentation and theoretical discussion on the use of physiological ratings and facial expressions as indicators of task load and QoE. PubDate: 2023-02-16 DOI: 10.1007/s41233-023-00054-7
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Despite the growing availability of data, simulation technologies, and predictive analytics, it is not yet clear whether and under which conditions users will trust Decision Support Systems (DSS). DSS are designed to support users in making more informed decisions in specialized tasks through more accurate predictions and recommendations. This mixed-methods user study contributes to the research on trust calibration by analyzing the potential effects of integrated reliability indication in DSS user interfaces for process management in first-time usage situations characterized by uncertainty. Ten experts specialized in digital tools for construction were asked to test and assess two versions of a DSS in a renovation project scenario. We found that while users stated that they need full access to all information to make their own decisions, reliability indication in DSS tends to make users more willing to make preliminary decisions, with users adapting their confidence and reliance to the indicated reliability. Reliability indication in DSS also increases subjective usefulness and system reliability. Based on these findings, it is recommended that for the design of reliability indication practitioners consider displaying a combination of reliability information at several granularity levels in DSS user interfaces, including visualizations, such as a traffic light system, and to also provide explanations for the reliability information. Further research directions towards achieving trustworthy decision support in complex environments are proposed. PubDate: 2022-09-05 DOI: 10.1007/s41233-022-00053-0
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Virtual reality (VR) applications, especially those where the user is untethered to a computer, are becoming more prevalent as new hardware is developed, computational power and artificial intelligence algorithms are available, and wireless communication networks are becoming more reliable, fast, and providing higher reliability. In fact, recent projections show that by 2022 the number of VR users will double, suggesting the sector was not negatively affected by the worldwide COVID-19 pandemic. The success of any immersive communication system is heavily dependent on the user experience it delivers, thus now more than ever has it become crucial to develop reliable models of immersive media experience (IMEx). In this paper, we survey the literature for existing methods and tools to assess human influential factors (HIFs) related to IMEx. In particular, subjective, behavioural, and psycho-physiological methods are covered. We describe tools available to monitor these HIFs, including the user’s sense of presence and immersion, cybersickness, and mental/affective states, as well as their role in overall experience. Special focus is placed on psycho-physiological methods, as it was found that such in-depth evaluation was lacking from the existing literature. We conclude by touching on emerging applications involving multiple-sensorial immersive media and provide suggestions for future research directions to fill existing gaps. It is hoped that this survey will be useful for researchers interested in building new immersive (adaptive) applications that maximize user experience. PubDate: 2022-06-15 DOI: 10.1007/s41233-022-00052-1
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Modern immersive multisensory communication systems can provide compelling mediated social communication experiences that approach face-to-facecommunication. Existing methods to assess the quality of mediated social communication experiences are typically targeted at specific tasks or communication technologies. As a result, they do not address all relevant aspects of social presence (i.e., the feeling of being in the presence of, and having an affective and intellectual connection with, other persons). Also, they are typically unsuitable for application to social communication in virtual (VR), augmented (AR), or mixed (MR) reality. We propose a comprehensive, general, and holistic multi-scale (questionnaire-based) approach, based on an established conceptual framework for multisensory perception, to measure the quality of mediated social communication experiences. Our holistic approach to mediated social communication (H-MSC) assessment comprises both the experience of Spatial Presence (i.e., the perceived fidelity, internal and external plausibility, and cognitive, reasoning, and behavioral affordances of an environment) and the experience of Social Presence (i.e., perceived mutual proximity, intimacy, credibility, reasoning, and behavior of the communication partners). Since social presence is inherently bidirectional (involving a sense of mutual awareness) the multiscale approach measures both the internal (‘own’) and external (‘the other’) assessment perspectives. We also suggest how an associated multiscale questionnaire (the Holistic Mediated Social Communication Questionnaire or H-MSC-Q) could be formulated in an efficient and parsimonious way, using only a single item to tap into each of the relevant processing levels in the human brain: sensory, emotional, cognitive, reasoning, and behavioral. The H-MSC-Q can be sufficiently general to measure social presence experienced with any (including VR, AR, and MR) multi-sensory (visual, auditory, haptic, and olfactory) mediated communication system. Preliminary validation studies confirm the content and face validity of the H-MSC-Q. In this paper, we focus on the underlying concepts of the H-MSC-Q. We make the initial draft questionnaire available to the community for further review, development, and validation. We hope it may contribute to the unification of quality measures for mediated social communication. PubDate: 2022-06-14 DOI: 10.1007/s41233-022-00051-2
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Fuelled by the increase in popularity of virtual and augmented reality applications, point clouds have emerged as a popular 3D format for acquisition and rendering of digital humans, thanks to their versatility and real-time capabilities. Due to technological constraints and real-time rendering limitations, however, the visual quality of dynamic point cloud contents is seldom evaluated using virtual and augmented reality devices, instead relying on prerecorded videos displayed on conventional 2D screens. In this study, we evaluate how the visual quality of point clouds representing digital humans is affected by compression distortions. In particular, we compare three different viewing conditions based on the degrees of freedom that are granted to the viewer: passive viewing (2DTV), head rotation (3DoF), and rotation and translation (6DoF), to understand how interacting in the virtual space affects the perception of quality. We provide both quantitative and qualitative results of our evaluation involving 78 participants, and we make the data publicly available. To the best of our knowledge, this is the first study evaluating the quality of dynamic point clouds in virtual reality, and comparing it to traditional viewing settings. Results highlight the dependency of visual quality on the content under test, and limitations in the way current data sets are used to evaluate compression solutions. Moreover, influencing factors in quality evaluation in VR, and shortcomings in how point cloud encoding solutions handle visually-lossless compression, are discussed. PubDate: 2022-05-07 DOI: 10.1007/s41233-022-00050-3
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The current pandemic situation has led to an extraordinary increase in remote working activities all over the world. In this paper, we conducted a research study with the aim to investigate the Quality of Remote Working Experience (QRWE) of workers when conducting remote working activities and to analyse its correlation with implicit emotion responses estimated from the speech of video-calls or discussions with people in the same room. We implemented a system that captures the audio when the worker is talking and extracts and stores several speech features. A subjective assessment has been conducted, using this tool, which involved 12 people that were asked to provide feedback on the QRWE and assess their sentiment polarity during their daily remote working hours. ANOVA results suggest that speech features may be potentially observed to infer the QRWE and the sentiment polarity of the speaker. Indeed, we have also found that the perceived QRWE and polarity are strongly related. PubDate: 2022-03-30 DOI: 10.1007/s41233-022-00049-w
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Within the worldwide diving community, underwater photography is becoming increasingly popular. However, the marine environment presents certain challenges for image capture, with resulting imagery often suffering from colour distortions, low contrast and blurring. As a result, image enhancement software is used not only to enhance the imagery aesthetically, but also to address these degradations. Although feature-rich image enhancement software products are available, little is known about the user experience of underwater photographers when interacting with such tools. To address this gap, we conducted an online questionnaire to better understand what software tools are being used, and face-to-face interviews to investigate the characteristics of the image enhancement user experience for underwater photographers. We analysed the interview transcripts using the pragmatic and hedonic categories from the frameworks of Hassenzahl (Funology, Kluwer Academic Publishers, Dordrecht, pp 31–42, 2003; Funology 2, Springer, pp 301–313, 2018) for positive and negative user experience. Our results reveal a moderately negative experience overall for both pragmatic and hedonic categories. We draw some insights from the findings and make recommendations for improving the user experience for underwater photographers using image enhancement tools. PubDate: 2021-12-22 DOI: 10.1007/s41233-021-00048-3
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Human experiences have been studied in multiple disciplines, Human–Computer Interaction (HCI) being one of the largest research fields with its user experience (UX) research. Currently, there is little interaction between experience researchers from different disciplines, although cross-disciplinary knowledge sharing has the potential to accelerate the development of UX and other experience research fields to the next level. This article reports a research profiling study of almost 52,000 experience publications over 125 years, showing the breadth of experience research across disciplines. The data analysis reveals the disciplines that study experiences, the prominent authors, institutions and countries in experience research, the most cited works by experience researchers across disciplines, and how UX research is situated on the map of experience research. This descriptive research profiling study is a necessary first step on the journey of mapping the landscape of experience research, guiding researchers towards understanding experience as a multidisciplinary concept, and establishing a more coherent experience research field. PubDate: 2021-09-17 DOI: 10.1007/s41233-021-00047-4
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The uptake of chatbots for customer service depends on the user experience. For such chatbots, user experience in particular concerns whether the user is provided relevant answers to their queries and the chatbot interaction brings them closer to resolving their problem. Dialogue data from interactions between users and chatbots represents a potentially valuable source of insight into user experience. However, there is a need for knowledge of how to make use of these data. Motivated by this, we present a framework for qualitative analysis of chatbot dialogues in the customer service domain. The framework has been developed across several studies involving two chatbots for customer service, in collaboration with the chatbot hosts. We present the framework and illustrate its application with insights from three case examples. Through the case findings, we show how the framework may provide insight into key drivers of user experience, including response relevance and dialogue helpfulness (Case 1), insight to drive chatbot improvement in practice (Case 2), and insight of theoretical and practical relevance for understanding chatbot user types and interaction patterns (Case 3). On the basis of the findings, we discuss the strengths and limitations of the framework, its theoretical and practical implications, and directions for future work. PubDate: 2021-08-20 DOI: 10.1007/s41233-021-00046-5
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract The development of immersive technologies has brought with it the need to redefine the concept of quality of experience (QoE). Studies have explored QoE in virtual reality (VR) by adopting a top-down approach—these are solely based on existing frameworks and theory, and complemented with novel technical considerations. It can be argued that any QoE framework derived in this manner is limited, as its scope is fixed even prior to any data gathering process. To this end, the current study proposes a bottom-up approach, involving the user in the formulation of a broader QoE model. The repertory grid technique (RGT) was used to analyse and group 360 attributes, listed by participants as criteria they used in judging the quality of a VR experience. The advantage of RGT is that it has a holistic approach towards the interpretation of the user’s experience combined with the precision of quantitative analysis. The study resulted in a QoE model that consists of three main groups of attributes (i.e., user, content, and system). Furthermore, the analysis showed that participants listed attributes related to their experience and appraisal of VR, and to the content that they viewed. In contrast, very few system-related attributes were mentioned. Finally, the current study discussed the RGT methodology—and user-driven approaches in general—as a complementary research approach to create a comprehensive and practical QoE model. PubDate: 2021-04-03 DOI: 10.1007/s41233-021-00045-6
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Driving stress can impact the driving performance that has an impact on the overall driving experiences. It is a vital area to focus on when the traffic scenario is challenging in terms of having traffic congestion, unruly drivers, and a lack of law enforcement. In Bangladesh, these issues are frequent on the roads. That is why we looked at self-reported stress scores of professional drivers, their personality analysis and conducted mixed-method (quantitative and qualitative) user studies that provided us a clear indication of driving stress. Then the findings motivated us to design and develop a low-cost real-time stress measurement wearable through human-centered computing, users’ feedback, and experiences. This wearable unit can understand bodily stress from physiological factors using Heart Rate Variability along with road conditions. This technology can help in supporting drivers in increasing self-awareness regarding driving stress, which will have a positive impact on drivers’ wellbeing and overall driving performance. PubDate: 2021-01-06 DOI: 10.1007/s41233-020-00043-0
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Due to biased assumptions on the underlying ordinal rating scale in subjective Quality of Experience (QoE) studies, Mean Opinion Score (MOS)-based evaluations provide results, which are hard to interpret and can be misleading. This paper proposes to consider the full QoE distribution for evaluating, reporting, and modeling QoE results instead of relying on MOS-based metrics derived from results based on ordinal rating scales. The QoE distribution can be represented in a concise way by using the parameters of a multinomial distribution without losing any information about the underlying QoE ratings, and even keeps backward compatibility with previous, biased MOS-based results. Considering QoE results as a realization of a multinomial distribution allows to rely on a well-established theoretical background, which enables meaningful evaluations also for ordinal rating scales. Moreover, QoE models based on QoE distributions keep detailed information from the results of a QoE study of a technical system, and thus, give an unprecedented richness of insights into the end users’ experience with the technical system. In this work, existing and novel statistical methods for QoE distributions are summarized and exemplary evaluations are outlined. Furthermore, using the novel concept of quality steps, simulative and analytical QoE models based on QoE distributions are presented and showcased. The goal is to demonstrate the fundamental advantages of considering QoE distributions over MOS-based evaluations if the underlying rating data is ordinal in nature. PubDate: 2020-12-26 DOI: 10.1007/s41233-020-00044-z
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Abstract Subjective speech quality assessment has traditionally been carried out in laboratory environments under controlled conditions. With the advent of crowdsourcing platforms tasks, which need human intelligence, can be resolved by crowd workers over the Internet. Crowdsourcing also offers a new paradigm for speech quality assessment, promising higher ecological validity of the quality judgments at the expense of potentially lower reliability. This paper compares laboratory-based and crowdsourcing-based speech quality assessments in terms of comparability of results and efficiency. For this purpose, three pairs of listening-only tests have been carried out using three different crowdsourcing platforms and following the ITU-T Recommendation P.808. In each test, listeners judge the overall quality of the speech sample following the Absolute Category Rating procedure. We compare the results of the crowdsourcing approach with the results of standard laboratory tests performed according to the ITU-T Recommendation P.800. Results show that in most cases, both paradigms lead to comparable results. Notable differences are discussed with respect to their sources, and conclusions are drawn that establish practical guidelines for crowdsourcing-based speech quality assessment. PubDate: 2020-11-22 DOI: 10.1007/s41233-020-00042-1