ArticlePDF Available

IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use

Taylor & Francis
International Journal of Human-Computer Interaction
Authors:
  • MeasuringU

Abstract

This paper describes recent research in subjective usability measurement at IBM. The focus of the research was the application of psychometric methods to the development and evaluation of questionnaires that measure user satisfaction with system usability. The primary goals of this paper are to (1) discuss the psychometric characteristics of four IBM questionnaires that measure user satisfaction with computer system usability, and (2) provide the questionnaires, with administration and scoring instructions. Usability practitioners can use these questionnaires with confidence to help them measure users' satisfaction with the usability of computer systems.
... Unlike other similar initiatives in the current literature, this research did not focus solely on the design aspects of the game, but on the evaluation of the perceived usability by teachers and students. For this purpose, the IBM-CSUQ tool [68] was used. Furthermore, a survey developed by the authors, based on the research of Calvo-Morata. ...
... To identify the usability of FreeDev, a survey provided by the IBM-CSUQ tool has been used [68], with the Likert 7 scale [70]. This questionnaire consists of 19 questions that evaluate the satisfaction that users experience when employing the developed application [68]. ...
... To identify the usability of FreeDev, a survey provided by the IBM-CSUQ tool has been used [68], with the Likert 7 scale [70]. This questionnaire consists of 19 questions that evaluate the satisfaction that users experience when employing the developed application [68]. This questionnaire was developed with the purpose of obtaining data on the ease of use of the system (SYSUSE), the quality of the information provided (INFOQUIAL), the quality of the interfaces (INTERQUIAL), and an overall evaluation of the application and its ease of use (OVERALL). ...
Article
Full-text available
Today, teaching faces several challenges, including students’ difficulty in understanding abstract concepts and lack of motivation. To address these problems, the use of virtual reality (VR) has been explored as an innovative and potentially effective educational tool. However, so far, the effectiveness of VR applications and the perception of their use lack a clear and effective approach to be used to support education. The importance of addressing this problem lies in the need to improve the quality of teaching using emerging technologies. It is for this reason that it is important to find new strategies to improve the effectiveness of teaching using VR. In this context, this research presents the results of the FreeDev application, previously validated with 20 teachers and with 80 engineering students from a private university. FreeDev is a VR application designed to support the teaching of basic programming, it is aimed as an educational tool to provide an immersive experience to students on how to get started in programming and computational thinking. FreeDev has been well accepted, and both teachers and engineering students see it as a tool that can be used to support education. It is hoped that this research will contribute to the advancement of knowledge in the field of education.
... For both PortionSize and MFP conditions, after recording the simulated lunch meal during the respective study visit, participants completed a User Satisfaction Survey (USS), and the Computer System Usability Questionnaire (CSUQ) [19][20][21] using Research Electronic Data Capture (REDCap) [8,22,23]. REDCap is a secure, web-based application designed to support data capture for research studies, providing the following: 1) an intuitive interface with validated data collection instruments (e.g., the CSUQ) that have been precoded in the REDCap data dictionary formats [24]; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for importing data from external sources [8]. ...
... The USS assessed satisfaction, ease of use, and adequacy of training on how to use each respective application. The CSUQ is a standardized, reliable (coefficient α > 0.89), and validated (criterion-related validity, r ¼ 0.80) questionnaire for adult populations, being originally designed to evaluate computer programs that have been used to quantify the usability of mobile phone applications [19][20][21]. The CSUQ includes 19 questions, scored using a 7-point Likert scale (1 being the most favorable score), and participants rated overall satisfaction, usefulness, information quality, and interface quality for each respective application. ...
... Race or ethnicity was also self-reported by the participants from a list including non-Hispanic White, non-Hispanic Black, Hispanic, Asian or Pacific Islander, Native American (including Alaskan), biracial or multiracial (specify), or other (specify). Participants completed an 8-item survey that was adapted from prior studies 19,20 to obtain overall satisfaction with PortionSize, satisfaction with embedded food templates and app training, and ease of use. All items were rated on a scale ranging from 1 −6, with 1 indicating extremely dissatisfied or not at all and 6 indicating extremely satisfied or very much. ...
... Researchers have been using validated questionnaires that help one understand how users perceived the authoring tool's usability and satisfaction. Questionnaires like the System Usability Scale (SUS) [22] and After-Scenario Questionnaire (ASQ) [23] are the ones that are mostly used to assess such variables. The SUS questionnaire helps researchers understand the user-perceived usability of the application, and the ASQ helps to understand the user's satisfaction level when performing tasks using the authoring tool. ...
Article
Full-text available
Virtual reality (VR) for training helps minimize risks and costs by allowing more frequent and varied use of experiential training experiences, leading to active and improved learning. However, creating VR training experiences is costly and time-consuming, requiring software development experts. Additionally, current authoring tools are desktop-oriented, which detaches the process of creating the immersive experience from experiencing it in a situated context. This paper presents the development of an immersive authoring tool designed to create immersive virtual environments that can be used to train operatives. The authoring tool can record and replay animations of each action the user performed that can later be used to instruct other users how the task should be performed. Participants were divided into two groups, and the proposed authoring tool was evaluated using usability, satisfaction, presence and cybersickness. Between groups, Independent T-tests revealed that there were no significant differences between expert and non-expert groups in any of the studied variables. Also, the results showed that the authoring tool had high usability and satisfaction, average presence, and low probability of cybersickness symptoms.
... Accordingly, the rainfall episode used in this study was based on the 16 th of August high-impact event experienced in 2020 at Terrassa. Although short in duration (2 hours (Lewis, 1995;Tan et al., 2020b;Tullis & Albert, 2013). For the first section of the A4alerts evaluation, the criteria used in the ANYWHERE-H2020 project (Gebhardt et al., 2019) were applied to assess the value and benefits of EWS platforms from an end-user perspective. ...
Thesis
Full-text available
The present thesis focuses on developing, implementing, and evaluating a comprehensive community and impact-based early warning system (EWS) framework to support the protection actions of authorities and citizens at specific vulnerable locations during flood emergencies. For this purpose, this research delves into various aspects of people-centred impact-based early warning systems, such as meteorological monitoring and forecasting, hazard assessment, impact analysis, warning dissemination tools, mobile crisis apps and the crucial role of community engagement. By integrating these components, the “Site-Specific EWS” (SS-EWS) framework aims to provide locally relevant emergency information to empower individuals and communities to take appropriate self-protection actions based on their response capabilities and local impacts. A significant portion of the thesis is devoted to evaluating the framework, its components and the mobile app developed for warning dissemination (A4alerts) by employing a wide range of strategies. From analyzing the impact-based warnings triggered vs the impacts observed, the usefulness of the A4alerts and its functionalities, the content of the warning messages and their influence on risk perception, these strategies aimed to provide an all-encompassing understanding of the local effectiveness and utility of the SS-EWS. Furthermore, to identify the difficulties, success areas and future improvements, this research presents the social and technical experiences from implementing the community and impact-based driven SS-EWS framework within the selected case-study areas in Catalonia, Spain. The findings of this thesis contribute to the existing knowledge on impact-based EWS by providing innovative methodologies and practical recommendations for their development and implementation within a community context. Ultimately, the work presented in this study and its outcomes can pave the way for expanding the SS-EWS and impact-based driven strategies to vulnerable communities needing guidance and support to enhance their response capacity to mitigate the impacts caused by flood emergencies.
... Nesse sentido, uma das formas mais comuns de mensurar a usabilidade de um sistema pelo seus usuários são os questionários padronizados, dos quais se destacam o System Usability Scale [3], desenvolvido por John Brooke, e o Computer System Usability Questionnaire (CSUQ) [10], por James Lewis, ambos em meados da década de 90, os quais recentemente contavam respectivamente por 43% e 15% das avaliações de usabilidade pós uso de software [8]. É notável também o Post-Study System Usability Questionnaire (PSSUQ) [9], pois esse origina o CSUQ, e tem como diferença principal ser focado na avaliação de um software no tempo passado, e por isso é de interesse para uma avaliação para apenas testar um aplicativo, e também pode ser flexível para a pontuação de variadas aplicações, como no âmbito da saúde [18], finanças [14], e educação [2]. ...
Conference Paper
This work presents the results of usability tests of the Antropindicadoresapplication, which aims to support a project of collection andanalysis of anthropic indicators in the Brazilian Amazon Region.Usability tests based on the PSSUQ method are used for obtainingend-user opinions, as well as heuristic evaluation for finding usabilityproblems. In general, the obtained data showed good results inthe usability tests, obtaining an average grade of 1.19 in the heuristicsand 1.27 in the PSSUQ, which can be classified as an "A+"gradeby the Curved Grading Scale method.
... Finally, there are some cases in which GPT, probably because of RLHF alignment, refuses to answer questions, making results useless to the user. Without access to the individuals who asked the questions, there is no way to know with certainty what the optimal answer for them would be, and we are therefore using scoring criteria that can be reasonably shared by most users and that are commonly used to evaluate the usability of information systems (Brooke et al., 1996;Lewis, 1995). Tackling these weaknesses can be a large avenue for future work to achieve AI agents better equipped to get closer to the types of answers that humans seek. ...
Preprint
Full-text available
Humans have an innate drive to seek out causality. Whether fuelled by curiosity or specific goals, we constantly question why things happen, how they are interconnected, and many other related phenomena. To develop AI agents capable of addressing this natural human quest for causality, we urgently need a comprehensive dataset of natural causal questions. Unfortunately, existing datasets either contain only artificially-crafted questions that do not reflect real AI usage scenarios or have limited coverage of questions from specific sources. To address this gap, we present CausalQuest, a dataset of 13,500 naturally occurring questions sourced from social networks, search engines, and AI assistants. We formalize the definition of causal questions and establish a taxonomy for finer-grained classification. Through a combined effort of human annotators and large language models (LLMs), we carefully label the dataset. We find that 42% of the questions humans ask are indeed causal, with the majority seeking to understand the causes behind given effects. Using this dataset, we train efficient classifiers (up to 2.85B parameters) for the binary task of identifying causal questions, achieving high performance with F1 scores of up to 0.877. We conclude with a rich set of future research directions that can build upon our data and models.
Article
Background Literature showed that learners’ perceived usability and perspective toward a technology application affected their learning experience. Fewer studies have investigated immersive virtual reality (IVR) simulation learning of fundamental nursing skills learning (FNSL). Purpose The aim of the study was to explore the perceived usability of IVR simulations for FNSL among first-year nursing students and their perspectives toward this learning modality. Methods This study used a mixed-methods design with an educational intervention. Sixty-five first-year nursing students participated in 2 IVR simulation procedures in complementary mode. Surveys and focus groups were conducted in the postintervention period. Results The findings demonstrated students’ positive inclinations toward IVR simulation learning. Two areas emerged: using IVR simulation as a complementary modality for FNSL and barriers affecting students’ perceived usability toward this technology. Conclusions With addressing the concerns from students’ perceived usability, immersive virtual reality simulation could be a potential complementary modality for FNSL.
Article
Full-text available
Usability evaluators used an 18-item, post-study questionnaire in three related usability tests. I conducted an exploratory factor analysis to investigate statistical justification to combine items into subscales. The factor analysis indicated that three factors accounted for 87 percent of the total variance. Coefficient alpha analyses showed that the reliability of the overall summative scale was .97, and ranged from .91 to .96 for the three subscales. In the sensitivity analyses, the overall scale and all three subscales detected significant differences among the user groups; and one subscale indicated a significant system effect. Correlation analyses support the validity of the scales. The overall scale correlated highly with the sum of the After-Scenario Questionnaire ratings that participants gave after each scenario. The overall scale also correlated moderately with the percentage of successful scenario completion. These results are consistent with the hypothesis that these alternative measurements tap into a common underlying construct. This construct is probably usability, based on the content of the questionnaire items and the measurement context.
Article
Full-text available
This study is a part of a research effort to develop the Questionnaire for User Interface Satisfaction (QUIS). Participants, 150 PC user group members, rated familiar software products. Two pairs of software categories were compared: 1) software that was liked and disliked, and 2) a standard command line system (CLS) and a menu driven application (MDA). The reliability of the questionnaire was high, Cronbach’s alpha=.94. The overall reaction ratings yielded significantly higher ratings for liked software and MDA over disliked software and a CLS, respectively. Frequent and sophisticated PC users rated MDA more satisfying, powerful and flexible than CLS. Future applications of the QUIS on computers are discussed.
Article
The Subjective Workload Assessment Technique (SWAT) has been under development for approximately five years. This measure is under a systematic development program to define its' strengths and weaknesses. Both laboratory research and field applications are being employed in this evaluation and some of the findings are presented. Current research on refinements to the procedure are discussed.
Chapter
This chapter discusses the conduct of research to guide the development of more useful and usable computer systems. Experimental research in human-computer interaction involves varying the design or deployment of systems, observing the consequences, and inferring from observations what to do differently. For such research to be effective, it must be owned—instituted, trusted and heeded—by those who control the development of new systems. Thus, managers, marketers, systems engineers, project leaders, and designers as well as human factors specialists are important participants in behavioral human-computer interaction research. This chapter is intended as much for those with backgrounds in computer science, engineering, or management as for human factors researchers and cognitive systems designers. It is argued in this chapter that the special goals and difficulties of human-computer interaction research make it different from most psychological research as well as from traditional computer engineering research. The main goal, the improvement of complex, interacting human-computer systems, requires behavioral research but is not sufficiently served by the standard tools of experimental psychology such as factorial controlled experiments on pre-planned variables. The chapter contains about equal quantities of criticism of inappropriate general research methods, description of valuable methods, and prescription of specific useful techniques.
Article
Determining the number of common factors is one of the most important decisions which must be made in the application of factor analysis. Several different approaches and techniques are reviewed here along with associated strengths and weaknesses. It is argued that a combination of approaches will lead to the best judgment regarding the number of factors to retain. A computer program is available which presents the number of factors to retain as suggested by both discontinuity and parallel analyses. Utilization of the program removes the negative aspect associated with the use of each technique.