Conference PaperPDF Available

How Many Users Are Enough for a Card-Sorting Study?

June 2004

June 2004

Conference: Usability Professionals Association (UPA) 2004 Conference
At: Minneapolis, MN

Authors:

Bentley University

A study was conducted to assess the minimum number of participants needed for a card-sorting study. Similarity matrices and tree structures from various sample sizes were compared to those based on a set of 168 participants. Results indicate that reasonable structures are obtained from 20-30 participants.

Correlation coefficients for various sample sizes, with error bars.

…

Figures - uploaded by Thomas Tullis

Content may be subject to copyright.

Content uploaded by Thomas Tullis

Content may be subject to copyright.

How Many Users Are Enough for a Card-Sorting Study? —Page 1

How Many Users Are Enough for a Card-Sorting Study?

Tom Tullis - Fidelity Investments (tom.tullis@fmr.com)

Larry Wood - Brigham Young University (WoodL@byu.edu)

ABSTRACT

A study was conducted to assess the minimum number of participants needed for a card-sorting study.

Similarity matrices and tree structures from various sample sizes were compared to those based on a set

of 168 participants. Results indicate that reasonable structures are obtained from 20-30 participants.

Introduction

Card-sorting, either online or with actual cards, has become a very popular technique for helping to

organize the elements of an information system in a way that makes sense to users of that system. It has

become a standard tool in the toolbox for most usability professionals and information architects. Card-

sorting has been used for designing mainframe menu systems (Tullis, 1985) and, more recently, for

designing web sites (e.g., Frederickson-Mele, 1997; Tullis, 2003). A variety of computer-based tools are

now available for conducting card-sorting exercises online and/or analyzing the data from manual or

online card-sorting studies (e.g., EZSort, WebCAT, WebSort, Socratic CardSort, Classified, CardZort).

However, one of the questions that has not been answered is how many users need to do the card-

sorting exercise to get an accurate picture of how the information should be organized. The purpose of

this study was to answer that question.

The Card-sorting Study

The study that these analyses are based on was conducted online at Fidelity Investments. The purpose

of the card-sorting study was to determine how to organize the information for a redesign of the Intranet

web site of our usability department. A total of 46 “cards” were used in the study, many of which

represented services offered internally by the department, such as “Prototyping”, “Usability Testing”,

“Wireless Design”, “Focus Groups”, and (somewhat circularly) “Card-sorting”. Some cards also

represented general information about the department, such as “Who We Are”, “Where We Are”, and

“Tour of the Usability Lab”.

The card-sorting study was conducted online on our company’s Intranet using the WebCAT tool. Users

were presented with a list of the 46 “cards” in a random order. They would then drag a representation of

each card into a region for each category that they wanted to create. Categories were not pre-defined;

each user created and named their own categories. They could create as many or as few categories as

desired. The exercise was complete when they had put every card into a category.

Employees of our company worldwide were invited to participate in the card-sorting study via an

announcement in a daily message that is sent to all employees. The incentive to participate was entry in

a drawing for a $50 gift check. A total of 172 employees participated in the card-sorting study. Four

participants had to be dropped due to incomplete data, resulting in 168 complete card-sorts. For each

participant, a file was created reflecting the cards that person grouped together and the names given to

those groups. Each of these files can be converted to a similarity matrix showing all pair-wise

combinations of cards, in which a pair that was grouped together received a similarity of “1” and a pair not

grouped together received a similarity of “0”. Summing all of these individual similarity matrices resulted

in an overall similarity matrix with entries ranging from 0 (if no one grouped that pair together) to 168 (if

everyone grouped that pair together).

Data Analysis

These data were then analyzed using a modified version of WebSort to look at random sub-samples of

different sizes from the full dataset of 168 participants.

Usability Professionals Association (UPA) 2004 Conference: 

Minneapolis, Minnesota, June 7-11, 2004

How Many Users Are Enough for a Card-Sorting Study? —Page 2

The similarity matrix referred to above is the basis on which a statistical cluster analysis (Romberg, 1984)

is performed, the result of which effectively "averages" the categorization cumulated across a set of

participants. The resulting cluster analysis is then displayed as a hierarchical tree structure (known

formally as a dendogram) on which organization of content and menu structures can be based.

The major goal of our research was to assess the degree of similarity of an organizational tree structure

based on a sample of participants to a structure based on the full set of 168 participants in order to

estimate the minimum number of participants needed to produce an effective organization. As a means

to that end, correlation coefficients were calculated between the similarity matrices on which the cluster

analysis was based. The assumption is that the more similar the trees, the higher should be the

correlation between the similarity matrices on which they are based. Correlation coefficients between the

sample similarity matrixes and the full similarity matrix were calculated for 10 samples each of sizes 2, 5,

8, 12, 15, 20, 30, 40, 50, 60, and 70 participants. A graph of the resulting mean correlation coefficients is

shown in Figure 1.

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

0 10203040506070

Sample Size

Average Correlation

Figure 1. Correlation coefficients for various sample sizes, with error bars.

As shown in the graph, the relationship between the sample size and the average correlation is a

negatively increasing function. Thus, the increase is more dramatic at the smaller sample sizes so that as

the size increases beyond 20-30, there is little increase in the size of the correlation coefficient. Also note

that the variance of the values, as indicated by the error bars, is much greater for the smaller samples.

An important question is how the function shown in Figure 1 relates to the similarity of the actual tree

structures as a function of sample size. One practical implication is that the structures derived from

sample sizes above 30 are very similar to that derived from the full set of participants, while those based

on smaller sample sizes are increasingly different with smaller sample sizes. To the extent that this is

true, it would have implications for determining the minimum number of users needed to obtain valid

information.

To illustrate the kinds of differences in structure that occur with trees based on various sample sizes,

trees are shown in Figures 3-6 of the Appendix for sample sizes of 40, 20, 15, 10, and 5, respectively.

The tree based on all 168 participants is shown in Figure 2. The highlighted items in Figure 2 of the

Appendix are the basic two-card clusters that are formed during a cluster analysis, based on cards that

are most similar. A tree is then constructed by either combining these base clusters or adding individual

cards to an existing base cluster iteratively until the tree is complete. As shown in Figure 2, there are 17

base clusters derived from the analysis using all 168 participants.

How Many Users Are Enough for a Card-Sorting Study? —Page 3

In Figures 3-7, one of the base clusters from the tree based on all 168 participants (card 18-Remote

usability testing and card 21-Portable usability testing) is highlighted to indicate how the trees based on

various sample sizes might differ from that based on all 168 participants. As shown in the figures, the

cluster is still intact in the tree for N=40, whereas in those for N=20 through N=5, those two cards are

separated by a greater distance as the sample size becomes smaller. Of course, this is not true of each

tree for a given sample size, but does illustrate the trend.

To provide a more general idea of how tree structures differ as a function of sample size, five trees were

generated from each sample size. For all 17 base clusters, the mean separation of the two cards in each

base cluster was then calculated across the five samples. The results are listed in Table 1 for each base

cluster as a function of sample size. Cluster separation was measured by counting the number of nodes

separating the two cards in each base cluster. For example, referring to Figure 4, card 18 and card 21

are only separated by one node, whereas in Figure 6, they are separated by six nodes. A node was

defined as the intersection of two branches. Thus, the number of nodes separating a pair consisted of

the number of intersections that had to be crossed going up the tree from each pair until a common

intersection was found.

Table 1. Mean separation of base clusters as a function of sample size

Base clusters from analysis of all 168

participants Mean separation of base clusters across different sample

sizes. Means are derived from five samples of each sample

size

N = 5 N = 10 N = 15 N = 20 N = 40

42 Design_for_touch_screens

43 Design for voice_based 1.6 0,4 0.8 0.8 0.6

44 Design for elderly users

46 Design for blind users 2.8 0.4 1.2 1.4 0.8

29 Web Design Guide

40 Top 10 Web Design Mistakes 1.0 1.0 0 0 0

10 Prototyping

11 Card sorting 2.4 2.0 3 3.2 0.6

18 Remote usability testing

21 Portable usability testing 1.8 0.4 0.4 0.6 0.2

04 Usability checklist

41 Usability cycle 1.4 1.8 0.2 2.2 0.6

24 Web usability seminary

34 Sign up for usability studies 2.4 1.6 2.2 0 0.4

22 On-line help and documentation

23 Documentation samples 1.6 1.4 1.6 0 0

27 Usable Bits newsletter

28 Usable Bits archive 0.8 1.4 0 0.4 0

25 Study of the month

26 Study of the month results 0.6 0 0 0 0

12 User surveys

13 Focus groups 1.8 1.2 2.0 2.4 0.8

14 Expert reviews

36 Case studies 1.8 2.4 3.8 2.8 1.4

31 Eye-tracker_research

38 HID research 8 1.8 8.2 2.4 5.2

06 HID news

07 HID events 2.0 0.8 0.6 1.4 0.4

08 Who we are

45 Where we are 1.2 0.8 0.6 0.6 0.4

01 HID mission

20 Tour of HID lab 1.0 0.6 0.4 0.2 0.4

03 Site feedback

37 Customer_Testimonials 2.2 4.4 2.2 0.8 1.4

Mean separation across all base pairs 2.0 1.3 1.5 1.1 0.75

Mean % base pairs separated 69% 50% 45% 48% 35%

How Many Users Are Enough for a Card-Sorting Study? —Page 4

Conclusions

A general conclusion that can be drawn on the basis of this research is that it may not be cost effective to

spend resources to gather information from more than 20-30 participants in a card-sorting study.

However, it is important to note that even the trees based on the smallest sample sizes are probably

closer to the one for all 168 participants than might be obtained from speculation by a designer who is not

a potential user of the content or application for which the organization is being developed. As always,

we must exercise appropriate caution in generalizing results from one study. Results will obviously differ

as a function of the homogeneity of the participants in a sample and such things as the instructions given

to the participants for the card-sorting task.

References

Frederickson-Mele, K. (1997) Usability Testing an Intranet Prototype Shell: A Case Study. CHI’97

Workshop on Usability Testing World-wide Web Sites. Retrieved 1/30/2004 from

http://www.acm.org/sigchi/web/chi97testing/mele.htm.

Romesburg, C. H. (1984) Cluster analysis for researchers. Belmont, Calif. : Lifetime Learning

Publications.

Tullis, T. S. (1985) Designing a Menu-based Interface to an Operating System. Proceedings of CHI'85

Conference on Human Factors in Computing Systems, San Francisco, CA, April 1985.

Tullis, T. S. (2003) Using Card-sorting Techniques to Organize Your Intranet. Intranet Journal of Strategy

and Management, March 2003.

EZSort: http://www-3.ibm.com/ibm/easy/eou_ext.nsf/Publish/410

WebCAT: http://zing.ncsl.nist.gov/WebTools/WebCAT/overview.html

WebSort: http://www.websort.net/

Socratic CardSort: http://www.sotech.com/main/eval.asp?pID=123

Classified: http://www.infodesign.com.au/usabilityresources/classified/

CardZort: http://condor.depaul.edu/~jtoro/cardzort/index.htm

How Many Users Are Enough for a Card-Sorting Study? —Page 5

Appendix - Trees from various sample sizes

Figure 2. Tree based on all 168 participants with base clusters highlighted.

How Many Users Are Enough for a Card-Sorting Study? —Page 6

Figure 3 - A sample tree from sample size N=40

How Many Users Are Enough for a Card-Sorting Study? —Page 7

Figure 4 - A sample tree from sample size N=20

How Many Users Are Enough for a Card-Sorting Study? —Page 8

Figure 5 - A sample tree from sample size N=10

How Many Users Are Enough for a Card-Sorting Study? —Page 9

Figure 6 - A sample tree from sample size N=5

Fast and Wrong: An Eye-Tracking Exploration on How Low “Cognitive Reflection” People Analyze and Choose Commercial Packages With Multidimensional Prices

Article

Full-text available

Jun 2024

Based on the literature on individual differences in cognitive processes, we analyzed gaze behavior during a purchase decision context to understand if the levels of cognitive reflection affect the type of price-information processing and, in turn, the quality of choice. The participants were presented with two websites selling the same commercial package and asked to choose one. The two alternative packages were displayed by four price dimensions. Fixation durations and the direction of the information search were recorded using eye-tracking technology (Eye Link 1000 Plus). We found a worse choice quality for people with low cognitive reflection test–inhibitory control score (e.g., selection of the more expensive package). The underlying cognitive processes were investigated, and two possible explanations for the low-quality choice finding were tested by analyzing gaze behavior. Results support the superficial price-information processing hypothesis and show that participants with lower cognitive reflection spend less time to look at all displayed price dimensions which, in turn, leads to a worse choice accuracy. The results are interesting because they highlight that cognitive reflection can manifest not only in our thinking but how we allocate attention to the information and the environment.

Emergency Remote Teaching in Higher Education Institutes: A Taxonomy of Challenges Faced by First-Year Mathematics Students in the Pacific Region

Article

Full-text available

Jan 2024

Emergency Remote Teaching (ERT) can be defined as a shift of instructional delivery to a substitute delivery approach during a crisis. Such a shift poses several challenges for students at Higher Education Institutes. This paper presents a taxonomy of such challenges faced by first-year mathematics students in the Pacific region during the ERT dictated by the COVID-19 pandemic. First, a list of 44 challenges was assembled based on a university’s in-house monitoring report, literature review and the authors’ experiences of challenges faced by students. Next, open card sorting technique involving 32 participants was used to classify these challenges. Open card sorting is a well-established method for discovering how people understand and categorize information. This paper addresses the problem of quantitatively analyzing open card sorting data using the Best Merge Method, Category Validity Technique and Multidimensional Scaling. Analysis of the collected card sort data produced the initial taxonomy of challenges. Finally, the participants were asked to answer a questionnaire so that we could validate and further refine the taxonomy. The proposed taxonomy includes seven challenges: i) lack of online learning support; ii) problem with online course delivery; iii) time and workload management; iv) learning management system issues ; v) lack of face-to-face interactions; vi) financial hardship; vii) Internet challenge. Such a taxonomy might be particularly useful in designing and evaluating an ERT approach.

Taxonomy for building permit system - organizing knowledge for building permit digitalization

Article

Full-text available

Jan 2024
ADV ENG INFORM

An investigation of universal design (UD) features in Indian household products

Article

Full-text available

Feb 2023
Work

Background: Universal design (UD) is a beneficial concept for better accessible design to improve easy approachability and industry-standard products. Specifically, Indian household products require UD features in domains such as bathroom and toilet, furniture, kitchen utilities, and home appliances. Among household product design in India, a lack of understanding of the product's universality might be a constraint for product designers. Also, there are no studies assessing the UD features of Indian household products. Objective: (1) To examine the UD feature of Indian household products against the seven principles of UD; (2) To determine the most lacking UD feature among Indian household products; and (3) To find out the Indian household categories (i.e., bathroom and toilet, furniture, kitchen utilities, and home appliances) which are most lacking in UD performance. Method: The UD features were evaluated using a standardized questionnaire, which contains 29 questions on UD principles and general questions (gender, education level, age and house characteristics). Using statistical packages, the data were computed for mean and frequency distribution, as well as analyzed to achieve the objectives. The analysis of variance (ANOVA) was performed for comparative analyses. Results: The results indicate that the "flexibility in use" and "perceptible information" principles were lacking among the Indian household products. Also, bathroom and toilet and furniture household products were most lacking in UD performance. Conclusion: The findings of this research will enlighten the insights into the usefulness, usability, safety, and marketability of Indian household products. In addition, they will be helpful in promoting UD features and obtaining financial benefits from the Indian market.

Card Sorting: A new pedagogy for understanding challenges in Mathematics during Emergencies and Crises

Preprint

Full-text available

Mar 2021

Interest in challenges faced by university students during COVID-19 has led to research and development initiatives that include educational, technological, economical and socio-cultural provisions. Despite these initiatives, little is known about the usage of open card sorting, similarity matrix, and Hierarchical Clustering Method - 3D Cluster View algorithm in understanding and analysing mathematics challenges in a regional university during emergencies and crisis. This paper presents findings from a study that explored the challenges encountered by first-year mathematics students in a South Pacific institution. The findings reveal seven challenges: i) financial hardship; ii) motivational challenge; iii) moodle issues; iv) lack of face to face interactions; v) problem with course delivery; vi) internet challenge; and vi) home disturbances. A heptagon model is presented with possible solutions for the challenges identified by participants. The findings point to the complex inter-relationship between the institution’s emergency remote teaching, students’ learning needs, and students’ dynamic socio-cultural environments as important factors for delivering quality mathematics learning during a pandemic. This paper highlights the contribution of card sorting, as a new pedagogy, to the field of educational research as a provider of new learning analytics for desirable learning outcomes in a given pandemic. Decisionmakers and Policymakers of Higher Education Institutions around the world may benefit from these findings while formulating strategies to support first-year mathematics students during the current and future pandemics.

Municipal Flag Design Preferences of United States Residents

Article

Jun 2024

Toby Nelson

Exploring correlates of physical activity behaviour in UK children and their inter-relationships using a multidisciplinary approach: A concept mapping study

Article

Full-text available

Jan 2024
J SPORT SCI

It is still unknown which correlates of physical activity behaviour (PAB) may be effective and how they may influence PAB in UK children. The objective of the current study was to generate a conceptual analysis of the correlates of PAB in UK children (5–12 years) using the input of researchers in the field of physical activity (PA experts; PAE) and other fields (non-PA experts; non-PAE). A concept mapping approach was used to identify potential (new) correlates of PAB in children, assess their importance based on rating of potential modifiability and effect, and generate a concept map depicting the associations between them. In the first (brainstorming) stage (n = 32 experts) yielded 93 correlates, including 14 new correlates not identified in previous reviews. In the second (rating and sorting) stage (n = 26 experts), 32 correlates were rated as important and a four-cluster concept map was generated including themes related to Society/community, Home/social setting, Personal/social setting and Psychological/emotional correlates. Two additional concept maps were generated for PAE and non-PAE. From expert opinion, we identified new correlates of PAB that warrant further research and we highlight the need to consider the interaction between intrapersonal and external correlates when designing interventions to promote PA in UK children.

“I Miss Going to that Place”: The Impact of Watching Nature Videos on the Well-Being of Informal Caregivers

Conference Paper

Full-text available

Aug 2023

Informal caregivers play an essential role in caring for persons who require assistance and in managing the health of their loved ones. Unfortunately, they need more health, leisure, and relaxation time. Nature interaction is one of many kinds of self-care intervention. It has long been regarded as a refreshing break from stressful routines, and research suggests exposure to nature interventions to improve the quality of life of caregivers. Despite not being the real thing, technology allows us alternatives that can still have some beneficial effects. In this preliminary study, we explore the benefits of natural environment videos on informal caregivers as an alternative to exposure to nature. Specifically, we are interested in the effects of their own choices versus a random video. We found that natural environment videos improve the well-being of informal caregivers in at least three key areas: valence, arousal, and negative affect. Furthermore, the effect increases when they choose the video they want to watch instead of a random video. This effect benefits the studied subjects because they need more time and energy to visit real natural environments.KeywordsInformal caregiversSelf-careWell-beingNature videos

How Many Participants Do You Need for an Open Card Sort? A Case Study of E-commerce Websites

Chapter

Aug 2023

Open card sorting is the most used method for developing user-centered information architectures. One important question for every HCI method is how many users to involve. Existing studies that address this question for open card sorts have involved trained professionals sorting content items of rather specialized domains. In addition, they employ data analysis approaches that might decrease the confidence one can place on the reported findings. This paper investigates the minimum number of participants for open card sorts performed on a general public website domain (e-commerce). In specific, it involves 203 and 210 participants sorting content items of two real-world e-commerce websites. Results from all the participants were compared with those of different-sized and randomly selected samples of the participants. It was found that 15 to 20 participants is a cost-effective way to obtain reliable open card sort data for general public websites.KeywordsCard SortingInformation ArchitectureSample size

Task-Based Open Card Sorting: Towards a New Method to Produce Usable Information Architectures

Chapter

Jul 2023

Open card sorting is the most widely used HCI technique for designing user-centered Information Architectures (IAs). The method has a straightforward data collection process, but data analysis can be challenging. Open card sorting has been also criticized as an inherently content-centric technique that may lead to unusable IAs when users are attempting tasks. This paper proposes a new variant of open card sorting, the Task-Based Open Card Sorting (TB-OCS), which considers users’ tasks and simplifies data analysis. The proposed method involves two phases. First, small groups of participants perform classic open card sorting. Then, each participant performs findability tasks using each IA produced by the rest participants of the same group and their first-click success is measured. Analysis of the collected data involves simply calculating the first-click success rate per participants’ IA and selecting the one with the highest value. We have also developed a web-based software tool to facilitate the conduction of TB-OCS. A within-subjects user testing study found that open card sorting produced IAs that had significantly higher first-click success rates and perceived usability ratings compared to the IAs produced by TB-OCS. However, this may be due to parameters of the new method that require finetuning, thus further research is required.KeywordsCard SortingInformation ArchitectureIATask-Based Open Card Sorting

Using Card-sorting Techniques to Organize your Intranet

Article

Full-text available

Jan 2003

Thomas Tullis

Designing a menu-based interface to an operating system

Article

Full-text available

Apr 1985

Thomas Tullis

The development of a large menu-based interface to an operating system posed a number of interesting user interface questions. Among those were how to determine the user's view of the relationships among the myriad of functions in the system, and how to reflect those relationships in a menu hierarchy. An experiment utilizing a sorting technique and hierarchical cluster analysis was quite effective in learning the user's perception of the relationships among the system functions. A second experiment comparing a “broad” menu hierarchy to a “deep” menu hierarchy showed that users made significantly fewer inappropriate menu selections with the broad hierarchy.

Cluster Analysis For Researchers

Article

May 1985

H. Charles Romesburg

Usability Testing an Intranet Prototype Shell: A Case Study

Jan 1997

K Frederickson-Mele

Frederickson-Mele, K. (1997) Usability Testing an Intranet Prototype Shell: A Case Study. CHI'97 Workshop on Usability Testing World-wide Web Sites. Retrieved 1/30/2004 from http://www.acm.org/sigchi/web/chi97testing/mele.htm.

How Many Users Are Enough for a Card-Sorting Study?

Abstract and Figures

Recommended publications

Transcendental Arguments: A Plea for Modesty

Consultation Groups: Participants' views

Standards for the dental team and their implications

Normal Forms for Representations of Representation-finite Algebras