How many participants do you need for your user test?

How many participants do you need for your user test?
Sample size is a tricky issue within user testing. The number of participants depends on various things such as what you are testing for and whether it is crucial to uncover all usability issues.

Often we do not have the resources or the need to use advanced statistics. Therefore true validity can be hard to assess. Hence the number of participants you need for your test might vary a lot. Here are some things to consider when deciding the sample size for your user test – and remember one participant is better than none!

The famous five participants

You might have heard Jakob Nielsen’s statement that five participants will discover 80% of the usability problems. This might be right in some cases, however in other cases it may be insufficient. The statistics behind this statement has a 85% confidence interval and an error margin of ±18.5%. This means that there is a 85% chance that a group of five participants will uncover between 66.5% and 100% of the problems. In Jakob Nielsen’s study some groups of five found nearly all usability problems, however one of the groups only discovered 55%. So keep this in mind!

Problem discovery

When you are planning your test, you should consider the average percentage of problems you hope to find, together with the minimum percentage that will be okay to uncover. Or the other way around, how many problems will you discover with x number of participants. This sounds complicated, however, just look at the table below from a study made by Laura Faulkner. This gives a great overview of the correlation between the number of participants versus amount of problems found.

In the table you can see that going from 5 to 10 participants hugely increases the expected level of problem discovery, but going from 15 to 20 participants has far less impact. Here you should be aware of and consider how big an impact undiscovered problems will have on your end-users. Hence you can argue to have fewer participants if undiscovered problems would have a small impact on your end-users, and have more participants if undiscovered problems would have a large impact on your end-users.

Studies conducted both by Virzi, and Perfetti & Landesman indicate that the appropriate number of participants is between three and twenty – depending on the complexity of the product, and a good baseline is between five and ten participants. Nielsen Norman however argues that for quantitative studies, where you aim for statistics and not insights, should test with at least 20 participants in order to get statistically significant numbers (if you want tight confidence intervals you should test with even more).

So how many participants do you need?

A good guideline is to consider five participants as a good starting point for getting insight into your product from a more qualitative point of view. When you want to start looking at usability and UX metrics and maybe calculate basic statistics, you should consider having 15 participants or more. Keep in mind, that if you are conducting a comparison study, e.g. AB testing, you should go for 15 participants per version.

Hope this helped you a bit in the never ending discussion of how many test participants are enough – and as said before – one participant is better than none. Happy testing!

Sources

Faulkner, Laura. “Beyond the Five-User Assumption: Benefits of Increased Sample Sizes in Usability Testing.” Behavior Research Methods, Instruments, and Computers, Vol. 35, No. 3, 2003.

Landauer, Thomas K. “Research Methods in Human-Computer Interaction.” In Handbook of Human-Computer Interaction, ed. Martin G. Helander. Amsterdam: North Holland, 1988.

Nielsen, Jakob. “Why You Only Need to Test with 5 Users.” Nielsen Norman Group, March 19, 2000. Retrieved September 2nd 2020.

Perfetti, Christine, and Lori Landesman. “Eight Is Not Enough.” UIE, June 18, 2001. Retrieved September 2nd 2020.

Virzi, R. A. “Refining the Test Phase of Usability Evaluation: How Many Subjects Is Enough?” Human Factors, Vol. 34, 1992.