I provide sample size formulae and tables for the design of studies that compare two or more coefficients of inter-observer agreement or concordance. Such studies may arise, for example, when interest centres on assessment of how measures of inter-observer agreement vary across different patient subgroups or treatment centres. I consider cases of both a continuous and a dichotomous outcome measure. Three examples illustrate the results.