Station 4: The Pass Mark

What is standard setting?

"In its most essential form, standard setting refers to the process of establishing one or more cut scores on a test"1. The General Medical Council in Tomorrows Doctors states:
"Medical schools must have appropriate methods for setting standards in assessments to decide whether students have achieved the outcomes for graduates." 2

The standard setting concept may appear straight forward but this is not the case in that the scores set can vary depending on the different methods used. No one standard setting method fits all assessments; often there is a particularly appropriate method for each assessment. What needs to be considered when choosing a method is that it must be:

  • Defensible - that it will satisfy the stakeholders about its validity as a method
  • Explicable - there is a rational behind the decisions made
  • Stable - the cut scores are stable over time otherwise there is a question over its defensibility.

Standard setting methods

There are two different kinds of standards, relative and absolute. The GMC in Assessment in Undergraduate Medical Education (2009) state:

"Schools should use approaches to standard setting which ensure concordance with absolute standards."3

This demonstrates how a good group of students would be disadvantaged if a set proportion of candidates fail and then a poor group of students would be advantaged using this method.Relative standards (norm referenced methods) are based on a comparison among the performances of candidates. An example of this is a set proportion of candidates fails the assessment regardless of how well they performed.

Absolute standards (criterion referenced methods) are based on how the candidates perform against set criteria. This means that all the candidates can perform well enough to pass the assessment regardless of how the other candidates have performed (so your good group of students will have a higher proportion of candidates passing compared to a poor group of candidates).

Specific standard setting methods for clinical assessment

Performance based methods have increasingly been the choice of standard setting in clinical skills and performance assessments. These methods derive the standard from judgements made on the actual candidate test performance at the time of the assessment. These fall in to 3 main methods:

  • Borderline Regression
  • Borderline Group
  • Contrasting Group

In addition to marking the elements of a student performance against criteria, all these methods require the examiner to assign a global rating for that performance.

Borderline Regression Method

This means that all the candidates below the cut score will fail the station regardless of the global category they have been given.This appears one of the most popular methods of setting cut scores for the structured clinical assessments. Again, like the other borderline methods there is a global rating for each candidate performance. This may have up to a seven criteria e.g. Fail, Borderline, Pass, Very Good, Outstanding.

The checklist scores are plotted against the global rating category a regression line is applied to the data.

The point at which the Borderline category intersects with the regression line is the cut score for the station.

Borderline Group Method

The potential difficulty with this method is if there are very few or no candidates who have received a borderline global rating.The global rating for this method is generally a 3 point scale:

  • Pass
  • Borderline
  • Fail

The mean score of all those candidates whose performance has been marked as borderline on the global rating scale is used to calculate cut score for that individual station.

Contrasting Groups Method

There is the possibility that these two groups do not intersect, which then requires a decision as to what the cut score should be. This method again requires the assessor to assign a global rating to the candidate’s performance, this can be:

  • Pass/Fail
  • Adequate/inadequate
  • Competent/not competent

In this situation the cut score is calculated as the point of intersection between the global rating distribution curves.

Examiner training for standard setting

This is crucial. Examiners must have information about the group of candidates and what level they are at in their training. Examiner briefing information is provided for the specific station you are in to guide you during the assessment. Feedback is important as you can then review where you are compared to other examiners in that station.

References:

  1. Cizek GJ, Bunch MB. Standard setting: A guide to establishing and evaluating performance standards on tests: SAGE Publications Ltd; 2007.
  2. General Medical Council Education C. Tomorrow's doctors: General Medical Council. Education Committee; 2009.
  3. General Medical C. Assessment in Undergraduate Medical Education: Advice Supplementary to Tomorrow's Doctors (2009): GMC; 2009.
  4. Kramer A, Muijtjens A, Jansen K, Düsman H, Tan L, van der Vleuten C Comparison of a rational and an empirical standard setting procedure for an OSCE, Medical Education, 2003 Vol 37 Issue 2, Page 132 / Kaufman DM, Mann KV, Muijtjens AMM, van der Vleuten CPM. A comparison of standard-setting procedures for an OSCE in undergraduate medical education. Acad Med 2000; 75:267-271.