"In its most essential form, standard setting refers to the process of establishing one or more cut scores on a test"1.
The General Medical Council in Tomorrows Doctors states:
"Medical schools must have appropriate methods for setting standards in assessments to decide whether students have achieved the outcomes for graduates." 2
The standard setting concept may appear straight forward but this is not the case in that the scores set can vary depending on the different methods used. No one standard setting method fits all assessments; often there is a particularly appropriate method for each assessment. What needs to be considered when choosing a method is that it must be:
There are two different kinds of standards, relative and absolute. The GMC in Assessment in Undergraduate Medical Education (2009) state:
"Schools should use approaches to standard setting which ensure concordance with absolute standards."3
Relative standards (norm referenced methods) are based on a comparison among the performances of candidates. An example of this is a set proportion of candidates fails the assessment regardless of how well they performed.
Absolute standards (criterion referenced methods) are based on how the candidates perform against set criteria. This means that all the candidates can perform well enough to pass the assessment regardless of how the other candidates have performed (so your good group of students will have a higher proportion of candidates passing compared to a poor group of candidates).
Performance based methods have increasingly been the choice of standard setting in clinical skills and performance assessments. These methods derive the standard from judgements made on the actual candidate test performance at the time of the assessment. These fall in to 3 main methods:
In addition to marking the elements of a student performance against criteria, all these methods require the examiner to assign a global rating for that performance.
This appears one of the most popular methods of setting cut scores for the structured clinical assessments. Again, like the other borderline methods there is a global rating for each candidate performance. This may have up to a seven criteria e.g. Fail, Borderline, Pass, Very Good, Outstanding.
The checklist scores are plotted against the global rating category a regression line is applied to the data.
The point at which the Borderline category intersects with the regression line is the cut score for the station.
The global rating for this method is generally a 3 point scale:
The mean score of all those candidates whose performance has been marked as borderline on the global rating scale is used to calculate cut score for that individual station.
This method again requires the assessor to assign a global rating to the candidate’s performance, this can be:
In this situation the cut score is calculated as the point of intersection between the global rating distribution curves.
This is crucial. Examiners must have information about the group of candidates and what level they are at in their training. Examiner briefing information is provided for the specific station you are in to guide you during the assessment. Feedback is important as you can then review where you are compared to other examiners in that station.
References: