A Comparison of Differential Scoring Methods For Multiple Choice Tests in Terms of Classical Test and Item Response Theories
Keywords:
multiple choice tests, partial credit model, scoring methods, option weightingAbstract
The purpose of this research is to determine the effects of binary (1-0) scoring, judgement-based (a priori) option weighting and empirical option weighting on the reliability and validity of a multiple-choice test regarding Classical Test and Item Response theories. Data were collected through the administration of a multiple choice test of verbal ability to 1593 students attending several departments at Hacettepe and Gazi Universities. Research findings showed that regarding Item Response Theory, “1-0“ scoring estimates the parameters within different intervals on the ability scale more precisely than weighted scoring and binary scoring is superior to weighted scoring in terms of validity. In case of Classical Test Theory, results indicated that empirical option weighting estimates the highest reliability and all scoring methods cause an identical effect on test validity.