The practical significance of the item response theory model (IRT) choice on the results of a statewide assessment was investigated at multiple decision making levels: the examinee level, school and district summary levels, and in terms of impact to subgroups. Data for the study included the student response matrix for South Carolina’s 2014 Palmetto Assessment of State Standards (PASS). The Rasch model, used with PASS and in nearly half of PASS-like multiple-choice statewide assessments in other states, was compared to another popular IRT model used in similar statewide assessments: the 3PL model.

Model fit checks indicated that the 3PL had a better person-fit than the Rasch model for PASS. Results centered around the impact of PASS summary scores reported for schools and districts on state and federal report cards showed that for most schools and districts, percentage in PASS performance level and PASS means are largely unchanged by the choice of 3PL or Rasch model. However, for some small schools and districts, the IRT model would have striking effects on percentage in performance level featured on report cards. Furthermore, at the examinee level, examinees near the lower end of the score distribution are sensitive to the change in IRT model. Decisions for some examinees at this level, such as selection for various support programs or even for retention based on PASS scores, might be redistributed due to the change in model. The subgroup with individualized education plans (IEPs) showed the most change because this subgroup, on average, had scores near the lower end of the score distribution. With regard to grade and subject areas, 8th grade Math, as compared to 3rd grade ELA, 3rd grade Math, and 8th grade ELA, was the most impacted. The 3PL model’s estimated guessing parameter was higher for 8th grade math than the other grades and subjects.

In addition to analyzing the student response matrix from the actual administration of PASS, a small simulation study on the most impacted group, the 8th grade Math IEP subgroup, was performed based on the ability parameter and item parameter estimates of the actual examinees. The fit and misfit models accurately estimated the modeled true PASS scores except in the case where 3PL was the true model and Rasch was the misfit model used for estimation.