Talk:Computerized adaptive testing
I'm proposing Computer Adaptive Tests for final exams in my University, but I find some coarse reaction. They argument that this kind of tests punishes the most intelligent people and unmotivate them. Has anyone had an experience at implementing CATs for academic courses?
Regards,
Mario Montoya
In order to implement an adpative test, you must first have scaled the pool using item response theory. The sample size typically required for IRT scaling is at least 200 and preferrably several hundreds or thousands of people. As a consequence, it is uncommon and generally impractical for use in academic settings unless there is a dedicated staff who produce one or a few exams each year and who can manage the CAT. The way this usually evolves is to first administer the exam on computer for a period of time while the database of questions and answers accumulates. Then the scaling is performed and an adaptive version of the exam is introduced.
I haven't seen any studies that reinforce the notion that CAT's punish the most intelligent students although CAT introduces a new administration method that many students dislike. For example, item review and modification are generally disallowed in CAT because if you get an easier item, then you know that you got the previous item wrong. If you can backup and correct that answerm then your score will be biased. In fact, if item review is allowed then here's how you subvert the exam: Intentionally answer each item wrong until you finish the exam. The CAT will have served the easiest possible items to you. Now go back and answer all the easy items correctly. If you actually get 100% correct, then you should pass but if you don't then you can take them to court and have them explain to a judge why, with 100% correct, you did not pass their exam! (This is, from memory, Howard Wainer's argument against allowing item review.) Many students are frustrated that they cannot skip harder questions and return to them later.
Why would you want to introduce CAT exams in an academic setting? CAT is really only helpful if you need extremely reliable tests over a broad range of ability (e.g., for categorization or on a broad entrance exam like the GRE). CAT is too much work and, as Wainer shows, too much of a security risk for general usage.
Amead