Saturday, May 29, 2010

Scantron, Round 2

We are just winding up our second go-round with the Scantron Performance Assessment Series testing, the computer-based assessment of students hat CPS requires.

As far as multiple choice assessments go, there are several things I like about the Scantron Performance Assessment Series. The results are available immediately. It works as both a norm-referenced and a criterion-referenced assessment. It provides useful diagnostic and grouping info. It has some okay resources for teachers. If you have the requisite resources (computers, fast Internet, staff), it is fairly easy to administer.

However... I am looking over the results from the testing. I am seeing student scores that vary by hundreds of points from February scores (in some cases, a 20% difference). Some higher, some lower. In statistical terms, really big standard deviations for some classes, especially in the reading assessment.

Scantron differs from other assessments like ISAT or the CPS Benchmark Assessment in that it provides a grade-independent number that should show growth over the years as students learn more (all other things being equal).

Considering just student knowledge (if in fact the assessment measures that) there is little reason for scores to drop. Save for brain injury, it's unreasonable to think that students lose knowledge over a three month period. I suppose maybe a teacher so completely mis-taught something that the students unlearned something. But that would show up in the numbers in a different way.

The wild score fluctuations raise, again, the striking role of all of the extrinsic factors to testing. Things that are not part of the questions themselves (i.e., the content, the wording, the pictures, etc.). In particular, it raises the human dimension of assessment, like the test-taker's willingness, confidence, interest, desire, alertness, and so on.

Napoleon said that "in war, moral factors account for three quarters of the whole; relative material strength accounts for only one quarter." ("Moral" = "morale".) In testing, the same pretty much holds, I think, for standardized testing.

In questioning students whose scores dropped significantly (let's say more than 100 points), they would say things like they didn't feel well, or they were tired, or they didn't try, or they didn't care. After cajoling and encouraging and even threatening them to really try their best, typically scores jumped considerably -- by hundreds of points. (Scantron uses a scale from about 1300 to 3900; an entire grade level jump is approximately equivalent ranges from a 70 to 220 point change, depending on the grade.)

This moral factor in testing is one of those easily ignored things in the overall fetishization of data. Testing kids is not the same as taking their temperature. When assessing a student, the student must do something -- the student must perform -- their cooperation, their buy-in, their willingness to play along is three-quarters of the whole. The horse and water.

The fake science behind "performance management" must ignore the moral factors because they aren't controllable. And because the moral dimension is ignored, it bumbles from that fundamental error to another and another, until a massive lie has been constructed.

I have to admit that there is one part of me that appreciates the resistance -- mostly unconscious -- on the part of the students to what is being done to them through the seemingly endless testing. Now if that impulse to resist could be nurtured, and shaped, and directed towards something really useful -- now that would be an education.


P.S. I thought we were done, because Friday (5/28) was originally set as the final day, but I just saw an email yesterday saying that the window had been extended to June 3.

P.P.S. I did a review, rather neutral and not too critical, of the Scantron Performance Assessment, available here.

No comments: