Monday, March 14, 2011

VAM!

Dr. Bruce Baker, in his column in yesterday’s Record, makes the case that the use of any sort of value-added models (VAM’s) to assess teacher effectiveness is dangerous and unfair because of the possibility of identifying good teachers as ineffective. Using student test data to evaluate teacher proficiency, he says, is a non-starter because there’s a risk that a good teacher might be labeled as a bad teacher. VAM’s are a crapshoot, no better than a flip of a coin, and do no more than offer “new opportunities to sabotage a teacher’s career.” All that will result from an attempt to measure teacher effectiveness, he warns, will be a “flood of lawsuits like none ever previously experienced.”

Dr. Baker’s concerns circle around the odds of a teacher receiving what’s often referred to as a “false positive.” It’s like cancer biopsies. Let’s say someone has lump and goes to the doctor. The doctor then orders some blood work and the results show that the lump is malignant. Only later, after more tests, does our story turn out happily: the patient is cancer-free and that original blood test was a false positive.

In the context of VAM’s, there’s a risk that tying student test scores to teaching effectiveness could yield false positives. While the current proposal would make that false positive only 45% of a teacher’s evaluation, it’s possible that a good teacher could be deemed ineffective.

Yet compare that to our current system of teacher evaluations. According to the National Center for Education Statistics, in 2007-2008 2% of public school teachers in New Jersey were either dismissed or didn’t have their contract renewed. Over a ten year period, 47 out of 100,000 teachers were terminated.

Either our current system of teacher evaluation has a stunningly high rate of false negatives or teaching public school students is a cakewalk.

Nah, it’s the former. If we go back to our cancer patient, those blood tests on her lump showed no malignancy, but she really had cancer. In other words, our current method of evaluating teachers is worse than a roll of the dice: it’s a recipe for keeping ineffective teachers in the classroom and increasing the odds that children, especially needy ones, will fall further and further behind.

Even the president of the American Federation of Teachers, Randi Weingarten, has this to say on our current non-metrics: “With rare exceptions, teacher evaluation procedures are broken—cursory, perfunctory, superficial and inconsistent.”

But Dr. Baker is more concerned with the chances of firing a good teacher than he is with the impact of ineffective teachers on children. He’s all about the teacher’s rights; student rights to an effective education are entirely absent from his article. False negatives? No big deal. False positives? Your lawsuit is in the mail.

Are VAM’s as unreliable as he makes them out to be? Here’s Dan Goldhaber, the director of the Center for Education Data & Research:
When it comes to VAM estimates of performance, we actually know quite a bit. Researchers find that the year-to-year correlations of teacher value-added job performance estimates are in the range of 0.3 to 0.5. These correlations are generally characterized as modest, but are also comparable to those found in fields like insurance sales or professional baseball where performance is certainly used for high-stakes personnel decisions.
So VAM’s for teachers aren’t perfect, not by a long shot, though they’re comparable to other industries. But they’re far better than our current non-system, which awards tenure to just about anyone with a pulse. Will we lose some good teachers? No doubt. Will that loss be mitigated by the overall increase in teacher quality?

That depends on whether you look at VAM’s as an instrument intended to protect teachers or as an instrument intended to protect students. Dr. Baker’s analysis is all about the former. For a statistician, he makes a pretty good labor lawyer.

2 comments:

kallikak said...

Dr. Baker is right.

Professional statisticians know that any screening system must be viewed in light of the false positives and negatives it generates.

Typically any effort to control results in one dimension (e.g., limit the number of false positives) is likely to influence expected outcomes in the other.

An ideal VAM screening system for teachers would minimize the number of good teachers labeled as bad while at the same time minimizing the number of bad teachers cleared as good.

If such a system exists, as supported by peer-reviewed analysis, please bring it forward.

m11.in said...

great blog it's wonderful.
..................................