Teacher ratings can’t tell good teachers from bad ones – back to the drawing board?

Corporate and business people who have lived through serious quality improvement programs, especially those based on hard statistical analysis of procedures and products in a manufacturing plant, know the great truths drilled by such high-quality statistical gurus as W. Edwards Deming: The fault, dear Brutus, is not in the teacher, but in the processes generally beyond the teacher’s control.

Here’s the shortest video I could find on Deming’s 14 Points for Management — see especially point #14, about eliminating annual “performance reviews,” because as Dr. Deming frequently demonstrated, the problems that prevent outstanding success are problems of the system, and are beyond the control of the frontline employees (teachers, in this case). I offer this here only for the record, since it’s a rather dull presentation. I find, however, especially among education administrators, that these well-established methods for creating champion performance in an organization are foreign to most Americans. Santayana’s Ghost is constatly amazed at what we refuse to learn.

Wise words from the saviors of business did not give even a moment’s pause to those who think that we can improve education if we could only get out those conniving, bad teachers, who block our children’s learning. Since the early Bush administration and the passage of the nefarious, so-called No Child Left Behind Act, politicians pushed for new measures to catch teachers “failing,” and so to thin the ranks of teachers. Bill Gates, the great philanthropist, put millions of dollars in to projects in Washington, D.C., Dallas, and other districts, to come up with a way to statistically measure who are the good teachers, the ones who “add value” to a kid’s education year over year.

It was a massive experiment, running in fits and spurts for more than a decade. We have the details from two of America’s most vaunted and haunted school districts, Washington, D.C., and New York City, plus Los Angeles and other sites, in projects funded by Bill Gates and others, and we can pass judgment on the value of the idea of identifying the bad apple teachers to get rid of them to improve education.

As an experiment, It failed. After measuring teachers eight ways from Sunday for more than a decade, W. Edwards Deming was proved correct: Management cannot identify the bad actors from the good ones.

Most of the time the bad teachers this year were good teachers last year, and vice versa, according to the measures used.

Firing the bad ones from this years only means next year’s good teachers are gone from the scene.

Data have been published in a few places, generally over complaints of teachers who don’t want to get labeled as “failures” when they know better. Curiously, some of the promoters of the scheme also came out against publication.

A statistician could tell why. When graphed, the points of data do not reveal good teachers who constantly add value to their students year after year, nor do the data put the limelight on bad teachers who fail to achieve goals year after year. Instead, they reveal that what we think is a good teacher this year on the basis of test scores, may well have been a bad teacher on the same measures last year. Worse, many of the “bad teachers” from previous had scores that rocketed up. But the data don’t show any great consistency beyond chance.

So the post over at the blog of G. F. Brandenburg really caught my eye. His calculations, graphed, show that these performance evaluations systems themselves do not perform as expected: Here it is, “Now I understand why Bill Gates didn’t want the value-added data made public“:

It all makes sense now.

At first I was a bit surprised that Bill Gates and Michelle Rhee were opposed to publicizing the value-added data from New York City, Los Angeles, and other cities.

Could they be experiencing twinges of a bad conscience?

No way.

That’s not it. Nor do these educational Deformers think that value-added mysticism is nonsense. They think it’s wonderful and that teachers’ ability to retain their jobs and earn bonuses or warnings should largely depend on it.

The problem, for them, is that they don’t want the public to see for themselves that it’s a complete and utter crock. Nor to see the little man behind the curtain.

I present evidence of the fallacy of depending on “value-added” measurements in yet another graph — this time using what NYCPS says is the actual value-added scores of all of the many thousands of elementary school teachers for whom they have such value-added scores in the school years that ended in 2006 and in 2007.

I was afraid that by using the percentile ranks as I did in my previous post, I might have exaggerated or distorted how bad “value added” really was.

No worries, mate – it’s even more embarrassing for the educational deformers this way.

In any introductory statistics course, you learn that a graph like the one below is a textbook case of “no correlation”. I had Excel draw a line of best fit anyway, and calculate an r-squared correlation coefficient. Its value? 0.057 — once again, just about as close to zero correlation as you are ever going to find in the real world.

In plain English, what that means is that there is essentially no such thing as a teacher who is consistently wonderful (or awful) on this extremely complicated measurement scheme. How teacher X does one year in “value-added” in no way allows anybody to predict how teacher X will do the next year. They could do much worse, they could do much better, they could do about the same.

Even I find this to be an amazing revelation. What about you?

And to think that I’m not making any of this up. (unlike Michelle Rhee, who loves to invent statistics and “facts”.)

You should also see his earlier posts, “Gary Rubenstein is right, no correlation on value-added scores in New York city,” and “Gary Rubenstein demonstrates that the NYC ‘value-added’ measurements are insane.”

In summary, many of our largest school systems have spent millions of dollars for a tool to help them find the “bad teachers” to fire, and the tools not only do not work, but may lead to the firing of good teachers, cutting off the legs of the campaign to get better education.

It’s a scandal, really, or an unrolling series of scandals. Just try to find someone reporting it that way. Is anyone?

More, Resources:

America’s bravest and most honest school reformer, Diane Ravitch, reviews the book about how Finland does it right; Finnish Lessons: What Can the World Learn from Educational Change in Finland? by Pasi Sahlberg, with a foreword by Andy Hargreaves, Teachers College Press, 167 pp., $34.95 (paper)

This entry was posted on Sunday, March 4th, 2012 at 7:15 pm and is filed under Education, Education Administration, Education assessment, Scandals, Teachers, Teaching, Testing. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

5 Responses to Teacher ratings can’t tell good teachers from bad ones – back to the drawing board?

Founders on education standards: James Madison | Millard Fillmore's Bathtub says:

February 25, 2015 at 2:06 pm

[…] Teacher ratings can’t tell good teachers from bad ones — back to the drawing board? […]

LikeLike
Deming and Peters, and teacher evaluations | Millard Fillmore's Bathtub says:

October 12, 2013 at 1:21 am

[…] “Teacher ratings can’t tell good ones from bad ones — back to the drawing board?&#… […]

LikeLike
War on Teachers and Education, part 1: Prof. Ravitch’s emotion-touching call for a cease-fire on teachers | Millard Fillmore's Bathtub says:

June 10, 2013 at 12:31 am

[…] work in the French Revolution, it didn’t work in Russian in 1917. Management experts like W. Edwards Deming, the most famous of the tough-reorganization management consultants in the d… — the people fired are not the problem, nor do they have the authority to fix the problems, […]

LikeLike
jaycubed says:

March 5, 2012 at 2:19 pm

The problem with the application of Deming’s method is typically that managers “know better” than Deming and pick & choose those elements that coincide with their personal beliefs.

They immediately reject the most important elements they need to learn personally so that they could better manage their enterprises.

The typical result is chaos: with managers blaming employees for managerial incompetence and employees being subjected to hypocritical lessons/behavior from managers.

Why isn’t Deming’s method working in every business on the planet? Because it is never applied completely. Even the Japanese, who applied Deming’s methods more deeply than others, were done in by rejecting their core businesses in an attempt to make profit on ephemera (the collapse in real estate & art speculation caused the shrinkage of the Japanese economy by trillions of yen and drained necessary investment & r&d funds for a decade).

There is also the vulture class of business, the MBAs, whose purpose is to milk the maximum amount of short term profit from a business before it is ruined. Deming’s teaching, insight & methods are the opposite of what the vast majority of these/any people learn in “business school”.

LikeLiked by 1 person
Now I Understand Why Bill Gates Didn’t Want The Value-Added Data Made Public « Millard Fillmore's Bathtub says:

March 5, 2012 at 11:46 am

[…] See more, next post. Share this:TwitterStumbleUponDiggRedditFacebookEmailLike this:Like2 bloggers like this post. […]

LikeLike

Please play nice in the Bathtub -- splash no soap in anyone's eyes. While your e-mail will not show with comments, note that it is our policy not to allow false e-mail addresses. Comments with non-working e-mail addresses may be deleted.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

	Ed Darrell on . . . and that’s the…
	Big Ramifications on . . . and that’s the…
	Ed Darrell on Joe Biden, Mensch
	The First Five Error… on How to convince people DDT doe…
	N. Joy on . . . and that’s the…
	Ed Darrell on Quote of the moment: Rachel Ca…
	Ed Darrell on . . . and that’s the…
	Ed Darrell on What Democrats really stand fo…
	mkfreeberg on . . . and that’s the…
	Willis Eschenbach on What Democrats really stand fo…
	1011art on Quote of the moment: Rachel Ca…
	First Draft of Logo… on Why is the US flag displayed o…
	BobD on Quote of the moment: Reorganiz…
	Ed Darrell on “Rise Again”: How…
	Ure8jssjao on Cluster of presidents’ b…

Millard Fillmore's Bathtub