Carrell and West discovered, however, that just the opposite is true. Students thought that the best teachers were the ones who did the worst at preparing them for the higher level classes. Hence, Carrell and West conclude, "students appear to reward higher grades in the introductory course, but punish professors who increase deep learning."
In the second story, the Texas educational system has taken the notion that students are effective evaluators of teacher performance to a new extreme by letting students decide which professors get teaching rewards of up to $10,000. Dr. Stanley Fish railed against this scheme in a recent column for The New York Times. Fish writes,
Once this gets going (and Texas A&M is already pushing it), you can expect professors to advertise: “Come to my college, sign up for my class, and I can guarantee you a fun-filled time and you won’t have to break a sweat.” If there ever was a recipe for non-risk-taking, entirely formulaic, dumbed-down teaching, this is it.Part of the problem of using students to evaluate teaching, Fish argues, has to do with "deferred judgment." Time, sometimes years, if often required to fully understand what you have learned from a class, especially a class well taught.
And that is why student evaluations (against which I have inveighed since I first saw them in the ’60s) are all wrong as a way of assessing teaching performance: they measure present satisfaction in relation to a set of expectations that may have little to do with the deep efficacy of learning. Students tend to like everything neatly laid out; they want to know exactly where they are; they don’t welcome the introduction of multiple perspectives, especially when no master perspective reconciles them; they want the answers.
But sometimes (although not always) effective teaching involves the deliberate inducing of confusion, the withholding of clarity, the refusal to provide answers; sometimes a class or an entire semester is spent being taken down various garden paths leading to dead ends that require inquiry to begin all over again, with the same discombobulating result; sometimes your expectations have been systematically disappointed. And sometimes that disappointment, while extremely annoying at the moment, is the sign that you’ve just been the beneficiary of a great course, although you may not realize it for decades.
Needless to say, that kind of teaching is unlikely to receive high marks on a questionnaire that rewards the linear delivery of information and penalizes a pedagogy that probes, discomforts and fails to provide closure. Student evaluations, by their very nature, can only recognize, and by recognizing encourage, assembly-line teaching that delivers a nicely packaged product that can be assessed as easily and immediately as one assesses the quality of a hamburger.The problems associated with student evaluations have been know for decades. In the many studies I have seen on the topic, not one has shown student evaluations to be an effective measure of teaching performance. (If you find one, please let me know.) Nonetheless, student evaluations are used as the main source, or the only source, of information to evaluate teaching performance. This incentive structure is known to lead to teaching that tries to be entertaining, lightening of the student's work load, and grade inflation (which has continued to rise since the introduction of student evaluations in the 1960s).
In one now well known experiment, two people were asked to teach the same class on the same topic. Both lectured on the topic and had time at the end for some Q&A with the students. One was a professor and expert on the topic. The other was an actor. The actor was charming, charismatic, funny, and bluffed his way through the whole lecture and Q&A. (You can probably guess where this is going.) The students by far thought the actor was the better teacher, and (here's the kicker) more knowledgeable on the topic.
The obvious question then becomes, if we know that student evaluations are poor measures of teaching performance, and have known this for decades, why do colleges continue to use them to measure teaching performance? The answer, I believe, has to do with a problem common in the social sciences--the data that is most likely to be used is the data that is easiest to collect. Deans and hiring and promotion committees rely upon this data because that is the data they have. They look at all those means and standard errors nicely laid out in those neat little tables and it becomes easy to assume those numbers mean something. The situation reminds me of a joke told by economist Ken Rogoff to explain why economists failed to foresee the current economic crisis.
A drunk on his way home from a bar one night realizes that he has dropped his keys. He gets down on his hands and knees and starts groping around beneath a lamppost. A policeman asks what he’s doing.In the same way that economists failed to understand the economic collapse because they failed to collect the right data, college administrators fail to effectively evaluate teaching because they are collecting the wrong data.
“I lost my keys in the park,” says the drunk.
“Then why are you looking for them under the lamppost?” asks the puzzled cop.
“Because,” says the drunk, “that’s where the light is.”
Full version of the Carrell and West study in PDF.
Cowen, Tyler. "Does professor quality matter?"
Fish, Stanley. "Student Evaluations, Part Two"
Douthat, Ross. "In Defense of Student Evaluations."
Douthat, Ross. "Now, The Case Against Student Evaluations."
Jacobs, Alan. "Stanley Fish is right again."