Educational policy analysis archives
Education -- Research -- Periodicals   ( lcsh )
non-fiction   ( marcgt )
serial   ( sobekcm )


Includes EPAA commentaries.

usfldc doi - E11-00075
usfldc handle - e11.75
Full Text
Educational policy analysis archives.
Academic freedom, tenure, and student evaluation of faculty : galloping polls in the 21st century / Robert E. Haskell.
1 of 32 Education Policy Analysis Archives Volume 5 Number 6February 12, 1997ISSN 1068-2341A peer-reviewed scholarly electronic journal. Editor: Gene V Glass Glass@ASU.EDU. College of Education Arizona State University,Tempe AZ 85287-2411 Copyright 1997, the EDUCATION POLICY ANALYSIS ARCHIVES.Permission is hereby granted to copy any a rticle provided that EDUCATION POLICY ANALYSIS ARCHIVES is credited and copies are not sold.Academic Freedom, Tenure, and Student Evaluation of Faculty: Galloping Polls In The 21st Century Robert E. Haskell University of New England 1 This is the first of four articles by Haskell on th is subject. The other articles can be found at Volume 5 Number 17 Volume 5 Number 18 Volume 5 Number 21 Abstract: Despite a history of conflicting research on the r eliability and validity of student evaluation of faculty (SEF) it has typically not been viewed as an infringement on academic freedom. When it is suggested that SEF may impinge on academic freedom, it is oft en considered an attack on either student rights, or on the process of evaluating faculty performance in general. Faculty and educational adm inistrator views and surveys are reviewed as SEF is used in salary, prom otion and tenure decisions. It is suggested that the literature show s that SEF infringe on instructional responsibilities of faculty by provid ing a control mechanism over curricular, course content, grading, and teach ing methodology. It is further suggested that SEF play a significant role in current attacks on tenure, and that its role in a demographically dive rse 21st century educational system has changed from its benign hist orical origins. It is concluded that contrary to current views, SEF is a serious unrecognized


2 of 32infringement on academic freedom. .......... Despite the long line of research on the validity of SEF it has not typically been viewed a s impinging upon academic freedom. At first glance, t he very notion seems counterintuitive, if not downright mean spirited. Indeed, when it is suggest ed, it is seen as not only a novel idea, but as an attack on either students, or a general attack o n evaluating faculty. In fact, one recent handbook for college administrators, (Weeks, 1996) refers to such an idea as "a rather novel attack on the use of student evaluations in assessi ng a faculty member's performance." Indeed, it has generally been taken for granted that SEF is ap propriate and necessary. Unlike the body of research on the methodology of SEF construction and the validity of the ratings, there is a paucity of data on the issue of its impact on academic free dom. .......... The likely reasons for this paucity of d ata are that (1) there has been little professional mention of SEF as an infringement on academic freed om, (2) some faculty are embarrassed to admit that student evaluations may influence their professional behavior in the classroom, (3), to question the right of students to evaluate faculty may be considered unprofessional if not undemocratic, (4) to question the right of students to evaluate faculty may be seen as self serving, and (5)SEF tends not to be high status research. Fi nally when SEF is recognized to have an impact, unlike traditional threats to tenure and ac ademic freedom, the deceptive appearance of SEF does not seem to warrant serious concern........... In a recent and otherwise carefully reas oned book on academic freedom (Menand, 1996), the issue of SEF is not mentioned. 2 At best, the subject of SEF has been the orphan in discussions on academic freedom with no direct logi cal lineage. At worse, it has been considered illegitimate. There are signs, however, indicating wider faculty recognition of this issue. As administrative policy, the use of SEF has largely e volved in a de facto manner, this paper will review the issue in search of a more tutored policy .......... It is important to note at the outset, t hat it is not SEF per se that is the issue, but the impact of its use on salary, promotion, tenure decisions, and equally important, its impact on the delivery of quality education. In the evolution of any polic y the accumulation of data, judgements, and arguments around an issue need to be coalesced. As the history of legal rights demonstrates, issues not considered to have legitimate standing o nly come to have standing after a long process of advocacy. It is, therefore, the purpose of this paper to coalesce arguments, and data, and explore implications of SEF scattered throughout th e literature and to thus facilitate this developmental process........... In explicating SEF, many closely related issues must be substantially bracketed. These related issues include (1) its validity (Cahn, 1987 ; Damron, 1996; Greenwald, 1996; Greenwald and Gillmore, 1996; Scriven, 1993; Seldin, 1984; Ta gomori, Bishop, and Laurence. 1995), 3 (2) the problem of defining teaching effectiveness, 4 (3) general variables affecting SEF scores, 5 (4) alternatives to SEF's (5) classroom politically cor rect or popular standards and perceptions, (6) low student academic preparation, (7) age and gende r discrimination issues (Feldman, 1983, 1993) (8) strategies for change, and (9) other inte grally related issues such as their being largely responsible for lowered course standards, and grade inflation. 6 Though these are important and related issues, they can only be addressed here in so far as they directly impact the focus on SEF and academic freedom........... While SEF was apparently first used in t he early 1920's at the University of Washington, beginning in about the 1960's SEF has been increasi ngly used by universities in decisions on tenure and promotion. One of the reasons that SEF w as instituted---and rightly so---was for informational feedback so that faculty might be mor e aware of student needs. The instrument has not, however, been used just for informational feed back to professors. If this were the case, then SEF would presumably not be a problem. 7 As Cashin (1996), Director of the Kansas State


3 of 32University, Center For Faculty Evaluation and Devel opment, notes, "The higher education rhetoric is almost universal in stating that the pr imary purpose of faculty evaluation is to help faculty improve their performance. However, an exam ination of the systems--as used--indicates that the primary purpose is almost always to make p ersonnel decisions. That is, to make decisions for retention, promotion, tenure, and sal ary increases." Herein lies the problem. 8 .......... In comparing two studies of the same 600 liberal-arts colleges, the author found that the number of institutions using student ratings to eva luate instructors had escalated from 29 per cent to 68 per cent to 86 per cent. The author noted tha t no other method of evaluation has approached that level of usage (Seldin, 1993). Another survey found that most business schools now use SEF for decision making, with 95% of the deans at 220 a ccredited undergraduate schools always making use of them as a source of information (Crum bley, 1995). Two nationwide studies of accounting department Chairpersons, indicated that reliance upon SEF was second only to research publications in professional journals (Yun ker and Sterner, 1988). Department chairs and Deans often weigh student ratings heavily in the fa culty evaluation process. Perhaps no other method of evaluation has become so sacrosanct. SEF are used not only in the U.S. but in Australian, Canada, Europe and Great Britain. Unlik e in the U.S., however, in Great Britain SEF by formal questionnaire, despite apparently no form al mandate, are increasingly used, though not weighed as heavily as is information gathered by ot her means. (Husbands and Fosh, 1993). .......... While considerable research---both past and present--has been conducted to assess the validity of SEF, formal reports or studies suggesti ng that they might impinge on academic freedom are virtually non existent. There is likewi se a paucity of direct hard data to support widespread but informal evidence addressing the iss ue of SEF as impinging on academic freedom. 9 Informal and reasoned analyses of the issue indica te that because SEF is used for faculty salary, promotion, and tenure decisions, th ere is pressure to comply with student classroom demands regarding teaching style, grading and a host of others demands(see below). It is suggested that it is this pressure to comply wit h student demands that directly leads to an infringement upon academic freedom........... SEF is not simply a salary, promotion, a nd tenure issue--as important as these are for individual faculty; nor is the issue simply that st udents evaluate faculty. As the findings of this paper suggest, SEF are not the benign instrument th ey may appear to be or may once have been. Their primary impact goesto the core of academic fr eedom and to quality of instruction.A Brief Look At Academic Freedom .......... Academic freedom and tenure are two side s of the same coin. Following an initial statement of principles in 1915, the current view o f tenure was established in 1940 when the American Association of University Professors (AAUP ) and the Association of American Colleges (AAC) officially sanctioned it for purpose s of preserving faculty's right to academic freedom. Legally, it assures faculty the right to p ursue any line of inquiry in the course of their teaching or research without being censored, penali zed or fired by university administrators. 10 In 1973, the Commission on Academic Tenure in Higher E ducation (jointly sponsored by the AAUP and the AAC) made the following recommendation : The commission believes that "adequate cause" in fa culty dismissal proceedings should be restricted to (a) demonstrated incompeten ce and dishonesty in teaching and research, (b) substantial and manifest neglect of duty, and (c) personal conduct which substantially impairs the individual's fulfil lment of institutional responsibilities. The burden of proof in establishi ng cause for dismissal rests upon the institution (University of Michigan, 1994).


4 of 32.......... The AAUP's Statement on Teaching Evaluat ion suggests that Casual procedures, a paucity of data, and unilatera l judgments by department chairs and deans too often characterize the evaluation of teaching in American colleges and universities.... A judicious evaluation of a colleg e professor as teacher should include: (1) an accurate factual description of wha t an individual does as teacher (AAUP Committee C 1975). 11 .......... How SEF establishes incompetence, or neg lect of duty is problematic, having largely to do with issues of validity. To further complicate matt ers the concept of academic freedom, like most abstract terms is logically fuzzy around its edges. Moreover, unlike the legal categories of academic freedom and tenure, there is no equivalent legal category of SEF. Consequently, published legal rulings on this issue are scarce........... While academic freedom has not been reco gnized universally by the courts as equivalent to a constitutional right, it has nevertheless been viewed as a right which the courts have deemed must not be violated in the performance evaluation process. In addition, academic freedom has been associated with the First Amendment right of f ree speech. Some courts have considered it to be a First Amendment-right in and of itself. Academ ic freedom is a "special concern of the First Amendment, which does not tolerate laws that cast a pall of orthodoxy over the classroom." While the two rights are not necessarily the same, they frequently and sufficiently overlap to trigger judicial scrutiny when faculty performance evaluation process threatens to impinge on the First Amendment (Copeland and Murry, 1996).Faculty Assessment of How SEF Infringe Upon Academi c Freedom.......... Mention of this issue within the text of a number of articles throughout the literature provides a kind of uncontrolled data base indicatin g that an increasing number of faculty consider SEF an impingement on academic freedom. There are, however, a few formal surveys. While the judgement of a profession's membership on a subject is not necessarily a valid indicator, especially when that the judgement renders them the primary beneficiaries, in the absence of significant evidence to the contrary it would seem reasonable that a profession's judgement should be accorded considerable weight in any analy sis or process leading to policy decisions. .......... Before reviewing what the literature see ms to show is the predominant faculty view, a statement of at least one faculty who does not thin k SEF impinges on academic freedom will illustrate the opposite perspective. This faculty w as quoted as saying, Do I sense more than a little self-disdain in the c omments of my colleagues who portray faculty members as cowed by the prospect of a less than enthusiastic course evaluation by their students? That is not my impres sion of my colleagues, and I can assure you it is not an accurate representation of me. It is my impression that faculty grade students with quality of performance as the p rimary, if not sole, criterion (Kresl, 1995). At least some faculty, then, believe that SEF does not influence their behavior and is therefore not an abridgement of academic freedom........... While formal surveys of faculty views on this issue are relatively rare, one study found that at least one third of faculty respondents repo rted lowering their grading standards and course level in response to their student evaluations (Rya n, Anderson, and Birchler, 1980). Another study found that 64.8% of respondents believed that "student evaluation forms are responsible for lenient grading." According to another study, 39% o f accounting administrator respondents


5 of 32admitted being aware of faculty who altered their i nstructional behavior in order to improve evaluation scores (Crumbley and Fliedner, 1995). Fa culty were also in nearly universal agreement that SEF is important in promotion (86.6% ) and tenure (88.2%) reviews (Kolevzon, 1981, see also Avi-Itzhak and Lya, 1986). 12 At the very least, such reviews are strong prima facie evidence of SEF as an infringement on academi c freedom. .......... In two nationwide studies of accounting departments it was shown that 37% of faculty respondents were dissatisfied with their present ev aluation system. The response of 561 accounting faculty to the statement that the presen t system of SEF is well-designed and properly implemented, reported that 15.3% indicated "strong disagreement," 31.7% "disagree," and 25.5% indicated a "neutral" response (Yunker and Sterner, 1988; Bures, DeRidder and Tong, 1990). 13 .......... Unlike the paucity of formal surveys, th ere are numerous statements by faculty in the research literature clearly arguing that SEF is an infringement on academic freedom. These statements by faculty contend that SEF (1) is prime facie evidence of administrative intrusion into the classroom, (2) are often used as an instru ment of intimidation forcing conformity to politically correct standards (Young, 1993), (3) cr eate pressure for a self-policed lowered teaching standard (Bonetti, 1994), (4) are responsi ble for a considerable amount of grade inflation (Greenwald, 1996, Greenwald and Gillmore, 1966), 14 (5) function as prescriptions for classroom demeanor (Damron, 1996), (6) when used fo r promotions, salary raises or continued employment, SEF becomes a potent means of manipulat ing the behavior of faculty (Stone, 1995), (7) when salary and promotion are possible conseque nces of SEF there is pressure for faculty to teach in a manner that results in higher student ev aluation (Damron, 1996), (8) contrary to their original intent of improving instruction, do not el iminate poor or below-average teachers but instead increases poor teaching practices (Carey, 1 993), (9) illustrate a mercantile philosophy of "consumerism" (Benson, and Lewis, 1994), which erod es academic standards (Goldman, 1993; Renner, 1981), (10) have thus lowered the quality o f U.S. education (Carey, 1993; Crumbley, and Fliedner, 1995; Young, 1993), (11) lead to the inap propriate dismissal of faculty (Parini, 1995), and (12) constitute a threat to academic freedom (D ershowitz, 1994; Stone, 1995). Finally, it would seem that SEF creates an educational conflict of interest between faculty and student impacting on the quality of instruction. 15 .......... At least one Faculty Senate tenure and p romotion committee at a large state university along with the ACLU (Heller, 1986), found that give n the crude state of SEF and its subsequent use, such evaluations are a"de facto violation of a cademic freedom." (p.14) Despite these concerns, however, there are some faculty who, give n the alternative of peer evaluation, consider SEF as less harmful (Carey, 1993)........... Academic freedom involves more than just external political control of course content (Schrecker, 1986), and a Scopes Trial type gag orde r; it entails pressure to censor unpopular subjects within the popular cultural belief systems of students. In this regard, Alan Dershowitz (1994), the well known Harvard Law School professor and appellate attorney believes that the administrative use of SEF is a direct threat to aca demic freedom and to quality education. After teaching on a controversial subject, and receiving negative feedback from students, he says he "realized how dangerous it would be for an untenure d professor" to teach about such subjects. He goes on to point out, however, that "Most of the st udents appreciated the diversity of viewpoints," and evaluated him fairly, noting that he was "very good at presenting alternative views," and "helped me get a less dogmatic view of the law," that he was "open to criticism," that the class was "the most engaging class on campus," that he was "the most intellectually honest professor I've had," and was "fair in presenting si des that usually aren't raised." .......... There was, however, a sizable group of s tudents who, says Dershowitz, "used the power of their evaluations in an attempt to exact their poli tical revenge for my politically incorrect teaching." He was lecturing on the (sensitive) lega l analysis of rape. 16 He notes, that


6 of 32One student said that I do "not deserve to teach at Harvard" because of my "convoluted rape examples." Another argued that wom en be allowed an "option" not to take my class because I "spent two days talking about false reports of rape," another demanded that my "teaching privileges" be s uspended. One woman purported to speak for others: "Every woman I know in the class including myself found his treatment of rape offensive and disturbin g." Another woman felt "oppressed throughout the course." Although I alway s try to learn from my evaluations, I will not be bullied into abandoning a teaching style that I believe is best designed to stimulate thinking.(italics added, p.118-119) .......... Dershowitz says that it takes no courage for him to exercise his academic freedom, since he holds the rank of full professor and has tenure, but he seriously wonders if he were an untenured assistant professor, if he would I have t he same courage. He concludes: Are other less established teachers being coerced i nto changing their teaching by the fear of negative evaluations, which can be fatal to tenure? You bet they are, and it poses a real danger to academic freedom and good ed ucation. (italics added, p.118-119) Dershowitz perhaps overstates his case on the secur ity provided by tenure for faculty on many campuses. Being at Harvard, he is not the typical f aculty member. Unlike at the Harvards, the Seven Sisters, and the Ancient Ones, in the trenche s of most campuses SEF is accorded inordinate weight (at least by administrations) ove r research. He is also "politically" protected by being so well known, not to mention that his famous cases and books have probably made him financially independent.Classroom and Curricula Infringements Based on SEF.......... The following three recent examples will illustrate the extreme to which SEF can lead in terms of impinging on academic freedom. Each exempl ifies a different aspect of academic freedom being infringed upon by SEF as used at most institutions. There are likely a number of cases similar to a recent one reported in the Chron icle of Higher Education of faculty being dismissed because of negative student evaluations ( Magner, 1995). .......... With four weeks remaining in the semeste r, a non-tenured faculty was removed as the instructor of an introductory chemistry course at t he University of Montana. 17 His dismissal was prompted by a student petition signed by 100 of the 200 students in one of his classes. These students maintained that he demonstrated an "inabil ity to teach new information" and a "willful academic demoralization of students." Three differe nt views of this situation have emerged. Some contend that he was a tough teacher who became the victim of "whiny" students looking for an easy instructor to spoon-feed them the cours e material. Supporters maintained that many students in the course planning professional or gra duate programs, feared that a low grade in a difficult subject would diminish their chances of a dmission. Others, maintained he ignored student questions, and made tests more difficult th an they needed to be. .......... The faculty member maintained he receive d good student evaluations in the past with no problems in his other courses. The faculty believe the reasons for the complaint is that he is a demanding teacher who requires students to go beyon d simple memorization and regurgitation of facts on his exams. He says he has "never knowingly belittled a student." He sees an anti-intellectualism at work, maintaining that "The re is an attitude on...campus that any display of learning or erudition is arrogance." He also mainta ins that "it's not the students' prerogative to decide who teaches a course, and by extension, how that course should be taught."


7 of 32Administration is quoted as saying, "We're an openadmission university. A very large fraction of the class was completely unable to compete." As it turns out, the previous year the administration warded off a similar student petition against a ten ured full professor of chemistry. In that case, the administration says they "clearly felt we were dealing with some people [students] who wished to get better grades for less work." Granted all the facts are not in on this case. The faculty was replaced with a professor who is popula r, with students. Nevertheless, it is clear that student evaluations were the primary mechanism lead ing to a faculty's dismissal when, whatever the facts are, they are at least open to serious qu estion. .......... As well as academic freedom in the class room, other cases affect tenure, promotion, and more importantly, curricula. A faculty (Goldman, 19 93) at Wichita State University notes that in a thirty-faculty department which is responsible fo r certification of teachers, in the past 25 years, six faculty have been hired as assistant professors to teach Foundations of Education. All apparently well qualified, receiving their doctorat es from excellent universities, only one of these faculty was awarded tenure; none was promoted. Acco rding to Goldman (1993), the reason for this is student evaluations........... In general, as the data show, required c ourses hold less interest and receive lower evaluations than elective courses; in general, stud ents who are drawn to become teachers are concrete-sequential, and are less interested in the abstract and theoretical content of the foundations of education course. This leads the fac ulty who teach the foundational course to receive lower student evaluations than other educat ion faculty. Because student evaluations are often the major, if not the only, gauge of teaching quality, and since teaching quality usually out ranks research and scholarly productivity on most c ampuses, when tenure, promotion, and salary increases are awarded, these rewards will not be ev enly distributed to the foundational, educational psychology faculty. Rewards will accrue to the concrete-oriented methods faculty whose courses will further intensify the concrete o rientation of teacher preparation. And the downward spiral will continue. 18 .......... The third example of SEF as infringement on academic freedom reportedly occurred in a business department (in Crumbley, 1995). A rigorous instructor teaching a basic business class gave D's and F's and received SEF scores in the one range (on a 5-point scale). As a consequence, she was removed from teaching that class and assign ed to a non required graduate course where she proceeded to give a 50-50 split of As and B's e ach semester. She received SEF scores of up to 4.9 after making this adjustment. She informs th e students at the start of the semester that only As and B's will be given. This administrative strat egy of assigning "tough" grading faculty to non required courses allows students to force easy grad ing by self-selecting away from the more stringent graders, and thus "censors" certain instr uctors courses and by implication the content of what they teach........... These examples suggest how SEF can be us ed to shape faculty behavior, curricular content, and the kind of faculty that are retained in programs. They also suggest how SEF is selectively used as a primary mechanism of dismissa l of faculty who do not conform to student and administrative demands. Though a faculty may (a ) receive good evaluation in other courses, and (b) receive good evaluations from the majority of students, a few negative evaluations may be determinative of disciplinary action toward facu lty. The examples also raise the issue of who and what determines academic standards. On campuses with "open admissions," meaning a lower level of student in courses than has historically b een the case, that an otherwise competent instructor is required to adapt his or her course t o whatever level of student is enrolled, or be subject to the above consequences based on student dissatisfaction. Regardless of the specifics, the case illustrates the direct aspects of student evaluations affecting educational standards and raises the question of academic freedom of faculty in the classroom.Legal Considerations of SEF


8 of 32.......... There are multiple latent legal issues e ngendered by SEF. Unlike academic freedom and tenure, as a legal search category SEF does not exi st. 19 In a subsequent paper I will coalesce and deal more fully with legal rulings involving SEF in the denial of reappointment and tenure. While the following ruling is Canadian and has no s tanding in the U.S., it is nevertheless initially relevant, especially given that Canadian protection of free speech is more narrow than in the U.S. .......... Under Canadian law, a situation recently developed at the University of Regina where a faculty was denied tenure partly on the basis his s tudent evaluation scores (Education Employment Law News, 1994). 20 A board of arbitration ruled that tenure decisions could not be based solely on asses sments which were completed by students who had never been made aware of the ramif ications of their statements....To base serious career decisions narr owly on student evaluations is not to be encouraged..... This is particularly so when. ..the students are not advised of the potential use of the evaluation tool... [I]f evalua tions are to be used for serious career development purposes those completing them should b e aware of the potential consequences of their participation. At least one similar ruling exists under U.S. law........... At the University of Guam, a ruling to r emove anonymous student evaluations from professors' tenure files was handed down by an arbi tration board as the result of a rare challenge to the use of such evaluations in tenure and promot ion decisions (Blum, 1990). The action was in response to a grievance filed by the university's f aculty union, the Guam Federation of Teachers, which is affiliated with the American Federation of Teachers (AFL-CIO). The union said the use of SEF violated the union's contract with the unive rsity, which provides that anonymous documents or those "based on hearsay" should not be included in a faculty member's file. The union also argued that the university improperly in terpreted the data from the evaluations. .......... Some of the issues here are (1) students not being made aware of the purpose and ramifications of their evaluations, (2) the anonymo us nature of student evaluations, (3) the invalid analysis of SEF, and therefore, (4) SEF in effect being anecdotal and hearsay data. Since most SEF results are prepared anonymously, an instr uctor has no recourse to confront his/her evaluators. As will be addressed below, the anonymo us nature of SEF is beginning to also be questioned by arbitration boards........... After reviewing the research, a Director of the Office of Educational Assessment at one large state university (Gillmore, 1984) concluded t hat, "If student rating are to qualify as evidence in support of faculty employment decisions questions concerning their reliability and validity must be addressed." (p.561) Perhaps the mo st comprehensive review and statistical analysis of SEF validity is that of Greenwald (1996 ). .......... The philosopher of science, Michael Scri ven (1995, 1993, 1991), who has conducted extensive analyses on faculty evaluation methodolog y suggests of faculty evaluation in general that, All are face-invalid and certainly provide a worse basis for adverse personnel action than the polygraph in criminal cases. Based on exam ination of some hundreds of forms that are or have been used for personnel deci sions (as well as professional development), the previous considerations entail th at not more than one or two could stand up in a serious hearing. 21 The cost of lawsuits notwithstanding, given the abo ve, it is surprising that there have not been more lawsuits by faculty. 22 It appears, however, that in regard to faculty eva luation in general the courts have not been concerned with validity issues though this may be changing as well (Rebell,


9 of 321990).Releasing SEF to Students and the Public.......... In exploring possible legal implications of SEF, it should be made clear that I am not an attorney and approach this section on the basis of the "reasonable man" legal standard. To begin, some faculty believe that due process and defamatio n issues are involved in SEF (Crumbley, 1996). It has been suggested that faculty are entit led to at least the same rights as students. The Fourteenth Amendment requires, for example, due pro cess before a public institution may deprive one of life, liberty, or property. Given th e problematic nature of SEF, due process is in question. In a university, a faculty's reputation i s considered a liberty right, and for tenured faculty the courts have pronounced the possession o f tenure a property right. Presumably, any inappropriate action depriving faculty of these rig hts would be open to legal action. .......... Though it is illegal to post a student's grades using a social security number or date of birth, 23 on the majority of campuses scientifically questio nable SEF and other anecdotal student remarks about faculty teaching are not only used in determining faculty salary increases, promotion and tenure decisions, they are openly pub lished on some university campuses and sanctioned by some administrators and state governm ent officials. In what many faculty see as an outrageous attempt to control the academic classroo m, some state governments have sanctioned the release of SEF to the campus community and in s ome cases to the general public by publishing faculty student evaluations on the unive rsity's world wide web pages, thus making them not only available on campus but globally........... Case in point: At the University of Wisc onsin, the Chancellor refused to release the SEF, citing a statute allowing personnel evaluations to be withheld from public view. The students took the chancellor to court. However, after being advised to do so by the state's Attorney General, citing Wisconsin's open-records law, the U niversity of Wisconsin's campus will open students' evaluations of professors for public view To the credit of the student and faculty senates, they passed resolutions in support of the Chancellor's refusal, and the university's lawyer concurred. Despite these resolutions, the Attorney General disagreed, writing that "the requested records are public records and the university's sta ted reasons for withholding access do not outweigh the public interest in the records" ( Chronicle of Higher Education 1994a, 1994b). .......... Other schools also published SEF. One re cent survey of accounting departments found that 11.4% of the respondents indicated that SEF sc ores are made available to students (Crumbley and Fliedner, 1995). Indeed, a search usi ng "faculty evaluation" on the world wide web will return numerous examples of published SEF. All this while faculty are restricted from divulging information on students (see Pennsylvania State University, 1996). Articles are, however, beginning to appear that question the lega lity of publically releasing SEF (Robinson and Fink, 1996)........... It has been suggested that if a universi ty damages a faculty's reputation by publishing fal se and anecdotal data from SEF, faculty should able to sue for libel or defamation. The concept of defamation typically refers to communication that c auses a person to be shamed, ridiculed, held in contempt by others, or their status lowered in t he eyes of the community, or to lose employment status or earnings or otherwise suffer a damaged reputation. Legally, while defamation is governed by state law, it is limited by the first amendment (Black, 1990). 24 According to one source, however, the courts have g enerally protected administrators from defamation charges resulting from performance evalu ations (Zirkel, 1996). It would seem, however, that these older precedents applied when a dministrative evaluations were conducted in private and not publically distributed........... University administrators are often allo wed to release SEF to students when the release of


10 of 32personnel information is apparently allowed in no o ther phase of personnel or other key management functions. An Idaho ruling upheld the re lease of SEF to students by reasoning that students were not the general public and therefore faculty evaluations were not protected under the privacy rights of the Idaho Code (Evaluating Te acher Evaluations, 1996). Given such apparent breaches of confidentiality and privacy, i t will be instructive to see how the courts will continue to rule. It would seem that a university s hould be held responsible for insuring that data made public are valid........... Finally, in typical personnel evaluation s, professional validation studies are not permissible unless shown by professionally acceptab le methods to be "predictive of or significantly correlated with important elements of work behavior which comprise or are relevant to the job or jobs for which candidates are being e valuated." In Title VII of the Civil Rights Act of 1964 the employer must meet "the burden of showi ng that any given requirement (or test) has a manifest relationship to the employment in questi on" (in Griggs v. Duke Power Co., 401 U.S. 424 (1971). "In view of the possibility inherent in subjective evaluations, supervisory rating techniques should be carefully developed, and the r atings should be closely examined for evidence of bias" (EEOC Guidelines, 99 CFR 1607.5 ( b) (4). (in Crumbley, 1996).SEF As Social Judgement and Diagnosis.......... Given courts uncritical assumptions rega rding the validity of faculty evaluations in general, and the untrained judgement of those makin g decisions, it would seem appropriate to address the issue of judgement and decision making in SEF. Frequently, in interpreting SEF the method is simply "eyeballing" them. The "actual" co nsistency of findings of SEF validity at this point in time, notwithstanding, one judge (Rebell, 1990) clearly points out, "because the state of the art concerning teacher-evaluation practices is at a sensitive developmental stage, extensive court intervention at this point can substantially influence ---for better or worse---the future direction of basic practice in the field." (p.344) He goes on to say that seen in this light, it is cl ear that whether increased judicial intervention will h ave a positive or a negative impact on professional evaluation practice lies more with edu cators and psychometricians than with the judges, meaning that faculty control over future ru lings depends upon providing clear evidence to the courts........... Thus, part of influencing future court r ulings is demonstrating relevant research. With the typically unsystematic method by which SEF are inte rpreted, and the considerable weight accorded selective negative comments by a few stude nts in making tenure and promotion decisions, the meaning of SEF (i.e., diagnosis) as they are applied to the analysis of teaching effectiveness is, in fact, all too frequently an in tuitive one. .......... As to the issue of the validity of findi ngs in SEF, as indicated at the opening of this pap er, more recent and rigorously analyzed statistical fin dings cast serious doubt on their validity. The research literature on judgement and decision makin g would seem applicable to both students making such judgements and to those interpreting (i .e., diagnosing) the results, especially given the intuitive and non-data based foundations of suc h judgements and decisions. In the course of reviewing the research over the years on social jud gement and clinical diagnosis, it seems clear that the manner in which nearly all SEF data are an alyzed is a subset of the research on social judgement and clinical diagnosis, with both student and interpreter being involved in the same logical and cognitive biases and distortions that r esult in the pervasive low accuracy level of social judgement in general and clinical diagnosis in specific. .......... Psychological research has recognized th e severe cognitive problems and limitations of "intuitive," and "experience-informed" everyday jud gements for over thirty years, (Dawes, Faust, and Meehl, 1989; Faust, Guilmette, Hart, Arkes, Fis hburne and Davey, 1988; Garb, H. N. 1989; Hayes, 1991; Larkin, McDermott, Simon, and Simon, 1 980; Rabinowitz, 1993) yet the mistakes


11 of 32continue in everyday practice situations. Interpret ation of SEF are no different. As Franklin & Theall (1990) independently observe relative to SEF Even given the inherently less than perfect nature of ratings data and the analytical inclinations of academics, the problem of unskilled users, making decisions based on invalid interpretations of ambiguous or frankly bad data, deserves attention. According to Thompson (1988, p. 217) 'Bayes Theorem shows that anything close to an accurate interpretation of the results of imperf ect predictors is very elusive at the intuitive level. Indeed, empirical studies have sho wn that persons unfamiliar with conditional probability are quite poor at doing so (that is, interpreting ratings results) unless the situation is quite simple." It seems lik ely that the combination of less than perfect data with less than perfect users could qui ckly yield completely unacceptable practices, unless safeguards were in place to insur e that users knew how to recognize problems of validity and reliability, understood th e inherent limitations of rating data and knew valid procedures for using ratings data in the contexts of summative and formative evaluation. (79-80). The authors conclude by noting "It is hard to ignor e the mounting anecdotal evidence of abuse. Our findings, and the evidence that ratings use is on the increase, taken together, suggest that ratings malpractice, causing harm to individual car eers and undermining institutional goals, deserves our attention." (pp. 79-80). Recognizing s uch problems is not methodological nit-picking; they are pragmatic and fundamental........... It would therefore be a reasonable recom mendation to suggest those researching the validity and especially interpreting SEF familiariz e themselves with the psychological social judgement and clinical diagnosis literature. Few st ates or courts require evaluators to be trained in evaluation methods (Rebell, 1990). 25 It is these cognitive problems that need to be add ressed in the SEF literature prior to otherwise statistica lly sophisticated of the data being performed analyses---when rigorous methods are used.SEF and Administrative Control of Academic Freedom.......... In addition to legal aspects, there are the pragmatics of the SEF which maintain its use. The literature clearly suggests that administrators tend to strongly oppose the elimination of SEF being used for faculty salary, promotion and tenure decisions. There are three reasons for opposing the elimination of SEF. The first seems to be a lack of practical alternatives to SEF (Greenwald, 1996), 26 the second is administrative control, and the thir d is that student input facilitates student retention in numerous ways........... SEF provides a mechanism of control in a system otherwise lacking direct control over faculty, and are a powerful tool in assuring classr oom changes that lead to the retention of student tuition dollars by assenting to student con sumer demands and of parents who foot the tuition bill. Academic issues such as teaching, gra ding, curricular requirements, and other academic standards have by tradition and expertise been the exclusive province of faculty. Although academic freedom and the protection of ten ure would appear to insure faculty classroom independence, the extent to which faculty thought and behavior are administratively shaped is the extent to which both are infringed up on (see below). 27 Such administrative control mechanisms, of which SEF is one, are therefore seen as infringements on academic freedom. .......... Control mechanisms are more widespread a nd intricately embedded in the everyday operations of the university than is generally ackn owledged. After a review of the research, Stone (1995), observes that SEF opens the "door to the di rect application of bureaucratic control to


12 of 32academic decisions. It is the very kind of policy t hat, for example, has enabled educational administrators to mandate the "politically correct" at the expense of the "academically credible." He further observes, In great part, the reason for this outcome is that the faculty comprising colleges, departments, and programs are subjected to the same incentives that impact the institution as a whole. Chief among these incentive s is internal flow of institutional resources. The allocation of institutional resource s is administratively controlled in such a way that the academic units with the greates t enrollment growth are generally afforded the greatest resources. The incentives are such that no individual faculty member can comfortably question activities that bui ld his or her department's or college's budget. After all, the job they save may be their own. The same can be said for peer evaluations of faculty for promotion, tenu re, and merit pay. Faculty typically find it difficult to negatively evaluate colleagues who are high credit hour producers and who have high student ratings even if their aca demic standards are suspect. Credit hours and student ratings are routinely and carefully monitored by the administrative bureaucracy. Learning outcomes are g iven much less attention. Units that fail to grow not only remain under supported, they are sometimes cannibalized, and their share of the budget is channeled to the a reas demonstrating greater potential for growth....The internal competition fo r resources shapes individual faculty behavior as well.... They are, therefore, i n the interest of self-preservation, especially inclined to accommodate academic standar ds to the student market. Stone concludes that administrative influence, incl uding the use of SEF, has become so effective in shaping faculty that their collective voice has been seriously compromised. 28 It should be noted that some administrators, and university pres idents do recognize the importance of tenure and support its continuance. (e.g., Cotter, 1996)........... Contrary to some published reports, cond itions such as weakened standards, fragmented curriculum, and inflated grades do not simply arise from a spontaneous deterioration of faculty into so-called "deadwood." Neither are most faculty incompetent at teaching. As Stone (1995) points out, "Rather, these problems seem likely to have developed as a result of the continuing insidious pressure placed on teaching and grading p ractices by the imperative to keep students happy and enrollments up." While most students are positively disposed to SEF, not all students are so oriented. 29 .......... Many students understand the above descr ibed ensuing consequences. A glance at articles from online student newspapers reveals strong senti ments against what many students consider the erosion of standards created by SEF. One studen t writer went so far as to say "We therefore suggest a boycott of the 1995 student/teacher evalu ations. This boycott will provide a more effective means of communication than anything writ ten on the evaluation itself. Something must be done about the trend of grade inflation. We as s tudents refuse to contribute to the downfall of academia (Stern and Flynn, 1995). Some students a re thus quite aware of the effects of SEF on their education. Thus the foregoing are something t o seriously consider in the light of the principle of academic freedom.Faculty Complicity and Adaptation to SEF.......... There is the argument that academic free dom is not abridged if faculty voluntarily participate in the SEF process. This argument is pr oblematic for three reasons. First on most campuses SEF is required by administrative policy, and is therefore neither individually nor


13 of 32collectively voluntary. 30 Second, as noted by Stone (1995) above, given thei r mandated use in salary, promotion and tenure decisions, SEF creates significant pressure to conform to administrative and student demands. The third is th at the second condition does not seem to meet the requirements of what is legally considered "inf ormed consent" in a contractual agreement. .......... Like most people, given that faculty beh aviors are subject to reward structures, is it is n ot surprising that faculty would tend to teach to the evaluations and self-limit their academic freedom prerogative. From a purely psychological/be havioral learning theory perspective, when a faculty's livelihood is at stake, it is understanda ble why many will adjust their grading and course content level in order to receive a good evaluation It is also predictable: behaviors that are rewarded tend to increase the probability of their reoccurrence. 31 .......... Similarly, as two research studies note (Crumbley, 1995; Nelson and Lynch, 1984), from an economic business accounting point of view, If a faculty can choose teaching styles, grading levels, and course content, s/he will naturally pre fer choices that are expected to result in higher SEF scores; if faculty know the variables affecting their careers, they will meet these criteria. There are laws that regulate ones financial stateme nts so as to reduce the manipulation and opportunistic behavior regarding income, yet there is no regulation of SEF. Faculty "have a high incentive to manage SEF, even more so than managers have the incentive to enhance earnings"(Crumbley, 1995). Unfortunately student ra tings tend to discourage instructors from espousing views that might offend popular student p rejudices. "Expecting that even highly professional individuals will disinterestedly adher e to academic and intellectual ideals in the face of pervasive incentives to do otherwise is not real istic." (Stone, 1995) .......... Even one of the strongest advocates of t he validity of SEF (Seldin, 1984) warns: "The confidentiality of the data must at all cost remain inviolate. If data are shared, it must be with the consent and at the discretion of the appraised prof essor. For data surreptitiously to be used for personnel decisions, it will have an immediate chil ling--even fatal--effect on the credibility of the entire evaluation program." (p.129) Certainly when SEF are published not only confidentiality has been violated but so is trust. Again, given the above pressures, it would seem that "voluntary" faculty participation in the SEF process does not m eet the requirements of what is legally defined as informed consent. As two legal scholars (Copelan d and Murry, 1996) noted in a different context is directly applicable here. They note, "Of course the right to academic freedom means very little if by exercising it the educator suffer s, or runs the risk of suffering, financial consequences." (p.249)SEF As Trojan Horse In The Movement to Eliminate Te nure.......... There is a final and even more serious t hreat that SEF can be said to pose to academic freedom than the simple act of student pressure and administrative control of rewards based on them. This last threat is not generally recognized. This threat is the use of SEF in a continuing national movement against tenure and to reduce cost s. For about the last twenty years there has been considerable criticism about the concept of te nure. The public has always viewed the purpose of academic tenure as job security, just li ke seniority in a trade union. And as with unions, the perception is that job security leads t o lowered productivity. Accordingly, there have been continuing periodic attempts to eliminate tenu re. More recently, however, these attacks have increased. The attacks come from all levels of society and from inside academia itself, including boards of trustee and State Legislatures attempting to either eliminate or revamp academic tenure, the latest being the widely public ized bid by the University of Minnesota Board of Regents (Guernsey, 1996; Healy, 1996; Magner, 19 96; The Chronicle of Higher Education 1995)........... Typically, the arguments against tenure have been economic. With shrinking budgets,


14 of 32these old economic arguments have taken on an even more bottom line approach to education, arguing, at the least, for the revamping of tenure for the purpose of enabling schools to become programmatically and fiscally more flexible in a ch anging world. While views of tenure are changing, including those of the AAUP, traditionall y AAUP has primarily held that "Institutions of higher education are conducted for the common go od and not to further the interest of either the individual teacher....[and]..The common good de pends upon the free search for truth and its free exposition. Academic freedom in its teaching a spect is fundamental for the protection of the rights of the teacher in teaching and of the studen t to freedom in learning" (AAUP, 1970). .......... At least one court has ruled, "The purpo se of tenure is to protect academic freedom---the freedom to teach and write without fear of retribut ion for expressing heterodox ideas"(in Copeland and Murry, 1996, p.250). 32 Herein lies the point: academic freedom can only b e guarded within a system of tenure. 33 Certainly tenure does not guarantee academic freed om. Nevertheless, though tenure is not a sufficient con dition, it is a necessary condition for academic freedom........... Some tenure system makes it procedurally difficult to fire faculty who may disagree with popular beliefs. There is a long history, continuin g to the present, of professors being fired because of their unpopular critiques. The classic e xamples, of course, include the 1925 trial of John T. Scopes, the Tennessee high school teacher w ho was fired for teaching the Darwinian theory of evolution. In the 1950s, we saw Senator J oseph McCarthy's communist witch hunts that resulted in professors being fired and/or blacklist ed. During the 60's and 70's entire departments were wiped out or put in special trusteeship becaus e of faculty critiques of the Viet Nam war. More recently, "political correctness" about gender race, and other issues, both on the left and the right, has led to the firing of professors, bot h tenured and non tenured. So even tenure is not an absolute guarantee for academic freedom........... Part of maintaining safeguards to academ ic freedom is that the university be insulated (not isolated) from the popular culture. To understand t enure, then, it is necessary to understand that the university is a unique organization and in its most important aspects can not be compared or likened to other organizations in society. First, u nlike the business corporation, the university should not be consumer oriented (though there are, of course, certain economic realities). The role of the university is leadership, not as a serv ant of consumer (i.e., student) demands. .......... The university tenure function insures o ne of the only places in society where open dialogue on any issue no matter how unpopular or un orthodox can be critically examined without consideration of the political cost, without fear o f reprisal, without the pressures of social taboos, social norms, faddish movements, personal notions o f etiquette, and other immediate pragmatic pressures that exist in the culture in which it is embedded. Above all other roles, this is the defining feature of the university in a democratic society. Without this role, there is no place where truth--with either a capital "T" or lowerca se "t"---can be sought. Once the university becomes indistinguishable from the popular culture, or becomes political, it ceases to be a credible place for the creation and dissemination o f knowledge. 34 .......... If tenure is compromised, we will no lon ger have a university as we know it. It may look like a university, but it will not in fact be one. 35 It will be something else. If we want to play chess without using the King, we can certainly elec t to do so, but whatever game we are then playing, let us be clear that it is not chess. 36 .......... Undergirding this push to eliminate or r evamp tenure lies a metaphor that has been transformed into a literal concept: The university as a business entity. The university considered as a business carries with it the attendant and ass ociated ideas of students as consumers in an educational market place, fiscal efficiency, and be ing a servant of the consumer community in which it is physically embedded. (See Sommer, 1995) The fact is, a university is not like a business. Some courts have recognized inappropriate ness of this metaphor. One has noted


15 of 32(EEOC V. Franklin & Marshall College, 1985) I do not agree with the majority's assumption that academic institutions are the same as any other employer. At least insofar as their ad ministrative and governance structures are concerned, colleges and universities differ significantly from garden variety private employers. In the context of applic ation of the provisions of the National Labor Relations Act the Supreme Court has counseled that "principles developed for use in the industrial setting cannot be imposed blindly on the academic world.'" NLRB v. Yeshiva University, 444 U .S. 672, 681, 100 S.Ct. 856, 861, 63 L.Ed.2d 115 (1980) p.120 .......... Academic tenure, then, is not like senio rity in business, civil service, or a union where t he purpose of "tenure" is for the protection of the wo rker. Unlike these organizations, academic tenure is for the protection of the education funct ion, not individual faculty jobs; it is for the protection of the role of the university in a democ ratic society. In fact, the metaphor of student as consumer is more appropriately replaced by the meta phor of student as worker or apprentice. Enter SEF........... SEF has become the newest weapon in the attack on tenure. But the assault is not a frontal one. SEF is increasingly used in evaluation of facu lty for tenure and low evaluation is taken as prima facie evidence of teaching ineffectiveness, t hus justifying not awarding tenure to a faculty. A recent legal handbook for college administrators (Weeks, 1996)says, "Student ratings are an effective tool to respond to litigation...[and]...T hey can become important evidence if a college should decide to terminate a tenured professor for reasons of competence." Of course findings from SEF may indeed validly describe incompetence, but given (a) the conflicting data on their validity, (b) the way many institutions have constr ucted SEF instruments, (c) the often unsystematic statistical method by which SEF are in terpreted, and especially (d) given the considerable weight accorded negative comments by o nly a few students in making tenure and promotion decisions, it would seem SEF can all to e asily be used as a covert instrument for the elimination of tenure candidates and other faculty who may threaten student tuition dollars and perhaps ideological and popular culture agendas........... As noted above SEF can be a powerful adm inistrative control mechanism on granting tenure. Abolishing tenure also lowers operating cos ts. Older faculty who may be unpopular, and who lost more, can be fired and younger, cheaper, f aculty can be hired. It is often older faculty who consistently receive lower student evaluations than younger faculty (Feldman, 1983). McMurtry (1991) has noted that, education has alway s been subject to external pressures whose purpose is to subordinate it to vested interests of various kinds, whether it is slave-holding oligarchies, theocratic states, political parties o r the prevailing dogmas of collective beliefs. The difference today, however, is that threats to acade mic freedom come from within: the consumer student. 37 As the well known scholar David Reisman (1981) not ed years ago in this regard "This shift from academic merit to student consumerism is one of the two greatest reversals of direction in all the history of American higher edu cation; the other being the replacement of the classical college by the modern university a centur y ago." ( p.xi) Within a consumer model of education, to deny consumer demands is difficult. W hile it is difficult enough to deal with political, ideological and economic pressures, deal ing with consumer pressures has become nearly impossible. If denying fiscal efficiency is viewed as unreasonable, irresponsible, and even irrational, to deny "consumer's" their demands is v iewed as undemocratic and downright mean spirited........... It might be asked that if tenure is not joined at the hip with academic freedom, why are Supreme Court Justices granted lifetime tenure? It might be asked, too, why it is that virtually no reasonable person suggests revamping academic freed om, yet many have little qualm--if


16 of 32any---about eliminating or at least about revamping tenure? The short answer is that for many tenure has come not to be seen as crucial to academ ic freedom. The longer answer is that at least in principle, revamping academic freedom would not be acceptable in a democracy. At worse, it would be viewed as the equivalent of Gestapo bustin g in the front door of the university (or as a kind of Tiananmen square of higher education). What is not widely understood is that SEF is often a kind of Trojan Horse in the battle against tenure and academic freedom. 38 It often becomes a stealth mechanism by which to covertly ab rogate both tenure and academic freedom.SEF and Academic Freedom in the 21st Century.......... As noted above, arguments against tenure have typically been economic ones. As higher education enters the 21st century, and its associat ed demographic changes, however, arguments against tenure are changing. A paradigm shift is ta king place in arguments against academic freedom and tenure, a paradigm that is based in the changing demographics of the student population. It is said that academic freedom tends to be viewed from the perspective of a bygone era when the university faculty and student populat ion were relatively homogeneous. Accordingly, eliminating tenure or at least radical ly revamping it is increasingly being justified not on matters of principle but by political and ot her expedient considerations. .......... It is currently suggested that eliminati ng or revamping tenure would be fairer to minorities, to unemployed, and to part-time faculty (Murray, 1996; Parini, 1995; Wilson, 1996) presumably by opening up faculty positions for youn g professors and minorities. Thus critics of tenure maintain that a university going into the 21 st century can no longer afford the luxury of the traditional view of academic tenure. While thes e are valid problems changing the parameters of academic freedom and tenure is a dangerous way t o solve them. To the contrary, in a 21st century global society, academic freedom and tenure become more important than it has been in the past, not less. This is why:.......... With increasing diversity in both the ge neral culture and the university campus, including not only diversity among the student body, but dive rsity among the faculty, comes increasing conflict of ideas, values, and perspectives. To man age these changes, increasing attempts to reduce the conflict by limiting freedom of speech a nd action in the name of protecting minorities are being implemented. For example, it has been cha rged (Schrank, 1993) that, As now defined, academic freedom...ignores the inte rsubjectivity of all persons in the setting... such an approach conceals the vulner ability of women and other historically excluded groups who are still marginal in the academy, and does not take account of the historic advantages enjoyed by white heterosexual, able-bodied does not acknowledge power imbalances in relations based on gender, race, sexuality, class, and other dimensions of dif ference....An exclusive focus on words and ideas can obscure the fact that we are ta lking about whole people. For many people, ideas and words may well be referents for highly significant experiences which have powerful meanings, evoke str ong emotions, and are not simply ideas and words. The meanings and emotions a ttached to ideas and words render people extremely vulnerable in what, for oth ers, might be a benign intellectual exercise....The argument for including civility recognizes that ideas are not separate, public, and objective but real, emoti onal, and personal. .......... Certainly, insensitivities in the classr oom are to be discouraged. What this quote reflects is a fundamental paradigm shift in the parameters of a cademic freedom as historically conceptualized. SEF can and do reflect these and ot her political and cultural conflicts, creating what the courts in other contexts have called a "ch illing" effect on academic freedom (Recall the


17 of 32above emotional evaluation by students described by Allan Dershowitz). .......... This paradigm shift in the parameters of academic freedom, however, is but a subset of a more overarching social shift in the first amendmen t right to free speech outside of academia. In an extensive article in a noted law review journal (Matsuda, 1989), the author opens with the statement, "This article rejects an absolutist firs t amendment position." (p.2321) Such views argue for restricting academic freedom on the basis of the feelings of various and changing constituencies. The problem, as Dershowitz (1994) p oints out, "is that everyone has a different...exception, and in a nation of equal pro tection, it is difficult to pick and choose among the proffered exceptions. If we were to accept them all, there would be little left of the First Amendment." (p.41) In academia, this means there wo uld be little left of academic freedom and quality education. 39 .......... Contrary to popular views, tenure and th e traditional concept of academic freedom has not simply protected the majority, it has and will prot ect the newly entering minorities. Indeed, women and other minority studies programs in the un iversity largely owe their very existence to the traditional principle of academic freedom. As N adine Strossen, (1995) head of the ACLU points out in her well-documented book, restriction s on free speech originally meant to protect minorities often end up being used against them........... In reviewing legal cases involving SEF i n denial of tenure, it is clear that SEF can be use d as an instrument of covert discrimination against m inorities. As a legal handbook (Weeks, 1996) for college administrators points out, "Student rat ings are an effective tool to respond to litigation involving discrimination, since they provide both q ualitative and quantitative data on teaching." The handbook was suggesting that SEF can be used to show that the institution was not discriminating on the basis of gender or race but i nstead on the basis of SEF. In fact courts have taken the position that even if discrimination can be shown, if it can be shown that the faculty member is "really" being fired because of poor stud ent evaluation of their teaching, discrimination is not relevant. While prime facie t his is a reasonable judgement, disentangling gender and race bias from student evaluations is ex tremely problematic given the way evaluation forms are typically constructed, analyzed, and appl ied. .......... In conclusion, what this article suggest s is that SEF is far from the benign instrument it may once have been in a more homogeneous political, gender, racial, and academically prepared student environment. Unfortunately, on many campuse s the traditional model of student and teacher belongs to a past age. 40 Faculty now teach in a litigious context. The new role and impact of SEF need to be reassessed accordingly.Summary.......... Validity: As currently used (and perhaps under more stringent conditions as well) the validity of SEF as used in salary, promotion, and t enure decisions, are in question relative to methodology: (1) untrained interpreters using (2) i ntuitive, and (3) eyeballing methods of analysis to analyze SEF, (4) not controlling for co ntaminating variables such as (a) level of course, (b) instructors' standards (c) grading prac tices, (d) subject matter or discipline, (e) personality and (f) interest of student, (g) academ ic level of student, (h) required course v.s elective course, (i) class size, (j) age, and (k) g ender of instructor, and (l) a host of other variables; (5) Validity is also in question regardi ng student ability to validly render judgements regarding instruction and curricular........... Administrative Pressures To Relinquish F aculty Control: (6) SEF is typically required by administrative policy, (7) are therefore involuntar y, (8) are used for salary, promotion, and tenure decisions, which allow (8) administrative intrusion into the classroom, (9) create economic incentives for shaping faculty behavior to assure ( 10) assent to general consumer demands for the


18 of 32type and level of education, (11) assuring classroo m changes in accordance with educational fads, (12) political ideology, (13) to teach in a m anner that results in student satisfaction, (14) leading to the retention of students, (15) their tu ition dollars, resulting in (16) institutional growth........... Control Over Academic Standards: Given t he above, pressures are created (17) for faculty to assent and adjust their teaching to whatever lev el of student the (open enrollment) institution elects to admit, (18) for conforming to student dem ands in the classroom by lowering standards which include (19) to lower curricular changes, (20 ) make tests less rigorous, (21) inflate grading, (22) fewer classroom requirements and prer equisites, (23) for easier and less course content, (24) adjusting courses to popular culture belief and notions, (25) sets up a conflict of interest between the instructor and quality of educ ation, all of which result (26) in the opposite of the original intent of SEF which was the improvemen t of instruction. .......... General Legal Implications: Depending on their use, SEF (27) is often in conflict with usual personnel practices and procedures, (28) when published can lead to defamation of faculty reputation, (29) be discriminatory with regard to a ge, gender, race, and other variables, (30) involuntarily imposed, (31) not meeting what is con sidered contractual informed consent of faculty for their use,.......... Academic Freedom and Tenure: SEF can be used to (32) inappropriately dismiss competent faculty, (33) abrogate tenure, and (34) a bridge academic freedom.ReferencesAbrami, P. C., Dickens, W. J., Perry, R. P., & Leve nthal, L. (1980). Do teacher standards for assigning grades affect student evaluations of inst ruction? Journal of Educational Psychology 72, 107-118.Avi-Itzhak, T., and Lya K. (1986). An investigation into the relationship between university faculty attitudes toward student rating and organiz ational and background factors. Educational Research Quarterly 10 31-38 Benson, D. E. Lewis, J. M. (1994). Students' evalua tion of teaching and accountability: implications from the Boyer and the ASA reports. Teaching Sociology 22, 195-99 Blum, D. E. (1990, October 3). U. of Guam removes e valuations from files. The Chronicle of Higher Education Section: Personal & Professional, p. A21 Bonetti, S. (1994). On the use of student questionn aires. Higher Education Review 26, 57-64 Bures, A. L., DeRidder J. and Hsin-Min Tong. (1990) An empirical study of accounting faculty evaluation systems. The Accounting Educators' Journal (Summer). 68-76. Cahn, S. (1987, October 14). Faculty members should be evaluated by their peers, not by their students. Chronicle of Higher Education p. B2 Carey, G. W. (1993). Thoughts on the lesser evil: s tudent evaluations. Perspectives on Political Science 22, 17-20. Cashin, W. E. (1996). Developing an effective facul ty evaluation system. Idea Paper No. 33, Manhattan: Kansas State University, Center For Facu lty Evaluation and Development (January) Cashin, W. E. (1995). Student ratings of teaching: The research revisited. IDEA Paper No. 32.


19 of 32Manhattan, KS: Kansas State University, Center for Faculty Evaluation and Development. Chacko, T. I. (1983). Student ratings of instructio n: A function of grading standards. Educational Research Quarterly 8(2), 19-25. Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research 51, 281-309. Copeland, J.D. and Murry, J. W., Jr. (1996). Gettin g tossed from the ivory tower. Missouri Law Review 61, 233-327 Cohen, P.A. (1983). Comment on "a selective review of the validity of student ratings of teaching." Journal of Higher Education 54, 448-458. Cotter, William. (1996). Why tenure works. Academe (Jan/Feb) 26-29 Crumbley, L.. (1996). Due process/defamation issues [Not to be construed as legal advice] Society for A Return to Academic Standards. [Online ] Available: Crumbley, L. (1995). On the dysfunctional atmospher e of higher education: games professors play. Accounting Perspectives, 1, (Spring). [Online ] Available: Crumbley, L. D. and Fliedner, Eugene (1995). Accoun ting administrators' perceptions of student evaluation of teaching (set) information. Manuscrip t, Department of Accounting, Texas A&M University.Damron, J. C. (1996). Instructor personality and th e politics of the classroom. Manuscript, Douglas College, New Westminster, British Columbia, Canada V3L 5B2. (An Earlier Versions in the June 1994 Issue of Faculty Matters (No. 5, P ages 9-12) and the September, 1994 Issue of Update (The Newsletter of the Okanagan University C ollege Faculty Association). Douglas College)Dawes, R. M., Faust, D., and Meehl, P. E. (1989). C linical versus actuarial judgment. Science 243, 1668-1674Dershowitz, A. (1994). Contrary to popular opinion New York: Berkley Books Dilts, D. A., Samavati, H., Moghadam, M.R., and Hab er, L.J. (1994). Student evaluation of instruction: Objective evidence and decision making Journal of individual employment rights 2,7386.Divoky, J. J. and Rothermel, A. (1988). Student per ception of the relative importance of dimensions of teaching performance across type of c lass. Educational Research Quarterly 12. 40-45Dowell, D. A., & Neal, J. A. (1982). A selective vi ew of the validity of student ratings of teaching. Journal of Higher Education 53, 51-62. DuCette, J. and Kenney, J. (1982). Do grading stand ards affect student evaluations of teaching? Some new evidence on an old question. Journal of Educational Psychology 74. 308-314.


20 of 32Education Employment Law News (1994). How big a rol e should student evaluations play in the assessment of a professor for tenure? (January) pp. 3-4. EEOC V. Franklin & Marshall College, No. 84-1739, U nited states court of appeals for the third circuit, 775 F.2d 110; 3 Fed. R. Serv. 3d (Callagha n) 282; 85 A.l.r. Fed. 669; 39 Fair Empl. Prac. Cas. (Bna) 211; 38 Empl. Prac. Dec. (Cch) P35,644, August 5, 1985, Argued, October 21, 1985, DecidedEvaluating Teacher Evaluations (1996). Article in t he Topical Interest Group: Assessment in Higher Education submitted by glen9579@novell@uidah on 11/12/96. Available: uatingteacherevaluations.html Faust, D., Guilmette, T. J., Hart, K., Arkes, H.R., Fishburne, F. J. and Davey, L. (1988). Neuropsychologists' training, experience, and judge ment accuracy. Archives of Clinical Neuropsychology 3, 145-163. Feldman, K. A. (in press). Identifying exemplary te achers and teaching: Evidence form student ratings. In R. P. Perry & J. C. Smart (Eds.), Effective teaching in higher education: Research a nd practice New York: Agathon Press. Feldman, K.A. (1993). College students' view of mal e and female college teachers: Part II---Evidence from students' evaluations of their c lassroom teachers. Research In Higher Education 34, No. 2. 151-191. Feldman, K.A. (1983). Seniority and experience of c ollege teachers as related to evaluations they receive from students. Research in Higher Education 18, 3-124. Franklin, J., & Theall, M. (1990). Communicating st udent ratings to decision makers: Design for good practice. In Theall, M. & Franklin J. (Eds.), Student Ratings of Instruction: Issues For Improving Practice San Francisco: Jossey-Bass. Garb, H. N. (1989). Clinical judgment, clinical tra ining, and professional experience. Psychological Bulletin 105, 387-392. Gilbaugh, John W. (1982). Renner substantiated. Phi Delta Kappan 63 (6 Feburary) 418. Gillmore, G. (1984) Student ratings as a factor in faculty employment decisions and periodic review. Journal of College and University Law 10, 557-576 Goldberg, G. and J. Callahan. (1991). Objectivity o f students' evaluations of instructors. Journal of Education for Business 66, (July/August) 377-378. Goldman, L. (1993). On the erosion of education and the eroding foundations of teacher education. Teacher Education Quarterly 20, 57-64. Goldman, L. (1985). The betrayal of the gatekeepers : grade inflation. Journal of General Education 37, 97-121. Greenwald, A. G. (1996). Applying social psychology to reveal a major (but correctable) flaw in student evaluations of teaching. University of Wash ington, Draft Manuscript, March 1, 1996. Greenwald, A.& G. Gillmore. (1996). No pain, no gai n? The importance of measuring course workload in student ratings of instructions. Univer sity of Washington. Manuscript submitted for


21 of 32publication.Guernsey, L. (1996, September 20) Metropolitan Stat e College reviews its tenure policy. The Chronicle of Higher Education p. A13. Hayes, S. C. (1991). The emperor's clothes: Examini ng the "delusions" of professional psychology: The healthy skepticism of David Faust. Scientist Practitioner, 1 22-25. Healy, P. (1996, January 26). Newly appointed trust ees stir up public colleges. The Chronicle of Higher Education p. A26. Heller, S. (1986, December 17). Proper use of stude nt evaluations at issue in professor's legal appeal to regain job. The Chronicle of Higher Education p. 14. Holmes, D. S. (1972). Effects of grades and disconf irmed grade expectancies on students' evaluations of their instructor. Journal of Educational Psychology 63, 130-133. Howard, G. S., & Maxwell, S. E. (1982). Do grades c ontaminate student evaluations of instruction? Research in Higher Education 16, 175-188. Howard, G. S., & Maxwell, S. E. (1980). Correlation between student satisfaction and grades: A case of mistaken causation? Journal of Educational Psychology 72, 810-820. Howard, G. S., Conway, C. G, & Maxwell, S. E. (1985 ). Construct validity of measures of college teaching effectiveness. Journal of Educational Psychology 77, 187-196. Husbands, C. and Fosh, P. (1993). Student's evaluat ion of teaching in higher education: Experiences from four European countries and some i mplication of the practice. Assessment & Evaluation on Higher Education 18, 95-114. Kemp, B. W. and G. S. Kuman, (1990). Student evalua tions: are we using them correctly? Journal of Education for Business 65, (November/December). 106-11. Kolevzon, M. S. (1981). Grade inflation in higher e ducation: a comparative study. Research in Higher Education 15, 195-212. Kresl, P. (1995). Grade inflation must be examined before discussed. [Online] Available: ps/4002.html (March 7). Larkin, J., McDermott, J., Simon, and Simon, H. (19 80). Expert and novice performance in solving physics problems. Science 208, 1335-1342. Leatherman, C. (1996a, September 13). Commission re commends strengthening college presidents' power. The Chronicle of Higher Education p. A43. Leatherman, Courtney (1996b, October 25). More facu lty members question the value of tenure. The Chronicle of Higher Education p. A12. Magner, D. K. (1996, September 20) Minnesota regent s' proposals stir controversy with faculty. The Chronicle of Higher Education p. A11. Magner, D. K. (1995, May 19). Mid-semester removal of professor roils university of montana. The Chronicle of Higher Education p. A25.


22 of 32Marsh, H. W., & Dunkin, M. J. (1992). Students' eva luations of university teaching: A multidimensional perspective. Higher Education: Handbook of Theory and Research 8, 143-233.Marsh, H. W. (1984). Students' evaluations of unive rsity teaching: Dimensionality, reliability, validity, potential biases, and utility. Journal of Educational Psychology 76, 707-754. Marsh, H. W. (1982). Validity of students' evaluati ons of college teaching: A multitrait-multimethod analysis. Journal of Educational Psychology 74, 264-279. Marsh, H. W. (1980). The influence of student, cour se, and instructor characteristics on evaluations of university teaching. American Educational Research Journal 17, 219-237. Matsuda, M. J. (1989). Public response to racist sp eech: considering the victim's story. Michigan Law Review 87, 2320-2378. McCallum, L.W. (1984). A meta-analysis of course ev aluation data and its use in the tenure decision. Research in Higher Education 21, 150-158. McKeachie, W. J. (1979). Student ratings of faculty : A reprise. Academe 65, 384-397. McMurtry, J. (1991). Education and the market model Journal of Philosophy of Education 25, 209-217.Menand, L. (1996). (Ed.), The Future of Academic Freedom Chicago: University of Chicago Press.Menand, L. (1993). The future of academic freedom. Academe (May/June) 11-17. Murray, B. (1996). Attacks on tenure lead to new st andards: To contain spiraling higher-education costs and be fairer to minority fa culty, the tenure system should be reformed, educators say. American Psychological Association Monitor (May). Nelson, J. P. Lynch, K. A. (1984). Grade inflation, real income, simultaneity, and teaching evaluations. Journal of Economic Education 15, 21-37. Parini, J. (1995, July 14). Tenure and the loss of faculty talent. The Chronicle of Higher Education Section: Opinion, POV. Pennsylvania State University (1996). Policy On Con fidentiality Of Student Records.[Online] Available: '//psuinfo/Policies-Rules for Students/P olicies and Rules/Records. Also available via gopher from, S. (1989). Teaching in a litigious age. Change 21, (July/August), 50-54. Powell, R. W. (1977). Grades, learning, and student evaluation of instruction. Research in Higher Education 7, 193-205. Rabinowitz, J. (1993). Diagnostic reasoning and rel iability: a review of the literature and a model of decision-making. The Journal of Mind and Behavior 14, 297-316. Rebell, M, A. (1990). Legal issues concerning teach er evaluation. In Millman, Jason & Darling-Hammond, Linda, (Eds.), (pp. 337-355) The new handbook of teacher evaluation


23 of 32Beverly Hills: CA. SAGE.Reisman, D. (1981). On higher education: The academic enterprise in an era of rising student consumerism San Francisco: Jossey Bass. Renner, Richard, R. (1981). Comparing professors: h ow student ratings contribute to the decline in quality of higher education. Phi Delta Kappan 63 2 Oct. 128-30. Robinson, R. K., and Fink, Ross, L. (1996). Public disclosure of teaching evaluations: privacy and liability considerations. Journal of Education for Business May/June 284-287. Ryan, J. J., J.A. Anderson, and A.B. Birchler. (198 0). Student evaluations: The faculty responds. Research in Higher Education 12 (December) 317-333. Sacks, P. (1996). Generation X Goes To College. An Eye Opening Accoun t of Teaching in Postmodern America LaSalle, Il: Open Court. Schrank, B. (1993). Academic freedom and the inclus ive university. Paper presented at the Learned Societies Meeting, Carleton University, Ott awa, Ontario, 5 June 1993. [Online] Available: Schrecker, E. W. (1986). No ivory tower: Mccarthyism & the universities Cambridge: Oxford University Press.Scriven, M. (1995). A unified theory approach to te acher evaluation. Studies in Educational Evaluation 21, 111-129. Scriven, M. (1993). The validity of student ratings : In teacher evaluation. Evaluation & Development Group.Scriven, M. (1991). Evaluation thesaurus (4th ed.) Newbury Park, Calif. : SAGE Publication s. Seldin, P. (1993, July 21). When students rate prof essors. The Chronicle of Higher Education Opinion p. POV.Seldin, P. (1984) Changing practices in faculty evaluation San Francisco: Jossey-Bass. Snyder, C. R., & Clair, M. (1976). Effects of expec ted and obtained grades on teacher evaluation and attribution of performance. Journal of Educational Psychology 68, 75-82. Sommer, J. W. (Ed.) (1995). The academy in crisis: The political economy of hi gher education (pp.95-126) London: Transactions Publishers.Stern, J., and Flynn, P. D. (1995). Students propos e a course of action for grade inflation. The Bucknellian (Feb, 20).[Online] Available: ps/4165.html Stone, J. E. (1995). Inflated grades, inflated enro llment, and inflated budgets: an analysis and call for review at the state level. Education Policy Analysis Archives 3, Number 11 (2075) [Online] Available: Strossen, N. (1995). Defending pornography: Free speech, sex and the fi ght for womens's rights New York: Anchor/Doubleday.


24 of 32 Tagomori, H. T. Bishop, L. A. 1995. Student evaluat ion of teaching: Flaws in the instruments. Thought & Action 11, 63-78. The Chronicle of Higher Education (1995, March 31). South Carolina lawmaker would a bolish tenure, p. A24.The Chronicle of Higher Education (1994a, April 27). State calls for release of facu lty evaluations, p. A4.The Chronicle of Higher Education (1994b, December 14). Faculty evaluations to go o n line at U. of Texas at Austin, p. A21.University of Michigan (1994). Toward a definition of tenure. Senate Advisory Committee on University Affairs Standing Subcommittee on Tenure Endorsed Unanimously by the University of Michigan Senate Assembly, December 12.[Online] A vailable: Vasta, R., & Sarmiento, R. F. (1979). Liberal gradi ng improves evaluations but not performance. Journal of Educational Psychology 71, 207-211. Weeks, K. M. (1996). Managing departments: Chairper sons and the law. Nashville, Tennessee: College Legal Information, Inc.Wilson, R. (1996, June 7) Scholars off the tenure t rack wonder if they'll ever get on. The Chronicle of Higher Education p. A12. Worthington, A. G., & Wong, P. T. P. (1979). Effect s of earned and assigned grades on student evaluations of an instructor. Journal of Educational Psychology 71, 764-775. Young, R. D. (1993). Student evaluation of faculty: a faculty perspective. Perspectives on Political Science 22, 12-16. Yunker, B.J. and Sterner, J (1988). A survey of fac ulty performance evaluation in accounting. Accounting Educators' Journal (Fall). 63-74. Zirkel, P. A. (1996) The law of teacher evaluation Bloomington, IN: Phi Delta Kappa Educational Foundation.Notes:1. Address correspondence and notification of comme ntary to: Robert E. Haskell, Ph.D. Professor of Psychology, Department of Social and B ehavioral Sciences, University of New England, Biddeford, Me. 04005. Email I would like to thank Larry Crumbly of Louisiana State University for providing me with some of the sources cited in this article. I would especially like to thank professor s John Damron, of Douglas College, and John Stone, of the Andrew Jackson Institute, for sources and for reading an early draft of this manuscript.2. The closest and only allusion to SEF is found in Chapter Four by Cass R. Sunstein. In discussing the formal and informal regulation of sp eech (i.e., certain viewpoints) the author notes, "the evaluation of students and colleagues c annot occur without resort to content, and it


25 of 32would be most surprising if viewpoint discriminatio n did not affect many evaluations." (p.106) 3. Since the issue of SEF validity, in terms of lea rning, is so central a few observations of the literature is necessary. Greenwald and Gillmore (19 96) have categorized some of the significant reviews and empirical research that find in favor o f validity of SEF as measures of quality of instruction, for example, Cashin (1995), Cohen (198 1), Feldman (in press), Howard, Conway, and Maxwell (1985), Howard and Maxwell (1980, 1982) Marsh (1980, 1982, 1984), Marsh and Dunkin (1992), and McKeachie (1979). Reviews and em pirical critiques that are critical of the validity of SEF include, Chacko (1983), Dowell and Neal (1982), Holmes (1972), Powell (1977), Snyder and Clair (1976), Vasta and Sarmiento (1979) and Worthington and Wong (1979). Positions, suggesting cautious support for validity of SEF while at the same time expressing concerns about the adequacy of their support, inclu de, Abrami, Dickens, Perry, & Leventhal (1980). The recent methodologically sophisticated r esearch of Greenwald (1996), and Greenwald and Gillmore (1996) find strong evidence inconsiste nt with the common dismissive interpretation of the relationship between SEF and high student gr ades as reflecting a relationship between amount learned and student ratings.4. Cahn (1987) suggests that student ratings do not measure the instructional effectiveness or the intellectual achievement of students. SEF measure s tudent satisfaction, attitudes toward instructors course, student personality, and the ps ychosocial needs of the student. Cahn suggests, students know if instructors are likeable, not if t hey are knowledgeable; they know if lectures are enjoyable, not if they are reliable. In a meta-anal ysis Cohen (1983) concludes from his study, "While the magnitude of the average rating/achievem ent correlation for the thirty-three multisection courses is not overwhelming [14.4% of shared variance between ratings and the criterion], the relationship is certainly stronger and more consistent than we were led to believe..." (p. 455). Dowell & Neal (1982) conclude that "The research literature can be seen as yielding unimpressive estimates of the validity of student ratings. The literature does not support claims that the validity of student ratings is a co nsistent quantity across situations. Rather, the evidence suggests that the validity of student rati ngs is modest at best and quite variable (p. 59). "The variability in obtained validity coefficients even in studies with reasonable methodological requirements. . lead us to suspect that the valid ity of student ratings is influenced by situational factors to such an extent that a meaningful, genera lizable estimate of their validity does not exist. In general . .no meaningful estimate of the valid ity of student ratings can be provided with confidence that is generalizable enough to be usefu l..." (60-61) 5. For example, studies demonstrate the following v ariables: Age, gender, class size, year of student, level of student, instructor style, subjec t matter, major or elective course, student interes t in subject matter, instructor grading difficulty, a nonymous v.s signed ratings, whether students are informed of their use, instructor present v.s i nstructor absent while completing the evaluation (see for example, Divoky and Rothermel, 1988).6. At least one study found that 70% of students "i ndicated that the grade they thought they would get influenced the level at which they rated their professors" (in Goldman, 1985). The grade inflation literature shows that the typical p ercentage of A's and B's on many campuses is anywhere from 70-90%, and that the percentage of gr aduates who are graduating with cum laude and above is somewhere in the range of 50-70%.7. If used correctly (see Copeland and Murry, 1996; Kemp and Kuman, 1990; Scriven, 1995, 1993, 1991; Seldin, 1984), SEF can be very useful i nstructionally, and when used in conjunction with other methodologically sound evaluation proced ures and criteria, it can assist in informing an institution when a faculty does not pass muster as an effective teacher. The question, of course


26 of 32is: does student feedback to faculty result in impr oved teaching and student learning. In a review of studies, Marsh (1984) suggests that there is a s mall positive correlation for improved student learning of SEF feedback to faculty if used in a ca refully constructed collegial consultation process. They are seldom used in this manner.8. Bona Fides: I am a full professor, so I'm not pe rsonally concerned with SEF effect on promotion, tenure, and performance evaluations. I t ypically receive a 3.4 or 5 on a 4. scale on my evaluations, and in the classroom I do a fairly goo d dog and pony show as well. 9. Other countries using SEF questionnaires seem no t as concerned with research on their validity as is the U.S., there is apparently almost no published technical literature on validity and bias in SEF in German universities. There is howeve r, a debate in German higher education as to whether SEF is an invasion of academic autonomy (Hu sbands and Fosh, 1993). 10. While faculty are entitled to freedom of discus sion and inquiry in their classroom, it is a generally recognized limitation that they should no t introduce controversial matter which has no relation to their subject. What constitutes "contro versial" and "no relation," however, often remains an open question.11. AAUP Committee C on College and University Teac hing, Research, and Publication. It was adopted by the Council of the American Association of University Professors in June 1975 and endorsed by the Sixty-first Annual Meeting as Assoc iation policy. They also state: "An important and often overlooked element of evaluating teaching is an accurate description of a professor's teaching. Such a description should include the num ber and level and kinds of classes taught, the numbers of students, and out-of-class activities re lated to teaching. Such data should be very carefully considered both to guard against drawing unwarranted conclusions and to increase the possibilities of fairly comparing workloads and kin ds of teaching, of clarifying expectations, and of identifying particulars of minimum and maximum p erformance. Other useful information might include evidence of the ability of a teacher to shape new courses, to reach different levels and kinds of students, to develop effective teachin g strategies, and to contribute to the effectiveness of the individual's and the instituti on's instruction in other ways than in the classroom.12. Kolevzon found an interesting breakdown of thes e responses focused upon the finding that faculty with fewer years of teaching experience at the university were more impressed with the importance of the role played by student evaluation s when compared with senior faculty. p.208 Faculty with at least 6 years of teaching experienc e were more strongly supportive of the position that student evaluation forms were responsible for lenient grading. Avi-Itzhak and Lya found---perhaps counterintuitively---that senior fa culty were the most opposed to SEF being used for salary, promotion and tenure.13. It should go without saying, that not all stude nts are the same. SEF vary by maturity, intellectual level, i.e., graduate student evaluati ons v. undergraduate (See, Divoky and Rothermel, 1988; Dilts, Samavati, Moghadam, and Hab er, 1993) and therefore probably by campus and program. This may account in part for so me of the wide variation in faculty attitudes toward SEF.14. Other studies also show that inflating grades l eads to higher SEF (See, DuCette and Kenney, 1982; Goldberg and Callahan, 1991; Kemp and Kuman, 1990). 15. Given that SEF serve a dual function of both pr omotion termination, another area of possible


27 of 32counterproductiveness is the inherent conflict in t he necessary cooperation for faculty development purposes; SEF used for such purposes is thus in conflict with faculty self-protection. Unlike in the past, now that SEF a re an increasingly important instrument at many institutions, its underlying problems become i ncreasingly evident. 16. Last semester, says Dershowitz, "a small group of students complained about my teaching…[subject]… from a civil liberties perspect ive." I responded that it was important for the students to hear a variety of perspectives about…[s ubject]…, just as they hear, without objection, about other crimes. I also reminded them that the m ajority of students who speak in class present the "politically correct" views. I told them that t he answer to an offensive argument is not to censor but rather to come up with a better argument A few days later, one of the students told me that I should expect to be "savaged" in the evaluat ions that each teacher receives from the students. (italics and brackets added, p.118-119)17. I use these examples as merely illustrations of the role of SEF, making no attempt to assess validity or non validity of the examples.18. I use this case because I have a certain degree of prima facie confidence in the instructor's view. I had a similar experience teaching a require d abnormal psychology course to a group of vocational majors (Occupational Therapy). I have be en teaching for twenty years, and received good student evaluations and have earned tenure twi ce at institutions that rely heavily on SEF. I recently experienced my first student "revolt." The students were under pressure not to get two D's. My course was the last course before they were formally admitted into the major. I received no support from administration, including the depar tment chair. Moreover some of the vocational faculty (who incidently are not doctoral level facu lty), making it known to the students, complained to administration about my course. For i nteresting similar personal accounts, See Peter Sacks (1996). Generation X Goes To College. A n Eye Opening Account of Teaching in Postmodern America. LaSalle, Il: Open Court19. I have contacted a number of legal scholars who have written compendia of legal cases regarding the denial of tenure and promotion. Few w ere able to refer me to cases involving SEF. I would like to express my appreciation to Professo r William A. Kaplin, School of Law, Catholic University of America, and to Michael Rebell, of Re bell and Katzive, New York city. 20. I wish to than Dr. John Damron of Douglas Colle ge, for this reference. 21. Scriven has done considerable rigorous work on evaluation procedures, particularly on the justification inferring from ratings to conclusions about the merit of teaching on the basis of statistical correlations between ratings and studen t learning gains. He suggests that such inferences are invalid, unless a number of stringen t conditions are met on the design, administration, and use of such rating forms.22. Litigation resulting from invalid data in Great Britain has occurred. As Husbands and Frosh (1993) note, "As far as we know, there have been no comparable cases in European courts but, if European universities follow the American example o f using student evaluations largely or exclusively for summative purposes, it is only a ma tter of time before there is external examination of the techniques being used, and of th eir suitability for the purposes for which they are intended." p.103. They conclude, "it is a sad c ommentary on the gullibility of some people in the face of numerical data that it required the int ervention of the courts to force the discontinuation of the more gross forms of this typ e of interpretation" p. 103.


28 of 3223. Many faculty are not aware of the extent of the confidentiality of student information. For example, (1) student scores or grades can not be po sted publicly by name, social security numbers, or any other identifier that can be know b y anyone but the instructor and student, (2) student papers or lab reports that have student nam es and grades can not be left in places that are accessible to others, (3) other students may not ha ve revealed to other in a class, (4) faculty are not to request student information without a legiti mate educational reason, (5) student grades or other educational information may not be shared wit h other faculty unless the faculty has a specific legitimate reason to know, (6) libraries a re apparently prohibited from revealing to instructors class reading material s/he has specifi cally put on reserve in the library for students to read to see if they have read the material, (7) stu dent grades or other educational information can not even be revealed to the parent of the student ( who may be paying for the student's education) without written permission of the student. There ar e many other restrictions as well. 24. According to Black's Law Dictionary (With Pronu nciations Sixth Edition (1990). St. Paul, Minn.: West Publishing Co.) "Libel. A method of def amation expressed by print, writing, pictures, or signs. In its most general sense, any publication that is injurious to the reputation of another. A false and unprivileged publication in wr iting of defamatory material. Bright v. Los Angeles Unified School Dist., 51 Cal.App.3d 852, 12 4 Cal.Rptr. 598,694. "A maliciously written or printed publication which tends to blacken a per son's reputation or to expose him to public hatred, contempt, or ridicule, or to injure him in his business or profession. Corabi v. Curtis Pub. Co., 441 Pa. 432, 273 A.2d 899,904. "Accusation in writing or printing against the character of a person which affects his reputation, in that it ten ds to hold him up to ridicule, contempt, shame, disgrace, or obloquy, to degrade him in the estimat ion of the community, to induce evil opinion of him in the minds of right thinking persons, to m ake him an object of reproach, to diminish his respectability or abridge his comforts, to change h is position in society for the worse, to dishonor or discredit him in the estimation of the public, o r his friends and acquaintances, or to deprive him of friendly intercourse in society, or cause hi m to be shunned or avoided, or where it is charged that one has violated his public duty as a public officer. Almost any language which upon its face has a natural tendency to injure a ma n's reputation, either generally or with respect to his occupation. Washer v. Bank of America Nat. T rust & Savings Ass'n, 21 Cal.2d 822, 136 P.2d 297, 300."25. According to Rebell (1990), Florida specificall y mandates the school board to provide professional training programs to "ensure that all individuals with evaluation responsibilities understand the proper use of the assessment criteri a and procedures" (Fla. Educ. Code, /sec 231.29(2)). p.345. Apparently, however, such mandat es are not applied to higher educational institutions.26. Dowell & Neal ( 1982) suggest that "The attract ion of student ratings to higher education officials lies in the seductive way they seem to re duce a complex human activity, teaching, to simple numbers. Administrative over reliance upon r atings may encourage faculty to "perform to criterion," to teach in such a way as to accrue goo d ratings, even though this may not result in optimal educational practice. Ratings cannot substi tute adequately for more informed judgment about teaching effectiveness, which might be obtain able from reviewing course materials, visiting classes, and discussing teaching practices with faculty. Unfortunately, these detailed qualitative methods are currently regarded as too e xpensive and difficult to use in most institutions. Evidence presented in [Cohen's 1983] review reinforces our earlier conclusion that student ratings are inaccurate indicators of studen t learning and therefore are best regarded as indices of "consumer satisfaction" rather than teac hing effectiveness" (p.60-61) 27. Recently, a group of college and university pre sidents have proposed that faculty input into


29 of 32decisions be reduced. The commission's 50-page repo rt, "Renewing the Academic Presidency: Stronger Leadership for Tougher Times," makes recom mendations to presidents, professors, trustees, and public officials. Copies of the repor t, available for $12.95 for members of A.G.B. and $19.95 for non-members, may be obtained from th e Association of Governing Boards of Universities and Colleges, One Dupont Circle, Suite 400, Washington 20036; (202) 296-8400. From: Leatherman, (1996a)28. It is perhaps ironic in a democracy that there should be so much scrutiny of faculty behavior in the classroom. One faculty who has taught and li ved in totalitarian states observed (in Sacks, 1996), I grew up under a totalitarian government and worked as a teacher there, and never did anybody come to visit my classroom to make sure I w as toeing the line in that sense...there was more "oppression" of teachers the United States, a country that boasts the ideals of freedom speech and thought, and that there was more pressur e to conform to "acceptable" ideas in U.S. classrooms." (p.33)29. See endnote # 1230. For purposes of this paper I consider the manda ted use or non use of SEF for salary, promotion and tenure by union contract agreements a special case of the "voluntary" nature of SEF. Some union contracts prohibit their use in sal ary, promotion and tenure decisions. Other require that students sign their evaluations.31. There is all too frequently negative administra tive support for faculty when students complain. For years I've seen faculty struggle with this issue. In the end, they usually conform to it---at least to some degree. When I informed one a dministrator that I was going to change my grading system from my already drastically modified Bell curve to to the campus norm of giving mostly A's and B's, because I just could no longer "compete" for students in my courses, he soberly looked me straight in the eye and firmly sa id, "That's a good idea." 32. See, Grimes v. Eastern I11. Univ., 710 F.2d 386 388 (7th Cir. 1983). Cited in, Copeland and Murry, (1996).33. A recent article on the nation-wide attacks on tenure, repeatedly referred to it as "job security" for professors. It was noted that America n Association for Higher Education (AAHE) researchers found that tenure serves an important p urpose by "rewarding dedicated, productive professors." Other articles maintain that a tenure system helps to recruit and retain quality faculty. And a representative of the American Assoc iation of Psychology Professors, argued that "tenure is only awarded to faculty who conduct qual ity research and teaching" (Murray, 1996) For similar justifications see Leatherman (1996b). After talking about job security in Business and in the military, both of which have nearly vani shed, one faculty said "I can't say there's some obvious reason why higher education should be immun e to it"(in Leatherman,1996b). As a consequence, there seems to be a merging of the pop ular culture and the academic views of tenure, with the direction of flow from the popular culture to the academic. Arguments in defense of tenure increasing exhibit a tone of, "...and, oh by the way, it's to help insure academic freedom as well." faculty could be given five and t en-year contracts. While this may address the irrelevant issue of job security in some measure, i t neither solves the problem, nor addresses the issue of academic freedom. This is why it doesn't: First---and this seems to be not widely recognized---it merely postpones being "dehired." S econd, it exacerbates the job security issue. By the time a faculty has been at an institution fo r ten years, s/he is either an associate professor, or is at a salary range that reduces the chances of finding another position. Increasingly, with institutions cutting costs, most positions are entr y level ones. So variable length contracts just


30 of 32provide an additional cost cutting mechanism for el iminating senior ---or soon to be senior---faculty. And third, as a consequence, such contracts tend to put a damper on faculty feeling free to examine and critique issues that ar e unpopular. With such contracts, the tenure function does not in fact exist.34. I am not an uncompromising academic isolationis t advocating the development of a kind of Monroe Doctrine applied to higher education. I see the value of some changes that have taken place in the university which could probably only h ave occurred as the result of outside pressures. But such changes should be made with ext reme caution, reason, logic, and evidence. Most importantly, any changes should made within th e boundaries of what a university needs to be in order to perform its role in society. To tamp er with tenure is to clearly cross one of those boundaries.35. This is not intended as a blanket apologia for academia. There are many problems within the academy. In many other areas, I am a severe critic of my colleague's collective behavior. 36. Increasingly, and perhaps understandably, junio r faculty are questioning the system of tenure. After all, not only do they not have it, but it is becoming increasingly difficult to attain, politically and in many other ways. A recent survey by the Higher Education Research Institute at the University of California at Los Angeles shows t hat more than a third of some 34,000 faculty either strongly or somewhat agreed that tenure is a n outmoded concept (an increase from 1989), seeing it as "elitist," a "holdover," and a tiered system of "haves and have nots" (in Leatherman, 1996b). But it seems that even older faculty are be ginning to question tenure. The foundations of such reasoning, it seems to me, is very egocentric, shortsighted, and dangerous to the very concept of a university in a democratic society.37. McMurtry delineates the inappropriate use of th e consumer metaphor in education. "The best product on the market, as we know, is the one which is the most 'problem-free' for its purchaser--delivered ready made for instant easy us e','guaranteed replacement' if it does not work, and 'repaired cost-free' whenever it needs ma intenance attention. The best education, on the other hand, is the opposite on all standards of excellence. It cannot be produced or delivered by another at all, is never ready-made nor instant, and cannot be guaranteed replacement or service cost-free if it is not working. The higher the standards it has, the less it can be immediate in yield, the more work it demands of its owner, an d the more its failures must be overcome by its possessor's own work. An education can never be 'problem free', and poses ever deeper and wider problems the higher the level of excellence i t achieves. Freedom in the market is the enjoyment of whatever one is able to buy from other s with no questions asked, and profit from whatever one is able to sell to others with no requ irement to answer to anyone else. Freedom in the place of education, on the other hand, is preci sely the freedom to question, and to seek answers, whether it offends people's self-gratifica tion or not....What is the best policy for buying a product--to assert the customer's claim ‘as alway s right'--is the worst possible policy for a learner. What is the best policy for selling a prod uct--to offend no-one and no vested interest--ma): be the worst possible policy for an educator. The principles of freedom here are contradictory, and become the more so the more each is realized." p.213-214 38. Extending the ancient Greek account, in compute r science a Trojan Horse refers to a set of instructions hidden within a legitimate program, ca using a computer to perform illegitimate functions.39. A recent edited book on The Future of Academic by Louis Menand (1996) provides an extensive series of closely reasoned chapters addre ssing the multicultural-based attacks on the


31 of 32 First Amendment. See especially Henry Louis Gates, Jr.'s "Critical Race Theory and Freedom of Speech."40. See, for example Pinsker, (1989). About the AuthorRobert E. HaskellBrief Bio ~ Robert E. Haskell has been teaching college and university level courses for over twenty years. He earned his Ph.D. from the Pennsylv ania State University in Psychology and Social Relations, his M.A., and B.A. from San Franc isco State University. His areas of research and teaching include: transfer of learning, analogi cal reasoning, small group dynamics. Major publications include: four books, the latest of whi ch is, The Future of Education and Transfer of Learning: A Cognitive Theory of Learning and Instru ction For The 21st Century (forthcoming), and numerous presentations, chapters, and research articles in national and international journals. He also serves on several editorial review boards, and is Associate Editor of The Journal of Mind and Behavior. He is former Chair, and currently Pro fessor of Psychology, Department of Social and Behavioral Sciences, University of New England.Professor of PsychologyUniversity of New EnglandBiddeford, Maine 04005UNE Home Page: E-mail: Copyright 1997 by the Education Policy Analysis ArchivesThe World Wide Web address for the Education Policy Analysis Archives is General questions about appropriateness of topics o r particular articles may be addressed to the Editor, Gene V Glass, or reach him at College of Education, Arizona Stat e University, Tempe, AZ 85287-2411. (602-965-2692)EPAA Editorial Board Michael W. Apple University of Wisconsin Greg Camilli Rutgers University John Covaleskie Northern Michigan University Andrew Coulson Alan Davis University of Colorado, Denver Sherman Dorn University of South Florida


32 of 32 Mark E. Fetler California Commission on Teacher Credentialing Richard Garlikov Thomas F. Green Syracuse University Alison I. Griffith York University Arlen Gullickson Western Michigan University Ernest R. House University of Colorado Aimee Howley Marshall University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Calgary Richard M. Jaeger University of North Carolina--Greensboro Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Rocky Mountain College Dewayne Matthews Western Interstate Commission for Higher Education William McInerney Purdue University Mary P. McKeown Arizona Board of Regents Les McLean University of Toronto Susan Bobbitt Nolen University of Washington Anne L. Pemberton Hugh G. Petrie SUNY Buffalo Richard C. Richardson Arizona State University Anthony G. Rud Jr. Purdue University Dennis Sayers University of California at Davis Jay D. Scribner University of Texas at Austin Michael Scriven Robert E. Stake University of Illinois--UC Robert Stonehill U.S. Department of Education Robert T. Stout Arizona State University


1 of 8 Contributed Commentary on Volume 5 Number 6: Haskell Academic Freedom, Tenure, and Student Evaluation of Faculty and Volume 5 Number 8: Stake Response to Haskell's "Academic Freedom ... & Stude nt Evaluation"On Michael Theall's (and implied et al.) "A Reply to Haskell and to Stake" Robert E. Haskell In "On Drawing Reasonable Conclusions about Studen t Ratings of Instruction: a Reply to Haskell and to Stake," Michael Theall (1997), Director of the Center for T eaching and Learning at the University of Illinois at Springfield, opens his critique of my article (Haskell, 1997a) by lame ntably describing my piece as rhetorical. He then g oes on to characterize my article: as an example of (1) faculty who "fulmi nate" against SEF as (2) "simplistic," (3) "loade d with misinterpretations of the literature," as (4) "myth ology," (5) exhibiting an "ignorance of evaluation/ measurement literature," as containing (6) "sweeping generaliza tions," (7) "misinformation," as (8) "simply ridicu lous!" (9)" ripe" with "hysterical rhetoric," as (10) assuming a "mythical group of better students" of some bygone era, asse rting that (11) SEF "are the cause of grade inflation," as suggesting ( 12) "we do away with ratings." Continuing, Theall w rote that (13) "Perhaps the weakest part of his article is what is n't there: constructive suggestions for improvement ," that I (14) "suggested that ratings are a violation of academic freedom," and finally---but not exhaustively---(15 ) that "Academic freedom has been defined in many ways, but never be fore in a way that suggests the construct (traditio n? principle? tenet?) is vulnerable to the influence of student ratings." The last two items, at least, are correct. With th is said, let me say I am perplexed by Theall's response. I am perplexed for three reasons. First, because o f what my article actually clearly said and did not say, second, because of Theall's own findings, and third because of the tone. Let me begin with what my article act ually clearly said and did not say, and Theall's reaction to it. This will lead naturally into his own findings which, ironic ally, seem to support much of my thesis.Theall's Inaccuracies and Misinterpretations In anticipation of responses like Theall's, I clea rly laid out the boundaries of my article quite car efully. Evidently, he either did not notice these boundaries or for some reason elected to ignore them. I will start with th e more general accusations and proceed to the more specific, both of which as I will document, are filled with inac curacies. He says of my article, among other things that "Perhaps the we akest part of his article is what isn't there: cons tructive suggestions for improvement." I had indicated that I would not be a ddressing this issue, nor dealing in detail with fi ndings on validity of SEF, or with grade inflation. I clearly indicated t he boundaries: I said In explicating SEF many closely related issues mu st be substantially bracketed. These related issues include (1) its validity ( Cahn 1987; Damron 1996; Gree nwald, 1996; Greenwald and Gillmore 1996; Scriven 1993; Seldin 1984; Tagomori Bishop, and Laurenc e. 1995), (2) the problem of defining teaching effectiveness, (3) general variables affecting SEF scores, (4) alternatives to SEF 's (5) classroom po litically correct or popular standards and perceptions, (6) l ow student academic preparation, (7) age and gender discrimination issues (Feldman, 1983, 1993) (8) str ategies for change, and (9) other integrally relate d issues such as their being largely responsible for lowered course standards, and grade inflation. Though thes e are important and related issues, they can only be addr essed here in so far as they directly impact the fo cus on SEF and academic freedom. Despite these clearly stated boundaries, and despi te the fact of providing example citations to many of the areas like validity that I could not cover---I refer the reade r to endnotes #3, 4, 5, 6 and 14 in my paper. Disre garding these limits, Theall proceeded to critique my article by concentr ating on (a) validity and (b) grade inflation, exac tly those areas that I said would have to be bracketed. I refused to be pu lled into these two areas in my paper, and I refuse to be hauled into them now. Both validity and grade inflation are too importan t to be briefly dealt with within the confines of t his response by simply offering, as did Theall, a citation here and a citation there. Such an ad hoc approach to these important issues is little more than a statistical analogue of using ve rbal anecdotes to resolve the issue. In a more appr opriate forum, if Theall would like to come up with something a little more systematic than the statistical anecdotes he offere d, it might be worth responding. (Note 1) In any event, Theall refused t o substantially address the main issue of my articl e: academic freedom


2 of 8and SEF I will now address some of the undocumented specif ic inaccuracies of Theall's critique. From numerous places in his response, he says, Robert Haskell suggests that we do away with rating s....The thrust is simply that we should do away wi th student ratings.... I am targeting the use (perhaps misuse) of these issues in efforts to discredit o r do away with faculty evaluation practices in general and st udent ratings in particular....building his case th at ratings were an essentially unreliable form of data that sh ould be done away with. I did not say that SEF should be done away with. I n fact in (a) my original article and (b) my respon se to Stake, (Haskell, 1997b ) both of which Theall evidently re ad, I clearly stated: It is important to note at the outset, that it is n ot SEF per se that is the issue, but the impact of its use on salary, promotion, tenure decisions, and its impact on the delivery of quality education. In addition in endnote #6 I said, again: If used correctly (see Copeland and Murry, 1996; Ke mp and Kuman, 1990; Scriven, 1995, 1993, 1991; Seld in, 1984), SEF can be very useful instructionally, and when used in conjunction with other methodologicall y sound evaluation procedures and criteria, it can as sist in informing an institution when a faculty doe s not pass muster as an effective teacher. Let me make it clear once again: I am not against SEF only their use for administrative purposes. A s I said in my paper, while it is true that SEF may indeed validly describe incompetence...give n (a) the conflicting data on their validity, (b) t he way many institutions have constructed SEF instruments, (c) the often unsystematic statistical method by w hich SEF are interpreted, and especially (d) given the c onsiderable weight accorded negative comments by on ly a few students in making tenure and promotion decisio ns, it would seem SEF can all to easily be used as a covert instrument for the elimination of tenure can didates and other faculty who may threaten student tuition dollars and perhaps ideological and popular culture agendas. Note that I said "conflicting data on their validi ty," not that the literature demonstrates conclusiv ely that they are completely invalid. Given the number of instances on which Theall used the word "practical," I would think, given the tim e and resource intensive requirements for rendering SEF u se valid, he would appreciate the fact that the lik elihood of its appropriate use is not high. (In fact he does. See below). Theall further claims that I make causal attributions from SEF to grade inflation. He wrote, Lately, there has been a lot of discussion attempti ng to causally link student ratings to problems suc h as grade inflation'.... Ratings" become the cause of the dow nfall of higher education.... Robert Haskell (1997) suggested that ratings are.... the cause of grade inflation). is very risky to point to one reason for thes e changes. Nowhere in my article was it either suggested that SEF is "the cause" or was the term "cause" used or such an attribution made about SEF. I did suggest that SEF is largely responsible for lowered course standards, a nd grade inflation....are responsible for a conside rable amount of grade inflation. As scholars and rigorous researchers will likely a gree there is a considerable difference between att ributing a single causal connection and suggesting they contribute to grade inflation. I guess the statistical concept o f variance was conveniently forgotten. Theall claims that I "blame" students. He wrote, Don't blame students." In anticipating such a readi ng of my article, I stated in the opening that to question S EF is often seen as not only a novel idea, but as an attack on either students, or a general attack on evaluating faculty. I evidently anticipated this claim correctly. In f urther anticipation of such a reading of my article in endnote #12 I said, It should go without saying, that not all students are the same. SEF vary by maturity, intellectual le vel, i.e., graduate student evaluations v. undergraduate. Then later in the text I stated,


3 of 8Many students understand the above described ensuin g consequences. A glance at articles from online st udent newspapers reveals strong sentiments against what m any students consider the erosion of standards crea ted by SEF One student writer went so far as to say "We therefore suggest a boycott of the 1995 student/tea cher evaluations. This boycott will provide a more effec tive means of communication than anything written o n the evaluation itself. Something must be done about the trend of grade inflation. We as students refuse to contribute to the downfall of academia (Stern and F lynn, 1995) Some students are thus quite aware of t he effects of SEF on their education. Theall then goes on to claim that if there is grad e inflation that one of the parties we should blame is faculty; that we should "Blame the perpetrators." There is a section in my paper entitled, "Faculty Complicity and Adap tation to SEF," in which I analyze this problem, but he does not menti on any of the arguments I advance. I come to quite a different conclusion, however, than does Theall To my amazement, Theall simply notes, almost in pa ssing, that I suggest that SEF "somehow violate the great traditions of academic life." What does he mean, so mehow? The entire article was a fairly detailed "le t me count the ways"--a listing, explaining, and accounting of jus t how I suggest SEF do violate academic freedom, bo th in the abstract and with concrete examples. So when he simply says of my article that "Academic freedom has been defin ed in many ways, but never before in a way that suggests the c onstruct (tradition? principle? tenet?) is vulnerab le to the influence of student ratings," I would suggest that Theall, tell us more about SEF and academic freedom, since this was the main thesis of the article. I would now like to note very briefly and comment on some other claims made by Theall My comments w ill be brief because, again, I dealt with them in my paper First, he says that students "are the appropriate providers of data," mainly because they are paying "tuition dollars." I addressed these issues in my paper in discussing t he misplaced (a) political metaphor of democracy applied to instruct ion, and (b) corporate metaphor of student as consu mer. Theall also claims that ratings are collected before students get their fin al grades and thus, their opinions must be based on expectations which are, in turn, based on performan ce to date. Ratings thus can not be said to reflect a disconfirmed expectancy about the overall course ou tcome (i.e., the course grade).This would only be t he case if ratings were gathered after final grades were di stributed and the final grade disconfirmed what was expected as a result of the experiences during the semester. So the ratings relationship is limited to experien ces and results during the term rather than to the final gr ade. In many if not most cases Theall's claim is incorr ect. This is why: (a) In many, and maybe most, cour ses students receive test grades and the class curve/distributio n (or at least should) throughout the semester, (b) the student grapevine is quite efficient is informing students what instruct ors grading practices are. So ratings thus can be s aid to reflect a disconfirmed expectancy about the overall course ou tcome. Theall at least did make one technically valid set of points. Assuming SEF are on their face valid an d reliable because there is "little variance across responses" says Theall The question now becomes one which deals with: 1) t he appropriateness of course content; 2) the standa rds used for grading; and 3) the question of whether le nient coverage and grading were deliberately chosen in order to influence ratings. The first two items can be and should be dealt with via curricular mechani sms such as departmental review of courses and content, and faculty agreement on standards for student work. Granting for the moment SEF validity and reliabilit y (I probably would even really accept their reliab ility), I would like to address 1) and 2) in the above quote (I addressed t he third above). On a collective level, I agree wit h Theall that faculty have not been responsible in developing and enforci ng standards. In fact in endnote #34 I wrote: This is not intended as a blanket apologia for acad emia. There are many problems within the academy. I n many other areas, I am a severe critic of my collea gue's collective behavior. Theall's Work Supports Much of My Article I noted at the beginning of this response that I w as perplexed at much of Theall's reaction. I am par ticularly perplexed at his response to a passage I cited from his work. He says, Haskell included a long quote from a chapter Jennif er Franklin and I (1990) did in our New Directions for Teaching and Learning issue #43 ( Theall & Franklin 1990b) While our point was that ratings practice must be improved, Haskell used the quote to supplement o ther citations as evidence in building his case tha t ratings were an essentially unreliable form of data that sh ould be done away with ... a gross misinterpretatio n of our intent. I am perplexed for two reasons. First, it is really neither here nor there that the quote apparently d oes not reflect Theall's (et


4 of 8al.) intent. Whatever an author's intent was in usi ng a quote does not necessarily reflect on other us es of the material, especially when the quote is not used to suggest an authors position. Second, I am perplexed because much of Theall's ch apter revolves around other material that he sugges ts demonstrate some of the severe problems in using SE F as they are currently used. The full quote to whi ch Theall refers is as follows: Even given the inherently less than perfect nature of ratings data and the analytical inclinations of academics, the problem of unskilled users, making decisions ba sed on invalid interpretations of ambiguous or fran kly bad data, deserves attention. According to Thompson (19 88, p. 217) "Bayes Theorem shows that anything clos e to an accurate interpretation of the results of imperf ect predictors is very elusive at the intuitive lev el. Indeed, empirical studies have shown that persons unfamilia r with conditional probability are quite poor at do ing so (that is, interpreting ratings results) unless the situation is quite simple." It seems likely that th e combination of less than perfect data with less than perfect us ers could quickly yield completely unacceptable pra ctices, unless safeguards were in place to insure that user s knew how to recognize problems of validity and re liability, understood the inherent limitations of rating data and knew valid procedures for using ratings data in the contexts of summative and formative evaluation. (79 -80). The authors conclude by noting "It is hard to ignor e the mounting anecdotal evidence of abuse. Our fin dings, and the evidence that ratings use is on the increase, taken together, suggest that ratings malpractice, causin g harm to individual careers and undermining institutional goals, deserv es our attention." (pp. 79-80). Then in endnote #26 of my paper, Theall objects to my quoting Dowell and Neal noting that: "...Cohen' s (1983) review reinforces our earlier conclusion that stude nt ratings are inaccurate indicators of student lea rning..." (Note 2) Theall omits the rest of the Cohen's quote which said, "an d therefore are best regarded as indices of consume r satisfaction' rather than teaching effectiveness." Even given that Theal l (et al.) is talking about the logic in use of SEF not its inherent validity, he goes on to say, Conversations with faculty and administrators...led increasingly to concerns about what users [e.g., c hairmen; deans] were doing with the information we were prov iding. We saw that some departmental administrators who routinely use ratings to make decisions about p ersonnel, evaluation policy, and resource allocatio n, were not familiar enough with important ratings issues t o make well informed decisions...Clearly stated dis claimers regarding the limitations of ratings data in partic ular circumstances appeared to have little effect o n the inclination of some clients to use invalid or inade quate data...There are some fundamental concepts fo r using numbers in decision making. To the degree that thes e concepts are ignored, interpretations of data bec ome, at best, projective tests reflecting what the user (e. g., a chairperson or dean) already knows, believes, or perceives in the data. Treating tables of numbers like inkblo ts ('ratings by Rorschach') will cause decisions to be subjective and liable to error or even litigation.. Theall (et al.) continue in that chapter to lay ou t carefully the methodological precondition for the reliable use of SEF For example, in that chapter Theall's (et al. ) wrote: Three types of errors come to mind immediately. The first involves interpretation of severely flawed d ata, with no recognition of the limitations imposed by proble ms in data collection, sampling, or analysis. This error can be compared to a Type I error in research -wrongl y rejecting the null hypothesis -because it invol ves incorrectly interpreting the data and coming to an unwarranted conclusion. In this case, misinterpreta tion of statistics could lead to a decision favoring one in structor over another, when in fact the two instruc tors are not significantly different...(p.87-88)The second type of error occurs when, given adequat e data, there is a failure to distinguish significa nt differences from insignificant differences. This er ror can be compared to a Type II error. -failure to reject the null hypothesis because the user does not realize that there is enough evidence to warrant a decisio n. In this case, failure to use data from available reports (a ssuming the reports to be complete, valid, reliable and appropriate) may be prejudicial to an instructor wh ose performance has been outstanding but who, as a result of the error, is not appropriately rewarded or wors e, is penalized. (p.87-88) The third type of error occurs when, given signific ant differences, there is a failure to account for or correctly identify the sources of differences. This error com bines the other two types and is caused by misunder standing of the influences of relevant and irrelevant variab les. In this case, a personal predisposition toward teaching style.., may lead a user to attribute negative mean ings to good ratings, or to misinterpret the result s of an item as negative evidence when the item is actually irre levant and there is no quantitative justification f or such a decision. Any of these errors can render an interpr etation entirely invalid.(p.87-88) Summarizing, Theall (et al.) wrote, Let us...state our goal in the following way: "The user will make decisions that are based on valid, r eliable


5 of 8hypotheses about the meaning of data." In this case the user should receive or construct working hypo theses that do the following things: Take into account problems in measurement, sampling or data collection and include any appropriate warnings or disclaimers regarding the suitability o f the data for interpretation and use. Do not attempt to account for differences between a ny results when they are statistically not signific ant (probably <.05). Disregard any significant differences that are mere ly artifacts (for example, small differences observ ed in huge samples), which can technically be signific ant but are unimportant). Account for any practically important, significant differences between results in terms of known, like ly sources of systematic bias in ratings or reliably o bserved correlations, as well as in terms of releva nt praxio logical constructs about teaching or instruc tion. The user should also refrain from constructing or a cting on hypotheses that do not meet these conditions...(pp. 87-89)... Again, I said in my paper, SEF may indeed validly describe incompetence...give n (a) the conflicting data on their validity, (b) t he way many institutions have constructed SEF instruments, (c) the often unsystematic statistical method by w hich SEF are interpreted, and especially (d) given the c onsiderable weight accorded negative comments by on ly a few students in making tenure and promotion decisio ns, it would seem SEF can all to easily be used as a covert instrument for the elimination of tenure can didates and other faculty who may threaten student tuition dollars and perhaps ideological and popular culture agendas. So what is the problem of "intent" to which Theall refers? Inappropriate Tone of Theall's Commentary Theall's also makes brief mention of Stake's (1997 ) response to my article. While Stake can take care of himself, I would like to comment on one of Theall's comments a s in my view it is indexical of Theall's setting up "straw men." Theall says, Stake's reference is probably to "educational seduc tion", the skill of the infamous "Dr. Fox" who supp osedly entertained students and received high ratings desp ite the fact that he delivered no content ( Naftuli n, Ware, & Donnelly, 1973). Many who do not care for ratings f ind one study that supports their position but igno re subsequent work (e.g. Perry et al., 1979; Marsh & W are, 1982) which points out problems with the origi nal study and proceeds to clarify the issue. The point is that Stake does not even come close t o mentioning the Dr. Fox study, a study that used a n actor to deliver lectures with nearly no content. Setting up such straw men, allows Theall to then claim to hav e demolished them. What the purpose of this was, I can not say. Finally, in a critique which rather cavalierly tak es on not only my article, but Stake (1997) and Gre enwald, and Greenwald and Gilmore (1996), (Note 3) perhaps what is most disturbing is that Theall's attempt at reb uttal is, in tone, unjustifiably condescending and, in word, ad hominem In referring to a research article of his (et al. ), from which I quote in my article much to his dissatisfaction, he says, Frankly, I don't like to recommend articles like th is to those not actively involved in ratings resear ch or practice because such writings can mislead readers who aren't really familiar with the cited ratings l iterature. I'm sorry if that sounds elitist: it isn't intended to be but I do have a reason for noting it. Theall should be more than "sorry;" as a scholar p resumably knowledgeable in science, he should be em barrassed. He should also not flatter himself by suggesting hi s statement may be considered elitist; arrogant, pe rhaps, but not elitist. (Note 4) This is why: First, he has no legitimate r eason to assume that I am not familiar with the lit erature (see below), just because I disagree with him and have reservations a bout SEF His assumption seems to be based on the most fundamental of logical and statistical inference fallacies. The all says, We (Franklin & Theall, 1989) found that ignorance o f evaluation/measurement literature and methods correlated significantly with negative faculty opin ions about students and student ratings. I note thi s because discussions about ratings are so often filled with misinformation. Now as any first year college student learns in st atistics, (a) correlation is not causation, and (b) one can not infer from a statistical generalization of a group, that an individual in that group is an indicator of that generalization. In fact this "logic" is the same as that used by people who exhi bit prejudice: applying some generalization or cent ral tendency characteristic of a specific group of people to an individual. Scriven (1988) makes this point in a si milar context.


6 of 8 Second, were his claims not so clearly uninformed, they would be an insult not only to me, but to man y other scholars as well. This is why that is: To say that scholars not actively involved in doing statistical or experimental research on a subject are not qualified to read, understand and comment on this research, immediately puts most theoretical physicists, theoretical biologist, and philosophers of science out of business. Hmmm. (Note 5)ReferencesBarnett, Larry, D. (1996). Are Teaching Evaluation Questionnaires Valid? Assessing the Evidence. Journ al of Collective Negotiations in the Public Sector, 25, (4) 335-349Cohen, P. A. (1983). Comment on a selective review of the validity of student ratings of teaching. Jou rnal of Higher Education, 54, 448-458.Franklin, J. L. & Theall M. (1990). Communicating student ratings to decision makers: design for goo d practice. In M. Theall & J. Franklin ( Eds .) Student ratings of in struction: issues for improving practice. New Direc tions for Teaching and Learning # 43. San Francisco: Jossey Bass.Franklin, J. L. & Theall, M. (1989). Who reads rati ngs. Knowledge, attitudes, and practices of users o f student ratings of instruction. Paper presented at the 70th annual mee ting of the American Educational Research Associati on. San Francisco: March 31.Greenwald, A. G. & Gillmore G. M. (1997). No pain no gain? The importance of measuring course workl oad in student ratings of instruction. Journal of Educational Psyc hology (in press). Greenwald, A. G. & Gillmore G. M. (1996). Applyin g social psychology to reveal a major (but correcta ble) flaw in student evaluations of teaching. University of Wash ington, Draft Manuscript, March 1. Haskell, R. E. (1997b). Contributed Commentary on: Stake Response to Haskell: Academic Freedom, Tenure and Student Evaluation of Faculty. Educational Policy Analysis Archives, 5 Available online: http://olam.ed.asu.ed u/epaa/v5n8c1.html. Haskell, R. E. (1997a). Academic freedom, tenure, a nd student evaluations of faculty: Galloping polls in the 21st century. Educational Policy Analysis Archives, 5. Available online: Natuflin D. H. Ware, J. E., & Donnelly F. A. (1 973). The Dr. Fox lecture. A paradigm of educationa l seduction. Journal of Medical Education, 48, 630-635.Scriven M. (1988). The New Crisis in Teacher Eval uation: The Improper Use of 'Research-based' Indica tors. Professional Personnel Evaluation News. (January) 1-8.Stake, J. E. (1997). Response to Haskell: academic freedom, tenure, and student evaluations of faculty Educational Policy Analysis Archives, 5. Available online: http://olam Theall M. (1997). On Drawing Reasonable Conclusio ns about Student Ratings of Instruction: a Reply to Haskell and to Stake. Educational Policy Analysis Archives, 5. Ava ilable online: tml. Thompson, G.E. (1988). Difficulties in interpreting course evaluations: Some Bayesian insights. Resear ch in Higher Education, 28, 217-222.Notes1. Theall, suggested that I misinterpreted research on the SEF. For example, he says, Peter Cohen's most recognized contribution to the r atings literature is the meta-analysis of multisect ion validity studies in which performance on a common f inal exam was correlated with ratings (1981). There was a .43 correlation between ratings and performance o n those exams. In other words, sections with highly rated instructors had higher average scores which could n ot be attributed to differential grading standards or sampling errors. This is evidence for the validity of ratings, the exact opposite of what Haskell says Calling that relationship grade inflation is simply ridicul ous! Further, in at least partial defense before readers who may not be conversant with the literature, let me offer at least one somewhat lengthy statistical anecdote from Barn ett (1996) that addresses Theall's belief in .43 co rrelations. In analyzing some of Cohen's (1981) data, Barnett expl ains, I turn now to a second statistical matter the inter pretation of correlation coefficients. The numerica l value of a correlation coefficient can be deceiving, even when it is 'statistically significant' (i.e., even when it is unlikely


7 of 8to have occurred by chance if no relationship exist s). A correlation coefficient measures the degree t o which change in the amount of the explanatory variable is accompanied by change in the amount of the effect variable, but the most beneficial feature of a coef ficient is not its numerical value, which has no in herent, practical meaning. Rather, the square of the numeri cal value is the most advantageous aspect of a corr elation coefficient, for the square indicates the proportio n of variation in the effect variable that can be s tatistically attributed to variation in the explanatory variable The research summary by Professor Cashin reported a correlation coefficient of .44 between student rati ngs of an instructor 'overall' and examination grad es [8]. This coefficient means that 19.4 percent of the variatio n in student learning (as measured by course grades ) is explained by variation in instructional quality (as measured by student ratings). If accurate, a corre lation of this magnitude is 'practically useful,' as Professo r Cashin said, though one must keep in mind that fo ur-fifths of the variation in course grades remains unexplain ed and is attributable to other factors. But does this correlation coefficient accurately e stimate the relationship between student evaluation s of teaching and student achievement? The best research on the magnitude of the relationship is the mult isection validity study.' When it is ideally designed, such a study possesses the following features: each cour se included in the study has numerous sections; studen ts are randomly assigned to sections; the sections of a course have different instructors but a common text book and the same examination(s); all examinations for a course are constructed by a person who does not tea ch any section; and subjective (essay) components o f examinations are graded by the person who developed them. A review of multisection validity studies ci tes one work that, the author of the review asserts, el iminates at least in part 'many of the criticisms o f the multisection validity study' and provide[s] stron g support for the validity of students' evaluations of teaching effectiveness' [9, p. 721]. However, the cited work which subjected the results of other multisection studies to a statistical analysis did not control a number of critical variables that could have generated or enl arged the relationship between student ratings of teachers an d student achievement [10]. Among the missing varia bles that might have explained the relationship was the rigor of the requirements of the instructor (such a s checks for student preparation and amount of material assi gned), a factor that may vary considerably across s ections of a single course. If the variable was related bot h to student ratings of instructional quality and t o student achievement, a control for the variable could have markedly weakened or entirely eliminated the relati onship originally found between student ratings and studen t achievement p.339 Another variable that the work omitted was the stu dents' level of interest in the subject matter of t he course prior to exposure to the teacher they later evaluated. As will be suggested below, neither of t hese variables should have been excluded from the analys is and left uncontrolled. While the work did not incorporate a number of pot entially important variables into its data analysis the work is the source of a set of correlation coef ficients (including the coefficient of .44) that Ca shin suggested are credible estimates of the relationshi p between student ratings of teachers and student achievement. A reader of the reproduced coefficient s can easily be misled, however, because Cashin fai led to make clear that the coefficients may have been seri ously confounded by variables whose influence was n ot removed. The failure to clarify this point is surpr ising inasmuch as Cashin explicitly stated that a c ontrol is necessary for one of the variables omitted by the w ork, namely, the interest students initially exhibi ted in the subject.' [12, p. 5]. P.340 Cohen treated student judgments of 'the amount and difficulty of the work the teacher expects of students' as one component of instructional quality Whether it is an element of teaching quality or a separate factor, the amount and difficulty of work should be controlled under the conditions mentioned because it may explain much or all of the relationship detected be tween, on the one hand, evaluations of an instructo r overall or on specific dimensions and, on the other, perfor mance on examinations. From the studies he reviewed Cohen calculated a negligible mean correlation coef ficient for the relationship between the amount/dif ficulty of work and student achievement, but he also found a substantial range for the coefficients reported b y the studies. Specifically, the interval for 95% of the coefficients extended from -.42 to +.39. Id, at 293 295. Individual studies may thus involve a nontrivial as sociation between the perceived difficulty of teach ers and the examination performance of students (p.341). I would also suggest that he reader see Greenwald's historical review of past studies and Barnett's re view of some of the most cited pieces that claim the validity of SE F. 2. I can understand how such quotes might be embarr assing (a) to an author who professes that SEF is b asically if not completely valid instrument, and (b) to a practitio ner trying to convince faculty and administrators t o use them in faculty evaluation programs. I have never heard of a consul tant who suggests that SEF have been shown to be in valid. Given Theall’s critique let me make it very clear that I am not here indicting consultants. I have done and occasionally do consulting myself. I would like to say that Theall (et al.)---in word, I can not speak for deed--doe s recognize most of what needs to be done to make SEF use in administrative decisions reasonably ethical. 3. Requests for prepublication copies of Greenwald’ s articles should be sent to http://weber.u.washing 4. Theall goes on to say, "Stake says that ratings undermine the faith and trust students must place i n teachers." Given his opinion of ratings, I assume that he must then feel that teachers have no need to place faith or trust in their students. This


8 of 8paternalism is antiquated and unrealistic. The logi c by which it is unacceptable for Stake not to trus t (valid?) perceptions of students, but acceptable for Theall not to trust the perceptions of scholars, escapes me, just a as does his calling Stake’s view of students paternalistic and not seeing that his own views of his not trusting scholars to corre ctly read research is paternalistic. Though he does recognize his view so unds elitist, he says it is really not. 5. I wonder what Theall would say about child psych ologists who have no children of their own, or abou t healthy psychotherapists who treat the mentally disordered.


1 of 6 Contributed Commentary on Volume 5 Number 6: Haskell Academic Freedom, Tenure, and Student Evaluation of Faculty and Volume 5 Number 8: Stake Response to Haskell: Academic Freedom, Tenure, and Student Evaluation of Faculty 21 May 1997Michael Theall On drawing reasonable conclusions about student rat ings of instruction: a reply to Haskell and to Stake. Kenneth Ebel was right when he said (1983, p. 65) "No corner of the university.lacks faculty members who fulminate against student evalu ations, with little or no examination of the large body of research ...that underlies the practi ce." Lately, there has been a lot of discussion attempting to causally like student ratings to prob lems such as "grade inflation," and to suggest that ratings somehow violate the great traditions o f academic life. I do not mean to trivialize these issues. They are important and deserve discus sion. Rather, I am targeting the use (perhaps, misuse) of these issues in efforts to discredit or do away with faculty evaluation practices in general and student ratings in particular. "Grade-inflation" (i.e., the rise in "average" gra des since 1990 or is it 1950, or 1890, or the dawn of time) is always a hot topic because it's ea sy to make sweeping generalizations about "today's students," especially in comparison to som e more-than-likely-mythical group of "better" students from some time past (especially a time ".. .when I was a student"). While it is probably true that a lot of things have changed about higher education including the standards used to grade, the expectations of both faculty and student s, and the overall grade profile, it is very risky to point to one reason for these changes. There jus t isn't evidence for such a simplistic leap. When coupled with the sometimes hysterical rhetori c about student evaluations, the inflation complaint itself gets inflated. "Ratings" become the cause of the downfall of higher education. They are, as one anonymous and angry pro fessor wrote to me "...a corrupt practice of the 60's....A sop to students from administrators w ho are unwilling or unable to do anything to really improve teaching." More recently, in an arti cle ripe with rhetoric and loaded with misinterpretations of the literature, Robert Haskel l (1997) suggested that ratings are a violation of academic freedom (and of course, the cause of grade inflation). Haskell included a long quote from a chapter Jennifer Franklin and I (1990) did i n our New Directions for Teaching and Learning issue #43 (Theall & Franklin, 1990b) While our point was that ratings practice must be improved, Haskell used the quote to supplement othe r citations as evidence in building his case that ratings were an essentially unreliable form of data that should be done away with ... a gross


2 of 6misinterpretation of our intent. Frankly, I don't like to recommend articles like t his to those not actively involved in ratings research or practice because such writings can mislead readers who aren't really familiar with the cited ratings literature. I'm sorry if tha t sounds elitist: it isn't intended to be, but I do have a reason for noting it. We (Franklin & Theall, 1989 ) found that ignorance of evaluation/measurement literature and methods corre lated significantly with negative faculty opinions about students and student ratings. I note this because discussions about ratings are so often filled with misinformation. For example, Hask ell, (in reference note #26) says that "...Cohen's (1983) review reinforces our earlier co nclusion that student ratings are inaccurate indicators of student learning...". Peter Cohen's m ost recognized contribution to the ratings literature is the meta-analysis of multisection val idity studies in which performance on a common final exam was correlated with ratings (1981). Ther e was a .43 correlation between ratings and performance on those exams. In other words, section s with highly rated instructors had higher average scores which could not be attributed to dif ferential grading standards or sampling errors. This is evidence for the validity of ratings, the e xact opposite of what Haskell says. Calling that relationship grade inflation is simply ridiculous! Academic freedom has been defined in many ways, bu t never before in a way that suggests the construct (tradition? principle? tenet ?) is vulnerable to the influence of student ratings. The only comparable complaint I've heard c ame from a faculty member who missed classes frequently because he attended numerous rel igious services. He told me that since the ratings instrument contained an item about the inst ructor being available and his ratings on that item were depressed due to his frequent absences, t he student ratings process violated his constitutionally guaranteed freedom of religion. Fo rtunately, in a reply to Haskell in a later issue of EPAA (V5#8) Jeffrey Stake (1997) does us the favor of providi ng a strong counter-argument to the academic freedom issue. Unfortunately, he th en proceeds to propose that "Almost anything that can be done to undermine the administrative pr actice of getting students to evaluate teaching ought to be done" (p. 1). Stake says that ratings u ndermine the faith and trust students must place in teachers. Given his opinion of ratings, I assume that he must then feel that teachers have no need to place faith or trust in their students. Thi s paternalism is antiquated and unrealistic. Stake also proposes that administrative use of ratings cr eates an image that institutions are asking if they "guessed right" about the faculty they hired. Drawing this in the other direction, I suppose that assigning grades creates the image that instit utions are asking if they "guessed right" about the students they admitted. Finally, Stake proposes that asking students' opinions "... focuses the attention of students on the acting and special eff ects, rather than the message." (p. 2) I haven't seen many evaluation forms which contain items on e ither of these topics. Stake's reference is probably to "educational seduction," the skill of t he infamous "Dr. Fox" who supposedly entertained students and received high ratings desp ite the fact that he delivered no content (Naftulin, Ware, & Donnelly, 1973). Many who do not care for ratings find one study that supports their position but ignore subsequent work (e.g. Perry et. al., 1979; Marsh & Ware, 1982) which points out problems with the original study a nd proceeds to clarify the issue. In this case, the determination that while style is no substitute for substance, "expressiveness" in a lecture can be a powerful aide to students' motivation and atte ntion during lectures and thus, to recall and other aspects of eventual performance. Jennifer Franklin and I did two studies which invo lved looking at grades and ratings: one specifically on grade inflation (1991) and one on t he attributions of students about their academic performance (1990a). In the first study we looked a t thousands of courses from one institution over a five year period during which mandatory rati ngs were initiated. We found a statistically significant but numerically minute increase in aver age grades but we also found that average


3 of 6ratings actually decreased in the same time period. So, even if faculty were trying to "buy" higher ratings by giving higher grades, students "weren't buyin' it." In the attribution study we found that ratings of instructors were very consistent (i.e. reliable) across all grades. In other words, the ra tings were essentially the same from 'A' students and 'F' students. This held up when class average r atings showed that the instructor was considered to be "among the best" as well as when t he instructor was considered to be "average" or "among the worst." There was no revenge for low grades or reward for high grades. Major reviews (e.g., Marsh, 1987) have reported ei ther near zero correlations between workload, difficulty, and ratings, or positive rela tionships indicating that more work and/or more difficult courses often receive higher ratings (e.g ., Cashin, 1988). Greenwald & Gillmore (1996a, 1996b) however, report a positive grades-ratings re lationship (i.e., higher grades higher ratings) and a negative grades-workload relationship (i.e., higher grades less work). However, they do not then propose a direct, negative ratings-to-work load relationship (i.e., less workhigher ratings) although they suggest it when they say (p. 14) "...finding a negative path between expected grade and workload is critically diagnosti c of a causal effect of grading leniency." One might extrapolate that if less/easier work is relat ed to higher grades, and higher grades are related to higher ratings, then less/easier work should als o relate to higher ratings. Whether the syllogism holds remains a question. Even if it does, there is still a question about its implications for the student ratings process. Does a relationship even a causal one, mean that ratings are inherently invalid or unusable? Greenwald and Gillmore note th at ratings are imperfect but that flaws are "correctable" (1996b) and suggest making improvemen ts rather than dismissing ratings. For the record, the relationship between grades and ratings in other studies is relatively consistent, with correlations normally between .3 and .4 (e.g., Feld man, 1976). The counter-argument to the grades-ratings-inflati on idea is that there is (and should be) a relationship between grades and ratings because goo d teaching begets good learning. Good learning results in both good grades and satisfied learners. This is still a compelling notion but Greenwald and Gillmore (1996A, 1996B) have clarifie d the definitions and controlled some of the variables more precisely. On the surface, their work suggests that there is a relationship which does not rely on the counter-argument and tha t when all else is held constant, giving higher grades will result in getting higher ratings In other words, you can't account for the relationship solely on the basis of the "good teach ing begets good learning" counter-argument. One question about their findings is whether their finding is a psychometric, an instructional, or a psychological phenomenon. Consi der that ratings are undeniably a measure of the satisfaction of learners with their learning ex perience perhaps more than they are a direct or absolute measure of the total quality of instructio n. Though this sounds like heresy from a ratings proponent, it is a position that most researchers h ave held for some time. Consider also, that ratings are collected before students get their fin al grades and thus, their opinions must be based on expectations which are, in turn, based on perfor mance to date. Ratings thus can not be said to reflect a disconfirmed expectancy about the overall course outcome (i.e., the course grade). This would only be the case if ratings were gathered aft er final grades were distributed and the final grade disconfirmed what was expected as a result of the experiences during the semester. So the ratings relationship is limited to experiences and results during the term rather than to the final grade. So what could account for the grades-ratings relat ionship if not "better teaching begets better learning"? I propose the following.


4 of 6 Given that student satisfaction is related to the students' perceptions that they have received something of value in return for their tui tion dollars, and given that a series of successful classroom and related experiences has un folded during the semester, it is reasonable to assume that students will believe that they have le arned something, will be satisfied, and will provide positive ratings. Now, in many classes (esp ecially the lower level, undergraduate courses that populate many evaluation databases and by virt ue of their enrollments and numbers, may disproportionately influence results) students are not the best judges of the breadth and depth of the instructor's knowledge or the extent to which t he instructor has provided a complete, current, or even adequate treatment of the subject. If stude nts have received good grades and positive feedback from their instructors they should: 1) be satisfied; 2) feel that they have learned something; and 3) be able to honestly rate their ex periences and their instructors highly [even if the experience might be marginal using some other r ating criterion coming from some other group]. The data they provide is thus both valid an d reliable. Valid, because they are the appropriate providers of data and (we will assume f or the moment) the instruments used ask at least face valid questions. Reliable because there will be little variance across responses (Marsh, 1987) and because their responses will not change m uch over time (Frey, 1976). The question now becomes one which deals with: 1) the appropriateness of course content; 2) the standards used for grading; and 3) the question of whether lenient coverage and grading were deliberately chosen in order to influe nce ratings. The first two items can be and should be dealt with via curricular mechanisms such as departmental review of courses and content, and faculty agreement on standards for stu dent work. The third item represents an ethical dimension that has much less to do with ratings tha n it does with behavior. If it is possible to manipulate students covertly (as in "dumbing down" courses so that students are led to believe that they are learning a lot when, in fact, they ar e only scratching the surface), then the problem belongs to the instructor and/or the department. Do n't blame students or the ratings process. If the situation involves overt manipulation (as in making a 'ratings for grades' deal), then the problem belongs to the instructor, the students, the instit ution, and to higher education itself. Blame the perpetrators and those who let them get away with i t, not the process of collecting, reporting, and using ratings data. Ultimately, I believe this sugg ests that ratings are reliable and valid indicators If what they indicate doesn't please those who get and use the data, then the users, their departments, or their institutions should correct t he situation, rather than discard the data or the processes of collecting, analyzing, or reporting it There are three points here. The first is that no system, however well constructed is safe from manipulation. The second is that there are pro bably legitimate instances of situations in which the established findings from the ratings lit erature are violated even though these situations are infrequent enough to influence the r esults of large database analyses which have been the bases of most research. The third is that ratings are a symptom, not a cause. Many opinions about ratings (like those of Haskell and S take) seek to "blame" ratings for phenomena like grade inflation. Legitimate concern for the qu ality of teaching and learning should prompt us to examine the contexts of teaching and learning as well as the ratings that are provided by students, and to carefully review the other availab le kinds of data before jumping to erroneous conclusions about what "causes" what. Robert Haskell suggests that we do away with ratin gs. Perhaps the weakest part of his article is what isn't there: constructive suggestio ns for improvement. The thrust is simply that we should do away with student ratings. One could just as well argue that we should do away with grades. Either way, no more nasty inflation! WRONG! There are appropriate reasons and uses for both. What's needed is more substantial research, l ess rhetoric, and better informed faculty and administrators who can distinguish between opinion and evidence, separate mythology from


5 of 6established findings, and explore the situational c ontext in order to arrive at fair and equitable conclusions about faculty performance. Doing this w on't violate anyone's rights. In fact, it should lead to better evaluation, and if data are carefull y collected, reported, and used, better teaching as well. REFERENCES Cashin, W. E. (1988) Student ratings of teaching: a summary of research. (IDEA paper No. 20) Kansas State university, Division of Continuing Edu cation. ERIC # ED 302 567 Cohen, P. A. (1983) Comment on a selective review o f the validity of student ratings of teaching. Journal of higher Education, 54, 448-458.Cohen, P.A. (1981) Student ratings of instruction a nd student achievement: a meta-analysis of multisection validity studies. Review of Educationa l research, 51, 281-309. Ebel, K. (1983) The aims of college teaching. San F rancisco: Jossey Bass. Feldman, K. A. (1976) Grades and college students' evaluations of their courses and teachers. Research in higher Education, 4, 69-111.Franklin, J. L. & Theall, M. (1991) Grade inflation and student ratings: a closer look. Paper presented at the 72nd annual meeting of the America n Educational Research Association. Chicago: April 7. ERIC # ED 349 318Franklin, J. L. & Theall, M. (1990) Communicating s tudent ratings to decision makers: design for good practice. In M. Theall & J. Franklin (Eds. ) Student ratings of instruction: issues for improving practice. New Directions for Teaching and Learning # 43. San Francisco: Jossey Bass. Franklin, J. L. & Theall, M. (1989) Who reads ratin gs. Knowledge, attitudes, and practices of users of student ratings of instruction. Paper pres ented at the 70th annual meeting of the American Educational Research Association. San Fran cisco: March 31. Frey, P. (1976) Validity of student instructional r atings. Does timing matter? Journal of Higher Education, 47 (3), 327-336.Greenwald, A. G. & Gillmore, G. M. (1996a) No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology (in press). Available HTTP: aingain.htm Greenwald, A. G. & Gillmore, G. M. (1996b) Applying social psychology to reveal a major (but correctable) flaw in student evaluations of teachin g Available HTTP:, R. E. (1997) Academic freedom, tenure, and student evaluations of faculty: galloping polls in the 21st century. Educational Policy Analy sis Archives V5 #6. Available HTTP: Marsh, H. A. (1987) Students' evaluations of univer sity teaching: research findings, methodological issues, and directions for future re search. International Journal of Educational


6 of 6Research, 11, 253-388.Marsh, H. W. & Ware, J. E. (1982) Effects of expres siveness, content coverage, and incentive on multidimensional student ratings scales. New interp retations of the Dr. Fox effect. Journal of Educational Psychology, 74, 126-134.Natuflin, D. H. Ware, J. E., & Donnelly, F. A. (197 3) The Dr. Fox lecture. A paradigm of educational seduction. Journal of Medical Education 48, 630-635. Perry, R. R., Abrami, P. A., & Leventhal, L. (1979) Educational seduction: the effect of instructor expressiveness and lecture content on st udent ratings and achievement. Journal of Educational Psychology, 71, 107-116.Stake, J. E. (1997) Response to Haskell: academic f reedom, tenure, and student evaluations of faculty. Educational Policy Analysis Archives V5 # 8. Available HTTP: Theall, M. & Franklin, J. L. (1990a) Attributions o r retributions: student ratings and the perceived causes of performance. Paper presented at the 71st annual meeting of the American Educational Research Association. Boston: April 16. ERIC #ED 319 764 Theall, M. & Franklin, J.L. (1990b) Student ratings of instruction: issues for improving practice. New Directions for Teaching and Learning # 43. San Francisco: Jossey Bass. About the AuthorMichael Theall, Ph.D.Associate Professor & DirectorCenter for Teaching and Learning University of Illi nois At Springfield Springfield, IL 62794-9243phone: 217-786-7157fax: 217-786-7188email: EducationBA St Anslem College (1963; English) MS Syracuse Univ. (1976; Instructional Technology) PhD Syracuse Univ. (1980; Instructional Design, Dev elopment, & Evaluation)