xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20029999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00284
Educational policy analysis archives.
n Vol. 10, no. 36 (September 06, 2002).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c September 06, 2002
Research and rhetoric on teacher certification : a response to Teacher certification reconsidered / Linda Darling-Hammong.
Arizona State University.
University of South Florida.
t Education Policy Analysis Archives (EPAA)
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 10issue 36series Year mods:caption 20022002Month September9Day 66mods:originInfo mods:dateIssued iso8601 2002-09-06
1 of 55 Education Policy Analysis Archives Volume 10 Number 36September 6, 2002ISSN 1068-2341 A peer-reviewed scholarly journal Editor: Gene V Glass College of Education Arizona State University Copyright 2002, the EDUCATION POLICY ANALYSIS ARCHIVES .Permission is hereby granted to copy any article if EPAA is credited and copies are not sold. EPAA is a project of the Education Policy Studies Laboratory. Articles appearing in EPAA are abstracted in the Current Index to Journals in Education by the ERIC Clearinghouse on Assessment and Evaluation and are permanently archived in Resources in Education .Research and Rhetoric on Teacher Certification: A Response to "Teacher Certification Reconsidered" Linda Darling-Hammond Stanford University 1 Citation: Darling-Hammond, Linda. (2002, September 6). Research and rhetoric on teacher certification: A response to "Teacher Certification Reconsidered," Education Policy Analysis Archives 10 (36). Retrieved [date] from http://epaa.asu.edu/epa a/v10n36.html.AbstractIn October, 2001, the Baltimore-based Abell Foundat ion issued a report purporting to prove that there is "no credible rese arch that supports the use of teacher certification as a regulatory barrie r to teaching" and urging the discontinuation of certification in Maryland. T he report argued that large inequities in access to certified teachers fo r poor and minority students are not a problem because research linking teacher education to student achievement is flawed. In July, 2002, the U .S. Secretary of Education cited the Abell Foundation paper in his A nnual Report on Teacher Quality as the sole source for concluding t hat teacher education does not contribute to teacher effectiveness. The S ecretary's report then
2 of 55recommended that requirements for education coursew ork be eliminated from certification standards, and attendance at sch ools of education and student teaching be made optional. This article doc uments the many inaccuracies in the Abell Foundation paper and desc ribes the actual findings of many of the studies it purports to revi ew, as well as the findings of other studies it ignores. It details mi srepresentations of a number of studies, including inaccurate statements about their methods and findings, false claims about their authors' vie ws, and distortions of their data and conclusions. The article addresses m ethodological issues regarding the validity and interpretation of resear ch. Finally, the article presents data challenging the Abell Foundation's un founded claims that uncertified teachers are as effective as certified teachers, that teacher education makes no difference to teacher effectiven ess, that verbal ability is the most important determinant of teachi ng effectiveness, that private schools staffed by uncertified teachers are more effective than public schools, and that untrained teachers are mor e qualified than prepared teachers. It concludes with a discussion o f the policy issues that need to be addressed if all students are to be prov ided with highly qualified teachers. In October, 2001, the Baltimore-based Abell Foundat ion issued a report purporting to prove that there is "no credible research that supp orts the use of teacher certification as a regulatory barrier to teaching" (Walsh, 2001, p. 5) (Note 2) The Abell Foundation paper argued against Maryland's efforts to strengthen tea cher preparation requirements and defended the continuation of a local short-term alt ernative route into teaching that had come under criticism. Suggesting that "educators, p olicymakers, the media, and the public mistakenly equate teacher quality with teach er certification" (p. 1), Kate Walsh, the author of the paper, complained that efforts to improve education for poor and minority children in Baltimore by the state and loc al superintendents of schools and by local advocacy organizations foolishly sought to se cure more fully certified teachers for their schools. She cited as wrong-headed newspaper articles raising concerns, for example, that: "Least prepared teachers are at wors t city schools: One-third lack basic credentials for certification," (p. 1). Calling mis guided the efforts of a Baltimore community group that released a study which "bemoan ed the fact that more uncertified teachers were teaching in the city's high-poverty, predominantly African-American schools than the city's whiter, more affluent schoo ls" (p. 2), the paper sought to demonstrate that these inequalities in access to ce rtified teachers are not problematic if certification can be discounted as a determinant of achievement. The Abell Foundation proposed that Maryland should 1) "eliminate the coursework requirements for teacher certification" and require only a bachelor's degree and a passing score on an appropriate teacher's exam; 2) "report the average verbal ability score of teachers in each school district and of teacher can didates graduating from the State's schools of education;" and 3) "devolve its responsi bility for teacher qualification and selection to its 24 public school districts," deleg ating all hiring authority to individual school principals (pp. vii-viii).Although these ideas might seem indefensible to tho se who are engaged in research regarding teacher preparation and recruitment, the U.S. Secretary of Education echoed
3 of 55these recommendations in his Annual Report on Teach er Quality (USDOE, 2002), a report on the national state of teacher quality req uired under the 1998 reauthorization of Title II of the Higher Education Act. In this repor t, the Secretary argued that teacher certification systems are "broken," imposing "burde nsome requirements" for education coursework that make up "the bulk of current teache r certification regimes" (p. 8). The report argues that certification should be redefine d to emphasize higher standards for verbal ability and content knowledge and to de-emph asize requirements for education coursework, making attendance at schools of educati on and student teaching optional and eliminating "other bureaucratic hurdles" (p. 19 ). The report suggests that its recommendations are ba sed on "solid research." However, only one reference among the report's 44 footnotes is to a peer-reviewed journal article (which is misquoted in the report); most are to new spaper articles or to documents published by advocacy organizations, some of these known for their vigorous opposition to teacher education. (Note 3) For the recommendation that education preparation be eliminated or made optional, the Secretary's report relies exclusively on the Abell Foundation's paper. Though written as a local rejoi nder to Maryland's efforts to strengthen teacher preparation and certification, i t appears to have become a foundation for federal policy. This article includes the response I wrote to Walsh 's paper (Note 4) when it was first issued, with some additions that respond to a reply she issued with Michael Podgursky (Note 5) and a briefer version of her report recently print ed in Education Next a magazine put out by the Hoover Institution (Walsh, 2002). In order to make a case for her agenda, Walsh attac ks all research that has found relationships between teachers' preparation and the ir measured effectiveness, including students' achievement. She characterizes much of th e education research as "flawed, sloppy, aged and sometimes academically dishonest" (p. 13), a characterization that more aptly describes her own paper, which consisten tly misrepresents the statements of researchers, the findings of studies, and the evide nce base for her claims. She claims to have reviewed all of the studies ever cited by prop onents of teacher education. In fact, a large number of the references in the paper and app endix are not directly on the topic of teacher education, and many studies of teacher educ ation effects are not included in the report. Furthermore, her paper does not actually re view most of the studies it mentions. An original report appendix listing studies shrank from 175 in July, 2001 to fourteen in the version of the report released in October, 2001 selected according to no obvious criteria and omitting many of the most prominent st udies on the topic. (Note 6) The "reviews" in a now separate appendix published on t he foundation's website are generally not careful assessments of research metho ds or findings but a list of complaints and random observationsÂ—sometimes accura te but often notÂ—about various aspects of the studies or how they have been cited by others. (A number of examples are included below.)All studies have limitations, and some are too prob lematic to be relied upon, including a number that Walsh relies upon for her own assertion s. However, Walsh's paper, which is littered with inaccuracies, misstatements, and misr epresentations, sheds little light on the research or its implications for teacher education and certification. In what follows I discuss the inaccuracies in Walsh's account, the ac tual findings of many of the studies she purports to review, and the findings of other s tudies she chooses to ignore, as well as the implications of her proposals for teachers, the ir knowledge, and the students they
4 of 55teach.In the course of the paper, I review some of the st udies that have found influences of teacher education and certification on student achi evement at the levels of the individual teacher (e.g. Goldhaber & Brewer, 2000; Hawk, Coble & Swanson, 1985; Monk, 1994); the school (Betts, Rueben, & Danenberg, 2000; Fetle r, 1999); the school district (Ferguson, 1991; Strauss & Sawyer, 1986); and state (Darling-Hammond, 2000c). The convergence of findings in analyses using different units of analysis reinforces the strength of the inferences that might be drawn from any single study.What are the Arguments?The Abell Foundation report admits that teacher qua lifications make a difference but it also tries to make a case that "the backgrounds and attributes characterizing effective teachers are more likely to be found outside the do main of schools of education. The teacher attribute found consistently to be most rel ated to raising student achievement is verbal ability.... usually measured by short vocabu lary tests..." (p. v). Later in the report, Walsh suggests that subject matter knowledge may be an additional criterion for hiring secondary teachers, but not for elementary teachers Walsh objects to the state requirements regarding content coursework in each o f the core academic areas for elementary teachers, since many who want to enter t hrough the alternative Resident Teacher program in Maryland have had trouble meetin g these requirements. Walsh then tries to dismiss all studies that find e vidence that knowledge about teaching also makes a different for teacher performance, or to claim that studies finding positive effects of teacher education or certification are e ither too old, too small, too highly aggregated, or dependent on evidence about teacher performance other than student achievement or are not really about certification a fter all, even if their authors say they are. She often does this by misrepresenting the stu dies' actual methods and findings, as I detail below.While there are legitimate concerns to be raised ab out various studies in the literatureÂ—on all sides of the questionÂ—this articl e does not shed much light on them. A thorough review of the quality and accurately portr ayed findings of the several bodies of research that bear on this question would be a serv ice to this field. Unfortunately, this document's inaccuracies and misinterpretations make it of little use in this regard. In what follows, I address five major issues regard ing the Abell report and the research base on teaching and teacher education: Evidence Ignored. Evidence about student learning in reading and oth er areas documents the need for teachers to have professiona l knowledge that includes and extends beyond subject matter knowledge. The Abell Foundation report does not consider this evidence or answer the question of ho w teachers are to acquire this knowledge if they are not professionally prepared. 1. Unfounded Claims. No evidence supports Walsh's claim that either ver bal ability or subject matter knowledge alone makes teachers ef fective. She lacks supporting evidenceÂ—and fails to consider contradictory eviden ceÂ—for her claims about the relative effectiveness of certified and uncertified teachers, the outcomes of teacher education, the primacy of verbal ability as the mos t important measure of 2.
5 of 55teaching, the effectiveness of private and public s chools and the preparation of their teachers, and the attributes of individuals w ho enter teaching without certification.Misrepresentations of Research. Walsh's claim that she has reviewed 100 to 200 studies cited in support of teacher education and f ound that "none of them holds up to scrutiny" is not true. In fact, she is unable to discount a number of important studies that support teacher education or certifica tion. In addition, a large number of the studies relevant to the question of teacher education effects are not reviewed at all in Walsh's paper. Most of the studies she me ntions do not concern teacher education or certification directly: at most 80 of the nearly 200 studies listed in the study or appendix are focused on teacher education or certification. A number of those reviewed are badly misrepresented, including inaccurate statements about their methods and findings, false claims about thei r authors' views, and distortions of their data and conclusions. Many are not reviewe d for their methods and findings, but are dismissed because of their sample size, age, dependent variable, or publication venueÂ—unless Walsh likes one of the findings, in which case she uses the study, sometimes after already having dism issed it. Even the studies that Walsh says she reviewed are missing from the append ix of the report, where she refers readers for evidence. (Note 7) 3. Methodological Issues and Double Standards in Using Research. Walsh misunderstands some fundamental research design iss ues, including the difference between experimental and correlational studies and the interpretation of research conducted at different levels of aggregation. In he r effort to make the evidence base about teacher education disappear, Walsh elimi nates from consideration studies that have been cited regarding the contribu tions of various measures of teacher qualifications to teacher effectiveness if they have small sample sizes, if they were published more than 20 years ago, or if t hey were published as dissertations, technical reports, or conference pap ers rather than in peer-reviewed journals. She also eliminates all studies that use measures of teacher effectiveness other than student achievement (e.g. supervisors' r atings of performance, researchers' observation-based measures of teacher practice). There are legitimate issues associated with the sample size, age, qualit y assurance, and measurement that warrant discussion (see below). However, as a blanket means of eliminating evidence from consideration, this strategy is probl ematic, as Walsh's frequent citations of studies that fail to meet her own crit eria suggest. 4. Illogical Policy Conclusions. While it is clear that teacher certification syste ms are not perfect and there are many weak teacher educati on programs, points that I have frequently made in my own research, it does no t follow that the response to these problems should be to eliminate expectations for teachers to acquire the knowledge they need to teach students effectively. The more appropriate policy response is to improve the quality of teacher educa tionÂ—a process that has been underway with important results in a number of stat es, and one that rests on the processes of accreditation and certification that p rovide policymakers with levers for change and improvement. 5.Evidence Ignored
6 of 55While the Abell Foundation report claims that teach ers do not need professional knowledge in order to teach, the field has been mov ing rapidly to codify the ways in which teaching knowledge makes a difference in stud ent learning. For example, the National Reading Panel of the National Institute of Child Health and Human Development last year published a major review of c arefully controlled research which found that children's reading achievement is improv ed by systematic teaching of phonemic awareness, guided repeated oral reading, d irect and indirect vocabulary instruction with careful attention to readers' need s, and a combination of reading comprehension techniques that include metacognitive strategies. The report notes that teacher education is critical to the success of reading instruction with respect to both instruction in phonemic awaren ess and more complex comprehension skills: Knowing that all phonics programs are not the same brings with it the implication that teachers must themselves be educat ed about how to evaluate different programs to determine which ones are based on strong evidence and how they can most effectively use thes e programs in their own classrooms. It is therefore important that teachers be provided with evidence-based preservice training and ongoing inse rvice training to select (or develop) and implement the most appropriate pho nics instruction effectively. (p. 11)Teaching reading comprehension strategies to studen ts at all grade levels is complex. Teachers not only must have a firm grasp o f the content presented in the text, but also must have substantial knowled ge of the strategies themselves, of which strategies are most effective for different students and types of content and of how best to teach and model strategy use.... (Data from the studies reviewed on teacher training) indi cated clearly that in order for teachers to use strategies effectively, extensi ve formal instruction in reading comprehension is necessary, preferably begi nning as early as pre-service (National Reading Panel, 2000, pp. 15-1 6). Studies have documented that professional training can be effective in providing teachers with the strategies that enable them to te ach these complex comprehension skills, and teachers who receive such training sign ificantly improve students' reading outcomes (e.g, Duffy, Roehler, Sivan et al., 1987; Duffy & Roehler, 1989, regarding explicit strategy instruction; Palincsar & Brown, 1 989, regarding reciprocal teaching). Similar insights in our understanding of how to dev elop student proficiency in mathematics and science, and how to develop teacher s' skills for doing so, have recently emerged. For example, recent analyses of the Nation al Assessment of Educational Progress (NAEP) which control for student character istics and a number of measures of school inputs have found that students whose teache rs have majored in mathematics or mathematics education, who have had more preor in -service training in how to work with diverse student populations and more training in how to develop higher-order thinking skills, and who engage in more hands-on le arning do better on the NAEP mathematics assessments. Similarly, students whose teachers have majored in science or science education and who have had more preor inservice training in how to develop laboratory skills and who engage in more hands-on l earning do better on the NAEP
7 of 55science assessments (Weglinsky, 2000). (Note 8) A recent review commissioned by the Department of E ducation, which was carefully vetted by a panel of researchers, disagreed with th e Abell Foundation's conclusions. This review, which analyzed 57 studies that met specific research criteria and were published after 1980 in peer-reviewed journals, concluded tha t the available evidence demonstrates a relationship between teacher education and teache r effectiveness (Wilson, Floden, & Ferrini-Mundy, 2001). The review shows that empiric al relationships between teacher qualifications and student achievement have been fo und across studies using different units of analysis and different measures of prepara tion and in studies that employ controls for students' socioeconomic status and pri or academic performance. It is ironic that just as the field is learning mor e about how to prepare teachers to teach children effectively, the Abell Foundation suggests that we truncate teacher education and end the certification policies that would encou rage and enable teachers to acquire this knowledgeÂ—or at least that we do so for the ch ildren of the poor, who also attend school in districts with minimal resources for prof essional development. The unanswered question is, How are teachers to learn w hat is known about how to teach well if there are no expectations, incentives, or s upports for them to do so?Unfounded ClaimsWhile ignoring these serious questions, Walsh makes a number of claims that are not supported either by the research she presents or by other evidence in the field. These include the following: New teachers who are certified do not produce great er student gains than new teachers who are not certified. There is little evidence that the content and skill s taught in preservice education coursework is ( sic ) either retained or effective. Verbal ability and subject matter alone are suffici ent to produce effective teachers. Private schools do not hire certified teachers and they are more effective than public schools. Individuals with higher academic ability will be re cruited to teaching if certification standards are eliminated. The Effectiveness of Certified and Uncertified Teac hers For her proposition that "new teachers who are cert ified do not produce greater student gains than new teachers who are not certified," Wal sh cites seven studies, none of which provides support for this proposition, and five of which actually provide evidence that contradicts her claim. Three of the studies (Bliss, 1992; Stoddart, 1992; Lutz & Hutton, 1989) include no data on student achievement at all although Walsh elsewhere dismisses all other studies that do not use student achievement data as the dependent variable. (In a reply to my response, Walsh and Pod gursky (2001) note that these studies have been deleted in a newly printed version, along with some studies Walsh cited that were not peer reviewed, "so that the report ... doe s not appear to convey a double
8 of 55standard" (p. 15)). Six of the studies Walsh cites actually deal with a lternatively certified rather than uncertified teachersÂ—that is, teachers who had unde rtaken teacher education at the post-baccalaureate level in universityor school d istrict-based programs that rearrange the way teacher education is delivered. The finding s across the studies are mixed, but none of them shows that uncertified teachers do as well as certified teachers, and one of them shows that this is clearly not true. Several o f the studies point instead to the value of teacher education: The more positive findings ar e found for the alternatives that provide more complete preparation. Bliss (1992) wrote about the Connecticut alternativ e certification program, a two-year training model which the author notes feat ures "a significantly longer period of training than in any other alternate rout e program" in existence at that time (p. 52). This report does not examine uncertif ied teachers, nor does it meet Walsh's criteria for inclusion in a review of liter ature, because it includes no data about teacher effectiveness as gauged by student ac hievement measures. Bliss notes that most recruits reported their initial tra ining to be helpful, and she briefly mentions results from another researcher's survey o f recruits' supervisors which suggested mixed reviews of their performance: 33 pe rcent of supervisors said that the alternate route teachers were weaker than other s in classroom management (presumably, then, 67 percent said they were not weaker than others in this area), while 38 percent said they were stronger than other s in teaching skills (and 62 percent presumably said they were not stronger than others in this area). 1. Stoddart (1992) reports on the subject matter quali fications and attrition rates of recruits to the Los Angeles Teacher Trainee Program also a two-year training model. She found that content qualifications were c omparable to those of traditionally trained recruits, except for math rec ruits, who had lower GPAs than traditionally trained mathematics teachers, and tha t attrition rates for those who entered were relatively low in the first two years but higher than national rates after 5 years. (Note 9) Results cited by Stoddart from other studies about the observed practices of these teachers in comparison with university-trained teachers produced mixed results: university-trained English teachers appeared more skillful than alternate route teachers, but th e levels of skill appeared lower for mathematics teachers from both groups. 2. Lutz and Hutton (1989) compared the demographic cha racteristics, attitudes, certification test scores, and opinions of Dallas P ublic Schools' alternative certification (AC) recruits with other first year t eachers in the district. Like the other studies noted above, this study did not exami ne student achievement gains of the recruits' students. The program provides sum mer training to recruits and then places them in mentored internships during the school year while they are completing other coursework. The study found many s imilarities but some differences between AC recruits and other first yea r teachers, including significantly lower rates of expected long-term con tinuation in teaching for the AC recruits (40% vs. 72% for other first year teach ers). They also examined supervisors' perceptions of recruitsÂ—a measure that Walsh argues should eliminate other studies from consideration. These w ere positive for the 54% of the pool (59 out of 110) defined as "successful" intern s in the studyÂ—those who completed the intern year without dropping out (10% ) or being held back for 3.
9 of 55another year or more due to 'deficiencies' in vario us areas of performance (36%). The study also reported data from another evaluatio n of the program by the Texas Education Agency (Mitchell, 1987), which surveyed p rincipals, finding that: The principals rated the [traditionally-prepared] b eginning teachers as more knowledgeable than the AC interns on the eight program variables: reading, discipline management, classroo m organization, planning, essential elements, ESL methodology, inst ructional techniques, and instructional models. The ratings o f the AC interns on nine other areas of knowledge typically included in teacher preparation programs were slightly below average in seven areas compared with those of beginning teachers. It might therefore be assumed that pre-service teacher education programs are doing something right! (p. 250). In the paragraph cited above, Lutz and Hutton wax e nthusiastic about preservice teacher education programs that seemed in these dat a to outperform the alternative route. Later they wax enthusiastic about the altern ative route, given results from another survey of principals, most of whom felt tha t alternative credential candidates who eventually made through the program were comparable to other beginning teachers. At the end of the piece, they n ote that the high attrition rates and difficulty maintaining the program suggest the alternate route will not likely be a long-term solution to teacher supply problems. Although Walsh cites Lutz and Hutton's enthusiastic feelings about the AC pro gram, she does not accurately report the complete data from the study, including the low rates of successful program completion, the low rates of planned retent ion in teaching, and the mixed reviews of their performance. In her appendix, she includes this study with the following "review:" "Darling-Hammond ignores the un qualified authors' (sic) endorsement of the merits of alternative route to t eaching...." One presumes that she means to reference the authors' "unqualified en dorsement" rather than to call the authors themselves unqualified. Yet as the abov e excerpts make clear, the study does not provide an unqualified endorsement o f the program. Walsh repeats this mistake in the appendix when she critiques a review of alternate certification programs (Darling-Hammond, 1992). She states that, "Darling-Hammond cites the findings from many studi es that looked at alternative programs; but she does not include findings that sh ow alternatively trained teachers are at least as effective at raising academic achieveme nt as those who graduate from traditional programs," (p. A-3), citi ng Lutz and Hutton (1989), despite the fact that their study presented no empi rical data on academic achievement of students and presented mixed evidenc e about the rated performance and retention rates of these recruits.Two other studies Walsh cites do include student ac hievement data, but they do not, as she states, compare certified with uncertif ied teachers. Both deal with alternatively certified teachers who receive a subs tantial amount of education coursework while they are undertaking mentored teac hing supervised by both university supervisors and classroom mentors. Miller, McKenna, & McKenna (1998) is a matched comp arison group study of what the study's authors call a "carefully construc ted" university-based alternate 4.
10 of 55route program for middle school teachers. Reflectin g the characteristics of alternative routes endorsed by the National Commiss ion on Teaching and America's Future (1996), this program offered 15 to 25 credit hours of coursework before interns entered classrooms where they were i ntensively supervised and assisted by both university supervisors and schoolbased mentors while they completed additional coursework needed to meet full standard state certification requirements. Forty-one of these teachers were comp ared to a group of 41 traditionally certified teachers matched for years of experience, using ratings of their teaching conducted by trained observers. Then student test score data were collected for 18 of these teachers. Although the sa mple size is too small to meet Walsh's criteria (Note 10) for studies worth considering (a point she seems t o have forgotten here), and data are not provided on stude nt pre-test scores, the study appears reasonably well-conducted.The traditionally trained teachers in this study fe lt somewhat more confident in their practice and scored slightly higher on the tw o sub-scales of an observation instrument used by trained observers to rate their teaching. However, these differences were not significant, and the authors r eport, without including the actual data analyses, that there were no significan t differences in the student achievement of 18 teachers from the two groups by t he 3rd year of practice after both had completed all of their education coursewor k. (The authors did not control for prior achievement levels of students; however, they stated that the initial differences in student achievement across groups we re not significant.) Because the design of this program was so different from many quick-entry alternative routes, Miller, McKenna, and McKenna no te that their studies "provide no solace for those who believe that anyone with a bachelor's degree can be placed in a classroom and expect to be equally successful as those having completed traditional education programs.... The three studie s reported here support carefully constructed AC programs with extensive mentoring co mponents, post-graduation training, regular in-service classes, and ongoing u niversity supervision" (p. 174). This finding does not support Walsh's contentions t hroughout her paper that only general intelligence and subject matter knowledge m ake a difference for teacher effectiveness, her statement that uncertified teach ers do as well as certified teachers, or her claim that there is no evidence wh ich supports teacher education and certification. The other study on alternative certification cited favorably by Walsh (Bradshaw & Hawk, 1996) was not published as a peer-reviewed ar ticle or research reportÂ—one of Walsh's criteria for rejecting the results of ot her reports. It is actually not an empirical study but a literature review that, like other reviews Walsh criticizes, is based on a mixture of unpublished papers and on stu dies that, for the most part do not examine student achievement. Some of the papers cited do not include empirical evidence at all. Walsh characterizes the report's findings as providing "mixed, inconclusive" evidence. This is certainly t rue. Studies examining measures of knowledge, teacher beliefs and attitude s, teacher ratings, and student views report no differences on some measures and di fferences, typically favoring traditionally prepared teachers, on others, especia lly measures of professional knowledge and performance.With respect to student achievement, Bradshaw and H awk list five papers that 5.
11 of 55discuss outcomes for differently trained teachers. The first, an unpublished paper by Barnes, Salmon, and Wale (1989) does not present any empirical data or discussion of specific studies, but it includes a s tatement that two districts in Texas reportedly found equivalent outcomes for alte rnative and traditional program teachers. While it does not mention what pr ograms might have been compared, it does include a table listing teacher e ducation programs designated as alternatives. This list includes oneand two-year university-based master's programs (which are called "alternative" in Texas b ecause they are not undergraduate models) along with district alternati ve programs that generally offer only a few weeks of summer training before teachers are assigned to classrooms. Thus, the "alternative" group included programs pro viding extensive graduate level training of the sort that many states would c all 'traditional," along with programs that provide little formal preparation. As ide from the unanswered question of what analyses some unnamed parties migh t have been done to support assertions about relative effects, the wide range o f program models included as "alternative" precludes any inferences about the ef fects of preparation on teacher effectiveness.A second study, by Denton & Peters (1988) provides another example of the definitional problems associated with the terms "al ternative" and "traditional". This paper actually studied two versions of a unive rsity's college-based teacher education program. The one called "alternative" in their paper was in fact an expansion of the regular teacher education program, rather than a reduction in coursework. Graduates of this more extensive curric ulum had students who had stronger performance in earth and physical sciences while scores in mathematics were stronger for students of the regular teacher e ducation program Of the remaining studies, two found that student ac hievement gains were higher for the students of traditionally prepared teachers in language arts (Gomez & Grobe, 1990, in a comparison with alternatively cer tified teachers) and mathematics (Hawk, Coble, & Swanson, 1985, in a com parison with uncertified mathematics teachers). The last (Stafford & Barrow, 1994) did not present original research but referenced studies reporting differenc es associated primarily with teaching experience between the performance of alte rnative program teachers, other first-year teachers, and experienced teachers In combination, these studies do not provide any su pport for the statement that uncertified teachers are as effective as certified teachers. In addition to its other inaccuracies, Walsh's review confuses alternative c ertificationÂ—a strategy that provides candidates with preparation that is differ ently packaged from what various states deem "traditional" training (usually the difference is that training is post-baccalaureate rather than undergraduate and is streamlined into about a year rather than spread across four years of college)Â—wi th lack of certificationÂ—which generally indicates a lack of preparation. Having a lready missed this critical distinction, Walsh does not begin to attempt to sor t out the effects of the differences in preparation experiences and outcomes associated with different models of teacher education. Thus, she does not not e that program designs that include a comprehensive and coherent program of cou rsework and intensive mentoring (e.g. Miller, McKenna, & McKenna, 1998) h ave been found to produce more positive evaluations of candidate performance than models that forego most
12 of 55of this coursework and supervised support.For example, a comparative study of more than 200 a lternative certification candidates in New Hampshire, who are certified via three years of on-the-job training in lieu of formal preparation, found they were rated by their principals significantly lower than university-prepared teache rs on instructional skills and instructional planning, and they rated their own pr eparation significantly lower than did the university candidates (Jelmberg, 1995) To understand the outcomes of different approaches, studies of alternatives ne ed to acknowledge the differences in program models.Finally, Walsh cites two additional studies that in clude uncertified teachers, but she gets the findings wrong. Neither study shows th at uncertified teachers do as well as certified teachers. One shows that the reve rse is true. In one study (Goldhaber & Brewer, 2000), the author s found that high school students who had a certified teacher in mathematics did significantly better after controlling for initial achievement and student dem ographic factors, than those who had uncertified teachers. The same trends were true in science, but the influences were somewhat smaller. The effects of ce rtification on achievement were larger thanÂ—and in addition toÂ—the effects of a subject matter degree. In this sample, students of a small number of science teachers who held emergency or temporary certification (24 out of the 3,469 tea chers in the overall sample) did no worse than the students of certified teachers, a lthough they, too, did better than the students of uncertified teachers. Another analy sis of these data (Darling-Hammond, Berry, & Thoreson, 2001) showed t hat in this sample most of the teachers on temporary / emergency certificates were experienced and most had education training comparable to that of the certif ied teachers. Most appeared to be already licensed teachers from out-of-state who were in the transition period to securing a new state license or experienced teacher s teaching out of their main field. Only a third were new entrants whose charact eristics may have suggested a content background with little education training. The students of this sub-sample of teachers had lower achievement gains in an analy sis of co-variance that controlled for pre-test scores, content degrees, an d experience than those of the more experienced and traditionally trained teachers 6. Finally, Walsh cites a recently released study of T each for America (TFA) by Raymond et al. (2001). This study is relevant to Wa lsh's discussion of the Resident Teacher Program through which she notes th at many TFA recruits enter teaching in Maryland. However, the study did not co mpare certified to uncertified teachers, as Walsh claims. Although they had the da ta to do so, the authors chose not to examine how TFA teachers performed in compar ison to trained or certified teachers. The study examined the influences of TFA teachers on student achievement scores, using regression methods that c ontrolled for teacher experience and school demographics; thus, the compa rison was between TFA recruits and other inexperienced teachers in high-m inority schools in HoustonÂ—where most underqualified teachers are plac ed. Since about 50% of Houston's new hires are uncertified and about 35% w ere found to lack a bachelors degree in the most recent year of the study, TFA re cruits were compared to an extraordinarily underprepared set of teachers. In t his comparison, students of TFA teachers did about as well as those of other inexpe rienced, largely untrained 7.
13 of 55teachers, many of them without bachelors degrees. ( Reviewers of this report have noted that the report should have compared TFA recr uits to other BA holders and to prepared or certified teachers; based on the sta tistics shown, it is not clear that the results of these comparisons would be favorable to TFA.) (Note 11) Another study that compared TFA teachers to certified teach ers found significantly higher scores for the students of certified teachers (Lacz ko-Kerr and Berliner, 2002). The Raymond et al. report also indicated that minority students in Houston, who are disproportionately taught by these underprepared te achers, lose ground academically each year. In addition, only about 50% of African American and Latino 9th graders in Houston graduate from high sc hool four years later (Haney, 2000; NCES, 2000). It would be hard to argue that t he assignment of so many underprepared teachers to these students has nothin g to do with their lack of success.The TFA study found that students of experienced te achers performed significantly better than students of inexperienced teachers, including TFA recruits. Along with the report's finding that, ove r a three year period, between 60% and 100% of TFA candidates had left after their second year of teaching, this finding raises additional questions about Teach for America's contribution to the education of Houston students, since they do not st ay long enough to gain the experience that could support student achievement. Earlier data from the Maryland Department of Education showed that TFA re cruits in Baltimore had similar attrition rates, with 62 % gone by the thir d year of teaching (Darling-Hammond, 2000b). These high attrition rates resemble those found in some other studies of short-term alternative routes (Darling-Hammond, 2000c) and sug gest another important outcome of teacher preparation policies. Both the H ouston study and Walsh's own review indicate that experienced teachers are more effective than inexperienced teachers (Walsh, pp. 5-6), yet many short-term alte rnative program recruits leave quickly. Other research indicates that those who co mplete 5-year teacher education programs enter and stay in teaching at mu ch higher rates than 4-year teacher education graduates, who stay in teaching a t higher rates than teachers hired through alternatives offering only short-term summer training before full-time teaching (Andrew & Schwab, 1995; DarlingHammond, 2000b). One reason for this might be the fact that 5-year progr am graduates typically have both a disciplinary major and a full-year of student tea ching tightly integrated with education coursework. Student teaching appears to make a strong differenc e in teacher retention. In a longitudinal study of recent college graduates who entered teaching in 1993, a recent NCES report notes that recruits without stud ent teachingÂ—most common among untrained recruits or those who enter through shorter-term alternative routesÂ—leave teaching at rates nearly twice as high as those who have had this kind of clinical training (Henke, Chen, & Geis, 200 0). The authors noted: In comparison with new teachers who had less traini ng in pedagogy, those with more training were less likely to have l eft teaching without returning by 1997. Fifteen percent of those who had student taught had left the profession and not returned by 1997, c ompared with 29 percent of those who had not student taught. Where as 14 percent of
14 of 55certified teachers had left by 1997, 49 percent of those without certification had not done so (p. 49). Findings about the high attrition rates of those hi red without full preparation for teaching raise questions about the cost-effectivene ss of a recruitment strategy that relies on teachers with little preparation who are likely to leave the profession before they can learn to become effective with chil dren. Meanwhile, the children they have taughtÂ—almost always the most disadvantag ed students in the most disadvantaged schoolsÂ—have not had the benefit of a teacher with either professional knowledge or experienceÂ—two sources of greater teaching skill. A recent study in Texas showed that teacher attriti on costs school systems at least $8,000 for each recruit who leaves in the first few years of teaching (Texas Center for Educational Research, 2000). It estimated that the high attrition of beginning teachers in Texas, a growing number of whom enter w ith little or no preparation and receive few supports in learning to teach, cost s the state more than $200 million per year (p. 16). This and other studies of teacher attrition suggest that policymakers should consider both teaching effects and retention patterns when they think about how to recruit and prepare teacher s. Walsh chooses to ignore other studies showing that certified teachers do better than uncertified teachers. One of these by Hawk, Coble, & Swanson (1985), enti tled "Certification: It Does Matter," foundÂ—in contradiction to Walsh's statemen t cited aboveÂ—that teachers' certification in mathematics has a large and statis tically significant effect on student achievement gains in both general mathemati cs and, to an even greater extent, in algebra. It compared preand post-test scores of students whose teachers who were certified in mathematics as compared to th ose of teachers with similar levels of experience who were uncertified in mathem atics. This study is dismissed in one part of Walsh's review as too small (p. 34), so that its findings can be discounted with respect to certification. However, the size of the study does not appear to matter to Walsh when she chooses to cite it as a basis for arguing that only subject matter makes a difference to teaching effectiveness (p. 65). This double standard about the use of research permeates the report. A study is declared inadequate when it finds any contribution of teache r education or certification to any measure of teacher effectiveness but a study of comparable size or methodologyÂ—often the same studyÂ—is embraced elsewh ere and used to support a different argument.While the study does have a small sample size (it e xamined 36 teachers, paired by school, course, and ability level of students being taught and the 826 students they taught), it is a reasonably well-controlled matched comparison design. The study does support the idea that subject matter knowledge matters to teaching. However, Walsh misrepresents the study as suggesting that only subject matter knowledge matters. The study did not directly examine the iso lated effects of subject matter knowledge but the combined effects of subject matte r knowledge and educational knowledgeÂ—including methods courses in the teaching of the content areaÂ—that are part of the certification requirements for an i n-field credential. Authors Hawk, Coble, and Swanson concluded: 8.
15 of 55The results of this study lend support to maintaini ng certification requirements as a mechanism to assure the public of qualified classroom teachers... (p. 15). (Note 12) As this and other studies reviewed here suggest, co ntent knowledge in combination with content pedagogical knowledgeÂ—that is, knowledge about how to teach the content, which, together with student teaching, constitute the major components of certificationÂ—appear to make contribu tions to student learning that exceed the contributions of either component indivi dually. An important policy point from this and other studies of certification is the fact that teachers would not have been guided or encouraged to acquire the conte nt knowledge and content pedagogical knowledge represented by in-field certi fication unless there were certification requirements. While Walsh and the For dham Foundation manifesto she endorses would turn all hiring decisions over t o principals, it was principals in these schoolsÂ—and in many others across the country Â—who hired and assigned out-of-field teachers to teach mathematics as well as other subjects (Ingersoll, 1998). In a policy world that eliminates teacher ce rtification, there would be no barrier to that practice occurring on an even more widespread basis. Another, much larger study resulted in similar find ings about teacher certification in California. Fetler (1999) examined the relations hip between school scores on the state's mathematics test and teachers' average experience levels and certification status in 795 high schools, after con trolling for student poverty rates and test participation rates. It found that the per cent of teachers on emergency credentials exerted a strong and highly significant negative influence on student achievement. The author concluded that, "After fact oring out the effects of poverty, teacher experience and preparation are sig nificantly related to achievement" (p. 13).This study is cited but never discussed in Walsh's revised report. In her original appendix, Walsh applauded the study's methods but t hen sought to dismiss its findings with two inaccurate assertions. First, she suggested, incorrectly, that the study's results pertained to subject matter knowled ge alone, not to the combination of subject matter and teaching knowledge represente d by certification. She misread both the study and the requirements of Cali fornia's credentialing system to make this claim, appearing to believe that indiv iduals who have passed only the subject matter requirement of a content test are gr anted full credentials in California (they are not), that individuals who are certified through internship programs (California's alternative route) do not ha ve to complete pedagogical requirements (this is false), and that individuals are hired on emergency permits solely if they lack content knowledge (this is also false). (Note 13) Walsh also suggested, incorrectly, that the study "may have so me basic methodology problems, by reaching conclusions using aggregated state-wide data." However, all of the study's data are aggregated to the schoo l level, not the state level. (See the author's confirmation of this statement, below. ) In the original appendix, (Note 14) Walsh stated: The article would be only be of interest if someone tried to assert that a teacher who knows no math could be a good math te acher. Any attempt to use this study as evidence against the p ractice of hiring alternatively trained teachers, as appears to be Da rling-Hammond's 9.
16 of 55implies (sic) and as Wilson et al. interpret it, lo ses all of its impact after reading Fetler.... In fact the author.... is primarily advocating ensuring that math teachers take more subject matte r coursework, and is clearly disinterested in any effect that may be had from coursework in "professional knowledge." The author, Mark Fetler, took strong issue with thi s interpretation of his findings. When I shared Walsh's statement with Fetler, he wro te in reply: I am surprised that Kate Walsh makes those statemen ts. I had a brief telephone conversation with her, but she was not fo rthcoming about her intent. Meeting the subject matter requirement involves both knowing the topic, e.g., Algebra, and the specific procedures needed to teach it in the classroom. Someone who knows how to solve quadratic equations, but does not know how to conve y that information to children in a classroom, is a poor t eacher. Both math subject knowledge and math pedagogy are essential. I believe that my study is consistent with these statements.... I wou ld be surprised to hear of any research that demonstrated successful t eaching that lacked either of those elements. My study supports the imp ortance of appropriate credentials. Supposing that you could f ind people who know math to teach, if they lack the ability to com municate effectively with children, they will not succeed in the classroom and will create dissatisfied students, parents, colleag ues, administrators, and board members. It will be a mess. Higher standa rds, not lower, are the solution. Fetler also noted that, "the unit of analysis in my paper is the school. It is not based on statewide aggregated data."Two other recent school-level studies in California have found significant negative relationships between average student scor es on the state examinations and the percentage of teachers on emergency permits after controlling for student socioeconomic status and other school characteristi cs (Betts, Rueben, & Dannenberg, 2000; Goe, forthcoming). Like Fetler's study, these studies also found smaller positive relationships between studen t scores and teacher experience levels, with negative effects on student achievement associated with the proportion of beginning teachers.California's experience is a good example of what h appens when pressures and supports for hiring credentialed teachers are relax ed. After nearly a decade of inadequate and unequal salaries, easy access to eme rgency permits and waivers, and few incentives for the training and equitable d istribution of qualified teachers for high-need fields and locations, California, now one of the lowest-achieving states in the nation, found itself with more than 4 0,000 teachers teaching on emergency permits or waivers by 1999-2000. The vast majority of these teachers were teaching in a small number of urban school sys tems in schools with the highest proportions of low-income students and stud ents of color. High-minority schools were nearly seven times as likely to have u ncredentialed teachers as low-minority schools. Low-achieving schools were ne arly five times as likely to have uncredentialed teachers as high-achieving scho ols (Note 15) (Shields et al.,
17 of 552000, pp. 41-43).These results mirror those already noted in Baltimo re, Houston, and other cities. The pattern appears across the country. For example a recent series in the Chicago "Sun Times" (Note 16) documented that "children in the state's lowest-scoring, highest-minority and highest-povert y schools were roughly five times more likely to have teachers who had flunked at least one certification test" and were least likely to have teachers who were "co rrectly certified." The burden should be on those who argue against efforts to ens ure minimally qualified teachers for all students to prove that the conflue nce of race, poverty, and low achievement with the presence of untrained and unce rtified teachers does not further disadvantage our nation's most vulnerable s tudents. Evidence about Preservice Teacher EducationFor the proposition that "there is little evidence that the content and skills taught in preservice education coursework is (sic) either ret ained or effective" (p. 7), Walsh cites two articles (Murnane, 1983; Veenman, 1984) from am ong the many dozens of studies of teacher education that could have been retrieved from the peer-reviewed literature, had she done a search. Both of these are very old p ieces, published long before recent reforms in teacher education. Neither of them makes any statement in support of Walsh's claim. Veenman (1984) describes the most frequently cited problems by novice teachers. These included concerns about topics ranging from c lassroom management to teaching loads and class sizes. Nowhere in the arti cle does he suggest that what teachers learned in preservice education was not re tained or effective. In fact, he notes that researchers should look more to the cond itions of schooling than to teacher education for explanations for many of the problems beginning teachers cite. Veenman notes that the outcomes of teacher ed ucation may vary by characteristics of programs, citing studies finding that those who had had more intense student teaching, more competency-oriented teacher education coursework, or who were more satisfied with their t eacher education experiences reported fewer problems in the classroom. 1. Murnane's (1983) article is not an empirical study but a brief commentary on the work of another author who proposed the development of doctoral degrees for teacher leaders. While he questions the value of do ctoral education for developing pedagogical skills (as would I), Murnane is careful to point out that there are forms of teacher education that may be helpful, and that lack of evidence in large data sets about the effects of preservice education may be related to the lack of data collected on the topic at that time, nearly 20 years ago. (See additional discussion of this point under "Evidence about Verb al Ability" below.) 2. Walsh ignores the findings of other studies on this topic, including some she has cited for other propositions. She criticizes Everts on, Hawley, and Zlotnik (1985) for their interpretion of the findings of Edward Be gle (1979), "a respected mathematician" regarding his findings about teacher s' subject matter preparation (p. 34). In one of the few early data sets providin g evidence about teacher preparationÂ—a mammoth study of 112,000 students con ducted through the 3.
18 of 55National Longitudinal Study of Mathematical Abiliti esÂ—Begle (reported in Begle & Geeslin, 1972 and, with additional data, in Begle 1979) found that measures of teacher subject matter knowledge did not exert stro ng influences on student achievement. He also found that coursework in mathe matics methods had a stronger effect on student achievement than higherlevel coursework in the subject matter (discussed in Begle, 1979). On the lack of i nfluence of subject matter knowledge in his earlier study (Begle & Geeslin, 19 72) Begle noted, and Walsh reports, that the teachers in the study may have ha d stronger content knowledge than the norm, since they had all been accepted to a National Science Foundation Summer Institute. This is an appropriate point.However, Walsh chooses to ignore Begle's findings a bout the value of education coursework. She does not explain why. Walsh cites B egle's work at several points in her text, and refers readers to her appendix for a review of his work that is no longer there. In her separately-published appendix, Walsh admits of Begle (1979) that, "this is a scholarly work, employing defensib le analyses at the time it was written for examining the data." She then nonethele ss sought to dismiss it with a vague statement about possible aggregation bias (al though achievement data were aggregated only to the classroom level), "too many variables" in the data set, and "much greater variance in the number of subject mat ter courses teachers took than the number of methodology courses they took." This last complaint is particularly odd. The implications of greater variability in sub ject matter courses contradicts the point she makes above about the possibly high l evels of subject matter knowledge among sample members (in re: Begle & Gees lin, 1972). In fact, wider variability would generally make it easier to find effects, if they are there to be found, rather than harder. In another instance (reg arding Byrne, 1983), Walsh notes, correctly, that the limited variability in s ubject matter coursework levels may have made effects more difficult to find. Walsh seems confused about the research findings and their implications but clear about her goal of discrediting any results that support the value of teachers lear ning about how to teach their content to others. Monk (1994) offers similar findings on this questio n from a more recent data set that incorporates more fine-grained variables about teacher education. Using data on 2,829 students from the Longitudinal Study of Am erican Youth, Monk (1994) found that teachers' content preparation, as measur ed by coursework in the subject field, is positively related to student achievement in mathematics and science, but he notes that the relationship is curvilinear, with diminishing returns to student achievement of teachers' subject matter courses abo ve a threshold level (e.g., five courses in mathematics). In addition, teacher educa tion coursework (e.g. methods courses in the content area) had a positive effect on student learning in mathematics, exhibiting "more powerful effects than additional preparation in the content area" (p. 142). Monk concluded that "a good grasp of one's subject area is a necessary but not a sufficient condition for effe ctive teaching" (p. 142). Monk told me that when Walsh first shared her brief appendix review of his work with him, he was surprised that she had used his wo rk to emphasize the importance of subject matter knowledge without ackn owledging his findings on the value of education courses. He noted in an emai l to me that he had communicated to Walsh that: 4.
19 of 55My study of relationships between teacher course ta king experiences and subsequent student gains in performance showed that the number of both content courses and content-specific pedago gy courses in a teacher's background is positively related to pupil test score gains in the relevant content area. It is misleading to repo rt the positive results for the content courses and to not acknowledge the positive results for the pedagogy courses. After Monk communicated with Walsh, she did acknowl edge in her appendix that Monk's study provides support for the contention th at education coursework has a positive effect on teaching performance; however, s he did not incorporate this admission in her claims that "not one" of the studi es ever cited on this topic provides such support.In addition to newer databases that allow some larg e-scale examinations of the influences of teacher education variables on studen t achievement, recent studies have begun to look at the outcomes of different tea cher education program designs. For example, studies of 5-year teacher edu cation programsÂ—programs that include a bachelor's degree in the discipline plus an additional year of education study and extended student teachingÂ—have found graduates to be more confident and better rated than graduates of 4-year programs in the same institutions and as effective as more senior teache rs, as well as more likely to enter and remain in teaching (Andrew & Schwab, 1995; Dent on & Peters, 1988). Walsh does not review or cite any of these studies, even those that were available for her information from previous research she claims to ha ve scrutinized. 5. The Influence of Verbal Ability on Teacher Effectiv eness There is little disagreement about the fact that ve rbal ability and subject matter knowledge influence teacher effectiveness, although Walsh tries to set up a straw man by suggesting, inaccurately, that some researchers, including myself, have argued otherwise. (See the section on "Misrepresentations of Research" below.) There are two areas of real disagreement, however. One is whether verbal ability alone is the only or best measure of teacher effectiveness. The other is how to evaluate the size of relative contributions of various kinds of knowledge to teac her effectiveness. As examples cited earlier illustrate, the literatur e on teacher characteristics and their effects on teacher performance has been a captive o f the measures most likely to be available in large data sets at any moment in time. While there are many studies evaluating the influences of teachers' standardized test scores, especially measures of verbal or general academic ability, because these v ariables have been readily available in large-scale data sets since the 1960s, data on teac hers' course-taking backgrounds or teacher education experiences have been included in large data sets only since the early 1990s. Thus, there are more studies finding influen ces of variables that have most often been measured. Finally, most of the studies that have included mea sures of verbal ability or content knowledge have not included measures of teacher edu cation or certification. In a recent review, Wayne and Youngs (in press) found five stud ies that observed relationships between measures of teachers' verbal or general aca demic ability and student
20 of 55achievement and that met the standard of having con trolled for students' socioeconomic status and prior achievement. Four of these studies employed data sets from the 1960s and 1970s and none of the five included measures of teacher education or certification. Looking across studies in these different eras, in many cases, the relative effect sizes of verbal ability measures are no larger than those of teacher education and certification measures in the studies that use these instead. Walsh uses an article by Murnane (1983) written nea rly 20 years ago to argue for the primacy of verbal ability as a correlate of tea cher effectiveness. She states, illogically, that, "to concede this relationship wo uld mean acknowledging that formal teacher preparation is not as critical to st udent achievement as some would advocate" (p. 41). However, Murnane pointed out in his article that evidence about the influence of verbal ability was partly a function of the fact that teachers' standardized test scores were one of the few variab les about teachers available in large-scale databases at that time, which did not i nclude good measures of teacher education. In discussing the results on verbal abil ity, he diverges from Walsh's interpretation, stating: Clearly one should not interpret these results as i ndicating that intellectual ability should be the sole criterion u sed in recruiting teachers or that formal teacher training cannot mak e a difference. In fact, the lack of evidence supporting formal preser vice training as a source of competence may be to some extent a result of limitations in the available data. For example, all databases suit able for examining the correlates of teaching effectiveness as measure d by student achievement gains pertain to a single school distri ct. Since there is less variation in training among teachers within a district than among teachers in the country at large, these databases d o not permit the most powerful possible tests of the efficacy of alt ernative teacher training programs (p. 565). 1. Walsh tries to use another article by Greenwald, He dges, and Laine (1996) as evidence that verbal ability is the only critical v ariable influencing teacher effectiveness, and misrepresents a communication sh e had with Larry Hedges, one of the study's authors, regarding the appropriate i nterpretation of his findings. Characterizing Greenwald, Hedges, and Laine's artic le as "a sound review of 60 studies," she then criticizes a direct reference to its findings in a report by the National Commission on Teaching and America's Futur e (Walsh, p. 17). Her criticism first alludes, incorrectly, to a chart in the Commission's report (which in fact referred to another study, (Note 17) ) then she criticizes the interpretation of the chart. The correct chart in the Commission's re port (Figure 5, entitled "Effects of Educational Investments" in Darling-Hammond, 199 7, p. 9) was reproduced directly from Greenwald, Hedges, and Laine's table 7, column 1 (p. 379) with the same variable labels and statistics as presented in the original source. It describes the size of increase in student achievement for eve ry $500 spent on several different kinds of investments. Here is a reproduct ion of the table from Greenwald et al.'s study:Table 7 The effect of $500a per student on achievementb2.
21 of 55 Sample Input VariableFull Analysis Publication bias robustness Per pupil expenditure 0.150.15 Teacher education0.220.20Teacher experience 0.180.17 Teacher salary0.160.08Teacher/pupil ratio 0.040.04a1993-94 dollarsbAll achievement outcomes are in standard deviation units. In explaining the table, study authors noted that The magnitudes (of the effects) for teacher educati on and teacher experience are higher than, but of the same magnitu de, as PPE (per pupil expenditures). That is, one would expect comp arable and substantial increases in achievement if resources w ere targeted to selecting (or retaining) more educated or more expe rienced teachers. (p. 380) The Commission used this finding, as Greenwald, Hed ges, and Laine had done, as an indicator that investments in teacher education showed stronger influences on pupil achievement gains than investments in other r esources, like reduced teacher/pupil ratios. We noted in discussing their overall study that the authors had found evidence of the influences of teacher abi lity and experience, along with teacher education. However, Walsh criticizes the Co mmission's two-sentence characterization of the research (which she calls a discussion "in considerable detail") for failing to note that Greenwald, Hedges and Laine found more studies supporting the influences of teacher verbal ability on achievement than what they labeled "teacher education" (measured in their stud y as masters degrees because this was the most widely used measure in large data sets.) She suggests that Hedges disagrees with the Commission's characteriza tion, a view that Hedges clarified was inaccurate when I spoke to him. He in dicated that Walsh had not revealed her interpretation of his findings when sh e contacted him, and wrote the following to explain his own view of the proper int erpretation of his findings: It is true that the relationship between teacher ve rbal ability and student achievement is relatively large and consist ent across the few studies that have examined it. However this does no t imply that investing in teacher ability (among possibly poorly qualified teachers) is a cost effective way to enhance student achievem ent. There are two
22 of 55reasons. First, teacher ability (among qualified te achers) may be more expensive than other resources that could be purcha sed to improve achievement. That is, there could be a strong relat ionship but high cost. Second, and more important, the relations fou nd in the studies Greenwald, Hedges, and Laine (1996) reviewed were s tudies of practicing teachers. There is no reason to expect t hat the same relation holds among those who are not part of the teaching workforce. The point here, similar to that made by Murnane (ab ove), is not that verbal ability is not important, but that the evidence does not pr ove it is the only important contributor or the most efficient way to achieve te acher effectiveness. In fact, most current certification systems combine tests of basi c skills and general academic ability, subject matter, and teaching knowledge wit h evidence of successful supervised clinical experience and coursework focus ed on teaching knowledge and skills to help candidates assemble many sources of expertise in a more coherent way than would otherwise be the case. In pursuit of her argument that only verbal ability makes a difference, Walsh seeks to discount other studies that have found strong in fluences of teacher certification test scores on teacher effectiveness as being relev ant only to the measurement of verbal ability and irrelevant to the broader questi on of teacher certification. These studies are also misrepresented.In her discussion of Schalock (1979) in the appendi x (B13), Walsh seeks to dismiss his review's findings about the limited evi dence regarding the relationships between teachers' measured intelligen ce and other indicators of effectiveness because the review is "old, old!!" an d because, she argues, "More recent research such as Summers and Wolfe, 1977; Fe rguson, 1991; Ferguson & Womack, 1996 (sic) ; Murnane, 1983; Hanushek, 1971; Strauss and Sawyer 1986 suggest that intelligence (measured by SAT, verbal ability tests and college selectivity) are indeed substantially important."Aside from the facts that two of these "more recent studies pre-date the review she dismisses as "old, old!" and one (Murnane, 1983 ) is not a study at all, Walsh here cites two studies that she dismisses elsewhere for "aggregation bias" (Ferguson, 1991 and Strauss & Sawyer, 1986, see Wal sh, p. 27) and another (Ferguson & Womack, 1993) that she dismisses withou t stating a reason (see discussion of Wilson et al., in Appendix B). (Note 18) Walsh's readers are referred to Appendix B for reviews of these issues, but the studies are not included there. 3. Walsh cites Ferguson (1991) for a number of her pro positions, including the fact that teacher quality matters (p. 5), that teacher r ace does not matter (p. 6), and that verbal ability matters (p. 6). Later, she claimsÂ—wh en she wants to dismiss the study for its findings about teacher education and certificationÂ—that the study suffers from aggregation bias, a concern I address in the next section on methodological issues. Ferguson's analysis of nearl y 900 Texas school districts controlled for student background and district char acteristics; he found that combined measures of teachers' expertiseÂ—scores on a state teacher licensing examination, master's degrees, and experienceÂ—accou nted for more of the inter-district variation in students' reading and m athematics achievement (and achievement gains) in grades 1 through 11 than stud ent socioeconomic status. An 4.
23 of 55additional, smaller contribution to student achieve ment was made by lower pupil-teacher ratios and smaller schools in the ele mentary grades. The effects were so strong, and the variations in teacher expertise so great, that after controlling for socioeconomic status, the large disparities in achi evement between black and white students were almost entirely accounted for b y differences in the qualifications of their teachers.As I noted in an earlier review of this study (Darl ing-Hammond, 2000c), of the teacher qualifications variables, the strongest rel ationship was found for scores on the TECAT, a state licensing examination described by the test developer as a test that measures basic skills and professional knowled ge. The Texas Education Agency's published outline of the test content show s that it seeks to measure verbal ability, logical thinking, research skills, and a set of items on professional knowledge. Walsh takes issue with this description of the test and argues that the study does not support the value of teacher certifi cation because the test should be considered primarily a basic literacy test. In Wals h's view, this makes it irrelevant to the question of teacher certificationÂ—even thoug h it is required for teachers to maintain their certification. She also argues that the relatively smaller influence of master's degrees in Ferguson's study (which account ed for about 5% of the explained variance) means that teacher education is unimportant, and she criticizes the fact that I discuss the three variab les associated with teacher quality (TECAT scores, experience, and masters degrees) in combination, although this is also the way in which Ferguson discusses them at se veral points in his analysis. Walsh's arguments are illogical in several ways. Fi rst, while it is true the TECAT measures basic skills, it also measures other acade mic abilities and professional knowledge, as confirmed by the test maker's documen tation and administering agency's descriptions. There is no basis for making judgments contrary to the claims of the developers. In addition, the test wou ld not exist at all if there were not a state certification system requiring it. Like all of the other variables one can evaluate in studies of this kind, the test scores a re a rough proxy for many aspects of teacher capacity that may matter for their perfo rmance. In a regression equation of this sort where one variable stands in for other s for which data are not available, it undoubtedly captures the effects of o ther unmeasured factors. Even if it were true that the test was a weak measure of pr ofessional knowledge, this would not mean that professional knowledge is unimp ortant or that verbal ability is the only important variable for predicting teach ing ability. Only a better measure of professional knowledge (coursework or a more in-depth test of teaching knowledge) would allow a test of this ques tion. Finally, as Hedges notes above, since the Ferguson study was based on practi cing teachers, its findings do not shed light on the relative effectiveness of non -teachers who might score differently on the tests.Masters degrees and experience are other very parti al measures of teacher knowledge and skill that show a modest effect in th is study and a larger effect in Ferguson and Ladd's (1996) similar study in Alabama that included a weaker test measure of pre-college general skills (the ACT), wh ich is not designed to capture knowledge relevant to teaching. However, masters de grees are also a very crude proxy for teacher education, given the wide variabi lity in the content of masters degrees pursued by teachers, many of which have bee n pointed at jobs outside of
24 of 55teaching, such as administration, counseling, measu rement and evaluation. In fact, aside from MAT preparation programs in a small numb er of institutions and specialist programs for reading and special educati on, there were few masters degree programs for the study of teaching until the recent advent of 5-year teacher education programs and masters degrees developed ar ound the National Board for Professional Teaching Standards that focus on conte nt pedagogy. Thus, there is reason to expect that some masters degree studies w ould affect teaching ability, but not much reason to expect the effect of masters degrees as an undifferentiated variable to be uniform or large in the aggregate, a point I have made in earlier commentary (Darling-Hammond, 2000a). Goldhaber and Brewer (1998, 2000) have made the same point and have completed researc h that documents the greater influence of both bachelors and masters degrees in the content area taught (e.g. mathematics or mathematics education) as compared t o undifferentiated degrees. It makes more sense to consider these variables tog ether as proxies for expertise than to treat them as mythically precise measures o f totally unrelated constructs. As I have argued elsewhere, research on teaching su ggests a view of expertise that includes general knowledge and ability, verbal abil ity, and subject matter knowledge as foundations; abilities to plan, organi ze, and implement complex tasks as additional factors; knowledge of teaching, learning, and children as critical for translating ideas into useful learning experiences; and experience as a basis for aggregating and applying knowledge in non routine situations (Darling-Hammond, 2000a). David Berliner's studies of expertise in teaching, for example, include experience along with several othe r traits as a critical aspect of expertise (see e.g. Berliner, 1986). All of these f actors combine to make teachers effective; furthermore, one cannot fully partial ou t the effects of one factor as opposed to another as many are highly correlated.Walsh also cites Strauss and Sawyer (1986) for her proposition that verbal ability matters (p. 6), but fails to report the study's act ual findings and seems unconcerned that it might suffer from "aggregation bias." In a study of 145 school districts in North Carolina, these researchers found that teache rs' average scores on the National Teacher Examinations (NTE) had a strong in fluence on average school district test performance. Although the authors did not specify which portion(s) of the NTE were used as measures, the Weighted Common Examinations Test (WCET) was required in North Carolina at that time The WCET included separate subtests measuring general knowledge and profession al knowledge about teaching. Walsh apparently wants to count this as a test of verbal ability, but does not acknowledge the Professional Knowledge Examinat ion portion of the test. The authors found that, taking into account per-cap ita income, student race, district capital assets, student plans to attend co llege, and pupil/teacher ratios, teachers' certification test scores had a strikingl y large effect on students' failure rates on the state competency examinations: a 1% in crease in teacher quality (as measured by NTE scores) was associated with a 3 to 5% decline in the percentage of students failing the exam. The authors' conclusi on is similar to Ferguson's (1991): Of the inputs which are potentially policy-controll able (teacher quality, teacher numbers via the pupil-teacher rati o and capital stock), our analysis indicates quite clearly that improving the quality of 5.
25 of 55teachers in the classroom will do more for students who are most educationally at risk, those prone to fail, than re ducing the class size or improving the capital stock by any reasonable ma rgin which would be available to policy makers (p. 47). The same illogic holds in regards to the dismissal of this study as the previous one. In addition to questions about the content of tests used in various studies, the measures that appear in large data sets are always relatively crude proxies for the constructs under study, so it is impossible to know with great precision exactly what trait is being represented when a variable sho ws an effect. For example, scores on tests of academic ability like the SAT ha ve generally been strongly correlated with scores on ETS subject matter and pr ofessional knowledge tests (Gitomer, Latham, and Zimek, 1999); in eras when hi gher degrees were less common (e.g. pre-1980), verbal ability scores were also strongly correlated with masters degrees. Where certification tests are in p lace, test scores correlate with certification status. And both certification status and masters degrees typically correlate with teacher experience, since most state s require teachers to obtain certification in order to remain in the workforce a nd most teachers have traditionally secured masters degrees by taking cou rses over time while teaching. (This is changing to some extent where beginning te achers are being trained in post-baccalaureate or 5-year programs and sometimes enter the workforce with a masters degree).These interrelationships do not invalidate studies that have used one or more of these variables, but they are one reason why it is difficult to say with certainty which of these measuresÂ—or other unmeasured variabl es that are related to themÂ—are associated with measured effects. The corr elational studies that Walsh relies on almost exclusively do not establish causa tion; they point to possible relationships for further, more fine-grained explor ation. However, Walsh often dismisses other large studies and the more fine-gra ined studies from consideration, at least when the findings do not su it her predilections. Walsh also cites Ferguson & Womack (1993) for her p roposition that verbal ability matters most, although the reason for this is unclear. This study of more than 250 candidates from a single teacher education program examined the influences on 13 dimensions of teaching performance of education and subject matter coursework, NTE subject matter test scores, and GPA in the student's major. The ratings of performance were based on det ailed descriptors of teaching on 107 items evaluated by subject matter specialist s and education supervisors. The authors found that the amount of education cour sework completed by teachers explained more than four times the variance in teac her performance than did measures of content knowledge (NTE specialty scores and GPA in the major). It is possible that Walsh cites this study as support for verbal ability influences because she has confused the NTE specialty tests of subject matter knowledge with other components of the NTE battery measuring general aca demic ability. In any event, the strength of the relationship was very small. Gi ven her willingness to cite the study for a very weak finding about verbal ability, it is interesting that she does not cite it for its much stronger finding that educ ation coursework mattered for teaching performance. 6.
26 of 55In her separately-published appendix, Walsh seeks t o dismiss the Ferguson & Womack study because it is limited to a single inst itution (Note 19) and uses "supervisor's evaluations" as the measure of perfor mance. As noted earlier, she is willing to use studies based on such measures for h er own claims, despite her assertions that they should not be included. More i mportant, in this study the ratings are not the global ratings from school prin cipals that have often been found to be relatively low in reliability. They are lower -inference ratings based on a detailed protocol used by subject matter specialist s and university supervisors, which are typically more reliable. In addition, the limitations on generalizability created by the use of a single institution are not fatal to consideration of the findings. They require that the study be considered in the context of other studies on similar questions using different samples. Such studies have been conducted. In a similar study which compared relative influenc es of different kinds of knowledge on 12 dimensions of teacher performance f or more than 270 teachers, Guyton and Farokhi (1987) found consistent strong, positive relationships between teacher education coursework performance an d teacher performance in the classroom as measured through a standardized ob servation instrument (the Georgia Teacher Performance Assessment Instrument), while relationships between classroom performance and subject matter te st scores were positive but insignificant and relationships between classroom p erformance and basic academic skill scores were almost nonexistent. (The two measures of basic academic skills were the Georgia Regents' test, a r equired examination for public university students, for which the researchers used reading and essay scores, and the states' Teacher Competency Test.)The researchers noted that extensive reliability st udies had been conducted to support the reliability of the TPAI performance mea sure, which was used statewide as an assessment for certification. Walsh eliminates this study from consideration because it is a single institution st udy and refers the reader to Appendix B for her review (p. 25). In her appendix, Walsh criticizes the study for its reliance on supervisors' ratings, again failing to distinguish the research on principals' general teacher evaluation ratings from the research on the reliability of the TPAI as an observational instrument. She also a pparently failed to read the study carefully, questioning why the numbers of tea chers differ for various comparisons, not having noted the authors' explanat ion that all correlations depended upon the number of teachers for whom data on both variables were available (p. B11).Whereas Walsh tries to paint an unambiguous picture about the value of such measures as verbal ability (suggesting, for example that these scores be reported statewide as a primary measure of accountability) a nd the lack of value of teacher education, the real picture is decidedly more compl ex. Her evidence for her claims confuses measures of verbal ability with measures o f professional knowledge and subject matter knowledge, and often includes studie s that actually show influences of these other kinds of knowledge that are at least as strong as measures of verbal ability. The world is just not as simple as Walsh w ould like to make it appear. Even strong advocates of the notion that academic a bility matters are not willing to make the kinds of over-assertions Walsh urges. F or example, Hanushek (1992), whom Walsh cites repeatedly for her defense of verb al ability as a key measure 7.
27 of 55concludes: The closest thing to a consistent finding among the studies is that "smarter" teachers who perform well on verbal abili ty tests do better in the classroom. Even for that the evidence is not very strong (p. 116). While it would be ridiculous to argue that verbal a bility and subject matter knowledge do not matter for teaching, it is equally ridiculous to argue that knowledge of teaching and learning and the opportun ity to learn to teach under the close supervision of a master teacher through stude nt teaching and other guided experiences do not matter at all. The literature ju st does not support this reading or the policy implications that Walsh would draw. The Academic Ability of Teachers who Lack Certifica tion Another argument made by those who would eliminate certification is that an unconstrained market would allow the recruitment of individuals with higher verbal or general academic ability who do not now enter teach ing. While it is probable that some individuals would choose to teach if they did not h ave to prepare, it is not clear that most of these entrants would be more academically able, that they would be better teachers, or that they would stay long in teaching. It is also u nlikely that given current wages, individuals who are now preparing for much higher-p aying careers in medicine, the law, engineering, and other professions that require muc h more onerous preparation and licensing processes would choose teaching as a care er simply because they did not have to be certified. Labor market contexts are relevant to this question The qualifications of individuals preparing for teaching improved noticeably between the early 1980s and the early 1990s in terms of both academic attainment and ability me asures, in part because of the changes in admissions requirements to teacher educa tion adopted by states and universities but also likely because of the substan tial increases in real wages for teachers that occurred during the 1980s. Whereas prospective teachers were disproportionately drawn from the bottom quartile of college students in the early 1980s (Lanier & Little, 1986), both grades and test scores improved for tea cher candidates by the 1990s. The Recent College Graduates Survey, which tracks c ollege graduates into the labor market, found that the grade point averages of newl y qualified teachers in 1990 were higher than those of the average college graduate, with 51% earning a GPA of 3.25 or better as compared to 40% of all graduates (Grey et al., 1993). However, average GPAs were significantly lower for the 15% of college graduates entering teaching who were neither certified nor eligible for certification. M ost of the uncertified entrants (57%) had grade point averages below 3.25, and 20% had GPAs below 2.25. Attrition was a lso high for the untrained candidates. By the time of the su rvey (one year later), only one-third of the uncertified entrants were still engaged in teac hing as their primary jobs (Grey et al., 1993).In addition, the Educational Testing Service found that among 270,000 test-takers in 1995 through 1997, college admissions test scores w ere highly correlated with initial teacher licensing scores (Praxis I and Praxis II), and the lowest average scores on both
28 of 55kinds of tests were those held by individuals who e ntered teaching without preparation (Gitomer, Latham, and Zimek, 1999). (Walsh describe s this 14% of the sample as an "error" in the study since the individuals had not enrolled in a teacher education program; she misunderstands the fact that these Pra xis test-takers were the entrants to teaching who used emergency or alternative routes. (Note 20) Prepared teachers scored much higher than unprepared teachers.While students who prepare to enter fields other th an teaching have higher average test scores on measures like the SAT than do those prepa ring to enter elementary school teaching, there is no significant difference for pr ospective secondary teachers, most of whom earn a disciplinary degree along with their te aching certificate. The narrowing of this gap between prospective teachers and others is likely a function of the more rigorous admissions requirements for teacher education enact ed in most states and the growth in wages between the early 1980s and the mid-1990s. Finally, the study found that graduates of NCATE-ac credited colleges of education passed the Praxis subject matter tests for teacher licensing at a significantly higher rate than did graduates of unaccredited programs, boosti ng their chances of passing the examination by nearly 10 percent (Gitomer, Latham, and Zimek, 1999). Walsh suggests that this higher Praxis pass rate might simply refl ect the fact that NCATE schools could be located in states with low cutoff scores. Howeve r, additional analyses of the data by ETS and another independent study (Note 21) indicate that this is not the case. A more likely explanation is that NCATE's requirements tha t colleges demonstrate how they screen applicants for general ability and that they ensure strong content backgrounds translate into somewhat greater attention to these matters in institutions that are accredited. These data suggest that standards may i ncrease the general as well as specialized qualifications of prospective teachers. They do not suggest that removal of certification requirements brings higher ability in dividuals into teaching or keeps them there.It is important to recognize that labor market ince ntives operate among individuals actually entering teaching. For example, several st udies of alternative certification programs found that the academic records of recruit s varied substantially by teaching field, with alternatively-certified candidates in h igh demand shortage fields, such as mathematics and science, having much poorer academi c records than candidates in other fields and than candidates from traditional teacher education programs in those same fields (see Natriello & Zumwalt, 1992, re: New Jers ey; Lutz and Hutton, 1989 re: Dallas; Stoddart, 1992, re: Los Angeles). It is unl ikely that eliminating requirements for training would increase the career attractions to t eaching for academically able candidates as much as increased wages would. Meanwh ile, eliminating training requirements could result in a less well-qualified teaching force, especially if the elimination of certification standards not only red uced the knowledge of entrants but also reduced pressures for competitive wages.The Private School ArgumentFinally, a claim sometimes made by opponents of tea cher certification, including Walsh, is that private schools are more effective than pub lic schools, and that this is becauseÂ—or at least is not impeded byÂ—the fact that private school teachers are not certified. There are two major problems with the pr ivate school "proof": First, there are
29 of 55conflicting findings about the relative effectivene ss of public and private schools, with credible evidence on both sides of the question. Se cond, most private school teachers are certified and an even larger majority have specific preparation for teaching, even when they have not sought certification.On the effectiveness of private schools, Walsh cite s Coleman, Hoffer, & Kilgore (1982), who examined data from the first wave of High schoo l and Beyond surveys, conducted in 1980, and found evidence of higher performance f or comparable students in Catholic and other private schools as compared to public sch ools. The researchers attributed their findings primarily to differences in student behavi or across school sectors, measured by variables like lower rates of absenteeism, cutting class, and fighting, along with factors like more time spent on homework and higher individ ual student attendance. They also found that achievement was actually higher for comparable students who were in public schools that had these characteristics. Subsequent studies have produced findings that favor both public and private schools after control ling for student characteristics and school organization (Bryk & Lee, 1992; Lee & Bryk, 1988; Lee, Dedrick, & Smith, 1991). Most studies have pointed to variables like school and class size, school organization, and curriculum differentiation as cri tical variables in determining both public and private school effectiveness. When these factors are controlled, public school students often do as well or better than private sc hool students in schools with similar features.Furthermore, differences in the preparation of publ ic and private school personnel are not as large as many people assume. More than 30 st ates certify private school personnel (Feistritzer, 1984), and, when Coleman did his anal ysis, more than 85% of private and parochial school teachers were certified, as compar ed to about 95% of public school teachers (NCES, 1985). This has changed only slight ly in the years since. Although certification is not required for private school te achers in all states, only 34% of private school teachers in 1993-94 (the most recent year fo r which national data are available), were not certified in their primary assignment fiel d. Some of these teachers were certified in fields other than their primary assign ment field. Many undertook teacher preparation, even though they did not apply maintai n a state license or certificate. In 1993-94, public and private school teachers were al most equally likely to have received an undergraduate degree in education (68.9% for pub lic vs. 61.5% for private elementary teachers and 19.8% for public vs. 19.3% for private secondary teachers) (NCES, 1997, p. 25). The education degree as an indicator of pre paration is quite partial, since the education degree has waned as certification increas ingly requires a content degree with an education minor or credential. The percentage of 1992-93 bachelor's degree recipients who had taken education courses was 87.1% for publi c school teachers and 71.6% for private school teachers, (Note 22) and the average number of education credits earned was 37.4 for public school teachers as compared to 35.2 for private school teachers (NCES, 1997, table A-51). (Note 23) Public school teachers were also more likely to hav e taken subject matter degrees in their teaching fields than private school teachers. For example, 66% of public school mathematics teachers held a major or minor in the f ield, as compared to 58% of those in private school. (Goldhaber and Brewer, 2000 reporte d a similar finding.) The same differentials hold in other fields to somewhat less er extents. The greater content preparation of public school teachers is likely a f unction of the fact that certification has required increasing amounts of subject matter cours ework in the field to be taught, thus
30 of 55leveraging stronger content preparation for public school teachers in states where private school teachers are not required to hold certificat ion. Almost all states now require certified teachers to hold at least a minor in the field to be taught, and many require a major in the field.Finally, even if it were true that untrained teache rs were unusually effective in some private schools for students of comparable initial achievement levelsÂ—a point about which there is no published evidenceÂ—it would be a large leap of faith to assume that such teachers would be equally effective in schools where many students have much greater educational needs and students are not preselected for their academic ability, their positive school attendance and behavior, and their parents' income and interest in education. There are very large differences in the populations of students attending public and private schools in the United States, (Note 24) which have important implications for teachers' knowledge and skills. It is one thing for a teacher to offer information in whatever manner comes instinctively to students who are academically able, have learned to learn independently, and are well-supported at home by educated parents, tutors, and other supports for their learn ing. It is quite another thing to teach by the seat of the pants when students do not have the se learning supports at home and may present a variety of language and learning differen ces. Being effective with students who need substantial support for their learning require s greater diagnostic ability and knowledge of how to present information and structu re experiences in ways that help them become successful. Systematic knowledge about how to organize curriculum and reach students with special learning needs is most needed in the schools that serve most students with these needs.Other Misrepresentations of Research FindingsThe remainder of Walsh's review continues the kind of misrepresentations documented above, appearing to rely on the belief that readers will read its accusations, but will not read or understand the research itself. Although sh e prepared a draft appendix with 192 studies that sought to critique many of the studies she dismisses (often inaccurately), it was not published with the report. Appendix B, to w hich the reader is repeatedly referred for reviews, includes only 14 studies. Throughout t he report, the reader is referred to this appendix for critiques of studies that do not appea r there. The selection of research included in the published version of the report's a ppendix is very strange. Many strong studiesÂ—some of the key citations in the fieldÂ—are omitted, along with the flawed rationales for dismissing them that now appear in a separately-published appendix. Some much less important and less well-designed stu dies are included, with the apparent goal of critiquing their size or designs as though they represented the dozens of studies not mentioned or excluded. Thus, the paper does not include information regarding most of the studies Walsh claims she has reviewed and do es not provide evidence for her claim that, of all the studies cited in support of teacher education and certification, "none bear up to scrutiny."Here are just a few additional examples of major mi srepresentations. Goldhaber & Brewer (2000). In a string of citations Walsh lists a study by Goldhaber and Brewer (2000), for its finding that t eachers with a degree in their subject matter are more effective than those withou t such degrees. This study fits all of Walsh's desiderata: It is large (using a dat a set that includes more than 3,000 1.
31 of 55teachers), recent, and published in a peer-reviewed journal. However, Walsh does not cite the authors' findings that certification s tatus has an even greater influence on teachers' effectiveness than a degree in the sub ject area. Later, Walsh states, "...most research indicates that the most distinct problem in schools serving poor children is the number of teachers who are teaching subjects in which they have no expertise (Goldhaber & Brewer, 2000; ... Hawk, C oble, & Swanson, 1985). These studies do not show that certification status as an isolated variable, has any significant effect on the achievement level of chil dren who are poor or minority." (p. A6). Neither study examined the subject matter expertise of teachers in low-income schools, and both found strong effects o f certification on student achievement. In fact, Goldhaber and Brewer wrote: Turning to an examination of the effect of teacher certification, we find that the type (standard, emergency, etc.) of c ertification a teacher holds is an important determinant of student outcom es. In mathematics, we find the students of teachers who a re either not certified in their subject (in these data we cannot distinguish between no certification and certification out of subject a rea) or hold a private school certification do less well than students who se teachers hold a standard, probationary, or emergency certification in math. Roughly speaking, having a teacher with a standard certific ation in mathematics rather than a private school certificat ion or a certification out of subject results in at least a 1.3 point increase in the mathematics test. This is equivalent to about 10% o f the standard deviation on the 12th grade test, a little more tha n the impact of having a teacher with a BA and MA in mathematics. T hough the effects are not as strong in magnitude or statistic al significance, the pattern of results in science mimics that in mathem atics. Teachers who hold private school certification or are not ce rtified in their subject area have a negative (though not statistica lly significant) impact on science test scores (p. 139). The authors note that the effect size of "having a teacher with a standard certification in mathematics rather than a private school certification or a certification out of subject" is "a little more tha n the impact of having a teacher with a BA and MA in mathematics." Of course, the certification i tself includes requirements for subject matter knowledge as well a s for knowledge of teaching and learning. In fact, certified mathematics teache rs are more likely to have a degree in the field than non-certified teachers. Th e fact that the study found a significant effect of certification status even aft er controlling for whether teachers had a degree in their field and after controlling f or experience suggests that whatever is represented by the certification variab le has an influence above and beyond the influence of content knowledge and class room experience. Druva & Anderson (1983). This meta-analysis of 65 s tudies examined relationships between science teacher characteristi cs and teaching behaviors, student achievement in science, or both, using meta -analytic techniques to translate results from a wide range of studies into Pearson correlation coefficients in order to compare them. It found that ratings of teaching effectiveness by principals and students were most strongly correlat ed with the number of 2.
32 of 55education courses taken, followed by student teachi ng grades, and teaching experience. On a teacher "effectiveness" scale comp osed of many teaching behaviors associated in process-product research wi th student achievement, both science training (examined in 28 studies) and educa tion coursework and performance (examined in 47 studies) were related t o effectiveness, as were teacher attitudes, values, and temperament. Associa tions with cognitive and affective student outcome measures were found for b oth science training and, to a somewhat smaller extent, for education coursework a nd performance, based on 34 studies for each of these sets of variables. The au thors concluded that: Student outcomes are positively associated with the preparation of the teacher, especially science training, but also prep aration in education and academic work generally.... While the hiring of ficial seeking a new science teacher certainly must look beyond info rmation on the teacher characteristics considered in this study, i nformation on some of these characteristics certainly is worthy of inc lusion in the decision-making process.... In general, the hiring official would be well advised to employ teachers with thorough prepa ration in both professional education and the sciences being taugh t. There is a relationship between teacher preparation programs a nd what their graduates do as teachers (p. 477). Walsh seeks to dismiss the results of this study in part by misreporting them. She states the study "did not show the benefit of educa tion coursework on student achievement" (p. 19), and that education coursework is not significantly related to student outcomes, although significance statistics were not reported in the study. This assertion is not supported by the authors' rep orted findings that both science coursework and education training showed a relation ship to teacher effectiveness as defined by student outcomes (in both cases, thou gh to a greater extent for science coursework) (Note 25) as well as teaching behaviors and ratings (reporte d in the case of education coursework only).Darling-Hammond (2000). Walsh criticizes and misquo tes a study that this author conducted, which examined both the literature on te acher characteristics and student achievement and conducted a regression anal ysis of state-level data from the National Assessment of Educational Progress and the Schools and Staffing Surveys (Darling-Hammond, 2000). The study found th at measures of teacher preparation and certification were by far the stron gest correlates of student achievement in reading and mathematics, both before and after controlling for student poverty and language status. The conclusion discussed a number of potential reasons for these large effects: The strength of the "well-qualified teacher" variab le may be partly due to the fact that it is a proxy for both strong disciplinary knowledge (a major in the field taught) and substan tial knowledge of education (full certification). If the two kinds of knowledge are interdependent as suggested in much of the literatu re, it makes sense that this variable would be more powerful than eith er subject matter knowledge or teaching knowledge alone. It is also p ossible that this variable captures other features of the state polic y environment including general investments in, and commitment to education, as 3.
33 of 55well as aspects of the regulatory system for educat ion, such as the extent to which standards are rigorous and the exte nt to which they are enforced.... Finally, there may be unmeasured c orrelations between the extent to which states enact and enforc e high standards for teachers and the extent to which they have enac ted other policies that are supportive of public schools. Although it does not appear that teaching standards are strongly related to investme nts regarding class sizes or to overall education spending, it is possi ble that there are other factors influencing student achievement which generally co-exist with teacher quality and which were unmeas ured in these estimates. Walsh seeks to invalidate these findings by raising two complaints, one of which is inaccurate and the other of which is a matter of legitimate discussion in the field. She states, incorrectly, that, "Darling-Hamm ond did not control for class size differences among the states" (p. 26). State-l evel differences in average class size were in fact included in the analyses, and the variable had a very small, insignificant effect. Walsh also complains that the state-level analyses suffer from aggregation bias because they used average student test scoresÂ—a critique she also levels against other studies she cited approvingly for their findings in other parts of the paper (see e.g. Ferguson, 1991; Strauss & Sa wyer, 1986; Coleman, 1966). (Note 26) There are legitimate debates in the field on this point, and I addressed this question in the study itself, as I do again be low in the section on "Methodological Issues." For purposes of tracking b road policy trends at the state level, analyses of state level data offer one usefu l lens. This perspective was shared by the nine reviewers who recommended this p aper's publication in a peer-reviewed journal and a peer-reviewed research report series. Finally, the literature review contained in this st udy is repeatedly mischaracterized throughout Walsh's paper and her appendix as minimi zing or ignoring the influences of verbal ability and subject matter pre paration for teaching. On the relationship between academic ability and te acher effectiveness, Walsh states: Darling-Hammond (1999, p. 6) claims there is "littl e or no relationship between teachers' measured intelligenc e and their students' achievement." She supports this statement with two studies by Soar, Medley and Cocker (sic) (1983) and Schaloc k (1979). These two studies simply recycle research from the 1940s and earlier, none of which is retrievable for scrutiny (p. 21). Walsh misrepresents this analysis by quoting a port ion of a sentence out of context and citing the reviews that summarized research on IQ tests as an example of the inappropriate use of older studies. Here is what I actually said: While studies as long ago as the 1940s have found p ositive correlations between teaching performance and measu res of teachers' intelligence (usually measured by IQ) or general ac ademic ability (Hellfritsch, 1945; LaDuke, 1945; Rostker, 1945; Sk inner, 1947), most relationships are small and statistically insi gnificant. Two
34 of 55reviews of such studies concluded that there is lit tle or no relationship between teachers' measured intelligence and their s tudents' achievement (Schalock, 1979; Soar, Medley, & Coker, 1983). Explanations for the lack of strong relationship be tween measures of IQ and teacher effectiveness have included the lack of variability among teachers in this measure and its tenuous rela tionship to actual performance (Vernon, 1965; Murnane, 1985). However, other studies have suggested that teachers' verbal ability is rel ated to student achievement (e.g., Bowles & Levin, 1968; Coleman et al., 1966; Hanushek, 1971), and that this relationship may be differentially strong for teachers of different types of students (Summers & Wolfe, 1975). Verbal ability, it is hypothesized, may be a more sensitive measure of teachers' abilities to convey ideas in c lear and convincing ways (Murnane, 1985)." Walsh's attempt to distort the text misses two crit ical points: First, studies of the relationship between IQ and teaching effectiveness (which I noted had found positive though small relationships) were primarily conducted before the 1960s, because IQ tests came into question as measures of ability at that time and were no longer often available in large data sets thereafte r. Measures of verbal ability became more popular and widely available in data se ts in the 1960s and following, and showed somewhat stronger relationships with tea cher outcomes, as I reported in my summary. The studies I cited include many of the same ones that Walsh cites for this propositionÂ—a point she does not ack nowledge as she tries to suggest, inaccurately, that I minimize the value of measures of academic ability for teachers. (Note 27) On the topic of subject matter knowledge, Walsh als o suggests on numerous occasions that I seek to minimize the importance of teachers' knowledge of content. She offers my work as an example of her sw eeping statement that "certification advocates ... offer evidence that kn owledge of subject matter has little effect on teaching performance" (p. 19). Her e is what I actually said in my brief summary of the literature, offering an analys is that clearly acknowledges the importance of subject matter knowledge for teaching and interprets the mixed results of studies in terms of what teachers may ne ed to know in order to teach different things. Byrne (1983) summarized the results of thirty studi es relating teachers' subject matter knowledge to student achie vement. The teacher knowledge measures were either a subject kn owledge test (standardized or researcher-constructed) or number of college courses taken within the subject area. The results of these studies were mixed, with 17 showing a positive relationship and 14 show ing no relationship. However, many of the "no relationship studies, Byrne noted, had so little variability in the teacher kno wledge measure that insignificant findings were almost inevitable. Asht on and Crocker (1987) found only 5 of 14 studies they reviewed exh ibited a positive relationship between measures of subject matter kno wledge and teacher performance.It may be that these results are mixed because subj ect matter
35 of 55knowledge is a positive influence up to some level of basic competence in the subject but is less important the reafter. For example, a controlled study of middle school mathem atics teachers, matched by years of experience and school setting, found that students of fully certified mathematics teachers ex perienced significantly larger gains in achievement than thos e taught by teachers not certified in mathematics. The differences in st udent gains were greater for algebra classes than general mathematic s (Hawk, Coble, & Swanson, 1985). However, Begle and Geeslin (1972) f ound in a review of mathematics teaching that the absolute nu mber of course credits in mathematics was not linearly related to teacher performance. It makes sense that knowledge of the material to be taught is essential to good teaching, but also that returns to subject matter expertise would grow smaller beyond some minimal essential le vel which exceeds the demands of the curriculum being taught. This interpretation is supported by Monk's (1994) more r ecent study of mathematics and science achievement. Using data on 2,829 students from the Longitudinal Study of American Youth, Monk (1994) found that teachers' content preparation, as measured by coursework in the subject field, is positively related to student ach ievement in mathematics and science but that the relationship i s curvilinear, with diminishing returns to student achievement of teach ers' subject matter courses above a threshold level (e.g., five courses in mathematics). It may also be that the measure of subject matter k nowledge makes a difference in the findings. Measures of course-taki ng in a subject area have more frequently been found to be related to te acher performance than have scores on tests of subject matter knowled ge. This might be because tests necessarily capture a narrower slice of any domain. Furthermore, in the United States, most teacher tes ts have used multiple-choice measures that are not very useful f or assessing teachers' ability to analyze and apply knowledge. M ore authentic measures may capture more of the influence of subje ct matter knowledge on student learning. For example, a test of French language teachers' speaking skill was found to have significant correlation to students' achievement in speaking an d listening (Carroll, 1975).It seems logical that teachers' abilities to handle the complex tasks of teaching for higher-level learning are likely to be associated, to varying extents, with each of the variables reviewe d above: verbal ability, adaptability and creativity, subject matte r knowledge, understanding of teaching and learning, specific te aching skills, and experience in the classroom, as well as interaction s among these variables. In addition, considerations of fit betwe en the teaching assignment and the teacher's knowledge and experien ce are likely to influence teachers' effectiveness (Little, 1999), a s are conditions that support teachers' individual teaching and the addit ive effect of
36 of 55teaching across classrooms, such as class sizes and pupil loads, planning time, opportunities to plan and problem so lve with colleagues, and curricular supports including appro priate materials and equipment (Darling-Hammond, 1997). Finally, Walsh suggests in several places that I ha ve characterized the research as indicating a "negative relationship between student outcomes and the NTE subject matter tests" (p. 19). In fact, I stated that "Stud ies of teachers' scores on the subject matter tests of the National Teacher Examinations ( NTE) have found no consistent relationship between this measure of subject matte r knowledge and teacher performance as measured by student outcomes or supervisory ratings. Most studies show small, statistically insignifican t relationships, both positive and negative (Andrews, Blackmon & Mackey, 1980; Ayers & Qualls, 1979; Haney, Madaus, & Kreitzer, 1986; Quirk, Witten, & Weinberg 1973; Summers & Wolfe, 1975)." (Note 28) Walsh misrepresents this statement numerous times.Methodological IssuesOne of the ways that Walsh seeks to make much of th e research on teacher education disappear is by suggesting that it is inappropriate to cite studies that are older, smaller, use measures of performance other than student achi evement scores, are aggregated at a level above the classroom, or are published in venu es other than peer-reviewed journals. As noted above, Walsh uses a double standard in sel ecting research to reject when it finds evidence of the influence of teacher educatio n on student learning and research to cite for her own purposes. While she discounts the findings of many dissertation studies and technical reports because they were not publish ed in peer-reviewed journals, in making her own claims, she cites at least 15 studie s that were not published in peer-reviewed journals or technical report series a nd at least 20 that were published before 1980, including some that she elsewhere dism issed from consideration because she did not like specific findings. For findings sh e likes, she also cites several that use supervisory ratings as the only measures of teacher effectiveness and others that she later dismisses for aggregation bias. Sometimes she repre sents the studies' findings accurately; sometimes not. Many of the studies she cites for various propositions do not contain the findings for which they are citedÂ—or, i n several cases, any data on the question at all.I would not argue, as Walsh does, that none of thes e studies have value as contributions to the literature. However, the double standard she applies in using studies of different eras, sizes, aggregation levels, dependent variable s, and publication statuses perhaps proves the point that to evaluate the weight of evi dence in a field it is often necessary to triangulate findings that used different methods, o ver different time periods, and at different levels of aggregation to see where there is an accrual of evidence over time and across methods. Of course it is important to do thi s with appropriate attention to the methodological strengths and weaknesses of various studies and lines of research. Unfortunately, Walsh often does this poorly, appear ing to misunderstand critical research design issues. Below, I discuss the issues of study size and design, level of aggregation, choice of dependent variable (includin g the use of supervisory ratings of teacher performance), age, and venue of publication
37 of 55Study Size and DesignIn one part of her review, Walsh bemoans the lack o f experimental research. She then rejects the results of studies with experimental de signs because of their smaller sample sizes and cites almost exclusively non-experimental correlational studies, whichÂ—though largerÂ—lack direct controls for the va riables of interest and must rely on statistical manipulations of data to account, indir ectly, for these other influences. This kind of correlational research is, of course, legit imate for staking out broad possibilities in relationships among variables, but it has its ow n limitations. Many of the more carefully controlled experimental designs can in fa ct offer more solid evidence about effects, because the "treatment" they are studying is known and the samples can be better controlled than is true for large correlational stu dies that use proxies and statistical controls rather than direct observation of the phen omena of interest. Medical research, for example, typically uses small sample experiment al research as the basis for establishing the possibilities of effects, while us ing large correlational studies as rough indicators of possible relationships that then requ ire further examination. Single case studies of clinical findings are part of the medica l research base along with small experiments sometimes carefully controlled and some times not, larger clinical trials, and correlational studies looking at broad tendencies.The usefulness of small, experimental and quasi-exp erimental studiesÂ—including those that Walsh cites and sometimes dismisses (and other times embraces, depending on her reading of and agreement with the findings)Â—is not in the definitiveness of their individual findings but in their contribution to a larger body of work from which a preponderance of evidence can be examined. Although medical researchers generally consider correlational studies to comprise a weaker source of evidence about effects than smaller experimental designs, they recognize that m ixed methods of research serve complementary purposes.Of course, one of the reasons correlational studies must be interpreted with caution is that there is always the question of what direction the correlations may point, sometimes referred to as "reverse causation." There is also t he problem that variables in these studies are frequently crude proxies for the actual measures of interest and may either fail to capture the intended construct or in fact b e reflecting the influences of other unmeasured variables. As noted above, many of the v ariables that can arguably be said to reflect constructs of interest are highly correl ated with one another. Furthermore, many of the variables of interest are not well-repr esented in large data sets. Thus it is critical to represent in any review of research a r ange of studies that can tease apart the different relationships of interest with a range of measures. Level of AggregationAnother criticism used to dismiss some studies' fin dings as irrelevant is the charge of "aggregation bias." For example, Walsh dismisses st udies that include favorable findings about the value of teacher education in which data are aggregated at the level of the school or district, although she, herself, cites si milarly aggregated data for her conclusion that verbal ability matters most (e.g. C oleman, 1966; Ferguson, 1991; Strauss & Sawyer, 1986). More important, this critique miss es a crucial point about how research results accrue and are triangulated to loo k at possible relationships among
38 of 55conditions and outcomes. Just as individual level d ata about health practices and outcomes inform medical research, for example, so d o highly aggregated data at the level of cities, counties, and even countries when researchers seek to understand why, for example, women in some nations have low levels of breast cancer or men have low levels of heart disease. Studies at different level s of aggregation provide different kinds of insights about the phenomena under study. In bui lding a corpus of research on any topic, a wide array of research strategies and leve ls of analyses are used. It is true that the size of measured effects of dif ferent variables can vary at different levels of the system; however, it is not always cle ar in which way the bias will operate. Often, the general direction of the results holds a t different levels of the system, even if effect sizes differ. For example, in their Alabama study, Ferguson and Ladd (1996) found the effects on student achievement of teacher s' test scores, masters degrees, and experience held at both the district and school lev els in terms of both significance and directionality. There are pros and cons of both kin ds of analyses. On the one hand, disaggregated data can exhibit greater measurement error. On the other hand some analysts have argued that omitted variables may bia s the coefficients of school input variables upward when data are aggregated to the di strict or state level (Hanushek, Rivkin, & Taylor, 1995). However, this generalizati on does not always prove true. For example, although Summers and Wolfe (1975) found th at selectivity ratings of each teacher's undergraduate institution were important in explaining 6th grade students' achievement when examined at the individual teacher level, this relationship disappeared with they aggregated the college rating s and other school inputs into school-level averages. This contradicts the assumpt ion about the usual direction of aggregation bias.Of course, omitted variables can bias results at an y level of the system. Sometimes, especially when the goal of a study is to evaluate broad trends and policy influences, it is important to have data aggregated and analyzed at m ultiple levels. For interpreting the weight of evidence on a particular issue, the most important question is whether consistent results are found at different levels of aggregation. Just as Walsh cites highly aggregated data as well as less aggregated data on the question of the influences of verbal ability, so the studies examined here reveal influences of measures of teacher education and certification on student achievement at the levels of state (Darling-Hammond, 2000c), school district (Ferguson 1991; Ferguson & Ladd, 1996; Strauss & Sawyer, 1986), school (Ferguson & Ladd, 1 996; Fetler, 1999), and individual teacher (Goldhaber & Brewer, 2000; Hawk, Coble, & S wanson, 1985; Monk, 1994). Measures for Assessing Teacher PerformanceWalsh argues that studies using various ratings of student performance other than student achievement test scores should be discounte d, noting that supervisory ratings "can be too subjective to measure teacher quality a ccurately" (p. 20). As support for this, she cites in her appendix a review of research on t eacher evaluation I conducted with colleagues at the RAND Corporation (Darling-Hammond Wise, & Pease, 1983). While her statement of why I cited the review in another article is completely inaccurate, (Note 29) she is correct when she notes that teacher evaluat ions by principals and other school-based supervisors have been found to lack st rong reliability. Our study of evaluation practices noted that this has been a fun ction of principals' lack of time, inadequate expertise for evaluating all teaching si tuations, insufficient evaluation
39 of 55training, and inappropriate instrumentation. Howeve r, this critique does not extend to ratings of performance that are based on structured observations conducted by trained, expert raters that have been developed and demonstr ated to have high reliability. Some of the studies Walsh dismisses use systematic ratin gs systems by trained observers (e.g. Ferguson & Womack, 1993; Guyton & Farokhi, 1987). T he extent to which ratings of performance should be considered or discounted depe nds on who conducts the rating process, with what training and instrumentation, un der what conditions, and with what efforts to enhance reliability.Age of StudiesThe age of studies is also a legitimate but not det erminative issue. Studies do not become invalid merely because they are old. While W alsh argues that many older studies using large data sets lacked certain kinds of variables as controls, this does not stop her from citing many of these studies for prop ositions with which she agrees. More important, the designs of some older studies are at least as strong as some of the more recent studies, and weak studies exist now as then. There is not a strong relationship between study vintage and quality. It is certainly true that teacher education programs and certification requirements have changed over ti me, so that inferences from studies conducted in one era do not automatically generaliz e to others; the extent to which one can learn something of use from a study depends on how well the variables are defined and on a knowledge of their relevance to more recen t conditions as well as on the strengths and limits of its methodology. Vintage does influence the prevalence of studies of certain kinds. With respect to studies of the effects of teacher education and certificati on, a large number of studies were conducted in the high-demand era of the 1960s and 70s when there was great variability in entry pathways and much interest in the topic. I t is also true that federal funding for educational research was substantially larger befor e 1980 than it was during the severe budget cuts of that decade. In addition, in times o f relatively low demand, like most of the 1980s, virtually all teachers were certified an d there was too little variability to find effects of this variable in large-scale studies. Fe w studies were concerned with these issues and few data sets had measures of teacher ed ucation variables. Interest and data on this topic have just begun to return in the 1990 s. Those who are interested in the extent to whichÂ—and the ways in whichÂ—different kin ds of preparation may matter for teacher performance and student learning can and sh ould be informed by earlier studies where they are applicable to the questions under st udy. Publication VenueAlthough Walsh is incorrect in her statement that d issertations are not retrievable (there are library systems for doing so, if sometimes less than convenient), it is legitimate to suggest that the kind of review they have received is often more variable, and may be less strenuous depending on the university and depa rtment, than for many peer-reviewed journals. There are certainly some universities who se dissertation review process is more rigorous than some journals, but the reverse is als o certainly true. The same variability in review stringency is true for conference papers and technical reports. However, Walsh herself cites a substantial number of unreviewed pa pers in support of various positions she takes. There are different schools of thought a bout how to treat these papers in reviews. Some would argue, as does Veenman (1984), a reviewer cited by Walsh, that
40 of 55the use of all identified studies is justified for a review that seeks to delineate global trends where large numbers of findings are similar (p. 166). Others would argue that papers that have not been published with peer revie w should be used only when the review includes a critique of each study's methods. Still others might argue, as Walsh does (at least rhetorically if not in practice), th at such studies should be excluded from consideration. I accept the point that it is a usef ul common ground to rely on research published in peer-reviewed journals, and I restrict the analysis in this paper to those studies. Even with this criterion, there is substan tial evidence to be weighed and discussed.Who is Affected by this Debate?The critical issue here is not the protection of re searchers' reputations or the turf of schools of education but the protection of students especially low-income students and students of color who are disproportionately taught by unprepared and uncertified teachers. As Walsh's paper shows in her references to data on the disparities in access to qualified teachers for students in Baltimore, the c hildren most affected by these arguments are economically and educationally disadv antaged children in central cities who are substantially abandoned by the funding and hiring protections that should operate to provide a foundation for their education These are the students whose education is most undermined by their lack of acces s to teachers who have the knowledge and skills to ensure that they learn to t he new high standards the society and the state demand.What the statistics on the lack of certified teache rs actually mean on the ground is that many of Baltimore's most educationally vulnerable c hildrenÂ—most of them African AmericanÂ—are taught in their elementary school year s by teachers who have had no training in how to teach them to read, much less to develop other basic and higher order skills they must have to succeed in school and life When they fail to learn, they begin the tortuous process of educational failure that wi ll end for many of them in dropping out or being unable to pass the state tests that wo uld grant them a diploma. This then launches a life spent either in a marginal part of the economy that barely yields subsistence wages or, as is true for more than 50% of high school dropouts, in the inability to gain any job at all. In today's econom y, these young people are fated to become part of the growing criminal justice system, as incarceration is increasingly linked to inadequate education. More than half of t he growing number of inmates in the United States are functionally illiterate and canno t gain access to today's labor market. This is not unrelated to the fact that so many lowincome students have been taught by teachers who never learned how to teach them to rea d.Illogical Policy ConclusionsThe disparities in access to qualified teachers in Maryland are a function of a state school finance system that has underfunded Baltimor e's schools for decades, along with inadequate incentivesÂ—for example, service scholars hips, forgivable loans, and recruitment attractions like salaries and housing a ssistanceÂ—to encourage individuals to acquire strong training and then teach in high-need fields and locations. The Abell Foundation report does not argue for more equitable funding for the schools that serve Maryland's poor and minority students or for strong er incentives to attract well-prepared teachers to these schools. In fact, the report cite s approvingly a paper prepared to stave
41 of 55off an equity lawsuit in Maryland (Hanushek, 1996b) which argues against district investments in smaller class sizes or higher salari es in Baltimore, asserting that "Baltimore City would not benefit from additional r esources as much as it could benefit by better school management." (Note 30) The Abell Foundation report argues that the enormous disparities in resources and qualified tea chers between Baltimore and other districts are not a problem because teacher certifi cation does not mean anything, and that in fact the solution is to do away with certificati on altogether. In suggesting that devolving all hiring decisions t o principals is the answer to the problem of recruitment for the schools serving mino rity and poor children, Walsh ignores the fact that, even if all principals had i nfinite information at their disposal about the likely effectiveness of teachers and made wise, fully informed choices (two assumptions that have been challenged by some resea rch on teacher selection practices), principals do not control the major levers for addr essing the problems of unequal supply: unequal district revenues, noncompetitive teacher s alary levels, and the policies that govern recruitment and preparation that would allow them to seek out and hire the individuals they might most want to recruit.Eliminating certification requirements would elimin ate pressures for competitive wages or recruitment incentives for teachers, since an op en marketplace in a resource-constrained public sector could resolve sh ortages by lowering standards. In addition, eliminating certification requirements wo uld eliminate evidence about disparities in students' opportunities to learn, fo r if there are no minimum standards, there will be no evidence of differences in the ext ent to which they have been achieved by teachers working with different groups of studen ts. This would in turn reduce pressures for the creation of policies to rectify t hese inequities. Finally, eliminating such standards would remove the mechanisms states have b een developing and improving to be sure that teachers know their content well, know how to teach the content to students, know how to teach fundamental skills like reading, and have the ability to meet the special needs of learners who may have learning dis abilities that require distinct teaching strategies, whose first language is not English, or who simply struggle with certain kinds of academic tasks and need diagnostic assistance.The outcome of Walsh's argument, were it to be succ essful in the policy community, would be continued inequality in funding, depressed salaries for teaching in high-need areas, continued lack of access for poor children t o a stable teaching force of well-qualified teachers by any definition, and trag ic loss of a productive future for students who are underserved.To be sure, certification is but a proxy for the su bject matter knowledge and knowledge of teaching and learning embodied in various kinds of coursework and in the evidence of ability to practice contained in supervised student teaching. It is true that certification is a relatively crude measure of teachers' knowledge a nd skills, since the standards for subject matter and teaching knowledge embedded in c ertification have varied across states and over time, are differently measured, and are differently enforced from place to place. The quality of preparation in both universit y programs and other alternatives has varied as well, although a number of states have ma de substantial recent headway in strengthening teachers' preparation and reducing th is variability. Given the crudeness of the measure, it is perhaps remarkable that so many studies have found significant effects of teacher certification.
42 of 55This does not mean that we should be sanguine about certification policies. There are questions about the quality of tests, courses, and institutions that are the subject of study and action across the country (see, for example, Da rling-Hammond, Wise, & Klein, 1999). The answer to flaws that may be perceived, h owever, is not to eliminate or undermine the pathways that enable and require teac hers to gain knowledge and students to have access to teachers who have the knowledge t hey need. If teacher knowledge and skill about both content and how to teach it is imp ortant, as substantial evidence suggests it is, the most sensible policy goal is to work to improve preparation opportunities and certification standards so that t hey increasingly approximate what teachers need to know and do in order to be success ful with diverse students. As Levin (1980) noted, certification is a criticall y important exercise in the economics of information that should be a target of continual im provement: (T)he facts that we expect the schools to provide b enefits to society that go beyond the sum of those conferred upon individual s tudents, that it is difficult for many students and their parents to ju dge certain aspects of teacher proficiency, and that teachers cannot be in stantaneously dismissed, mean that somehow the state must be concerned about the quality of teaching. It cannot be left only to the individual judgments of students and their parents or the educational administrators who are vested with managing the schools in behalf of society. The purp ose of certification of teachers and accreditation of the programs in which they received their training is to provide information on whether teach ers possess the minimum proficiencies that are required from the teaching f unction. Because this is an exercise in the provision of information, it is imp ortant to review the criteria for setting out how one selects the information tha t is necessary to make a certification or accreditation decision (p. 7).ConclusionKate Walsh has dismissed or misreported much of the existing evidence base in order to argue that teacher education makes no difference to teacher performance or student learning and that students would be better off with out state efforts to regulate entry into teaching or to ensure certain kinds of teachers' le arning. While she argues for recruiting bright people into teaching (and who could disagree with that?), her proposals offer no incentives for attracting individuals into teaching other than the removal of preparation requirements. While this proposal is couched as the elimination of "barriers" to teaching, evidence suggests that lack of preparation actually contributes to high attrition rates and thereby becomes a disincentive to long-term teachin g commitments and to the creation of a stable, high ability teaching force. Lack of p reparation also contributes to lower levels of learning, especially for those students w ho most need skillful teaching in order to succeed.The evidence from research presented here and elsew here makes clear that the policies Walsh endorses could bring harm to many children, e specially those who are already least well served by the current system. Those who make such arguments for eliminating one of the few protections these children have shou ld bear the burden of proof for showing how what they propose could lead to greater equity and excellence in American
43 of 55schools. Notes 1. The research assistance of Lisa Marie Carlson is gr atefully acknowledged. 2. "Teacher Certification Reconsidered: Stumbling for Quality" is published through the Abell Foundation website: www.abellfoundation.org The version of the report that was publicized and published on this website in October 2001 is the basis for this response. The report has since been amended. In a reply to my response posted to the Abell Foundation website, Walsh noted that some of the er rors I pointed out have been removed in the hard copy version the foundation pub lished in December 2001. 3. In addition to the Abell Foundation, these include the Fordham Foundation, which has issued a "manifesto" urging the elimination of teac her education and certification requirements. 4. See The Research and Rhetoric on Teacher Certifica tion: A Response to Teacher Certification Reconsidered, at http://www.nctaf.org 5. See Teacher Certification Reconsidered: Stumbling for Quality, A Rejoinder (November, 2001) at www.abellfoundation.org 6. A separate appendix is published on the Abell Foun dation website. Soem of its entries have changed as criticisms of the report have been lodged. 7. See, for example, footnote 18 on p. 13 where Walsh refers readers to Appendix B for analysis of six studies, only two of which (Guyton & Farokhi, 1987; Monk, 1994) are actually included there. Appendix B of the publishe d version of Walsh's report includes only 14 of 192 studies originally included in her d raft of July 23, 2001 and does not include most of the key studies on the topic. A lon ger appendix was later added to the Abell Foundation website. Readers who consult with that document will find that many of the studies listed are not concerned with teache r education but are cited for other reasons related to one of Walsh's own arguments; ma ny others are not reviewed because they were not retrieved or were deemed too old or t oo small; still others are "reviewed" only in the sense that complaints are made about th em or about the way they were cited by another researcher. 8. In a reply to my response, Walsh and Podgursky (20 01) suggest that Wenglinsky referred only to in-service education. However, the NAEP questions Wenglinsky analyzed for evidence of teacher learning covered c oursework or professional development teachers had encountered before or afte r entering teaching. The stem for these questions was in each case one of the followi ng: "During the past five years, have you taken courses or participated in professional development activitie s in any of the following?" or "Have you ever received training in any of the following, either in courses or in-service education?" 9. Another study by the California Commission on Teac her Credentialing found the attrition rates of Los Angeles Teacher Trainees who dropped out before they entered teaching to be quite high. Of the first cohort, 80. 3% completed the first year of training and only 64.6% completed the second year and receiv ed a clear credential the year after
44 of 55(Wright, McKibbon, and Walton, 1987). This 35% attr ition rate prior to graduation from the program added to the 53% attrition rate of thos e who completed the program but left the district within the subsequent 7 years (Stoddar t, 1992) left only about 30% of the original cohort in the district after 7 years. 10. In her Education Next article, Walsh (2002) lists a set of studies with sample sizes of up to 55 teachers as "too small to produce results that are reliable or that can be generalized to the larger population," (on-line ver sion, p. 9). However, in her reply to me (Walsh and Podgursky, 2001, p. 14), she states that because Miller, McKenna, & McKenna's study was a matched pair study, a "gold s tandard of research," its small numbers (18 teachers for examining student achievem ent effects) are justified. Yet just pages earlier in the same document (p. 8), she and Podgursky criticize another matched pair study (Hawk, Coble, & Swanson, 1985) which has a larger sample (36 teachers) and stronger design for evaluating student achievement (Miller et al. drop most of their teachers and the matched comparison design when the y evaluate student test scores) as lacking statistical controls (also missing in the M iller et al. study) and failing to adjust for pre-test scores of students (Miller, McKenna an d McKenna do not even present the pre-test scores of students). The Hawk et al. study which Walsh originally cited approvingly as an argument for content knowledge is now dismissed by Podgursky as "small and not well-controlled" to avoid having to acknowledge its results, which find positive effects of teacher certification on studen t achievement. 11. Personal communications with economist Susanna Loe b and statistician William Billet. 12. As one of dozens of examples of general sloppiness neither the Goldhaber and Brewer study nor the Hawk, Coble, and Swanson study cited by Walsh for this proposition even treated the question of whether "t he most distinct problem in schools serving poor children is the number of teachers who are teaching subjects in which they have no expertise." Neither study examined or repor ted on the socioeconomic status of students or the distribution of teachers in schools serving different children. 13. As the study clearly states, California uses emerg ency permits for those who lack either subject matter competence or pedagogy or bot h. The requirement for a clear credential is passage of both subject matter compet ence and a set of pedagogical requirements, whether these are completed in a "tra ditional" or an "alternative" program, which in California would be an internship model re quiring the candidates to meet the same standards as traditional programs. In fact, th e composition of the emergency permit pool in California is nearly the opposite of what W alsh seems to surmise. This pool includes many teachers who have passed the subject matter test (or alternative content course requirements) in mathematics but who have no t completed teacher education requirements. It also includes many teachers who ha ve passed a basic skills test but have not completed either the subject matter or teacher education requirements for a clear credential. It includes very few individuals who ha ve completed teacher education requirements but who have not completed subject mat ter requirements, since demonstration of subject matter competence is a pre requisite for entering the student teaching or internship portion of teacher education in California. Furthermore, experienced teachers who may be teaching math out o f field would generally have been included in Fetler's data set as credentialed, sinc e out of field teaching is not monitored by the state through the data set he used.
45 of 55 14. The original appendix was included in Walsh's draf t dated July 23, 2001. Her final complete appendix published in October, 2001 modifi es this statement only slightly, stating, "The author's principal and clear lament i s the lack of subject matter knowledge in mathematics, with little mention at all of educa tion coursework that may be lacking." 15. High-minority schools were defined as those with m ore than 90% students of color; low-minority schools had fewer than 30%. High-achie ving schools were defined as those in the top quartile of achievement on the SAT-9 tes ts used by the state; low-achieving schools were those in the bottom quartile. 16. Rosalind Rossi, "Teacher woes worst in poor school s," Chicago Sun Times, October 10, 2001. 17. Walsh states that, "L. Darling-Hammond ... present s a chart using an ambiguous term 'Teacher Qualifications' which accounted for n early half of the student achievement gains." (p. 17). The chart to which Walsh alludes a ctually referred to another study by Ferguson (1996) and was clearly labeled as such. An other chart next to this one was drawn directly from a table in the Greenwald, Hedge s, and Laine study, and was also clearly marked. 18. In a later response to my reply (Walsh & Podgursky 2001), Walsh notes that she cited Ferguson & Womack in error and meant to cite Ferguson and Ladd (1996). However, this study is one she should have discount ed due to its level of aggregation if she were adhering to her own standards for evaluati ng research. 19. One odd criticism is that the institution, Arkansa s Tech, has "low entrance requirements, making it unlikely that enough varian ce in student ability, background and coursework is present to reflect a broader populati on. The variance may be too narrow or at least skewed." Walsh seems to be unaware that th e variance in student ability measures is usually much larger in large state univ ersities like this one than it is in more selective colleges, thus making some kinds of infer ences more, rather than less supportable. The more appropriate question about si ngle institution studies is whether they may generalize to unlike institutions, a legit imate point that Walsh does not raise, and that should be answered by conducting studies w ithin and across institutional contexts. 20. Some may also have been those teachers who needed to take the Praxis as an entrance examination for a post-baccalaureate teach er education program. 21. The ETS re-analysis is soon to be published. An ea rlier analysis of the federal Baccalaureate and Beyond data base found that 1993 graduates of NCATE-accredited teacher education programs were about 50% more like ly to have scored above the 50th percentile on SAT and ACT tests than graduates of n on-NCATE teacher education programs (Shotel, 1998). NCATE graduates had also t aken more social science, computer science, advanced foreign language credit, pre-college mathematics, and teaching coursework and fewer remedial English cour ses than non-NCATE graduates, with other areas being approximately equal (Shotel, 1998). 22. The proportions who had taken other kinds of liber al arts coursework also differed little. For example, the proportion of 1992-93 bach elor's degree recipients who had taken college coursework in mathematics at the level of c alculus and above was 18.3% in
46 of 55public schools and 16.9% in private schools; scienc e was 77.2% vs. 73.5% (table A-51). 23. These statistics pertain to the youngest teachers in public and private schools: 1992-'93 bachelors degree recipients hired by 199394. These teachers are the least likely to be certified, even though they have taken education coursework at rates nearly as high as public school teachers. This suggests th at many of these teachers may have prepared to teach but did not seek or secure state certification. In 1993-94, NCES reports that about 36% of private school teachers held no c ertificate in their primary assignment field (the data are not presented regarding their c ertification in another field other than the primary teaching assignment). The rates of noncertification ranged from 27% for those with 20 or more years of teaching experience to 51% for those with 3 or fewer years of teaching experience (NCES, 1997, table A3. 14a). 24. For example, while most private school students (5 2%) attend schools that are less than 10% minority, only 31% of public school studen ts do (NCES, Digest of Education Statistics, 1999, p. 71, table 60 and p. 119, table 99). African American and Latino students are at least 50% more likely to attend pub lic than private schools. (NCES, 1997, Table A2.13). Most low-income students and students of color now attend public schools in urban public school districts. 25. Walsh objects to a composite "education and perfor mance" variable created by the authors, which included the amount of education cou rsework, student teaching grade, GPA, and science teaching experience. 26. In Walsh's original appendix, this study is furthe r critiqued because the reviewer was not clear on the meaning of the term "out-of-field" in the study when referencing elementary school teachers. The article defined the proportion of "well-qualified teachers" as the proportion holding state certifica tion and the equivalent of a major (either an undergraduate major or masters degree) i n the field taught. For elementary teachers, the equivalent of a major was defined an elementary education degree for generalists who teach multiple subjects to the same group of students or as degree in the field taught for elementary specialists (e.g. readi ng, mathematics or mathematics education, special education). The study defined "o ut-of-field" for elementary teachers in the same way it was defined for secondary teachers: holding less than a minor or the equivalent in the fields described above (elementar y education in the case of generalists or the specialist field (e.g. reading or mathematic s in the case of specialists). 27. For some mysterious reason, Walsh also tries to ma ke a point that I differentiate (wrongly in her view) between cognitive ability or IQ and verbal ability (see her footnote 14, p. 8), despite the fact that this is a standard distinction in the literature made by many of the analysts Walsh herself quotes for support of the importance of verbal ability measures. Few measurement experts would argue that IQ, as it was defined and measured in the 1940s and '50s, represents the same construct as verbal ability, as Walsh seems to be invested in proving. 28. Walsh makes a hash of the research cited here on t he relationship between teacher test scores and measures of teacher effectiveness, striving to prove that studies which found largely insignificant positive and negative r elationships between NTE scores and student achievement at least did not find significa nt negative relationships. Since there is little disagreement about the value of having teach ers demonstrate their basic skills and subject matter knowledge through either coursework or testing, I do not review each of
47 of 55these older studies here. 29. In her separately-published appendix, Walsh states that, "In 1999, Darling-Hammond summarized the main point of this article as a call for using student achievement as the measure of teacher quality." In fact, in Darling-Ha mmond (1999), I cited this review for an entirely different point. I cited it for the pro position that "Teachers' abilities to structure material, ask higher order questions, use student ideas, and probe student comments have also been found to be important varia bles in what students learn." 30. Cited in the separately-published appendix entry 8 8, p.50. ReferencesAndrew, M. & Schwab, R.L. (1995). Has reform in tea cher education influenced teacher performance? An outcome assessment of graduates of eleven teacher education programs. Action in Teacher Education, 17 : 43-53. Andrews, J.W., Blackmon, C.R., & Mackey, J.A. (1980 ). Preservice performance and the National Teacher Examinations. Phi Delta Kappan, 61(5) : 358-359. Ashton, P. & Crocker, L. (1987). Systematic study o f planned variations: The essential focus of teacher education reform. Journal of Teacher Education, 2-8. Ayers, J.B., & Qualls, G.S. (1979). Concurrent and predictive validity of the National Teacher Examinations. Journal of Educational Research, 73 (2) : 86-92. Begle, E.G. (1979). Critical variables in mathematics education: Findin gs from a survey of the empirical literature Washington, DC: Mathematical Association of Ameri can and National Council of Teachers of Mathematics.Begle, E.G. & Geeslin, W. (1972). Teacher effectiveness in mathematics instruction. National Longitudinal Study of Mathematical Abiliti es Reports No. 28 Washington, DC: Mathematical Association of America and Nationa l Council of Teachers of Mathematics.Berliner, D.C. (1986). In pursuit of the expert ped agogue, Educational Researcher (August/September): 5-13.Betts, J.R., Rueben, K.S., Danenberg, A. (2000). Eq ual resources, equal outcomes? The distribution of school resources and student achiev ement in California. San Francisco: Public Policy Institute of California.Bliss, T. (1992). Alternative certification in Conn ecticut: Reshaping the profession. Peabody Journal of Education, 67(3) : 35-54. Bowles, S., & Levin, H.M. (1968). The determinants of scholastic achievementAn appraisal of some recent evidence. Journal of Human Resources, 3 : 3-24. Bradshaw, L. & Hawk, P. (1996). Teacher Certification: Does It Really Make a Difference in Student Achievement? Greenville, NC: Eastern North Carolina Consortium for Assistance and Research in Education.
48 of 55Bryk, A.S. & Lee V.E. (1992). Are politics the prob lems and markets the answer? An essay review of "Politics, markets and America's sc hools." Economics of Education Review, 11(4) : 439-451. Byrne, C.J. (1983). Teacher knowledge and teacher effectiveness: A lite rature review, theoretical analysis and discussion of research str ategy Paper presented at the meeting of the Northwestern Educational Research Associatio n, Ellenville, NY. Carroll, J.B. (1975). The Teaching of French as a Foreign Language in Eig ht Countries New York: John Wiley and Sons.Coleman, J.S., Campbell, E.Q., Hobson, C.J., McPart land, J., Mood, A.M., Weinfeld, F.D., & York, R.L. (1966). Equality of Educational Opportunity Washington, DC: U.S. Government Printing Office.Coleman S. J., Hoffer T., and Kilgore, S. (1982). C ognitive outcomes in public and private schools. Sociology of Education, 55 (2-3): 65-76. Darling-Hammond, L. (1992). Teaching and knowledge: Policy issues posed by alternate certification for teachers. Peabody Journal of Education, 67(3) : 123-154. Darling-Hammond, L. (1997). Doing What Matters Most: Investing in Quality Teach ing NY: National Commission on Teaching and America's F uture, Teachers College, Columbia University.Darling-Hammond, L., Wise, A.E., & Klein, S.P. (199 9). A license to teach. San Francisco: Jossey-Bass.Darling-Hammond, L. (2000a). Reforming teacher prep aration and licensing: Debating the evidence. Teachers College Record, 102 (1): 28-56. Darling-Hammond, L. (2000b). Solving the Dilemmas of Teacher Supply, Demand, and Standards: How We Can Ensure a Competent, Caring, a nd Qualified Teacher for Every Child. NY: National Commission on Teaching and America's Future. Darling-Hammond, L. (2000c). Teacher quality and st udent achievement. Education Policy Analysis Archives, 8(1) : http://epaa.asu.edu/epaa/v8n1.html Darling-Hammond, L., Berry, B., & Thoreson, A. (200 1). Does Teacher Certification Matter? Evaluating the Evidence. Educational Evaluation and Policy Analysis, 23(1) : 57-77.Darling-Hammond, L., Hudson, L., & Kirby, S. (1989) Redesigning Teacher Education: Opening the Door for New Recruits to Sci ence and Mathematics Teaching Santa Monica: The RAND Corporation.Darling-Hammond, L., Wise, A.E. & Pease, S.R. (1983 ). Teacher evaluation in the organizational context: a review of the literature. Review of Educational Research, 53 : 285-237.Denton, J.J., & Peters, W.H. (1988). Program Assessment Report: Curriculum
49 of 55Evaluation of a Non-traditional Program for Certify ing Teachers Texas A&M University, College Station, TX.Druva, C.A., & Anderson, R.D. (1983). Science teach er characteristics by teacher behavior and by student outcome: A meta-analysis of research. Journal of Research in Science Teaching, 20(5) : 467-479. Duffy, G. & Roehler, L. (1989). The tension between information-giving and mediation: Perspectives on instructional explanation and teach er change. In J. Brophy (ed.), Advances in research on teaching, Vol. 1 Greenwich, CT: JAI. Duffy, G., Roehler, L., Sivan, E., Rackliffe, G., B ook, C., Meloth, M., Vavrus, L., Wesselman, R., Putnam, J., & ; Bassiri, D. (1987). Effects of explaining reasoning associated with using reading strategies. Reading Research Quarterly, 22 (3): 347-368. Evertson, C., Hawley, W., & Zlotnick, M. (1985). Ma king a difference in educational quality through teacher education Journal of Teacher Education, 36 (3), 2-12. Feistritzer, C.E. (1984). The Making of a Teacher Washington, DC: National Center for Education Information.Ferguson, R.F. (1991). Paying for public education: New evidence on how and why money matters. Harvard Journal of Legislation, 28(2) : 465-498. Ferguson, R.F. & Ladd, H.F. (1996). How and why mon ey matters: An analysis of Alabama schools. In Helen Ladd (ed.) Holding Schools Accountable pp. 265-298. Washington, DC: Brookings Institution.Ferguson, P. & Womack, S.T. (1993). The impact of s ubject matter and education coursework on teaching performance. Journal of Teacher Education 44 (1): 55-63. Fetler, M. (1999). High school staff characteristic s and mathematics test results. Education Policy Analysis Archives, 7(9) : http://epaa.asu.edu/epaa/v7n9.html Gitomer, D.H., Latham, A.S., & Ziomek, R. (1999). The Academic Quality of Prospective Teachers: The Impact of Admissions and Licensure Testing Princeton, NJ: Educational Testing Service.Goe, L. (forthcoming). Legislating equity: The dist ribution of emergency permit teachers in California. Berkeley: Graduate School of Educati on, University of California, Berkeley.Goldhaber, D.D. & Brewer, D. J. (1998, October). Wh en should we reward degrees for teachers? Phi Delta Kappan 134-138. Goldhaber, D.D. & Brewer, D.J. (2000). Does teacher certification matter? High school certification status and student achievement. Educational Evaluation and Policy Analysis, 22 : 129-145. Greenwald, R., Hedges, L.V., & Laine, R.D. (1996). The effect of school resources on student achievement. Review of Educational Research, 66 : 361-396.
50 of 55Grey, L., Cahalan, M., Hein, S., Litman, C., Severy nse, J., Warren, S., Wisan, G., & Stowe, P. (1993). New Teachers in the Job Market: 1991 Update Washington, DC: U. S. Department of Education, Office of Educational R esearch and Improvement. Guyton, E. & Farokhi, E. (1987). Relationships amon g academic performance, basic skills, subject matter knowledge and teaching skill s of teacher education graduates. Journal of Teacher Education (Sept-Oct.): 37-42. Haney, W., Madaus, G., & Kreitzer, A. (1987). Charm s talismanic: testing teachers for the improvement of American education. In E.Z. Roth kopf (Ed.) Review of Research in Education, 14 : 169-238. Washington, DC: American Educational Res earch Association. Haney, W. (2000). The myth of the Texas miracle in education. Education Policy Analysis Archives, 8 (41): http://epaa.asu.edu/epaa/v8n41/ Hanushek, E. (1971). Teacher characteristics and ga ins in student achievement: Estimation using micro data. The American Economic Review 61(2) : 280-288. Hanushek, E. (1992). The trade-off between child qu antity and quality. Journal of Political Economy, 100 : 84-117. Hanushek, E.A., Rivkin, S.G., &Taylor, L.L. (1995). Aggregation bias and the estimated effects of school resources. Rochester, NY: University of Rochester, Center for Economic Research.Hanushek, E. (1996b). School Resources and achievement in Maryland Baltimore, MD: Maryland State Department of Education.Hawk, P., Coble, C.R., & Swanson, M. (1985). Certif ication: It does matter. Journal of Teacher Education, 36(3) : 13-15. Hellfritzch, A.G. (1945). A factor analysis of teac her abilities. Journal of Experimental Education, 14 : 166-169. Henke, R., Chen, X., & Geis, S. (2000). Progress through the teacher pipeline: 1992-93 college graduates and elementary/secondary school t eaching as of 1997 Washington, DC: National Center for Education Statistics, U.S. Department of Education. Ingersoll, R. (1998). The problem of out-of-field t eaching. Phi Delta Kappan, (June): 773-776.Jelmberg, J. (1995). College-based teacher educatio n versus state-sponsored alternative programs. Journal of Teacher Education, 47 (1), 60-66. (Jan-Feb 1996). Laczko-Kerr, I. & Berliner, D. (2002). The effectiv eness of Teach for America and other under-certified teachers on student academic achiev ement: A case of harmful public policy. Educational Policy Analysis Archives 10 (37). Available: http://epaa.asu.edu/epaa/v10n37/.LaDuke, D.V. (1945). The measurement of teaching ab ility. Journal of Experimental Education, 14 : 75-100.
51 of 55Lanier, J. and J. Little. (1986). Research on Teach er Education. In M. Wittrock (ed.), Handbook of Research on Teaching, Third Edition New York: Macmillan. Lee, V.E. & Byrk, A.S. (1988) Curriculum tracking a s mediating the social distribution of high school achievement. Sociology of Education, 61 : 78-94. Lee, V.E., Dedrick, R.F., & Smith, J.B. (1991) The effect of the social organization of schools on teachers' self efficacy and satisfaction Sociology of Education, 64 : 190-208. Levin, H. M. (1980). Teacher certification and the economics of information. Educational Evaluation and Policy Analysis, 2 (4) : 5-18. Little, J.W. (1999). Organizing schools for teacher learning. In L. Darling-Hammond and G. Sykes (eds.), Teaching as the Learning Profession pp. 233-262. San Francisco: Jossey-Bass.Lutz, F.W. & Hutton, J.B. (1989). Alternative teach er certification: Its policy implications for classroom and personnel practice Educational Evaluation and Policy Analysis, 11(3) : 237-254. Miller, J.W., McKenna, M.C., & McKenna, B.A. (1998) A comparison of alternatively and traditionally prepared teachers. Journal of Teacher Education, 49(3) : 165176. Mitchell, N. (1987). Interim Evaluation Report of the Alternative Certif ication Program (REA87-027-2) Dallas, TX: DISD Department of Planning, Evaluati on, and Testing. Monk, D. (1994). Subject area preparation of second ary mathematics and science teachers and student achievement. Economics of Education Review, 12(2) : 125-142. Monk, D. & King, J. (1994). Multi-level teacher res ource effects on pupil performance in secondary mathematics and science. In R.G. Ehren berg (ed.), Choices and Consequences ILR Press, Ithaca, NY. Murnane, R.J. (1985). Do Effective Teachers have Common Characteristics: Interpreting the Quantitative Research Evidence. Paper presented at the National Research Council Conference on Teacher Quality in S cience and Mathematics, Washington, DCMurnane, R.J. (1983). Understanding the sources of teaching competence: Choices, skills and the limits of training. Teachers College Record, 84(3): 564-569. National Center for Education Statistics (NCES) (19 85). The Condition of Education, 1985 Washington, DC. U.S. Department of Education. National Center for Education Statistics (NCES) (19 97). America's Teachers: Profile of a Profession Washington, DC: U.S. Department of Education. National Center for Education Statistics (NCES) (20 00). Digest of Education Statistics, 1999 Washington, DC: U.S. Department of Education. National Commission on Teaching and America's Futur e (1996). What Matters Most:
52 of 55Teaching for America's Future. New York: Author. National Reading Panel (2000). Teaching Children to Read: An Evidence-based Assessment of the Scientific Research Literature on Reading and Its Implications for Reading Instruction Washington, DC: National Institute of Child Healt h and Human Development.Natriello, G. & Zumwalt, K. (1992). Challenges to a n alternative route for teacher education. In Lieberman, A. (Ed.). The 91st Yearbook of the National Society for the Study of Education, Vol. 1, pp. 59-78. Chicago: University of Chicago Press. Quirk, T.J., Witten, B.J., & Weinberg, S.F. (1973). Review of studies of concurrent and predictive validity of the National Teacher Examina tions. Review of Educational Research, 43 : 89-114. Palincsar, A.S. & Brown, A.L. (1984). Reciprocal te aching of comprehension-fostering and comprehension-monitoring activities. Cognition & Instruction, 1 : 117-175. Raymond, M., Fletcher, S., & Luque, J. (2001). Teach for America: An Evaluation ofTteacher Differences and Student Outcomes in Hous ton, Texas CREDO, The Hoover Institution, Stanford University. Available: http:/ /www.rochester.edu/credo Rostker, L.E. (1945). The measurement of teaching a bility. Journal of Experimental Education, 14 : 5-51. Schalock, D. (1979). Research on teacher selection. In D.C. Berliner (ed.), Review of Research in Education (vol. 7) Washington, DC: American Educational Research Association.Shields et al., Stanford Research International (SR I) (2000). The Status of the Teaching Profession, 2000: An Update to the Teaching and Cal ifornia's Future Task Force. Santa Cruz, CA: The Center for the Future of Teaching and Learning. Shotel, J.R. (Summer 1998). Does NCATE Make a Difference? Quality in Teacher Education. Washington, DC: George Washington University. Skinner, W.A. (1947). An Investigation of Factors Useful in Predicting Te aching Ability. University of Manchester. Master of Education thes is. Soar, R.S., Medley, D.M., and Coker, H. (1983). Tea cher evaluation: A critique of currently used methods Phi Delta Kappan, 65(4) : 239-246. Stafford, D. & Barrow, G. (1994). Houston's alterna tive certification program. The Educational Forum, 58 : 193-200. Stoddart, Trish (1992). An alternate route to teach er certification: Preliminary findings from the Los Angeles Unified School District Intern Program. Peabody Journal of Education, 67(3). Strauss, R.P. & Sawyer, E.A. (1986). Some new evide nce on teacher and student competencies. Economics of Education Review, 5(1) : 41-48.
53 of 55Summers, A.A., & Wolfe, B.L. (1975). Which school r esources help learning? Efficiency and equality in Philadelphia public scho ols. The American Economic Review, 67(4): 639-652. Texas Center for Educational Research (2000). The Cost of Teacher Turnover Austin, TX: Texas State Board for Teacher Certification (SB EC). U.S. Department of Education. (2002). Meeting the highly qualified teachers challenge: The Secretary's Annual Report on Teacher Quality. Washington, DC: U.S. Department of Education, Office of Postsecondary Education, Of fice of Policy Planning and Innovation.Veenman, S. (1984). Perceived problems of beginning teachers. Review of Educational Research, 54 : 143-178. Vernon, P.E. (1965). Personality factors in teacher trainee selection. British Journal of Education Psychology (35) : 140-149. Walsh, K. (2001). Teacher certification reconsidered: Stumbling for q uality. Baltimore, MD: Abell Foundation. Available: http://www.abellfo undation.org. Walsh, K. (2002, Spring). The evidence for teacher certification. Education Next, 2 (1): 79-84.Wayne, A.J., & Youngs, P. (under review). Teacher c haracteristics and student achievement gains: A review. Review of Educational Research. Wenglinsky, H. (2000). How teaching matters: Bringing the classroom back i nto discussions of teacher quality Princeton, NJ: Educational Testing Service. Wilson, S., Floden, R., & Ferrini-Mundy (2001). Teacher Preparation Research: Current Knowledge, Gaps, and Recommendations University of Washington: Center for the Study of Teaching and Policy.Wright, David P., Michael McKibbon, & Priscilla Wal ton (1987). The Effectiveness of the Teacher Trainee Program: An Alternate Route int o Teaching in California California Commission on Teacher Credentialing.About the AuthorLinda Darling-HammondSchool of EducationStanford University Email: email@example.com Linda Darling-Hammond is Charles E. Ducommun Profes sor of Education at Stanford University and was Founding Executive Director of t he National Commission on Teaching and America's Future. Her research, policy and teaching focus on teacher education and teaching quality, school restructurin g, and educational equity. Among other writings, she is author of The Right to Learn, which received the Outstanding
54 of 55 Book Award from the American Educational Research A ssociation in 1998.Copyright 2002 by the Education Policy Analysis ArchivesThe World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu General questions about appropriateness of topics o r particular articles may be addressed to the Editor, Gene V Glass, firstname.lastname@example.org or reach him at College of Education, Arizona State University, Tempe, AZ 8 5287-2411. The Commentary Editor is Casey D. Cobb: email@example.com .EPAA Editorial Board Michael W. Apple University of Wisconsin Greg Camilli Rutgers University John Covaleskie Northern Michigan University Alan Davis University of Colorado, Denver Sherman Dorn University of South Florida Mark E. Fetler California Commission on Teacher Credentialing Richard Garlikov firstname.lastname@example.org Thomas F. Green Syracuse University Alison I. Griffith York University Arlen Gullickson Western Michigan University Ernest R. House University of Colorado Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Ontario Institute of Technology Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Dewayne Matthews Education Commission of the States William McInerney Purdue University Mary McKeown-Moak MGT of America (Austin, TX) Les McLean University of Toronto Susan Bobbitt Nolen University of Washington Anne L. Pemberton email@example.com Hugh G. Petrie SUNY Buffalo Richard C. Richardson New York University Anthony G. Rud Jr. Purdue University Dennis Sayers California State UniversityÂ—Stanislaus Jay D. Scribner University of Texas at Austin Michael Scriven firstname.lastname@example.org Robert E. Stake University of IllinoisÂ—UC
55 of 55 Robert Stonehill U.S. Department of Education David D. Williams Brigham Young University EPAA Spanish Language Editorial BoardAssociate Editor for Spanish Language Roberto Rodrguez Gmez Universidad Nacional Autnoma de Mxico email@example.com Adrin Acosta (Mxico) Universidad de Guadalajaraadrianacosta@compuserve.com J. Flix Angulo Rasco (Spain) Universidad de Cdizfelix.firstname.lastname@example.org Teresa Bracho (Mxico) Centro de Investigacin y DocenciaEconmica-CIDEbracho dis1.cide.mx Alejandro Canales (Mxico) Universidad Nacional Autnoma deMxicocanalesa@servidor.unam.mx Ursula Casanova (U.S.A.) Arizona State Universitycasanova@asu.edu Jos Contreras Domingo Universitat de Barcelona Jose.Contreras@doe.d5.ub.es Erwin Epstein (U.S.A.) Loyola University of ChicagoEepstein@luc.edu Josu Gonzlez (U.S.A.) Arizona State Universityjosue@asu.edu Rollin Kent (Mxico)Departamento de InvestigacinEducativa-DIE/CINVESTAVrkent@gemtel.com.mx email@example.com Mara Beatriz Luce (Brazil)Universidad Federal de Rio Grande do Sul-UFRGSlucemb@orion.ufrgs.brJavier Mendoza Rojas (Mxico)Universidad Nacional Autnoma deMxicojaviermr@servidor.unam.mxMarcela Mollis (Argentina)Universidad de Buenos Airesmmollis@filo.uba.ar Humberto Muoz Garca (Mxico) Universidad Nacional Autnoma deMxicohumberto@servidor.unam.mxAngel Ignacio Prez Gmez (Spain)Universidad de Mlagaaiperez@uma.es Daniel Schugurensky (Argentina-Canad)OISE/UT, Canadadschugurensky@oise.utoronto.ca Simon Schwartzman (Brazil)American Institutes for ResesarchÂ–Brazil(AIRBrasil) firstname.lastname@example.org Jurjo Torres Santom (Spain)Universidad de A Coruajurjo@udc.es Carlos Alberto Torres (U.S.A.)University of California, Los Angelestorres@gseisucla.edu