xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20039999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00322
Educational policy analysis archives.
n Vol. 11, no. 24 (August 04, 2003).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c August 04, 2003
High-stakes testing : another analysis / Barak Rosenshine.
Arizona State University.
University of South Florida.
t Education Policy Analysis Archives (EPAA)
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 11issue 24series Year mods:caption 20032003Month August8Day 44mods:originInfo mods:dateIssued iso8601 2003-08-04
1 of 8 A peer-reviewed scholarly journal Editor: Gene V Glass College of Education Arizona State University Copyright is retained by the first or sole author, who grants right of first publication to the EDUCATION POLICY ANALYSIS ARCHIVES EPAA is a project of the Education Policy Studies Laboratory. Articles appearing in EPAA are abstracted in the Current Index to Journals in Education by the ERIC Clearinghouse on Assessment and Evaluation and are permanently archived in Resources in Education Volume 11 Number 24August 4, 2003ISSN 1068-2341High-Stakes Testing: Another Analysis Barak Rosenshine University of Illinois at Urbana, ChampaignCitation: Rosenshine, B. (2003, August 4). High-sta kes testing: Another analysis. Education Policy Analysis Archives, 11 (24). Retrieved [date] from http://epaa.asu.edu/epa a/v11n24/.AbstractAmrein and Berliner (2002b) compared National Asses sment of Educational Progress (NAEP) results in high-stakes states against the national average for NAEP scores. They studied NAEP scores for 8th grade mathematics, 4th grade mathematics, and 4 th grade reading. They concluded that states that introduced consequences (high-stakes) to their statewide tests did not show any particular gains in their statewide NAEP scores. However, ther e was no comparison group in their analysis. In this analysi s, a comparison group was formed from states that did not attach co nsequences to their state-wide tests. This analysis showed that s tates that attached consequences outperformed the comparison group of s tates on each of the three NAEP tests for the last four-year peri od. These results showed that, overall, these was a meaningful carryo ver from attaching consequences on statewide tests to statew ide NAEP scores. As reported by Viadero (2003a, 2003b), Audrey Amrei n and David Berliner (2002a, 2002b, 2002c) have studied the effects of attaching consequences or accountability ("high stakes") to student scores on their statewid e exams. These consequences include monetary awards to schools or teachers, aut hority to replace a principal or
2 of 8 a teacher, and limiting grade-to-grade promotion. I ndividual states imposed one to six such consequences, with an average of three con sequences. (See Table 1 in Amrein and Berliner (2002b).) In an admirable decision, Amrein and Berliner did n ot look at each state's scores on their own statewide tests. Rather, they looked at e ach state's scores on an independent measure, the National Assessment of Edu cational Progress. They compared the four-year changes in NAEP scores in ea ch high-stakes state against the average changes for all the states that took ea ch NAEP test. They studied NAEP results in three areas: 8th grade mathematics, 4th grade mathematics, and 4th grade reading.They concluded that, there were "no consistent effe cts across states" after consequences were introduced. (Amrein & Berliner, 2 002b, p. 57). Some states had larger increases than the national average, but the NAEP changes in other states were less than the national average. They co ncluded that students in high-stakes states were not learning anything beyon d the specific content of the statewide tests.Their analysis, however, did not include a comparis on group. The economist John H. Bishop, quoted in Viadero (2003b), said "The nat ural thing to do would be to compare the states that had accountability systems to ones that didn't." This reanalysis follows Bishop's suggestion and is a com parison of the NAEP gains in the high-stakes states against the NAEP gains in st ates that did not have statewide accountability procedures. Not all the 26 high-stakes states were included in the Amrein and Berliner analysis. They noted that some states may have manipulated th eir NAEP scores by exempting some of the special education students an d students with limited English proficiency from those taking the NAEP test. They e xcluded these states or individual results from these states. For example, all the results from North Carolina and Texas, two states with large NAEP increases, we re labeled as "unclear" and were not included in their analyses. All of their r eported results are for the remaining 8 to 12 "clear" states (depending upon th e specific NAEP exam). I believe this separation into "clear" and "unclear" high stakes states was a valuable step because it allows the analysis to focus only o n the results in the clear states. My analysis uses the NAEP four-year gain from these same 8-12 "clear" high-stakes states. An additional 14 to 18 states ( depending upon the specific NAEP exam) did not attach consequences to tests and they served as the comparison group in my analysis. My analysis is bas ed on the gain, from cohort to cohort, between 1996 and 2000 for the two mathemati cs tests and on the gain between 1994 and 1998 on the reading test. One coul d, of course, go back eight years, but then the number of high-stakes states wo uld be much smaller. The tables from which these numbers came are readily av ailable at the NAEP website (http://nces.ed.gov/nationsreportcard/). The results in the Table 1 show the average NAEP in creases in the "clear" high-stakes states were much higher than the increa ses in the comparison states. In 8th grade mathematics and in 4th grade reading t he mean increase for the clear high-stakes states was double the increase for the states without consequences.
3 of 8 The effect sizes for the comparisons were .35 for 4 th grade math, .79 for 8th grade math, and .61 for 4th grade reading. These effect s izes have been called moderate to large. An effect size of .35 means that the aver age high-stakes state would score at the 63rd percentile of the comparison states. Ef fect sizes of .79 and .62 correspond to the 78thand the 73rd percentile of th e comparison states.Table 1 Average four-year increases on NAEP scores for Clear High-Stakes States and for States without Hig h StakesNAEP TestAverage four-year increase in clear high-stakes states Average four-year increase in states without high-stakes 4th grade mathematics (1996-2000) 3.45 (n=11 states)2.40 (n=15 states) 8th grade mathematics (1996-2000) 3.42 (n=7 states)1.63 (n=13 states) 4th grade reading (1994-1998) 3.44 (n=9 states)1.21 (n=14 states) These results suggest that students in the clear hi gh-stakes states were, indeed, learning mathematics and reading that was beyond th e specific content of the statewide tests. Their NAEP achievement, on all thr ee tests, was higher than the achievement of students in NAEP states that did not attach high-stakes to their statewide exams.These results might lead us to reconsider some of t he textual statements in the Amrein and Berliner report (2002b). Amrein and Berl iner wrote that "Â…the imposition of high-stakes testing results in a more narrow form of trainingÂ…." (p. 6) Perhaps, and perhaps not. But this apparent narrow training did not prevent students in the clear high-stakes states from doing quite well on the NAEP tests, better than the students in states that did not att ach consequences to statewide tests.Did state NAEP scores really decrease?The Amrein and Berliner (2002b) report states that NAEP scores in many high-stakes states "decreased" after consequences w ere implemented. They wrote that "grade 4 reading achievement decreased" (p.19) in Alabama after stakes were attached to statewide tests. They wrote that in Nev ada, "grade 4 math achievement decreased." But "decrease" is a relative term. Scor es in Nevada increased three points on the 4th grade NAEP math test between 1996 and 2000. But because this increase was less than the national average of four points, Nevada was listed as "grade 4 math achievement decreased." Grade 4 readi ng achievement increased by three points in Alabama between 1994 and 1998, t he same increase as the
4 of 8 national average. However, Amrein and Berliner note d that the percentage of students exempted from NAEP increased, in Alabama, between 1990 and 1994. Therefore, they concluded that "after stakes were a ttached to tests in Alabama, grade four reading achievement decreased." (p. 19)Overall, Amrein and Berliner reported that NAEP sco res in the 4th grade mathematics test "decreased" in 8 states. However, these was no actual decrease in any of these states: one state had no change bet ween 1996 and 2000 but the remaining 7 states actually showed increases of fro m 1 to 4 points during this period. If one adds them up, across the three NAEP exams, there was an absolute decrease in only two of the high-stakes state score s used in the Amrein and Berliner analysis and an absolute decrease in 10 of the NAEP scores for states that did not include accountability measures.DiscussionThis analysis showed that the clear high-stakes sta tes outperformed the comparison states on each of the three NAEP tests f or the last four-year period. If my analytic approach makes sense, and if these resu lts are confirmed by others, then I hope we can begin to study what these result s mean The results showed there was a meaningful carryover effect, in some states, from statewide testing to the NAEP. Based on these resul ts, it is not appropriate for Amrein and Berliner (2002b) to say that attaching a ccountability to statewide tests "results in a narrow form of training," or "high-st akes testing creates a 'training effect' only." (p. 6) Nor is it appropriate to say that "students were learning the content of the state-administered test and perhaps little else." (p. 60) Although attaching accountability to statewide test s worked well in some high-stakes states, it was not an effective policy in all states. South Carolina, Massachusetts, and Alabama did particularly well in 4th grade math, but not New Mexico, West Virginia, or Kentucky. Indiana and Ala bama did particularly well in 8th grade math, but not New Mexico or Missouri. Louisia na, Delaware and Virginia did particularly well in 4th grade reading, but not Mis souri or New Mexico. It would be appropriate to study these successful a nd less successful high-stakes states and learn how they achieved their results. I t would be less appropriate to simply use these results as a hammer and blindly re quire all states to impose consequences. I find it unlikely that the NAEP results in the hig h-scoring states were obtained merely because two weeks were devoted to test prepa ration. I find it unlikely that the NAEP results in the high-scoring states were ob tained only because consequences and accountability were introduced.My guess would be that there is a strong academic f ocus in these classrooms and these schools. The research has supported academica lly-focused classrooms since 1960, and having seen lots of trivia in class rooms, I welcome a return to an academic focus. I've been in many low-income high-a chievement elementary schools and they are, indeed, high achieving places I have seen history projects, discussion of novels, Junior Great Books, impressiv e mathematics lessons, and
5 of 8 tutoring during lunch hours. It would be unfair to the efforts of students, teachers, and principals to say that they are merely focusing on statewide tests. Statewide policies and district policies may also f acilitated these increases. If so, how was that accomplished? Did these policies affec t all schools? What forms of help did the state and the district provide for the classrooms? All these items may be worthy of study. We might also reexamine the lis ts of statewide consequences in Table 1 of Amrein and Berliner (2002b) and ask w hich of these consequences might act as a motivating and not a threatening fac tor. Audrey Amrein and David Berliner have performed an important service by focusing on the consequences that some states have attached to statewide testing. I think their use of state NAEP scores as an independent as sessment was a brilliant move. Their use of NAEP scores also allows others t o conduct additional analyses of this public data. My additional analysis suggest s that students in some high-stakes states have done very well on the NAEP tests. I hope we can now study how this happened.AcknowledgementI was in contact with Audrey Amrein throughout this reanalysis and it could not have been done without her help. She provided clear and full answers to each of my frequent email questions. Some of my ideas for th is analysis came as a result of email discussions with David Berliner, and I thank him. Sam Stringfield, Burnie Bond, Bob Stevens, Jere Brophy and Marilyn Kohl all provided useful questions and comments.ReferencesAmrein, A.L. & Berliner, D.C. (2002a, March 28). Hi gh-stakes testing, uncertainty, and student learning Education Policy Analysis Archives 10 (18). Retrieved July 18, 2003 from http://epaa.asu.edu/epaa/v10n18/.Amrein, A.L. & Berliner, D.C. (2002b). The impact o f high-stakes tests on student academic performance: An analysis of NAEP results i n states with high-stakes tests and ACT, SAT, and AP Test results in states w ith high school graduation exams Tempe, AZ: Education Policy Studies Laborat ory, Arizona State University. Retrieved July 18, 2003 fromhttp://www.asu.edu/educ/epsl/EPRU/documents/EPSL-02 11-126-EPRU.pdf. Amrein, A.L. & Berliner, D.C. (2002c). An analysis of some unintended and negative consequences of high-stakes testing. Tempe AZ: Education Policy Studies Laboratory, Arizona State University. Retri eved July 18, 2003 from http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-02 11-125-EPRU.pdf. Viadero, D. (2003a, January 8). Reports find fault with high-stakes testing. Education Week, 22 (16), 5. Viadero, D. (2003b, February 5). Researchers debate impact of tests. Education Week, 22 (21), 1 & 12.
6 of 8 About the AuthorBarak Rosenshine College of EducationUniversity of Illinois, UrbanaÂ—ChampaignEmail: firstname.lastname@example.orgBarak Rosenshine is an emeritus professor of educat ional psychology at the University of Illinois at Urbana-Champaign. His res earch speciality is classroom instruction. The World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu Editor: Gene V Glass, Arizona State UniversityProduction Assistant: Chris Murrell, Arizona State University General questions about appropriateness of topics o r particular articles may be addressed to the Editor, Gene V Glass, email@example.com or reach him at College of Education, Arizona State Un iversity, Tempe, AZ 85287-2411. The Commentary Editor is Casey D. Cobb: firstname.lastname@example.org .EPAA Editorial Board Michael W. Apple University of Wisconsin David C. Berliner Arizona State University Greg Camilli Rutgers University Linda Darling-Hammond Stanford University Sherman Dorn University of South Florida Mark E. Fetler California Commission on Teacher Credentialing Gustavo E. Fischman California State UniveristyÂ–LosAngeles Richard Garlikov Birmingham, Alabama Thomas F. Green Syracuse University Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Ontario Institute ofTechnology Patricia Fey Jarvis Seattle, Washington Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College
7 of 8 Les McLean University of Toronto Heinrich Mintrop University of California, Los Angeles Michele Moses Arizona State University Gary Orfield Harvard University Anthony G. Rud Jr. Purdue University Jay Paredes Scribner University of Missouri Michael Scriven University of Auckland Lorrie A. Shepard University of Colorado, Boulder Robert E. Stake University of IllinoisÂ—UC Kevin Welner University of Colorado, Boulder Terrence G. Wiley Arizona State University John Willinsky University of British ColumbiaEPAA Spanish Language Editorial BoardAssociate Editor for Spanish Language Roberto Rodrguez Gmez Universidad Nacional Autnoma de Mxico email@example.com Adrin Acosta (Mxico) Universidad de Guadalajaraadrianacosta@compuserve.com J. Flix Angulo Rasco (Spain) Universidad de Cdizfelix.firstname.lastname@example.org Teresa Bracho (Mxico) Centro de Investigacin y DocenciaEconmica-CIDEbracho dis1.cide.mx Alejandro Canales (Mxico) Universidad Nacional Autnoma deMxicocanalesa@servidor.unam.mx Ursula Casanova (U.S.A.) Arizona State Universitycasanova@asu.edu Jos Contreras Domingo Universitat de Barcelona Jose.Contreras@doe.d5.ub.es Erwin Epstein (U.S.A.) Loyola University of ChicagoEepstein@luc.edu Josu Gonzlez (U.S.A.) Arizona State Universityjosue@asu.edu Rollin Kent (Mxico) Universidad Autnoma de Puebla email@example.com Mara Beatriz Luce (Brazil) Universidad Federal de Rio Grande do Sul-UFRGSlucemb@orion.ufrgs.br Javier Mendoza Rojas (Mxico)Universidad Nacional Autnoma deMxicojaviermr@servidor.unam.mx Marcela Mollis (Argentina)Universidad de Buenos Airesmmollis@filo.uba.ar Humberto Muoz Garca (Mxico) Universidad Nacional Autnoma deMxicohumberto@servidor.unam.mx Angel Ignacio Prez Gmez (Spain)Universidad de Mlagaaiperez@uma.es Daniel Schugurensky (Argentina-Canad) OISE/UT, Canada Simon Schwartzman (Brazil) American Institutes forResesarchÂ–Brazil (AIRBrasil)
8 of 8 firstname.lastname@example.org@sman.com.br Jurjo Torres Santom (Spain) Universidad de A Coruajurjo@udc.es Carlos Alberto Torres (U.S.A.) University of California, Los Angelestorres@gseisucla.edu EPAA is published by the Education Policy Studies Laboratory, Arizona State University