USF Libraries
USF Digital Collections

Educational policy analysis archives


Material Information

Educational policy analysis archives
Physical Description:
Arizona State University
University of South Florida
Arizona State University
University of South Florida.
Place of Publication:
Tempe, Ariz
Tampa, Fla
Publication Date:


Subjects / Keywords:
Education -- Research -- Periodicals   ( lcsh )
non-fiction   ( marcgt )
serial   ( sobekcm )

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
usfldc doi - E11-00455
usfldc handle - e11.455
System ID:

This item is only available as the following downloads:

Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20059999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00455
0 245
Educational policy analysis archives.
n Vol. 13, no. 33 (June 28, 2005).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c June 28, 2005
Conflicting demands of no child left behind and state systems : mixed messages about school performance / Robert L. Linn.
x Research
v Periodicals.
2 710
Arizona State University.
University of South Florida.
1 773
t Education Policy Analysis Archives (EPAA)
4 856


EDUCATION POLICY ANALYSIS ARCHIVES A peer-reviewed scholarly journal Editor: Sherman Dorn College of Education University of South Florida Copyright is retained by the first or sole au thor, who grants right of first publication to the Education Policy Analysis Archives EPAA is published jointly by the Colleges of Education at Arizona State University and the University of South Florida. Articles are indexed by H.W. Wilson & Co. Volume 13 Numb er 33 June 28, 20 05 ISSN 1068-2341 Conflicting Demands of No Child Left Behind and State Systems: Mixed Messages abou t School Performance1 Robert L. Linn University of Colorado at Boulder Citation: Linn, R. L. (2005, June 28). Conf licting demands of No Child Left Behind and state systems: Mixed messages about school performance. Education Policy Analysis Archives, 13(33). Retrieved [date] from ht tp:// Abstract An ever-increasing reliance on student performance on test s holds schools and educators accountable both to state ac countability systems and also to the accountability requirements of the No Child Left Be hind (NCLB) Act of 2001. While each state has constructed its own definition of Adequate Yearly Progress (AYP) requirements within the confines of NCLB, substantial differences between the accountability requirements of many state systems an d NCLB still have resulted in mixed messages regarding the performanc e of schools. Several features of NCLB accountability and state acco untability systems contribute to the identification of a school as meeting goals acco rding to NCLB but failing to do so according to the state accountability system, or vise vers a. These include the multiple hurdles of NCLB, the comparison of performance agains t a fixed target rath er than changes in 1 I thank the editor, Sherman Dorn and three unknown reviewers for helpful comments on an earlier draft of this paper. The work reported herein was partially supported under the Ed ucational Research and Development Centers Program, PR/Award Number R305B96002, as administered by the Institute for Education Sciences, U.S. Department of Education. The findings and opinions expressed do not reflect the positions or policies of the National Center for Education Research, the Institute of Education Sciences or the U.S. Department of Education. Accepted under the editorship of Sherman Dorn. Send commentary for EPAA about this article to Casey Cobb (


Education Policy Analysis Archives Vol. 13 No. 33 2 achievement, and the definiti on of performance go als. The result of these features is a set of AYP measures that is inconsistent both with existing state accountability systems and also with state NAEP performa nce. Using existing achievement to set the cut-score measured by AYP and using the highest-performing schools to set the year-to-year improvement standards would improve the NCLB accountability system. Keywords: accountability; Co lorado; Florida; Kentucky; No Child Left Behind. Test-based educational accountability has expa nded greatly during the past decade. Most states were already using tests results to hold schools accountable prior to the time that President Bush signed the No Child Left Behind (NCLB) Act into law in January 2002. NCLB not only further increased the already relatively strong empha sis in a number of states on the use of student test results as a means of holding schools accoun table, but it also superimposed a new set of accountability rules that often give signals th at conflict with those provided by the state accountability systems. Mixed Messages NCLB requires states to test students in grades 3 through 8 in mathematics and English/language arts starting no later than the 2005 school year. NCLB requires each state to have adopted challenging academic content standards and challenging student academic achievement standards (P. L. 107, Section 1111(b)(1)(A)). States must also establish adequate yearly progress (AYP) goals for each year from 2002 to 2014 that culminate in the 2014 goal where all students are at or above the proficient st udent academic achievement standard. As discussed below, however, states still control many important system characteristics in complying with NCLB, such as the specification of content standards, the choice of assessments and the setting of academic achievement standards. For states with functioning assessment and accountability systems of their own, NCLB accountability has frequently been layered on as a separate system. Kentucky, for example, had a comprehensive school accountability system in place before the enactment of NCLB. The Kentucky accountability system uses tests in seven content area s (reading, writing, mathematics, science, social studies, arts and humanities, and practical living/vocational studies). The tests are administered at selected grades so that the overall testing burden is distributed across grade levels. Composite index scores are used for school accountability. The index scores are derived across content areas and include some non-test measures (e.g ., attendance or graduation rate). Biennial accountability targets for the composite index scores are set for schools relative to the schools starting position defined by the schools accountability index score in the 1999 biennium. Schools that started low have to gain more in their index score than schools that started out relatively high but all are supposed to reach an index value of 100 by the 2013 biennium (Kentucky Department of Education, 2004). In the computation of the index value, students who score at the highest level (called distinguished) on a test contribute 140 points, students at the proficient level contribute 100 points, and students in various categories below proficient contribute an amount less than 100how much less depends on how far below the profic ient level the students score is. NCLB imposes a quite different set of accountability requirements for Kentucky schools. Mathematics and reading must be reported separa tely and schools must make annual, rather than biennial, measurable objectives in each subject (not just on a composite score). No extra credit is


Conflicting Demands of No Child Left Behind and State Systems 3 allowed for students scoring at the distinguished level. They are simply lumped with proficient students in the proficient or above category. School performance is compared to an absolute target, which is the same regardless of where the school started. Schools must meet AYP requirements in both reading and mathematics, not only for the student body as a whole, but for each of several subgroups (assuming there are enough students in a subgroup to be counted for NCLB accountability purposes): major racial and ethnic gr oups, English language proficiency status, student disability status, and economic status. Given the differences between Kentuckys ow n school accountability system and the NCLB system, it is hardly surprising that the two sy stems are giving mixed messages. Except for schools with special designations, seven hundred thirty of 986 (74.0%) of Kentucky schools made AYP in 2004 (Ford & Thacker, 2005). According to Kent uckys own accountability, however, 943 of the 986 (95.6%) schools met their Kentucky biennium goals in 2003. Thus, the best possible agreement between the two systems would be if all of the 730 schools that made AYP also met the Kentucky biennium goals and all of the 43 schools that fell short of the Kentucky biennium goals also failed to make AYP for a combined total agreement of the two systems of 78.3% (773 out of the 986 schools). Even in this best-case scenario, just over 20% of the schools would receive mixed messages by meeting the goal according to one acco untability system, but failing to do so according to the other system. Table 1 State and NCLB Classifications of Kentucky Schools, 2004 Met AYP Targetsa Met the State Goal Yes No Total Yes 713 (72.3) 230 (23.3) 943 (95.6) No 17 (2.6) 26 (12.6) 43 (4.4) Total 730 (74.0) 256 (26.0) 986 (100) a Cumulative percentages in parentheses. Based on Ford and Thacker (2005). Table 1 displays the cross-classification of the 986 schools in terms of meeting or not meeting AYP in 2004 and meeting or not meeting Ke ntuckys accountability goals. As can be seen, not all the schools that made AYP also met the Kent ucky goals. Hence, a quarter of the schools (247 of 986) received mixed messages that they met exp ectations according to one accountability system but failed to meet them according to the other system. The mixed messages of the NCLB and the individual states own accountability system are not unique to Kentucky. Floridas state accountabili ty system has assigned letter grades of A, B, C, D, or F to schools based on the performance of students on their state assessment. The distributions of Florida school grades over the seven years between 1999 and 2005 are displayed in Figure 1. These distributions have painted quite a favorable picture of school performance. The percentage of schools receiving grades of A increased from 8.3% in 1999 to 47.6% in 2004. The percentage of schools receiving either an A or a B has also increased sharply (from 21.3% in 1999 to 68.0% in 2004), while the percentage of schools recei ving Ds or Fs declined from over a quarter of the schools in 1999 (27.9%) to less than a tenth of the schools in 2004 (8.8%).


Education Policy Analysis Archives Vol. 13 No. 33 4 Figure 1 Distribution of Fl orida School Grades by Year (Source: http://schoolgrades.fld 0 10 20 30 40 50 60 ABCDF GradePercent of Schools 1999 2000 2001 2002 2003 2004 2005 The NCLB accountability results in Florida in two recent years have provided a sharp contrast to the positive results from Floridas own accountability system. Florida had the dubious distinction in 2003 of leading the nation as the state with the largest percentage of schools (82%) that failed to make AYP. Although there was some decline in the percentage of Florida schools that did not make AYP in 2004 (from 82 to 77%), onl y Alabama had an equally high percentage of schools failing to make the AYP target in 2004 (Olson, 2004, p. S6). Other southern states had more modest percentages of schools that failed to make AYP in 2004: Georgia, 20%; Louisiana, 8%; Mississippi, 24%; North Carolina, 29%; South Carolina, 44%, Tennessee, 14%; and Virginia, 25% (Olson, 2004, p. S6). As will be discussed in greater detail below, the variation by state makes little sense in comparison to other information about stud ent performance by state such as that provided by the National Assessment of Educational Progre ss (NAEP), but it is nonetheless clear that the states own accountability system and NCLB are gi ving quite a mixed picture in Florida. Fifty six percent of the 1262 schools in Florida that received an A in 2004 failed to make AYP. The mixed messages provided in Kentucky and Florida are repeated in varying degrees in a number of other states. Colorado, for example, ha s an academic performance rating system that assigns schools to one of five graded performan ce categories called Unsatisfactory, Low, Average, High, and Excellent. Figure 2 displays the percenta ge of schools that made AYP in 2003 by school type and academic performance rating. As can be seen in Figure 2, there is a clear relationship between the Colorado academic performance rating of a school and the likelihood that the school will meet AYP. The relationship is far from perfect, however. Consequently, a substantial number of schools are receiving mixed messages. For example, 21.9% of the schools rated Unsatisfactory and 47.5% of the schools rated Low made AYP, while 13.7% of the schools rated high failed to make AYP.


Conflicting Demands of No Child Left Behind and State Systems 5 Figure 2 Percentage of Colorado Schools Maki ng AYP in 2003 by School Type and Academic Performance Rating (Source: 0 25 50 75 100UnsatisfactoryLowAverageHighExcellentAcademic Performance RatingPercent of Schools Elementary Middle High Total Although the summary statistics provide clear evidence that mixed messages are being given by NCLB and state accountability systems, it is, as was clearly illustrated by Dillon (2004), the failure of prestigious suburban high schools to meet AYP re quirements that seems to have caused the most consternation. Dillon quotes people such as Representative Judy Baggert of Illinois, who helped write the law and former North Carolina Gover nor James Hunt, who has praised the law, among others who were dismayed when they learned that particular high schools that they knew to be excellent were identified for failing to make AYP. Although, as is explained below, there are a variety of aspects of the NCLB identification system that makes it likely that excellent schools will be found wanting by failing to make AYP it is nonetheless confusing to parents and the general public. The confusion is summed up well in the title of D illons article Good schools or bad? Conflicting ratings leave parents baffled (2004). Why Mixed Messages? There are several features of the NCLB accoun tability requirements that make it likely that the results will conflict with the accountability of individual states. Some of these features also contribute to the wide state-to-s tate disparities in the proportion of schools that meet AYP. Three of these features, the use of absolute targets rather than improvement targets relative to a schools starting level, the need to meet targets in both re ading and mathematics rather than a composite, and the requirement of meeting targets for separate subgroups within a school, were mentioned in passing in the discussion of the Kentucky results. Th ese and other features are elaborated in this section.


Education Policy Analysis Archives Vol. 13 No. 33 6 Multiple Hurdles Unlike many state accountability systems, NCLB requires schools to clear several hurdles. The most obvious of these is that student achievement must exceed the annual measurable objective (AMO) in both mathematics and reading/English language arts. Performance on, say, reading achievement that far exceeds th e AMO cannot compensate for mathem atics achievement that just misses the AMO. In addition to meeting the separate AMOs for mathematics and reading/language arts, schools must have at least 95% of their eligible students participate in the assessments in each subject. The school must also meet the goal estab lished for the other academic indicator, usually attendance rate for elementary and middle schools and graduation rate for high schools, required by NCLB. Thus, there are a minimum of 5 hurdles for a school with a homogeneous student body and insufficient numbers in any subgroup to be held accountable for disaggregated results. The number of hurdles for meeting AYP expa nds rapidly for large schools with diverse student bodies due to the disaggregation requirements of NCLB. A school with more than the minimum number of students, designated by the state and approved by the U.S. Department of Education, for purposes of AYP in each of, say, 4 racial ethnic groups, students with limited English proficiency, economically disadvantaged students, an d students with disabilities would have not 5, but 33, hurdles to clear (the 5 when all students in the school are considered as a whole, plus 16 for the 4 hurdles for each of 4 racial/ethnic groups, pl us 4 for students with limited English proficiency, plus 4 for the economically disadvantaged students, plus 4 for the students with disabilities (see Table 2). Thus, schools can meet AYP requirements in only one way, by clearing multiple hurdles, but can fall short in many different ways. Given the la rger number of hurdles to be cleared by more diverse schools it is not surprising that Novak an d Fuller (2003) found that schools serving more diverse student bodies were less likely to meet AYP requirements than schools serving less diverse student bodies. [E]ven when students display almost identical average test scores schools with more subgroups are more likely to miss their growth targets under federal rules set by the No Child Left Behind Act (Novak & Fuller, 2003, p. 1). Th ese results are not unique to California. In Massachusetts, for example, about 48% of the sc hools failed to make AYP in 2002, but only 9% of the 373 schools with only one subgroup of sufficient size to be used in determining AYP failed to meet the requirement, whereas 87% of the 106 schools with 6 or 7 subgroups included in the determination of AYP failed to meet the requirement (Nelson & Rosenberg, 2004a). These results are to be expected because even a school with high averag e achievement may be tripped up on one of the multiple hurdles, such as missing th e participation rate for a particular subgroup or because students with disabilities perform belo w the proficient cutoff in either reading or mathematics. In Westport, Conn., [for example] the Bedford Middle School, where test scores are often among Connecticuts highest, was called low pe rforming because the school failed to meet the 95 percent standard for testing for the disabled by one student (Dillon, 2004).


Conflicting Demands of No Child Left Behind and State Systems 7 Table 2 33 AYP Hurdles for a Large School with a Diverse Student Population Note Table modeled after Marion, White, Carlson, Erpenbach, Rabinowitz, & Sheinker (2002). Status vs. Growth Targets State accountability systems frequently esta blish performance targets based on growth, thereby taking into account previous performance as well as current status. NCLB requirements, on the other hand, with the exception of the safe harbor provision discussed below, focus only on current status in comparison to the performance target. Californias accountability system, like the Kentucky system described above, provides a good example of a carefully developed system that focuses on growth in achievement. Growth is calc ulated not by tracking individual students and computing indices based on longitudinal data, but by comparing successive cohorts of students (e.g., the achievement of fourth grade students in a sc hool in 2004 compared to the achievement of fourth graders in that school in 2005). Califor nias system uses an acad emic performance index (API) that is a weighted combination of performance on tests of English language arts (including writing) and mathematics for grades two through eight. For grades 9 through 11, history-social science and science are included along with English language arts and mathematics for the weighted API. The API is scaled to have scores that range from 0 to 1000. An API score of 800 has been selected by the State Board of Education as the target toward which all schools should aspire (California Department of Education, 2004, p. 29). Schools are not sanctioned for falling short of the absolu te target of 800, however. Instead a school is held accountable for meeting their annual API growth ta rget, which is defined as five percent of the distance from the schools API and the statewide performance target or a minimum of one point (California Department of Educatio n, 2004, p. 30). For example, a school with an API in 2003 of 700 would have an API growth target of 5 point s (5% of 800) for 2004, while schools with APIs of 650 and 750 would have growth targets of 7.5 and 2.5, respectively. Californias focus on growth rather than status obviously stands in sharp contrast the fede ral AYP requirements. In order for a school to meet all API growth requirements, students in the school who are members of a numerically significant sub group defined by ethnicity or socioeconomic disadvantage must achieve at least 80 percent of the school-wide annual growth target (California Reading/English Language ArtsMathematics Demographic Group Participation Rate Percent Proficient or Above Participation Rate Percent Proficient or Above Other Academic Indicator All Students Racial/Ethnic Group 1 Racial/Ethnic Group 2 Racial/Ethnic Group 3 Racial/Ethnic Group 4 Economically Disadvantaged Students with Limited English Proficiency Students with Disabilities


Education Policy Analysis Archives Vol. 13 No. 33 8 Department of Education, 2004, p. 30). Thus, while the California accountability system includes some attention to subgroup performance within a school, it does so in a way that differs from NCLB in at least two significant ways. First, f ewer subgroups are considered. Second, the subgroup target for improvement of performance is somewhat lower than the school-wide target, which unlike NCLB, implicitly makes some allowance for the less reliable gains in achievement for subgroups that obviously have fewer students than are available for calculating the school-wide API changes. The one provision of the NCLB accountabili ty system that considers improvement from one year to the next rather than only annual performance in comparison to an AYP is the, so called, safe harbor provision. If a subgroup of students in a school falls short of the AYP target, the school can still meet AYP if (1) the percentage of students who score below the proficient level is decreased by 10% from the year before, and (2) there is improvement for that subgroup on other indicators. Although the safe harbor provision is intende d to allow schools that fall short of the AYP goal to still make AYP if they show substantial improvement, very few schools that would not otherwise make AYP do so because of the safe ha rbor provision. The very small percentage of schools that are saved by the safe harbor provision is due to th e fact that the 10% decrease in students scoring below proficient sets a very high bar in comparison to what is achieved even by schools where students show considerable improv ement from one year to the next. Only a tiny fraction of schools actually meet AYP through the safe harbor provision because it is so extreme. In Pennsylvania, for example, only about 1% of a bout 780 schools that had one or more subgroups miss the AYP target made AYP because of the sa fe harbor provision in 2002 (Nelson & Rosenberg, 2004b). The percentage of schools that were saved by safe harbor was somewhat greater in Massachusetts than in Pennsylvania. Forty-fo ur (5%) of the 884 schools that made AYP in Massachusetts did so because of the safe harbor provision. The 44 schools that were saved by the safe harbor provision tested fewer students on av erage than the typical Massachusetts school, and smaller schools tend to have less stable results fr om one year to the next. Those schools also had fewer subgroups large enough to be considered in the determination of AYP status (Nelson & Rosenberg, 2004a). If a provision is desired to allow school s to meet AYP by showing decreases in the percentage of students scoring below the proficient level, then consideration should be given to alternative criteria such as an above average decrease in the percentage of students scoring below the proficient level from one year to the next. This would likely lead to a cr iterion closer to a 3% reduction in the below proficient category from on e year to the next rather than the current 10% criterion. Changing the safe-harbor provision from a 10% reduction in below proficient to a 3% reduction would go a long way toward solving the problems caused by the multiple hurdles created by subgroup reporting while still maintaining a focus on the improvement in performance of all subgroups. In addition to considering growth as measured by comparison of successive cohorts of students, it would also be desirable to allow the us e of estimates of growth that rely on longitudinal data with matched student records across years. Tennessee has received considerable attention for its value-added approach based on longitudinal data (See, for example, Sanders and Horn, 1998) and a number of states are interested in tracking individual students across years and using longitudinal results in their accountability systems. Other Requirements for NCLB Achievement Goals There are several important differences between the way in which school achievement goals are set for purposes of NCLB and the ways in wh ich they are typically set in state accountability


Conflicting Demands of No Child Left Behind and State Systems 9 systems. First, as was discussed in the previous section, there is the difference between status in comparison to a target used for NCLB and improvement targets used in the typical state accountability system. Second, there is the difference between NCLBs use of an absolute level of performance that is constant regardless of a schools initial status whereas states usually set targets at levels that depend on the schools performance in a baseline year or biennium. Third, NCLB and the typical state system differ in the establishment of the long-range goals and the timeline for reaching those goals. Establishment of Proficiency Targets for 2014 As was previously noted, NCLB requires states to set challenging academic performance stan dards. There must be at least three performance standards for each assessment. NCLB provides only general guidelines to states for defining the academic achievement standards, specifying only that the state establish (ii) challenging academic achievement standards that(I) are aligned with the States academic content standards; (II)) describe two levels of high achievement (profici ent and advanced) that determine how well children are mastering the material in the St ate academic content standards an d (III) describe a third level of achievement (basic) to provide complete information about the progress of the lower-achieving children toward mastering the proficient and advanced levels of achievement (NCLB, P. L. 107 110, sec. 1111 (b)(1)(D)). All students must be at the pr oficient level or above by 2013 for schools and districts to avoid sanctions, regardless of how leniently or st ringently the state defines the proficient standard. State starting points for percent proficient or a bove were supposed to have been established using assessment results from the 2001 academic year. The states starting point is equal to the higher of the following two values: (1) the percen tage of students in the lowest scoring subgroup who achieve at the proficient level or above and (2) the school at the 20th percentile in the State, based on enrollment, among all schools ranked by the pe rcentage of students at the proficient level (NCLB, P. L. 107, Sec. 1111 (b)(2)(E)(ii)). Establishment of Intermediate (AYP) Targets. States must also establish intermediate goals (the AMOs described above), for AYP between th e 2001 starting point and the 100% proficient goal in 2013. The first increase in the goal from the starting point must occur by 2004, and subsequent increases must occur in not more than three years following the last increase. State-to-State Variability in AYP Plans and Judgments Variability in Trajectory of Intermediate (AYP) Targets Although some states, e.g., Florida, have set their intermediate goals for AYP by using equal increments each year to move from the starting point in 2002 to 100% in 2014, a number of states have elected to use a stair-step approach to settin g their intermediate goals for AYP with increases in 2005, 2008, 2011, and 2014 and st atic levels for intermediate years as illustrated in Figure 3 by Colorado and North Carolina. Alternatively states op ted for stair steps in 2005, 2008 and 2011, but then had annual increments through 2014 as is ill ustrated in Figure 3 by Arizona and Louisiana. Porter, Linn, and Trimble (2005) referred to the approach illustrated by Arizona and Louisiana as a back-loaded trajectory and calle d the approach illustrated by Colorado and North Carolina a linear with plateaus trajectory. Porter, et al obtained information about state trajectories for 47 states. Nineteen state plans use a linear wi th plateaus trajectory, 24 state plans use a backloaded trajectory, and only 4 states use a straight-line trajectory.


Education Policy Analysis Archives Vol. 13 No. 33 10 Variability in Proficiency Targets A comparison of the graphs of the AYP targets for the four states displayed in Figure 3 shows that the four states have quite different starting points. North Carolina has a starting point of 74.6% proficient or above which is slightly more than 10 times as high as Arizonas starting point of 7% proficient. Colorados starting point of 60.7 % is twice as high as Louisianas starting point of 30.1%. Yet the 2014 AYP target for all students in thes e four states, as well as in all the other states, is 100% proficient or above. That sort of improvement in student achievement is completely unrealistic (see, for example, Linn, 2003; 2004 and McCombs, Kirby, Barney, Darilek & Magee, 2004, for discussions of the unrealistic nature of the 100% proficient goal by 2014). Figure 3 Intermediate (AYP) Goals for Four Illustrative States Arizona Colorado Louisiana North Carolina 0 25 50 75 100 2002200420062008201020122014 YearPercent Proficient The large variation in starting points is a re sult of the large between-state variability in the stringency of state performance standards. Because of concerns about state control of education and avoidance at the federal level of anything that hint s of an attempt to impose a national curriculum, it is not surprising that the definition of academic co ntent standards, the choice of assessments that are used to measure those standards, and academic achievement (performance) standards are left for the state to determine. The result is that the st ate performance standards and assessments are not comparable. That would not necessarily be a pr oblem were it not for the requirement that all students reach the proficient level or above by 2014. In addition to being unrealistic, the 100% proficient goal is radically diffe rent from one state to the next. The differences among starting levels for the fo ur states shown in Figure 3 are much larger than the differences in actual performance of students in the eighth grade mathematics. The percentages of public school students who were at or above the proficient level on the 2003 NAEP grade 8 mathematics assessment were as follows for th e four states displayed in Figure 3: Arizona, 21%; Colorado, 34%; Louisiana, 17%; and North Carolina, 32% (National Center for Education


Conflicting Demands of No Child Left Behind and State Systems 11 Statistics, 2004). Although student performance on the 2003 grade 8 NAEP mathematics assessment does differ for these four states, the range from hi gh to low in percent at the proficient level or above is 17%, which is small in comparison to the range in differences in AYP starting points of 67.6%. Moreover, the rank order of the four states in terms of AYP starting values does not match the rank order in terms of actual performance on NAEP in 2003. There obviously are differences between the state eighth grade mathematics assessments and NAEP. States have different academic content st andards that overlap imperfectly with the NAEP mathematics framework and the conditions under which the assessments are administered differ. But Congress apparently expected there to be so me reasonable relationship between NAEP and state assessments since NCLB requires states to pa rticipate every other year in the state-by-state NAEP assessments in reading and mathematics at grades 4 and 8 beginning in 2003. Comparisons of state assessment results to NAEP results pr ovide an indication of how well state results generalize. Furthermore, the work of McLaughlin and Bandeira de Mello (2002) indicates that there is a fairly substantial relationship between mo st state assessments and NAEP. Thus, it seems reasonable to conclude that the differences between states in terms of their percent proficient or above on NAEP and the NCLB starting values have more to do with difference in the stringency of their performance standards than to differe nces in the state assessments and NAEP. It should also be noted that the steep annual increases that Arizona and Louisiana chose to use to set AYP targets for the last four years (2011 through 2014) are just the opposite of what might be expected. Past experience with test-based accountability systems has shown that larger gains are usually made in the first few years fo llowing implementation and that gains generally become smaller in later years. Moreover, common sense suggests that it likely to be much harder to realize a gain of 5% to move from 95% to 100% than from 30% to 35% proficient or above. Variability in Percentage of Schools Meeting AYP Goals Figure 3 illustrated the fact that states differ sub stantially in the starting points, as well as their intermediate AYP goals. There is also consider able state-to-state variability in the percentage of schools that met AYP in each of the first two years (2003 and 2004) of AYP reporting. Olson (2004) reported the percentage of schools making AYP for 41 states in 2003 and 44 states in 2004. The percentage of schools that met AYP goals in 2004 in a 45th state, Illinois, was obtained from the Illinois State Board of Education web site. In 2003 the percentage of sc hools that met AYP goals ranged from a low of 18% to a high of 95%, with an average of 65.6% for the 41 states listed by Olson (2004). For the 45 states with 2004 results, the range was from 23% to 96% with an average of 74.2%. Olson (2004) reported the percentage of schools meeting AYP goals for both 2003 and 2004 for only 36 of the states. The percentage of school s meeting AYP goals was higher in 33 of the 36 states in 2004 than in 2003 and the three exceptions, Indiana, Louisiana, and Michigan, had decreases from 2003 to 2004 of only 1 or 2%. The average 2003 to 2004 increase in percentage of schools making AYP goals was 10.2% and eight states (Connecticut, Delaware, Massachusetts, Missouri, North Carolina, Pennsylvania, South Ca rolina, and Tennessee) had increases that ranged from 20 to 30%. These appear to be remarkable improvements in a single year. Although the increases probably reflect some real improvement in student performance, they are largely due to artifacts such as schools getting better about meetin g requirements that at least 95% of the eligible students in each subgroup participate in the asse ssments. An even larger part of the apparent improvement can more reasonably be attributed to ch anges in AYP calculations that states requested and the U.S. Department approved b etween the 2003 and 2004 reports.


Education Policy Analysis Archives Vol. 13 No. 33 12 The Center for Education Policy (CEP) posted a report on their web site dated October 22, 2004 summarizing changes in state implementation of NCLB accountability rules (CEP, 2004). According to the CEP report, 47 states requested approval for change in their NCLB accountability plans. The U.S. Department of Education posted letter s to 35 of those states in time to be reviewed for the CEP report approving many, though not all of the changes (CEP, 2004, p. 1). Many of the approved changes make it easier for schools to meet AYP goals. For example, 11 states changed the minimum group size for disaggregated reporting an d 12 states introduced the use of confidence intervals, which give the benefit of the doubt to schools in cases where the percentage of students who are proficient or above is somewhat below th e target value required for meeting the AYP goal. The number of schools that meet AYP requi rements increases when the minimum group size is increased and/or the use of confidence interv als is introduced (Porter, Linn & Trimble, 2005). Thus, states that increased the minimum group size in 2004 or introduced confidence intervals for the first time helped schools make AYP goals in 2004 that would not have made it without these changes in NCLB accountability plans. It is worth noting in this regard, that 4 of the states (Missouri, North Carolina, Pennsylvania, and Sout h Carolina) that were among the 8 states showing the largest increases in percentage of schools meeting AYP goals from 2003 to 2004 were also among the 12 states that started usi ng confidence intervals in 2004. Meeting AYP Goals and Performance on NAEP The substantial state-to-state variability in th e percentage of schools meeting AYP makes it evident that the likelihood that a school will fail to meet AYP goals depends not only on the performance of students in the school, but also, at least in part, on the state in which the school is located. Furthermore, as can be seen in Figures 4 and 5, the percentage of schools in a state that meet AYP goals has only a weak relationship to diffe rences among states in student performance on NAEP. Figure 4 shows the relationship of the per centage of schools within the state that met AYP goals in 2003 and the average percent proficient or above across the 2003 NAEP grade 4 reading, grade 8 reading, grade 4 mathematics, and grad e 8 mathematics assessments. Figure 5 shows the relationship of the same NAEP av erage percent proficient or abov e with the percentage of schools that met AYP goals in 2004. It is apparent from an inspection of Figures 4 and 5, that there is only a relatively weak relationship between the performance of students in a state on NAEP and the percentage of schools that meet their AYP goals. In 2003, the state wi th the second lowest average performance on NAEP had more than 90% of their schools make AYP, wh ereas only half the schools met their AYP goals in the state with the highest pe rformance on NAEP (Figure 4). The relationship between the average NAEP performance in 2003 and the percentage of making AYP in 2004 included a few more states and slightly higher than it was for 2003 AYP results. Nonetheless, there are some notable outliers shown in Figure 5 where a state with relatively low average performance on NAEP has a noticeably higher percentage of schools making AYP than another state with relatively high average performance on NAEP.


Conflicting Demands of No Child Left Behind and State Systems 13 Figure 4 Relationship of Average of State Percent Profic ient on 2003 NAEP and Percentage of Schools Meeting AYP in 2003 (41 States, Correlation = .26)AK AZ CA CO CT DE FL GA HI ID IL IN IA KS KY LA MD MA MI MS MO MT NE NV NH NC ND OH OK OR PA RI SC TN TX UT VA WA WV WI WY0 25 50 75 100 15202530354045 2003 NAEP Average Percent Proficient2003 % Schools Meeting AYP Figure 5 Relationship of State Average Percent Proficient on 2003 NAEP and Percentage of Schools Meeting AYP in 2004 (45 States, Correlation = .35)WY WI WV WA VA VT UT TN SD SC PA OR OH ND NC NY NM NJ NH NV MO MS MN MI MA MD ME LA KY KS IA IN IL ID HI GA FL DE CT CO CA AS AZ AK AL0 25 50 75 100 15202530354045 2003 NAEP Average Percent Proficient2004 % Schools Meeting AYP Results such as those shown in Figure 3, 4, and 5, clearly illustrate that it is not meaningful to compare states in terms of the performance stan dards or the rates at which schools in different states meet NCLBs AYP requirements. The perf ormance standards set by states bear little relationship to real between-state variability in student performance. Differences in performance


Education Policy Analysis Archives Vol. 13 No. 33 14 standards set by states as well as differences in th e ways in which states, with the approval of the U.S. Department of Education, comply with NCLB accountability requirements, with regard to features such as the minimum number of students needed for determining if subgroups must meet targets, and whether or not confidence intervals are used, obscure between-state comparisons of percentages of schools meeting AYP requirements. Although comparisons among states is not one of NCLBs purposes, it is clear that the likelihood that a school will be identified as failing to meet AYP targets in a given year, or be placed in the needs improvement category, or be subject to more serious sanctions, depends to a substantial degree on the state in which the school is located and not exclusively on the effectiveness of the school. Conclusion Test-based accountability has become a pervas ive consideration for schools and educators as a consequence of the combination of state accountability requirements and those imposed by NCLB. Because of the substantial differences in state and NCLB requirements, mixed messages that are confusing to the public are being given abou t school performance. The goals established under NCLB are already unrealistic for many schools that started with low performance in 2002 and will become increasing so, not only for those schools but for all schools as the increases in AYP targets occur, especially in 2005 and 2008 when many states will have big jumps in their AYP targets. If the goal for 2013 remains unchanged, essentially all schools will fail to meet the unrealistic goal of 100% proficient or above, and No Child Left Behind will have turned into No School Succeeding. Significant changes in NCLB accountability requirements are needed to avoid labeling all schools as failures. What are some of the needed changes? Possibly most important is to make the goal something that is more realistically obtainable. As noted above, NCLB requires states to participate every other year in the NAEP reading and mathematics assessments at grades 4 and 8 starting in 2003. Although the use of state-level NAEP results are not specified in the law, it is reasonable to think of those results as providing so me kind of benchmark for state assessments. In 2003, no state or large district had anything close to 100% of their students performing at the basic level, much less the proficient level at either grade 4 or grade 8 in either reading or mathematics. (It should be noted that the NAEP achievement levels have been the subject of considerable criticism, in part, because they are set at levels that are higher than the performance of students in any countrysee, for example, Linn, 2003.) Performance goals mandated by the accountab ility system should be ambitious, but also should be realistically obtainable with sufficient effo rt (Linn, 2003, p. 4). At the very least, there needs to be an existence proof. That is, there shou ld be evidence that the goal does not exceed a value that has previously been achieved by the hi ghest performing schools. For example, if the top 10% of schools in a state, in terms of sustained im provements in student achievement, had rates of improvement in the percentage of students achieving at the proficient level or above during the past 5 years that averaged 3% per year, then adequate yearly progress might be defined as a 3% increase in the percentage of students achieving at the prof icient or above each year. That would be a great challenge to the vast majority of schools, but might be a target that is within reach with sufficient effort. Saying that all students must be at the prof icient level or above by 2014, but leaving the definition of proficient achievemen t to the states has resulted in so much state-to-state variability in the level of achievement required to meet the profic ient standard that proficient has become a meaningless designation. Certainly, reporting results in terms of percent proficient or above on state assessments lacks comparability from state to state.


Conflicting Demands of No Child Left Behind and State Systems 15 If the percentage of students who are above a cut score on a state assessment is to be used, the cut score should be more meaningful than the state established proficient levels, which lack any semblance of a common meaning across states. There are several approaches that would be preferable to reporting results in terms of percen t proficient or above. One simple approach would be to define the standard or cut score on a state assessment to be equal to the median score in a base year, presumably 2002 (Linn, 2004). The percentage of students scoring above that constant cut score would then be used to monitor improvement in achievement with target increases set at reasonable levels, e.g., 3% per year. With a target increase of 3% a year, the proportion of students scoring above the 2002 median would need to increase from 50% in 2002 to 86% in 2014. That would represent a gigantic improvement in the ac hievement of the nations students, but might not be totally unrealistic, and surely is not as poorly de fined as 100% proficient or above given the huge state-to-state variability in the meaning of proficient. Another alternative would be to use what Popha m (2004) has called grade-level descriptors. At-grade-level might correspond more closely to the basic than the proficient level in most states. Using past experience, targets could be s et that would bring the achievement of an everincreasing percentage of students up to the at-grade-level standard. The NCLB insistence on a common target for all sc hools, regardless of where they started, is appealing in the sense that it sets the same high expectations for all, but is nonetheless counterproductive when it leaves schools with in itially low performing students with no realistic hope of making the absolute target. Schools demonstrating substantial improvement should not be labeled as failing to make adequate progress, an d for the reasons discussed above, NCLBs safe harbor provision turns out to be no real help to most schools in this regard due to the high hurdle that is established. Holding schools accountable for the performance of students in subgroups that have too often been ignored in the past (e.g., racial/ethni c minorities, economically disadvantaged, limited English proficient students, and students with disabi lities) is a desirable feature of NCLB. As it is implemented, however, it places large, diverse sc hools at a substantial disadvantage. Changing the safe-harbor provision from a 10% reduction in belo w proficient to, say, a 3% reduction would go a long way toward solving the problems caused by the multiple hurdles created by subgroup reporting while maintaining a focus on the improvement in performance of all subgroups.


Education Policy Analysis Archives Vol. 13 No. 33 16 References California Department of Education. (2004). 2003 Academic Performance growth report: Information Guide Retrieved October 22, 2004, fr om http://www. Center for Education Policy. (2004). Rule chan ges could help more sc hools meet test score targets for the No Child Left Behind Act. Retr ieved October 22, 2004 from Dillon, S. (2004, September 5). Good schools or bad? Conflicting ratings leave parents baffled. New York Times. Ford, L. A. & Thacke r, A. A. (2005). Consequential impact of the No Child Left Behind Act on Kentuckys Accountabil ity System: Phase I FR-05-03. Louisv ille, KY: HumRRO. Kentucky Department of Education. (2004). 2004 CATS interpretive guide detailed information on using your score reports Frankfort, KY: Author, available online at Linn, R. L. (2003). Accountability: Resp onsibility and reason able expectations. Educational Researcher 32 (7), 313. Linn, R. L. (2004, July). Rethinking the No Child Left Behind Accounta bility System Paper presented at Center for Educ ation Policy Forum, Washington, DC. Available online at Marion, S. T., White, C., Carlson, D., Erpenbach, W. J., Rabinowit z, A. & Sheinker. J. (2002). Making valid and reliable decisions in determining adequate yearly progress A paper series: Implementing the state accountability require ments under the No Chi ld Left Behind Act of 2001. Washington, DC: Council of Chief State School Officers. McCombs, J. S., Kirby S. N., Barney, H., Darilek, H., & Magee, S. (2004). Achieving state and national literacy goal s, a long uphill road TR-180. Santa Mo nica, CA: Rand. McLaughlin, D. & Bandeira de Mello, V. (2002, April). Comparison of state elementary school mathematics achievement st andards using NAEP 2000 Paper presented at the Annual Meeting of the Amer ican Educational Research Association, New Orleans, LA. National Center for Education Statistics. (2004). The Nations report card: Mathematics highlights 2003 Available online at ht tp:// reportcard/mathematics/. Nelson, H. F. & Rosenberg, B. (2004a). AYP status and school proficiency ratings 2002: Massachusetts Washington, DC: American Federation of Teachers. Nelson, H. F. & Rosenberg, B. (2004b). AYP status and school prof iciency ratings 2002: Pennsylvania Washington, DC: American Federation of Teachers.


Conflicting Demands of No Child Left Behind and State Systems 17 No Child Left Behind Act of 2001, Pub. Law No. 107-110. Novak, J. R. & Fuller, B. (2003, December). Pe nalizing diverse schools? Similar test scores, but different students, br ing federal sanctions. Policy Brief, Decembe r(3), Berkeley, CA, University of California Berkeley: Policy An alysis for California Education, available online at Olson, L. (2004). Taking Root [Special Report]. Education Week, 24(15), pp. S1S10. Popham, W. J. (2004, July). Ruminations regarding NCLBs most malignant provision: Adequate Yearly Progress Paper prepared for the Center for Education Policy Forum, Washington, DC. Available online at Porter, A. C., Linn, R. L., & Trimble, S. (2005, April). Consequences of AYP design decisions: Lessons from Kentucky Paper presented at the Annual meeting of the National Council on Measurement in Educatio n. Montreal, Canada. Sanders, W. & Horn, S. (1998). Research find ings from the Tennessee value added assessment system (TVAAS) database: Implications for educational evaluation and research. Journal of Personnel Evaluation in Education, 12(3), 247. About the Author Robert L. Linn School of Education University of Colorado at Boulder Email: Robert L. Linn is Distinguished Profe ssor of Education at the University of Colorado at Boulder and Co-Director of the Nation al Center for Research on Evaluation, Standards, and Student Testing. He has published over 200 jo urnal articles and chapters in books dealing with a wide range of theo retical and applied is sues in educational measurement and has received several awards fo r his contributions to the field, including the ETS Award for Distinguished Service to Measurement, the E. L Thorndike Award, the E.F. Lindquist Award, the National Council on Measur ement in Educ ation Career Award, and the American Educational Research Association Award for Distinguished Contributions to Educationa l Research. He is past pr esident of the American Educational Research Association, past president of the National Council on Measurement in Education, pa st president of the Evalua tion and Measurement Division of the American Psychological Association, and past vice-president for the Research and Measurement Division of the American Educational Resear ch Association. He is a member of the National Acad emy of Education, a Lifetime National Associate of the National Academies, and serves on two Boar ds of the National Academy of Sciences.


Education Policy Analysis Archives Vol. 13 No. 33 18 EDUCATION POLICY ANALYSIS ARCHIVES Editor: Sherman Dorn, University of South Florida Production Assistant: Chris Murre ll, Arizona State University General questions about ap propriateness of topics or particular articles may be addressed to the Editor, Sherman Dorn, Editorial Board Michael W. Apple University of Wisconsin David C. Berliner Arizona State University Greg Camilli Rutgers University Casey Cobb University of Connecticut Linda Darling-Hammond Stanford University Mark E. Fetler California Commission on Teacher Credentialing Gustavo E. Fischman Arizona State Univeristy Richard Garlikov Birmingham, Alabama Gene V Glass Arizona State University Thomas F. Green Syracuse University Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Ontario Institute of Technology Patricia Fey Jarvis Seattle, Washington Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Les McLean University of Toronto Heinrich Mintrop University of California, Berkeley Michele Moses Arizona State University Anthony G. Rud Jr. Purdue University Michael Scriven Western Michigan University Terrence G. Wiley Arizona State University John Willinsky University of British Columbia


Conflicting Demands of No Child Left Behind and State Systems 19 EDUCATION POLICY ANALYSIS ARCHIVES English-language Graduate -Student Editorial Board Noga Admon New York University Jessica Allen University of Colorado Cheryl Aman University of British Columbia Anne Black University of Connecticut Marisa Burian-Fitzgerald Michigan State University Chad d'Entremont Teachers College Columbia University Carol Da Silva Harvard University Tara Donahue Michigan State University Camille Farrington University of Illinois Chicago Chris Frey Indiana University Amy Garrett Dikkers University of Minnesota Misty Ginicola Yale University Jake Gross Indiana University Hee Kyung Hong Loyola University Chicago Jennifer Lloyd University of British Columbia Heather Lord Yale University Shereeza Mohammed Florida Atlantic University Ben Superfine University of Michigan John Weathers University of Pennsylvania Kyo Yamashiro University of California Los Angeles


Education Policy Analysis Archives Vol. 13 No. 33 20 Archivos Analticos de Polticas Educativas Associate Editors Gustavo E. Fischman & Pablo Gentili Arizona State University & Universidade do Estado do Rio de Janeiro Founding Associate Editor for Spanish Language (1998003) Roberto Rodrguez Gmez Editorial Board Hugo Aboites Universidad Autnoma Metropolitana-Xochimilco Adrin Acosta Universidad de Guadalajara Mxico Claudio Almonacid Avila Universidad Metropolitana de Ciencias de la Educacin, Chile Dalila Andrade de Oliveira Universidade Federal de Minas Gerais, Belo Horizonte, Brasil Alejandra Birgin Ministerio de Educacin, Argentina Teresa Bracho Centro de Investigacin y Docencia Econmica-CIDE Alejandro Canales Universidad Nacional Autnoma de Mxico Ursula Casanova Arizona State University, Tempe, Arizona Sigfredo Chiroque Instituto de Pedagoga Popular, Per Erwin Epstein Loyola University, Chicago, Illinois Mariano Fernndez Enguita Universidad de Salamanca. Espaa Gaudncio Frigotto Universidade Estadual do Rio de Janeiro, Brasil Rollin Kent Universidad Autnoma de Puebla. Puebla, Mxico Walter Kohan Universidade Estadual do Rio de Janeiro, Brasil Roberto Leher Universidade Estadual do Rio de Janeiro, Brasil Daniel C. Levy University at Albany, SUNY, Albany, New York Nilma Limo Gomes Universidade Federal de Minas Gerais, Belo Horizonte Pia Lindquist Wong California State University, Sacramento, California Mara Loreto Egaa Programa Interdisciplinario de Investigacin en Educacin Mariano Narodowski Universidad To rcuato Di Tella, Argentina Iolanda de Oliveira Universidade Federal Fluminense, Brasil Grover Pango Foro Latinoamericano de Polticas Educativas, Per Vanilda Paiva Universidade Estadual Do Rio De Janeiro, Brasil Miguel Pereira Catedratico Un iversidad de Granada, Espaa Angel Ignacio Prez Gmez Universidad de Mlaga Mnica Pini Universidad Nacional de San Martin, Argentina Romualdo Portella do Oliveira Universidade de So Paulo Diana Rhoten Social Science Research Council, New York, New York Jos Gimeno Sacristn Universidad de Valencia, Espaa Daniel Schugurensky Ontario Institute for Studies in Education, Canada Susan Street Centro de Investigaciones y Estudios Superiores en Antropologia Social Occidente, Guadalajara, Mxico Nelly P. Stromquist University of Southern California, Los Angeles, California Daniel Suarez Laboratorio de Politicas Publicas-Universidad de Buenos Aires, Argentina Antonio Teodoro Universidade Lusfona Lisboa, Carlos A. Torres UCLA Jurjo Torres Santom Universidad de la Corua, Espaa

xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 13issue 33series Year mods:caption 20052005Month June6Day 2828mods:originInfo mods:dateIssued iso8601 2005-06-28