USF Libraries
USF Digital Collections

Educational policy analysis archives

MISSING IMAGE

Material Information

Title:
Educational policy analysis archives
Physical Description:
Serial
Language:
English
Creator:
Arizona State University
University of South Florida
Publisher:
Arizona State University
University of South Florida.
Place of Publication:
Tempe, Ariz
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
Education -- Research -- Periodicals   ( lcsh )
Genre:
non-fiction   ( marcgt )
serial   ( sobekcm )

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
usfldc doi - E11-00481
usfldc handle - e11.481
System ID:
SFS0024511:00480


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 14issue 8series Year mods:caption 20062006Month March3Day 2020mods:originInfo mods:dateIssued iso8601 2006-03-20



PAGE 1

Readers are free to copy display, and distribute this article, as long as the work is attributed to the author(s) and Education Policy Analysis Archives, it is distributed for noncommercial purposes only, and no alte ration or transformation is made in the work. More details of this Creative Commons license are available at http:/ /creativecommons.org/licen ses/by-nc-nd/2.5/. All other uses must be approved by the author(s) or EPAA EPAA is published jointly by the Colleges of Education at Arizona State University and the Universi ty of South Florida. Articles are indexed by H.W. Wilson & Co. Send commentary to Casey Cobb (c asey.cobb@uconn.edu) and errata notes to Sherman Dorn (epaa-editor@s hermandorn.com). EDUCATION POLICY ANALYSIS ARCHIVES A peer-reviewed scholarly journal Editor: Sherman Dorn College of Education University of South Florida Volume 14 Number 8 Marc h 20, 2006 ISSN 1068–2341 The Black-White Achievement Gap: Do State Policies Matter? Henry I. Braun Educational Testing Service Aubrey Wang School District of Philadelphia Frank Jenkins Westat, Inc. Elliot Weinbaum University of Pennsylvania Citation: Braun, H. I., Wang, A., Jenkins, F., & Weinbaum, E. (2006). The Black-White achievement gap: Do state policies matter? Education Policy Analysis Archives, 14 (8). Retrieved [date] fr om http://epaa.asu.edu/epaa/v14n8/. Abstract A longstanding issue in American educat ion is the gap in academic achievement between majority and minority students. The goal of this study is to accumulate and evaluate evidence on the relationsh ip between st ate education policies and changes in the Black-White achievement gap, while addressing some of the methodological issues that have led to differences in interpretations of earlier findings. To that end, we consider the expe riences of ten states that together enroll more than forty percent of the nation's Bl ack students. We estimate the trajectories

PAGE 2

Education Policy Analysis Archives Vol. 14 No. 8 2 of Black student and White student achievement on the NAEP 8th grade mathematics assessment over the pe riod 1992 to 2000, and examine the achievement gap at th ree levels of aggregation: the state as a whole, groups of schools (strata) within a state defined by the SES level of the student population, and within schools within a stratum within a state. From 1992 to 2000, at every level of aggregation, mean achievement rose for both Black students and White students. However, for most states the achievement gaps were large and changed very little at every level of aggregation. The gaps are pervasive, profound and persistent. There is substantial heterogeneity among states in the type s of policies they pursued, as well as the coherence and c onsistency of those policies during the period 1988–1998. We find that states' overall policy rankin gs (based on our review of the data) correlate mode rately with their record in improving Black student achievement but are somewhat le ss useful in predicting th eir record with respect to reducing the achievement ga ps. States' rankings on co mmitment to teacher quality correlate almost as well as did the overall policy ranking. Thus, state reform efforts are a blunt tool, but a tool nonetheless. Our findings are consistent with the following recommendations: states' reform efforts should be built on broa d-based support and buffered as much as possible from changes in budgets and politi cs; states should employ the full set of policy levers at their disposal; and polic ies should directly support local reform efforts with proven effectiv eness in addressing the ex periences of students of different races attend ing the same schools. Keywords: achievement gap; National Assessment of Educational Progress (NAEP); state educat ion policies; hierarchical analyses. Acknowledgments This research was supported by the U. S. Department of Education through a grant under the NAEP Secondary Analysis Program (Award Number R902B020004) and by the Ed ucational Testing Service. A. Wang and F. Jenkins were employed by ETS during the course of the project. The authors thank A. Sedlacek (NCES), M. Nettles (ETS) and I. Lawrence (ETS) for their encouragement and support, as well as Stephen Raudenbush and Lauress Wise fo r their comments on an earlier version of the manuscript. This report is the result of the efforts of many people and the authors are indebted to them. We wish to recognize the contribution of the late James Watts of the Southern Regional Education Board who helped us make contact with many of our experts. Of course we are especially grateful to the following individuals who responded to the questionnaire and an swered questions from project staff. California : Tej Pandey, Private consultant, Sacramento, California. Kentucky : Scott Trimble, Kentucky Department of Education (retired), Skip Kifer, University of Kentucky, and Brian Gong, The National Center for the Improvement of Educational Assessment, Inc. Maryland : William Schafer, University of Maryland (r etired) and Paul Gammill, Baltimore City Public Schools (retired). Michigan : Judy Lanier, Michigan St ate University (retired) Sharif Shakrani, National Assessment Governing Board, and Scott Jenkins, Agate Software. New York : Gordon Ambach, Council of Chief State School Officers (retired), John Bishop, Cornell University, and Benjamin Lindemann, New York State Department of Education (retired). North Carolina : Gary Williamson, MetaMetrics, Inc., Ri chard Thompson, University of North Carolina, and Lou Fabrizio, North Carolina Department of Public Instruction.

PAGE 3

The Black-White Achievement Gap 3 South Carolina : JoAnne Anderson, South Carolina Educ ation Oversight Committee and Sandra Lindsay, South Carolina Department of Education. Tennessee : Douglas Wood, Columbia University, Ka ren Weeks, Tennessee State Board of Education, and Connie Smith, Tennessee Department of Education. Texas : Chriss Cloudt, Texas Education Agency, an d John Stevens, Texas Business and Education Council. Virginia : Shelly Loving-Ryder, Charles Finley, and Pat Wright, all of the Virginia State Department of Education. The following individuals reviewed and provided advice concerning the state profiles and data collection instruments: Douglas Tuthill, Private cons ultant, St. Petersburg, Florida; Margaret Goertz, Consortium for Policy Research in Education, University of Pennsylvania; Michael T. Nettles and Richard Coley, ETS; Patricia McAllister, Council of Graduate Schools; Carol Dwyer, ETS Technical support was provided by Terran Brow n, Youn-hee Lim, Haiying Liu, and Terrell Strayhorn. Additional support was provided by Heather Rose Brooks and Juan Guzman. Superb editorial assistance was provided by Elizabeth Bro phy. Any errors of fact or omission are those of the authors. Introduction A critical and contentious issue in American education is the persistent gap in academic achievement between majority and mi nority students, especially Blac k students (Coleman et al, 1966; Jencks & Phillips, 1998; Kober, 20 01). This gap has been documented at the national level at school entry (U. S. Department of Education [US DOE], 2002), at fourth and eighth grades (US DOE, 2004), and at twelfth grade (US DOE, 1995; US DO E, 1999).These reports make it clear that the gap at the means of the respective distributions is al so reflected in the paucity of minority students at the highest levels of achievement. Since the publication of the “Coleman Report,”1 substantial resources have been devoted to “closing the gap” through such programs as He ad Start, activities funded through Title I2, as well as various programs sponsored by individual states All too often, schools serving large numbers of disadvantaged students are underfunded and sta ffed with relatively new and/or underqualified teachers. Furthermore, economic disadvantage, hi gh family mobility and rampant crime all can militate against success in school for these disadv antaged children. For recent reviews see Barton (2003) and Rothstein (2004). The federal and st ate governments have attempted to close the achievement gap through efforts aimed at raising learning standards and making state and local policies more coherent and comprehensive, thereby sending clearer messages to educators, students, and parents about what is expected. This is know n as the standards-based movement. Furthermore, efforts to reduce funding inequities among school districts are intended to provide additional resources to districts with larg e proportions of students from poor er families, many of whom are minority. These efforts include Federal initiatives such as Title I, as well as initiatives undertaken by various states—often as the result of lawsuits (Odden & Picus, 2000). 1 One of the most frequently cited studies about the influence of schools and students’ home life on academic achievement was a 1966 stud y by James Coleman and others, Equality of Educational Opportunity 2 Title I (Part A) of the Elementary and Secondary School Act (ESEA) is the largest federal program in K–12 education, aimed at equalizing funding for high -poverty schools and funded at more than $11 billion in the 2003–04 school year.

PAGE 4

Education Policy Analysis Archives Vol. 14 No. 8 4 Some reduction in the achievement gap since the 1970s in both reading and mathematics is indicated in the long-term trend component of the National Assessment of Educational Progress (NAEP) (Hedges & Nowell, 1998). However, results from main NAEP3 from 1990 forward show that the gap has remained nearly constant. For example, in 8th grade mathematics at the national level, the gap between White students and Black students was 33 points in 1990 and 36 points in 2003 (US DOE, 2004). Moreover, the mean scale scor e in 2003 for Black students (252) falls at the 13th percentile of the distribution of scale scores for White students in 2003, which is an indication of the small overlap of the two distributions. At the same time, it is important to recognize that both groups improved their performance considerably ov er this period, White students by 18 points and Black students by 15 points. Although the BlackWhite achievement gap varies from state to state, the typical achievement gap at the state level is about the same magnitude as the achievement gap for the nation as a whole. Nonetheless, as we shall see below, some states have been more successful than others at closing the achievement gap, at least for some categories of schools. As Federal and state authorities have implem ented different strategies to raise achievement overall—and for specific groups—investigators ha ve attempted to determine whether particular policies, or constellations of policies, are diffe rentially successful in raising achievement and/or closing the gap. Evaluations of Title I (US DOE, 2001b; Kosters & Mast, 2003) or of the National Science Foundation (NSF) State Systemic Initiati ves (SSIs) (Webb, Kane, Kaufman, & Yang, 2001) fall under this rubric. Some studies have focuse d on states’ accountability efforts (Carnoy & Loeb, 2003), some on policies regarding teachers (Darlin g-Hammond, 2000), while others have considered a broader spectrum of reform initiatives (Swans on & Stevenson, 2002). Conclusions have been mixed and controversy is still the norm, particularly with respect to the l ong-term consequences of high-stakes testing (Amrein & Berliner, 2002; Carnoy & Loeb, 2003; Raymond & Hanushek, 2003; Braun, 2004; Desimone, Smith, Hayes & Frisvo ld, 2005; Nichols, Glass & Berliner, 2006). This should not be surprising because such eval uations are a type of observational study of a small population of units (typically, states), wi th non-random allocation of “treatments.” Consequently, causal inferences are not warranted. In addition, the actual causal pathways are complex, comprising many interacting systems whose dynamics can and do evolve through the years. Approaches that concentrate on a single po licy, such as the introduction of high-stakes testing, or rely on a snapshot of the policy landsc ape at a particular point in time, cannot yield inferences that can be strongly defended. Although most studies and public discussion rely on data reported at state or national levels, there is considerable evidence that there is more heterogeneity within states than among states (Raudenbush, Fotiu, Cheong, & Ziazi, 1996). This ra ises the possibility that patterns of achievement by race or trends in the achievement gap might pr esent a different picture if they were viewed at lower levels of aggregation. (See al so Bracey, 2003 on this point.) Anal yses at levels below that of the state provide a better basis for examining the apparent effectiveness of policies targeted, for example, at raising the achievement of all studen ts, including Black students; or at raising the achievement of certain ty pes of schools, such as high poverty schools. On the other hand, studies of individual schools (or even a few schools) must co nfront the volatility and lack of generalizability attendant on small sample sizes. The research presented here is an attempt to address some of these issues, by striking a compromise between state level and school level an alyses. With constraints of cost and time in 3 Main NAEP refers to that component of the NA EP program that is based on periodically revised curriculum frameworks. It is distinguished from the l ong term trend component of the NAEP program that is based on the original de sign established in 1969.

PAGE 5

The Black-White Achievement Gap 5 mind, we have chosen to examine the gap between Black students and their White peers, focusing on ten states in which Black students represent a substantial proportion of the public school population: California, Kentucky, Maryland, Michig an, North Carolina, New York, South Carolina, Tennessee, Texas, and Virginia.4 The selection process is detailed below. Our analysis is distinguished by the examination of student achi evement in subsets of schools (which we denote as “strata”) that are determined by the av erage SES levels of their students. We have two overarching research questions First, has the Black-White achievement gap decreased from 1992 through 2000 in certain categories of schools in some of the ten states? And, in addition, do coherent and comprehensive state polic ies make a difference in closing the achievement gap? It should be borne in mind that the size an d trajectory of the Black-White achievement gap is completely determined by how the achievement of both Black and White students varies over time. Changes in the gap can result from different scenarios. For example, a reduction in the achievement gap may occur with scores in both grou ps increasing, but Black students experiencing a greater rate of improvement. Alternatively, White students’ achievement may be essentially stagnant while Black students gain. Clearly, the interpretation and policy implications of a particular scenario should involve consideration of all the relevant tr ajectories. Accordingly, in this study, we present average results for each group separately, as well as for the difference in the averages (i.e., the achievement gap). In particular, we examine the possibility of an association between the reduction in the gap and the increase for Black students over the same period. We also examine the extent to which the variation in the gaps at different levels can be accounted for by characteristics of students and schools. Notwithstanding the complexities invo lved, we believe that th e achievement gap is a critical indicator of the efficacy of our education sy stem and merits attention in its own right. That the achievement gap appears to be generally re sistant to policy interventions only serves to underscore the importance of examining its structure in greater depth. Similarly, in order to address how effective state policies have been in closing the achievement gap from late 1980s through late 1990s, we must first understand how the policies have changed over time. Through our policy analysis framework, we identified five main policy levers: governance and the politics of reform, education finance, curriculum and standards, teacher quality, and assessment and accountability. For each of these levers, we identified some key reform-related questions for which we sought empirically based answers. To this end, we conducted, for each state, a comprehensive review of its policy history for the period 1988 to 1998. The review involved organizing and summarizing existing information and documentation relevant to the major policy dimensions, as well as extensive interviews with individuals knowledgeable about the state’s initiatives in the education realm during this period. In particular, we relied on these experts to help us evaluate those aspects of states’ policies that are more difficult to quantify but no less critical to their success: scope and quality, as well as coherence and consistency over time. Our approach is described in more detail in the Methods section. With respect to achievement, we decided to examine student performance on State NAEP 8th grade mathematics, using results from the 1992 1996, and 2000 administrations. This choice was made for a number of reasons. First, 8th grade mathematics represents the capstone of a state’s testing program in mathematics mandated in th e most recent reauthorization of ESEA, the No Child Left Behind (NCLB) Act. Second, there is a good deal of empirical evidence that math achievement, more so than reading, is influe nced by teacher and school characteristics (Nye, Konstantopoulos, & Hedges, 2004; Dee & Keys, 2004). Finally, student proficiency at this level is 4 In what follows, we will ofte n refer to the states by their usual two letter abbreviations.

PAGE 6

Education Policy Analysis Archives Vol. 14 No. 8 6 predictive of subsequent success in high school an d participation in post-secondary education. (See the report of the US DOE [2001b] and the references therein.) Using auxiliary information from the NAEP administrations, we categorized schools in the NAEP sample as higher poverty or lower poverty, based on the percentage of students in the school who were eligible for free or reduced price lunch. Schools with 50% or more of eligible students were classified as higher poverty. The others were classified as lower poverty. In what follows, we refer to these two categories as strata. For the period 1992 to 2000, we computed the changes in 8th grade NAEP mathematics scores for White studen ts, Black students and the differences between them (the Black-White achievement gap) at the state le vel and at the stratum level within the state. To facilitate the interpretation of the results, we first set the magnitudes of the Black-White gaps within strata against the general backgrou nd of between state differences (the usual focus of both policy-makers and the public) as well as agains t the typical sizes of between stratum differences within states. We then compared states in terms of the changes over the period in the Black-White achievement gaps within each stratum. (This is a reasonable strategy since the definition of the stratum is the same for all states and across administrations.) Contrasts between the higher and lower poverty strata within states were also examined. Next, we effectively restricted our attention to schools whose NAEP samples included both Black students and White students. For those schools, we employed hierarchical linear models (HLM) to partition overall achievement variance into between-student, within-school and betweenschool components. We then augmented the basi c variance components model with student and school characteristics. Thus, for each stratum within a state, we were able to estimate the size of the Black-White achievement gap for those students attending the same schools—usually denoted as a “pooled within school” estimate. Moreover, we were ab le to determine how much of that gap could be accounted for by other student characteristics, such as student socioeconomic status (SES). Although of secondary interest, we also estimated how much of the between school variance could be accounted for by school characteristics, such as their demographic make-up. We developed a summary description of th e achievement record of each state and categorized states on the basis of those summari es. We considered both absolute gains by Black students as well as progress in closing the ac hievement gap between Black and White students. Separately, we categorized states with respect to th eir policy histories, first considering each policy lever separately and then a derived overall policy score. Finally, we juxtaposed the two achievement categorizations against the policy categorizat ions, identifying patterns of interest. Of course, we are mindful of the fact that, no twithstanding the extensive amount of data analyzed, the observed patterns can only lead to tentative conclusions regarding the effectiveness of particular policies or strategies. The arguments must necessarily be indirect and circumstantial. Nonetheless, we believe that our approach has yielded insights that can be helpful to states as they move forward with their own initiatives and wrestl e with the requirements of NCLB. In any event, the results we have obtained can serve as a baseline against which to compare the success of these states over the next decade in improving achievement and closing the Black-White gaps. The article is organized as follows: In the next section we provide an extended review of the literature. We then describe our methods, followed by our policy analysis and then separate sections containing the basic descriptive results and the mu lti-level analyses. The next section presents the linking of the policy and quantitative analyses. The paper concludes with a short discussion of findings and implications.

PAGE 7

The Black-White Achievement Gap 7 Literature Review Concerns with U.S. Public Education There have always been criticisms of the pu blic education system and, correspondingly, a plethora of attempts to remedy the purported probl ems. This dynamic has been well documented by Tyack and Cuban (1995) and, more recently, by Ra vitch (2000). Two key issues are appropriate goals for the system and effectiveness in achieving those goal s. Particularly in the last decade, some critics have questioned whether, even with the provision of additional resources, public education is flexible enough to evolve over time to meet incr eased demands, including better serving all students. Advocates of charter schools and vouchers maintain that without competition and real alternatives, efforts to implement wide-ranging improvements in public education are doomed to failure (Hoxby, 2001). Over the last twenty years or so, there has b een growing concern about how U.S. students— and adults—perform in comparison to their peers in other nations. In a recent analysis of data from the International Adult Literacy Survey, Sum, Ki rsch, and Taggart (2002) characterize U.S. performance by the phrase “mediocrity and ine quality.” They point out that, on a number of dimensions, the U.S. places at or below the median of 17 developed nations and also exhibits much greater variability than any of the other nations Although Sum et al. are concerned with the performance of adults, the same characterizat ion applies to the achievement of in-school populations. Comparisons from international assessments, such as the Third International Mathematics and Science Study (TIMSS) and Third Internationa l Mathematics and Science Study-Repeat (TIMSSR) (US DOE, 2000) make it clear that overall U.S. performance is far from “world class.” Results of the first assessment carried out under the auspices of the Program for International Student Assessment (PISA) were released by U.S. Depa rtment of Education (2002). Among the 28 Organization for Economic Co-Operation and Developm ent (OECD) nations that took part in this assessment of 15-year-olds, the U.S. scored abou t average on reading, math and science. Bracey (2002) points out that if U.S. students were disa ggregated by race/ethnicity, White students would place 7th, while both Black and Hispanic students would place 27th. Obviously the wide disparities among demographic subgroups contribute to the “excess variability” noted above. The magnitude and persistence of these disparities is rightly regarded as problematic both for our long run economic competitiveness and the health of our democracy (Friedman, 2005). They are particularly troubling given the projections of the future demographic composition of the U.S. population. For a broader and more optimistic vi ew, the U.S. Department of Education report (2001a) on educational achievement and Black-White inequality is instructive. Among the findings are that Black-White disparities in college attendance, employment, and earnings are reduced, eliminated or even reversed if individuals are matched on prior educational achievement. However, as long as there are substantial differences in th e distributions of achievement, overall differences between Blacks and Whites will remain large and problematic. Bracey’s (2002) observation regarding disaggreg ated results is but one example of the problems in reporting data at one (usually the hi ghest) level of aggregation. Raudenbush, Fotiu, Cheong, and Ziazi (1996) make a similar point with respect to comparisons among states based on NAEP data. The latter study also demonstrates that there are substantial differences among states, and among race/ethnic groups within states, in ho me environment and in learning opportunities in school. Not surprisingly, these differe nces are strongly associated with differences in achievement.

PAGE 8

Education Policy Analysis Archives Vol. 14 No. 8 8 In view of the American tradition of local cont rol and, in many states, heavy reliance on the local property tax base to fund schools, there ar e serious questions about whether mechanisms can be found to provide adequate funding to impoveris hed districts with poor records of achievement. In many states, litigation has resulted in court mandated reforms—but these have often been stymied or poorly implemented, with the result that little progress in equalizing per pupil expenditures has been made. The Achievement Gap The persistent differences between Blacks and Whites in both educational achievement and educational attainment have been well documented in a number of sources. A brief historical introduction to the problem, as well as a thoughtful review of relevant issues, is provided by Jencks and Phillips (1998). Hedges and Nowell (1998) anal yze data from six surveys and present evidence that the achievement gap has narrowed somewhat since 1970 but that the differences remain substantial. Moreover, they note that “Blacks are hu gely underrepresented in the upper tails of the achievement distribution, and this representation [gap] does not seem to be decreasing” (p.167). 5 These conclusions are echoed in a recent report by Barton (2002) for the National Education Goals Panel in which he analyzes da ta from both National and State NAEP over the period 1990 to 2000. He reports, for example, that over that decade, the gap in 8th grade mathematics went up in 15 states and down in 14 states; however, only two of the 15 were statistically significant and none of the 14 were. The U.S. Department of Education (2001) report already cited is a comprehensive study of the diffe rences between Blacks and Whites in educational achievement as well as in a variety of educationa l and economic outcomes. One of the conclusions drawn is that, “The black-white mathematics ga p differed in size across grades, in a manner consistent with, but not necessarily demonstrating, a narrowing of the gap during elementary school, followed by a widening of the gap during junior high school and little change during senior high school” (p. v). It should be noted, however, that these findings are based on stitching together results from a number of cohorts spanning different grade ranges. Thus, they could be affected by unmeasured between-cohort differences. (S ee Ludwig, 2003, for further comments.) There is a considerable literature on the nonschool correlates of achievement and myriad partial explanations for the existence of the achiev ement gap. Miller (1995) presents an exhaustive analysis of the correlations between social cla ss and socio-economic status on the one hand and academic achievement on the other. The correlations are strong for all race/ethnic groups and for all standardized measures of achievement. Moreov er, he documents the wide variations among race/ethnic groups in education-relevant resour ces and their association with differences in achievement. He explains Coleman’s (1966) concl usion that family variables, rather than school variables, account for more of the achievement differences among race/ethnic groups as follows: “The variations in home-based, formal-sc hooling-derived resources that have been intergenerationally accumulated by families are grea ter than the variations in education-relevant resources that society is investing in the current generation of children directly through the schools” (p. 119). 5 Of course, for typical test score distributions, a substantial gap at the means of the individual distributions is accompanied by highly disproportionate representation in the tails of the combined distribution.

PAGE 9

The Black-White Achievement Gap 9 Miller also addresses the troubling finding th at even controlling for social class (using available measures such as parental educational attainment), there are considerable differences among race/ethnic groups in academic achievement in kindergarten through 12th grades. However, these differences parallel the findings “that there ar e currently very consequential differences in the amounts of human capital possessed by young whi te adults and their African American and Latino counterparts and that these variations exist at most educational attainment levels” (p. 170). Again, these disparities appear to be the consequence of differential rates of accumulation of human capital over many generations. From a policy perspective, th e inescapable conclusion is that the closing of the achievement gap will only happen over generati ons. The caution for the methodologist is that available measures of family resources underestima te group differences. This bias must be taken into account when interpreting the results of any analysis. Phillips, Brooks-Gunn, Duncan, Klebanov, and Crane (1998) offer an accessible account of the issues as well as a sophisticated analysis of da ta from the Children of the National Longitudinal Survey of Youth. They investigate a number of factors related to family background and parenting styles and habits and conclude that perhaps as much as two-thirds of the test score gap can be accounted for by these factors. They acknowledge, however, that other authors (Herrnstein & Murray,1994; Hedges et al., 1998) reach a figure cl oser to one-third. The latter estimates employ a more restrictive definition of socio-economic char acteristics. Note that Phillips et al. (1998) augment the usual family variables with a number of other educationally relevant variables that reduce the misspecification described by Miller (1995). They are also at pains to point out the potential confounding of genotype and home environment, which complicates both interpretation and prediction. More recently, Barton (2003) continues this line of analysis. Based on a review of the literature, Barton identifies 14 correlates of academic achievement that involve factors related to health, housing, nutrition and school quality. He then provides data to estimate the gap between minority and majority students. For each factor, the gap favors majority students and, in most cases, the gap is very substantial. A plausible conclusion is that the achievement gap is but one consequence of these differences among students and, moreover, that school reform alone is unlikely to eliminate the gap. For a slightly different view, see Lee (2002). Indeed, school effects have been much studied since Coleman et al. (1966), with the more recent studies examining the contributions that schools may make to the achievement gap. For example, Phillips, Crouse, and Ralph (1998) conduct a meta-analysis and report that, “Black students who start elementary school with the same test sc ores as the average White student learn less than the average White student between the first an d twelfth grades” (p. 257). They further note that “…our results imply that neither differences be tween the schools that Blacks and Whites attend nor differences in their socioeconomic status suffice to explain why Blacks learn less than Whites with similar initial skills” (p.267). Ferguson (1998) inve stigates the differential effects of grouping and tracking and concludes that they cannot account for much of the achievement gap. Clearly, no single factor can account for the large and persistent test score gaps that have been observed and the complexity of the problem may be the principal reason that it has proven so refractory to amelioration, at least on a large scale. Education Reform In addition to the contributions of families and communities, schools can make a difference in closing the achievement gap. Throughout the 1990s, there was a shared belief that low student achievement in public schools was primarily the resul t of low standards, incoherent and fragmented

PAGE 10

Education Policy Analysis Archives Vol. 14 No. 8 10 policies, and poor use of resources (Corcoran, 1997; US DOE, 2001c). As a result, the 1994 reauthorization of the Elementa ry and Secondary Education Act (ESEA), the Goals 2000 program, and the National Science Foundation’s State System ic Initiatives all focused on top-down reform. States were asked to set higher standards and expect ations for all students; create more coherent and aligned assessments and curricular frameworks; im plement an accountability system with rewards and sanctions; and change teacher certification requi rements to enable teachers to prepare students to meet the new standards (Corcoran, 1997). States are constitutionally responsible for elem entary and secondary education and generally play a major role regulating education, although th ere is considerable variation across states in the balance between centralized and lo cal (district) control. Where there is sufficient political will, however, every state has the power to promulga te new rules, establish standards, mandate accountability measures, and impose sanctions or rewa rds. Nonetheless, state-initiated reform efforts are shaped in part by traditions of local control and by the challenge of effectively driving change down through myriad local bureaucracies. Fede rally-initiated reforms have to contend with constitutional issues, although the most recent re -authorization of the ESEA, the No Child Left Behind Act, appears to have had a dramatic and immediate impact on states’ policies through the threat of withholding federal fund ing in the event of non-compliance. During the 1990’s, many state reforms focused on disadvantaged students and some directly on the achievement gap—with modest, if any, su ccess (Brady, 2003). Clearly, there are many possible explanations to account for this discouragi ng record, including the challenge of maintaining coherence, consistency, and adequate resources in a dynamic political environment, as well as the general difficulty of propagating “top-down” re form without substantial attenuation at the classroom level. Another view is that schools have multiple missions that conflict with one another (Christensen & Karp, 2003). Toward the end of the decade, a number of reports examined the achievement gap and offered reco mmendations on actions that po licy-makers, school officials, parents, and others could take to improve achievem ent overall and to reduce the gap (College Board, 1999; Center on Education Policy, 2001). Evaluating the Effectiveness of Reform Efforts The initiatives taken at the federal and sta te levels, such as Title I and the NSF’s State Systemic Initiatives (SSIs), are based on assumptio ns that schools can make a difference in student learning, notwithstanding the substantial influen ce of family and community characteristics. (Of course, there is a second, implicit, assumption that governmental policies can constructively influence the practice of schools.) There is certainly considerable anecdotal evidence of schools that have made a difference (Education Trust, 1999, 2001 ; Cawelti, 1999), but large-scale analyses have yielded inconsistent results with continuing contro versy ever since the publication of the Coleman Report in 1966. As part of the 1994 reauthorization of the ESEA, Congress mandated the national evaluation of Title I to examine the progress of students whom the program was intended to benefit, as well as the implementation of key prov isions of the program. The evaluation found that Title I reached more than 12.5 million students many of whom were from the highest-poverty schools. Although Title I specifically targeted studen ts in poverty, the impact of Title I program on student achievement could not be easily disentangled from the contributions of other factors, including the state and local reform efforts which Title I was designed to support. Moreover, results from state assessments and from NAEP present so mewhat contrasting pictures of Title I’s success in narrowing the achievement gap. State assessmen ts generally indicate some progress in narrowing

PAGE 11

The Black-White Achievement Gap 11 the achievement gap while the results from the long-term trend component of NAEP indicates a slight widening of the achievement gap from the late 1980s to 1999 (U.S. DOE, 1999, 2001c)6 The vision outlined by NSF’s SSI was even more ambitious. It included high standards, along with aligned curriculum, pedagogy, and a ssessment. Proponents of systemic reform believed that high standards for student learning should form the basis for the alignment of all policies, practices, and resources throughout the educational system. They posited that improvement in student achievement requires coherent policies an d coordinated resources that are designed to communicate a clear vision of what students should know and be able to do (Zucker, Shields, Adelman, Corcoran, & Goertz, 1998; Corcoran, 1997, Clune, 1998). A total of 24 states and Puerto Rico participated in the SSI program in the 1990s. These states experimented with a variety of strategies. Ev aluations of SSIs have found that half of the SSI states showed some impact on classroom practice. Moreover, achievement gains were higher in states that had intense professional development focusing on curriculum and materials (Blank, 2000, Zucker et al. 1998). However, researchers conclu ded that assessing the extent to which an SSI achieved the goal of closing the achievement gap wa s challenging, as the capacity to do so varied greatly from one SSI to another and, furthermor e, most states were not able to ensure equal implementation of SSI activities across all schools. In fact, only a few SSIs attempted to change or restructure the professional development system itse lf to ensure that all teachers were given access to high quality training. However, it does seem that states that impl emented a focused SSI or standards-based reform were able to make a diffe rence in their teachers’ self-reported classroom practice. It should be evident that the evaluation of such broad reform efforts must confront a number of challenges. These include difficulty in establishing clear specifications and measurement rubrics for different policy components, the char acterization and tracking of changes in these components over time, and the ability to disentangl e the effects of the intended policies on targeted outcomes from other factors. Most researchers agr ee that no single study or approach can provide a definitive answer to any realistic policy question. The issues involved are too complex and the limitations of a particular approach too numerous to fully exclude plausible competing explanations. Instead, it is necessary to triangulate among cro ss-sectional, longitudinal, and experimental studies. Unfortunately, the time frame for such a comprehensive strategy usually does not conform to the needs of decision-makers. Although analytic methodologies have increased in sophistication and the databases have become richer and more comprehensive, there is still a frustrating lack of consensus on “what works.” The key is to try to exploit the links between variation in practices and differences in outcomes. Presumably, once unusually effective (or ineffective) schools and teachers have been identified, further investigation is required to determine the specific practices that may be responsible. Such intensive study is time-consuming and expensive (see Klein et al. [2000] and the references therein). Turning to the evaluation of the effective ness of state actions, there are many policy differences among states that can be considered Of course, much depends on the outcome measure selected. Barton and Coley (1998), for example, follow a cohort of 4th grade students in 1992 who attended the 8th grade in 1996 and estimate the gains for each state, based on State NAEP results. As Camilli (2000) points out, Texas ranks below the medi an on this measure, rather different from its 6 Discrepancies in trends betw een a state assessment and NAEP can be due to many factors including differences in the assessment frameworks, test content, administration protocols and student motivation. For further di scussion, consult Thissen (2005) and Koretz (2005).

PAGE 12

Education Policy Analysis Archives Vol. 14 No. 8 12 rank as number one in other accounts of the “T exas Miracle” (Haney, 20 00). A recent update can be found in Coley (2003). It is not surprising to find differences in results between cross-cohort and within-cohort analyses (Braun, 2004). Certainly, participation in the State NAEP asse ssments offers states a number of evaluation alternatives. While some of the concerns mentioned above with respect to state tests are allayed with State NAEP, others remain and new ones appear. As an example of the former, there are worries that some states may try, at least to some exten t, to exclude many low-performing but eligible students in order to improve states’ reported perf ormance. On the other hand, because participation in NAEP is not mandatory, the representativene ss of both the school and student samples must be closely monitored. Furthermore, inasmuch as NAEP does not report scores at the student level and is not considered a high stakes test, variation in student motivation (within and between states) is a potential source of bias that is difficult to quantify. Nonetheless, State NAEP represents a rich databa se for analyses at the state level and below. By now there have been a sufficient number of ad ministrations to justify the exploration of the possible effects of reform efforts. There appear to be two main strategies in the use of State NAEP. One relies on aggregation of data to the state le vel and, depending on the nature of the models adopted, proceeds to draw inferences about the st ate or about comparisons among the states. The other employs so-called multi-level or hierarch ical linear models (HLM) (Raudenbush & Willms, 1995; Raudenbush, 1988; Raudenbush & Bryk, 20 02) to develop more detailed pictures of the relationship between achievement and other measured variables as a basis for inference. A report by the Education Intelligence Agency (A ntonucci, 1999) offers a relatively simple but interesting example of the first approach. He fi rst computes a cumulative score for each state by combining its scaled scores on Grade 4 math and reading as well as Grade 8 math and reading obtained from the 1996 and 1998 State NAEP Assessments. He then examines, for example, how state rankings change when attention focuses on the performance of the states’ Title I students. Texas, which is near the median in the overall ranking climbs to the top in the latter one, while California remains mired near the bottom on both. He also compares states on the differences in scores between students whose teachers had a masters degree and those whose teachers had a bachelors degree, In each state, the comparison favo rs the former but in only a third of the states was the improvement per test of practical interest. In this case, California and Texas were adjacent, and at the median. While the specter of confound ing is ever-present, Antonucci does make some interesting observations that lead to hypotheses to be fu rther investigated. More recently, Lee (2002) has considered tr ends in Black-White and Hispanic-White score gaps in reading and math using long-term trend NAEP. He observes that the “…achievement gaps narrowed in the 1970s and 1980s but then stabilized or widened in the 1990s” (p. 3). He then attempts to correlate these patterns with long term trends in various measures of socioeconomic, educational and cultural conditions. He argues that the latter trends cannot account for the widening of the gaps observed during the 1990s and that it will be “… necessary to investigate simultaneous changes across a broad range of factors from multiple data sources and to examine their interactive, joint influences on the achievement gap” (p.10). Darling-Hammond (2000) considers state aver ages on reading and math (1990, 1992, 1994, 1996) as well as state gains in Grade 4 math (1990 to 1996 and 1992 to 1996) and state gains in Grade 4 reading (1992 to 1994). In addition to State NAEP, she draws on the 1993–94 Schools and Staffing Survey and data from a 50-state survey of teach er policies. The goal of the investigation is to link state achievement to state initiatives targeting the improvement of teacher quality. A variety of analytic techniques are employed, including comparing similarly situated states with different gain records and adjusting state results for differences in contextual variables such as student poverty. This effort is noteworthy for combining quantitat ive analysis with a comprehensive set of case

PAGE 13

The Black-White Achievement Gap 13 studies. While recognizing both that states often engage in multiple reform initiatives and that aggregate estimates necessarily miss important within-state variation, she concludes that, “…states interested in improving student achievement may be well-advised to attend, at least in part, to the preparation and qualifications of the teachers th ey hire and retain in the profession” (p. 35). Grissmer, Flanagan, Kawata and Williamson (2000) adopt an approach that is analytically more complex. They employ data from the 1990 U. S. Census and from NELS to augment the family data collected by State NAEP, with the rationale that NAEP does not collect some important variables and that some of the student self-report data NAEP does collect is of low reliability. Their starting point is the incomplete matrix of 44 states crossed with the seven 4th grade or 8th grade math and reading tests that were administered between 19 90 and 1996 (inclusive of the end years). A cell entry is the (normalized) score of a particular state on a particular test. There were 271 cell entries in all. Grissmer et al. (2000) proceed to develop and fit a number of different models in order to estimate such quantities as annualized state gains in reading and math and, “to determine whether trends and differences in scores across states for students from similar family backgrounds can be statistically linked to differences in state educational system characteristics that are resource intensive…” (p. xx). In a separate set of analyses they also attempt to esti mate the cost-effectiveness of different reform policies. Among other things, they conclude that 1) most states experienced statistically significant gains in 8th grade math over the period 1990–96; and, 2) policies that emphasize enrolling more children in public prekindergarten programs, reducing pupil-teacher ratios in the lower grades, and providing teachers with adequate resources appear to be effective in improving achievement—at least in states with large numbers of disadvantaged students and whose educational expenditures were below the national average at the beginning of the period. More recently, Desimone et al. (2005) de veloped a five-dimensional framework for describing state education policies, as well as protoc ols for developing descript ors of each state with respect to the framework. They then used multiple regression to relate changes in state performance on NAEP 8th grade mathematics, from 2000 to 2003, to various combinations of state policy descriptors. They studied the NAEP composite score, as well as subscores for procedural knowledge, problem solving and conceptual unders tanding. This approach was most successful in accounting for variation in states’ procedural k nowledge averages in 2003 in terms of four policy descriptors, states’ procedural knowledge averages in 2000, and the interactions among these five predictors (adjusted R2 = .5). There was some evidence that gains were greater for states with lower average scores in 2000. Nichols et al. (2006) contribute to the long-ru nning debate about the efficacy of high-stakes testing in improving student learning, as measured by test scores. Their contribution is two-fold. First they develop and validate a new indicator, the Accountability Pressure Rating (APR), which ranks states with respect to the pressure exerted on school systems through the implementation of high-stakes testing and its associated consequences. They also introduce a second indicator (EPR) that tracks meaningful changes in the APR over time Second, they carry out a series of analyses that examine the pattern of relationships between the APR or the EPR and a variety of criteria, including NAEP gains in math and reading over different ti me periods. Analyses are carried out both for the entire student population and for specific subgroups The scope of the study and the heterogeneity of the results make it impossible to provide a simp le summary. Suffice it to say that there is some evidence that accountability pressure is related to student gains in grade 4 mathematics, but neither for grade 8 mathematics nor for reading in either grades 4 or 8. Strategies that employ HLM typically focus on the analysis of test scores within a state and take account of the fact that NAEP student samp les are clustered by school. They provide a more detailed picture of the structure of achievement in th e state as well as a more defensible partitioning

PAGE 14

Education Policy Analysis Archives Vol. 14 No. 8 14 of error variance. An excellent example of the a pplication of HLM to important policy issues is provided by Raudenbush et al. (1996). They employ data from the 1992 State NAEP Assessment in 8th grade math to study state-to-state variations in achievement as well as differences between majority and minority students. They find that a combination of home environment indicators and school learning opportunity indicators account for mo st of the considerable between-state variation in 8th grade math achievement. Interestingly, they also examine differences among states in the overall availability of learning opportunities, as we ll as social and ethnic inequalities in access to those opportunities. Not surprisingly, here, too, th ey find substantial variation among states. The story that emerges from this analysis offers much more to policy-makers and educators than do simple state rankings by overall achievement. Swanson and Stevenson (2002) used hierarchical linear models to evaluate the impact of the national standards-based reform on classroom inst ructional practices, using data from the 1992 and 1996 state National Assessment of Educational Prog ress (NAEP) in eighth grade mathematics. They found that the states with standards-based reform were more likely to have teachers’ self-reported classroom practices that were consistent with the st andards-based models of mathematics education. With cross-sectional observational studies like NAEP, inferences about policy impact must be made indirectly and circumspectly. Grissmer et al. (2000), for example, conclude that certain residual patterns in the data not accounted for by the model can be plausibly linked to particular reform initiatives. They acknowledge that other ex planations are possible and could be developed if more comprehensive data were available. It is worth noting that the conclusions of Grissmer et al. (2000) concerning the impact of state policies do not fully accord with those of Darling-Hammond (2000) with respect to teachers nor with those of Hanushek and his collaborators (Hanushek, Rivkin & Taylor, 1996) with respect to the impact of resource investments. In view of the current level of disagreement, these authors as well as Raudenbush et al. (1996) concur that further study is warranted. Ev en for a particular type of survey data, like State NAEP, there are tradeoffs among different kinds of models. As indicated above, HLM yields a more detailed picture of the topography of achievement and its correlates, but at the cost of having to grapple with sampling variability and the quality and extent of the relevant contextual data. Indeed, Grissmer et al. (2000) make a cogent argument that an important advantage of state-level analyses is that it is possible to augment NAEP data with high quality data from other sources that should reduce model misspecification. They also includ e an extensive discussion of the problem of aggregation bias. We have taken these issues to he art in the development of our research design. Framework for Evaluating State Education Policy In our view, a rigorous investigation of the link between state education policy and student outcomes requires that policy descriptions be cons tructed around a framework. If successful, such a framework helps to organize data collection and anal ysis. Equally important, it facilitates meaningful comparisons among states. There have been a number of attempts to develop useful policy frameworks and they are described in brief below. In their seminal paper that helped to launch the standards-based reform movement, Smith and O’Day (1991) proposed a dual strategy to infl uence student achievement: state leaders would initiate a coherent and directed reform centered ar ound an instructional guidance system for schools and districts, while local decision makers would have flexibility in adopting and adapting system components to maximize the quality of classroom instruction. The components of an instructional guidance system include curriculum standards and frameworks, curricular materials, assessments,

PAGE 15

The Black-White Achievement Gap 15 professional development, teacher licensing, teacher re-certification, technical support, accountability, and organizational capaci ty building (Smith & O’Day, 1991). This theory of action has been elaborated by many others since. The central argument is that if governments set standards for student performa nce, and adopt aligned policies for curriculum, assessment, accountability, and governance, then educators will alter their practices, and student performance will improve as a consequence (Corcora n, 1997). The main strategy of systemic reform intended to close the achievement gap is to raise expectations and standards for all students, to align assessments and curriculum with the standards, and to rationalize the evaluation of teaching and learning. One version of a standards-based policy fram ework was posited by Swanson and Stevenson (2002) in their quantitative evaluation of the im pact of the national standards-based reform on classroom instructional practices. Specifically, th ey were interested in the correspondence between state-level activism in content standards, perf ormance standards, and aligned assessments and professional standards, on classroom instructiona l practices. They developed a composite measure based on a set of 16 teacher-reported classroom instructional activities as the dependent variable, which were collected from a series of studies conducted by the Council of Chief State School Officers (CCSSO) on state policy actions. Clune (1998), in the context of evaluating nine NSF statewide systemic initiatives, proposed a framework for a state’s standards-based reform efforts. The framework has four related aspects (i.e., standards-based reform, standards-based po licy, standards-based curriculum, and standardsbased achievement), each with a number of second ary elements. He argued that each aspect should be rated with respect to breadth and depth, wher e breadth referred to the number of elements involved and depth referred both to the quality of the implementation and to the strength of influence. This framework was used to evaluate the effectiveness of nine statewide systemic initiatives based on rich case studies. Desimone et al. (2005), cited above, review some of the relevant literature and propose a policy framework comprising five attributes: consis tency, specificity, authority, power and stability. These are described and operational definitions pr ovided. Of particular interest, they examine empirically the relationships among the attributes for the states in their sample. While the article appeared after the present work was completed, ther e is certainly some overlap in our frameworks. In particular, we also argue that consistency and stability are important aspects of state policy and should be incorporated in any study that seeks to link policy and education outcomes. Edwards and the staff of Education Week (1997, 1998, 1999, 2000) developed an alternative framework for grading state efforts in raising st udent achievement in their Quality Counts reports. This framework addressed five aspects of state policies: achievement; standards and assessment; quality of teaching; school climate; and resources They believed a successful public school system should have these characteristics: high standard s in English, math, science and history for all children and assessments that align with those stan dards; teachers whose primary focus is on student learning and who possess the knowledge, skills, an d commitment to teach to higher standards; schools that are organized and operated in a wa y that encourages and supports teaching and learning; adequate funding distributed equitably to all children and focused on the functions that matter; and all students achieving at high levels and engaged in challenging intellectual work. They drew their state-level data from such organizations as the U.S. Department of Education, the Education Commission of the States, the American Federation of Teachers, and the Center for Education Reform.

PAGE 16

Education Policy Analysis Archives Vol. 14 No. 8 16 Methods Preliminaries Selection Criteria for Participating States We adopted three criteria for selecting states for the study. First, the chosen states must have participated in the 8th grade State NAEP mathematics assessm ent in the years 1992, 1996, and 2000. Eighth grade math was chosen because it represents the capstone of the state testing programs in mathematics mandated in the NCLB Act. There is also a significant amount of empirical evidence that math achievement, more than reading, is influenced by factors that are under state policy control, such as teacher and school characteristics. Moreover, a student’s proficiency in math at the 8th grade level is predictive of subsequent success in high school and participation in post-secondary education. Second, the selected states’ public school population had to have encompassed a minimum of 10 percent Black students. While we reco gnize that all disadvantaged groups confront barriers to achievement, we focused on Black studen ts so that we could concentrate our attention on a manageable number of states. Finally, at least so me of the states selected had to be considered “bellwether states”7 on educational policy over the period of interest. Four states appeared to meet these criteria: Kentucky, North Carolina, Tennessee, and Texas. Initially, California was not included because its public school population did not encompass a minimum of 10 percent Black students. We decided to add it to the study both because it has a large number of Black students and because it is prominent in education reform. We augmented our state sample by including Maryland, Michigan, New York, South Carolina, and Virginia. These states all had three administrations of State NAEP ma thematics assessment and had a minimum of 10 percent black students in their public school popu lations. We believed these states could provide useful contrasts to the other states. Analysis of State Policy Data We used the case study approach to document state educational policies over the period 1988 through 1998. Our research questions were: What policies were in effect during this ten-year period? What was the sequence of these policies? In the aggregate, how coherent and consistent were these policies? In order to address these questions, we deve loped a framework for policy analysis that guided the development of our state profile, a que stionnaire on state education policy, and a followup interview protocol. The framework builds on the earlier work just cited and comprises five components, which we refer to below as “policy levers.” The state profile and questionnaire were revised several times, based on comments from internal and external reviewers. We submitted the final versions of the profile and questionnair e to at least two experts in each state,8 asking them to review the profile, respond to the questionnaire an d to participate in an interview. We then triangulated multiple sources of evidence to crea te an extensive state summary that was eventually reduced to a one-page synthesis for each state. 7 We define bellwether states as those states that are generally considered to have been leaders in systemic education reform. 8 Only one expert from California was interviewed.

PAGE 17

The Black-White Achievement Gap 17 After evaluating the state syntheses against ou r framework, we classified the states into one of three ranked categories, based on our judgment as to the extent to which state policies over the period would lead us to expect substantial improvem ent in student test scores from 1992 to 2000. The categorization process was carried out for each of the five policy levers, as well as for an overall policy score. The next section describes each of these steps in more detail. Establishing the Framewor k for Policy Analysis The policy analysis framework was developed th rough an iterative process. We began by reviewing both the theory underlying standards-base d reform and the efforts to evaluate the theory. We drew from the work of Smith and O’Day (19 91), Swanson and Stevenson (2002), Clune (1998), and the Quality Counts framework (Edwards, 1997 1998, 1999, 2000). Although each framework had much to recommend it, one concern we had was that none appeared to cover the full range of policy levers available to states. Another concern was that they did not address either how well policies were carried out over time or how strongly aligned they were. Our premise was that for any educational reform to have an impact, it must be developed, implemented, and refined through time (Brady, 2003). Our framework comprises five policy levers through which states can regulate and monitor their education system: governance, education finance, curriculum and standards, teacher quality, and assessment and accountability. We further posit that an evaluation of state reforms must take into account the quality of implementation, as we ll as their coherence and consistency over time. In other words, our framework portrays sta te policy as the means by which a state can effectively implement its statutory and regulatory au thority (governance) through its ability to raise, distribute, and spend money on public education (education finance). These financial inputs influence how much instructional and other resources are available to students and their teachers. Another way for states to influence educational policy and academic achievement is through the specification of what students should learn and be able to achieve (curriculum, standards and accountability), and who should tea ch them (teacher qualifications and quality). Finally, the degree to which states have coherently and consistently developed and implemented desired policies is instrumental to their success. Developing the State Profile, Quest ionnaire and Interview Protocol Employing the framework described above, we de veloped a sample profile of one state using information obtained from over thirty sources incl uding the Council of Chief State School Officers (CCSSO), the National Center for Education Statistics (NCES), Education Week Quality Counts the Education Commission of States, and the Conso rtium for Policy Research in Education. We also consulted the published research literature as well as a large number of project reports. We then shared this sample profile with five advisors.9 Based on their comments, we revised the profile and shared the second draft with them. We then de veloped the final version of the state profile (Appendix A), which was subseque ntly used for all ten states. Using a similar iterative process, we developed a questionnaire on state education policy (Appendix B) and a follow-up interview protocol (Appendix C). We structured the questionnaire 9 These were Margaret Goertz, Co-Director of the Consortium for Policy Research in Education; Douglas Tuthill, private consultant, St. Petersburg, FL; Michael Nettles of Educational Testing Service; Rich Coley of Educational Testing Service, and Patty McAllister of the Council of Graduate Schools.

PAGE 18

Education Policy Analysis Archives Vol. 14 No. 8 18 and the interview protocol around the five policy levers. Through these instruments we were particularly interested in obtaining information with respect to issues relating to coherence, consistency, the quality of implementation, percei ved positive and negative effects, and lessons learned. Selecting State Experts Except for California, for which we only conducted one interview, at least two individuals in each state were identified on the basis of their k nowledge of state policy actions over the period 1988 to 1998, and their willingness to participate in this study. Typically, they work or worked in the state’s department of education or related agen cies, regional education laboratories, or policy research institutions. They were sent the draft pr ofile and the questionnaire and asked to review and correct the profile, fill out the questionnaire, send the materials to the researchers, and participate in a 45-minute phone interview. The level of coopera tion was excellent. The experts substantially contributed to our understanding of the coherenc e, consistency, and quality of state policy actions over the period from 1988 to 1998. They also provided references to relevant materials and suggested other respondents. Developing State Summaries and Ranking the States Once all the state information was in hand, we triangulated the information and compiled a comprehensive “state summary” for each state. (An example for Kentucky is contained in Appendix D.10) By focusing on those issues that best different iated the states, we further reduced the summary to produce a one-page synthesis for each state. We then evaluated each state synthesis in comparison to the others and then classified the states into one of three categories based on our judgment as to whether state policies over the period would lead us to expect substantial improvement in NAEP results from 1992 to 2000. Stratification of Schools An important goal of this study was to develop parsimonious descriptions of the patterns of achievement of Black students and White studen ts, paying special attention to trends in the achievement gaps. We were particularly interested in describing those trajectories at a level below that of the state as a whole. Of course, we r ecognized that we would have to balance the construction of more homogeneous groups of sc hools from a state’s NAEP sample against the increased variability due to smaller sample sizes. Th is was especially problematic as the number of schools in a state sample could fall well below the target of 100 to 105 schools. The data for all analyses were taken from NAEP Restricted Use Data Products11 and were organized into 10 x 3 = 30 data sets, one for each combination of state and administration. In each data set, the information for an individual studen t consists of five plausi ble values on the NAEP 8th grade mathematics scale, an extensive set of st udent/teacher/school background variables collected during the NAEP administration, and the variables employed by Westat in developing the sampling 10 The full set of state summaries is available from the first author. 11 NAEP Restricted Use Data Products are availa ble to qualified researchers by license from the National Center for Educational Statistics, U. S. Department of Education.

PAGE 19

The Black-White Achievement Gap 19 plan for the state. For further details, see Chapter 3 of Allen, Jenkins, Kulick, & Zelenak (1997). Subsequently, we linked the NAEP files to the Comm on Core of Data (CCD) files of the same year using NCES school codes. 12 Inasmuch as our interest was in studying trends in achievement for groups of schools within a state, we began by categorizing schools with respect to three characteristics: Type of location (TOL), percentage minority (%Min) and percentage of students eligible for free or reduced price lunch (the lunch-program percentage). The first two characteristics are obtained directly from the Westat files and consist of up to eight and up to four categories, respectively. For the lunch-program percentage, it was deci ded to develop two categories: Schools with less than 50% of students eligible and schools wi th 50% or more eligible. These categories are denoted as Stratum 1 (S1) and Stratum 2 (S2), re spectively. The threshold of 50% was chosen, in part, because the 1994 reauthorization of ESEA permitted schools with 50% or more of their students eligible for free or reduced price lunch to be declared Title 1 schools, allowing them to use Title I funds on a school-wide basis rather than havi ng to direct them to individual students. Some researchers have used 75% as a threshold, but we found there were too few schools with percentages of eligible students a bove that level to yield results with acceptable levels of uncertainty. The classification of a school as S1 or S2 wa s made on the basis of a NAEP school variable which identifies schools as being in one of eight categories, based on the reported percentage of students eligible for free or reduced price lunch. When that variable was missing, classification was made on the basis of a calculated ratio of the number of eligible students in the grade to the total number of students in the grade, both variables obtained from the CCD files. When missing data precluded calculating the ratio, we attempted to impute the correct classification based on the median income for the school (from the CCD file) and the relationship between median income and the lunch-program percentage for other schools with the same type of location and minoritypercentage designation. The proportion of imputations for a data set varied from 0% to about 10%, with a median value of about 5%. In only a f ew cases were we unable to categorize a school and forced to exclude its data from the analyses. Clearly, there is some uncertainty in the corr ect classification of a school in S1 or S2 (even when there are no missing data), particularly fo r those schools whose true values are near the boundary value of 50%. The effect of any miscla ssification would be to reduce the apparent differences between the two strata on any charac teristic that was correlated with the S1/S2 classification. The number of schools in the 30 NAEP state samples ranged from about 80 to about 100. Such numbers preclude very fine stratification of schools. After consideration of the focus of the study and the issues likely to be of greatest inter est to policymakers (as well as some exploratory analysis), it was decided to retain only the S1 /S2 stratification for each state/administration combination. Information on type of location and pe rcentage minority was retained for later use in modeling. It is important to note that while the defi nitions of S1 and S2 are the same across administrations, the universe of schools corresponding to each stratum did change over the period. In addition to school openings and closings, the lu nch-program percentage in a school could have changed enough over the eight years from 1992 to 2000 to cause the school to cross the 50% boundary. Nonetheless, statistics based on properl y weighted student scores from the schools in the 12 The Common Core of Data is an annually updated database compiled by NCES that includes, among other things, information about schools that is not available through NAEP.

PAGE 20

Education Policy Analysis Archives Vol. 14 No. 8 20 NAEP sample that fall in a particular stratum (in a specific year) yield approximately unbiased estimates of the corresponding population charac teristics of the stratum (for that year). With this structure we committed ourselves to carrying out a basic analysis for each combination of year by state by stratum, or 60 (= 3 x 10 x 2) analyses in all. Although this is a substantial number, it does give us the capacity to co mpare the time trends in the two strata within a state both with each other and with that of the state as a whole. It also provides us with the opportunity to contrast the record of a particular stratum within a state with those of equivalent strata in other states. This should yield more meaningful comparisons, giving us a more accurate picture of the degree of success a state has had in improving achievement over all, as well as in reducing the gap between Black students and White students. (Of course, since these analyses are purely descriptive, no causal mechanisms are offered or should be inferred.) Descriptive Analyses Our program of quantitative analysis falls naturally into two phases. The first is more exploratory and the second is more model-based. In the first phase, our approach was to proceed systematically from higher to lower levels of aggr egation. That is, we began with overall comparisons among states and between strata within states. Th ese are not only informative but also provide a substantively meaningful background against which to judge the results of the more focal analyses, which track over time the achievement gap for each stratum within each state. In our presentation of the data, we have attempted both to identify genera l patterns and to highlight those states or strata whose results are sufficiently different to merit our attention. All reported statistics are calculated using sa mpling weights so that they are approximately unbiased estimates of the corresponding population quantities. Each statistic is accompanied by an estimated standard error, obtained through stan dard NAEP procedures based on the jackknife method (Burke & James, 1997). When necessary, we employed specialized methods to take account of sampling dependencies. For example, the standa rd error of the mean difference between Black students and White students within a stratum requ ires such a calculation since these students are grouped by school, with many, if not most, NAEP school samples including students of both races. Multilevel Modeling Rationale The second phase relies on hierarchical linea r models or HLM (Raudenbush & Bryk 2002). This approach was motivated by the observati on that the achievement gaps between Black and White students in each stratum in each state are generally large and persistent. However, in a particular stratum/state/year combination, Black students and White students are not distributed proportionately across schools. To the extent that overall school means are correlated with differences in the distributions of students of the two races, the estimate of the stratum-level achievement gap is confounded with differences in school means. One way to remove this confounding is to estimate a pooled within school achievement gap. That is, in effect, estimate the gap for each school in which students of both ra ces are assessed and then compute an appropriate summary of those estimates. This procedure is easily handled by HLM. Going a bit farther, it is natural to pose two related questions: First, how much is the estimated achievement gap reduced by taking into account student characteristics other than race?

PAGE 21

The Black-White Achievement Gap 21 Second, what proportion of the variance in sc hool means can be accounted for by school characteristics? Both questions should be helpful in understanding the achievement gap. The first question recognizes that students differ with re spect to a number of characteristics that are associated with achievement. Indicators of some of these characteristics are available through NAEP. By including them in our models, we can estimate how much of the pooled within school achievement gap is accounted for by these chara cteristics. The second question recognizes that certain school characteristics may account for di fferences in school means. Again, both questions, and their extensions, can be addressed through the development and estimation of an appropriate family of HLMs. It should be noted that one unavoidable drawback to the HLM approach is that only schools with both Black and White students in the NAEP sample can contribute to the estimation of the achievement gap. For some stratum/state/year co mbinations, the number of such schools is rather small and leads to relatively large estimated standa rd errors for that contrast. However, the set of schools incorporated in the HLM analysis includes those schools with NAEP samples containing Black students but no White students or those wi th White students but no Black students. The latter two kinds of schools have information that can cont ribute to the estimation of the other regression coefficients in the model. HLM and Plausible Values A NAEP assessment consists of a large number of items covering a broad domain. To limit testing time, only a small fraction of the item pool is administered to any one student. To accomplish this, the item pool is first organized into blocks of items. The blocks are then paired according to a balanced incomplete block or a partially balanced incomplete block design into booklets (Lazer, 1999). Booklets are randomly distributed to students within schools. Because each student responds to relatively few items, a point estimate of an individual’s score based on her cognitive data would be subject to substantial measurement error and, more importantly, would yield biased group estimates. A ccordingly, through a rather complex process, NAEP estimates a latent proficiency distribution for each student, conditioning on the student’s responses to the cognitive items, as well as to th e background questions (Allen, Johnson, Mislevy, & Thomas, 1999). Five random draws are then made from the student’s latent proficiency distribution. These are called plausible values. The procedure for obtaining estimates of population quantities and the corresponding estimated standard errors consists of two steps: 1) Calculate the point estimate of the statistic. This is done by computing the statistic five times, onc e for each set of plausible values, and taking the average; and 2) Estimate the (total) variance of the statistic. This is done by computing an estimate of the variance due to the sampling of students and schools employing data from the first set of plausible values. A second component, representing measurement error, is obtained by computing the variance among the five point estimates calculated in step 1. The total variance is an appropriately weighted combinat ion of the two components. To facilitate applications to NAEP data, the HLM5 program was adapted for use with plausible values. This new program, HLM5-PV, auto matically analyzes a NAEP dataset five times, once for each set of plausible values. (See Raud enbush, Bryk, Cheong, and Congdon (2001) for a full description of the HLM5 software.) The results displayed in the summary output of HLM5-PV reflect steps 1 and 2 above.

PAGE 22

Education Policy Analysis Archives Vol. 14 No. 8 22 HLM and Sampling Weights The problem of whether and how to incorporate weights in fitting HLMs to survey data is an area of active research (L ittle 2003; Chambers 2003; Pfefferman Skinner, Holmes, Goldstein, & Rasbash, 1998; Pfeffermann, Moura, & Nascimen to Silva, 2004). As the discussion following the earlier Pfefferman paper indicates, there is no unan imity in the field with respect to this question, even as to whether weights should be used at all. Alternative suggestions are made, but there is no consensus on a preferred approach. The Pfefferman et. al. (2004) paper explores a promising strategy based on factoring the student design wei ght into two components: A component that is related to the selection of schools and another that is related to the selection of students within schools. One could then apply the weights to sample versions of certain census estimators. Unfortunately, the version of HLM that was available when the current analysis was done did not offer this option. Upon reflection, we decided to forego entirely the use of weights. Our rationale was that, quite aside from the difficulty of implementation, only those schools in which both Black students and White students were assessed could contribute to the estimate of the achievement gap in a particular stratum/state combination. That set of schools did not constitute a simple random sample of the full set of schools for that stratum/state. Consequently, the weights we had available were not the appropriate weights and so, even were we to employ them in some way, we would not be able to make inferences to a larger population of schools. Consequently, we fit the multilevel models without weights. This complicates comparisons with the descriptive analyses we have just outlined, which do use weights. Accordingly, we carry out some intermediate analyses that enable us to estimate how much of the differences between the sets of estimates may be due to differences in the wei ghting strategies. The intermediate analyses are described in the section containing the HLM results. Centering Predictors There are several ways that level 1 predictors can be centered, e.g. no centering, centering around the grand mean of the full sample or centering around the group (i.e., school) mean. Centering the level 1 predictors affects the values and interpretations of the regression coefficients at both levels (Raudenbush & Bryk, 2002, Ch. 2). We c hose to center each level 1 predictor around its corresponding school mean. Consequently, we can in terpret the estimated Black/White gap as the pooled-within-school difference between Black an d White students. When other predictors (also school-mean centered) are included in the level 1 model, the estimated Black-White gap is the pooled-within-school difference between Black and Whi te students adjusted for these other student characteristics. When all predictors are school-mea n centered, the level 1 intercept is the average outcome in the school. Thus, when the level 1 intercept is treated as the criterion at level 2, the full between-school variation in school means is being modeled. Raudenbush and Bryk (2002, Ch. 5) present a similar centering strategy for estimating Black-White ac hievement gaps. When there is a structured regression at level 2, all level 2 predictors (i.e., school characteristics) are grand-mean centered. With this choice, the intercept in the regression for the level 1 intercept can be interpreted as the mean response of a school with the grand mean value on each level 2 predictor.

PAGE 23

The Black-White Achievement Gap 23 Model Building Foundations HLM can accommodate a large number of pred ictors at both level 1 (between students within schools) and level 2 (between schools). A fully specified HLM will include every level 1 parameter as an outcome at level 2. The resulting model can be difficult to interpret. It can be further complicated by classical regression problem s, such as supressor effects and multicollinearity. In order to construct a meaningful and usable model, it is advisable to proceed by developing and testing simple models, which are systematically augmented and pruned until an acceptable final model is obtained (Raudenbush & Bryk, 2002). The rest of this subsection describes the process we carried out. Note that in each model all random errors are assumed to be mutually independent. Unstructured Model The usual starting point is an unstructured model, in which there are no predictors at either level 1 or level 2. It has the following form: Level 1 0()ijjijmathproficiencyBe for student i in school j Level 2 0000 jj B r with var( eij) = 2 and var(0 jr) = 2 0 This is equivalent to an analysis of variance in which 2 is the variance within groups and 2 0 is the variance between groups. This model is instructive because it tells us how much of the total variance in the outcomes is between schools. If there is none (or very little), a multi-level analysis is not needed. An estimate of the propor tion of variance between groups, or the intraclass correlation, is calculated as: 2 0 22 0 Structured level 1 model, unstructured level 2 model The next step is to introduce predictors at level 1, leaving level 2 unstructured. The predictors are represented by X ’s, with the asterisks denoting that the predictors are school-mean centered. With this family of models, we can identify those student-level predictors that are statistically related to the outcome. This is exactly th e kind of exploration that takes place in ordinary regression analysis. At this stage, we can also determine whether any of the level 1 regression parameters has a random component; i.e., whether a regression parameter varies substantially across schools. The model has the form:

PAGE 24

Education Policy Analysis Archives Vol. 14 No. 8 24 Level 1 ( math proficiency )ij = ** 011...jjijpjpijij B BXBXe for student i in school j Level 2 B0 j = 000 jr B1 j = 101 jr • • • Bpj = 0 ppjr With var( eij) = 2 and 2 0010 2 101 2 01 p ppp representing the variance-covariance matrix of the re siduals in the level 2 model. By testing whether the variance terms,222 02,,,p are significantly different from zero, we can decide which of the regression coefficients 01,,...,,jjpj B BB to treat as fixed and which as random. Specifically, if the variance term corresponding to a level 1 regression coefficient is nonzero, then we conclude that parameter varies over schools and should serve as a criterion in a between-schools regression. The NAEP database makes available a large number of student covariates. These are described in Appendix E. Following a series of exploratory analyses, the initial pool of student covariates was reduced to three variables to be included in the level 1 model. Predictors were retained if the associated regression coefficient was of constant sign and statistically significant for most of the 60 analysis sets. The three covariates retained were: Student socioeconomic status (SES), student academic focus (AcadFoc), and the Black vs. White contrast (BvsW). Briefly, SES is an index based on a combinati on of parental education and the number of reading related items in the home. AcadFoc is an index based on a combination of types and characteristics of math classes taken, student e ffort, and student beliefs. (Note: The components of AcadFoc vary across years.) BvsW is one for White students, 0 for Black students and missing for other race/ethnic groups. For furthe r details, consult Appendix E. Once the final level 1 model was obtained, the intercept and the three regression coefficients were tested to determine if the corresponding variance components were nonzero. Only the intercept term, reflecting the school mean, had a va riance component significantly different from zero. In terms of the model parameters, the variance associated with the intercept, 2 0 was found to be significantly different from zero, while the va riances associated with the other regression terms were found to be not significantly different from zero. The inference is that typically, within a stratum, the school means were significantly diffe rent from one another, but the vectors of regression coefficient are the same across schools. (Of course, this finding substantially simplified the final set of analyses.) The model then takes the form:

PAGE 25

The Black-White Achievement Gap 25 Level 1 ( math proficiency )ij = *** 0112233 jjijjijjijij B BXBXBXe for student i in school j Level 2 0000 jj B r with var( eij) = 2 var(0 jr) =2 0 and 1133,...,jj B BBB for all j Structured level 1 model, structured level 2 model Once the level 1 structure is finalized (i.e., pred ictors are chosen and it is determined which are fixed and which random), predictors for the random effects are introduced in level 2. In this study, predictors were retained if they were signific ant in most of the 60 analysis sets. The final model contained 4 predictors in the regression model with the school intercept from level 1 as the criterion. The predictors are Average School SES (AggSES), Percent Black Students Assessed (AggBvsW), School Climate, and Aggregated Academic Focus (AggAcadFoc). The predictors are represented by W ’s, with the asterisks denoting that the predictors are grand-mean centered. Consult Appendix E for definitions. The final model has the following form: Level 1 ( mathproficiency )ij = *** 0112233 jjijjijjijij B BXBXBXe for student i in school j Level 2 B0 j = **** 000110220330440 jjjjjWWWWr, with var( eij) = 2 var(0 jr) =2 0 and 1133,...,jj B BBB Defining trends for the analysis The full study analyzed data on the mathematic s achievement of grade 8 students at three time points, 1992, 1996, and 2000. However, for the purpose of presenting findings on trends, only data from years 1992 and 2000 will be reported. After careful consideration, the authors decided that including the 1996 data would unduly complicate the summaries while contributing only marginally to the link with policy analysis. When we report on the estimated Black-White ac hievement gaps from the HLM analyses, we present two sets of estimates. The first set cont ains what we term adjusted gaps because they are free of average differences in mean scores am ong schools. Specifically, in the model below, B1 represents the average difference between Black and White students attending the same schools, pooled across schools in the stratum.13 That is, B1 is treated as fixed across schools. B0j represents the mean over all students in school j and is allowed to vary randomly over schools. The regression models are: 13 Clearly, these estimates draw only on data from schools for which the NAEP sample included both Black and White students.

PAGE 26

Education Policy Analysis Archives Vol. 14 No. 8 26 Level 1 (mathproficiency)ij = 01*jijij B BBvsWe for student i in school j where, ij B vsW is the Black-White indicator deviated from its respective school mean. Level 2 (no predictors) 0j000jBr Table 1 New York S2 (High Poverty Str atum). Pooled Within-School Black-White Achievement Gaps. Model 1992 2000 1992–2000 Adjusted 17.9 6.9 11.0 Fully Adjusted 14.2 5.1 9.1 Estimates of trends are simply differences of the estimates of B1 in 1992 and 2000. For example, see the adjusted results in Table 1 for S2 (the high-poverty stratum) in New York, which indicate that, over the period of interest, the achievement gap was reduced by 11 points. The second set of estimates contains the fully adjusted gaps. These estimates are the pooledwithin-schools estimated achievement gaps, adjusted for student differences in SES and Academic Focus and are also free of differences in mean achievement across schools. (Here, B1 is a partial regression coefficient with respect to the other predictors in the model.) While B0j, the average outcome in school j, is allowed to vary randomly over schools, the three regression coefficients are assumed to be constant across sc hools. Using data from the same reduced set of schools as before, the second set of estimates is derived by fitting the model below: Level 1 (mathproficiency)ij = *** 0123***jijijijij B BSESBAcadFocBBvsWe for student i in school j where the superscript ‘*’ indicates predictors deviated from school means.. Level 2 (no predictors) 0000jj B r The fully adjusted results in Table 1 indicate a reduction in the achievem ent gap of 9.1 points. To the extent that the fully adjusted gaps are smaller than the adjusted gaps in a particular year, it is possible to regard the covariat es, SES and AcadFoc, as accoun ting for some of the observed differences in achievement between Black and Whi te students. On average, we find that the fully adjusted gaps are smaller than the adjusted gaps by about 30 percent. Interpreting trends in fully adjusted gaps is difficult, however, si nce they are a function of di fferences betwee n cohorts in both achievement and in the student char acteristics employed as predictors.

PAGE 27

The Black-White Achievement Gap 27 Findings Education Policy All ten states embarked on some type of educ ation reform during the period 1988 through 1998. Many of their efforts focused on setting academic standards for all students, adopting measures to improve teacher quality, developing n ew assessments of student academic achievement, and establishing accountability systems that were at least partially focused on student outcomes. Employing the policy framework we have developed, we now summarize our findings from a crossstate evaluation of the state profiles, expert inter views, and state syntheses. For each of the five policy levers in our framework, we identified a few key questions. Governance and the politics of reform. What was the governance structure and politics around governance, in the context of reform ? What was the balance between central and local control and did it change over the period? Who we re the principal drivers of education reform? What were the main reform mechanisms? Education finance. To what extent was there continuing commitment to improving the funding of education? What was the level and trajectory of the proportion of state funding in education? What was the level and trajectory of ex penditures per student? What was the level and trajectory of the funding gap between highand low-poverty districts? Curriculum and standards. To what extent was there an ongoing commitment to improving curriculum and standards, especially in mathem atics? Was there a strong state curriculum in mathematics (with statewide text book adoption and alignment of tex tbooks with the curriculum)? Was the mathematics curriculum deep and rigorous? Teacher quality. To what extent were there meaningful in itiatives related to teacher quality? Was there a middle grades content-specific teacher certification, especially in mathematics? Was recertification tied to professional development? How well were teachers compensated compared to the nation as a whole and to the other states in the study? What was the extent of out-of-field teaching and how did it change over the period? Assessment and accountability. Was there a broad commitment to assessment, especially of high-level skills? Was there continuity in assessment policy? Was there consistency in accountability policy? Was there a strong accountability system (wit h an effective system of sanctions and rewards) for both Title I and non-Title I schools? Analysis of Education Reform Policy Levers Governance and the politics of reform. The ten states varied substantially with respect to state control (see Table 2). For instance, California, N ew York, North Carolina, Texas, and Virginia all had strong state controls; whereas Maryland, Mi chigan, South Carolina, and Tennessee had strong local controls. Kentucky fell somewhere in-between, as authority was divided between the state department of education and the school districts.

PAGE 28

Education Policy Analysis Archives Vol. 14 No. 8 28 Table 2 Governance and Main Drivers of Reform State Strong State Control Governor, State Legislatures Court Case Business Groups Higher Education System Grassroots & Special Interest Groups California X X X X Kentucky X1 X X2 X X Maryland X Michigan X X New York X X North Carolina X X X South Carolina X Tennessee X X3 Texas X X X4 X Virginia X X 1Implementation of education reforms in Kentucky was divided between the state department of education and school districts. The state department of education created an d managed the assessment and accountability systems, while the schools and sc hool-based decision making councils decided on the curriculum and resource distribution. 2In 1989, the Kentucky Supreme Court ruled that the state’s public sc hool system was unconstitutional. 3In 1993, the Supreme Court of Tenne ssee ruled that th e state’s education financ e system violated the its constitution’s equal protection clause. 4In 1987, a Texas district judge rule d that the state’s education financ e system violated the state’s 1984 Equal Opportunity Act, which aimed to reduce th e monetary gap between funding for rich and poor school districts. In September 1991 the state began to comply by increasing spending in poorest districts. As the entries in Table 2 indicate, in some states reform was driven “from the top,” by the governor, the state board of education, or the sup erintendent of schools. These included Maryland, New York, South Carolina, and Virginia. In Ke ntucky, Michigan, North Carolina, and Texas, business groups also played a central role in pr oviding impetus for reform. Moreover, Kentucky and California stand out as having had a wide variety of constituencies shaping education reform during the period. For instance, the main drivers of Kentucky’s educational reform during the late 1980s through the 1990s were a combination of loca l coalitions of grassroots non-governmental organizations, a coalition of school districts, a defi nitive state Supreme Court ruling, the governor, and the General Assembly, the state board and the commissioner of education. In California, the governor, some state legislators, and the superintend ent of education were very active in the state’s first wave of education reform from the mid 19 80s through the mid 1990 s. However, the state’s higher education system, a large number of non-prof it organizations and special interest groups also had a significant impact on the natu re and the fate of the state’s educ ation initiatives. California also experienced considerable conflict with respect to both control of reform and the substance of the reforms (Wilson, 2003). Education Finance. Clearly, funding is essential to the functioning of any education system. However, the impact of the funding is dependent on how it is allocated across the state, the activities it supports and the efficiency with which it is employed. Table 3 describes the ten states with respect to three indicators of state policies re lated to education finance: Average proportion of state funding, average expenditures per student, and funding disparities among school districts. As described in the table, the last indicator comprises two different statistics. The first is a measure of

PAGE 29

The Black-White Achievement Gap 29 the funding disparities among districts in the state during the early part of the period of interest. (There is an implicit assumption that, generally disparities favored lower poverty districts.) The second is a direct measure of the disparities b etween lower and higher poverty districts, compiled from data toward the latter part of the period of interest. The table entries are the value of the indicator and a subjective ranking of the states into one of three categories.14 A rank of 1 corresponds to the highest category and rank of 3 to the lowest category. We also constructed an overall ranking based on combining the three indicators, with the average level of expenditures per pupil having the greatest weight and the average proportion of state funding the least weight. California was assigned rank 1 overall as it was the only state not in the lowest category for any indicator. Note that their per-pupil expenditures straddle the median for that indicator. At the other end of the scale, Virginia was the only state to be placed in the lowest category for three of the indicators. We did not find it possible to distinguish among the remaining eight states. Interestingly, Kentucky, along with Tennessee and Virginia, employed education finance reform as one of their main policy levers during the period studied. In particular, Kentucky redesigned its school funding system to increase funding for studen ts who required more time to achieve academic success and for teachers who could help these students succeed. Under the new funding formula, districts were required to meet certain local revenu e-raising benchmarks; however, those with small tax bases and/or limited property values were entitled to additional state funding. As a result, Kentucky made some progress toward greater equity in spending per student. Through Public Act 145 and Public Act 335, both enacted in 1993, Michigan completely restructured education funding. Local property tax was eliminated as the source of funding for the operating costs of K–12 public schools; ins tead, schools were funded through other sources, including a two-percentage point increase in sales tax and use tax, and the 50-cent increase in the cigarette tax. As a result, the state’s share of education expenditures increased from about 30% to about 70% in just one year. PA 335 also increa sed support for professional development, at-risk students, and Math/Science Centers. It also extende d the length of the school day from five to six hours. In 1992, Tennessee overhauled its education fi nance system by creati ng the Basic Education Program (BEP), a regression-based formula that determined the funding level required for each school system to achieve a common and basic level of service for all students. Monies from BEP were allocated to both classroom and non-classroom components, including teachers’ salaries, technology and other school improvements. After five years of graduated increases, full funding was attained in the 1997–98 school year. 14 A state’s ranking on an indicator is relative to the values of the other states on that indicator. We do not have absolute standards by which we can judge a state’s record.

PAGE 30

Education Policy Analysis Archives Vol. 14 No. 8 30 Table 3 State Education Finance Indicators and Overall Ranking. Ra nks in parentheses. Average Funding Disparity Between Highand Low-Poverty Districts State Average Proportion of State Funding1 (percentage) Gini2 Per-pupil gap3 Average Level of Expenditures4 ($/Students) Overall Ranking California 62.2 (1) 0.082 (2) $ 35 (1) $ 5,858(2) 1 Kentucky 65.7 (1) 0.092 (3) -150 (1) 5,576(3) 2 Maryland 38.2 (3) 0.083 (2) 701 (3) 7,645 (1) 2 Michigan 43.0 (2) 0.100 (3) 1,261 (3) 7,690 (1) 2 New York 40.6 (2) 0.099 (3) 2,794 (3) 9,770(1) 2 North Carolina 65.2 (1) 0.047 (1) 413 (2) 5,591(3) 2 South Carolina 49.5 (2) 0.049 (1) 427 (2) 5,506 (3) 2 Tennessee 45.7 (2) 0.101 (3) -138 (1) 4,917 (3) 2 Texas 42.2 (2) 0.069 (1) 386 (2) 5,718(2) 2 Virginia 32.1 (3) 0.100 (3) 879 (3) 6,470(2) 3 1Calculated by the Nation al Center for Education Statistics (NCES) based on 1987–88 through 1997–98 data from the Common Core of D ata (CCD) collection of surveys, 2 Funding disparities are quantified by two methods. The first is the Gini coefficient, with higher values corresponding to greater disparities. Calculations are based on the Gini coefficients reported by Hussar & Sonnenberg (2000), which used th e 1988 through 1990 financial data co llected by the Census Bureau, as part of the Census Government F 33 School District Finances Survey. 3 The second funding-disparity measure is based on the ED Trust’s (2002) calculation of gap between highest and lowest poverty districts 1997. The report ed figures above are from Table 2 of the Education Trust (2002), The funding gap: Low-income and mino rity students receive fewer dollars. Washington, DC: Education Trust. The numbers read as follows: In 1997, the highest poverty districts in California received $35 less per student in state and local revenu es than the lowest poverty districts. The highest poverty districts in Kentucky, on the other hand, received $150 more per student in state and local revenues than the lowest poverty districts. 4Constant prices at 2000–01, calculated by NCES based on 1992–93 through 1997–98 data from the Common Core of Data (CCD) collection of surveys. Curriculum and Standards. Table 4 compares the extent to which the ten states had a continuous commitment to mathematics curriculum and standards. In particular, we examined the rigor and depth of the mathematics curriculum and a ssociated standards during the period, as well as whether there was a statewide textbook adoption. All ten states had some type of mathematics learning standards in place by 1998. The standard s evolved from being relatively vague to being more specific, and from focusing on basic level skills to focusing on higher order thinking skills. On the other hand, we found that states varied greatl y in terms of rigor, depth, and quality of the mathematics curriculum and standards implemented, as well as the extent to which there was a statewide textbook adoption policy. The experience in California is of special note, as there were deep disagreements about the mathematics curriculum, which influenced assessment policy and other reform efforts.

PAGE 31

The Black-White Achievement Gap 31 Table 4 Curriculum and Standards State Mathematics curriculum or standards High quality curriculum or standards Statewide textbook adoption in mathematics California X X1 X2 Kentucky X3 X4 Maryland X5 X6 Michigan X7 New York X8 X9 North Carolina X X X South Carolina X X10 X11 Tennessee X X12 X13 Texas X14 X15 Virginia X16 X X 1California’s state curriculum standards were considered to be high quality when first adopted but did not keep pace with the chan ging standards in math ed ucation. By 1996, they were considered only to be of adequate quality. 2California had statewide textbook adoption for grades K–8. 3The Kentucky content standards went through a process of evolution and developed their core content standards in 1998, at the end of the period being studied. 4Kentucky has a list of approved textbooks but schools make their own choices. 5The Maryland School Performance Assessment Program (MSPAP) shaped Maryland’s curriculum to a great extent. 6The MSPAP became the de facto set of standards and MSPAP was ge nerally considered to represent high standards. 7Michigan did not make its state standa rds mandatory for school districts. 8New York created state syllabi for ev ery grade level and subject area. At the high school level, there are two sets of standards, leading to two types of diplomas, Regents and local. 9New York’s high school math curriculum leading to the Regents diploma was broad, deep, and rigorous. Curriculum for the local diploma was much less demanding. 10Until 1993, South Carolina had very low-level math s tandards based on its basic skills tests. In 1993, math standards were completely re-written to meet with the national standards promoted by the National Council of Teache rs of Mathematics. Grade-by-grade achi evement standards we re then created and implemented in 1998. 11South Carolina provided districts with a list of ap proved textbooks; however, local districts were able to select other texts and ad d to this eligible list of books. It wa s not until 1998 that the textbook list was aligned with the newly enhanced frameworks. 12Tennessee had strong and clear cont ent standards into the mid-1990s bu t then shifted to a more locally controlled set of standards. 13Tennessee has statewide textbook adoption for more than 20 years. Althou gh texts are nominally supposed to cover state standards, variation in both coverage and quality has been great, with a large number of texts being approved. 14Comprehensive statewide standards we re only adopted in Te xas near the end of ou r period of study. 15The state recommends (and subsidizes) but does not require textbook book adoption in all subject areas. During the 1988–98 pe riod, there were not sufficient state s tandards to make textbook selections based on alignment. 16Since 1995, Virginia has de monstrated a strong commitme nt to a set of high quality learning standards.

PAGE 32

Education Policy Analysis Archives Vol. 14 No. 8 32 In our view, the ten states fell into one of thr ee categories during the period of interest. One state had a consistent commitment to curriculum an d standards with a strong state mathematics curriculum: North Carolina. Other states had some commitment to curriculum and standards. A few of these states had a statewide textbook adoption policy that allowed either an extensive list of state approved mathematics textbooks, or substantial flexibility for local adoption that resulted in great variations in the quality and content of textbooks being used across school districts. Others demonstrated their commitment to standards only towards the end of the period being studied. These states are California, New York, South Caro lina, Tennessee, and Virginia. The other states had inconsistent or weak commitments toward standards and curriculum: Kentucky, Maryland, Michigan, and Texas. Teacher Quality. Table 5 compares states on various dimensions of teacher quality initiatives during the period. We were especially interested in how the states differed in terms of their middle grade math certification requirements, the amount of professional development required for recertification, the extent to which the states established professional development centers and standards, the degree of out-of-field teaching in core subject areas in grades 7 to 12, and teacher salary levels. We found that only some states had a continuous and comprehensive commitment to teacher quality during the period. Table 5 Comparison of State Commitment to Teacher Quality State Middle grades math certification Professional development tied to recertification every 5 to 7 years State professional development centers/ standards Extent of outof-field teachin g in core subject areas for teachers in grades 7–12 Salary level California X X Middle High Kentucky X High Middle Maryland Middle High Michigan X Low High New York X X Low High North Carolina X1 X X Low Low South Carolina X X X Middle Low Tennessee X2 High Middle Texas X High Low Virginia X3 X Middle Middle 1Available since 1996. 2Available since 1997. 3Available but voluntary. We grouped the states again in three categorie s. The states with continuous commitment to teacher quality: New York, North Carolina, and South Carolina. States with inconsistent commitment to teacher quality: California, Kentucky, Michigan, and Texas. States with weak or no commitment to teacher quality initiatives: Maryland, Tennessee, and Virginia. More specifically, North Carolina, after 1996, re quired all new teachers to participate in a two-year mentoring and evaluation program; furthe rmore, middle school teachers were required to be licensed in a particular subject area, resulting in the state’s having the lowest percent of out-offield teaching in core subject areas among the ten states. During the study period, New York

PAGE 33

The Black-White Achievement Gap 33 maintained a network of teacher centers to support professional development in localities across the state. In addition, beginning in the 1980’s, mi ddle grade teachers had to achieve subject-specific certification. This is reflected in the relatively low percent of out-of-field teaching in core subject areas found in New York. South Carolina supported a host of teacher improvement activities both statewide and targeted at low-performing school s. Although the state offered middle school certification, it was not required. This may explain why South Carolina ranks near the middle of the group of ten states with respect to out-of-field teaching in core subject areas. California had a number of initiatives to improve teacher quality, both in terms of developing the skills of practicing teachers and en hancing the preparation of teachers coming into the system. However, with the class size reduction program and the steady population growth in the state over the period, increasing numbers of stud ents were being taught by teachers with emergency certification or who were not certified for that sub ject. Kentucky revised its less rigorous standards for middle grade certification in 2000, when the sta te began to require new and re-certified teachers to pass a national teaching exam and to have two field specialties. However, the lateness of this initiative may explain why Kentucky consistently had the second highest proportion of classes with out-of-field teachers among the ten states. Michigan maintained rigorous teacher certification and recertification programs, combined with high teach er salaries. The state has nearly the lowest percentage of out-of-field teaching among the stat es studied. Texas had invested in small-scale regional centers and university-based outreach programs. The state maintained its teacher certification program, which grants a permanen t teaching certificate after the candidate has completed 30 semester hours of graduate coursework. As the state does not require specific middle grades certification, Texas, with a steady growth in the school-age population, has one of the highest percentages of out-of-field teaching. Of the last group of states, Maryland, except for the granting of teacher certifications, essentially relegated responsibility for teacher stan dards and evaluations to the school districts and schools. Virginia, on the other hand, has had sporadic efforts to improve teacher quality but has largely left this effort to local school districts. Furthermore, the state offered an endorsement for middle school math instruction. Although the state also demanded the accumulation of professional development points for re-certification, points could be accrued through a wide-range of activities that could be certified by the district. Since 1994 Tennessee required all new teachers to have an academic major, a full semester of student teach ing, and a strong general education core. For existing teachers, however, the state commitment to teacher professional development was minimal and varied with the amount of money available in the state budget. Middle grade certification was created in 1997 but was not content specific. Tenne ssee had the highest rate of out-of-field teaching of any state in this study. Tennessee also implem ented a value-added model (Sanders & Horn, 1997) to evaluate teacher effectiveness. The use of the re sults was not mandated and less than half of the districts employed it in a serious and consistent fashion. Assessment and Accountability. Table 6 presents a comparison of the ten states on key aspects of their assessment and accountability policie s. States were examined with respect to the consistency of their commitment to assessment, th e type of tests that was used, their ongoing commitment to accountability, and the effectiveness of their system of rewards and sanctions. Not surprisingly, we found that all the states had po licies on assessment and accountability but with varying degrees of focus and with varying effecti veness in holding districts, schools, teachers, and students accountable.

PAGE 34

Education Policy Analysis Archives Vol. 14 No. 8 34 Table 6 Commitment to Assessment and Accountability State Consistent commitment to assessment Assess higherorder skills Consistent commitment to accountability High-stakes accountability system of sanctions and rewards California X1 X2 Kentucky X1 X X X3 Maryland X1 X X Michigan X X4 New York X5 X X North Carolina X6 X X X South Carolina X X Tennessee X7 Texas X X X Virginia X1 1The type and form of assessment ch anged during the years of interest. 2California changed from relying on a more open-ended assessment that was aligned with relatively advanced learning objectives in 1993–1994 to norm-referenced standardiz ed tests in 1995–1998. 3Kentucky’s system of sanctions and rewards has evolved over time with increasing rewards and sanctions for schools and teachers. 4The content of the Michigan assessment evolved from basic skills to higher-level skills during the years of interest. 5New York maintained consistent commitment in as sessing middle and high school students, but not elementary school students. 6North Carolina has consis tently assessed elementa ry and middle school students since the late 1970s, but only consistently assessed high school students after 1996. 7Tennessee had a consistent assessment plan in the elementary and middle grades between 1992 and 1998, using a norm-referenced test in grades three through eight. Three states had continuous and consistent co mmitment in assessment and accountability, with a strong accountability system incorporating sanctions and rewards: Kentucky, North Carolina, and Texas. Five states had continuous commitment to higher-level skills assessment, but had lowstakes accountability systems: California, Maryla nd, Michigan, New York, and South Carolina. The two remaining states, Tennessee and Virginia, had consistent commitment to basic skills assessment but had weak accountability systems. More specifically, Kentucky demonstrated a clear commitment to assessment and accountability during the period of interest, al though the mode of assessment, the level of accountability, and the system of rewards and sanc tions evolved over time. Since the passage of the Kentucky Education Reform Act (KERA) in 1990, the mode of the standards-based assessment has changed from a norm-referenced to a performan ce-based and open-response portfolio system. All students in grades 4/5, 7/8, and 11/12 were assess ed during this time. The state had a consistent commitment to accountability, although the level of focus shifted from the district to the schools and teachers. From 1988 to 1992, the system fo cused on accountability for whole school districts. From 1993 to 1998, accountability was much more centered at the school level. Furthermore, the rigor of the accountability system increased during the period of interest. For example, rewards and sanctions for teachers were imposed beginning in 1993 or 1994.

PAGE 35

The Black-White Achievement Gap 35 North Carolina had a consistent focus on assessment and accountability throughout the period of interest. The state had used some form of a statewide math assessment since the late 1970s in the elementary and middle grades. During the ten years studied here, the state used a multiplechoice exam that was aligned with state standards. The state also showed consistent commitment to accountability, focusing first on district level accoun tability. Beginning in 1996, it then focused on school and student level accountability with the introduction of the School Based Management and Accountability Program (commonly known as the AB Cs). Furthermore, the state redirected its attention from both input and output measures to only performance-based measures after 1996. It imposed strong accountability on both Title I an d non-Title I schools, with a system that used student progress (based on test scores) to award teacher bonuses, as well as school recognition or sanctions. For schools that were severely underperforming, the state had takeover teams of administrators and master teachers who worked fo r a period of a year to improve practice and performance. Texas maintained a continuous effort in assessi ng students in basic skills and held schools accountable during the period. Since 1993–94, the state has demonstrated a consistent commitment to performance-based accountability by sending th e message that improved student performance on the state test (TAAS) was the highes t priority for schools and districts. Texas had the same system of accountability for both Title I and non-Title I schools. Sanctions on under-performing schools ranged from public reporting to reconstitution. In addition, student accountability came with the required exit exam. However, as the pressure to pr oduce higher test scores intensified, many lowperforming schools began to teach to the test, stage pep rallies, and conduct cram sessions and mock tests in the hope of raising scores. At high er-performing schools, other measures such as the SAT, AP tests, or in-school performance measur es continued to define performance. TAAS performance remained high and of relatively li ttle concern at the higher-performing schools. Of the remaining states, five (California, Maryland, Michigan, New York, and South Carolina) had continuous commitment in accountability and assessment, although the mode or the type of assessment changed over time and the sa nctions and rewards were not well articulated or implemented. The remaining two states (Tennessee and Virginia) had inconsistent commitment to assessment and almost no accountability system in place during the period of interest. California was a leader in the drive for a ligning assessment with curriculum and, in mathematics, structuring assessment to tap higher -order skills. The California Learning Assessment System (CLAS) was introduced in 1993 and was a model of a progressive assessment that employed both multiple-choice items and open-ended tasks. CLAS immediately ran into difficulties due in part to its novelty and in part due to its inability to meet all the goals set for it. Political battles around the curriculum, as well as a somewhat unfavorable review by a technical advisory committee, led to the early demise of CLAS. The state then shifted back toward more traditional norm-referenced tests and has continued to tinker with both the grades tested and the assessments used. Accountability during this period generally focused on public repor ting of school results, and there was little in the way of rewards and sanctions. Maryland continuously assessed students during the period of interest. The form of this assessment changed dramatically in 1991, when the state introduced the Maryland School Performance Assessment Program (MSPAP). Th e MSPAP was a performance-based test that assessed students in grades 3, 5, and 8 in math an d English. The test did not yield individual scores and was used to measure class and school perfor mance. The state used MSPAP for measuring both Title I and non-Title I schools’ progress and reported achievement status. It appears, however, that meaningful sanctions were not imposed on low-pe rforming schools, though there was some indication that teachers at the school level woul d get together and talk about how to improve student test performance.

PAGE 36

Education Policy Analysis Archives Vol. 14 No. 8 36 Michigan has shown a continuous commitment to assessment over the past 30 years. Test content improved markedly in 1990, when the tests evolved into an assessment of higher level essential skills needed for academic and professional advancement. Michigan tested students in math and reading in grades 4, 7, and 11; and in writing, social studies and science in grades 5, 8, and 11. With the exception of the writing test, the assessm ents were all criterion-referenced, multiple-choice exams. Michigan maintained a very low-stakes a ccountability system. Although the state depended heavily on public reporting and school choice, it a ppears that this has had only limited influence on changing teacher practice. Michigan had a very weak accountability system for non-Title I schools and a slightly more rigorous system for Title I schools. Neither system resulted in significant sanctions for schools. Students and districts had almost no accountability at the state level during this time period. New York has a long history of assessment, st arting with the end-of-course Regents exams that were administered since 1878 to high school se niors. Over the years, the state developed several other exams to assess elementary and middle school students and high school graduates not taking the Regents exam. For instance, the norm-ref erenced Pupil Evaluation Program (PEP) was developed in 1965. It assessed reading and mathem atics in grades 3 and 6 and writing in grade 5. PEP underwent several revisions, with the most significant in the early 1980s when standards were raised and the assessment became a criterion-refe renced exam instead of a norm-referenced exam. In 1979, the state introduced a basic skills tes t, the Regents Competency Testing Program, in reading, writing, and mathematics for all high schoo l students not taking the Regents exams in those areas. Although the state maintained some level of public accountability through reporting of test results, the state did not have strong accountab ility for its schools, with minimal sanctions or rewards. The accountability system was the same for Title I and non-Title I schools. South Carolina used the same assessment system and maintained a consistent state accountability system focused on districts achievement in both cognitive and non-cognitive measures during the entire time period of interes t in this study. The tests emphasized basic skills. The accountability system had limited effectiveness b ecause the standards set represented a relatively low level of achievement, which allowed many districts to escape being identified as in need of assistance despite poor test results and high dropout rates. Tennessee had a continuous commitment to assessment between 1992 and 1998 but accountability was minimal during the time period of interest. The state adopted the Tennessee Value Added Assessment System in the early 1990s and required all schools to participate with the passage of the Education Improvement Act legisl ation in 1992. Between 1992 and 1998, the state used norm-referenced tests in grades 3 through 8 to assess student progress. In addition, students had to pass an eighth-grade level competency test in order to graduate. However, with the exception of the relatively low-level graduation requirement, there were no real rewards or sanctions based on performance; furthermore, the state did not disaggr egate data until the introduction of the No Child Left Behind Act. Although it continuously assessed students at the elementary, middle, and high school levels, Virginia had a number of different assessmen t mechanisms during the time period under consideration. Through the 1980’s and into the earl y 1990’s, the state tested students in grades 4, 8, and 11 using nationally-normed achievement tests These assessments were used to chart progress but had no consequences for students or schools. Over that same period, the state also had a minimum competency test that was first administered to students in tenth grade. Passing this test was required for graduation and those who did not pass the first time were able to retake the test in eleventh and twelfth grades. Subsequently, the state went through a period in which it administered, in the sixth grade, a set of tests for basic skills in reading, writing, and math. In 1998, the state shifted to assessments aligned with the Standards of Learning. Furthermore, until 1998, Virginia had

PAGE 37

The Black-White Achievement Gap 37 essentially no performance-based accountability; th at is, school and district accreditation was based entirely on “inputs,” a measure of the quality of st aff and services being provided to students. The state began to consider performance-based accounta bility only after the learning standards had been implemented. Thus, only in 1998 was school accreditation linked to student performance. Conclusion Clearly, state responses to calls for education reform have been far from uniform. They vary in the policies they employ, what roles are assumed by different parts of the state government and local education agencies, who is held accountable for student learning, and how consistently and coherently they carried out their educational initiatives over the period 1988 through 1998. This heterogeneity is a function of differences in hi story, political culture, educational governance structure and policies, state demographics, and educ ational performance. State responses also reflect how effectively state leaders were able to emphasi ze the same educational issues over time. We found that it was rare for any state to focus on a particular policy lever over the entire ten-year period. Instead, it was more likely for a state to focus on one or two policy levers for several years before shifting to other policy levers. Such shifts during the period studied makes categorizing the coherence, consistency, and quality of state policies a challenging task. However, we attempted to do so in the belief that these qualities are essential to understanding how state-driven reform plays out in schools and classrooms. Our results are presented below. Summary rankings We now propose to categorize the ten states wi th respect to their strength in each of the policy components, as well as provide an overall ra nking. Categorizations were based on reviews of all the information available about each state (i.e., state profile, state interviews, state synthesis, and state summary). Nonetheless, such an endeavor is fraught with difficulty, as well as involving an element of subjectivity. Furthermore, the “grading” of the states is essentially normative. We have no absolute basis for judging states’ success in implementing education reform policies. Thus, we consider each state in light of the results for the other nine states, and decide on (what we judge as) a fair placement in one of three categories. In governance and poli tics of reform we evaluated the state in terms of whether the stat e had strong central control, whether the drivers of reform came from multiple sources, and wheth er multiple policy levers were used to initiate reform during the period of interest. In education finance we identified the states that achieved higher levels of per pupil expenditures, grea ter funding equity across districts, and a greater proportion of state support for education. In curriculum and standards we determined whether the state had a strong commitment in aligning math ematics curriculum with mathematics learning standards and whether there was statewide textbook adoption during the period of interest. In teacher quality we analyzed the extent to which the particular state was committed to raising teacher quality by increasing the licensure requireme nts and/or providing extensive professional development in the area of mathematics teaching. In assessment and accountability we examined the extent to which the state consistently assessed thei r students, the extent to which the assessment tool measured higher-order thinking skills, and whether the state was committed to making their districts and schools accountable using a high-stakes accountability system of sanctions and rewards. Furthermore, the overall ranking is strongly influenced by the focal question our study is intended to address. That question may be framed as follows: Given the character and qualit y of a state’s policy effort s during the period 1988– 1998, in comparison to those of the other states, is it re asonable to expect that it

PAGE 38

Education Policy Analysis Archives Vol. 14 No. 8 38 would achieve relatively greater progre ss in closing the achievement gap—or increasing the scores of Black students—in comparison to those other states? In this regard, we gave greater weight to the la st three components, with the rationale that they can be expected to be more proximal determinan ts of classroom behaviors and efforts. Thus, a state was assigned to the highest category if, over the period in question and in relation to the other states, changes in policies and/or improvements in policy implementation would lead one to expect substantially greater improvement in test outcomes. Th us, were there a state that had a consistent, coherent, and broad-base d policy effort already in place by the late 1980s and kept it in place through the 1990s, we would not necessarily expect great relative improvement— although we might anticipate high absolute scores. More realistically, if a state only achieved coherence and consistency in its efforts by the mid-1990s, th en we might expe ct only “average” improvement over 1992 to 2000, even if the poli cies and their implemen tation were exemplary. Table 7 Ranking of states across policy components. Ranking Governance and politics of reform Education finance Curriculum and standards Teacher quality Assessment and accountability Overall 1 KY NC TX CA NC NY NC SC KY NC TX NC 2 CA NY VA KY MD MI NY NC SC TN TX CA NY SC TN VA CA KY MI TX CA MD MI NY SC CA KY NY SC TX 3 MD MI SC TN VA KY MD MI TX MD TN VA TN VA MD MI TN VA Table 7 presents the complete set of rankings. Considering the overall ranking, we note that: Only North Carolina was placed in the highest category. While it would have been preferable to have a more balanced allocation of states ac ross categories, North Carolina’s set of policy component rankings were substantially superior to t hose of any other state; Five states, California, Kentucky, New York, South Carolina, and Texas, fell in the second group. These states were committed to education reform but focused on only one or two policy levers. These states also allowed a certain degree of local choice that led to inconsistency across districts, and the remaining four states, Maryland, Michigan, Tennessee, and Virgin ia, fell in the third group. These states did not have strong state control and failed to align their policies to any significant extent during the period studied. Virginia, as already noted, only mounted a consistent effort in the middle of the period under study.

PAGE 39

The Black-White Achievement Gap 39 Structure of the Achievement Gap Preliminaries As we indicated earlier, it will prove useful to set the findings about the patterns in the BlackWhite achievement gap against a background of between state as well as other, within-state results. Accordingly, we begin our analysis with the data displayed in Table 8. The subtable at the bottom shows that in 1992 the average for the ten states was nearly four points below the national average, but by 2000 the differential was less than 2 points. In the main table, the ten states are listed in order of their mean achievement levels in 1992. The rang e is only ten points, from Virginia (268) to North Carolina (258). All the states experienced some im provement over the next eight years, with North Carolina showing the greatest increase (21.7) and California the least (1.3). However, these increases should be viewed in light of the changes in exclusion rates over the same period. These are presented in the last column and we note that Nort h Carolina stands out with an increase of 10.6%. This has led some commentators to discount entirely North Carolina’s improvement (Amrein & Berliner, 2002). For an alternative view see Braun (2004). Table 8 State NAEP results. State Achievement 1992 Change in mean achievement (1992 to 2000) Change in exclusion rate (1992 to 2000) VA 268 8.8 4.7 MI 267 11.1 5.9 NY 266 9.8 4.6 MD 265 11.8 5.9 TX 265 10.3 3.0 KY 262 9.3 4.9 CA 261 1.3 0.4 SC 261 5.6 0.9 TN 259 4.6 -0.3 NC 258 21.7 10.6 Average Achievement 1992 2000 Nation 10 study states Difference 266.9 263.2 3.7 274.4 272.6 1.8 Since we are interested in examining patterns of achievement at a level below that of the state, in Table 9 we display counts of schools and students in the NAEP sample for each stratum/state for 1992 and 2000. For all ten states in both years, the number of schools in the lower poverty stratum (S1) is always larger, and usually much larger, than the number of schools in the higher poverty stratum (S2). In addition, for all ten states, the number of schools in S2 in 2000 is greater than in 1992. The latter outcome is presum ably due both to changes in the demographic profiles of the states and di fferential success in obtaining school participation in NAEP.

PAGE 40

Education Policy Analysis Archives Vol. 14 No. 8 40 Table 9 Number of schools and students by stratum and state. 1992 2000 # of Students # of Students State # of Schools Total WhiteBlack # of Schools TotalWhite Black S1 1903 101411242 956502 44 CA S2 22 613 122 64 29 672 100 88 S1 77 2067 177919060 14581270 129 KY S2 27 689 604 55 37 836 677 118 S1 83 2186 142252684 20041315 444 MD S2 10 213 21 173 20 397 34 307 S1 82 2167 180820764 15601356 83 MI S2 19 449 106 290 21 415 133 235 S1 69 1783 144012742 975750 72 NY S2 16 375 54 173 31 658 101 280 S1 87 2334 173349879 18041311 373 NC S2 16 435 157 239 25 550 178 294 S1 74 1909 129349148 1179831 282 SC S2 28 716 213 430 47 1127 449 607 S1 73 1966 161628267 16541339 218 TN S2 21 519 254 231 28 578 285 255 S1 68 1771 107819356 1313782 150 TX S2 30 843 185 106 41 1004 281 153 S1 93 2470 176348389 21451492 403 VA S2 10 240 115 103 16 324 84 192 The number of students in S1 ranges from 956 (CA/2000) to 2470 (VA/1992). The number of students in S2 ranges from 213 (MD/1992) to 1127 (SC/2000). Looking forward to the comparisons between White students and Black students, the counts for each group are generally respectable; there are, however, six instances in wh ich the number of White or Black students falls below 100. Four occurred in S2 in 1992: CA (Bla ck), KY (Black), MD (White) and NY (White); two occurred in S2 in 2000: MD (White) and VA (White). Table 10 presents stratum means for 1992 and 2000, as well as the changes over that period. All strata in all states experienced gains. These ra nged from 7 (NC, TN) to 24 (NC), with a median of 11 points in S1, and from 2 (CA) to 24 (NY) with a median of 11 points in S2. With the exception of California, Maryland, Tennessee, and Virginia in S2, all the stratum gains were statistically significant. Thus, NAEP results suggest there was real improvement in both strata for most of the states, but that the typical rate of improvement amounted to only about a point and a half per year.

PAGE 41

The Black-White Achievement Gap 41 Table 10 Mean achievement by stratum and st ate (standard errors in parentheses). Lower poverty (S1) Higher poverty (S2) 1992 2000 2000–1992 1992 2000 2000–1992 State Mean S.E. Mean S.E. Mean S.E. Mean S.E. Mean S.E. Mean S.E. California 268 (2.1) 275 (2.6)7 (3.3)242 (2.8) 244 (2.9)2 (4.0) Kentucky 266 (1.3) 278 (1.4)12 (2.0)254 (1.9) 260 (2.2)6 (3.0) Maryland 268 (1.3) 284 (1.4) 16 (1.9) 229 (5.6) 238 (3.5) 9 (6.6) Michigan 273 (1.8) 284 (1.6) 11 (2.4) 236 (3.4) 251 (2.7) 15 (4.3) New York 277 (1.5) 288 (1.5)11 (2.2)235 (5.4) 259 (4.5)24 (7.0) North Carolina 261 (1.2) 285 (1.3)24 (1.7)243 (3.3) 264 (3.0)21 (4.4) South Carolina 266 (1.2) 274 (2.1) 8 (2.4) 245 (2.0) 259 (1.8) 14 (2.7) Tennessee 263 (1.2) 270 (2.0) 7 (2.4) 242 (2.6) 247 (3.4) 4 (4.3) Texas 271 (1.7) 283 (2.0)11 (2.7)251 (1.5) 266 (2.1)15 (2.6) Virginia 270 (1.3) 279 (1.8)9 (2.2)250 (5.1) 258 (3.7)8 (6.3) Median 268 281 11 243 259 11

PAGE 42

Education Policy Analysis Archives Vol. 14 No. 8 42 6 04 899 1 7 4 8 26* 26 25* 25 24* 24 23* 23 22* 410 5 322 65 9 58 344 589 04 28* 28 27* 27 26* 26 25* 25 7 310 8866 31 2000 1992 2000 1992 In addition to evaluating the state-specific numerical outcomes, it can also be helpful to consider the stratum results in the aggregate. Fi gure 1 presents two back-to-back stem-and-leaf displays, for S1 and S2, respectively The general improvement in both S1 and S2 over the period is quite evident. While this is welcome, comparisons between strata within a year are less heartening. In 1992, the distribution of means for S2 lies well below that of S1. Indeed, there is a seven-point gap between the state with the highest S2 mean (KY) and the state with the lowest S1 mean (NC). The story is essentially the same in 2000: There is a four-point gap between the st ate with the highest S2 mean (TX) and the state with the lowest S1 mean (TN). Of greater concern, perhaps, is the comparis on between the distributions for S1 in 1992 and S2 in 2000. We note that the highes t-ranking states with respect to S2 in 2000 are just at the level achieved by the lowest-ranking states with respect to S1 in 1992. That is, overall in 2000, mean achievement in higher-poverty schools still fell well short of the levels attained by lower-poverty schools eight years earlier. This observation leads us to consider the gaps between strata on a stateby-state basis. We first investigate, however, the possibility of a relationship between changes in stratum mean and the “baseline” level of achievement in 1992. Such a relationship could complicate the interpretation of the findings, since it raises the pos sibility that some of the observed patterns might simply be manifesting a statistical artifact such as regression to the mean. Figures 2 and 3 display the gains from 1992 to 2000 for each state for S1 and S2, respectively. The error bars attached to each state gain extend one standard error in each di rection, corresponding to a 68% confidence interval. Leaving aside the outliers (NC in S1 and NY in S2 ), there does not appear to be even a modest relationship between gains and baseline. Lower Poverty Stratum (S1) Higher Poverty Stratum (S2) Figure 1 Stem-and-leaf plots fo r stratum means by state and year (Note that the stems for the two displays are aligned numerically, so that comparisons can be easily made between the displays.) In each panel, th e central column represents the “stem” and the side columns represent the “leaves.” In 1992, for example, the state with the highest achievin g S1 among the ten states had a score level of 277, while the lowest achieving S1 had a sc ore level of 261. Note that the stems for the tw o panels are aligned vertically, faci litating comparisons across panels. For further guidance on read ing stem-and-leaf plots, see Tukey (1977) or any modern introductory statistics textbook.

PAGE 43

The Black-White Achievement Gap 43 CA KY MD MI NY NC SC TN TX VA0 5 10 15 20 25 30 2602652702752801992 Achievement LevelGain from 1992 to 2000 Figure 2 Lower poverty stratum (S1). Gain in mean achievement (1992 to 2000) vs. mean achievement in 1992. CA KY MD MI NY NC SC TN TX VA0 5 10 15 20 25 30 225230235240245250255 1992 Achievement LevelGain from 1992 to 2000 Figure 3 Higher poverty stratum (S2). Gain in mean achievement (1992 to 2000) vs. mean achievement in 1992.

PAGE 44

Education Policy Analysis Archives Vol. 14 No. 8 44 Analysis of the Gaps Table 11 displays the stratum gaps for each state in both 1992 and 2000. The stratum gap is defined as the difference in mean achievement betw een S1 and S2. The states are listed in ascending order of the gap in 1992, with Kentucky having th e smallest gap (12) and New York the largest (42). In 2000, Texas and South Carolina had the smallest gaps (16), while Maryland had the largest (46). Note that these stratum gaps are typically many times larger than the corresponding estimated standard errors. The last panel in Table 11 displays the changes in the gaps from 1992 to 2000, with the corresponding standard errors. (Negative numbers signal increases in the gap and positive numbers signal decreases.) Since these changes are “differences of differences,” the estimated standard errors are somewhat larger than before. Maryland, Kentucky, and California experienced the largest increases (-7, -6, -5), while New York and Sout h Carolina experienced the largest decreases (+14, +6). Only New York’s change is statistically signific ant. It is interesting to note that, despite having the largest increase in the stratum gap among th e ten states, in 2000 Kentucky still had the third smallest stratum gap.15 Table 11 Stratum gaps in mean achievement by state (standard errors in parentheses). [S1 S2] [S1 S2] Reduction in Gap1 State 1992 2000 1992 to 2000 KY 12 (2.3) 18 (2.7) -6 (3.5) NC 19 (3.4) 21 (3.2) -3 (4.7) VA 20 (5.3) 21 (4.1) -1 (6.7) TX 20 (2.3) 16 (2.9) 4 (3.7) TN 21 (2.9) 23 (4.0) -3 (4.9) SC 21 (2.3) 16 (2.7) 6 (3.6) CA 26 (3.4) 31 (3.9) -5 (5.2) MI 37 (3.9) 33 (3.1) 4 (4.9) MD 39 (5.7) 46 (3.8) -7 (6.8) NY 42 (5.6) 29 (4.7) 14 (7.3) Median 21 22 -2 1 Calculations for the last panel are based on original estimates of gaps in 1992 and 2000, rather than the rounded values presented in the table. The distribution of gaps for the ten states, however, did not change very much over the period, with the median gap increasing slightly from 21 points to 22 points. It is noteworthy that in both years the range from the state with the lowes t gap to the state with the largest gap was 30 points, or about three times the size of the overall difference between the highest and lowest ranking 15 One can wonder whether legislators and other stakeholders in Kentucky should have been more concerned with losing ground in their effort to re duce the gap, or could take solace in maintaining their relative ranking. This contrast hi ghlights the difficulty in making si mple summaries of state achievement results over time.

PAGE 45

The Black-White Achievement Gap 45 states (see Table 8). This supports our contenti on of the importance of studying patterns of achievement within states. With these comparisons between strata as background, we now turn our attention to comparisons between Black students and White students. Black-White Achievement Gaps Table 12 displays the mean scores for White students, Black students, the gaps (White mean—Black mean) for each state in 1992 and 2000, as well as the changes in the gaps over this period.16 All the gaps displayed in Table 12 are many times larger than the corresponding standard errors. In 1992, Kentucky had the smallest gap (23) and New York the largest (47), yielding a range of 24 points. In 2000, Kentucky had the smallest ga p (22) and Michigan the largest (44), yielding a range of 22 points. Equally important is the fact that the median of the state gaps was large and remained essentially constant over the period. The median gap in 1992 (34 points) is about 50% larger than the corresponding median stratum gap in 1992 (21 points), or about three times larger than the difference between the highest and lowest ranking states overall. There is considerable variability among states in the changes in the ga ps, with North Carolina experiencing the largest increase (-7) and New York the largest decrease (+15). Both changes are statistically significant. In 2000, Kentucky was still the state with the smalle st gap, while New York had moved to below the median. Beginning our examination of within-stratum patterns, Table 13 displays the mean achievement for Blacks and Whites for each stratum/ state/year combination, as well as the changes for both Blacks and Whites in each stratum/state from 1992 to 2000. There was an increase in mean achievement for both Black students and White studen ts over the period for each stratum/state, although there was considerable variability among states. There were many cases of substantial increases for Black students. In S1, NY(23) and NC (17) had the best records; in S2, NY(29) and NC(21) again stand out. Both these states had excellent records with respect to gains for White students, although New York in S1 experienced a gain of only 9 points. Inasmuch as a key goal of the study is to ex amine outcomes and trajectories in each stratum, Table 13 enables us to compare the gains over the period of White students in S1 to those of White students in S2. The gains in S1 were greater in fi ve states and smaller in four states. However, the only state in which the difference in gains approach ed significance was Virgin ia. For Black students, the gains in S1 were greater in five states and lower in five states. In no state was the difference in gains significant. 16 The estimated standard errors for the gaps tak e into account the dependency between the mean scores for White students and Black students induced by the clustering of st udents within schools.

PAGE 46

Education Policy Analysis Archives Vol. 14 No. 8 46 Table 12 Mean achievement for White students an d Black students, and Black-White gaps, by state and year (standard errors in parentheses). 1992 2000 White Black W-B White Black W-B Reduction in gap 1992 to 2000 State Mean S.E. Mean S.E. MeanS.E. MeanS.E. MeanS.E. MeanS.E. MeanS.E. CA 277 (1.9) 234 (3.6) 42 (3.9)278 (2.2)242 (2.8) 36 (3.6)6 (5.3) KY 265 (1.1) 242 (2.6) 23 (2.8)275 (1.3)253 (2.8) 22 (3.0)1 (4.1) MD 279 (1.5) 240 (2.0) 39 (2.4) 290 (1.3) 249 (2.0) 41 (2.2) -2 (3.3) MI 277 (1.5) 233 (1.8) 44 (2.5) 286 (1.4) 242 (2.6) 44 (2.6) 0 (3.6) NY 280 (1.1) 233 (4.4) 47 (4.5)289 (1.3)257 (4.3) 32 (4.4)15 (6.3) NC 267 (1.0) 239 (1.7) 28 (1.6)291 (1.1)256 (1.4) 35 (1.8)-7 (2.4) SC 274 (1.1) 242 (1.0) 32 (1.3) 279 (1.5) 249 (1.7) 30 (1.8) 2 (2.2) TN 266 (1.1) 235 (2.4) 31 (2.4) 271 (1.4) 237 (3.0) 34 (3.1) -3 (3.9) TX 279 (1.5) 244 (2.0) 35 (2.3)288 (1.4)252 (3.3) 36 (3.0)0 (3.8) VA 275 (1.1) 245 (1.8) 30 (2.2)285 (1.4)252 (1.9) 33 (2.1)-3 (3.0) Median 276 239 34 286 251 35 0

PAGE 47

The Black-White Achievement Gap 47 Table 13 Mean achievement for Black and White students by stratum and state (standard errors in parentheses) 1992 2000 2000–1992 White Black White Black White Black State Mean S.E. Mean S.E.Mean S.E. Mean S.E.Mean S.E. Mean S.E. S1 (Lower poverty) CA 279 (2.0) 241 (3.3)280 (2.4)248 (5.3)2 (3.1) 7 (6.2) KY 268 (1.3) 244 (3.1)281 (1.4)258 (3.0)13 (2.0) 14 (4.3) MD 279 (1.5) 244 (2.0) 291 (1.2) 258 (2.3) 12 (1.9) 15 (3.0) MI 278 (1.5) 241 (2.8) 287 (1.5) 249 (6.0) 9 (2.1) 8 (6.6) NY 281 (1.0) 252 (5.1)290 (1.4)275 (7.0)9 (1.7) 23 (8.7) NC 267 (1.0) 242 (2.0)293 (1.2)259 (1.8)25 (1.6) 17 (2.7) SC 275 (1.2) 246 (1.2) 283 (2.0) 252 (3.5) 7 (2.3) 7 (3.7) TN 268 (1.3) 241 (2.3) 274 (1.7) 246 (6.1) 6 (2.1) 5 (6.5) TX 280 (1.8) 246 (2.6)290 (1.9)256 (5.4)10 (2.6) 10 (6.0) VA 276 (1.1) 247 (1.6)285 (1.5)255 (2.4)9 (1.8) 8 (2.9) S2 (Higher poverty) CA 264 (2.8) 221 (7.3)267 (5.7)237 (4.5)3 (6.3) 16 (8.6) KY 257 (1.9) 236 (4.9)263 (2.2)247 (4.5)6 (2.9) 11 (6.6) MD 253 (13.2) 226 (4.8) 261 (11.3) 237 (2.9) 9 (17.4) 11 (5.7) MI 266 (5.1) 227 (1.6) 280 (4.7) 240 (2.7) 14 (6.9) 13 (3.1) NY 266 (5.8) 223 (3.7)283 (3.9)252 (4.2)17 (7.0) 29 (5.6) NC 262 (3.5) 231 (2.1)280 (3.1)252 (2.2)18 (4.7) 21 (3.1) SC 265 (2.3) 238 (1.6) 272 (2.4) 248 (1.9) 7 (3.3) 10 (2.5) TN 257 (2.0) 227 (3.0) 261 (3.2) 230 (3.8) 4 (3.8) 2 (4.8) TX 274 (2.8) 241 (2.9)282 (1.9)248 (4.1)8 (3.4) 7 (5.0) VA 260 (3.8) 240 (7.2)279 (4.5)247 (3.5)19 (5.9) 7 (8.0) We now turn our attention to Table 14, which displays the Black-White gaps for each stratum/state/year and the changes in the gaps over the period. This is the main focus of the section. The states are listed in descending order of the improvement (reduction) in the state level gaps, which are presented in the left-most colu mn. In both 1992 and 2000, and for both S1 and S2, the median (for the ten states) within-stratum Black-White gap was only slightly smaller than the median (for the ten states) within-state Black-White gap (see Table 12). Indeed, that is the case for most states; i.e., for most states, the achievement gap within a stratum is almost as large as the achievement gap for the entire state. Thus, the state-level gap cannot be accounted for by the (substantial) gap between the strata and the differe ntial distributions of Black students and White students across strata. This finding is somewhat surprising, in view of the continuing discussions about the relationships between academic achievement, on the one hand, and race and class, on the other. Given the substantial gap in achievement between S1 and S2 in each state and the fact that Black students, in comparison to White students, are disproportionately enrolled in schools in S2, one might expect that the observed Black-White achi evement gap at the state level was largely a

PAGE 48

Education Policy Analysis Archives Vol. 14 No. 8 48 consequence of Black students attending lower achi eving schools. In fact, the magnitude of the state level gap was also due to the large differences in achievement between Black students and White students attending schools in the same stratum. Re stricting attention to comparisons within stratum can only partially control for differences in other factors that are associated with achievement. In particular, students of different races are not proportionately distributed across schools within a stratum. As we indicated earlier, that observati on motivated the HLM analyses that do control for individual school effects. The results of those analyses are presented in the next section. Table 14 Black-White gaps (White mean—Black mean) in mean achievement by stratum and state. S1 (Lower poverty) gap S2 (Higher poverty) gap Reduction in State gap (1992– 2000) 1992 2000 1992–20001992 2000 1992–2000 Mean S.E. State Mean S.E. Mean S.E.Mean S.E.Mean S.E.Mean S.E. Mean S.E. 15 (6.3) NY 29 (4.9) 15 (7.1)14 (8.6)43 (6.9)31 (4.5) 12 (8.2) 6 (5.3) CA 37 (3.6) 32 (5.7)5 (6.7)43 (7.9)30 (6.7) 13 (10.4) 2 (2.2) SC 30 (1.6) 30 (3.1) -1 (3.5) 27 (2.8) 24 (2.6) 3 (3.8) 1 (4.1) KY 24 (3.3) 23 (3.4) 1 (4.7) 21 (5.3) 16 (4.6) 5 (7.0) 0 (3.6) MI 37 (3.4) 38 (5.7)-1 (6.6)39 (5.7)39 (4.8) -1 (7.5) 0 (3.8) TX 34 (2.8) 34 (4.5)0 (5.3)33 (4.5)34 (3.8) -1 (5.9) -2 (3.3) MD 35 (2.4) 33 (2.5) 3 (3.4) 26 (11.5) 24 (10.6) 2 (15.6) -3 (3.0) VA 29 (2.0) 31 (2.5) -2 (3.2) 20 (6.7) 32 (6.1) -11 (9.1) -3 (3.9) TN 27 (2.2) 28 (6.1)-1 (6.4)29 (3.6)31 (4.5) -2 (5.8) -7 (2.4) NC 25 (1.8) 34 (2.1)-9 (2.7)31 (3.7)28 (3.5) 3 (5.1) 0 Median 29 31 0.5 30 30 2.5 With respect to reducing the Black-White ga p in S1, NY(+14) and CA(+5) have the best records while NC(-9) has the poorest. Only the result for North Carolina reaches statistical significance. Turning our attention to S2, we see that CA(+13), NY(+12) have the best records, while VA(-11) has the poorest. None of the results reach statistical significance. The trends in Table 14 should be interpreted in light of those presen ted in Table 13. For example, there was a 17 point increase for Black students in S1 in North Caro lina, but that was still smaller than the 25 point increase for White students in S1. The difference in gains is reflected in the poor gap trend result for the state. Virginia presents another interesting case. From Table 13, we note that Black students posted increases of 8 and 7 points in S1 and S2, re spectively. Although White students in S1 did only a little better (9 points), those in S2 posted a larger gain of 19 points, leading to a 12 point increase in the gap. Overall, the median reduction in the achievement gap in S2 was 2.5 points. Turning our attention to comparisons between strata (within states), we note that the reduction in the gap in S1 was greater than that in S2 for five states and lower in four states. However, the magnitude of the differences in gaps were generally small and approached significance only in one state, North Carolina. It may be misleading to focus on changes in the Black-White gaps without taking into account the patterns of achievement for Black studen ts and White students separately. Accordingly, we present Figures 4 and 5, which display scatterp lots of the reduction in the achievement gap and the change in mean achievement for Black students in S1 and S2, respectively. (Again, the error bars extend one standard error in each direction, corresponding to a 68 percent confidence interval.) In

PAGE 49

The Black-White Achievement Gap 49 both S1 and S2, New York is an outlier with respect to each dimension. There is no relationship apparent in S1 while, in S2, there is only a hint of a positive relationship. Thus, in general, states that appear exemplary in the analysis of a particular educational outcome, may be viewed quite differently in anothe r analysis. Leaving aside North Carolina for a moment (in view of the questions that have been raised about the change in exclusion rates), consider California, which appeared to make some progress in reducing th e Black-White gaps within strata, while making little dent in the stratum ga p. New York presents a more consistent picture, displaying typical to exemplary pr ogress at each level of analysis. CA KY MD MI NY NC SC TN TX VA-30 -20 -10 0 10 20 30 051015202530 Change in Mean Achievement for Black Students (1992 to 2000)Reduction in Achievement Gap (1992 to 2000) Figure 4 Lower poverty stratum (S1). Reduction in achi evement gap (1992 to 2000) vs. change in mean achievement for Black students (1992 to 2000).

PAGE 50

Education Policy Analysis Archives Vol. 14 No. 8 50 VA TX TN SC NC NY MI MD KY CA-25 -20 -15 -10 -5 0 5 10 15 20 25 30 051015202530 Change in Mean Achievement for Black Students (1992 to 2000)Reduction in Achievement Gap (1992 to 2000) Figure 5 Higher poverty stratum (S2). Reduction in achi evement gap (1992 to 2000) vs. change in mean achievement for Black students (1992 to 2000). Grading the States The principal aims of the previous subsection were to present estima tes of the Black-White achievement gaps at both the state and the stratum within state levels, as well as to describe the changes in the three sets of gaps over the period of study.17 While we believe these descriptions are of interest in their own right, they can also serv e as the basis for judgments about the (relative) success that the states have had in improving th e outcomes for Black students. These judgments will be summarized in a “report card” for the states th at can be juxtaposed against the education policy report card that was presented in the previous s ection. The comparison of the two report cards in the linking section will offer some insight into th e relationship between state-initiated reforms and student outcomes. There is no absolute basis for evaluating a state’s success in raising test scores: The grades assigned to the states are normative, in that each state’s record is considered in the context of the results for the other nine states. Although there is an element of subjectivity in such an effort, we believe that the summary provided by such a rank ing serves a useful purpose in facilitating the linking of policy and outcomes. The states have been ranked separately with respect to two outcomes in each stratum: improving the achievement of Black students an d reducing the achievement gap. For the first outcome, we consulted Table 13, and for the second, Table 14. For each combination of stratum 17 This program is continued in the following sect ion, where HLMs are employed to estimate the achievement gaps within schools within a stratum.

PAGE 51

The Black-White Achievement Gap 51 and outcome, states were classified into one of th ree categories. In rankin g a state, we considered both the absolute magnitude of the change in relati on to those of other states, as well as the effect size of the change (i.e., the ratio of the change to its estimated standard error). The results are presented in Table 15. Table 15 Ranking states on student achievement outcomes. Improving Black student achievementClosing the achievement gap Ranking Lower poverty stratum (S1) Higher poverty stratum (S2) Lower poverty stratum (S1) Higher poverty stratum (S2) 1 KY, MD, NY, NC NY, NC NY CA, NY 2 SC, TX, VA CA, KY, MD, MI, SC CA, MD KY, NC, SC 3 CA, MI, TN TN, TX, VA KY, MI, NC, SC, TN, TX, VA MD, MI, TN, TX, VA We observe that the rankings of two states are consistent across outcomes: New York is in the highest category in all four rankings, while Tennessee is in the lowest category in all four rankings. Most states, however, present a somewhat mixed picture. North Carolina fares well in both strata with respect to improving Black student ac hievement but poorly in closing the achievement gap. California fares better with respect to both ou tcomes in S2 than in S1. Michigan, Texas and Virginia appear in the lowest category in three of the rankings. South Carolina is assigned the middle category in three of the rankings. Finally, Kentucky and Maryland have the most heterogeneous assigned categories. Next, we examine patterns in the rankings for the two strata within each outcome. With respect to “Improving Black student achievement”, we observe a modest consistency across strata: New York and North Carolina are in the highest category in both strata, Tennessee is in the lowest category in both strata, and the other seven states differ by only one level between strata. For the outcome “Closing the Achievement Gap,” there is again a modest consistency across strata: New York is in the highest category in both strata; Michigan, Tennessee and Virginia are in the lowest category in both strata; and the other six states differ only by one level between strata. Results of HLM Analyses In this section we will discuss the results of carrying out a series of analyses based on HLMs. The analyses were carried out for both strata in all ten states for the years 1992 and 2000. This section presents estimates of the achievement gap employing different models, as well as a number of comparisons of trends in achievement gaps. These more complex models offer some insights into the structure of the achievement gap. Unambigu ous interpretations are elusive because of the complexity of the dynamics among the different factors. Moreover, patterns of school segregation combined with limitations of the NAEP sample c onstrain the generalizability of the results for some states. We have already indicated some of the caveats that must be kept in mind when interpreting the results of the within-stratum comparisons over time between Black students and White students.

PAGE 52

Education Policy Analysis Archives Vol. 14 No. 8 52 In particular, it has been well documented that there can be substantial differences in average achievement among schools and that students with similar characteristics attending different schools achieve at different levels. Such school/peer effects (to the extent they exis t) are confounded with our estimates of the Black-White gaps within strata, because Black students and White students have differential patterns of enrollment across schools within a stratum. It is natural, then, to ask what the gap estima tes would be if it were possible to eliminate the contribution of between-school differences. We turn for answers to the method of hierarchical linear models (HLM). An HLM can include a contrast at the student level of the model that enables us to estimate the average difference in achievem ent between Black students and White students attending the same schools in a particular stratum. This so-called “pooled within-school” estimate of the Black-White gap is free of average between-school differences. It is important to recognize that two stages of data restriction are involved. In the first stage, we exclude schools whose NAEP samples have ne ither White students nor Black students. The second stage occurs within the HLM analysis. Specifically, only those schools with NAEP samples that included both Black and White students can contribute to the estimation of the contrast of interest. For some combinations of stratum/state/year the numbers of schools and students contributing to the estimate were quite small. Th is is especially the case for S2 in 1992. Thus, while the prospect of obtaining estimates of achievement gaps free of between-school differences is an exciting one, it comes at cons iderable cost: The sample of schools on which such estimates are based can be relatively small and th e sample is not generally representative of the stratum as a whole. Moreover, comparisons of these estimated gaps among states is more problematic because of the different degrees of de facto school segregation (within a given stratum) from state to state. Nonetheless, we believe thes e estimates should be of interest to state policy makers, as they do present a picture of within-school differences in achievement for that subset of the schools that enrolled both Black and White students. Black-White Achievement Gaps Table 16 presents new counts of schools, White students, and Black students for each stratum/state combination, for 1992 and 2000. These are schools for which the NAEP sample contained both Black and White students.18 Only the students in these schools contribute to the estimate of the within-school achievement gap. 18 Presumably, there were schools participating in NAEP that had both White students and Black students, but for which the NAEP student sample did not happen to include students of one or the other race. Such schools do not appear in Table 16.

PAGE 53

The Black-White Achievement Gap 53 Table 16 Number of schools and students by state (Reduced school sample) 1992 2000 # of Students # of Students State # of Schools Total WhiteBlack # of Schools TotalWhite Black S1 38 954 4518424 544255 44 CA S2 7 250 49 36 16 434 66 56 S1 50 1348 109319038 913747 129 KY S2 8 199 137 55 23 549 406 118 S1 70 1839 111951568 16061019 382 MD S2 5 111 21 81 11 219 34 158 S1 31 826 65311322 550455 32 MI S2 9 215 60 118 9 167 57 77 S1 27 705 46612722 506349 53 NY S2 7 177 54 50 19 390 83 117 S1 74 1989 139549864 14841002 373 NC S2 16 435 157 239 21 469 146 260 S1 73 1885 127149146 1127804 257 SC S2 23 586 213 321 41 981 449 467 S1 50 1354 105425848 1211914 218 TN S2 7 184 87 90 13 285 138 128 S1 50 1292 72017342 1003554 150 TX S2 15 482 126 67 18 509 96 102 S1 85 2251 155248378 18681255 403 VA S2 7 172 89 62 13 258 84 134 Comparing Table 16 to Table 9, we observe that in both years there is a substantial reduction in numbers for all states, with the exception of North Carolina, South Carolina and Virginia. The reduction in the S2 school sample in 1992 resulted in only three states (North Carolina, South Carolina, and Texas) having ten or more schools contributing to the estimation of the contrast. Consequently, the gap estimates for S2 in 1992, based on these reduced samples, have greater uncertainty attached to them. In 2000, however, only one state (Michigan) had fewer than ten schools contributing to the estimation of the contrast. As explained in the methods section, we fit two types of HLMs, yielding what we termed adjusted and fully-adjusted estimates of the achievement gap19. In Table 17 we display four different estimates of the Black-White gaps for S1 in 1992 and 2000. For 1992, column (1) contains the estimated gaps already presented in Table 14, which compare weighted estimates of the mean achievement of all White students and of all Black st udents in the stratum. (The relevant counts are found in Table 9.) The estimates in the next three columns are based on the reduced school sample, which is delineated in Table 16. They all employ unweighted analyses. Column (2) contains the differences in the average scores of White studen ts and Black students. Column (3) displays the adjusted achievement gaps from the first type of HLM, while column (4) displays the fully-adjusted 19 The first type adjusts student scores for differences in scho ols attended; the se cond adjusts for schools attended, as well as student SES and student academic focus.

PAGE 54

Education Policy Analysis Archives Vol. 14 No. 8 54 achievement gaps from the second type of HLM. This pattern is repeated for 2000 in columns (5) through (8). Table 18, for the data from S2, has a parallel structure to Table 17. (Note that the ordering of the states is th e same as that in Table 14) It is important to recognize at the outset th at the estimates based on the full school sample are not directly comparable to estimates based on the reduced school sample both because of differences in the samples and because of the use of weights in the former but not the latter.20 Accordingly, in what follows we will focus on comparisons among estimates based on the reduced school sample. We first observe that in S1, in both 1992 and 2000, removing school effects results in a reduction in the estimated gaps. For example, in New York the reductions are 30 –21 = 9 points and 26 –17 = 9 points in 1992 and 2000, respectively. In some states the reductions are substantial. In 1992, five states had reductions of at least nine points; in 2000 there were three such states. Turning to S2, removing school effects also results in re ductions in the estimated gaps. (The only exceptions are Kentucky and Maryland in 2000.) Again, some states exhibit very substantial reductions in both years. In 1992, three states had reductions of at le ast ten points; in 2000 there were four such states. It is noteworthy that even after removing school effects the estimated gaps in each stratum/state/year combination are quite large, with most exceeding 20 points. That is, the typical pooled within-school achievement gap is about the sa me size as the typical gap between strata (see Table 11). That these within-school gaps are both re latively large and persistent, points to the need to probe more deeply into the differences in personal characteristics and school experiences that may account (in a statistical sense) for some portion of the achievement gap within schools. When we adjust the estimate of the pooled wi thin school achievement gap for differences among students in SES and AcadFoc, the gap is decreased in 1992 (column 4) for both S1 and S2, but only very slightly in 2000 (column 8). This may be due to the fact that differences in AcadFoc between White students and Black students were redu ced from 1992 to 2000, so that variable was unable to account for much of the differences in achievement between the two groups. 20 Interestingly, for S1, the estimates in columns (1 ) and (2) are quite similar for most states in both 1992 and 2000. The medians across states are nearly identica l. This is essentially th e case for S2 in 1992 as well, but not in 2000.

PAGE 55

The Black-White Achievement Gap 55 Table 17 Lower poverty stratum: Bl ack-White achievement ga ps from four analyses 1992 2000 Full school sample Reduced school sample Full school sample Reduced school sample Include school effects Include school effects Remove school effects Remove school effects and student covariates Include School effects Include school effects Remove school effects Remove school effects and student/ school covariates State 1 2 3 4 5 6 7 8 NY 29 (4.9) 30 (1.6) 21 (3.6) 14 (3.2) 15 (7.1) 26 (2.5) 17 (5.2) 15 (4.7) CA 37 (3.6) 32 (2.3) 15 (2.2) 10 (2.8) 32 (5.7) 32 (1.5) 15 (3.9) 11 (4.5) SC 30 (1.6) 29 (0.6) 26 (1.6) 17 (1.4) 30 (3.1) 30 (1.1) 26 (2.8) 22 (2.6) KY 24 (3.3) 25 (0.5) 23 (3.5) 16 (2.5) 23 (3.4) 24 (1.4) 24 (3.5) 23 (3.1) MI 37 (3.4) 29 (0.9) 23 (4.0) 21 (3.2) 38 (5.7) 37 (1.8) 33 (6.2) 30 (5.4) TX 34 (2.8) 34 (1.3) 22 (2.4) 14 (2.1) 34 (4.5) 31 (0.6) 18 (2.2) 12 (2.4) MD 35 (2.4) 35 (0.6) 26 (1.8) 18 (1.8) 33 (2.5) 31 (1.3) 25 (2.2) 23 (2.1) VA 29 (2.0) 31 (1.0) 22 (2.1) 15 (2.0) 31 (2.5) 29 (0.8) 21 (2.0) 17 (1.8) TN 27 (2.2) 29 (0.6) 28 (2.0) 21 (1.6) 28 (6.1) 28 (1.4) 24 (3.1) 20 (2.9) NC 25 (1.8) 27 (0.8) 26 (1.7) 19 (1.4) 34 (2.1) 33 (0.7) 29 (2.4) 22 (2.1) Median 29 30 23 16 31 31 24 22

PAGE 56

Education Policy Analysis Archives Vol. 14 No. 8 56 Table 18 Higher poverty stratum: Black-White achievement gaps from four analyses 1992 2000 Full school sample Reduced school sample Full school sample Reduced school sample Include school effects Include school effects Remove school effects Remove school effects and student covariates Include school effects Include school effects Remove school effects Remove school effects and student/scho ol covariates State 1 2 3 4 5 6 7 8 NY 43 (6.9) 34 (2.5) 18(4.5) 14(4.1)31(4.5) 25(0.9) 7(3.1) 5(3.6) CA 43 (2.4) 34 (2.7) 13(3.5) 8(4.6)30(6.7) 23(2.6) 3(2.8) 8(3.4) SC 27 (2.8) 26 (1.0) 24 (2.9) 18 (2.2) 24 (2.6) 24 (1.2) 22 (2.3) 20 (2.1) KY 21 (5.3) 24 (2.6) 23 (4.1) 15 (5.2) 16 (4.6) 18 (0.9) 20 (4.3) 19 (3.6) MI 39 (5.7) 30 (2.0) 16(7.5) 16(6.7)39(4.8) 25(1.6) 12(5.4) 16(5.6) TX 33 (4.5) 30 (2.5) 15(4.0) 8(5.1)34(3.8) 32(1.9) 14(3.7) 18(3.3) MD 26 (11.5) 23 (2.5) 17 (8.7) 12 (5.4) 24 (10.6) 17 (3.3) 22 (7.9) 23 (6.5) VA 20 (6.7) 21 (2.4) 20 (3.9) 12 (3.4) 32 (6.1) 26 (1.5) 19 (3.9) 15 (3.8) TN 29 (3.6) 25 (2.1) 22(6.6) 17(4.1)31(4.5) 30(2.9) 22(8.8) 21(6.9) NC 31 (3.7) 30 (0.7) 24(3.9) 16(3.4)28(3.5) 24(1.4) 20(3.0) 17(3.0) Median 30 28 19 14 31 24 20 19

PAGE 57

The Black-White Achievement Gap 57 Table 19 Differences in achievement gaps between strata [S1 – S2], within state, by year (full and reduced school samples) 1992 2000 Full school sample Reduced school sample Full school sample Reduced school sample State Include school effects Remove School effects Remove school effects & student covariates Include school effects Remove school effects Remove school effects & student covariates CA -6 2 2 2 12 3 KY 3 0 2 7 4 4 MD 9 9 6 9 3 0 MI -2 7 5 -1 21 14 NC -6 2 3 6 9 5 NY -14 3 0 -16 10 10 SC 3 2 -1 6 4 2 TN -2 6 4 -3 2 -1 TX 1 7 6 0 4 -6 VA 9 2 3 -1 2 2 Median -0.5 2.5 3 1 4.5 2.5 Table 19 presents, for each year, the differences in achievement gaps between strata, within a state.21 Results are displayed for the descriptive analysis, as well as both HLM-based analyses. It is evident that, with school effects removed, the achi evement gap in the lower poverty stratum (S1) is uniformly higher than in the higher poverty stra tum, although no individual state difference is statistically significant. There is a suggestion of the differences between strata being larger in 2000 than in 1992, with substantial increases in Ca lifornia and Michigan. In California, this is a consequence of a reduction in the gap in S2 from 1992 to 2000, while in Michigan it is a consequence of an increase in the gap in S1 coupled with a small reduction in S2. The patterns in between stratum differences and trends over time are similar for the results from the analysis in which both school effects and student covariates are removed. We now turn to trends over time in the achievement gap within strata. Results are presented in Table 20. The entries represent the reduction in the gap between 1992 and 2000, with positive numbers signaling a reduction and negative numbers an increase.22 Focusing first on the results from removing school effects only, we note that for S1 th ere is a nearly even balance between reductions and increases in the gap, with a median of 0.5 points. This is similar, overall, to the results presented in Table 14, although there are discrepancies for pa rticular states. For example, in the full sample, New York experienced a reduction of 14 points, wh ile for the HLM analysis (based on the reduced school sample) the gap decreased by only 4 points. For S2, eight out of the ten states experienced reductions, with a median reduction of about 2.5 poi nts. Again, this is similar to the results presented in Table 14, where the median reducti on was about 2 points. With the reduced school 21 That is, the tabled entry = [White mean –Black mean]S1 –[White mean – Black mean]S2. 22 That is, the tabled entry = [White mean – Black mean]1992 – [White mean – Black mean]2000.

PAGE 58

Education Policy Analysis Archives Vol. 14 No. 8 58 sample, both California and New York experienced the greatest reductions in the gap—as was the case with the full school sample. Table 20 Reduction in achievement gap within stratum by state [1992 to 2000]. (R educed school sample.) S1 S2 State Remove school effects Remove school effects & student covariates Remove school effects Remove school effects & student covariates NY 4 -1 11 9 CA 0 -1 10 0 SC 0 -5 2 -2 KY -1 -7 3 -4 MI -10 -9 4 0 TX 3 2 1 -10 MD 1 -5 -5 -11 VA 1 -2 1 -7 TN 4 1 0 -4 NC -3 -3 4 -1 Median 0.5 -2.5 2.5 -2.5 When we consider the results for the analyses in which the estimated gaps are adjusted for student covariates, as well as for school effects, we observe a preponderance of increases in both gaps for strata. The median increase in the gap is about 2.5 points. Again, there is some ambiguity in interpreting these results, as they are influenced by the differential trajectories of the covariates over time. Linking Results In an earlier section we presented rankings of states into one of three categories on each of five policy levers, as well as an overall ranking (Table 7). Subsequently, we presented the results of our analyses of NAEP score trends drawing on data from the full school samples. Table 15 displays the ranking of states based on those results. Although policy makers are invariably interested in “what works”, a statistical analysis of the available data cannot answer this question directl y. Patterns of association between policy rankings and outcome rankings can, however, provide useful in sights. Specifically, it is possible to determine whether those states judged to be more succe ssful in the policy realm were among those that experienced greater improvements in student outcomes and, converse ly, whether those judged to be less successful tended to experience poorer outco mes. Given the nature of the phenomena under study, neither one policy analysis nor a single set of statistics can adequately convey the many strands that constitute a state’s “story” or the co mplexity of the relationships among the strands for ten different states. We have chosen, therefore, to adopt multiple perspectives. In this context, an analysis of the possible impact of educational policy requires consideration of the trajectories of both groups, as well as of the achievement gap between them. Accordingly, we employ two outcome measures, improvements in the NAEP scores of Black students and ieductions in the achievement gaps between Black students and White students. For

PAGE 59

The Black-White Achievement Gap 59 each measure, we examine the state’s record in both the higher poverty and lower poverty strata. In our study of the linkage between the outcome me asures and the states’ education policies, we consider the overall policy ranking, as well as the rankings on each policy component. Recall that even though the rankings are grounded in the extensive data collected and analyzed, there is necessarily some subjectivity involved in the process. To assist in the evaluation of the strength of the linkage between policy and outcomes, we rely on a simple measure of association: We coun t the number of category shifts that would be necessary to bring a policy ranking and an outco me ranking into perfect alignment—the fewer the number of shifts, the stronger the linkage.23 The Search for Patterns Consider first the outcome “Improving Black st udent achievement.” The “Overall” ranking, as well as the rankings on “Teacher quality” and “Assessment and accountability” provide the best matches to the combined (across the two strata) rankings on this outcome. The matches for the other three policy components are clearly poorer. The pattern of association between the “Overall” ranking and the outcomes for the two strata is re presented in Figure 6. Note that, ideally, states’ rankings on both dimensions would be identical. In that case, off-diagonal cells would be empty. Improving Black Student Achievement—Rankings Overall Policy Rankings 1 2 3 1 nc NC 2 ky ny NY sc tx CA KY SC ca TX 3 md va MD MI mi tn TN VA Figure 6 Relationship between overall policy ranking and improving Black student achievement. Cell entries are states: lower case abbreviation denotes the lower poverty stratum and upper case abbreviation denotes the higher poverty stratum. One can argue that the quality of the match is near ly as good as is possible, given the pairs of outcome ranks associated with each state. For the “Overall” ranking, the state in the highest category is also rank 1 on the outcome in both strata four of the five states in the middle category 23 This count is approximately a non-normalized version of Kendall’s tau, a measure of association often used with ranked data.

PAGE 60

Education Policy Analysis Archives Vol. 14 No. 8 60 were also rank 2 on the outcome in at least one stra tum, and three of the four states in the lowest category were also rank 3 in at least one stratum and never in rank 1. Maryland and New York have the greatest disparities between their overall policy ranking and their ranking on outcomes. Maryland is in the lowest policy category but is in rank 1 and rank 2 with respect to the outcome. New York is solidly in the middle category on policy, but very clearly rank 1 on the outcome in both strata. With respect to the outcome “Closing th e Achievement Gap”, the “Overall” ranking provides the best match to the combined rankin gs. Among the five policy levers, the rankings on “Curriculum and standards” and “Teacher quality” have the best matches, while “Governance” and “Assessment and accountability” have the poorest matches. The pattern of association between the “Overall” ranking and the outcomes for the two strata is represented in Figure 7. The linkage with policy is weaker for this outcome. The pattern is clea rest for those states in the lowest category for overall policy—they are most likely to fall in th e lowest category on the outcome for both strata. Closing the achievement gap—Rankings Overall Policy Rankings 1 2 3 1 NC nc 2 ny CA NY ca KY SC ky sc tx TX 3 md mi tn va MD MI TN VA Figure 7 Relationship between overall policy ranking and improving Black student achievement. Cell entries are states: lower case abbreviation denotes lower poverty stratum and upper case abbreviation denotes higher poverty stratum. As the evaluation of states’ policies makes clear, each state focused on a different set of policy levers—with different degrees of success—over the period of interest. The patterns just described indicate that there is no simple accounting for states’ results. No single dimension predicted states’ outcomes as well as the overall ranking did, although teacher quality was nearly as good. Policy rankings are more predictive of sta tes’ ranks on improving Black achievement than of their ranks on closing the achievement gap. Not only is the latter outcome more refractory to progress but also it is less obvious what combination of policies can lead to some success. Reducing the Gap: Selected State Stories Turning to individual states, New York, with th e best record on outcomes, was rated in the highest category on teacher quality, and in the mi ddle category on the other four policy components, as well as on overall policy. During this period, New York was distinguished by its greater emphasis

PAGE 61

The Black-White Achievement Gap 61 on professional development for its mathematics teachers than on high-stakes accountability. (It serves as a counter-example to the claim that high stakes testing is essential to making progress.) New York did have a solid assessment program that had been under development from the mid1980s, as well as a rigorous end-of-high school examination battery (Regents’ Examinations) that was not then required for graduation during 1988–9 8. Nonetheless, it was thought to exert some influence throughout the system. California was second to New York in reducing the achievement gap. Overall, it was more successful in the higher poverty stratum than in the lower poverty stratum. Like New York, it was rated in the middle category on four of the five policy components, and in the highest category on education finance. California did not institute high-s takes testing and, in fact, its assessment program suffered considerable discontinuities during the period. North Carolina was distinguished by its placement in the highest category on four policy components and the fact that it substantially ra ised Black student achi evement in both strata. (However, some of this improvement should be disc ounted in view of the state’s large increase in exclusion rates.) With respect to closing the achi evement gap, on the other hand, North Carolina’s record is rather poor. Kentucky’s policies have been well documented and, in many ways, it was considered a leader in education reform in the 1990s. Although it began with a comprehensive design, its implemented reforms were incomplete and beset by political battles. It had limited success with respect to either outcome. Texas, which employed high stakes testing as the driving force behind its education reforms, had relatively little success during the period of in terest. Contrasting the data from Texas, on the one hand, with that of New York and California on th e other, suggests that test-based accountability alone cannot carry the day. Rather, a reasonably consistent, broad-based effort can be more successful. Maryland was notable for its consistent and somewhat narrow focus on its state assessment program, the Maryland School Performance Assess ment Program (MSPAP), Initiated in 1991, the MSPAP was the state’s primary reform during this period. The state did not expend a great deal of energy on improving fiscal equity between schools an d districts, was relatively slow in developing extensive standards for teachers to refer to, and did not do a great deal with respect to raising teacher quality or qualifications. However, because the state is small (with only 24 school districts), the state department was able to maintain good co mmunication channels with every school district. Compared to the other states in the study, Virginia was late in introducing quality state standards and an accountability system. It was only in 1994 that Virginia began to overhaul the state system of standards and accountability. Starting in 1995, the state undertook a significant effort to familiarize all teachers with what is known as the Standards of Learning (SOLs), a set of standards and objectives for each grade and course. The state funded a lead teacher in each school in order to train teachers and oversee implementation of the SOLs. Assessments aligned with the SOLs were introduced in 1998, toward the end of our period of study. South Carolina had a different strategy than all the other states in this study. It began in 1993 with a slow introduction of standards and programs to help teachers become familiar with the new standards and practices, followed by progressive upgrading of teacher quality standards and some attempts at equalizing funding for poorest school districts. Then, in 1998, the state introduced aligned assessments and increased accountability. The ongoing changes in terms of standards and improvement programs proved to be confusing to some teachers and administrators who felt they were being asked to make incremen tal changes over many years. The strong tradition of local control in Michig an, combined with the lack of explicit support from both the teachers’ union and the higher educati on institutions in the state, meant that changes

PAGE 62

Education Policy Analysis Archives Vol. 14 No. 8 62 introduced by the state education department had limited impact on local districts. The two main reforms in the state were the introduction of the core content standards in 1990 and the 1993–1994 overhaul of school finance. Tennessee, in spite of its finance reform initiated in 1993, remained the lowest spending state in terms of per-pupil expenditure of any of the states in this study, and remains well below the national average. Combined with standards that varied in quality during the time period in question, no accountability, and an unclear link between standards and assessment, it is not surprising that the state experienced relatively poor outcomes over the period. Discussion The challenge for American education is both to raise achievement for all and to close the gaps between majority and minority students. As various education reforms are developed and launched (often with considerable fanfare), the paramount question is: Has there been any improvement and, if so, to what can we attribut e the success? The first part of the question is amenable to statistical analysis. The second part concerning cause and effect, cannot be answered definitively. The difficulty is both that the data are derived from observational surveys and that the different policies overlap in time. Consequently, it is almost impossible to isolate the impact of a particular policy. This study describes an approach that is at once limited and ambitious. It considers the experiences of just ten states over a decade and focuses on analyzing the NAEP scores of White students and Black students during that period on a single assessment in one grade. On the other hand, the public schools in the ten states in this study enrolled about 40–45% of all Black students in public schools in the nation. The study breaks “new ground” by probing beneath the surface of reported state results to estimate the achievement ga ps within poverty strata and even within schools within strata. Further, it incorporates extensive policy histories for each state, looking for links between states’ policies and their relative su ccess in meeting the challenges cited above. With respect to outcomes, mean achievement rose in both poverty strata in all ten states— for all students, for White students and for Black students. There was, however, considerable variation across states in the amount of improv ement. Further analysis revealed substantial heterogeneity within states: The typical achievemen t gap between strata within states was about 20 points and the typical Black-White achievement gap within a stratum was about 30 points. For nearly all states, the within-stratum Black-White achievement gaps were very similar in magnitude for the two strata. What was more surprising was that for most states the within-stratum BlackWhite achievement gaps were nearly as large as th e Black-White achievement gap for the state as a whole. That is, the large differences in averag e scores between lower poverty and higher poverty schools accounted for only a part of the state level Black-White achievement gap. This finding is of some interest in view of suggestions of using social class rather than race as a basis for affirmative action. To put these results in perspective, in 1992 the difference between the highest state mean and the lowest was only ten points. Yet across th e ten states, the median Black-White achievement gap within each stratum of about 30 points is nearly as large as the standard deviation of test scores in the full population, corresponding to an effect size of almost one. Finally, in 1990, when the cross-grade scale in mathematics was established, the difference in scores between the median 8th grader and the median 4th grader was 50 points. The focus of the study has been on the trajectories of the achievement gaps from 1992 to 2000. Across the ten states, the median gap betwee n strata and the median Black-White achievement

PAGE 63

The Black-White Achievement Gap 63 gap within a stratum each remained essentially constant over the period. In fact, for most states there was little change in the outcomes we considered. On the other hand, there were considerable differences in trajectories for some states, which led to the question of whether differences in states’ education policies could account for the variation in states’ results. To address this question, we developed a poli cy framework and compiled extensive histories of states’ policies from (roughly) 1988 to 1998, sup plemented by information provided by experts in each state. We found that states’ overall policy ra nkings correlated moderately with their record in improving Black student achievement but were less use ful in predicting their record in reducing the Black-White achievement gap. For both outcomes, th e strength of the association between policy and the direction of the trajectory was limited by the fact that, for most states, results differed by stratum. No single policy component accounted for the differences among states in the two outcomes as well as did the overall policy rating—although teacher quality did nearly as well. Perhaps the clearest finding was that states in the lowest category on overall policy (Michigan, Tennessee and Virginia) were also in the lowest categ ory in both strata with respect to reducing the achievement gap. One caution in interpreting the results on the Bl ack-White achievement gaps within strata is due to the differential distributions of Black st udents and White students across schools within a stratum. Thus, observed gaps could be confou nded with average differences among schools. Accordingly, we undertook a series of multi-level analyses to estimate the size of the achievement gap, after eliminating between-school differences The resulting pooled, within-school estimates were somewhat smaller than the original estimates but still quite large. In this setting, we observed some interesting consistencies. For all states in both years, the achievement gap in S1 was greater than the achievement gap in S2. For most states the achievement gap was reduced in both strata— particularly in S2. Limitations There are a number of limitations that should be taken into account in interpreting these results. First, in carrying out the policy analysis under time and budget constraints, we could not capture all the salient features of the political environment and policy landscape in the ten states over a ten-year period. To do so would have re quired writing a book on each state—as has been done for California and Kentucky! Such a book would have included interviews with a larger number of experts, more information on the scope and quality of the implementation of the various reform initiatives, as well as data on demogra phic and economic trends within the state. Our categorizations of the states on the different policy levers were necessarily grounded in the information that we were able to amass and, ulti mately, determined by our subjective judgments of the relative strengths of each state’s efforts. With respect to the data analysis, our estimates of the trajectories are potentially confounded with differences between the 1992 and 2000 cohorts. It is also the case that NAEP was not designed to support the kind of “deep” anal yses we have carried out. The target number of schools in a state sample was only about 100 and, in some cases, the realized sample was noticeably smaller. Furthermore, the number of students tested with in each school is only about 20. Consequently, the sample sizes for estimating the achievement gaps in many stratum/state/year combinations were disappointingly small and the estimated st andard errors correspondingly large. Because the analyses were exploratory and in tended to lead to a delineation of summary patterns, we used the standard errors for guidance, rather than hewing to the conventional .05 level

PAGE 64

Education Policy Analysis Archives Vol. 14 No. 8 64 for significance. Nonetheless, it should be borne in mind that the categorizations of the states were based on estimated changes over time that typically were not significantly different from zero and were subject to considerable uncertainty. Small sample sizes in the reduced school samp le were particularly problematic when using HLM to estimate the pooled within-school achievem ent gap. Moreover, because of the way in which schools were originally selected for NAEP, the esti mates derived from the reduced school sample cannot be directly generalized to a larger population of schools. Thus, those results must be treated as suggestive of what one might have found in a school sample expressly drawn to estimate the pooled within-school achievement gap. This study employed data from the NAEP 8th grade mathematics assessment for ten states. We are not in a position to generalize our findings to other states, other subjects or other grades. Finally, we must keep in mind that state policies are focused on strengthening the state’s curriculum frameworks and standards, enhancing the alignment of instruction with those standards, and improving scores on the state’s tests. To the exten t that the NAEP mathematics framework and the NAEP assessment differ from the state’s, the patter ns in NAEP results may not be an entirely fair basis for making judgments about the relative suc cess of the state’s education reform efforts. Furthermore, the magnitude of the achievement gap and its trajectory over time will vary with the choice of the criterion. For example, trends in the difference in the proportions of the two groups meeting the state’s standard of proficiency can be quite unlike trends in the difference in the means for the two groups—whether on a state test or on a comparable NAEP test. Conclusions We believe that this study supports the asserti on that analyses of outcomes within states are more interesting and useful than gross comparisons among states. From 1992 to 2000, Black students in the eighth grade made modest gains on the NAEP mathematics assessment but, overall, the Black-White achievement gap remained essentia lly constant—at each of the three levels of analysis we carried out. We conclude that the Black-White achievement gap can be fairly characterized as pervasive, profound and persistent: Pervasive because the gap exists in all ten states, profound because the gap can be found at all thre e levels of aggregation, and persistent because overall the gap did not diminish over the period of study. While a few states made some progress and othe rs lost ground, in all our analyses most states experienced little change in the achievement gap over the period of the study. Nonetheless, our categorizations of the states yielded an interes ting result: Policy variations among states appear to account for a modest amount of the differences in outcomes. In other words, top-down reform is a blunt tool, but a tool nonetheless. At the same time, these findings highlight the difficulties in realizing some of the goals that inspire state re forms. Although individual schools may demonstrate progress, large-scale effective reforms appear to be much rarer. While there are many reasons why this is the case, a particularly constructive analys is of the difficulties associated relying on standardsbased reform to effect school improvement can be found in O’Day (2002). On the basis of this study, we cannot make facile pronouncements of “what works.” Our findings are consistent with the recommendations made by the Center on Education Policy (2001) for closing the achievement gap. We want to draw particular attention to its warning, “Closing the gap will require bold, comprehensive and long-term strategies” (p. iii). In view of our within-school results, “bold” must include policies that directly support local reform efforts with demonstrated effectiveness in addressing the experiences of studen ts of different races attending the same schools. Considering our overall findings, “comprehensive” shou ld signify full use of all policy levers, rather

PAGE 65

The Black-White Achievement Gap 65 than reliance on one or two. Coherence and consis tency do matter. Finally, “long-term” means that reform efforts should be built on broad-based suppor t and structured to be better able to withstand the vicissitudes of economic trends and state politics. References Allen, N. L., Jenkins, F., Kulick, E., & Zelenak, C. A. (1997). Technical report of the NAEP 1996 state assessment pr ogram in mathematics. Washington, DC: Na tional Center for Education Statistics. Allen, N. L., Johnson, E. G., Mislevy, R. J., & Thomas, N. (1 999, July). Scaling procedures. In N. L. Allen, J. E. Carlso n & C. A. Zelenak (Eds.), The NAEP 1996 technical report, NCES 1999–452 (pp. 75–95). Washington, DC: Na tional Center for Education Statistics. Amrein, A. L., & Berliner, D. C. (2002). High-s takes testing, uncertainty, and student learning. Education Policy Analysis Archives, 10 (18). Retrieved March 28, 2002, from http://epaa.asa.edu/epaa/v10n18/. Antonucci, M. (1999). Measure for measure: A magnified l ook at standardized test scores. Carmichael, CA: Educati on Intelligence Agency. Barton, P. E. (2002). Raising achievement and reducing gaps : Reporting progress toward goals for academic achievement in mathematics. Washington, DC: National Education Goals Panel. Barton, P. E. (2003, October). Parsing the achievement gap. (Policy Information Report). Princeton, NJ: Educational Testing Service. Barton, P. E., & Coley, R. J. (1998). Growth in school: Achievement gains from the fourth to the eighth grade. (Policy Information Report). Princeton, NJ: Educational Testing Service. Blank, R. K. (2000). Summary of findings from SSI and recommendations fro NSF's role with states: How NSF can encourage state leadership in improvemen t of science and mathematics education. Washington, DC: Council of Chief State School Officers. Bracey, G. W. (2002). Internatio nal comparisons: An excuse to avoid meaningful educational reform. Education Week, 21 (19), p. 30. Bracey, G. W. (2003). Those misleading SAT and NAEP tr ends: Simpson's paradox at work. Retrieved September 21, 2004, from http://www.americatomorrow.co m/bracey/EDDA/ EDDRA30.htm Brady, R. C. (2003). Can failing schools be fixed? Washington, DC: Thomas B. Fordham Foundation.

PAGE 66

Education Policy Analysis Archives Vol. 14 No. 8 66 Braun, H. I. (2004). Reconsidering the impact of high-stakes testing. Education Policy Analysis Archives, 12 (1). Retrieved September 22, 2004, fr om http://epaa.asu.edu/epaa/v12n1/. Burke, J., & James, P. (1 997). Weighting procedures and variance estimation. In N. L. Allen & et al. (Eds.), Technical report of the NAEP 1996 state assessment program in mathematics Washington, DC: National Cent er for Educatio n Statistics. Camilli, G. (2000). Texas gain s on NAEP: Points of light? Education Policy Analysis Archives, 8 (42). Retrieved September 22, 2004, fr om http://epaa.asu.edu/epaa/v8n42/. Carnoy, M., & Loeb, S. (2003). Does external accountability affect student outcomes? A crossstate analysis. Educational Evaluation and Policy Analysis, 24 (4), 305–331. Cawelti, G. (1999). Portraits of six benchmark schools: Di verse approaches to improving student achievement. Arlington, VA: Educatio n Research Service. Center on Education Policy. (2001). It takes more than testing: Closing the achievement gap Washington, DC: Center on Education Policy. Chambers, R. L. (2003). Which sample survey strategy? A re view of three different approaches. (S3RI Methodology Working Paper No. M03/ 20). Southampton, UK: Southampton Statistical Sciences Research Institute. Christensen, L., & Karp, S. (2003 October 8). Why is school re form so hard?: Two classroom veterans rethink the process. Education Week Clune, W. H. (1998). Toward a theory of systematic refo rm: The case of ni ne NSF Statewide Systematic Initiatives (Research Mongraph No. 16). Madison, WI: University of Wisconsin, National Institute for Science Education. Coleman, J. S., & et al. (1966). Equality of educat ional opportunity. Washington, DC: U.S. Government Printing Office. Coley, R. J. (2003, November). Growth in school revisited. Achiev ement gains from the fourth to the eighth grade. (Policy Information Report). Princeton, NJ: Educational Testing Service. College Board. (1999). Reaching the top: A report of the Na tional Task Force on Minority High Achievement. New York, NY: The College Board. Corcoran, T. B. (1997). Evaluating systemic reform (TEECH Paper): Te acher Enhancement Electronic Community Hall. Retrieved January 2004, from http://teech.terc.edu/papers/papers/corcoran.htm. Darling-Hammond, L. (2000). Teache r quality and student achievemen t: A review of state policy evidence. Education Policy Analysis Archives, 8 (1). Retrieved March 20, 2006, from http://epaa.asu.edu/epaa/v8n1/ Dee, T. S., & Keys, B. J. ( 2004). Does merit pay reward g ood teachers? Evidence from a randomized experiment. Policy analysis management, 23 (3), 471–488.

PAGE 67

The Black-White Achievement Gap 67 Desimone, L. M., Smith, T. M., Hayes, S. A., & Frisvold, D. ( 2005, Winter). Beyond accountability and average mathematics scores : Relating state education policy attributes to cognitive achievement domains. Educational Measurement: Issues and Practices 24 5– 18. Education Trust. (1999). Dispelling the myth: High poverty schools exceeding expectations. Washington, DC: Education Trust. Education Trust. (2001). Dispelling the myth revisited. Prelim inary findings from a nationwide analysis of "hig h-flying" schools. Washington, DC: Education Trust. Education Trust. (2002). The funding gap: Low-inco me and minority students receive fewer dollars. Washington, DC: Education Trust. Edwards, V. B., & Olson, L. (Eds.) (1997, January). Quality Counts: A re port card on the condition of public education in the 50 States. (An Education Week/Pew Charitable Trusts Report). Washington, DC: Editor ial Projects in Education. Edwards, V. B., & Olson, L. (Eds.) (1998, January). Quality Counts '98: The urban challenge (An Education Week/Pew Charitable Trusts Repo rt on Education in the 50 States). Washington, DC: Editorial Projects in Education. Edwards, V. B., & Olson, L. (Eds.) (1999, January). Quality Counts '99: Rewarding results, punishing failure (An Education Week/Pew Charitable Trusts Report on Education in the 50 States). Washington, DC: Ed itorial Projects in Education. Edwards, V. B., & Olson, L. (Eds.) (2000, January). Quality Counts 2000 : Who should teach? (An Education Week/Pew Charitable Trusts Report on Education in the 50 States). Washington, DC: Editorial Projects in Education. Ferguson, R. F. (1998). Can schools narrow the Black-White test score gap? In C. Jencks & M. Phillips (Eds.), The Black-White test score gap (pp. 318–374) Washington, DC: Brookings Institute. Friedman, B. M. (2005). Meltdown: A ca se study. The Atlantic Monthly, 296 (1), 66–68. Grissmer, D., Flanagan, A., Kawa ta, J., & Williamson, S. (2000). Improving student achievement: What state NAEP scores tell us. Santa Monica, CA: RAND Corp. Haney, W. (2000). The myth of the Texas miracle in education. Education Policy Analysis Archives, 8 (41). Retrieved March 20, 2006, from http://epaa.asu. edu/epaa/v8n41/. Hanushek, E. A., Rivkin, S. G., & Taylor, L. L. (1996). Aggregat ion and the estimated effects of school resources. The Review of Econom ics and Statistics, 78 611–627. Hedges, L. V., & Nowell, A. (1998). Black-White test score convergence since 1965. In C. Jencks & M. Phillips (Eds.), The Black-White test score gap (pp. 149–181). Washington, DC: Brookings Institution.

PAGE 68

Education Policy Analysis Archives Vol. 14 No. 8 68 Herrnstein, R., & Murray, C. (1994). The bell curve: Intelligence an d class structure in American life. Ne York: Free Press. Hoxby, C. M. (2001). How school choice affects the achiev ement of public school students. Paper presented at the Koret Task Force meetin g, Hoover Institution, Stanford, CA. Hussar, W. & Sonnenberg, W. (2000). Trends in disparities in school district level instructional expenditures per pupil (NCES 2000–020). Washington DC: U.S. Department of Education, National Center for Education Statistics. Ingersoll, R. M. (1999). The pr oblem of underqualified teachers in American secondary school. Educational Researcher, 28 (2), 26–37. Jencks, C., & Phillips, M. (1998). The Black-White te st score gap: An introd uction. In C. Jencks & M. Phillips (Eds.), The Black-White test score gap (pp. 1–52). Washington, DC: Brookings Institution. Klein, S., Hamilton, L., McCaffrey, D., Steche r, B., Robyn, A., & Burroughs, D. (2000). Teaching practices and student achievement. Report of first-year findings from the 'Mosaic' study of systematic initiatives in mathematics and science. (RAND Education). Santa Monica, CA: RAND. Kober, N. (April 2001,). It takes more than testing: Closing the achievement gap. Washington, DC: Center on Education Policy. Koretz, D. (2005). Using aggregate-level li nkages for estimation and validation: Comments on Thissen and Qian & Braun Presented at the ETS Conferen ce on Linking and Aligning Scores and Scales: A conference in honor of Ledyard R Tucker’s approach to theory and practice, Princeton, NJ, June 24–25, 2005. Kosters, M. H., & Ma st, B. D. (2003). Closing the education achievem ent gap: Is Title I working? Washington, DC: AEI Press. Lazer, S. (1999, July). Assessme nt instruments. In N. L. Allen, J. E. Carlson & C. A. Zelenak (Eds.), The NAEP 1996 technical report, NCES 1999–452 (pp. 75–95). Washington, DC: National Center for Education Statistics. Lee, J. (2002). Racial and ethnic achievement ga p trends: Reversing the progress toward equity? Educational Researcher, 31 (1), 3–12. Little, R. (2003, November 12). To model or not to model? Compet ing modes of infe rence for finite population sampling. Ann Arbor, MI: University of Mich igan, School of Public Health. Retrieved January 28, 2005, from http:/ /www.bepress.com/umichbiostat/paper4. Ludwig, J. (2003). A review of Educational Achievement an d Black-White Inequality by J. Jacobsen, C. Olsen, J. King Rice, S. Sweetland, and J. Ralph (Eds.). Education Next 3 (3), 79–82. Retrieved March 11, 2005, from http://www.educationnext.org/20033/79.html.

PAGE 69

The Black-White Achievement Gap 69 Miller, S. (1995). An American imperative: Accelerating minority educational advancement. New Haven, CT: Yale University Press. Nichols, S. L., Glass, G. V., & Berliner, D. C. (2006). High-stakes testing and student achievement: Does acco untability pressure increa se student learning? Education Policy Analysis Archives, 14 (1). Retrieved February 20, 2006, from http://epaa.asu.edu/epaa/v14n1/. Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational evaluation and policy analysis, 26 (3), 237–257. O’Day, J. A. (2002). Complexity, acco untability, and school improvement. Harvard Educational Review, 72 (3). Retrieved Febr uary 20, 2006, from http://gseweb.harvard.edu/~hepg/oday.html. Odden, A., & Picus, L. (2000). School finance: A policy perspective (2nd ed.). New York: McGraw Hill. Pfeffermann, D., Moura, F., & Nascimento Silva, P. (2004). Multi-level modelling under informative sampling. (S3RI Methodology Working Paper No. M04/09). Southhampton, UK: Southhampton Statistical Sc iences Research Institute. Pfeffermann, D., Skinner, C. J. Holmes, D. J., Goldstein, H., & Rasbash, J. ( 1998). Weighting for unequal selection probabilities in multilevel models. Journal of the Royal Statistical Society, 60 (1), 23–40. Phillips, M., Crouse, J., & Ralph, J. (1998). Does the Black-Whi te test score gap widen after children enter school? In C. Jenks & M. Phillips (Eds.), The Black-White test score gap (pp. 229–272). Washington, DC: Brookings Institution Press. Phillips, M., Brooks-Gunn, J., Duncan, G. J., Klebanov, P., & Crane, J. (1998). Family background, parenting practices, and the Blac k-White test score gap. In C. Jencks & M. Phillips (Eds.), The Black-White test score gap (pp. 103–146). Washington, DC: Brookings Institute. Raudenbush, S., Bryk, A., Cheong, Y. F., & Congdon, R. (2001). HLM5: hierarchical linear and nonlinear modeling Lincolnwood, IL: Scientific So ftware International, Inc. Raudenbush, S. W. (1988). Educational applicat ions of hierarchical linear models: A review. Journal of Educational Statistics, 13 (2), 85–116. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models. Applications and data analysis methods Thousand Oaks, CA: Sage Publications. Raudenbush, S. W., Fotiu, R. P., Cheong, Y. F., & Ziazi, Z. M. (1996). Inequality of access to educational opportunity: A national re port card for eighth-grade math. Educational and Evaluation Policy Analysis, 20 (4), 256.

PAGE 70

Education Policy Analysis Archives Vol. 14 No. 8 70 Raudenbush, S. W., & Willms, J. D. (1 995). The estimation of school effects. Journal of Educational and Behavioral Statistics, 20 (4), 307–335. Ravitch, D. (2000). A century of failed school reforms. New York, NY: Si mon and Schuster. Raymond, M. E., & Hanushek, E. A. (2 003, Summer). High-stakes research. Education Next, 3 (3), 48–55. Retrieved September 21, 2003, from http://www.educati onnext.org/20033/. Rothstein, R. (2004). Class and schools: Using social, economic and educational reform to close the black-white achievement gap. Washington, DC: Econom ic Policy Institute. Smith, M. S., & O'Day, J. A. (1991). Systematic school reform. In S. H. Fuhrman & B. Malen (Eds.), The politics of curriculum and testing: The 1990 year book of the Politics of Education Association (pp. 233–267). Bristol, PA: Falmer. Sum, A., Kirsch, I., & Taggart, R. (2002). The twin challenges of me diocrity and inequality: Literacy in the U. S. from an international perspective. Princeton, NJ: Educational Testing Service. Swanson, C. B., & Stev enson, D. L. (2002, Spring). Stan dards-based reform in practice: Evidence on state policy and classroom instruction from th e NAEP state assessments. Educational Evaluation and Policy Analysis, 24 (1), 1–27. Thissen, D. (2005). Linking assessments based on aggregat e reporting: Background and issues Presented at the ETS Conference on Link ing and Aligning Scores and Scales: A conference in honor of Ledyard R Tucker’s approach to theory and practice, Princeton, NJ, June 24–25, 2005. Tukey, J. W. (1977). Exploratory data analysis Reading, MA: Addison-Wesley Publishing Company. Tyack, D., & Cuban, L. (1995). Tinkering toward utopia: A cent ury of public school reform. Cambridge, MA: Harvar d University Press. U. S. Department of Ed ucation. (1995, June). National Education Longit udinal Study of 1988 (NELS88)--A profile of American high school seniors in 1992. Washington, DC: National Center for Education Statistics. U. S. Department of Education. Office of Educational Resear ch and Improvement. National Center for Education Statistics (1999). NAEP 1999 Trends in ac ademic progress: Three decades of student performance, NCES No. 2000–469, by J. R. Campbell, C. M. Hombo, and J. Mazzeo. Jessup, MD: Educati on Publications Center (ED Pubs). U. S. Department of Education. (2000). Mathematics and science in the eighth grade: Findings from the Third International Mathematics and Science Study. (NCES 2000–014.) Washington, DC: National Cent er for Educatio n Statistics.

PAGE 71

The Black-White Achievement Gap 71 U. S. Department of Education, (2001a). Educational achievement an d Black-White inequality. (NCES No. 2001–061.) Washington, DC: Nati onal Center for Education Statistics. U. S. Department of Education. (2001b). High standards for all students: A report from the national assessment of Title I on progress and challenges si nce the 1994 reauthorization. Washington, DC: Planning and Evaluation Service. U. S. Department of Education. (2001c). Paving the way to postsecon dary education: K–12 intervention programs for underrepresented youth. (Report of the National Postsecondary Education Cooperative Working Group on Ac cess to Postsecondary Education No. NCES 2001–205r). Jessup, MD: National Center for Education Statistics. U. S. Department of Education. (2002). Early Childhood Longitudinal Study--Kindergarten Class of 1998–99 (ECLS-K), Psychometric report for kindergarten through first grade. (NCES No. 2002–05). Washington, DC: Nationa l Center for Education Statistics. U. S. Department of Education, National Center for Education Statistics. (2004). National Assessment of Educational Prog ress (NAEP), The nation's report card: Mathematics highlights 2003. Jessup, MD: ED Pubs or http:/ /nces.ed.gov/nationsreportcard. Webb, N. L., Kane, J., Kaufman, D., & Yang, J.-H. (2001, June). Study of the impact of statewide systematic initiatives program. Technical repo rt to the National Scie nce Foundation on the use of state NAEP data to assess the impact of the Statewide Syst ematic Initiatives. Madison, WI: University of Wisconsin, Wi sconsin Center for Education Research. Wilson, S. (2003). California dreaming: Reform ing mathematics education. New Haven and London: Yale University Press. Zucker, A. A., Shields, P. M., Adelman, N. E., Corcoran, T. B., & Goertz, M. E. (1998). Statewide systemic initiatives program Washington, DC: National Science Foundation.

PAGE 72

Education Policy Analysis Archives Vol. 14 No. 8 72 About the Author Henry I. Braun Educational Testing Service Aubrey Wang School District of Philadelphia Frank Jenkins Westat, Inc. Elliot Weinbaum University of Pennsylvan iaAffiliation information Email: hbraun@ets.org Dr. Henry Braun a Distinguished Presiden tial Appointee at ETS, has published in the areas of probability, stochastic modeling and empirical Bayes methods. He is a co-winner of the Palmer O. Johnson award fr om AERA (1986) and a co-win ner of the NCME award for Outstanding Technical Contribu tions to the Field of Educat ional Measurement (1991). His current interests include the interplay of technol ogy and assessment, design science, evaluation methodology, the analysis of large-scale assessment data and education policy. Aubrey H. Wang is a program evaluator at the School District of Philadelphia and the current President of the Chinese American Educ ational Research and Development Association. Her research interests include educational policy and practice in closing the achievement gap. Frank Jenkins is a senior statistician at WESTAT Inc, where he directs analyses of various educational and health studies involving hierarchical nesting of subjects. At ETS he worked on NAEP evaluations with particular emph asis on psychometric issues and hierarchical linear models. He has also developed Bayesian models for analyzing multivariate performance assessments. Elliot Weinbaum is a researcher at the Consortium for Policy Research in Education at the University of Pennsylvania. His research in terests include intergovernmental relationships and policy-making, performance-based accounta bility, and the impacts of policy on school improvement. He is currently an investigator in a national study of hi gh schools and the roles that outside organizations play in high schools' strategies to improve instruction.

PAGE 73

The Black-White Achievement Gap 73 Appendix A: Example of a State Profile Kentucky Profile NOTE: This profile comprises six sections. Each section begins with a number of questions. Some of the questions are answered by the data in the profile, while others appear on the questionnaire we are asking you to fill out. Context What were the general characteri stics of KY’s public education system during the period 1988 through 1998? How was responsibility fo r governing the state educ ation system (e.g., acco untability, teacher quality, curriculum and standards, finance) distributed during this period? What were the most significant changes? Education System In 1996, Kentucky’s K-12 public schools serv ed over 650,000 students who attended school in 176 districts (CPRE, 1996). As of 2003, Ke ntucky has a total of 176 school districts (1,271 schools) that serve over 610,000 students (KYDE, 2003). The total number of full-time equivalent teaching staff in 2000 was 34,173 with 23,083 el ementary teachers, 10,177 secondary teachers, and 913 combined elementary and secondary teachers. The average per pupil teacher ratio in 2000 was 15.64 (KYDE, 2003). Governance 1996 Kentucky has undergone more changes in it s public education system in the last ten years than any other state in the nation (CPRE, 1996). Kentucky realized substantial achievement gains during the 1990s, after undertaking perhaps th e most extensive systemic education reforms of any state in the 1990s. These included substantial equalization of school funding along with large increases in teacher salaries and overall spending; changes in school organization, including multi-age primary grade classrooms; investments in early ch ildhood education; the introduction of standards and curriculum frameworks, along with portfolios and performance assessments. Changes in teacher education and licensing accompanied these refo rms, including the adoption of the INTASC licensing standards (developed by a consortium of more than 30 states), the introduction of new licensing tests and teacher education requirement s, incentives for colleges of education to meet national professional accreditation standards; and massive investments in professional development (Darling-Hammond, 2000). As a response to the low ranking of Kentucky in 1983 in education spending per pupil, teachers’ salaries, pupil-teacher ratio, high school graduation, adults with a high school diploma, and adults with a college degree (Prichard Committee, 1999; Rhoten, Carnoy, Chabran, & Elmore, 2003), in 1989, the Kentucky Supreme Court delivered a la ndmark decisions, ruling that the state’s public school system was unconstitutional and further descri bing the conditions it deemed to be essential

PAGE 74

Education Policy Analysis Archives Vol. 14 No. 8 74 and minimal characteristics of an efficient system of common (public) schools. The General Assembly drafted the Kentucky Education Refo rm Act (KERA), which became law on July 13, 1990. The Act was amended in each subsequent session of the General Assembly—in 1992, 1994, and 1996 (CPRE, 1996). The legislation addressed policy structure. Th e position of the elected Superintendent of Public Instruction was abolished in 1992 by an am endment to Kentucky’s constitution, and almost all of this position’s duties were transferred to the appointed Commissioner of Education (CPRE, 1996). KERA also abolished the existing Department of Education (referred to hereafter as the state education agency, or SEA) on June 30, 1991 and re organized it to include new positions and a new service-oriented mission, effective July 1, 1991 (CPRE, 1996). KERA’s components include: a) educational go als indicating what graduates should know and be able to do; b) an assessment process to determine if students are reaching these goals (through the development of the Kentucky Instruct ional Results Information System (KIRIS); c) an accountability system holding schools responsible for student success; d) increased funding for professional development activities for educator s to help students succeed; e) a new system for credentialing teachings; f) early childhood programs; g) funding to help students who require more time to achieve academic success; h) a major commitment to technology; I) full-service schools inclusive of community and agency resources; j) changes in governance structure to alter the politics in Kentucky school districts; k) as well as a commitment to fund the new initiatives (CPRE, 1996). Accountability System After a series of conflicts over, and evaluati ons of, KIRIS, the process culminated in 1997 when the state fired the testing company because of scoring errors and investigated more than one hundred schools for cheating (Jacobson, 1999; Rh oten, Carnoy, Chabran, & Elmore, 2003). House Bill 53 was signed into law on April 14, 1998 and renamed the KIRIS system, the Commonwealth Accountability Testing System (CATS). The imme diate political goal and its primary technical function was to revise the existing testing instrume nts. At the level of assessment, CATS modified KIRIS by (1) introducing the Kentucky Core Content Test, which tests students on how well they are learning the basics of math, science, reading, writing, and other subjects; (2) requiring a national, norm-referenced portion, which matches the state’s core curriculum and provides national comparisons for state students; (3) calling for a pared down written portion of the test; and (4) expanding the number of grades in which these tests are administered. The state board selected CTB/McGraw Hill to run the norm-referenced co mponent of the CTAS testing system (Rhoten, Carnoy, Chabran, & Elmore, 2003). In addition, the new sanctions and assistance programs as prescribed by the law include: mandatory audits for struggling schools; eligibility to receive CATS school improvement money; education assistance from highly skilled, certified state staff members rather than, as under KIRIS, so-called distinguished educators, who were experienced, state-paid teachers or administrators; and the option for students at low-performing schools to transfer to successful ones (Rhoten, Carnoy, Chabran, & Elmore, 2003). As Kentucky redesigned its system, a new educ ation financing formula adjusted the state’s district allocation from a plan that dispensed comparable funding based on student attendance and teacher experience and certification to one that varied allocations based on the amount of revenues generated by local taxes. Under this new formula, districts are required to meet certain local revenueraising benchmarks; however, those with small tax bases and/or limited property values are protected and theoretically equalized with addition al state funding. As a result, according to Education Week Kentucky has accomplished a level of relati ve equity in spending per student, with

PAGE 75

The Black-White Achievement Gap 75 a variation rate of only 13 percent between district s comparing to that national average of 23.1 percent. From 1989-90 to 1998-99, Kentucky redu ced the gap in per pupil expenditures between wealthy and poor districts by 36.9 percent while raising the state’s national rank in per pupil spending from 42 to 31 (Heine, 2002; Rhoten, Carnoy, Chabran, & Elmore, 2003). Credentialing of Teachers Under the Kentucky Education Reform Act (KERA), the credentialing of teachers was moved from the SEA to the Education Professi onal Standards Board, an autonomous body appointed by the governor. KERA also established the Office of Education Accountability (OEA) under the Legislative Research Commission (LRC). The mission of OEA is to monitor the public education system and the implementation of KERA (CPRE, 1996). The State Board of Education (SBE) includes 11 members appointed by the governor and confirmed by the General Assembly and is responsible for managing the public schools, ad opting policies for SEA, and hiring and evaluating the Commissioner of Education (CPRE, 1996). Finance How has proportion of state funding in education changed during this period? How has expenditure per student changed duri ng this period at the state level? How has the gap between highand low-poverty scho ols expenditures per studen t changed during this period? To what extent were there efforts to equalize sc hool funding between lowpoverty and high-poverty districts during this period [questionnaire]? To what extent were there efforts to link federal an d state resources to suppor t specific initiatives for poor/minority schools or students ? Did the initiatives focused on mathematics and science learning [questionnaire]? ;

PAGE 76

Education Policy Analysis Archives Vol. 14 No. 8 76 State Contribution to Education Funding Table A-1 Trends in Proportions of State Co ntributions to Ed ucation Funding. 1987-88 1988-89 1989-901990-911991-92 1992-93 1993-94 1994-95 1995-96 1996-97 1997-98 %State %State %State %State %State %State %State %State %State %State %State Nation 49.54 47.79 47.11 47.16 46.37 45.6 45.2 46.8 47.5 48.0 48.4 California 70.20 66.44 66.86 66.03 65.87 62.2 56.2 54.2 55.8 60.0 60.2 Kentucky 65.19 68.06 67.34 66.95 67.02 67.0 65.9 65.8 65.3 62.9 61.7 Maryland 38.74 38.12 37.29 36.90 38.21 39.4 38.9 37.0 38.2 38.8 39.0 Michigan 35.28 32.53 26.81 26.77 26.57 30.6 28.7 67.3 66.8 65.5 66.0 New York 43.39 43.18 40.75 42.56 40.31 39.2 38.2 40.7 39.7 39.4 39.7 North Carolina 66.74 66.08 66.02 65.29 63.61 63.3 64.0 65.1 64.5 65.4 67.3 South Carolina 54.44 50.04 50.00 49.88 48.34 47.0 46.2 46.3 47.9 52.5 51.5 Tennessee 44.50 46.05 45.77 45.24 42.19 45.6 46.8 47.5 42.9 48.5 47.7 Texas 44.19 43.25 41.92 43.94 43.37 40.0 40.2 40.2 42.9 40.3 44.2 Virginia 32.62 33.84 32.69 32.75 31.11 32.1 30.8 31.8 31.1 32.5 31.4 Source: National Center for Education Statistics Research Department. Table A-1 presents data on proportion of state contribution to the total education funding from school year (SY) 1988 to SY 199 8. Across the years, Kentucky’s av erage state contribution of 65.7% is higher th an the national average of 47.2%. Over this ten ye ar period, the proporti on of state support ranged fr om a low of 61.7% in SY 1 998 to a high of 68.1% in SY 1989. In comparison to t he other states, Kentucky consistent ly had the highest state spending or close to highest state spending across the 10 years.

PAGE 77

The Black-White Achievement Gap 77 Table A-2 Trends in Total Expenditures per Pupil. 1992-93 1993-94 1994-95 1995-96 1996-97 1997-98 Total ($) † Total ($) † T otal ($) † Total ($) † Total ($) † Total ($) † Nation 6,351 6,3776,4356,4476,526 6,700 California 5,668 5,6505,5875,5955,796 6,110 Kentucky 5,296 5,3955,2915,4485,680 5,643 Maryland 7,445 7,4147,4827,4727,443 7,615 Michigan 7,303 7,3507,5267,6897,638 7,632 New York 9,545 9,6629,6759,4759,393 9,583 North Carolina 5,437 5,4375,4755,3485,431 5,691 South Carolina 5,165 5,1915,2405,4165,564 5,759 Tennessee 4,514 4,5664,6764,7285,048 5,344 Texas 5,246 5,3745,5635,6855,803 5,893 Virginia 6,224 6,2336,3116,2656,377 6,568 Source: National Center for Education Stat istics. (1995, 1996, 1997, 1998, 1999, 2000). Statistics in brief: Revenues and expenditures for public el ementary and secondary education: School year 1992-93, 1993-94, 199495, 1996-97, 1997-98 Washington, DC: U.S. Printing Office. † At constant prices of 2000-01 Table A-2 presents data on total expenditures per pupil for SY 1993 through SY 1998. Kentucky’s average expenditure per student betwee n SY 1993 to SY 1998 is $5,459, lower than the national average of $6,473. Compared to the other nine states, Kentucky spent less per pupil than most other states during this period. Table A-3 Variation in Instructional Expenditure per Pupil across School Districts Gini Coefficient for instructio nal expenditures per pupil for uni fied districts, fiscal year 1988 to 1994. 1988 1989 1990 1991 1992 1993 1994 California 0.075 0.083 0.088 0.087 0.085 0.074 0.047 Kentucky 0.093 0.098 0.084 0.070 0.075 0.075 0.078 Maryland 0.084 0.082 0.084 0.078 0.068 0.066 0.060 Michigan 0.098 0.099 0.104 0.105 0.094 0.091 0.085 New York 0.099 0.098 0.100 0.098 0.095 0.089 0.088 North Carolina 0.045 0.048 0.049 0.045 0.051 0.042 0.042 South Carolina 0.050 0.047 0.050 0.055 0.053 0.049 0.052 Tennessee 0.105 0.103 0.094 0.097 0.091 0.083 0.074 Texas 0.070 0.067 0.070 0.064 0.060 0.057 0.059 Virginia 0.104 0.105 0.091 0.089 0.089 0.079 0.081 Source: Hussar, W. & Sonnenberg, W. (2000). Trends in disparities in school district level instructional expenditures per pupil (NCES 2000-020) (p. 41). Table A-3 presents data on heterogeneity in instructional expenditure per pupil across school districts. The tabled value of the Gini coefficient is a measure of how far the state was (in a given

PAGE 78

Education Policy Analysis Archives Vol. 14 No. 8 78 year) from uniform funding across school districts A value of zero indicates perfect equality and larger values mean greater disparity. Kentucky’s average Gini coefficient is 0.082, higher than the 10-state average of 0.077, meaning that Kentucky has greater variability in instructional expenditures per pupil across their school districts than the 10-state average. Compared to the other nine states, Kentucky consistently fell near the middle in terms of the level of dispar ity. There is an indication that the disparities increased slightly after 1994. Curriculum and Standards To what extent was th ere a strong state curriculum in mathematics during this period [questionnaire]? How strong were the mathematics content standards? Was there a statewide textbook adopti on in mathematics [questionnaire]? To what extent were the mathematics standards li nked to textbooks, performance standards and to assessment [questionnaire]? How has this changed during this period [questionnaire]? Curriculum Standards Kentucky’s standards in the four core subjects are described in the Core Content for Assessment, which was first developed in 1996 (AFT, 1996). By 1998, Kentucky developed Program of Studies and Learning Descriptions (at the elementary level) to clarify the standards (AFT, 1998). Starting in 1995, the American Federation of Te achers (AFT) reviewed all 50-states’ standards and rated their curriculum standards. The math standards were rated as generally clear and specific about what students should know and be able to do at the elementary, middle, and high school levels (AFT, 1996; 1998). The science standards were rated as the strongest of the four subjects, offering clear and specific standards about the content stud ents should learn at the elementary, middle, and high school levels (AFT, 1996, 1998). The Eng lish standards was rated as clear, specific and grounded in content at the elementary level; va gue reading and basic writing conventions at the middle; and vague writing conventions at the hi gh school levels (AFT, 1998). The social studies standards were rated as vague at the elementary an d high school levels but clear at the middle school level (AFT, 1998). Teacher Quality Were middle grade certificat es subject specific? When di d this come into effect? Was professional development tied to re-certification? When di d this come into effect? How have teacher compensation policies changed over time? What was the extent of out-of-field te aching? How has this changed over time? Kentucky has a three-tiered cer tification system. Initial certification begins at Rank III, which requires a bachelor’s degree and the completi on of an internship. Rank II teachers must hold a master’s degree or have comp leted a planned, fifth-year prog ram. For fifth-year programs,

PAGE 79

The Black-White Achievement Gap 79 professional development tied to an individual’s pr ofessional growth plan may be substituted for up to 12 college credits, if approved by the college. Rank I requires 30 hours beyond the master’s degree, acquired within eight years of attaining a BA Recently, the state passed legislation allowing teachers to use National Board Certification as a su bstitute for graduate work to attain Rank I status (CPRE, 1996). Initial Certificate Effective January 1985, teaching candidates must pass the state teacher certification test and complete a one-year internship before receiving a cer tificate that is good for 4 years. Thereafter, renewals are based on acquisition of a Master’s degree, other advanced training and experience. Renewals are good for 5 year periods (Coley & Goertz, 1998; 1990). Throughout the period 1988 to 1998, KY offe red 3 types of middle grades (5-9) teaching certificates: 1) Provisional Certificate for Teaching in the Middle Grades 5-8, valid for 5 years and requiring a bachelor’s degree; an approved progra m of preparation includin g student teaching and two middle grade teaching fields; and passing NT E or PRAXIS II scores (NASDTEC, 1988, 1991, 1996). This became the only certificate availa ble in the middle grades 5-9 by 2000 (NASDTEC, 2000); 2) Standard Certificate for Teaching in the Middle Grades 5-8, which was discontinued on September 1, 1997. This certificate is valid for 5 years requires eligibility for Provisional Certificate, and a master’s degree or non-degree fifth-year approv ed program of preparation which includes a 12 semester-hour professional education component, and a 12 semester-hour specialization component (NASDTEC, 1988, 1991, 1996); and 3) Endorsement for Teaching in the Middle Grades 5-8, which was discontinued by 2000. This type of certificate allows elementary or high school certificates to be endorsed for teaching in the middle grades 5-8 upon completion of one middle grade teaching field and an approved program of professional pr eparation (NASDTEC, 1988, 1991, 1996, 2000). According to NASDTEC (1988, 1991, 1996) ther e is no requirement beyond the elementary credential for holders of elementary credentials to teach a departmentalized class in a specific subject in grades 7-9. While the requirements for holders of a secondary teaching credential to teach in a departmentalized class in a specific subject in grades 7-9 is a credential in the subject (where each teacher candidate must present at least one acceptab le teaching major of 30 semester hours of credit or one area of concentration with 48 semester hou rs of credit. Teaching minors are accepted only in addition to an area or major???) (NASDTEC, 1988, 1991, 1996). Professional Development Kentucky defined professional development as any course work, experience, training or renewal activity required by a state to keep a cer tificate in force (NASDTEC, 1996, 2000). Kentucky does not issue a permanent or life certificate. Ther e are PD requirements to renew the second-stage certificate. The purpose of PD requirement is fo r continued employment (NASDTEC, 1996, 2000). Kentucky requires that four days be set asid e for teacher professional development and that each teacher receive at least 24 hours of trai ning annually (Goertz, 1988; Coley & Goertz, 1990; CPRE, 1996). However, there is flexibility in how this requirement is met. Teachers may participate in school or district professional development du ring the school year, or they may satisfy the requirement on their own time, through summer cour sework (CPRE, 1996). Legislation provides funding for one of the 4 days to be designated fo r a centralized or regionalized in-service program. In addition, funding is provided for the annual Commonwealth Institute of Teachers, a week-long seminar with follow-up weekends for up to 150-200 outstanding teachers (Goertz, 1988; Coley & Goertz, 1990).

PAGE 80

Education Policy Analysis Archives Vol. 14 No. 8 80 Teacher Compensation Table A-4 presents data on average teacher sa laries from SY 1990 through 1997. Kentucky’s average teacher salary, $37,729, was lower than the national average of $42,909. Compared to the other states, Kentucky’s average teacher salary wa s the lowest in SY 1990 ($36,231) but fell in the middle ranges in the later years. However, Kentucky ’s average teacher salary is consistently lower than the national average across the years. Table A-4 Trends in Average Teacher Salaries. 1989-90† 1990-91† 1991-92†1992-93†1993-94†1994-95†1995-96† 1996-97† Nation 43,180 42,991 43,34443,12542,88642,77442,661 42,309 California 51,881 51,147 51,214 49,04448,66147,34047,782 47,369 Kentucky 36,231 38,068 39,12238,22537,88837,56837,488 37,243 Maryland 50,304 50,093 49,50247,60847,27147,33346,725 45,457 Michigan 50,229 49,424 51,56351,91254,14954,21753,752 52,631 New York 53,674 55,020 54,90155,28154,81255,42554,529 52,886 North Carolina 38,353 38,133 37,16335,75935,59835,84634,465 34,177 South Carolina 36,731 36,838 35,73835,81235,22335,34935,582 35,984 Tennessee 37,302 36,934 36,260 36,01136,54136,40137,542 37,755 Texas 37,782 36,741 37,65138,05236,54736,34735,850 35,727 Virginia 42,644 42,745 40,84940,41340,08339,47139,311 39,793 Source: American Federation of Teacher Resear ch Department, retrieved September 2003 from http://www.aft.org/research/salary/home.htm † At constant 2000-01 dollars Out-of-Field Teaching Table A-5 presents data on the percent of 7-12th classes in the four core academic fields (math, English, science, and social studies) taught by teachers who did not have a minor or major in the field taught. The NCLB Act requires states to report in this manner for schools in terms of outof-field teaching. Across the nation, the proportion of out-of-field classes increased from 21.8% to 24.2%. This pattern is also seen in Kentucky wh ere the proportion of out-of-field classes increased from 29.5% to 31.7%. However, compared to the ot her nine states, Kentucky had the highest or close to the highest proportion of ou t-of-field classes in both years.

PAGE 81

The Black-White Achievement Gap 81 Table A-5 Percentages of Public 7-12 Grade Cl asses in the 4 Core Academic Fields Taught by Teachers without a Major or Mi nor in the Field, by Year and State. 1993-94 1999-2000 United States 21.78 24.21 California 29.43 26.68 Kentucky 29.47 31.68 Maryland 20.71 22.33 Michigan 19.66 20.44 New York 13.12 18.11 North Carolina 17.72 19.41 South Carolina 23.54 22.49 Tennessee 29.52 35.62 Texas 20.95 29.67 Virginia 26.60 28.37 Source: Ingersoll, R. M. (2003). Out-of-field teaching and the limits of teacher policy: A research report (p. 19). Co-sponsored by Ce nter for the Study of Teaching and Policy and The Consortium for Policy Research in Education. Assessment and Accountability How strong was the accountability system? What was the system of sanctions and rewards? How has this system changed over time [questionnaire]? When was the mathematics assessmen t system first implemented? Ho w has it changed during this period? Was there a state-level mat hematics assessment at the 8th or higher grades? What type was used (basic skills or higher order thinking )? Was it a requireme nt for graduation? When did school report cards beco me available? When di d the state start to di saggregate achievement data by subgroups [questionnaire] Accountability System The state was one of the earliest to gain a pproval for their accountability and assessment practices with the federal government (CPRE, 2000 ). The new emphasis of the accountability system has moved from rewards for teachers to rewards for school and from sanctions to assistance (CPRE, 2000). Carnoy & Loeb (2002) develope d a zero-to-five index of the strength of accountability in 50 states based on the use of high -stakes testing to sanction and reward schools. Kentucky is of the a few states that received the second highest rating. Student Accountability In 1996, Kentucky had no rewards or conseque nces for students linked to their standards (AFT, 1996). By 1998, although there were still no consequences for students who do not meet the standards, there were incentives for students to meet the standards. Students who meet or exceed credit requirements, which include Advanced Placement courses, and who maintain a “C” in all their

PAGE 82

Education Policy Analysis Archives Vol. 14 No. 8 82 classes earn the state’s advanced “Common wealth Diploma.” (AFT, 1998). By 2000, student performance on the state assessment wa s sent to parents (CPRE, 2000). School Accountability 1998 Kentucky provides funding for extra academ ic assistance to students who are having difficulty meeting the standards. Students are selected for the intervention based on teacher recommendation (AFT, 1998). 2000 The school accountability system is amende d and will be implemented in 2000. Until then, the state has developed an interim model for charting school progress (CPRE, 2000). District Accountability 1996 Kentucky law requires districts to provide “extended school services” to students who are not performing well enough to meet the state st andards, and special funds are provided by the state for this purpose (AFT, 1996). 2000 A formal accountability system for districts has not been developed. However, a district summary of all of the district schools’ report ca rds is printed in each area’s largest circulation newspaper. State Assessment Syst em (mathematics) 1988 Students have been tested in reading, writing, Language arts, and mathematics in grades 3, 5, 7, and 10 since 1979. Local school distri cts must provide remedial help to those students falling below state-established performance standa rds. Under Legislation passed in 1984 the State Department of Education establis hed essential skills standards by grade and subject in mathematics reading, reference skills, spelling and writing and began testing students in these areas in every grade in 1985-86. Remediation is required for 1st and 2nd graders not passing the essential skills tests (Goertz, 1988). 1990 Students are tested in reading, writing, language arts, and mathematics in grades K, 1, 2, 3, 5, 7, and 10. Local school districts must prov ide remedial help to those students falling below state-established performance standards (Coley & Goertz, 1990). 1996 Kentucky has a state assessment system ti ed to their standards and given to all students across the state. Students are assessed in the core subject areas in grades 4/5, 7/8, and 11/12. The exact grade varies by subject (AFT, 1996). 1998 Beginning in the 1998/99 school year, Kentucky will implement a new testing program. According to state officials it will include reading and science tests in grades 4, 7, and high school; writing in grades 4, 7, and 12; and math and social studies tests in grades 5, 8, and high school (AFT, 1998). 2000 The state assessment system centers around the Commonwealth Accountability Testing System (CATS), the new testing system. C TBS-5 (is a multiple choice norm-referenced test) tests grades 3, 6, 9 in Reading, Math Language Arts and CATS (is a multiple choice and open response criterion-referenced test) tests mathematics in grades 5 and 11 (CPRE, 2000).

PAGE 83

The Black-White Achievement Gap 83 Summary Did the state have a consis tent (in terms of duration and stabil ity) commitment to education reform over the periods 1988 thro ugh 1998 [questionnaire]? Did the state have a coherent approach (in terms of sequencing and alignment) to education reform during this period [questionnaire]? What were the main reform mech anisms (curriculum cont rol, teacher empowerment, standard-based assessment) during this period [questionnaire]? What were the main positive ef fects of the reforms on low-poverty and high-poverty schools and students? What were the unint ended effects [questionnaire]? References American Federation of Teachers. (198 9, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997). AFT 50-state teacher salary survey Retrieved as an excel spreadsheet on November 2003 from http://www.aft.org/research /salary/stgrav e/Index.htm American Federation of Teachers. (1995). Making standards matter 19 95: A fifty-state progress report on efforts to raise academic standards Washington, DC: Author. American Federation of Teachers. (1996). Making standards matter 19 96: An annual fifty-state report on efforts to raise academic standards Washington, DC: Author. American Federation of Teachers. (1998). Making standards matter 19 98: An annual fifty-state report on efforts to raise academic standards Washington, DC: Author. Carnoy, M. & Loeb, S. (2002). Does external accounta bility affect studen t outcomes? A crossstate analysis. Educational Evaluation and Policy Analysis, 24 (4), 305-331. Coley, R. J. and Goertz, M. E. (1990). Educational standards in the 50 states: 1990 Princeton, NJ: Educational Testing Service, Policy Inform ation Center. Consortium for Policy Resear ch in Education. (1996). Teacher professional development profile: Kentucky Philadelphia, PA: Universi ty of Pennsylvania, Graduate School of Education, Consortium for Policy Research in Education. Consortium for Policy Resear ch in Education. (2000). Assessment and Accountabi lity in the Fifty States: Survey 2000: Kentucky A ssessment and Accountability Profile Philadelphia, PA: University of Pennsylvania, Graduate Scho ol of Education, Consortium for Policy Research in Education. Darling-Hammond, L. (2000). Teache r quality and student achievemen t: A review of state policy evidence. Education Policy Analysis Archives, 8 (1). Retrieved June, 2003 from http://epaa.asu. edu/epaa/v8n1/.

PAGE 84

Education Policy Analysis Archives Vol. 14 No. 8 84 Goertz, M. E. (1988). State educational standards in the 50 states: An update Princeton, NJ: Educational Testing Service. Hussar, W. & Sonnenberg, W. (2000). Trends in disparities in school district level expenditures per pupil (NCES 2000-020). Washington, DC: Nation al Center for Education Statistics. Ingersoll, R. M. (2003). Out-of-field teaching and t he limits of teacher poli cy: A research report cosponsored by Center for the Stu dy of Teaching and Policy an d the Consortium for Policy Research in Education Seattle, Washington: Center for the Study of Teaching and Policy. Kentucky Department of Education. (2000). State of Kentucky profile 1999-2000 Lexington, KY: Author. Retrieved Dece mber 12, 2003, from http ://www.kde.state.ky.us. National Association of State Directors of Te acher Education and Ce rtification. (1988). Manual on certification and preparation of educ ational personnel in the United States. Dubuque, Iowa: Kendall/Hunt Publishing Company. National Association of State Directors of Te acher Education and Ce rtification. (1991). Manual on certification and preparation of educ ational personnel in the United States. Dubuque, Iowa: Kendall/Hunt Publishing Company. National Association of State Directors of Te acher Education and Ce rtification. (1996). Manual on certification and preparation of educ ational personnel in the United States. Dubuque, Iowa: Kendall/Hunt Publishing Company. National Association of State Directors of Te acher Education and Ce rtification. (2000). Manual on certification and preparation of educ ational personnel in the United States. Dubuque, Iowa: Kendall/Hunt Publishing Company. National Center for Education Statistics. (1993). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1992-93 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (1994). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1993-94 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (1995). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1994-95 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (1996). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1995-96 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (1997). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1996-97. Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp.

PAGE 85

The Black-White Achievement Gap 85 National Center for Education Statistics. (1998). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1997-98 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (1999). Statistics in brief: Reve nues and expenditures for public elementary and secondary education: School year 1998-99 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (2000). Statistics in brief: Reve nues and expenditures for public elementary and secondary ed ucation: School year 1999-2000 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. National Center for Education Statistics. (2001). Statistics in brief: Reve nues and expenditures for public elementary and secondary ed ucation: School year 2000-2001 Retrieved December 7, 2003, from http://nces.ed.gov/ccd/pub_rev_exp.asp. Rhoten, D., Carnoy, M., Chabran, M., & Elmore, R. (2003). The conditions and characteristics of assessment and accountability: The case of four states. In Martin Carnoy, Richard Elmore, and Leslie Sa ntee Siskin (Eds.), The new accountability: High schools and highstakes testing (pp. 13-53). New York, NY: Routledge Falmer.

PAGE 86

Education Policy Analysis Archives Vol. 14 No. 8 86 Appendix B: Questionnaire on State Education Policy January 6, 2004 Dear colleague, ETS is conducting a study on the relationship between state education policies and student performance on NAEP 8th grade mathematics from the late 1980s through the late 1990s, with a particular focus on changes in the Black-White achiev ement gap. States in the study are California, Kentucky, Maryland, Michigan, North Carolina, N ew York, South Carolina, Tennessee, Texas, and Virginia. You have been identified by our colleag ues, as someone with considerable knowledge of state’s policies during this period. We are asking you to assist us in three ways (listed below) so that we may complete our description of these policies. We expect this will take in all about two to three hours of your time. Based on reports from a variety of sources such as the Council of Chief State School Officers (CCSSO), National Center for Education Statistics (NCES), Education Week Quality Counts Education Commission of States, and Consorti um for Policy Research in Education, we have compiled a profile on state’s education policies over the period 1988 through 1998. We are concerned with the four major mechanisms by which states regulate, support and monitor the effectiveness of their education system: educ ation finance and governance, curriculum and instructional policies, teacher quality, and accountability. First please read the profile and let us know if there are any errors or if we have omitted important information. Second please fill out the attached que stionnaire, which consists of questions that we were unable to address from secondary sources. Your responses to these questions are critical to our study as we hope to strengthen our understanding of the quality, coherence and consistency of state policy actions. Third, we will call within the next two weeks to schedule a 45-minute conversation with you to review the information from the questionnaire. The data collected will be part of a policy re port describing the relationship between states’ policy actions and their success in raising the achi evement of African American students and in closing the achievement gap with White students. We believe this study is unique in its attempt to develop a comprehensive, longitudinal descripti on of both state policy histories and patterns of student achievement disaggregated by race and school-poverty level. With the controversy surrounding the No Child Left Behind Act, it is all the more important that we examine our recent experience to glean some insights on what aspects of states’ policies may account for their differential success in reducing achievement gaps. The report should be useful to

PAGE 87

The Black-White Achievement Gap 87 policy makers as well as others interested in st rategies for education reform and will be widely disseminated. With your permission, your contribution will be duly noted. The profile and the questionnaire are provided in MS Word files. If you cannot retrieve the files or prefer to receive the questionnaire in anot her format, please contact us as soon as possible. When you have completed the questionnaire, please send it as a MS Word attachment to hbraun@ets.org or awang@ets.org or fax it to (609) 734-5960. If you have any questions about the study, the questionnaire, or the profile, please email or call Aubrey at (609-734-5058, awang@ets.org ) or Henry at (609-734-5887, hbraun@ets.org ). We will be happy to talk with you. We n eed your response to questionnaire by January 27th. We will also call to set up a 45-minute conversation with you to review the information from the questionnaire. This study is being supported by the U.S. Depa rtment of Education. Unfortunately, we do not have funds to compensate you for your time. We do thank you in advance for your participation in this important policy study and will prov ide you with copies of the final report. Sincerely yours, Aubrey Wang Henry Braun Associate Research Scientist Distinguished Presidential Appointee

PAGE 88

Education Policy Analysis Archives Vol. 14 No. 8 88 Questionnaire on St ate Education Policy Directions: All questions should be considered in the co ntext of state actions from the late 1980s through the late 1990s. We know that state poli cies evolve over time, and we are interested in understanding the development of th ose policies from about 1988 to 19 98. If you have questions, feel free to call or email Aubrey Wang at ETS ( awang@ets.org 609-734-5058) or Henry Braun at ( hbraun@ets.org 609-734-5887). Thank you in advance for your help. I. The first series of questions focuses on the development of and changes in education finance policies in your state from 1988 through 1998. I-1. How would you characterize the degree of equalization in school funding (between lowpoverty and high-poverty di stricts) in the late 1980s ? I-2. How would you characterize the state’s efforts to equalize funding through the 1990s ? I-3. Were there efforts during this period to link or combine state and federal funding (e.g., Title I; NSF State, Urban or Rural Systemic Initiatives; or Headstart) to support specific initiatives for poor/minority schools or students? If so, please describe. I-4. Did these initiatives focus on mathematics an d science learning? If so, please describe. II. This series of questions focuses on the de velopment of and changes in the mathematics curriculum and standards in your state from 1988 through 1998. II-1. To what extent was th ere a state curriculum in 7th and 8th grade mathematics during this period? II-2. How would you characterize the curriculu m in terms of breadth, depth and rigor? II-3. Was there a statewide textbook adoption po licy in mathematics? If so, please describe (e.g., when did this occur, what was the policy). II-4. Were the mathematics textbooks aligned with state curriculum? If so, please describe. II-5. Were there mathematics content standards? If so, were the mathematics content standards linked to a matching set of performance standards during this period? If so, please describe. III. These questions concern the changes and de velopment in the accountability system in your state from 1988 through 1998. III-1. When did school report cards become available? III-2. When did the state start to disaggr egate achievement data by subgroups?

PAGE 89

The Black-White Achievement Gap 89 III-3. Did non-Title I schools have school accountability during the period 1988 through 1998? When were these systems in place? IV. Teacher Quality. (Please refer to the Teacher Quality section in the state profile for a summary of our understanding of this area of education reform). V. These questions concern the development of educational reforms in your state from 1988 through 1998 V-1. Who were the main drivers of educ ation reform? What governmental and nongovernmental entities drove education reform in this period? What kind of mechanism were in place to continue policy direction in the face of changing political environments? V-2. What were the main reform mechanisms in your state (e.g., curriculum control, teacher empowerment, standards-based assessment) during this period? What is the evidence? V-3. Based on your understanding of what occurred in the state during the period 1988 through 1998, do you think the state had a consistent (in terms of duration and stability) commitment to education reform through each of these mechanisms? Policy Area Was there consistent commitment during this period? Please describe what made this possible? If these efforts were derailed, please describe the relevant factors or forces? Education Finance Yes [ ] Somewhat [ ] No [ ] Curriculum and Standards Yes [ ] Somewhat [ ] No [ ] Accountability Yes [ ] Somewhat [ ] No [ ] Teacher Quality Yes [ ] Somewhat [ ] No [ ]

PAGE 90

Education Policy Analysis Archives Vol. 14 No. 8 90 V-4. Based on your understanding of what occurred in the state during the period 1988 through 1998, do you think the state had a coherent approach (in terms of sequencing and alignment of the reform) to education reform? Policy Area Was there coherence to the education reforms during this period? Please describe what made this possible? If these efforts were derailed, please describe the relevant factors or forces? Education Finance Yes [ ] Somewhat [ ] No [ ] Curriculum and Standards Yes [ ] Somewhat [ ] No [ ] Accountability Yes [ ] Somewhat [ ] No [ ] Teacher Quality Yes [ ] Somewhat [ ] No [ ] V-5. Overall, what were the (g enerally agreed upon) positive effects of the reforms? What were the unintended effects? VI. Finally, have we missed anything? Are there other matters that you would like to bring to our attention? Are there references you could recomme nd? Who else should we speak to? Please provide any comments or information on any other topics related to state policy actions in your state during this period that you feel are important. Thank you very much for your assistance.

PAGE 91

The Black-White Achievement Gap 91 Appendix C: Follow-Up Interview Protocol Follow-up Questions for State Experts (Example for Kentucky) I-1. When did we see influx of state funds in to education (our table doesn’t show it)? I-2. Were these state initiatives focused on disadv antaged minorities in particular, for instance, programs focusing at-schools or mathematics and science? I-3. NSF SSI-RSI, any information? II-1. Was the link between content standards and performance standards clear enough to inform teachers and were there TPD support for improvi ng pedagogy? Standards for all? Teaching? III-1. School report cards? IV. Any special initiatives on TPD that are related to reform? V. Books on KY: Accountability consistency, asse ssment-major changes, coherence and timing— especially accountability leading other reform components VI. KY specific issues on demographics of poverty

PAGE 92

Education Policy Analysis Archives Vol. 14 No. 8 92 Appendix D: Example of a State Summary Kentucky State Summary I. Governance and Politics in the Context of the Reform A. State governance: Degree of central vs. local control Essential Points State in charge of assessment, accountability Localities in charge of curriculum, professional development Summary Governance of education refo rm in Kentucky was divided between the stat e department of education and the schools. The state de partment of educatio n created and managed the assessment and accountabi lity systems, while the school s and school-based decision making councils decided on the curriculum and resource distribution. B. Main drivers of education reform Essential Points State Supreme Court case and decision – brought by grassroots non-governmental organizations together with business organizations and a coalition of districts Supported by state Board, Commissioner of Education, Governor, and General Assembly Succession of leaders of the Kentucky Te achers Association supported the reform when they were promised that the reform would not touch the evaluation of teachers. During the decade of the 90’s Kentucky changed governors, changed Commissioners of Education, and change d crucial Departme nt of Education personnel. Despite the changes, there was a sense of continuity about the reform. New governors and commissioners embraced the reform as ardently as did their predecessors. Several State Board of Education members held their positions and continued to support the reform during the being studied. Summary The main drivers of Kentucky’s educational reform during the late 1980s through the 1990s were a combination of local coalit ion of grassroots non-governmental organizations such as the Prichard Committee, the Forward in the Fi fth, together with business organizations and a co alition of districts who brou ght the lawsuit that resulted in the landmark state Supreme Court ruling that the state’s public school system was unconstitutional. As a result of this court ru ling, the General Assembly drafted the blue print of Kentucky education reform: the Kentucky Education Reform Act (KERA) in 1990, which was amended and su stained through a combinatio n of the election of state board and the commissioner and continuous support by the General Assembly. Though

PAGE 93

The Black-White Achievement Gap 93 the central thrust of KERA was targeted at fi scal equity, it set the stage for a host of wide-ranging reforms. First state reform to follow the Gover nor’s Conference in 1989 and enact those recommendations. C. Main reform mechanisms Essential Points Finance reform Standards-based assessments School based accountability Increasingly explicitly/rigorously defined standards Summary The main reform mechanis ms in Kentucky were extensive finance reform, the promulgation of statewide standards, st andards-based assessment, and school-based accountability. Passage of the Kentucky Educational Reform Act (KERA) required a number of dramatic changes in state oversigh t of public education. Primary among these was the restructuring of school finances to cr eate a more equitable system. This included the addition of a significant amount of money. KERA also increased the state role dramatic ally in standards setting, assessment and accountability. Though Kentucky will not adopt statewide curricula due to state law, it has created a set of standard s that between 1990 and 1998 became incr easingly well defined in response to public demand. Be cause these standards were assessed through a number of performance measures, teacher s and administrators demanded clearer alignment, thereby increasi ng the state role. Addition ally, rewards and sanctions, delivered by the state, were attached to performance for the first time with the passage of KERA. Furthermore, the legislature established an Office of Educational Accountability (OEA) that reported to the Legislature about ed ucational reform. The OEA was concerned about the quality of the asse ssment and accountability syst em and did extensive work, such as investigating allegations of cheating and conducting/sponsoring evaluations of the state department of education and its contractors. II. Finance A. Continuous commitment in education finance Essential Points Prodded into action by a court decision in 1989, the Legislature has maintained a commitment to providing equitable and adequate school funding. There has not been an overwhelming growth in per pupil funding and the state remains well below the national average.

PAGE 94

Education Policy Analysis Archives Vol. 14 No. 8 94 Summary The financial inequity among schools within the st ate, more than the overall funding level, was the impetus for education finance reform in Kentucky. In 1989, the state supreme court decided that Kentucky’s system of pub lic education did not fulfill its constitutional obligation. As a result, the state went through a major overhaul of the system, of which finance was a large part. Traditionally, districts in the urban centers have spent considerably more money than those in the rural areas and the eastern part of the state. According to all reports, efforts at funding equalization have received consistent attention from the state legislature as well as from local communities over the time period being considered here. While Kentucky has tried to equalize funding through a shared state and local commitment, the total expenditures per pupil have not risen dr amatically. This is due in part to significant expenditures at the state level in redesigning the system (additional state spending is not represented in per pupil averages, leading to the question about finance discrepancies that I raised earlier) and in part to a desire to redistribute spending rather than raise the total amount. B. Level and trajectory of proportion of state funding in education Essential Points Per pupil spending remains well below the national average. No clear trajectory in per pupil spending in the state from 1992 to 1998 (possible that the big jump took place in 1990 with passage of KERA). Summary The percentage of total educational expenditur es paid for by the state reached a high of 68% in 1988 and has been dro pping since then (with one sm all exception). The size of the drops began to increase in 1995, ending with just 61.7% of expe nditures paid for by the state in 1997. The average expenditure per student does not follow the fluctuation in state contribution, indicating fluctuation in th e levels of local and federal contributions. Overall, Kentucky’s per studen t spending is considerably below the national average for the period from 1992-1998. Its expenditures average $5,459 and range between $5,291 and $5,680, without any clear tr ajectory in either direction. In contrast, the national average of per pupil expenditures for this period was $6,473 and demonstrated a clear and continuous annual increa se during the six years. C. Level and trajectory of gap between highand low-poverty school expenditures per student Essential Points Passage of KERA reduced inequities in funding in 1990. Inequities have been increasing since 1992 and remain above the average for our ten states. Summary Kentucky’s average Gini coeffi cient is 0.082, higher than the 10-state average of 0.077, meaning that Kentucky has grea ter inequity in instructional expenditures per pupil across

PAGE 95

The Black-White Achievement Gap 95 their school districts than the 10-state av erage. Compared to the other nine states, Kentucky consistently fell ne ar the middle in terms of the level of disparity. After improving on this measure of equity in 1990 and agai n in 1991, inequity appears to have been creeping back into the system since 1992. D. Characterization of equalizing efforts in school funding Essential Points After initial improvement in equalizing sc hool funding, the state has lost ground. Many respondents reported that the state has been committed and successful in this area, though our data does not necessarily support this contention. Summary As stated, Kentucky’s efforts at equalizing school funding were required by a 1989 court decision. In 1990, the state legislature passed the Support Excellence in Education in Kentucky funding program. This program soug ht to use a formula-driven funding program to meet district needs on an equitable basis. In order to fund the equalization, the state increased the sales tax from 5 cents to 6 cents At least initially, these efforts resulted in greater equity among schools. However, as st ated above, the inequity among schools has been on the rise after that initial correction. III. Curriculum and Standards A. Continuous commitment in curriculum and standards Essential Points State was initially very vague about curriculum and standards and only achieved widespread core content standards at the end of the time period we are studying. Reluctance to mandate any state curriculum or materials has lead to widely varying implementation of curriculum. Summary Curriculum only became a state focus with th e passage of the 1990 legislation. However, state law prohibits the establishment of a st atewide curriculum. As a result, the state embarked upon establishing a set of state standards in math. Since 1990, the state has demonstrated a strong commitment to a st andards-based approach and has shown a commitment to developing increasingly d etailed state content standards upon which assessment is based. This resulted in core content standards in 1998. The staff at the Kentucky Department of Education attempts to make standards clear without prescribing curriculum. The establishment of standards is somewhat further complicated by the fact that textbook selection is decided at the school level. In combin ation with evolving standards, it is unclear to what extent the standards and assessments moved school staff toward a common curriculum in the state. For example, discussions continued during much of the time period in question about the extent to which algebra should be taught in grades 7, 8, or 9. As one respondent indicated, while standards documents now exist in some detail, schools in his

PAGE 96

Education Policy Analysis Archives Vol. 14 No. 8 96 district continue to pursue very different curricula based on textbook selection, tracking practices, and instructional decisions and resources. This evidence would indicate that while the state has demonstrated an evolving commitm ent to standards, the impact that this has had at the local level may be quite variable. B. Strong state curriculum in mathematics (with statewide textbook adoption, alignment of textbook with curriculum, and content standards linked to performance standards) Essential Points The state has a list of approved texts but schools make their own choices. Curriculum alignment was one of the state’ s criteria for selecting texts but they materials represent a wide range of approaches and even content. The state did create content standards but did not provide clear enough guidance for teachers, nor a clear sequence of progression. Summary As stated above, the state standards went though a process of evolution. Their strength is a subject of some debate. For example, the 56 “Valued Outcomes” that the state identified were elaborated through two or three versions, but the lists always consisted of “examples,” not a definitive “scope,” and there was never an y sequencing laid out by the state. That was in keeping with the idea that Kentucky was a “local control” state with regard to curriculum. Kentucky did provide a list of approved textbooks (from which districts could generate their own “approved list”) and schools were able to choose and provided state funds for textbook adoption. However, the list of appr oved texts represented a broad range of instructional and curricular approaches. There wer e as many as five different choices in each content area by grade. While one of the criteria for state approval was alignment with standards, the range of approved texts raises some questions about degree of alignment. All respondents agree that the sta te and text publishers would ar gue that the texts were aligned with standards, but the standards were quite vague at the start and have become only progressively more clear while new approved lists were not always created concomitant with the creation of new standards. The content standards that the state develope d were general descriptions of content and performance. In math, such content standard s were not developed until the mid-1990’s and were not strongly linked to performance standa rds until 2001. They included examples of tasks/problems students should be able to do at each grade level. Beginning in 1998, the state made clear the esse ntial content for all students to know that would be included in the state assessment. This was to be used in conjunction with previously released Academic Expectations and Programs of Studies. Prior to this, there had been some sharing of assessment items rather than curriculum.

PAGE 97

The Black-White Achievement Gap 97 C. Breadth, depth and rigor of mathematics curriculum Essential Points Math curriculum was not made explicit until the mid to late 1990’s. AFT describes math standards as clear and specific, though this is not an evaluation of rigor. Summary Kentucky’s standards in the four core subjects are described in the Core Content for Assessment, which was first developed in 1996 (AFT, 1996). By 1998, Kentucky developed Program of Studies and Learning Descriptions (at the elementary level) to clarify the standards (AFT, 1998). Starting in 1995, the American Federation of Teachers (AFT) reviewed all 50-states’ standards and rated their curriculum standards. The math standards were rated as generally clear and specific about what students should know and be able to do at the elementary, middle, and high sc hool levels (AFT, 1996; 1998). Kifer would disagree, feeling that the 8th grade math standards are nearly incoherent and do not help teachers to cover essential understandings. IV. Teacher Quality A. Continuous commitment in teacher quality initiatives Essential Points State did increase funding to be used for professional development; use was a local decision. State began to provide content-based assistance to teachers in 1992. Summary The state did establish the Professional Standards Board in order to strengthen teacher quality. However, the Board did not have a grea t deal of authority and had little ability to deal with recruitment or retention of high quality teachers – this remained a local responsibility. Districts with strong teacher associations were able to supercede state initiatives that would have impacted the tea cher requirements. Additionally, the state’s school accountability system did not include tea cher quality as a variable in determining school ratings and thus did not receive a great deal of direct attention. However, starting in 1990, the state did provide money for teacher professional development. There was local flexibility on how the time and money should be used but it was meant to build teachers content base. Starti ng in approximately 1992, the state began to provide direct content-based assistance to teach ers. Additionally, distinguished educators in the schools (so designated by the state) ha d the power to terminate teachers who were documented as chronically performing below ex pectations. The 1990 legislation also raised teacher salary, in the hopes that competition would increase in under-served areas.

PAGE 98

Education Policy Analysis Archives Vol. 14 No. 8 98 B. Middle grades content specific teacher certification, especially in mathematics Essential Points Kentucky used to have less rigorous st andards for middle grade certification in particular subjects. By 2000, the state began to require newly and re-certified teachers to have passed a national teaching exam and have two field specialties. Within five years, all middle grade teachers should be required to have this higher level certification. Summary Effective January 1985, all teachers in Kentuc ky must pass the state teacher certification test and complete a one-year internship before receiving a certificat e that is good for 4 years. Thereafter, renewals are based on acqui sition of a Master’s degree, other advanced training and experience. Renewals are good for 5 year periods (Coley & Goertz, 1998; 1990). Throughout the period 1988 to 1998, KY offe red 3 types of middle grades (5-9) teaching certificates. Only the first of these still rema ins as an option for middle grade teachers: 1) Provisional Certificate for Teaching in th e Middle Grades 5-8, valid for 5 years and requiring a bachelor’s degree; an approved program of preparation including student teaching and two middle grade teaching fields; and passing NTE or PRAXIS II scores (NASDTEC, 1988, 1991, 1996). This became the only certificate available in the middle grades 5-9 by 2000 (NASDTEC, 2000). 2) Standard Certificate for Teaching in the Middle Grades 5-8, which was discontinued on September 1, 1997. This certificate is valid for 5 years, requires eligibility for Provisional Certificate, and a master’s degree or non-degree fifth-year approved program of preparation which includes a 12 semester-hour professional education component, and a 12 semester-hour speci alization component (NASDTEC, 1988, 1991, 1996). 3) Endorsement for Teaching in the Middle Grades 5-8, which was discontinued by 2000. This type of certificate allows elementary or high school certificates to be endorsed for teaching in the middle grades 5-8 upon completion of one middle grade teaching field and an approved program of professional preparation (NASDTEC, 1988, 1991, 1996, 2000). According to NASDTEC (1988, 1991, 1996) there is no requirement beyond the elementary credenti al for holders of elementary credentials to teach a departmentalized class in a specific subject in grades 7-9. While the requirements for holders of a secondary teaching credential to teach in a departmentalized class in a specific subject in grades 7-9 is a credential in the subject (where each teacher candidate must present at least one acceptable teaching major of 30 semester hours of credit or one area of concentration with 48 semester hours of credit. Teaching minors are accepted only in addition to an area or major.) (NASDTEC, 1988, 1991, 1996).

PAGE 99

The Black-White Achievement Gap 99 C. Professional development tied to re-certification Essential Points A minimum of 24 hours per year of profe ssional development is required for all teachers. Increases in teacher rank require attainment of a master’s degree (Rank 2) or a master’s degree plus 30 credit hours (Rank 1). Summary Kentucky defined professional development as any course work, experience, training or renewal activity required by a state to keep a certificate in force (NASDTEC, 1996, 2000). Kentucky does not issue a permanent or life cer tificate. There are professional requirements to renew teaching certificates and to adva nce in teaching class (NASDTEC, 1996, 2000). Kentucky requires that four days be set asid e for teacher professional development and that each teacher receive at least 24 hours of tr aining annually (Goertz, 1988; Coley & Goertz, 1990; CPRE, 1996). However, there is flexibility in how this requirement is met. Teachers may participate in school or district professi onal development during the school year, or they may satisfy the requirement on their ow n time, through summer coursework (CPRE, 1996). Legislation provides funding for one of th e 4 days to be designated for a centralized or regionalized in-service program. In addi tion, funding is provided for the annual Commonwealth Institute of Teachers, a week-long seminar with follow-up weekends for up to 150-200 outstanding teachers (Goertz, 1988; Coley & Goertz, 1990). D. Level and trajectory of teacher compensation Essential Points In 1989, Kentucky’s average salary was the lowest of the ten states being studied. With the passage of KERA, average salaries increased for two years before beginning a steady decline. By 1997, Kentucky’s average salary still remained below the national average but had moved closer to the middle of the ten states being studied here. Summary Kentucky’s average teacher salary between 1989 and 1997, was lower than the national average during that same time period ($37,729 vs. $42,909). Compared to the other states, Kentucky’s average teacher salary was the lowes t in SY 1989 ($36,231). Clearly, the salary received a significant increase with the passage of new legislation in 1990, jumping almost $2,000 on average, and another $1,000 the followi ng year. This trend toward higher salaries lasted only two years and since 1992 averag e salary has been declining in Kentucky. However, the state falls in the middle range of the ten states in the later years, as some states saw near continuous declines in average salary.

PAGE 100

Education Policy Analysis Archives Vol. 14 No. 8 100 E. Extent of out-of-field teaching Essential Points Like most other states as well as the na tion as a whole, the proportion of classes being taught by teachers not majoring or minoring in the field grew in Kentucky between 1993 and 1999. Kentucky consistently has the second highes t proportion of classes with out-of-field teachers in our set of ten states. Summary Across the nation, the proportion of middle and high school classes in the four core academic fields (math, English, science, and so cial studies) being taught by teachers who did not have a minor or major in the field increased from 21.8% to 24.2% between 1993 and 1999. This pattern is also seen in Kentucky wh ere the proportion of classes being taught by out-of-field teachers increased from 29.5% to 31.7%. Compared to the other states we are studying, Kentucky had the second highest proportion of classes (ranking just above Tennessee) being taught by out-of -field teachers in both years. V. Assessment and Accountability A. Continuous commitment in assessment Essential Point Assessment has been continuous though gr ades, subjects, and modes of assessment have varied over the years. Summary Though the state has demonstrated a clear commitment to assessing students since the passage of KERA in 1990, there has been evolution in the mode of assessment. From 1988 through 1991, the state used a norm-referenced test. From 1992-1998 the state used a performance-based and open-response portfolio assessment system. From 1998 until present, the state uses a combination of the performance-based assessment and a normreferenced test. B. Type of assessment, grade level assessed, requirement for graduation Essential Point From 1990-1998, students were assessed in grades 4/5, 7/8, and 11/12 (exact grade depended on which of the four core content areas was being measured) through a rich and time-consuming performa nce based assessment system.

PAGE 101

The Black-White Achievement Gap 101 Summary Since 1979, Students have been tested in readin g, writing, language arts, and mathematics in grades 3, 5, 7, and 10 (K, 1, and 2 were added in 1990). Local school districts must provide remedial help to those students falling below state-established performance standards (Coley & Goertz, 1990). Between 1990 and 1998, Kentucky had a state assessment system tied to its standards and given to all students across the state. Students were assessed using performance measures including in-class activiti es and portfolio assessments the core subject areas in grades 4/5, 7/8, and 11/12. Th e exact grade varied by subject (AFT, 1996). Beginning in the 1998 school year, Kentucky implemented a new testing program. It included both multiple choice and open-ende d response. The state uses the CTBS-5 (a multiple choice norm-referenced test) in grades 3, 6, 9 in reading, math language arts. In grade 5 and 11, the state uses CATS (a mu ltiple choice and open response criterionreferenced test) to assess mathematics. (I do not have information on what may have been required for graduation during the time period of our study.) C. Consistent commitment in accountability Essential Point Consistent commitment to holding “someone” responsible for student performance. The “someone” has shifted between the school/teachers and the district, not directly impacting the students. Summary Though the state has clearly shown a consis tent commitment to a system for educational performance accountability, the focus of the st ate accountability system did shift during the time period under consideration. From 1988 to 1992, the system focused on accountability for whole school districts. From 1993 to 1998 accountability was much more centered at the school level. The system has continued to develop since 1993, adding both indicators and incentives. For example, rewards and san ctions were applied to teachers beginning in 1993 or 1994. Between 1988 and 1998, there wa s no state mandated individual student accountability in Kentucky. D. Strong accountability system (with an e ffective system of sanctions and rewards) for both Title I and non-Title I schools? Essential Points The strength of the accountability system has been increasing since KERA was passed. It moved from a district focus (which allows individual schools to “hide”), to a school and teacher focus during th e time period we are studying. Summary All public schools in the state have been included in the accountability system since 1992, when accountability was established at the individual school (rather than district) level. The strength of the system has been increasing as elements (such as rewards and sanctions for teachers) are added. Currently, the system recei ves a 4 out of 5 on a scale of accountability strength developed by Carnoy and Loeb (2004).

PAGE 102

Education Policy Analysis Archives Vol. 14 No. 8 102 VI. Overall Quality of State Reform or “State Story”. A. Alignment in policy Continuous improvement in alignment of standards and assessment. Kentucky’s system of public education underwent an enormous upheaval with the Kentucky Education Re form Act in 1990. This was the first state to engage in the comprehensive reform along the lines of wh at was recommende d in 1989 at the Governors’ Conference. The system that the legislation created a ttempted to improve finance, governance, assessment, accountab ility, and teacher training. However, without a clearly defined set of statewide learning objectives, something that did not exist in 1990, it was difficult to say just what this alignment was around. The state increasingly paid attention to curriculum standards as th e difficulties of alig nment became clear. Nonetheless, the state did create a statewid e performance based assessment system and was able to demonstrate progress on a numb er of measures. Kentucky started with the assessments and had the standa rds follow, rather than vice versa. The order was not ideal, and they had to do mo re revision than they would have had they had the patience to allow assessment and accountabili ty lag behind the other reforms. B. Consistency in policy Consistent in terms of goals, fluctuation in terms of methods. While Kentucky has maintained a consisten t statewide focus on improving educational outcomes, and has committed enormous resources to this task, the areas of state policy emphasis have shifted over the years. For exampl e, the state began to provide increasingly explicit curricula and standards due in large part to the demands of teachers around the state. Similarly, the state scaled back the in -depth performance assessments that had been created initially because of the popular discon tent with the large about of time that was being used for the purpose of formal statewide assessments. Perhaps the most consistent policy change has been the modification in state education finance. Though we have some discrepant information on the amounts of state contribution and its increase, all agree that the ways in which schools were financed changed dramatically in 1990 and has been relatively stable since then. C. Quality of policy implementation In the areas of standards and curriculum, local authority has resulted in varying curricula and content pacing. While state policy goals related to finance and governance can be easily accomplished by legislative fiat, implementation of curriculum and standards is more difficult. This was made more difficult in Kentucky because of the way in which this material evolved. Core content was not made clear until 1996 and even this has to be made more explicit for teachers with a Program of Studies document that was pub lished in 1998. In addition, the multiple iterations of the assessment system, from norm referenced traditional assessment in 1988 to

PAGE 103

The Black-White Achievement Gap 103 performance based criteria referenced in 1 993, served as a significant change in the implementation of the assessment system. Despite these changes, it should be noted that the changes to the state’s education system were most dramatic in 1990. Though there has certainly been modification since then, all respondents report a seismic shift in the way the state was running the public education system. D. Perceived positive effects Change in attitude and focus More shared language around educational goals Increased supplementary and support programs The fundamental shift in education policy that occurred following the court decision had a number of positive conseque nces for the state. Primary among them was a change in attitude among educators and the general pu blic. Raising student achievement became a primary goal. The system was in tentionally designed to give schools th e most credit for raising the lowest pe rformers to a higher level. Th is put a particular emphasis on improving educational outcomes for struggling students. This was the first time that the state began to talk about includ ing all students in the assessme nt process. In so doing, it raised the status of previously under-served populations including those with disabilities or particularly low test scores. When the statewide data that demonstrated that many poor schools were making significant impr ovements in performa nce convinced some that poor students could learn, it unde rmined a previous feeling that student demographics and schoo l context were insurm ountable barriers. With these reforms came nationwide positive attention. Actors in the state and local systems felt that the state had taken a very positive step toward improving education and thus were more convinced of its potential for success. Additionally, the reduction in variation of school funding helped many to be more comfortable with the direction of change. The changes reduced the place of educat ion as a wedge issue dividing policymakers. There was widespread support for many of the re forms, reinforced by the national attention they garnered. The reforms provided teachers and administrato rs across the state with a common language and set of goals (though they were initially vague) to discuss their own and their students’ work. This occurred in terms of classroom inst ruction but also in terms of larger issues related to testing, assessment, and the particul ar needs of individual students. For example, Kentucky’s district assessment coordinators, a position created by the 1990 legislation, created their own professional organizati on, which has been a powerful group for professional development and advocacy. The asse ssment system pushed many in the state to focus on higher order thinking skills and a pplications as opposed to the more basic-skills approach that had existed prior to the 1990 legislation. Kentucky also created a system of parallel supports in its 1990 legislation. More social supports for students and families, early ch ildhood education, increased assistance to

PAGE 104

Education Policy Analysis Archives Vol. 14 No. 8 104 struggling students, were all additional component s of the reform that resulted in a better articulated system of services for both students and families in the state. E. Perceived unintended/negative effects Significant early fluctuation in policy, particularly in the early years. Shifts in emphasis due to accountability pressures. Because of its dramatic changes and its ea rly steps in the accountability movement, Kentucky received a great deal of national an d research attention. As mentioned above, when this attention was positive, it served a great purpose in the state. However, because the state was in such a period of policy transiti on, the early research had dramatic effects on the direction of policy in sometimes unintended ways. The effects of external evaluations of the system also served to undercut the credibility of the system in some ways. Of course, Kentucky has not been insulated from the challenges posed by all accountability systems. For example, when the math portfolio was criticized by math teachers as being too heavily focused on writing, it was eliminated after being in place for three years. Other portfolios, intended to be used to improve inst ruction have become classes unto themselves (students call it “portfolio prison”), demons trating that even more performance based forms of assessment are subject to the dangers of overly prescriptive test preparation. The accountability system also made teachers nervous at first. When rewards and sanctions began to be targeted at the school level, teachers became quite nervous about their job security. At those schools that merited reward s under the system, the local decision of how to distribute the award in some cases underc ut the types of cooperation and collaboration that the system was hoping to encourage. F. Lessons learned Kentucky continues to refine its systems of standards, assessment, and accountability. The state is still not strong in terms of student acco untability. In spite of this, many believe that the emphasis on performance and accountability has perverted instruction in some ways, shifting attention to particular areas of a ssessment rather than a more balanced and consistent approach. G. Special characterization of the state Kentucky attempted to address a huge range of issues simultaneously. Initiatives focused on all core subjects and all grade levels and included finance, governance, curriculum, assessment, and accountability. They even attempted to institute a range of support programs for children and families at the sa me time. Being both earlier and more farreaching than many states posed its own set of challenges and information gaps with which the state had to contend while it was un dergoing a massive systemic adjustment. Demographic information – poor minorities are in the urban areas, while poor white kids are in the rural areas so there is limited mi xing of poor students due to large scale geographic shifts.

PAGE 105

The Black-White Achievement Gap 105 Appendix E: HLM Variables Student Variables Socioeconomic Status (SES) SES is the average of non-missing values of the following two variables. 1. Number of reading related items in the home (i.e., Newspaper, encyclopedia, books, magazines) 2. Educational level of parent with greatest education. If both variables are missing, SES is defined as missing. If exactly one va riable is non-missing, SES consists of the non-missing variable. The SES scale has been standardized within state to have mean 0 and unit standard deviation. Academic focus for the student (AcadFoc) AcadFoc is the sum of the nonmissing values of a set of variables, displayed in Table E-1, that are hy pothesized to measure the construct of student academic focus. Results of a confirmatory cluster analysis were consistent with the hypothesis. Note that six variables occur in all years, five occur in two adjacent years and one variable only occurs in 1992. (Note: All variables available for a given year were used in the analyses for that year.) The scale of the composite variable was standardized within state to have zero mean 0 and unit standard deviation. Black vs. White (BvsW). This is an indicator variable that takes the value zero if the student is White, 1 if the student is Black and missing if the student is Hispanic, Asian, Pacific Islander. If a school has no Black or White students, this variable is missing for the entire school. In the level 1 model, the regression coefficient associated with this variable is equal to the mean difference between Black and White students for the school or the stratum, depending on the structure of the corresponding level 2 model. Average school SES (AggSeS). The average of the SES values of students in the NAEP sample in the school. Percent Black Students Assessed in the School (AggBvsW) The average of the Black/White indicator in the NAEP sample in the school. This is equal to the proportion of Black students in the subset of Black or White students in the sample. School Climate (Climate) The average of the non-missing va lues of those questionnaire items having to do with school climate, based on re sponses by teachers and school administrators. Table E-1 presents a brief description of each va riable and indicates the administrations in which they were measured.

PAGE 106

Education Policy Analysis Archives Vol. 14 No. 8 106 Table E-1 Student Variables Included in the Academic Focus Composite Variable Description 92 96 00 Homework B003901 How much time is spent each day on homework. (This was recoded to collapse the 1st 2 categories.) X X X Pgsread B001101 How many pages read in school and for homework. Recoded to reverse the categories. X X X Textbk M811601How often? Do math problems from textbooks. Recoded to reverse the categories. X X X Calc M811605 How often? Use a calculator. Recoded to reverse the categories. X X X Mathhi IF Classnow =3 or 4, Mathhi=1. Otherwise mathhi=0. ClassnowM810501 What kind of class are you taking this year? X X X Mathhi9 If class9 =4 or 5, mathhi9=1. Otherwise mathhi9=0. Class9M811701 What math class will you take in 9th grade? X X X Wrksheet M811602 How often? Do math problems on worksheets? X X Reports M811611 How often? Write reports or do math projects. X X Solution MM00501 How often asked to provide detailed solutions? X X Tests M811607 How often? Take math tests. (same categories as M811602). X X Howfar B009801 How far in school do you think you will go? Recoded so that category 6=1. X X Makeup M811610 How often? Make up problems for others to solve. X School Variables Aggregated Academic Focus (AggAcadFoc) The average of the student academic focus variable for the students in the NAEP sample in the school. The following table indicates the variables comprising this construct and the administrations in which each was measured. Due to changes in the back ground variables, as well as differences in how well the variables related to the construct, a different set of variables was used for each administration. Of course, this limits comparability across administrations.

PAGE 107

The Black-White Achievement Gap 107 Table E-2 School variables included in the school climate construct Variable Description 92 96 00 Absenteeism Student Is student absenteeism a problem in your school? X X X Absenteeism Teacher Is teacher absenteeism a problem in your school? X X X Fighting Are physical conflicts a problem in your school? X X X Parent Support Is parent support for academics positive or negative? X X X Pct Absent What percent of studen ts are absent on an average day? X X X Pct Retained What percent of students are retained in this grade? X X X Pct Enroll What percent of students are enrolled at the beginning and the end of the year? X X X Tardiness Is student tardiness a problem in your school? X X Pct PTA What percent of parents are in a parent-teacher organization? X X Pct Par Schl What percent of parents attend open house/back-toschool night? X X Par Involve Is lack of parent involvement a problem in your school? X X Gangs Are gang activities a problem in your school? X X Misbehavior Is student misbehavior a problem in your school? X X Pct Leave What percent of full-time teachers leave before the end of the year? X X Cut Class Is cutting classes a problem in your school? X Teacher Acad Are teacher attitudes to academics positive or negative? X Property Is regard for school property positive or negative? X Teacher & Student Are relations between teachers and students positive or negative? X Pct Conf What percent of parents attend parent/teacher conferences? X Health Is student health a problem in your school? X Cheating Is student cheating a problem in your school? X Expectations Are teacher’s expectations for student achievement positive or negative? X Seven variables are common to all three years, another seven are common to two adjacent years and eight variables occur in only one year.

PAGE 108

Education Policy Analysis Archives Vol. 14 No. 8 108 EDUCATION POLICY ANALYSIS ARCHIVES http://epaa.asu.edu Editor: Sherman Dorn, University of South Florida Production Assistant: Chris Murr ell, Arizona State University General questions about ap propriateness of topics or particular articles may be addressed to the Editor, Sherman Dorn, epaa-editor@shermandorn.com. Editorial Board Michael W. Apple University of Wisconsin David C. Berliner Arizona State University Robert Bickel Marshall University Gregory Camilli Rutgers University Casey Cobb University of Connecticut Linda Darling-Hammond Stanford University Gunapala Edirisooriya Youngstown State University Mark E. Fetler California Commission on Teacher Credentialing Gustavo E. Fischman Arizona State Univeristy Richard Garlikov Birmingham, Alabama Gene V Glass Arizona State University Thomas F. Green Syracuse University Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Ontario Institute of Technology Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Les McLean University of Toronto Heinrich Mintrop University of California, Berkeley Michele Moses Arizona State University Anthony G. Rud Jr. Purdue University Michael Scriven Western Michigan University Terrence G. Wiley Arizona State University John Willinsky University of British Columbia

PAGE 109

The Black-White Achievement Gap 109 EDUCATION POLICY ANALYSIS ARCHIVES English-language Graduate -Student Editorial Board Noga Admon New York University Jessica Allen University of Colorado Cheryl Aman University of British Columbia Anne Black University of Connecticut Marisa Cannata Michigan State University Chad d'Entremont Teachers College Columbia University Carol Da Silva Harvard University Tara Donahue Michigan State University Camille Farrington University of Illinois Chicago Chris Frey Indiana University Amy Garrett Dikkers University of Minnesota Misty Ginicola Yale University Jake Gross Indiana University Hee Kyung Hong Loyola University Chicago Jennifer Lloyd University of British Columbia Heather Lord Yale University Shereeza Mohammed Florida Atlantic University Ben Superfine University of Michigan John Weathers University of Pennsylvania Kyo Yamashiro University of California Los Angeles

PAGE 110

Education Policy Analysis Archives Vol. 14 No. 8 110 Archivos Analticos de Polticas Educativas Associate Editors Gustavo E. Fischman & Pablo Gentili Arizona State University & Universidade do Estado do Rio de Janeiro Founding Associate Editor for Spanish Language (1998—2003) Roberto Rodrguez Gmez Editorial Board Hugo Aboites Universidad Autnoma Metropolitana-Xochimilco Adrin Acosta Universidad de Guadalajara Mxico Claudio Almonacid Avila Universidad Metropolitana de Ciencias de la Educacin, Chile Dalila Andrade de Oliveira Universidade Federal de Minas Gerais, Belo Horizonte, Brasil Alejandra Birgin Ministerio de Educacin, Argentina Teresa Bracho Centro de Investigacin y Docencia Econmica-CIDE Alejandro Canales Universidad Nacional Autnoma de Mxico Ursula Casanova Arizona State University, Tempe, Arizona Sigfredo Chiroque Instituto de Pedagoga Popular, Per Erwin Epstein Loyola University, Chicago, Illinois Mariano Fernndez Enguita Universidad de Salamanca. Espaa Gaudncio Frigotto Universidade Estadual do Rio de Janeiro, Brasil Rollin Kent Universidad Autnoma de Puebla. Puebla, Mxico Walter Kohan Universidade Estadual do Rio de Janeiro, Brasil Roberto Leher Universidade Estadual do Rio de Janeiro, Brasil Daniel C. Levy University at Albany, SUNY, Albany, New York Nilma Limo Gomes Universidade Federal de Minas Gerais, Belo Horizonte Pia Lindquist Wong California State University, Sacramento, California Mara Loreto Egaa Programa Interdisciplinario de Investigacin en Educacin Mariano Narodowski Universidad To rcuato Di Tella, Argentina Iolanda de Oliveira Universidade Federal Fluminense, Brasil Grover Pango Foro Latinoamericano de Polticas Educativas, Per Vanilda Paiva Universidade Estadual Do Rio De Janeiro, Brasil Miguel Pereira Catedratico Un iversidad de Granada, Espaa Angel Ignacio Prez Gmez Universidad de Mlaga Mnica Pini Universidad Nacional de San Martin, Argentina Romualdo Portella do Oliveira Universidade de So Paulo Diana Rhoten Social Science Research Council, New York, New York Jos Gimeno Sacristn Universidad de Valencia, Espaa Daniel Schugurensky Ontario Institute for Studies in Education, Canada Susan Street Centro de Investigaciones y Estudios Superiores en Antropologia Social Occidente, Guadalajara, Mxico Nelly P. Stromquist University of Southern California, Los Angeles, California Daniel Suarez Laboratorio de Politicas Publicas-Universidad de Buenos Aires, Argentina Antonio Teodoro Universidade Lusfona Lisboa, Carlos A. Torres UCLA Jurjo Torres Santom Universidad de la Corua, Espaa


xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20069999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00481
0 245
Educational policy analysis archives.
n Vol. 14, no. 8 (March 20, 2006).
260
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c March 20, 2006
505
Black-white achievement gap : do state policies matter? / Henry I. Braun, Aubrey Wang, Frank Jenkins [and] Elliot Weinbaum.
650
Education
x Research
v Periodicals.
2 710
Arizona State University.
University of South Florida.
1 773
t Education Policy Analysis Archives (EPAA)
4 856
u http://digital.lib.usf.edu/?e11.481