xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20029999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00260
Educational policy analysis archives.
n Vol. 10, no. 12 (February 13, 2002).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c February 13, 2002
How schools matter : the link between teacher classroom practices and student academic performance / Harold Wenglinsky.
Arizona State University.
University of South Florida.
t Education Policy Analysis Archives (EPAA)
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 10issue 12series Year mods:caption 20022002Month February2Day 1313mods:originInfo mods:dateIssued iso8601 2002-02-13
1 of 30 Education Policy Analysis Archives Volume 10 Number 12February 13, 2002ISSN 1068-2341 A peer-reviewed scholarly journal Editor: Gene V Glass College of Education Arizona State University Copyright 2002, the EDUCATION POLICY ANALYSIS ARCHIVES Permission is hereby granted to copy any article if EPAA is credited and copies are not sold. Articles appearing in EPAA are abstracted in the Current Index to Journals in Education by the ERIC Clearinghouse on Assessment and Evaluation and are permanently archived in Resources in Education .How Schools Matter: The Link Between Teacher Classroom Practices and Student Academic Performance Harold Wenglinsky Educational Testing ServiceCitation: Wenglinsky, H. (2002, February 13). How s chools matter: The link between teacher classroom practices and student academic performanc e. Education Policy Analysis Archives 10 (12). Retrieved [date] from http://epaa.asu.edu/epa a/v10n12/.AbstractQuantitative studies of school effects have general ly supported the notion that the problems of U.S. education lie outs ide of the school. Yet such studies neglect the primary venue through whic h students learn, the classroom. The current study explores the link betw een classroom practices and student academic performance by apply ing multilevel modeling to the 1996 National Assessment of Educati onal Progress in mathematics. The study finds that the effects of cl assroom practices, when added to those of other teacher characteristic s, are comparable in size to those of student background, suggesting tha t teachers can contribute as much to student learning as the stude nts themselves.
2 of 30IntroductionMuch of the discussion in educational reform hinges on the question of whether schools matter. Over the past two decades, policymakers hav e called for improvements in the academic performance of U.S. students. Many educati onal reformers, particularly those associated with the standards movement, hold that t he key to improving student performance lies in improving the schools. If acade mic standards are rigorous, curriculum and assessments are aligned to those sta ndards, and teachers possess the skills to teach at the level the standards demand, student performance will improve. However, this perspective is to some extent at odds with another that has emerged from the discussion about school improvement, namely tha t it is students rather than schools that make the difference. Hence, a New York Times story on how to improve the academic performance of low-income students can inc lude the headline: "What No School Can Do (Traub, 2000)." Or, as Laurence Stein berg puts it in Beyond the Classroom: Why School Reform has Failed and What Pa rents Need to Do, "neither the source of our achievement problem, nor the mechanis m through which we can best address it, is to be found by examining or altering schools (Steinberg, 1996, p. 60)." In this view it is the social backgrounds of students that play the key role in their ability to learn, and only by moving outside of the educationa l system and attacking the pervasive economic inequalities that exist in the U.S. can st udent performance be improved. Quantitative research on whether schools matter has generally supported the notion that the problems of U.S. education lie outside of the s chools. Some research finds that when the social backgrounds of students are taken into a ccount, school characteristics do not seem to influence student outcomes, suggesting that schools do not serve as avenues for upward mobility, but instead reinforce existing soc ial and economic inequalities (Coleman et al., 1966; Jencks et al., 1972). Other researchers contend that school characteristics can have a greater effect on studen t outcomes than would be expected based upon student background (Lee, Bryk and Smith, 1993). But while the research in support of this contention does find significant ef fects for school characteristics, the magnitudes of these effects tend to be modest, far overshadowed by the effects of student background characteristics. (Note 1) A possible reason for the lack of large school effe cts in quantitative research is the failure of such research to capitalize on an insigh t from qualitative research: the central importance of the classroom practices of teachers. As far back as Willard Waller (1932), qualitative researchers have noted that the interac tion which occurs between teachers and students in the classroom is greater than the s um of its parts. Students can leave the classroom with their knowledge and attitudes dramat ically altered from what they were before they entered. Quantitative research neglects this dimension of schooling by treating it as a "black box," not worthy of study ( Mehan, 1993). Often teaching is not studied at all, and, when it is, only the character istics of teachers that are easily measured but far removed from the classroom (such as their l evel of educational attainment) are included.The current study seeks to fill this gap in the lit erature by using quantitative methods to study the link between student academic achievement and teacher classroom practices, as well as other aspects of teaching such as the pr ofessional development teachers receive in support of their classroom practices and the more traditional teacher background characteristics, referred to here as tea cher inputs. Such a study is made possible by the availability of a large-scale natio nally representative database, the
3 of 30National Assessment of Educational Progress (NAEP), which includes a comprehensive set of classroom practices along with student test scores and other characteristics of students and teachers. For this study, the 7,146 ei ghth graders who took the 1996 assessment in mathematics are studied along with th eir mathematics teachers. The statistical technique of multilevel structural equa tion modeling (MSEM) is employed to address the major methodological shortcomings of th e quantitative literature, namely the failure to distinguish between schooland studentlevel effects, to measure relationships among independent variables, and to explicitly mode l measurement error. The study finds that classroom practices indeed have a marked effect on student achievement and that, in concert with the other aspects of teaching under study, this effect is at least as strong as that of student background. This finding documents the fact that schools indeed matter, due to the overwhelming influence of the classroom practices of their teachers.BackgroundMuch of the quantitative literature linking school characteristics to student outcomes focused on the impact of economic characteristics, or school resources. These studies are known as production functions. One of the earliest of these studies was the Equality of Educational Opportunity Study, commonly referred to as the Coleman Report (Coleman et al., 1966). This study applied Ordinary Least Sq uares (OLS) regression analysis to nationally representative samples of elementary and secondary school students to relate school resources such as per-pupil expenditures to student academic achievement and other outcomes. The study found that, on average, w hen student background was taken into account, school resources were not significant ly associated with student outcomes. Nearly 400 additional production function studies h ave since been conducted. Meta-analyses tabulating the results of such studie s between 1964 and 1994 reached divergent conclusions. Some concluded that these st udies showed no consistent relationship between school resources and student a chievement (Hanushek, 1997, 1996a, 1996b, 1989), while others concluded that the studi es showed a consistent, albeit modest, positive relationship (Greenwald, Hedges, & Laine, 1996; Hedges & Greenwald, 1996; Hedges, Laine & Greenwald, 1994). (Note 2) Another line of inquiry into the impact of schoolin g on students, focusing on the social and organizational characteristics of schools, also emerged from the Coleman Report. This body of research, known as effective schools r esearch, sought to identify common characteristics of schools in which students perfor med above what would be expected based upon their backgrounds (Edmonds, 1979; Brooko ver et al., 1979; Austin & Garber, 1985). While the earliest of these studies tended to be small in scope, later studies using large-scale databases confirmed many of their basic findings (Lee, Bryk & Smith, 1993; Chubb & Moe, 1990). These studies foun d that such characteristics of schools as the leadership qualities of the principa l, the disciplinary environment of the school and the size of the student body all had an effect on student outcomes. In comparison to student background, however, these ef fects appeared quite modest. Much of the quantitative research which focused spe cifically on teaching conformed to a similar pattern, finding little relationship betwee n teacher inputs and student achievement. The Coleman Report measured seven teac her characteristics: years of experience, educational attainment, scores on a voc abulary test, ethnicity, parents' educational attainment, whether the teacher grew up in the area in which he or she was
4 of 30teaching, and the teacher's attitude toward teachin g middle class students. For most students, this study found these characteristics to explain less than 1% of the variation in student test scores. The findings of the meta-analy ses of production function studies were just as mixed for teacher inputs as for other school resources. They found that less than one-third of the studies could document a link between student outcomes and teacher experience, less than one-quarter could do so for teacher salaries, and just one in ten could do so for educational attainment; from su ch mixed results, the meta-analyses came to divergent conclusions, some suggesting a po sitive relationship and some suggesting no relationship.More recent research on teaching has confirmed the lack of a clear relationship between student outcomes and teacher inputs, but with two e xceptions: the amount of coursework the teacher had pursued in the relevant subject are a and the teacher's scores on basic skills tests. Two analyses of large-scale databases revealed that exposure teachers received to college-level courses in the subject th ey were teaching led to better student performance. Monk (1994) analyzed 2,829 high school students from the Longitudinal Study of American Youth. These students were tested in mathematics and science in 10th, 11th and 12th grades, and filled out question naires on their background characteristics. Their mathematics and science teac hers were also surveyed. The study related teacher characteristics to student test sco res, taking into account students' earlier test scores, background characteristics and teacher inputs. The study found that the more college-level mathematics or science courses (or ma th or science pedagogy courses) teachers had taken, the better their students did o n the mathematics and science assessments. The more traditional teacher inputs th at had been measured in the earlier production function studies, such as teacher experi ence or educational attainment, proved unrelated to student achievement. Similar re sults were obtained in a study by Goldhaber and Brewer (1995). They analyzed data on 5,149 10th graders, 2,245 mathematics teachers and 638 schools drawn from the National Educational Longitudinal Study of 1988 (NELS:88). Of the variou s inputs studied, the only one found to make a difference was the proxy for colleg e-level mathematics coursetaking, namely whether the teacher had majored in mathemati cs. Another series of recent studies suggested that, in addition to the teacher's coursework in the relevant subject making a difference, so too di d the teacher's proficiency in basic skills as measured by standardized tests. Ferguson (1991) analyzed data on nearly 900 Texas districts, representing 2.4 million students and 150,000 teachers. He related the district average of various teacher inputs to avera ge student scores on a basic skills test, taking into account student background. All of the school variables taken together accounted for from 25% to 33% of the variation in a verage student test scores, and one input, teachers' scores on the Texas Examination of Current Administrators and Teachers, a basic skills test, accounted for the li on's share of this effect. Similar results were obtained by Ferguson and Ladd (1996) in their study of Alabama school districts. Another district-level analysis, this time of 145 N orth Carolina school districts (Strauss & Sawyer, 1986), found a relationship between avera ge teacher scores on a licensure test, the National Teacher Examination, and student scores on two different assessments taken by high school juniors, taking into account o ther school and student characteristics. The Coleman data have even been re analyzed, finding a link between teacher scores on a vocabulary test and student sco res on tests in various subject areas (Ehrenberg & Brewer, 1995). That study aggregated d ata to the school level, analyzing samples of 969 elementary and 256 secondary schools The study calculated a dependent
5 of 30variable, a "synthetic gain score," as the differen ce between mean student scores in the sixth and third grades for elementary school studen ts and in the twelfth and ninth grades for high school students. The study related teacher s' educational attainment; experience and scores on a vocabulary test to synthetic gain s cores and found only the latter to be consistently related to student performance.Although large-scale quantitative research studied those aspects of teaching that are easily measurable, such aspects tend to be far remo ved from what actually occurs in the classroom. To study teacher classroom practices and the kinds of training and support pertinent to these practices which teachers receive it is necessary to draw primarily on the findings of qualitative research.The qualitative literature on effective teaching em phasizes the importance of high-order thinking skills (McLaughlin & Talbert, 1993). Teach ing higher-order thinking skills involves not so much conveying information as conve ying understanding. Students learn concepts and then attempt to apply them to various problems, or they solve problems and then learn the concepts that underlie the solut ions. These skills tend to be conveyed in one of two ways: through applying concepts to pr oblems (applications) or by providing examples or concrete versions of the conc ept (simulations). In either case, students learn to understand the concept by putting it in another context. In the case of an application, this might mean solving a unique pr oblem with which the student is unfamiliar. In the case of a simulation this might mean examining a physical representation of a theorem from geometry or engagi ng in a laboratory exercise that exemplifies a law from chemistry. While both lowerorder and higher-order thinking skills undoubtedly have a role to play in any class room, much of the qualitative research asserts that the students of teachers who can conve y higher-order thinking skills as well as lower-order thinking skills outperform students whose teachers are only capable of conveying lower-order thinking skills (see also Phe lan 1989; Langer & Applebee, 1987). The qualitative research also emphasizes three addi tional classroom practices: individualization, collaboration and authentic asse ssment. Individualization means that teachers instruct each student by drawing upon the knowledge and experience that that particular student already possesses. Collaborative learning means that teachers allow students to work together in groups. Finally, authe ntic assessment means that assessment occurs as an artifact of learning activities. This can be accomplished, for instance, through individual and group projects that occur on an on-going basis rather than at a single point in time (McLaughlin & Talbert, 1993; G raves & Sunstein, 1992; Golub, 1988). (Note 3) The qualitative research suggests that this set of classroom practices can produce qualitative improvements in the academic performanc e of all students, regardless of their backgrounds. The focus on higher-order thinking ski lls is not only appropriate for advanced students; even those in need of more basic skills can benefit from understanding the conceptual basis of these skills. And individualization of instruction does not simply mean using special techniques for l ow performing students; techniques developed to address the problems of low-performing students can often help high-performing students as well. Regardless of the level of preparation students bring into the classroom, the qualitative research assert s, decisions that teachers make about classroom practices can either greatly facilitate s tudent learning or serve as an obstacle to it.
6 of 30Qualitative studies are, by their nature, in-depth portraits of the experiences of specific students and teachers. As such, they provide valuab le insight into the interrelationships between various aspects of teacher practice and stu dent learning. However, because they focus on one specific setting, it is difficult to g eneralize the results of these studies to broader groups of students and teachers. This sugge sts the need for large-scale quantitative studies that can test the generalizabi lity of the insights from qualitative research.Yet there has been little quantitative research int o whether classroom practices, in concert with other teacher characteristics, have an impact on student learning that is comparable in size to that from background characte ristics. Two notable exceptions are a study of the classroom experiences of the nation's students using NELS:88 (National Center for Education Statistics, 1996) and a study of the professional development experiences and classroom practices of California's teachers (Cohen & Hill, 2000). The NELS:88 study related a few classroom practices to student achievement in mathematics and science and found that a focus on higher-order thinking skills had a positive effect in math but not in science. The California study relat ed a few professional development experiences of teachers to their classroom practice s, and related both of these to student scores on the state assessment. The study found pos itive relationships between reform-oriented classroom practices and student ach ievement as well as between reform-oriented professional development and reform -oriented classroom practices, although these relationships were marginal (mostly significant at the .15 level). While these two studies represent an important departure from production function studies, in their inclusion of measures of classroom practice a nd professional development, the usefulness of their findings is limited by their da ta and method. The measures of classroom practice in the NELS:88 and California da tabases are hardly comprehensive. Neither database has, among other things, a measure of hands-on learning activities. And the California study combines its few classroom pra ctices into two variables, reform-minded and traditional practice, making it d ifficult to gauge the effectiveness of particular practices. The NELS:88 data also lack me asures of most aspects of professional development, and hence professional de velopment was not included in the NELS:88 study. The California data lack measures of social background for individual students, and hence the California study relied upo n the percentage of students in the school who received a free or reduced-price lunch, a weak measure. The two studies also relied upon regression analysis, which, as shall be seen, is problematic in the study of school effects.These two exceptions notwithstanding, quantitative research has tended to find that the effects of student background on student achievemen t and other outcomes far overshadows school effects. Some of the research ha s found no school effects at all, while other research has found effects that are, at best, modest. Specifically in terms of teaching, such research has found that most charact eristics of teachers do not matter, and the few that do are not as important as student bac kground. Yet such studies ignore qualitative work that suggests that certain classro om practices are highly conducive to student achievement. If this is the case, then clas sroom practices may indeed explain a substantial portion of the variance in student achi evement. The current study seeks to explore this possibility, through the analysis of a national database that includes an unprecedentedly comprehensive set of classroom prac tices.
7 of 30Hypotheses, Data and MethodThe study tests two hypotheses concerning teacher q uality. Teacher quality has three aspects: the teacher's classroom practices, the pro fessional development the teacher receives in support of these practices, and charact eristics of the teacher external to the classroom, such as educational attainment. The firs t hypothesis is that, of these aspects of teacher quality, classroom practices will have t he greatest impact on student academic performance, professional development the next grea test, and teacher inputs the least. The rationale for this expectation is that the clas sroom is the primary venue in which students and teachers interact; hence, decisions by teachers as to what to do in this venue will most strongly affect student outcomes. Teacher inputs will be least likely to influence student academic performance because they do so less directly, through encouraging classroom practices conducive to high s tudent performance. Professional development falls somewhere between classroom pract ices and teacher inputs. It does occur outside the classroom, but is more closely ti ed to specific classroom practices than are teacher inputs. Second, it is hypothesized that teacher quality is as strongly related to student academic performance as student background characteristics. When the effects for all three aspects of teacher quality are added together the result will be comparable in size to that of student background. The rationale b ehind this expectation is that, as the qualitative literature suggests, student learning i s a product of the interaction between students and teachers, and both parties contribute to this interaction. To test these hypotheses, this study makes use of N AEP, which can measure all three aspects of teacher quality as well as student perfo rmance and other potential influences on student performance. NAEP is administered every year or two in various subjects to nationally representative samples of fourth, eighth and twelfth graders. The subjects vary, but have included at one time or another math ematics, science, reading, writing, geography and history. In addition to the test itse lf, NAEP includes background questionnaires completed by the student, the princi pal, and the teacher in the relevant subject area. The results from NAEP are used to mea sure trends in student performance over time and to compare performance among various subgroups of students such as males and females (for an overview of NAEP, see Joh nson 1994). For this study, data on the 7,146 eighth graders wh o took the 1996 mathematics assessment are analyzed. Eighth graders are used fo r this analysis because they are exposed to a wider range of subject matter than fou rth graders, and teacher questionnaires are not available for twelfth grader s. Student performance is measured from test scores on the assessment. Student backgro und is measured utilizing six questions from the student background questionnaire : the father's level of education, the mother's level of education, whether there are 25 o r more books in the home, whether there is an encyclopedia in the home, whether the f amily subscribes to a newspaper and whether the family subscribes to a magazine. The th ree aspects of teacher quality are measured from a background questionnaire, completed by the mathematics teacher. Three teacher inputs are measured: the teacher's ed ucation level, whether the teacher majored or minored in the relevant subject area (ma thematics or math education), and the teacher's years of experience. Ten measures of professional development are used: the amount of professional development teachers rec eived last year and whether teachers received any professional development in the last f ive years in the topics of cooperative learning, interdisciplinary instruction, higher-ord er thinking skills, classroom management, portfolio assessment, performance-based assessment, cultural diversity,
8 of 30 teaching special-needs students, and teaching limit ed-English-proficient (LEP) students. Finally, 21 classroom practices are utilized: addre ssing algebra, addressing unique problems, addressing routine problems, using textbo oks, using worksheets, having students talk about mathematics, having students wr ite reports, having students solve problems that involve writing about math, having st udents work with objects, having students work with blocks, having students solve re al-world problems, having students hold discussions in small groups, having students w rite a group paper, having students work with partners, assessing student progress from tests, assessing student progress from multiple-choice tests, assessing student progr ess from tests involving constructed responses, assessing student progress from portfoli os, assessing student progress from individual projects, and the amount of homework ass igned. One school characteristic not pertaining to teacher quality is also drawn from th e teacher questionnaire, the number of students in the class. See Table 1 for a complete l ist of variables.Table 1 Descriptive Statistics for Teacher Inputs and Professional DevelopmentTeacher Inputs MSD Teacher's Education Level(From 1=M.A.) 2.38.46 Teacher Majors in Mathematics(1=yes, 0=no) .69.43 Teacher's Years of Experience(From 1= 2 or less to 5=25 or more 3.531.17 Professional Development Classroom Management (1=yes, 0=no).44.46Cooperative Learning (1=yes, 0=no).68.44Cultural Diversity (1=yes, 0=no).32.43Higher-Order Thinking Skills (1=yes, 0=no).45.46Interdisciplinary Instruction (1=yes, 0=no).50.47Limited English Proficiency (1=yes, 0=no).12.47Performance-based Assessment (1=yes, 0=no).12.35Portfolio Assessment (1=yes, 0=no).36.45Special-needs Students (1=yes, 0=no).26.41Amt. Professional Development Last Year (1=none to 5=35+hours)3.301.14 The method employed in this study is intended to ad dress key methodological problems in the prior literature. Much of school effects res earch (including most production function studies as well as the NELS:88 and Cohen & Hill studies of classroom practice)
9 of 30relies upon OLS regression techniques. One problem with such techniques is that they are not sensitive to the multilevel nature of the d ata. School effects involve relating variables at one level of analysis, the school, to another level of analysis, the student. Studies using OLS tend either to aggregate student data to the school level or to disaggregate school data to the student level. The first approach can introduce aggregation biases into the models, the second appr oach can seriously underestimate standard errors, and both approaches can miss impor tant information about the nature of the school effects (Bryk & Raudenbush, 1992; Goldst ein, 1995). A second problem with regression techniques is their failure to take meas urement error into account. These techniques assume that the variables in the models are perfectly measured by the observed data. Yet the operationalizations of most variables are subject to substantial error, both because the operationalization does not correspond perfectly to the model (e.g. parents' income as a proxy for socioeconomic status) and because data collection procedures are error-prone. Failing to take measure ment error into account can lead to biased estimates of model coefficients. A third pro blem is that regression techniques are not adept at measuring interrelationships among ind ependent variables. School effects often involve a multi-step process, in which one sc hool characteristic influences another that may, in turn, influence the outcome of interes t. While it is possible to run a series of models that regress each independent variable on th e others, such models tend to be cumbersome and lack statistics measuring the overal l fit of the series of models. Because of these difficulties, school effects research ofte n neglects the indirect effects of various school characteristics.One way to address these problems is through the te chnique of multilevel structural equation modeling (MSEM). Structural equation model ing (SEM) involves two components: factor models and path models (Hayduk, 1987; Jreskog & Srbom, 1993). The factor models relate a series of indicators, kn own as manifest variables, to a construct of those indicators, known as a latent va riable. The path models then relate the latent variables to one another. The estimation pro cedure for both the factor and path components involves three steps. A set of hypothesi zed relationships is specified by the researcher. Then, through an iterative process, dif ferences in the covariance matrix those relationships imply (S SS S ) and the covariance matrix of observed data (S) are minimized. The resulting estimates include coefficients for th e hypothesized relationships, t-tests for their statistical significance, and statistics for the goodness of fit between S SS S and S. SEM can be adapted to handle multilevel data by employi ng the estimation procedure separately for the two levels of analysis (Muthn, 1994; Muthn, 1991). The researcher hypothesizes a student-level factor model, a studen t-level path model, a school-level factor model and a school-level path model. These m odels can be used to generate two implied covariance matrices, S SS SB, B,B, B, a between-school matrix computed as the school deviations around the grand mean, and S S S S W WW W, a within-school matrix computed as student deviations around group means. The observed data ca n be similarly partitioned into betweenand within-school covariance matrices (SB and SW). MSEMs can address the three problems in the prior l iterature. First, they do distinguish between schools and students; separate models are s pecified for each level of analysis and related to one another through a constant. Seco nd, these models take measurement error into account in two ways. For one, the factor models explicitly measure the amount of variance in the latent variables unexplained by the manifest variables. In addition factor models can actually reduce measurement error by generating latent variables from multiple manifest variables. Third, the path models estimate interrelationships among
10 of 30independent variables, allowing for the estimation of indirect effects. The effect sizes and t-scores of the indirect effects are produced, as well as statistics that measure the overall goodness of fit of models that simultaneous ly specify these interrelationships. (Note 4) The current study produces three MSEMs. Analyses ar e conducted using AMOS 3.6 (Arbuckle, 1997), a SEM software package, along wit h STREAMS 1.8, a preand post-processor that simplifies the syntax and outpu t for multilevel models (Gustafsson & Stahl, 1997). In preparation for the preprocessor, the preexisting student-level data variable labels are reduced to six characters and m issing values replaced with means for the pertinent variable. The software then aggregate s the student level data to the school level, and creates both a school-level covariance m atrix and a pooled matrix of residual student-level covariances. (Note 5) The first MSEM relates teacher inputs to student academic performance, taking into account student s ocioeconomic status (SES) and class size (see Figure 1 below). The student-level factor model generates an SES construct from the six measures of student backgrou nd, and an academic performance construct from a single test score. The student-lev el path model simply measures the covariance between SES and student academic perform ance. The school-level factor model generates an SES construct from school means of the six measures of student background and an academic construct from the schoo l mean of the single test score. In addition, class size, teachers' years of experience educational attainment and major are constructed from individual measures that correspon d to these constructs. The school-level path model treats student academic per formance as a function of the other constructs. The second MSEM relates professional development an d teacher inputs to student academic performance and one another, taking into a ccount student SES and class size (see Figure 2 below). The student-level factor and path models are the same as in the
11 of 30teacher inputs model. Early versions of the schoollevel factor and path models include SES and student academic performance, constructed a s before, teacher inputs which prove significantly related to student academic per formance, constructed from a single corresponding measure, the amount of time in profes sional development, constructed from a single corresponding measure, and all nine p rofessional development topics. For the sake of parsimony, the final school-level facto r and path models include only those professional development topics significantly relat ed to student academic performance. These are professional development in higher-order thinking skills, constructed from a single corresponding measure, and professional deve lopment in teaching different populations of students, constructed from professio nal development in cultural diversity, professional development in teaching LEP students a nd professional development in teaching students with special needs. The parsimoni ous school-level path model relates each professional development construct to student achievement, and the teacher input, class size and SES both to student achievement and to each professional development construct. The third MSEM relates classroom practices, profess ional development and teacher inputs to student academic performance and one anot her, taking into account SES and class size (see Figure 3 below). Student-level fact or and path models remain the same as in prior models. Early versions of the school-level factor and path models include SES, class size, teacher inputs that prove significant i n the teacher input model, the amount of time in professional development, the topics of pro fessional development that prove significant in the professional development model a nd all 21 classroom practices. For the sake of parsimony, the final school-level facto r and path models include only those classroom practices that prove significantly relate d to student achievement. The final school-level factor model constructs the teaching o f higher-order thinking skills from a single measure, solving unique problems; the teachi ng of lower-order thinking skills
12 of 30from a single measure, solving routine problems; en gaging in hands-on learning from three measures, working with blocks, working with o bjects and solving real-world problems; assessing student progress through tradit ional testing from two measures, multiple choice testing and the overall frequency o f testing; and assessing student progress through more authentic assessments from th ree measures, portfolio assessments, individual projects, and constructed r esponse tests. The SES, class size, teacher input and professional development construc ts are handled as in the professional development model. The school-level path model rela tes these classroom practice constructs to the student achievement constructs, r elates the professional development constructs to the classroom practice constructs, an d relates teacher inputs, SES and class size to the professional development classroom prac tice and student achievement constructs. These procedures are modified in two ways to take t he design of NAEP into account. First, design effects are employed. NAEP is a strat ified, clustered sample. Secondary analyses of NAEP that treat it as a simple random s ample will underestimate standard errors, making significance tests overly liberal. O ne procedure recommended to address this problem is to inflate standard errors estimate d assuming a simple random sample by a certain factor, known as a design effect (O'Reill y, Zelenak, Rogers and Kline 1996). This study uses a design effect of 2, calculated by estimating the proper standard error for select values in the first MSEM and choosing th e most conservative one. Cutoff points for all significance tests are increased by 41% (the increase in standard errors attributable to the square root of the design effec t). (Note 6) Second, each MSEM is estimated multiple times, once for each "plausible value" of the student test score, and the resulting parameters and standard errors are po oled. Because each student answers only a small subset of the assessment items, it is not possible to estimate a single student
13 of 30score. Instead, five estimates are provided based u pon the items the student did not answer and background information about the student and the school. The appropriate procedure for secondary analyses using these five e stimates, which are known as plausible values, is to estimate five separate mode ls for each of the plausible values, pool their point estimates by taking their means an d pool their standard errors as the sum of the mean standard error and the variance among t he five plausible values, weighted by a factor of 1.2 (Johnson, Mislevy and Thomas 1994). (Note 7) The current study employs this technique, producing a total of 15 sets of est imates, five for each of the MSEMs. (Note 8) ResultsBefore discussing the results from the three MSEMs, it is worthwhile to summarize what the NAEP data reveal about the prevalence of c lassroom practices, professional development and teacher inputs. (Note 9) The data on teacher inputs indicates that eighth grade math teachers are most likely to possess less than a master's degree, have majored or minored in mathematics or math education, and ha ve 10 or more years of experience teaching (Table 1). Approximately 40% of eighth gra ders have teachers who possess a master's degree or more, with the remainder possess ing a bachelor's degree or less. Approximately 70% of eighth graders have teachers w ho majored or minored in mathematics or math education; the rest have teache rs who are teaching off-topic. And approximately 60% of eighth graders have teachers w ith more than 10 years of experience.The data on professional development indicate that while most teachers receive some professional development in some topics, that profe ssional development tends not to be of long duration, and certain topics tend to be neg lected (Table 1). Most eighth graders have teachers who received some professional develo pment in the last five years in the most common topics, such as cooperative learning or interdisciplinary instruction. But only one-third of eighth graders have teachers who received professional development in cultural diversity, one-quarter have teachers who r eceived professional development in teaching students with special needs, and one-tenth have teachers who received professional development in teaching LEP students. And regardless of the topic of professional development, only a minority of studen ts have teachers who received more than 15 hours of professional development last year The prevalence of classroom practices varies greatl y (see Table 2). While much of the material covered in eighth grade involves issues of operatio ns and measurement, teachers do cover more advanced topics. More than half of all students are exposed to algebra, and one-quarter to geometry. The kinds of problems students are taught to solve tend to involve a routine set of algorithms; four out of five students commonly work with such problems, as opposed to about half of students working with problems that involve unique situations. All students report taking a math test at least once a month. The natur e of the test varies, however. Typically, students take tests that involve extended written r esponses (more than half do so at least once a month). About one-third of students take multiple-c hoice tests. Students are also assessed through individual projects and portfolios (also ab out one-third of students at least once a month). Hands-on learning activities appear quite i nfrequent. Just one-quarter of students work with objects and just one-tenth work with blocks. P roblems with a concrete or practical bent, that address real-world situations, are fairly usua l, however, with three-quarters of students encountering such problems as least once a week. Wr iting about mathematics is fairly uncommon, just one-third of students doing so at le ast once a week. Group activities vary in
14 of 30 their frequency; most students discuss math in smal l groups, but only a minority of students solve problems in groups or work on a problem with a partner. Finally, textbooks and homework are ubiquitous in eighth-grade classrooms; nearly all students use a textbook at least once a week, and most do some homework every day.Table 2 Descriptive Statistics for Classroom PracticesClassroom Practices MSD Address Algebra (From 1=none to 4=a lot)2.51.59Address Geometry (From 1=none to 4=a lot)2.00.61Address Solving Routine Problems (From 1=none to 4= a lot)2.78.43 Address Solving Unique Problems (From 1=none to 4=a lot)2.44.56 Assessment Using Multiple Choice Questions (from 1= never to 4= a lot/ twice a week) 1.99.83 Assessment Using Short/Long Answers (from 1=never t o 4= a lot/twice a week)2.49.92 Assessment Using Portfolios (from 1=never to 4= a l ot/twice a week)1.87.79 Assessment Using Individuals Projects (from 1=never to 4= a lot/twice a week)2.19.81 Work with Blocks (From 1=never to 4=almost everyday )1.52.58 Work with Objects (From 1=never to 4=almost everyda y)2.09.77 Solve Real-Life Problems (From 1=never to 4=almost everyday)2.93.74 Write Reports (From 1=never to 4=almost everyday)1. 39.49 Write about Math (From 1=never to 4=almost everyday )1.97.79 Take Math Tests (From 1=never to 4=almost everyday) 2.49.47 Do Worksheet (From 1=never to 4=almost everyday)2.6 5.82 Talk about Math (From 1=never to 4=almost everyday) 2.701.02 Solve Problems with Other Students (From 1=never to 4=almost everyday)2.84.83 Discuss Math with Other Students (From 1=never to 4 =almost everyday)3.31.70 Work with Partner (From 1=never to 4=almost everyda y)2.98.82 Do Homework (From 1=never to 4=almost everyday)2.93 .75 Use Textbooks (From 1=never to 4=almost everyday)3. 63.65Table 3 Descriptive Statistics for Other Characteristics of Schools and StudentsOther Characteristics of Schools and StudentsMSDClass Size (From 0=More than 36 student to 4=1 to 2 0 students)2.54.88 Student's Family Gets Newspaper (1=yes, 0=no).74.43
15 of 30 Student's Family Has Encyclopedia (1=yes, 0=no).82. 38 Student's Family Gets Magazine (1=yes, 0=no).83.37Student's Family Has More than 25 Books (1=yes, 0=n o).95.21 Father's Education Level 2.91.94 Mother's Education Level2.85.96Math Score: Plausible Value #1272.4535.85Math Score: Plausible Value #2272.6435.89Math Score: Plausible Value #3272.3636.35Math Score: Plausible Value #4272.4535.85Math Score: Plausible Value #5272.5435.56 This description of teacher inputs, professional de velopment and classroom practices says little about their effectiveness. The fact that certain pr actices are uncommon may be bad or good, depending upon their impact on student outcomes. It is the role of the series of MSEMs to gauge the effectiveness of these three aspects of t eacher quality. For all three MSEMs, the student-level factor model s are similar (Table 4). The factor models show the two student-level characteristics, SES and achievement, to be well measured. All of the indicators of SES have standardized factor load ings ranging from .24 to .33, suggesting that each plays a role in constructing the variable. The construct for achievement consists of a single indicator, and hence has a loading fixed at one and an error fixed at zero. The path model consists simply of the covariance between student S ES and student achievement, and this covariance proves significant, with a correlation c oefficient of .35 for all models. It should be remembered that this covariance pertains only to th e student-level component of the models, meaning that variations in SES among students in th e same school are associated with variations in their mathematics scores within that same school. Variations in average SES and achievement between schools is the purview of the s chool-level models.Table 4 Student-Level Factor and Path ModelsFactor ModelInput ModelsP.D. ModelPractices Model SESAchErrSESAchErrSESAchErr Mother's Education Level2.91* .31 1.00 .86 2.92*.31 1.00 .86 2.91*.31 1.00 .86 Father's Education Level2.78* .31 1.00 .85 2.80*.31 1.00 .84 2.79*.31 1.00 .85 Family Gets Newspaper1.00* .24 1.00 .92 1.00*.24 1.00 .92 1.00*.24 1.00 .92 Family Gets Encyclopedia.92* .26 1.00 .94 .92*.26 1.00 .94 .92*.26 1.00 .94
16 of 30 Family Gets Magazine1.16* .33 1.00 .90 1.16*.33 1.00 .90 1.16*.33 1.00 .92 Family Has More than 25 Books.60* .30 1.00 .92 .60*.30 1.00 .92 .60*.30 1.00 .85 Plausible Value #1 1.00* .77 1.00.28 1.00* .77 1.00.28 1.00* .77 1.00.28 Plausible Value #2 .99* .77 1.00.28 .99* .77 1.00.28 .99* .77 1.00.28 Plausible Value #3 1.00* .77 1.00.28 1.00* .77 1.00.28 1.00* .77 1.00.28 Plausible Value #4 .99* .77 1.00.28 .99* .77 1.00.28 .99* .77 1.00.28 Plausible Value #5 .98* .77 1.00.28 .98* .77 1.00.28 .98* .77 1.00.28 Path Model Covariance between SES andAchievement 1.15*.35 1.15*.35 1.15*.35 *p<.05Cells contain unstandardized and standardized coeff icients, in that order. The school-level factor models also have indicators that contribute substantially to their constructs (Table 5). The loadings for SES range be tween .17 and .25. (Note 10) Hands-on learning has loadings ranging from .46 to .79. Trad itional assessment has loadings ranging from .37 to .57. And authentic assessment has loadings r anging from .41 to .73. All of the constructs generated from a single indicator have loadings fix ed at 1 and errors fixed at 0 and so, by definition, their indicators contribute substantial ly. The one construct for which the indicators do not all contribute substantially is professional development in teaching special populations. Here two of the indicators (cultural diversity and teaching LEP students) load strongly on the construct, but the third (teaching special-needs st udents) does not. (A sensitivity analysis was conducted in which this indicator was excluded, wit hout significant impact on the model.)Table 5 School-level Factor Model: Classroom Practices SESAchClass Size PD Time PD Hi Order PD Diversity Tchr Major Error Mother'sEducation 2.66*.23 1.00 .23 Father'sEducation 2.89*.25 1.00 .25 Newspaper1.00* .19 1.00 .16
17 of 30 Encyclopedia.78* .17 1.00 .07 Books.50* .20 1.00 .10 Magazine1.05* .23 1.00 .10 Plausible Value#1 1.00* .57 1.00 .03 Plausible Value#2 1.00* .58 1.00 .02 Plausible Value#3 1.00* .58 1.00 .03 Plausible Value#4 1.01* .58 1.00 .04 Plausible Value#5 1.00* .58 1.00 .02 Class Size 1.00* 1.00 1.00 .00 PD Time 1.00* 1.00 1.00 .00 PD Hi order 1.00* 1.00 1.00 .00 PD Cultural 1.00* .76 1.00 .46 PD LED .65* .55 1.00 .59 PD Special .26* .20 1.00 .69 Tchr Major 1.00* 1.00 1.00.00 *p<.05Cells contain unstandardized and standardized coeff icients, in that order.Table 5 School-level Factor Model: Classroom Practices (con tinued) Hands-On Learning Trad Assess Auth Assess Lower Order Higher Order Error Real-world Problems .64*.46 1.00.63
18 of 30 Work with Objects 1.00*.66 1.00.53 Work with Blocks .83*.79 1.00.43 Take Tests .35*.37 1.00.66 Assess throughMultiple-Choice Tests 1.00*.57 1.00.58 Assess through ExtendedResponse Tests 1.01*.65 1.00.54 Assess through Projects 1.00*.73 1.00.48 Assess through Portfolios .63*.41 1.00.65 Address RoutineProblems 1.00*1.00 1.00.00 Address Unique Problems 1.00*1.00 1.00.00 *p<.05 Cells contain unstandardized and standardized coeff icients, in that order. The school-level path model for teacher inputs show s that one of the three inputs, the teacher's major, is modestly associated with academic achieve ment. The model consists of a single dependent variable, achievement, related to five in dependent variables, SES, class size and the three teacher inputs (Table 6). SES has an effect s ize of .76, which far overshadows those of class size and teacher's major (.10 and .09 respect ively). The teacher's level of education and years of experience prove unrelated to student achi evement.Table 6 School-level Path Model: Teacher InputsAch SES198.41** .76 Class Size3.04* .10 Tchr Major4.82** .09 Tchr Ed1.20 .02 Tchr Exp1.03 .05
19 of 30 Error1.00 .44 *p<.10**p<.05 The school-level path model for professional develo pment finds that two topics, addressing special populations of students and higher-order th inking skills, are substantially related to student achievement. The model indicates that schoo ls with high percentages of affluent students tend to have less time spent on profession al development generally, and are less likely to expose their teachers to professional developmen t on working with different student populations (Table 7). Schools with smaller average class sizes are also less likely to do these things. But, schools with more teachers teaching on topic also devote more time to professional development. Of the three aspects of professional d evelopment, the amount of time is not significantly related to achievement. Professional development in higher-order thinking skills, and dealing with special populations, however, do h ave significant effects, with standardized coefficients of .12 and .21 respectively.Table 7 School-level Path Model: Professional Development PD DiversityPD Hi OrderPD TimeAch SES -1.29**-.32 -.58-.09 -1.87*-.12 213.18**.83 Class Size -.08**-.17 -.04-.06 -.20*-.11 4.23**.14 Tchr Major -.01-.01 .11.08 .94**.27 5.05**.09 PD Diversity 13.24**.21 PD Hi order 4.88**.12 PD Time -.23-.01 Error 1.00.66 1.00.70 1.00.67 1.00.41 *p<.10**p<.05 Cells contain unstandardized and standardized coeff icients, in that order.The school-level path model for classroom practices finds three constructs, hands-on learning, solving unique problems and avoiding reliance on au thentic assessments, to be positively related to student achievement (Table 8). All five of the classroom practice constructs are related to some of the earlier variables, SES, clas s size, teacher major or the three aspects of professional development. Schools with more affluen t students are more likely to solve unique problems and less likely to engage in inauthentic f orms of assessment. Schools where teachers
20 of 30 received professional development in dealing with d ifferent student populations are less likely to have students engage in routine problem solving. And schools where teachers received professional development in higher-order thinking s kills are more likely to have students engage in hands-on learning. Also, the more time te achers engage in professional development, the more their students engage in hands-on learning and authentic assessment. These practices are associated with student achievement. Schools wh ere students engage in hands-on learning score higher on the mathematics assessment. Schools where students solve unique problems also score higher, as do those schools that do not rely primarily on authentic forms of assessment.Table 8 School-level Path Model: Classroom Practices PD Diversity PD Hi Order PD Time Hands-On Learning Lower Order Higher Order Trad Assess Auth Assess Ach SES -1.11**-.05 -.43-.07 -1.83*-.12 -1.02.14 -.32-.06 1.15**.17 -2.35**-.34 -.79-.09 192.26**.74 ClassSize -.09**-.17 -.04-.06 -.20*-.11 .03.03 .03.05 .03.03 .01.01 -.09-.10 2.33.08 TchrMajor -.01-.01 .11.08 .94**.27 -.04-.02 -.06-.05 .02.01 .19.13 -.05-.03 4.19*.07 PDDiversity -.23-.14 -.30*-.24 -.24-.16 -.13-.09 -.14-.07 PD Hiorder .34**.30 .01.01 .12.11 .21.19 .23*.18 PD Time .13**.27 .03.08 .05.12 -.14**-.32 .14**.26 Hands-OnLearning 8.88**.25 LowerOrder -3.85-.08 HigherOrder 4.82**.13 TradAssess 1.23.03 AuthAssess -5.73**-.18Error1.00.67 1.00.70 1.00.67 1.00.63 1.00.68 1.00.68 1.00.62 1.00.66 1.00.40
21 of 30 *p<.10 **p<.05Cells contain unstandardized and standardized coeff icients, in that order.Comparisons among the three school-level path model s help to gauge the impact of teaching on student achievement. First, all of the models expla in a similar amount of variance. While the residual variance goes from .44 in the teaching mod el to .41 in the professional development model and .40 in the classroom practices model, the se differences are slight. Thus, rather than explain more variance, the more complex models simp ly reallocate variance among explanatory variables. Second, the three models show the total effect of each teacher quality variable. The total effect is the sum of all direct and indirect effects, and is measured for each aspect from the sum of the effect sizes of the variables in directi ng that aspect in the model in which that aspect is related to achievement without mediating variabl es. (Note 11) Thus, the effect size of the one significant teacher input is .09, taken from the te acher inputs model; the effect sizes for the statistically significant aspects of professional d evelopment total .33, taken from the professional development model; and the effect size s for the classroom practices total .56, taken from the classroom practices model. Third, all of t he models fit the data well, with goodness of fit indices at the .99 or 1.00 level and root mean squared errors of approximation at the .014 level or better.In sum, it appears that the various aspects of teac her quality are related to student achievement when class size and SES are taken into account. In particular, the following 5 variables are positively associated with achievement: Teacher major Professional development in higher-order thinking s kills Professional development in diversity Hand-on learning Higher-order thinking skills Before discussing further the implications of these results, however, it is necessary to note some shortcomings of the study.Methodological CaveatsThe study suffers from four basic shortcomings. Fir st, the data are cross-sectional. The information about aspects of teacher quality is col lected at the same time as student test scores. Consequently, it is not possible to draw inferences about the direction of causation for the relationships that were discovered. It may be that a focus on higher-order thinking skills causes increased student performance, or it may be that ha ving high-performing students drives teachers to focus on higher-order thinking skills. The likelihood of the latter scenario is somewhat reduced in that the models take SES and cl ass size, both proxies of prior academic performance of the student and school, into account Nonetheless, to confirm the causal direction hypothesized in this study, subsequent re search should replicate the results using longitudinal data.Second, the study covers only one grade level in on e subject. It is possible that different sets of classroom practices will prove effective for other subjects and at other grade levels. Third, this study does not measure the link between aspects of teacher quality and the relationship between student test scores and student SES. MSEM measures student-level covariances by pooling
22 of 30each school's within-school covariance matrix. Cons equently, while it is possible to measure the relationship between a school variable and a st udent outcome, it is not possible to measure the relationship between a school variable and the relationship between two student characteristics. Other multilevel techniques, such as Hierarchical Linear Modeling, while unable to perform certain analyses that MSEM can pe rform (e.g. confirmatory factor analysis), are able to accomplish this. Subsequent research sh ould supplement the findings of this study by measuring the impact of classroom practices and other aspects of teacher quality on the relationship between student test scores and studen t background characteristics. Such analyses will make it possible to know not only how teachers can affect the average performance of their class, but how they can affect the distribution of performance within the class. Finally, better indicators of the constructs used i n this study are needed. The SES construct lacks indicators of parents' income or occupation, as well as non-educational materials in the home such as a microwave or washer and dryer, indic ators which prior research has found to be an important component of SES. Exposure to each top ic of professional development is measured as whether the teacher had received any ex posure in the last five years, making it impossible to distinguish between professional deve lopment that is rich and sustained and a lone weekend seminar. It is also not possible to me asure how receptive teachers are to the professional development they receive. Presumably, a more attentive teacher would benefit more from professional development than a less atte ntive one. Given that professional development in working with different student popul ations is so important, it would be useful to include a measure of classroom practices that in volves this activity. And while many of the classroom practices are measured through multiple i ndicators, some, such as higher-order thinking skills, are not. Additional indicators for single-indicator constructs should be introduced to increase the reliability of the const ructs. (Note 12) ConclusionDespite these methodological shortcomings, the curr ent study represents an advance over previous work. The first model to some extent exemp lifies the traditional approach to gauging the impact of teaching and other school characteris tics on student achievement. Although the model differs from most production function studies in including a measurement component and being multilevel, it is otherwise similar. Like OLS, the model relates a single dependent variable to a series of independent variables. The independent variables consist of teacher inputs and a class size measure, controlling for st udent background. Like most of the prior research, this model finds no significant relations hip to test scores for most of the characteristics, with the exception of the teacher' s college-level coursework as measured by major or minor in the relevant field. And like all of the prior research, all school effects are overshadowed by the effect of student SES.The subsequent models move beyond the first by intr oducing measures of what teachers actually do in the classroom, the training they rec eive to support these practices directly, and by modeling interrelationships among the independent v ariables. They are able to do so because the NAEP database includes a comprehensive set of c lassroom practices, and because MSEM can model all of the relevant interrelationships. A nd all of the models, including the teacher inputs model, move beyond most prior research in th eir ability to take into account measurement error and the multilevel nature of the data. Through these innovations it was possible to confirm the two hypotheses regarding th e role that teaching plays in student learning.
23 of 30The first hypothesis, that, of the aspects of teach er quality, classroom practices will have the greatest effect, is confirmed by the models. The ef fect sizes for the various classroom practices total .56; those for the professional development t opics total .33; and the effect size for the one teacher input found to have a statistically signifi cant impact is .09. As the qualitative literature leads one to expect, a focus on higher-order thinki ng skills is associated with improved student performance. Applying problem-solving techniques to unique problems is a key component of such skills. Hands-on learning can be understood in this way as well, in that it involves the simulation of concepts, moving the student from the abstract to the concrete. (Note 13) Also suggested by the qualitative literature, individual izing instruction seems to be effective. Students whose teachers received professional devel opment in learning how to teach different groups of students substantially outperformed other students. One apparent inconsistency between the findings of this study and the qualitat ive literature is in the area of authentic assessment in that the study documents the importan ce of using some form of traditional testing in assessing student progress. This finding, howeve r, merely suggests that on-going assessments such as portfolios and projects are not sufficient; they need to be supplemented with tests that occur at a distinct point in time.The second hypothesis, that the total impact of the teaching variables will be comparable to that of student SES, is also confirmed. The sum of the e ffects from the three aspects of teacher quality is .98. The effect sizes for SES range from .74 to .83, with a value of .76 in the model where all three aspects of teacher quality are incl uded (the classroom practices model). Thus, the impact of teaching can be said not only to be c omparable to that of SES, but even to be somewhat greater.In addition to confirming the hypotheses regarding the impact of teaching on student learning, the study uncovers important interrelationships amo ng the aspects of teaching. For one, professional development seems to influence teacher s' classroom practices strongly. The more professional development teachers received in hands -on learning, and indeed the more professional development they received regardless o f topic, the more likely they are to engage in hands-on learning activities. And the more profe ssional development teachers received in working with special student populations, the less likely they are to engage in lower-order activities. Another important interrelationship inv olves the trade-off between teacher quality and teacher quantity. Smaller class sizes are negat ively associated with teachers majoring in their relevant subject and in receiving substantial amounts of professional development, whereas teacher major and time in professional deve lopment are positively associated with one another. These relationships suggest that schools t end to choose between hiring more teachers or investing in improved teacher quality through re cruiting teachers with better preservice training and providing teachers with more and bette r in-service training. In sum, this study finds that schools matter becaus e they provide a platform for active, as opposed to passive, teachers. Passive teachers are those who leave students to perform as well as their own resources will allow; active teachers press all students to grow regardless of their backgrounds. Passive teaching involves reducing eig hth-grade mathematics to its simplest components. All lessons are at a similar level of a bstraction; problems are solved in a single step and admit of a single solution; and all studen ts are treated as if they had entered the class with the same level of preparation and the same lea rning styles. In contrast, active teaching does justice to the complexities of eighth-grade mathema tics. Lessons work at multiple levels of abstraction, from the most mundane problem to the m ost general theorem; problems involve multiple steps and allow multiple paths to their so lution; and teachers tailor their methods to the knowledge and experience of each individual student Schools that lack a critical mass of active
24 of 30teachers may indeed not matter much; their students will be no less or more able to meet high academic standards than their talents and home reso urces will allow. But schools that do have a critical mass of active teachers can actually provi de a value-added; they can help their students reach higher levels of academic performance than th ose students otherwise would reach. Through their teachers, then, schools can be the ke y mechanism for helping students meet high standards.AcknowledgementsThis article could not have been written without th e contributions of others. The author thanks the Milken Family Foundation for funding the study and Ashaki Coleman and Carla Cooper for providing invaluable assistance throughout the proj ect. The author also wishes to thank Paul Barton, Richard Coley, Daniel Eignor, Claudia Genti le, Drew Gitomer, Robin Henke, Jacqueline Jones, John Mazzeo, and Lew Solmon for t heir helpful feedback. Any errors of fact or interpretation, however, are the responsibility of the author.Notes 1 As is common in the literature, this articler uses the terms "effect" and "school effect" to connote statistically significant associations betw een variables. These associations need not be causal in nature. 2 For a discussion of the methodological issues asso ciated with production function research, see Wenglinsky (1997), Forture and O'Neil (1994) an d Monk (1992). 3 For mathematics, the classroom practices are simil ar to those endorsed by the National Council on Teaching Mathematics (1989). 4 It should be noted that some school effects resear ch addresses the problem of the insensitivity of regression analysis to multilevel data through t he use of hierarchical linear modeling (HLM). There are trade-offs to using HLM as opposed to MSE M. HLM has the advantage of being able to treat as a dependent variable not only a student outcome, but the relationship between that outcome and a student background characteristics; f or its part, MSEM makes it possible to explicitly model measurement error and more fully t est relationships among independent variables. While this study uses MSEM, it should be supplemented with an HLM. 5 In aggregating teacher characteristics to the scho ol level, the values of all teachers in that school for whom there were data were averaged. It w as not possible to create a separate teacher level of analysis because there were generally only one or two teachers surveyed from each school, and thus not a sufficient number of degrees of freedom for a third level. 6 For a fuller discussion of this approach as applie d to the 1992 mathematics assessment for eighth graders, see Wenglinsky (1996). 7 More generally, the pooled variance can be express ed as: V = U* + (1 + M -1) B Where V is the pooled variance, U* is the average sampling variance,
25 of 30 M is the number of plausible values, and B is the variance among the M plausible values. 8 One misleadingly compelling alternative to this ap proach is to treat the five plausible values as multiple indicators of a test score construct. H owever, this approach violates the assumption in structural equation models of independence of er rors, and has been shown to distort estimates of residual variances and certain statistics, such as the R-squared (Mislevy 1993). 9 Because NAEP is a sample of students and schools, but not of teachers, descriptive statistics apply to the students rather than the teachers (e.g 45% of students have teachers who received professional development in higher-order thinking s kills, not 45% of teachers received professional development in higher-order thinking s kills). 10 Loadings used here are taken from the classroom pr actices model (Table 5). For constructs that were also included in other models, the loadin gs proved nearly identical across models. The output for the two other school-level factor models is not presented have but is available upon request. 11 Total effects can be calculated in one of two ways The first is to estimate a single model that includes all relevant variables, both exogenous and endogenous, and to sum each of the direct and indirect effects for each variable. This option can be problematic, however, in that the size of the total effect may be an artifact of the numbe r of paths the model permits. The more paths that are fixed at zero, for a given variable, the l ower the total effect. The second option is to estimate successive models, in which only the direc t effects of the variables are used. Thus, in the current case, the first model is made entirely of exogenous variables. Their direct effects on achievement are equal to their total effects. The s econd model adds a set of endogenous variables. They are related to achievement only in a direct manner, however, and hence can be treated as total effects. A final set of endogenous variables is added in the third model. These, too, are only directly related to achievement and h ence can be treated as estimates of total effects. The presentation of total effects in this study is thus based upon the direct effects of teacher inputs in the first model, of professional development in the second model, and of classroom practices in the third model. 12 Mayer (1999) finds that while composite measures o f classroom practices drawn from teacher questionnaires are highly reliable and vali d, individual measures are problematic. 13 That said, hands-on learning may not always tap hi gher-order thinking skills. If a teacher does not make use of hands-on activities in a manne r that connects them to underlying concepts, these activities may degenerate into a se t of cookbook procedures. The fact that, as this study suggests, it is the better-trained teach ers who utilize hands-on techniques suggests, however, that such connections do tend to be made.ReferencesArbuckle, J. L. (1997). Amos users' guide. Version 3.6. Chicago: Small Waters Corporation. Austin, G. R. & Garber, H. (Eds.). 1985. Research on exemplary schools. New York: Academic Press.
26 of 30Brookover, W., Beady, C., Flood, P., Schweitzer, J. & Wisenbaker, J. (1979). School social systems and student achievement: Schools can make a difference. Brooklyn, NY: J. F. Bergin. Bryk, A. S. & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and data analysis methods Newbury Park, CA: Sage Publications. Chubb, J. & Moe, T. (1990). Politics, markets and America's schools. Washington, D.C.: The Brookings Institution.Cohen, D. K. & Hill, H. C. (2000). Instructional po licy and classroom performance: The mathematics reform in California. Teachers College Record, 102 (2), 294-343. Coleman, J. S., Campbell, E. Q., Hobson, C. J., McP artland, J., Mood, A. M., Weinfeld, F. D., & York. R. L. (1966). Equality of educational opportunity Washington, D.C.: U.S. Government Printing Office.Edmonds, R. (1979). Effective schools for the urban poor. Educational Leadership, 37 (1), 15-24.Ehrenberg, R. G. & Brewer, D. J. (1995). Did teache rs' verbal ability and race matter in the 1960s? Coleman revisited. Economics of Education Review, 14 (1), 1-21. Ferguson, R. F. (1991). Paying for public education : New evidence on how and why money matters. Harvard Journal of Legislation, 28 (2), 465-498. Ferguson, R. F. & Ladd, H. F. (1996). How and why m oney matters: An analysis of Alabama schools. In H. F. Ladd (Ed.), Holding school accountable: Performance-based refor m in education (pp. 265-298). Washington, D.C.: The Brookings Inst itution. Goldhaber, D. D. & Brewer, D. J. (1996) Why don't s chools and teachers seem to matter? Assessing the impact of unobservables on educationa l productivity. Journal of Human Resources, 32 (3), 505-520. Goldstein, H., (1995). Multilevel statistical models (2nd ed.) New York: Halsted Press. Golub, J. (Ed.) (1988). Focus on collaborative learning. Urbana, IL: National Council of Teachers of English.Graves, D. H. & Sunstein, B. S. (Eds.) (1992). Portfolio portraits. Portsmouth, NH: Heinemann. Greenwald, R., Hedges, L. V., & Laine, R. D. (1996) The effect of school resources on student achievement. Review of Educational Research, 66 (3), 361-396. Gustafsson, J. E. & Stahl, P. A. (1997). STREAMS user's guide (Version 1.7). Mlndal, Sweden: Multivariate Ware.Hanushek, E. A. (1989). The impact of differential expenditures on school performance. Educational Research, 18 (4), 45-51. Hanushek, E. A. (1996a). A more complete picture of school resource policies. Review of Educational Research, 66 (3), 397-409.
27 of 30Hanushek, E. A. (1996b). School resources and stude nt performance. In G. T. Burtless (Ed.), Does money matter? The effect of school resources o n student achievement and adult success (pp. 43-73). Washington, D.C.: The Brookings Instit ution. Hanushek, E. A. (1997). Assessing the effects of sc hool resources on student performance: An update Educational Evaluation and Policy Analysis, 19 (2), 141-164. Hayduk, L. A. (1987). Structural equation modeling with LISREL: Essential s and advances. Baltimore: Johns Hopkins University Press.Hedges, L. W. & Greenwald, R. (1996). Have times ch anged? The relation between school resources and student performance. In G. T. Burtles s (Ed.), Does money matter? The effect of school resources on student achievement and adult s uccess (pp. 74-92). Washington, D.C.: The Brookings Institution.Hedges, L. V., Laine, R. D., & Greenwald, R. (1994) Does money matter? A meta-analysis of studies of the effects of differential school input s on student outcomes. Educational Research, 23 (3), 5-14. Jencks, C., Smith, M., Ackland, H., Bane, M. J., Co hen, D., Gintis, H., Heyns, B., & Michelson, S. (1972). Inequality: A reassessment of the effect of family and schooling in America. New York: Basic Books. Johnson, E. (1994). Overview of part I: The design and implementation of the 1992 NAEP. In E. Johnson & J. Carlson, The NAEP 1992 Technical Report (pp. 9-32). Princeton, NJ: Educational Testing Service.Johnson, E., Mislevy, R. J., & Thomas, N. (1994). S caling procedures. In E. Johnson & J. Carlson, The NAEP 1992 Technical Report (pp. 241-256). Princeton, NJ: Educational Testing Service.Jreskog, K. G. & Srbom, D. (1993). Structural equation modeling and the SIMPLIS command language. Chicago: Scientific Software International. Langer, J. A. & Applebee, A. N. (1987). How writing shapes thinking. Urbana, IL: National Council of Teachers of English.Lee, V. E., Bryk, A. S., & Smith, J. B. (1993). The organization of effective secondary schools. Review of Research in Education, 19 171-267 Mayer, D. P. (1999). Measuring instructional practi ce: Can policymakers trust survey data? Educational Evaluation and Policy Analysis, 21 (1), 29-45. McLaughlin, M. E. & Talbert, J. E. (1993). Introduc tion: New visions of teaching. In M. W. McLaughlin & J. E. Talbert (Eds.), Teaching for understanding (pp. 1-10). San Francisco, CA: Jossey-Bass.Mehan, H. (1992). Understanding inequality in schoo ls: The contribution of interpretive studies. Sociology of Education, 65 (1), 1-20. Monk, D. H. (1992). Educational productivity resear ch: An update and assessment of its role in education finance reform. Educational Evaluation and Policy Analysis, 14 307-332.
28 of 30 Monk, D. H. (1994). Subject area preparation of sec ondary mathematics and science teachers and student achievement. Economics of Education Review, 13 (2), 125-145. Muthn, B. O. (1991). Multilevel factor analysis of class and student achievement components. Journal of Educational Measurement, 28 338-354. Muthn, B. O. (1994). Multilevel covariance structu re analysis. Sociological Methods and Research, 22 (3), 399-420. National Center for Education Statistics. (1996). High school seniors' instructional experiences in science and mathematics. Washington, D.C.: U.S Government Printing Office. National Council of Teachers of Mathematics. (1989) Curriculum and evaluation standards for school mathematics Reston, VA: Author. O'Reilly, P. E., Zelenak, C. A., Rogers, A. M., & K line, D. L. (1996). 1994 trial state assessment program in reading secondary-use data fi les user guide. Washington, D.C.: U.S. Department of Education.Steinberg, L. D. (1996). Beyond the classroom: Why school reform has failed and what parents need to do. New York: Simon and Schuster. Strauss, R. P. & Sawyer, E. A. (1986). Some new evi dence on student and teacher competencies. Economics of Education Review, 5 (1), 41-48. Traub, J. (2000, January 16). What no school can do New York Times Magazine, p. 52. Wenglinsky, H. (1997). How money matters: The effec t of school district spending on academic achievement. Sociology of Education, 70 (3), 241-256 221-237.About the AuthorHarold Wenglinsky Email: firstname.lastname@example.orgDr. Harold Wenglinsky worked for six years with the Policy Information Center at Educational Testing Service as a National Assessment of Educati onal Progress Visiting Scholar, Research Scientist, and Center Director. His research has ta ckled a wide range of issues of educational policy, from the impact of educational expenditures on student performance to the educational role of Historically Black Colleges and Universitie s. His most recent publications have focused on the issue of teacher quality and, more specifica lly, the link between various teaching practices and student outcomes, as well as identify ing methodologies appropriate for using large-scale data to study such effects. He is curre ntly writing a book on the impact of educational technology on teaching and learning.Copyright 2002 by the Education Policy Analysis ArchivesThe World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu General questions about appropriateness of topics o r particular articles may be
29 of 30addressed to the Editor, Gene V Glass, email@example.com or reach him at College of Education, Arizona State University, Tempe, AZ 8 5287-2411. The Commentary Editor is Casey D. Cobb: firstname.lastname@example.org .EPAA Editorial Board Michael W. Apple University of Wisconsin Greg Camilli Rutgers University John Covaleskie Northern Michigan University Alan Davis University of Colorado, Denver Sherman Dorn University of South Florida Mark E. Fetler California Commission on Teacher Credentialing Richard Garlikov email@example.com Thomas F. Green Syracuse University Alison I. Griffith York University Arlen Gullickson Western Michigan University Ernest R. House University of Colorado Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Calgary Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Dewayne Matthews Education Commission of the States William McInerney Purdue University Mary McKeown-Moak MGT of America (Austin, TX) Les McLean University of Toronto Susan Bobbitt Nolen University of Washington Anne L. Pemberton firstname.lastname@example.org Hugh G. Petrie SUNY Buffalo Richard C. Richardson New York University Anthony G. Rud Jr. Purdue University Dennis Sayers California State UniversityÂ—Stanislaus Jay D. Scribner University of Texas at Austin Michael Scriven email@example.com Robert E. Stake University of IllinoisÂ—UC Robert Stonehill U.S. Department of Education David D. Williams Brigham Young University EPAA Spanish Language Editorial BoardAssociate Editor for Spanish Language Roberto Rodrguez Gmez Universidad Nacional Autnoma de Mxico firstname.lastname@example.org
30 of 30 Adrin Acosta (Mxico) Universidad de Guadalajaraadrianacosta@compuserve.com J. Flix Angulo Rasco (Spain) Universidad de Cdizfelix.email@example.com Teresa Bracho (Mxico) Centro de Investigacin y DocenciaEconmica-CIDEbracho dis1.cide.mx Alejandro Canales (Mxico) Universidad Nacional Autnoma deMxicocanalesa@servidor.unam.mx Ursula Casanova (U.S.A.) Arizona State Universitycasanova@asu.edu Jos Contreras Domingo Universitat de Barcelona Jose.Contreras@doe.d5.ub.es Erwin Epstein (U.S.A.) Loyola University of ChicagoEepstein@luc.edu Josu Gonzlez (U.S.A.) Arizona State Universityjosue@asu.edu Rollin Kent (Mxico)Departamento de InvestigacinEducativa-DIE/CINVESTAVrkent@gemtel.com.mx firstname.lastname@example.org Mara Beatriz Luce (Brazil)Universidad Federal de Rio Grande do Sul-UFRGSlucemb@orion.ufrgs.brJavier Mendoza Rojas (Mxico)Universidad Nacional Autnoma deMxicojaviermr@servidor.unam.mxMarcela Mollis (Argentina)Universidad de Buenos Airesmmollis@filo.uba.ar Humberto Muoz Garca (Mxico) Universidad Nacional Autnoma deMxicohumberto@servidor.unam.mxAngel Ignacio Prez Gmez (Spain)Universidad de Mlagaaiperez@uma.es Daniel Schugurensky (Argentina-Canad)OISE/UT, Canadadschugurensky@oise.utoronto.ca Simon Schwartzman (Brazil)Fundao Instituto Brasileiro e Geografiae Estatstica email@example.com Jurjo Torres Santom (Spain)Universidad de A Coruajurjo@udc.es Carlos Alberto Torres (U.S.A.)University of California, Los Angelestorres@gseisucla.edu