xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20009999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00163
Educational policy analysis archives.
n Vol. 8, no. 19 (March 28, 2000).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c March 28, 2000
Bridging the gap between testing and technology in schools / Michael Russell [and] Walt Haney.
Arizona State University.
University of South Florida.
t Education Policy Analysis Archives (EPAA)
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 8issue 19series Year mods:caption 20002000Month March3Day 2828mods:originInfo mods:dateIssued iso8601 2000-03-28
1 of 10 Education Policy Analysis Archives Volume 8 Number 19March 28, 2000ISSN 1068-2341 A peer-reviewed scholarly electronic journal Editor: Gene V Glass, College of Education Arizona State University Copyright 2000, the EDUCATION POLICY ANALYSIS ARCHIVES. Permission is hereby granted to copy any article if EPAA is credited and copies are not sold. Articles appearing in EPAA are abstracted in the Current Index to Journals in Education by the ERIC Clearinghouse on Assessment and Evaluation and are permanently archived in Resources in Education Bridging the Gap between Testing and Technology in Schools Michael Russell Boston College Walt Haney Boston College AbstractThe widening gap between the increased use of techn ology in schools and the absence of computers in state-level testing programs raises important implications for policies related to the use of both technology and testing in schools. In this article, we summari ze recent developments in the use of technology in schools and in state le vel testing programs. We then describe two studies indicating that writte n tests administered on paper underestimate the achievement of students accustomed to working on computers. We conclude by discussing fou r approaches to bridging the gap between technology and testing in U.S. schools.Introduction The need to improve education in the U.S. h as received unprecedented attention recently in the media and in national and state ele ctions. Prescriptions for improving
2 of 10schools have been many, but two of the most common are what might be called the technology and testing remedies. The technology nostrum holds that the infus ion of modern technology into schools will bolster teaching and learning and will prepare students for an increasingly technological workplace. The second prescription, w hich is often called high stakes testing, holds that standardsbased accountability for students, teachers and schools will provide tangible incentives for improvements in tea ching and learning. What is little recognized, however, is that these two strategies a re working against each other in a sort of educational time warp. Recent research shows tha t written tests taken on paper severely underestimate the performance of students accustomed to working on computer (Russell, 1999; Russell & Haney, 1997). The situati on is analogous to testing the accounting skills of modern accountants, but restri cting them to the use of an abacus for calculations.The Computer Revolution Goes to School Although the personal-computer revolution b egan only twenty years ago and widespread use of the world wide web (WWW) is even more recent, computer technology has already had a dramatic impact on soc iety and schooling. Between 1984 and 1993, the percentage of people using computers in the workplace nearly doubled from 24.6 percent to 45.8 percent. Similarly, the p ercentage of people owning one or more computers in their home increased rapidly from 8.2 percent in 1984 to 22.8 percent in 1993 to 33.6 percent in 1997 (Newburger, 1997). Although schools have been slower to acquire these technologies, computer use in scho ols has recently increased rapidly (Zandvliet & Farragher, 1997). While schools had on e computer for every 125 students in 1983, they had one for every 9 students in 1995, and 1 for every 6 students in 1998 (Market Data Retrieval, 1999). Not only are more co mputers in classrooms, but schools are also increasing students' use of computers and access to the Internet. A recent national survey of teachers showed that in 1998, 50 percent of K-12 teachers had students use word processors, 36 percent had them u se CD ROMS, and 29 percent had them use the WWW (Becker, 1999). Although it is unc lear how computers are affecting student achievement in schools (see, for example, F abos & Young, 1999, questioning the efficacy of Internet based telecommunications e xchange programs in schools), there is little doubt that the computer revolution has go ne to school. As a result, more and more students are writing and performing school ass ignments on computers.Performance Testing in Schools Meanwhile, many states are increasingly see king to hold students, teachers and schools accountable for student learning as measure d by state-sponsored tests. According to annual surveys by the Council for Chie f State School Officers (CCSSO, 1998), 48 states use statewide tests to assess stud ent performance in different subject areas. Many of these tests are tied to challenging standards for what students should know and be able to do. Scores on these tests are b eing used to determine whether to: (1) promote students to higher grades; (2) grant high s chool diplomas; and (3) identify and sanction or reward lowand high-performing schools (Sacks, 1999). Currently, 32 states control, or plan to control, graduation and/or grad e promotion via student performance on state-level tests. Because of the limitations of multiple-choice tests, many statewide tests include sections in which students must write extended answers or written
3 of 10explanations of their work. As the recent CCSSO rep ort commented, "Possibly the greatest changes in the nature of state student ass essment programs have taken place in the 1990s as more states have incorporated open-end ed and performance exercises into their tests, and moved away from reliance on only m ultiplechoice tests" (CCSSO, 1998, p. 17). In 1996-97, an estimated ten to twelv e million students nationwide participated in a state-sponsored testing program t hat required them to write responses long hand (given a total national K-12 enrollment o f about 50 million and open-ended assessments in almost all the states in 3 out of 12 grade levels). In Ohio, for example, students must pass th e written portion of the Ohio Proficiency Test in order to graduate from high sch ool (Fisher & Elliott, 2000). Although many observers have criticized state-spons ored high-stakes tests on a variety of grounds (e.g., Heubert & Hauser, 1999; Sacks, 19 99), here we direct attention to a widely unrecognized but critical limitation of depe nding on these tests to drive educational reform: paper-and-pencil written tests yield misleading information on the capabilities of students accustomed to using comput ers.Testing Via Computer Research on testing via computer goes back several decades and suggests that for multiple-choice tests, administration via computer yields about the same results, at least on average, as administering tests via paper-and-pe ncil (Bunderson, Inouye, & Olsen, 1989, Mead & Drasgow, 1993). However, more recent r esearch shows that for young people who have gone to school with computers, open -ended (that is, not multiple choice) questions administered via paper-and-pencil yield severe underestimates of students' skills as compared with the same question s administered via computer (Russell, 1999; Russell & Haney, 1997). In both stu dies, the effect sizes for students accustomed to working computer ranged from .57 to 1 .25. Effect sizes of this magnitude are unusually large and of sufficient size to be of not just statistical, but also practical significance (Cohen, 1988; Wolf, 1986). Effect size s of this magnitude, for example, imply that the score for the average student in the experimental group tested on computer exceeds that of 72 to 89 percent of the st udents in the control group tested via paper and pencil. Our research on this topic began with a puz zle. While evaluating the progress of student learning in the Accelerated Learning Labora tory (ALL), a high-tech school in Worcester, MA, teachers were surprised by the resul ts from the second year of assessments. Although students wrote more often aft er computers were widely used in the school, student scores on writing tests decline d in the second year of the new program. To help solve the puzzle, the school asked us to assist in comparing paper and computer administration of the tests. In 1995, a randomized experiment was conduc ted, with one group of sixty-eight students taking math, science and language arts tes ts, including both multiple-choice and open-ended items, on paper, and another group of fo rty-six students taking the same tests on computer (but without access to word proce ssing tools, such as spell-checking or grammar-checking). Before scoring, answers writt en by hand were transcribed so that raters could not distinguish them from those done o n computer. There were two major findings. First, the multiple-choice test results d id not differ much by mode of administration. Second, the results for the open-en ded tests differed significantly by mode of administration. For the ALL School students who were accustomed to writing on the computer, responses written on computer were much better than those written by hand. This finding occurred across all three subjec ts tested and on both short answer and
4 of 10extended answer items. The effects were so large th at when students wrote on paper, only 30 percent performed at a "passing" level; whe n they wrote on computer, 67 percent "passed" (Russell & Haney, 1997). Two years later, a more sophisticated study was conducted, this time using open-ended items from the new Massachusetts state t est (the Massachusetts Comprehensive Assessment System or MCAS) and the Na tional Assessment of Educational Progress (NAEP) in the areas of languag e arts, science and math. Again, eighth grade students from two middle schools in Wo rcester, MA, were randomly assigned to groups. Within each subject area, each group was given the same test items, with one group answering on paper and the other on computer. In addition, data were collected on students' keyboarding speed and prior computer use. As in the first study, all answers written by hand were transcribed to com puter text before scoring. In the second study, which included about t wo hundred students, large differences between computer and paper-andpencil administrati on were again evident on the language arts tests. For students who could keyboar d moderately well (20 words per minute or more), performance on computer was much b etter than on paper. For these students, the difference between performance on com puter and on paper was roughly a half standard deviation. According to test norms, t his difference is larger than the amount students' scores typically change between gr ade 7 and grade 8 on standardized tests (Haney, Madaus, & Lyons, 1993, p. 234). For t he MCAS, this difference in performance could easily raise students' scores fro m the "failing" to the "passing" level (Russell, 1999). Recalling that nearly ten million students took some type of state-sponsored written test last year and that nearly half of the students nationwide use word processors in school, these results suggest that state paper-andpencil tests may be underestimating the abilities of millions of students annually. In the second study, however, findings were not consistent across all levels of keyboarding proficiency. As keyboarding speed decre ased, the benefit of computer administration became smaller. And at very low leve ls of keyboarding speed, taking the test on computer diminished students' performance ( effect size of about 0.40 standard deviations). Similarly, taking the math test on com puter had a negative effect on students' scores. This effect, however, became less pronounced as keyboarding speed increased.Bridging the Gap These studies highlight the importance of t he gap between the technology and testing strategies for school improvement. Increasi ngly, schools are using computers to improve student learning. To measure increases in s tudent learning, states are depending upon tests administered on paper. The open-ended qu estions on these tests, however, underestimate the achievement of students who regul arly use computers. As a result, this mis-match between the mode of learning and the mode of assessment may be underestimating improvements in achievement. This p roblem is likely to increase as more students become accustomed to writing on compu ters. There are at least four possible ways to bridge this gap. First, schools could decrease the amount of time students spend working on computers so that they do not become accustomed to writing on computers. Some schools have already adopted this practice. After r eviewing the first study described above and following the introduction of the new pap er-and-pencil MCAS test in Massachusetts, the ALL school required students to write more on paper and less on
5 of 10computer (Russell, 1999). In another Massachusetts school system, the principal feared that students who write regularly on computer lose penmanship skills, which might lead to lower scores on the new state test. This school increased penmanship instruction across all grades while also decreasing students' t ime on computers (Holmes, 1999). Such strategies, in effect reducing computer use in schools to better prepare students for low-tech tests, may be pragmatic given the high sta kes attached to many state tests. But they are also short-sighted in light of students' e ntry after graduation into an increasingly hightech world and workplace. A second way to bridge the test-technology gap would be to eliminate paper-and-pencil testing and have students perform open-ended tests on computer. This might seem a sensible solution, but it will not be feasible until all schools obtain an adequate technology infrastructure. Moreover, as sh own by problems in recent moves to administer some large-scale tests for adults on com puters, computerized testing is not the panacea some had hoped. Among other problems, i t adds considerably to the cost of testing and creates new test security concerns. But more importantly, as the second study summarized above indicates, administering open-ende d tests only on computer would penalize students with poor keyboarding skills. A third approach would be to offer students the option of performing open-ended tests on paper or on computer. On the surface, this seems like a sensible solution. However, it would add considerable complexity and c ost to test administration and scoring procedures. Although there has not been a l arge amount of research on the extent to which computer printing versus hand-writing affe cts ratings of written work, Powers et al. (1994) report that significant effects can o ccur. Surprisingly, Powers et al. found that computer printed responses produced by adults tended to receive lower scores than the same responses produced by hand. To control for such effects, in offering tests on paper and computer, handwritten responses would nee d to be converted to computer text. Surely it will be some years before text reco gnition software is sophisticated enough to convert handwritten responses into comput er text. Thus, for the foreseeable future, the cost of transcription would be prohibit ive. But beyond the need to convert responses to the same medium for scoring, the second study summarized above provides evidence tha t, when given the choice of using computer or paper to write their tests, many studen ts make poor decisions as to which medium they should use. This was evidenced in two w ays. First, the correlations between both students' preference for taking tests on computer or on paper and keyboarding speed and between preference and prior computer experience were near zero (less than .18). Second, preference was not fo und to be a significant factor in predicting students performance. Together, the adde d complexity of scoring open-ended responses produced in both mediums and students' ap parent inaccuracy in selecting the medium that optimizes their performance suggest tha t simply giving students the option of performing open-ended tests on computer or on pa per would do little to reduce the gap between testing and technology. A fourth approach, and perhaps the most rea sonable solution in the short term, is to recognize the limitations of current testing progra ms. Without question, both computer technology and performance testing can help improve the quality of education. However, until students' can take tests in the same medium i n which they generally work and learn, we must recognize that the scores from high-stakes state tests do not accurately reflect some students' capabilities. Reliance on paper and pencil written test scores to measure or judge student and/or school performance will mis characterize the achievement of students' accustomed to working on computers. Thus, the gap between the use of technology in schools and testing programs serves a s yet another reminder of the dangers
6 of 10of judging students and schools based solely on wri tten test scores.Note We would like to acknowledge the help of Je ff Nellhaus and Kit Viator of the Massachusetts Department of Education which allowed inclusion of MCAS items in the second study summarized in this article. Also, we w ish to note that this article is an expansion of an opinion essay appearing originally in the Christian Science Monitor in July 1999 (Haney & Russell, 1999). Additionally, we thank the National Board on Educational Testing and Public Policy (NBETPP) for its support and suggestions on earlier versions of this article. We also thank two anonymous EPAA reviewers for their suggestions for improving this article. Finally we thank Carol Shilinsky and the staff of the ALL School, and James Caradonio, the Superinten dent of the Worcester, MA, Public Schools, for their generous support of the research recounted here.ReferencesBecker, H. J. (1999). Internet Use by Teachers: Conditions of Professiona l Use and Teacher-Directed Student Use Irvine, CA: Center for Research on Information Technology and Organizations.Bunderson, C. V., Inouye, D. K. & Olsen, J. B. (198 9). The four generations of computerized educational measurement. In Linn, R. L ., Educational Measurement (3rd ed.), Washington, D.C.: American Council on Educati on, pp. 367-407. Cohen, J. (1988). Statistical power analysis for the behavioral scien ces (2nd ed.) Hillsadale, NJ: Erlbaum.Council of Chief State School Officers (1998). Key State Education Policies on K-12 Education: Standards, Graduation, Assessment, Teach er Licensure, Time and Attendance Washington, DC. Fabos, B. & Young, M. (1999). Telecommunications in the classroom: Rhetoric versus reality. Review of Educational Research, 69 (3) 217-259. Fisher, M. & Elliott, S. (2000) Proficiency: The te st questioned. Dayton Daily News (March 12). http://www.activedayton.com/news/2000/0 3/12/0313main001.html Glennan, T. K., & Melmed, A. (1996). Fostering the use of educational technology: Elements of a national strategy Santa Monica, CA: RAND. Haney, W., Madaus, G., & Lyons, R. (1993). The Fractured Marketplace for Standardized Testing Boston, MA: Kluwer Academic Publishers. Haney, W. & Russel, M. (1999). Low-tech tests short change high-tech students. Christian Science Monitor July 1, 1999. Holmes, R. (1999). A gender bias in the MCAS? MetroWest Town Online http://www.townonline.com/metrowest/archive/022499/ Heubert, J. & Hauser, R. (1999). High stakes: Testing for tracking, promotion and
7 of 10graduation (Report of the Committee on Appropriate Test Use) Washington, D.C.: Natioinal Academy Press. (Available on-line at http ://www.nap.edu). Market Data Retrieval. (1999). Technology in Education 1999 (A report issued by Market Data Retrieval). Shelton, CN: Market Data Re trieval. Mead, A. D. & Drasgow, (1993). Equivalence of compu terized and paper-and-pencil cognitive ability tests: A meta-analysis. Psychological Bulletin, 114 (3), 449-58. Newburger, E. C. (1997). Computer Use In the United States Washington, DC: U.S. Census Bureau.Powers, D., Fowles, M, Farnum, M, & Ramsey, P. (199 4). Will they think less of my handwritten essay if others word process theirs? Ef fects on essay scores of intermingling handwritten and word-processed essays. Journal of Educational Measurement, 31 (3), 220-233.Russell, M. & Haney, W. (1997). Testing Writing on Computers: An Experiment Comparing Student Performance on Tests Conducted vi a Computer and via Paper-and-Pencil. Educational Policy Analysis Archives, 5 (1). Available online: http://epaa.asu.edu/epaa/v5n3.html. Russell, M. (1999). Testing Writing on Computers: A Follow-up Study Comparing Performance on Computer and on Paper. Educational Policy Analysis Archives, 7 (20). Available online: http://epaa.asu.edu/epaa/v7n20/. Sacks, P. (1999). Standardized minds Reading MA: Perseus Books. Snyder, T. D. & Hoffman, C. (1990). Digest of Education Statistics Washington, DC: U. S. Department of Education.Snyder, T. D. & Hoffman, C. (1994). Digest of Education Statistics Washington, DC: U. S. Department of Education.Wolf, F. (1986). Meta-analysis: Quantitative methods for research sy nthesis SAGE University series on quatitative applications in th e social sciences, series no. 07-059. Newbury Park, CA: SAGE.Zandvliet, D. & Farragher, P. (1997). A comparison of computer-administered and written tests. Journal of Research on Computing in Education, 29 (4), 423-438.About the AuthorsMichael Russell Senior Research AssociateCenter for the Study of Testing, Evaluation and Edu cational Policy 323 Campion Hall, Boston CollegeChestnut Hill, MA 02467Phone: 617.552.0889 Fax: 617.552.8419 Email: email@example.com
8 of 10 Michael Russell is a research fellow for the Nation al Board on Educational Testing and Public Policy and a research associate in the Cente r for the Study of Testing, Evaluation and Educational Policy at Boston College. His resea rch interests include standards based reform, assessment, and educational technology.Walt HaneyCenter for the Study of Testing, Evaluation and Edu cational Policy 323 Campion Hall, Boston CollegeChestnut Hill, MA 02467 Email: firstname.lastname@example.org Walt Haney, Ed.D., Professor of Education at Boston College and Senior Research Associate in the Center for the Study of Testing Ev aluation and Educational Policy (CSTEEP), specializes in educational evaluation and assessment and educational technology. He has published widely on testing and assessment issues in scholarly journals such as the Harvard Educational Review, Review of Educational R esearch and Review of Research in Education and in wide-audience periodicals such as Educational Leadership, Phi Delta Kappan, the Chronicle of High er Education and the "Washington Post." He has served on the editorial boards of Educational Measurement: Issues and Practice and the American Journal of Education and on the National Advisory Committee of the ERIC Clearinghouse on Assessment a nd Evaluation.Copyright 2000 by the Education Policy Analysis ArchivesThe World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu General questions about appropriateness of topics o r particular articles may be addressed to the Editor, Gene V Glass, email@example.com or reach him at College of Education, Arizona State University, Tempe, AZ 8 5287-0211. (602-965-9644). The Commentary Editor is Casey D. C obb: firstname.lastname@example.org .EPAA Editorial Board Michael W. Apple University of Wisconsin Greg Camilli Rutgers University John Covaleskie Northern Michigan University Alan Davis University of Colorado, Denver Sherman Dorn University of South Florida Mark E. Fetler California Commission on Teacher Credentialing Richard Garlikov email@example.com Thomas F. Green Syracuse University Alison I. Griffith York University Arlen Gullickson Western Michigan University
9 of 10 Ernest R. House University of Colorado Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Calgary Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Dewayne Matthews Western Interstate Commission for HigherEducation William McInerney Purdue University Mary McKeown-Moak MGT of America (Austin, TX) Les McLean University of Toronto Susan Bobbitt Nolen University of Washington Anne L. Pemberton firstname.lastname@example.org Hugh G. Petrie SUNY Buffalo Richard C. Richardson New York University Anthony G. Rud Jr. Purdue University Dennis Sayers Ann Leavenworth Centerfor Accelerated Learning Jay D. Scribner University of Texas at Austin Michael Scriven email@example.com Robert E. Stake University of IllinoisÂ—UC Robert Stonehill U.S. Department of Education David D. Williams Brigham Young UniversityEPAA Spanish Language Editorial BoardAssociate Editor for Spanish Language Roberto Rodrguez Gmez Universidad Nacional Autnoma de Mxico firstname.lastname@example.org Adrin Acosta (Mxico) Universidad de Guadalajaraadrianacosta@compuserve.com J. Flix Angulo Rasco (Spain) Universidad de Cdizfelix.email@example.com Teresa Bracho (Mxico) Centro de Investigacin y DocenciaEconmica-CIDEbracho dis1.cide.mx Alejandro Canales (Mxico) Universidad Nacional Autnoma deMxicocanalesa@servidor.unam.mx Ursula Casanova (U.S.A.) Arizona State Universitycasanova@asu.edu Jos Contreras Domingo Universitat de Barcelona Jose.Contreras@doe.d5.ub.es Erwin Epstein (U.S.A.) Loyola University of ChicagoEepstein@luc.edu Josu Gonzlez (U.S.A.) Arizona State Universityjosue@asu.edu
10 of 10 Rollin Kent (Mxico)Departamento de InvestigacinEducativa-DIE/CINVESTAVrkent@gemtel.com.mx firstname.lastname@example.org Mara Beatriz Luce (Brazil)Universidad Federal de Rio Grande do Sul-UFRGSlucemb@orion.ufrgs.brJavier Mendoza Rojas (Mxico)Universidad Nacional Autnoma deMxicojaviermr@servidor.unam.mxMarcela Mollis (Argentina)Universidad de Buenos Airesmmollis@filo.uba.ar Humberto Muoz Garca (Mxico) Universidad Nacional Autnoma deMxicohumberto@servidor.unam.mxAngel Ignacio Prez Gmez (Spain)Universidad de Mlagaaiperez@uma.es Daniel Schugurensky (Argentina-Canad)OISE/UT, Canadadschugurensky@oise.utoronto.ca Simon Schwartzman (Brazil)Fundao Instituto Brasileiro e Geografiae Estatstica email@example.com Jurjo Torres Santom (Spain)Universidad de A Coruajurjo@udc.es Carlos Alberto Torres (U.S.A.)University of California, Los Angelestorres@gseisucla.edu