xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam a22 u 4500
controlfield tag 008 c20019999azu 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E11-00202
Educational policy analysis archives.
n Vol. 9, no. 5 (February 14, 2001).
Tempe, Ariz. :
b Arizona State University ;
Tampa, Fla. :
University of South Florida.
c February 14, 2001
How the Internet will help large-scale assessment reinvent itself / Randy Elliot Bennett.
Arizona State University.
University of South Florida.
t Education Policy Analysis Archives (EPAA)
xml version 1.0 encoding UTF-8 standalone no
mods:mods xmlns:mods http:www.loc.govmodsv3 xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govmodsv3mods-3-1.xsd
mods:relatedItem type host
mods:identifier issn 1068-2341mods:part
mods:detail volume mods:number 9issue 5series Year mods:caption 20012001Month February2Day 1414mods:originInfo mods:dateIssued iso8601 2001-02-14
1 of 23 Education Policy Analysis Archives Volume 9 Number 5February 14, 2001ISSN 1068-2341 A peer-reviewed scholarly journal Editor: Gene V Glass, College of Education Arizona State University Copyright 2001, the EDUCATION POLICY ANALYSIS ARCHIVES. Permission is hereby granted to copy any article if EPAA is credited and copies are not sold. Articles appearing in EPAA are abstracted in the Current Index to Journals in Education by the ERIC Clearinghouse on Assessment and Evaluation and are permanently archived in Resources in Education .How the Internet Will Help Large-Scale Assessment Reinvent Itself Randy Elliot Bennett Educational Testing Service U.S.A.Abstract Large-scale assessment in the United States is unde rgoing enormous pressure to change. That pressure stems from many c auses. Depending upon the type of test, the issues precipitating cha nge include an outmoded cognitive-scientific basis for test design ; a mismatch with curriculum; the differential performance of populat ion groups; a lack of information to help individuals improve; and ineffi ciency. These issues provide a strong motivation to reconceptualize both the substance and the business of large-scale assessment. At the same time, advances in technology, measurement, and cognitive science are providing the means to make that reconceptualization a reality. The the sis of this paper is that the largest facilitating factor will be technologic al, in particular the Internet. In the same way that it is already helpin g to revolutionize commerce, education, and even social interaction, t he Internet will help revolutionize the business and substance of large-s cale assessment.
2 of 23 Whether for educational admissions, school and student accountability, or public policy, large-scale assessment in the United States is undergoing enormous pressure to change. This pressure is most evident with respect to high-stakes tests, like those used for grade promotion or college entrance. However, i t is becoming apparent for lower-stakes survey instruments too, like the Natio nal Assessment of Educational Progress (NAEP) (e.g., Pellegrino, Jones, & Mitchel l, 1999). Several factors underlie the pressure to ch ange. First, whereas our tests have incorporated many psychometric advances, they have remained separated from equally important advances in cognitive science, in essence measuring the same things in ever more technically sophisticated ways. Although decad es of research have documented the importance of such cognitive constructs as knowledg e organization, problem representation, mental models, and automaticity (Gl aser, 1991), our tests typically do not account for them explicitly. As a result, our tests probably owe more to the behavioral psychology of the early 20th century than to the co gnitive science of today (Shepard, 2000). A second factor is the mismatch with the co ntent and format of curriculum, a criticism more true of the developed ability tests commonly used in postsecondary admissions than of school achievement measures, but relevant to the latter too. The mismatch arises in part from the fact that the elem ental, forced-choice problems dominating many tests are effective indicators of s kills and abilities, and thus provide an efficient means for estimating student standing on those constructs. However, the mismatch becomes problematic because of the increas ing attention being paid to test preparation. Although persistent direct training on these indicator tasks may increase test performance, it certainly is not the best way to im prove construct standing. Further, it distracts attention from other, arguably more criti cal, learning activities (Frederiksen, 1984). Differential performance of population grou ps is another factor. Because of the curricular mismatch, it is easy to blame group diff erences on purported bias in the test and more difficult to create a convincing defense t han it would be if the tests were strongly linked to learning goals. In a high-stakes decision setting like admissions, tests become a lightning rod for the failure of schools a nd society to educate all groups effectively. With the potential elimination of affi rmative action in university admissions, there is no politically acceptable choice but to re duce the role of such tests. California, Texas, Florida, and Pennsylvania are proposing to a dmit, or have begun admitting, all students with high-school rank above a certain poin t to their state higher education systems. At the same time, promotion tests tied to state curricular standards are being put into place to encourage schools to teach all st udents valued skills. Although in Texas one such test was challenged in court on the basis of differential performance, that challenge was rejected (Schmidt, 2000). This reject ion suggests that when wellconstructed tests closely reflect the curriculum, g roup differences should become more an issue of instructional inadequacy than test inac curacy (Bennett, 1998). As attention shifts to the adequacy of inst ruction, the ability to derive meaningful information from test performance becomes more crit ical. A weak connection between test and curriculum insures that the value of feedb ack for the examinee will be limited. Even for tests where the connection is stronger, fe edback is still too often of marginal value, in part because of the additional cost and p rocessing time that would be incurred. For achievement surveys like NAEP, which offer no i nformation to individuals, schools, or districts, motivation to participate is undoubte dly diminished. Finally, there is efficiency. Testing progr ams are expensive to operate. That
3 of 23The Promise of New Technology Radical improvements in assessment will der ive from advances in three areas: technology, measurement, and cognitive science (Ben nett, 1999). Of the three, new technology will be the most influential in the shor t term and, for that reason, I focus on it in this paper. New technology will have the greates t influence because itÂ—not measurement and not cognitive scienceÂ—is pervading our society. Billions of dollars are being invested annually to create and make commonpl ace powerful, general technologies for commerce, communications, entertai nment, and education. Due to their generality, these technologies can also be used to improve assessment. These technological advancements revolve pr imarily around the Internet. The Internet is (or will be) interactive, broadband, sw itched, networked, and standards-based. What does that mean? Interactive means that we can present a task to a student and quickly respond to that student's actions. Switched means that we can engage in different interactions with different students simultaneously. In combination, these two characteristics (interactive and switched) make for individualized assessments. Broadband means that those interactions can contain lots of information. For assessment tasks, that information could include au dio, video, and animation. Those features might make tasks more authentic and more engaging, as well as allow us to assess skills that cannot be measured i n paper and pencil (Bennett, Goodman, Hessinger, Ligget, Marshall, Kahn, & Zack, 1999). We might also use audio and video to capture answers, for example, gi ving examinees choice in their response modalities (typing, speaking, or, for a de af student, American Sign Language). Networked indicates that everything is linked. This linkage means that testing agencies, schools, parents, government officials, i tem writers, test reviewers, human scorers, and students are tied together elect ronically. That electronic connection can allow for enormous efficiencies. Finally, standards-based means that the network runs according to a set of conventional rules that all participants follow. Th at fact permits both the easy interchange of data and access from a wide variety of computing platforms, as long as the software running on those platforms (e. g., Internet browsers), adheres to those rules too. (Note 1) As an embodiment of these characteristics, what does the Internet afford? It affords the potential to deliver efficiently on a mass scal e individualized, highly engaging content to almost any desktop; get data back immedi ately; process it; and make information available anywhere in the world, anytim e day or night. Paper delivery cannot compete with this potential. The Internet is, of course, not being built to service the needs of large-scale assessment. It is, instead, being built for e-comme rce: to sell products and services over the web to consumers and to businesses directly. Co incidentally, the capabilities needed for e-commerce are essentially those needed for e-a ssessment: interactive (so that products can be offered and or ders transacted), switched (so different business transactions can be conducted with different customers simultaneously), broadband (so that those offers can be as engaging and enticing as possible),
4 of 23networked (so that product offers, orders, shipping inventory, and accounting can be integrated), and standards-based (so that everyone can get to it, re gardless of computing platform). Will we be able to count on continued inves tment in the Internet to support its use as a delivery medium? By any measure, the Internet and use of it, has grown dramatically, to say the least. As a communications medium, the Internet last year surpassed the telephone, with 3 billion email messa ges sent each day (Church, 1999). The number of unique URLs (web-page directory and s ubdirectory addresses) has grown from just under a billion in 1998 to a projected 3 billion in 2000 ("Big fish," 1999). In the United States, the percentage of homes with Int ernet access has increased from 26% in December 1998 to 42% in August 2000 (U.S. Depart ment of Commerce, 2000). (Note 2) Worldwide, the number of users has grown from so mewhere between 117 to 142 million in 1998 to about 400 million in 2000 ("Big fish," 1999; Global Reach, 2000; "How many online?", 2000). Finally, the number of h ost computers has gone from about 30 million to 75 million from January 1998 to Janua ry 2000 ("Internet domain survey host count," 2000). This phenomenal growth may slow as investment subsides from the speculative rates of the past few years. However, t he vast size of the Internet and its user base constitute a critical mass that should continu e to attract substantial capital. For commerce, the promise of the Internet i s all about being faster, cheaper, and better. Two "laws" of the digital era illustrate th is promise. Moore's Law predicts the doubling of computational capability (specifically, at the level of the microchip) every 18 months. As Negroponte (1995) has explained, what filled a room yesterday is on your desk today and will be on your wrist tomorrow. Metc alfe's Law says that the value of a network increases by the square of the number of pe ople on it. The true value of a network is, thus, less about information and more a bout community (Negroponte, 1995). One can see this effect clearly in eBay, the online auction broker (Cohen, 1999). Each new user potentially benefits every other existing user because every eBay member can be both buyer and seller. (Note 3) Metcalfe's law i s playing out well beyond eBay. Online business-to-business auction brokers are app earing in a variety of industries, including natural gas, electricity, steel, and band width (Friedman, 2000, pp. 386-387; Gibney, 2000). Another illustration of this cheaper-faster -better result is the effect of the Internet on the traditional relationship between richness and reach where richness is the depth of the interaction that a business can have with a customer and reach is the number of customers that a business can contact through a giv en channel. Traditionally, one limited the other. That is, a business could attain maximal reach but only limited richness. For example, through direct mail, broadcast, or newspap er ads a company could communicate with many people but have a meaningful interaction with none of them. Similarly, a business could attain maximal richness but limited reach. Via personal contact (e.g., door-to-door sales), very deep inter actions can occur, but with only a relatively small number of people. What has the Int ernet done? It has transformed the relationship between richness and reach by allowing businesses to touch many people in a personalized but inexpensive way (Evans & Wurster 2000). What does richness with reach make for? It makes for mass customization We can already see the effects in Dell Comp uter Corporation's business model. Customers can log onto Dell's Internet site ( www.dell.com ), choose from a menu of basic machine designs, and then configure a particu lar design to meet their needs. A second example is Radio.SonicNet ( http://radio.sonicnet.com/splash.asp ). Radio.SonicNet allows one to pick from a variety of music styles, choose artists within
5 of 23that style, and indicate how frequently each artist should play. The end result is a radio station uniquely tuned to the individual and contin ually interesting; it always plays what you like but you never know exactly what it is goin g to play. As a final example, consider Customatix ( www.customatix.com/customatix/common/homepage/Homep ageGeneral.po ), which allows you to design your own shoes using up to three billion trillion combinations of colors, graphics, logos and materials per shoe. You design them. They build them. And nobody else is likely to have exactly the same ones.Reinventing AssessmentReinventing the Business There are two major dimensions to reinventi ng assessment. One is the business of assessment. This dimension centers on the core proc esses that define an enterprise. In many cases, those core processes can become many ti mes more efficient because moving bits is faster and easier than moving atoms (Negroponte, 1995); that is, electronically processing information is far more c ost effective than physically manipulating things. For large-scale testing programs, some exam ples of the potential for electronic processing are in: developing tests, making the items easier to review revise, and automatically morph into still more items (e.g., Singley & Bennet t, in press) because the items themselves are digitally represented; delivering tests, eliminating the costs of printing warehousing, and shipping tons of paper; presenting dynamic stimuli like audio, video, and a nimation, making the need for specialized testing equipment (e.g., audio cassette recorders, VCRs) obsolete (Bennett, Goodman, Hessinger, Ligget, Marshall, Kah n, & Zack, 1999); transmitting some types of complex constructed resp onses to human graders, removing the need to transport, house, and feed the graders (Odendahl, 1999; Whalen & Bejar, 1998); scoring other complex constructed responses automat ically, reducing the need for human reading (Burstein et al., 1998; Clauser et al ., 1997); and distributing test results, cutting the costs of pri nting and mailing reports. To get a sense of how reinventing the busin ess of assessment might affect testing organizations, take a look at reference book publis hing, in particular the case of Encyclopaedia Britannica (Evans & Wurster, 2000; Landler, 1995; Melcher, 199 7). Encyclopaedia Britannica was established in Scotland in 1768. It is the old est and most famous encyclopedia in the English-speaking world. By 1990, its sales had reached $650 million per annum. But then suddenly, Britannica's fortunes drastically changed. In 1996, the company was sold for less than half its n et worth (i.e., the value of its assets, including its encyclopedia inventory, minus its lia bilities). That same year, it eliminated its entire door-to-door North American sales force. By 1998, sales had fallen 80%. What happened? What happened was that the reference book b usiness was reinvented because of the emergence of new technology. At its peak, Britannica was a 32-volume set of books costing well over $1,000. In 1993, Microsoft introd uced Encarta on CD-ROM for under
6 of 23$100 and even though Britannica was much more comprehensive, the difference for most people wasn't worth an extra $900+. Initially, Britannica did not respond as it didn't take the threat from Encarta seriously. But when it did respond, it did so ineffectively because Britannica wouldn't fit on a single CD-ROM and because the company's large sales force wasn't suited to sellin g software. But, ultimately, Britannica wasn't ready to cannibalize its existing paper busi ness to enter this new electronic one. Why is this story important? It's important because similar (though less extreme) scenarios are playing themselves out now in individ ual investing, book selling, travel planning, music distribution, long distance telepho ny, and even business-to-business transactions. (As to the last, Cisco Systems makes 90% of its revenue from business-to-business transactions done over the Int ernet [Cisco Systems, Inc., 2000]). These reinvention scenarios are forcing organizatio nsÂ—including some in educational assessmentÂ—to come quickly to grips with where new technology will and will not help core business processes. As should be obvious, technology-driven cha nges in business processes can occur quickly and their consequences can be significant f or the organizations that service a particular market. In fact, if radical and pervasiv e enough, process changes can force shifts in the substance of the business itself. So, although reinventing the business of assessment by incorporating technology into specifi c assessment processes is about trying to achieve the efficiencies needed to remain competitive today, reinventing the substance of assessmentÂ—most fundamentally, the rea son we do itÂ—is not about today. It's about tomorrow.Reinventing the Substance The populations seeking education are chang ing and so are their purposes for learning. At the college level, just 16% of students fit the traditional profile: 18-22 yea rs old, full-time, on-campus resident (Levine, 2000a). This is not because fewer 18-22 year olds are going to college. It is because more adult s are. The adult cohort is, in fact, the fastest growing segment in postsecondary education (Kerrey & Isakson, 2000). Working adults over age 24 constitute some 44% of college s tudents ("Education prognosis 1999," 1999). Why are so many adults returning to college ? Over the past 25 years, employer demand in the U.S. has shifted toward higher educat ional qualifications, as indicated by an increasing premium paid for those with a college degree (Barton, 1999). But in addition to this rise in entry qualifications, the knowledge required to maintain a job in many occupations is changing so fast that 50% of al l employees' skills are estimated to become outdated within 3-5 years (Moe & Blodget, 20 00). Witness any job that requires interaction with information technology (IT), which is a growing proportion of jobs. In fact, by 2006 almost half of all workers will be em ployed by industries that are either major producers or intensive users of IT products a nd services (Henry et al., 1999). So, more people want postsecondary educatio n because they need to have it if they want to becomeÂ—and stayÂ—employed. And, more of thes e individuals are nontraditional students who may work, travel in the ir jobs, or have families. For these people, physically attending classes is not always feasible, let alone convenient. (Note 4) This population's unmet educational need is increasingly becoming the target of distance learning. According to the National Center for Education Statistics, between fall 1995 and 1997-98, the percentage of higher educatio n institutions offering distance learning courses increased by one-third (from 33% t o 44%), and the number of course offerings and enrollments approximately doubled (Le wis et al., 1999). But although
7 of 23many institutions have delivered distance learning via mail, radio, or television for years, this growth is not in those media. Rather, it is di stance learning via the Internet that is booming. Among all higher-education institutions of fering any distance learning, the percentage of institutions using asynchronous Inter net-based technologies nearly tripled, from 22% in 1995 to 60% in 1997-1998. More recent d ata from Market Data Retrieval (MDR) confirm the trend ("Report: College Net use g rowing," 2000). MDR relates that, as of the 1999-2000 academic year, 34% of twoand four-year colleges offered accredited degree programs via computer, up from 15 % the year before. As of 2000, U.S. institutions reportedly offered more than 6,00 0 accredited courses on the Web and, by 2002, over 2 million students will be enrolled, a tripling of the 1998 enrollment (Moe & Blodget, 2000). At the same time, Internet-based distance l earning is finding its way into high school. The need is generated by home-schooled stud ents (of which there are over 1 million in the US), districts without a full comple ment of qualified teachers, and the children of migrant workers. So-called "virtual hig h schools" have emerged in Alabama, Arizona, California, Florida, Illinois, Indiana, Ke ntucky, Maryland, Massachusetts, Michigan, Missouri, Nebraska, New Mexico, and Utah (Carr, 1999; Carr & Young, 1999; Kerrey & Isakson, 2000). These programs can c ross state lines, with offerings open to students regardless of residence. Of partic ular note is that both the University of Missouri at Columbia High School and the Indiana Un iversity High School have been granted accreditation by the North Central Associat ion of Colleges and Schools (Carr, 1999). Accreditation means that students can apply course grades earned through these online institutions toward their high-school gradua tion. Both programs offer more than 100 high school courses. The growth of Internet-based distance learn ing will have a significant impact upon traditional education. For one, it may threaten the existence of established institutions (Dunn, 2000; Levine, 2000b). Many in the private se ctor see education as a huge industry that produces mediocre results for a high cost. If the private sector can leverage new technologies, like distance learning, to delive r greater value, the institutions that dominate education today will not be the leaders to morrow. The rapid growth of for-profit education companies (e.g., the Universit y of Phoenix), and the seemingly endless creation of well-capitalized new ones (e.g. UNext, Caliber, KaplanCollege.com, University Access, K12), suggests that a serious ch allenge to the existing order is well underway. The gravity of the threat is evident in h ow non-profits have responded. Cornell University, Columbia University, the Univer sity of Maryland, and New York University, among others, have each announced their own for-profit distance learning subsidiaries (Carr, 2000a)! A second reason that the growth of Internet -based distance learning will influence traditional education is that regardless of its imp act on nonprofit institutions, the distance learning industry will produce sophisticat ed software that everyone can use, in school and out. Both Dunn (2000) and Tulloch (2000) suggest that this occurrence will blur the distinctions between distance learning and local education. APEX offers an example ( http://apex.netu.com/ ). This company markets online Advanced Placement (AP) courses, targeting districts that want to offe r AP but which do not have qualified teachers. Districts can, thus, use APEX offerings o n site. (Note 5) The considerable potential of online learni ngÂ—local or distanceÂ—is reflected in a report to the President and Congress of the biparti san Web-Based Education Commission (Kerrey & Isakson, 2000). The Commission reached the following conclusion:
8 of 23The question is no longer if the Internet can be used to transform learning in new and powerful ways. The Commission has found tha t it can. Nor is the question should we invest the time, the energy, and the money neces sary to fulfill its promise in defining and shaping new lea rning opportunity. The Commission believes that we should. (p. 134, italic s in original) If acted on, the consequences of this state ment for assessment are profound. As online learning becomes more widespread, the substa nce and format of assessment will need to keep pace. Another quote from the Commissio n's report: Perhaps the greatest barrier to innovative teaching is assessment that measures yesterday's learning goalsÂ…Too often today 's tests measure yesterday's skills with yesterday's testing technol ogiesÂ—paper and pencil. (p. 59) So, as students do more and more of their l earning using technology tools, asking them to express that learning in a medium different from the one they typically work in will become increasingly untenable, especially wher e working with the medium is part of the skill being tested (or otherwise impacts it in important ways). Searching for information using the World Wide Web or writing on computer are examples. (Note 6) These changes in learning methodology offer exciting possibilities for assessment innovation. On site or off, an obvious result of de livering courses via the Internet is the potential for embedding assessment, perhaps almost seamlessly, in instruction (Bennett, 1998). Since students respond to instructional exer cises electronically, their responses can be recorded, leaving a continuous learning trac e. Depending upon how the course and the assessment are designed, this information c ould conceivably support a sophisticated model of student proficiencies (Gitom er, Mislevy, & Steinberg, 1995). That model might be useful both for dynamically dec iding what instruction to present next and for making more global judgments about wha t the student knows and can do at any given point. In addition to assessment embedded in Inter net-delivered courses, one can imagine Internet-delivered-assessment embedded in tradition al classroom activity. Such assessment might take the form of periodically deli vered exercises that both teach and test. In this scenario, the exercises would be stan dardized and performance might serve, depending upon the level of aggregation, to indicat e individual, classroom, school, district, state, or national achievement. Thus, the se exercises could serve summative as well as formative purposes and be useful to individ uals as well as institutions. If the exercises were of high enough quality, such a model might improve the motivation to participate in voluntary surveys like NAEP. There are, to be sure, many difficult issue s: How can we generate comparable inferences across st udents and institutions when variation in school equipment may cause items to di splay differently from one student to the next, potentially affecting performa nce? 1. How can we deliver assessment dependably given the unreliable nature of computers and the Internet, and the limited technic al support available in most schools? 2. How might we make sense of the huge corpus of data that the electronic recording of student actions might provide? 3. How would student learning be affected by knowing t hat one's actions are being 4.
9 of 23recorded? How can we prevent assessments that serve both inst ructional and accountability purposes from being corrupted by unscrupulous stude nts or school staff? 5. How can we manage the costs of online assessment? 6. How can we assure that all parties can participate? 7. Let's, for the moment, turn to this last is sue.Are the Schools Ready? A continuing concern with such reinvention visions is whether schools (and students) are ready technologically and, in particu lar, what to do about technology differences across social groups. The National Cent er for Education Statistics (NCES) reports that as of September 1999, 95% of schools w ere connected to the Internet, up from 35% in 1994 (NCES, 2000). Schools in all categories, (i.e., by grade level, poverty concentration, and metropolitan status), were equal ly likely to have Internet access. Further, most schools had dedicated lines: only 14% were using dial-up modem, a slower and less reliable access method.(Note 7) Clearly many of these schools could have on ly a single connected machine and that machine could be the one sitting on the principal's desk. How many classrooms were actually wired? According to NCES (2000), as of Sep tember 1999, 63% of all instructional rooms had Internet access (up from 3% in 1994, a 20-fold increase in five years). The ratio of students to Internet-connected computers was 9:1, down from 12:1 only a year earlier. These are staggering numbers, for they imply that classrooms are connecting to the Internet at a very rapid rate. This success is in no small part due to fed eral efforts. The government's e-rate program has been giving public schools and librarie s discounts of up to 90% on phone service, Internet hook-ups, and wiring for several years ("FCC: E-rate subsidy funded," 2000). In total, the program has committed 3.65 bil lion dollars to over 50,000 institutions, helping connect more than one million public school classrooms (Kennard, 2000). In addition, 70% of the program's last round of funding went to schools in the lowest income areas. However, even with these very significant e fforts, there continue to be equity issues. As of September 1999, in high poverty schoo ls, the ratio of students to Internet computers was 16 to 1. In low poverty schools, it w as less than half that amountÂ—7 to 1 (NCES, 2000). What should we conclude? Certainly, with fe w exceptions, it would be impossible to deliver large-scale assessment via the Internet today. But the trend is clear: the infrastructure is quickly falling into place for In ternet delivery of assessment to schools, perhaps first in survey programs like NAEP that req uire only a small participant sample from each school, but eventually for inclusive asse ssments delivered directly to the desktop. As evidence, witness the requests-for-prop osals recently released by the state education departments of Oregon, Virginia, and Geor gia for building Internet-delivered, state-assessment systems (Department of Education, 2000; Virginia Department of Education, undated, State of Georgia, 2001). Assuming that every classroom is wired, wil l all students then have the technology skills needed to take tests on-line? Clearly, more students are becoming computer-familiar every day and developing such ski lls is a national educational technology goal (Riley, Holleman, & Roberts, 2000). But, as Negroponte (1995) suggests, computer familiarity is really the wrong issue. The secret to good interface
10 of 23design is to make it go away Thus, advances in technology will eventually elim inate the need to be computer familiar. After nomadic computing, which we are now entering with the proliferation of wireless Internet devices and personal digital assistants, comes ubiquitous computing (Olsen, 2000)Â—the embedding of new techno logy into everyday items. Inventions like "radio" paper (Gershenfeld, 1999, p. 18; Maney, 2000; "NCS secures rights," 2000) may allow students to intera ct with computers in the same way that they interact with paper today. Smart desks ar e another likelihood, in which case a test may be electronically delivered, quite literal ly, to every desktop. In the U. S., then, we may see a future in which every classroom is wired and every student can easily take tests on line. What of the rest of the world? To be sure, the Internet is an American phenomenon. It derives from research sponsored by the Defense Department in the 1960's (Cerf, 1993). As a result of this history, the overwhelming majority of users were, until very recently, from o ur shores. At this writing, over 60% of Net users reside outside of the United States and the foreign growth rate no w exceeds the domestic one ("How many online?", 2000; "U.S. d ominance seen slipping," 2001). The largest numbers of foreign Internet use rs are, of course, in developed nations. These nations have the telecommunications infrastru cture and citizens with enough disposable income to afford the trappings of Intern et use. But what about developing nations? Will they be left irretrievably behind? Th e challenges for these nations are undoubtedly great. Over time, however, we should se e significant progress in building the infrastructure and the user base here too (Cair ncross, 1997; Fernandez, 2000). This progress will occur for at least two reasons. First the cost of technology has been dropping precipitously and, by Moore's law, will co ntinue to decline. Further, because the future of computing is undoubtedly in wireless devices (Grice, 2000), a telecommunications infrastructure will be much chea per to acquire than the land-lines of old. Second, as Metcalfe's law suggests, markets wi ll become all the more valuable as they are interconnected. (Witness the global econom y and the economic benefits resulting to nations from integration with it.) Tha t developing nations join the e-commerce network means greater opportunity for al l. It means more vendor choice for the people of developing nations; more opportunity for developed nations to serve these markets; and a new opportunity for third-world busi nesses themselves to compete globally. (Note 8) The same holds true for assessment. The Int ernet will make it easier for developing nations to get access to assessment services from e lsewhere and for those nations to distribute their own assessment services regionally or around the world. This ease of access and distribution should make it possible to form international consortia. Such consortia will be able to assemble technical resour ces that a single nation might not be able to acquire. In addition, those consortia may b e able to purchase services from others more efficiently than nations could obtain individu ally. Finally, an electronic network should make it easier to participate in internation al studies, bringing the benefits of benchmarking to nations throughout the world. But is Technology-Based Assessment Really Worth the Investment? One of the largest instantiations of techno logy-based assessment to date is computer-based testing (CBT) in postsecondary admis sions. As programs like the Graduate Record Examinations, the Graduate Manageme nt Admission Test, and the Test of English as a Foreign Language have found, CBT ca n be enormously costly. Being among the first large-scale programs to move to com puter, they bore the brunt of creating the infrastructure for what was essentiall y a new business. The building of that
11 of 23infrastructure was initiated in the early 1990's before test developers knew how to create tests for computer, before computers were widely available for individuals to take tests on, and before the Internet was ready to bring those tests to stud ents. In essence, these programs needed to build both a factory to stamp ou t a new product and a new distribution mechanism. A first generation infrastr ucture now exists, but it is not yet optimized to produce and deliver tests as efficient ly as possible. Right now, there's no question about it: for these programs, assessment b y computer costs far more than assessment by paper. If we have learned anything from the histor y of innovation, it is that new technologies are often initially far too expensive for mass use. That was true of the automobile, telephone service, commercial aviation, and the personal computer, among many other innovations. For example, in 1930 the co st of a three-minute telephone call from New York to London was $250 (in 1990 dollars). By 1995, the cost had dropped to under $1 (World Bank, 1995, cited in Cairncross, 19 97, p. 28). As a second instance, when the IBM Personal Computer was introduced in 19 81, it cost around $5,000. At the time, the median family income in the United States was on the order of $25,000, so that a computer cost about 20% of the average family's e arningsÂ—not very affordable. At this writing, the cost of a computer with many time s greater capability is a little more than $500 and the median income is closer to $55,00 0. (Note 9) A computer now costs about 1% of average income. (Note 10) When a promising new technology appears, in dividuals and institutions invest, allowing the technology to evolve and a supporting infrastructure to develop. Over the course of that development, failures inevitably occ ur. Eventually, the technology either dies or becomes commercially viableÂ—that is, effici ent enough. So, who's investing in CBT? At this point, it's an impressive list including non-profit testing agencies, for profit-testing com panies, school districts, state education departments, government agencies, and companies wit h no history in testing at all. The list includes ACT, the Bloomington (MN) Public Scho ols, CITO (the Netherlands), the College Board, CTB/McGraw-Hill, Edison Schools, ETS Excelsior College (formerly Regents College), Harcourt Educational Measurement, Heriot-Watt University (Scotland), Houghton-Mifflin, Microsoft, the Nation al Board of Medical Examiners, the National Institute for Testing and Evaluation (Isra el), NCS Pearson, the Northwest Evaluation Association, the Oregon Department of Ed ucation, the Qualifications and Curriculum Authority (Great Britain), Thomson Corpo ration, the University of Cambridge Local Examinations Syndicate (UCLES), the U.S. Armed Forces, Vantage Technologies, and the Victoria (Australia) Board of Studies. These organizations are producing tests for postsecondary admissions, colle ge course placement, course credit, school accountability, instructional assessment, an d professional certification and licensure (see the Appendix for details.) In concer t, they already administer something on the order of 10 million computerized tests each year. (Note 11) Why are these organizations investing? I th ink it's because they believe that technology-based assessment will eventually achieve important economies over paper and that, fundamentally, assessment will benefit. B ut I also think it's because they don't want to become Britannica That is, they see improvements in the business an d substance of assessment which, if they fail to embr ace, will lead them to the same fate as that encyclopedia publisher.CBT as a Disruptive Technology But as the case of admissions testing sugge sts, the road to improvement may be a
12 of 23difficult one since CBT might not be a typical inno vation. Christensen (1997) distinguishes between two types of innovation, call ed sustaining and disruptive technologies. Sustaining technologies enhance the p erformance of established products in ways that mainstream customers have traditionall y valued. Historically, most technological advances in any given industry have b een sustaining ones (e.g., in the personal computer industry, faster chips and bigger higher-resolution monitors). Occasionally, disruptive technologies emerge. Compa nies introduce these technologies hoping their features will provide competitive edge However, these features characteristically overshoot the market, giving cus tomers more than they need or are willing to pay for. Thus, disruptive technologies r esult in worse product performance, at least in the near-term, on key dimensions in a comp any's established markets. Interestingly, a few fringe customers typic ally find a disruptive technology's new features attractive. In these niche markets, such t echnology may thrive. If and when it advances to the level and nature of performance dem anded in the mainstream market, the new technology can invade it, rapidly knocking out the traditional technology and its dependent practitioners. Remember Britannica CBT has many of the characteristics of a disruptive technology. Established testing organizations are applying it in their mainstream m arkets, most notably postsecondary admissions. This innovation was introduced, in good part, to provide competitive edge through features like the ability to take a test at one's convenience and to get score reports immediately. As it turned out, these featur es overshot the market. At least initially, registrations for continuously-offered c omputer-based admissions tests mirrored those for fixed-date administrations, sugg esting that scheduling convenience was not a highly valued feature in the market of th e time. Moreover, examinees were dissatisfied with losing some of the features of pa per exams, including the ability to proceed through the test nonlinearly, the option to review the scoring of items actually taken, and the low cost (Perry, 2000). Although it encountered difficulty in the m ainstream admissions testing market, CBT found more rapid acceptance in the niches. One example is information technology (IT) certification, which individuals pursue to doc ument their competence in some computer-related proficiency. In 1999, over three m illion examinations in 25 languages were administered in this market (Adelman, 2000). M ost of these tests were delivered on computer and most were offered on a continuous basi s. Three delivery vendors provided the bulk of examinations: CAT, Inc. (a subsidiary o f Houghton-Mifflin), Prometric (a subsidiary of Thomson Corporation), and Vue (a subs idiary of NCS Pearson). Together, these vendors operated some 5,000 testing centers i n 140 countries. As of June, 2000, over 1.9 million credentials had been awarded, most for Microsoft or Novell technologies. Why is the CBT of today so well suited to t his market niche? Let's start by asking what features a testing product must have to succee d in this niche. First, it must be continuously offered because these test candidates build technology skill on their own schedulesÂ—at home or on the job, very often through books or online learning. These individuals want to test when they are ready, not w hen the testing companies are. Second, such a test must generally be offered on co mputer since technology use is the essence of the certification. What are the financial considerations assoc iated with serving this market? One consideration is whether the test fee can cover the cost of assessment. As it turns out, this market is less price-sensitive than postsecond ary admissions. Why? With IT testing, employers pay the fee for over half the candidates (Adelman, 2000). In addition, certified employees command a substantial salary pr emium (4-14%), which makes
13 of 23examinees more willing to absorb the higher fees th at CBT currently requires. A second consideration is that security is not as critical a s in admissions testing, so large item pools are not needed, reducing production cost. Low er security is tolerable because if an individual appears on the job with a dishonestly ob tained credential but without the required skill, he or she will not last. Finally, t est volume is self-replicating: there are many repeat test takers because information technol ogy changes rapidly, so skills must be updated constantly. From an innovation perspecti ve, then, IT certification may be one context in which the CBT of today can flourish and develop to better meet the needs of other assessment markets. So why do industry leaders tend to fail wit h disruptive technology while fringe players succeed? Industry leaders often fail precis ely because they attempt to introduce disruptive technologies into major markets before i t's time (Christensen, 1997). Because niche markets are often too small to be of interest leaders do not pursue those opportunities to refine the technology. Instead, th ey give up, having run out of resources or credibility. Making a disruptive technology work requires iteration and iteration means failure. Because they risk neither large reso urces nor reputations in the mainstream market, it is the fringe players who can fail early, often, and inexpensively enough to eventually challenge and overtake the ind ustry leaders. Toward the Technology Based Assessment of Tomorrow Are there other niche markets in which CBT might evolve? One such niche may be online learning. If we believe the Web-Based Educat ion Commission (Kerrey & Isakson, 2000), online learning will become a major enterpri se, especially for the lifelong updating of skills. In this market, institutions wi ll be less concerned with questions of who gets in and more with who gets out and what it is they have to do to get out (Messick, 1999). Why? Because once hired, businesse s are becoming more concerned with what employees know and can do, and less with where they went to school. Similarly, individuals are becoming more concerned with finding course offerings that meet their skill development goals and less with wh ether those offerings come from one institution or a half-dozen. What's the assessment need? First, it is fo r knowledge facilitation and, second, for knowledge certification; that is, to help people develop their skills and th en document that they've developed them. What's the assessment challenge? The challenge is to figure out how to design and deliver embedded assessment t hat provides instructional support and that globally summarizes learning accomplishmen t. In other words, the challenge is to combine richness with reach to achieve mass cust omizationÂ—use the Internet's ability to deliver the richness of customized assessment to reach a mass audience. Can assessment be customized? In very rudim entary ways, it already is. Certainly, we can dynamically adapt along a global dimension, as is done in many of today's computerized tests. But as we move assessment close r to instruction, we should eventually be able to adapt to the interests of the learner and to the particular strengths and weaknesses evident at any particular juncture, as intelligent tutors now do (e.g., Schulze, Shelby, Treacy, & Wintersgill, 2000). Like wise, we should be able to customize feedback to describe the specific profici encies the learner evidenced in an instructional sequence. But perhaps the most far-reaching customiza tion of assessment will come through modular online courses, whereby an instructorÂ—or ev en a sophisticated learnerÂ—assembles a series of components into a uni que offering. The Department of Defense (DOD) has taken a significant step through the Sharable Courseware Object
14 of 23Reference Model (SCORM) ( www.adlnet.org ). SCORM is to embody specifications and guidelines providing the foundation for how DOD wil l use technology to build and operate the learning environment of the future. SCO RM will allow mixing and matching of learning segments to create lower cost, reusable training resources. (Note 12) If embedded assessment can be built into course module s following a similar set of conventional specifications, the assessment too wil l be customized by default. Conclusion Whether for postsecondary admissions, schoo l and student accountability, or national policy, large-scale assessment must be rei nvented. Reinvention is not an option. If we do not reinvent it, much of today's paper-bas ed testing will become an anachronismÂ—"yesterday's testing technology," in th e words of the Web-Based Education Commission (Kerrey & Isakson, 2000)Â—becau se it will be inconsistent with what and how students learn. This reinvention must occur along both busi ness and substantive lines. As educators, we often behave as if business considera tions are unimportant, even distasteful. However, the business and substance of assessment are intertwined. Even for non-profit educational institutionsÂ—state education departments, federal agencies, schools, research organizationsÂ—providing quality a ssessment for a low cost matters. Using new technology to do assessment faster and ch eaper can free up the resources to do assessment better. We will be able to do assessment better bec ause advances in technology, cognitive science, and measurement are laying the groundwork to make reinvention a reality. Whereas the contributions of cognitive and measurem ent science are in many ways more fundamental than those of new technology, it is new technology that is pervading our society. My thesis, therefor, is that new technolog y will be the primary facilitating factor precisely because of its widespread societal accept ance. (Note 13) In the same way that the Internet is already helping to revolutionize co mmerce, education, and even social interaction, this technological advance will help r evolutionize the business and substance of large-scale assessment. It will do so by allowing richness with reachÂ—that is, mass customization on a global scaleÂ—as never b efore. However, as the history of innovation suggests, this reinvention won't come im mediately, without significant investment, or without setback. With few exceptions we are not yet ready for large-scale assessment via the Internet (at least in our school s). However, as suggested above, this story is not so much about today. It really is abou t tomorrow.NotesThis article is based on a paper presented at the a nnual conference of the International Association for Educational Assessment (IAEA), Jeru salem, May 2000. I appreciate the helpful comments of Isaac Bejar, Henry Braun and Drew Gitomer on an earlier draft of this manuscript. The Internet takes advantage of many such standards including Internet Protocol (IP) for transmitting packets of information; Trans mission Control Procotol (TCP/IP) for verifying the contents of those packet s; HyperText Transfer Protocol (HTTP) for transferring web-pages; and HyperText Ma rkup Language (HTML) and Extensible Markup Language (XML) for representi ng structured documents and data on the Web. XML provides a significant adv ance over HTML in that it 1.
15 of 23allows for the representation of unlimited classes of documents. Leadership in developing and implementing the many standards used by the Internet is provided by the World Wide Web Consortium ( www.w3.org) For more on Internet standards, see their website or see Green (1996), w ho gives a more basic introduction.According to Neilsen//NetRatings, 56% of U.S. house holds had Internet access as of November 2000 ("Internet access tops 56 percent, 2000). 2. And it works. eBay is reported to be the most succe ssful company in cyberspace, with 22.5 million registered users and 2000 revenue s of $430 million (Cohen, 2001). Why? It has none of the costs of retailing: No buying, no warehousing, no shipping, no returns, no overstock. 3. A recent, but potentially significant, addition to this population is the U.S. Army. In July, 2000, Secretary of the Army, Louis Caldera announced a 600 million dollar program to allow any interested soldier to t ake college courses over the Internet at little or no cost (Carr, 2000b). 4. A second, perhaps more interesting, example is Flor ida's Daniel Jenkins Academy, where students physically attend but take all acade mic courses on-line from off-site teachers (Thomas, 2000). 5. Russell has conducted several studies on the mismat ch between learning and testing methods in writing (e.g., Russell & Plati, 2001). The repeated result is that the writing proficiencies of students who routinely use word processors are underestimated by paper-and-pencil tests. 6. The Teaching, Learning, and ComputingÂ—1998 survey p rovides similar data (Anderson & Ronnkvist, 1999). This survey, conducte d using a national probability sample in Spring 1999, reports Internet access in 90% of schools and at least medium-speed, dedicated connections in 57% 7. Developing a technology infrastructure and integrat ing into the e-commerce network may, in fact, help jump-start the growth re quired to deal with the serious problems of public health, education, and welfare t hat these countries typically face (Friedman, 2000). 8. The median income for a family of four in 1981 was $26,274 (U.S. Census Bureau, 2001). For 1998, it was $56,061. 9. Price and quality-adjusted data tell a similar stor y. In 1983, the quality-adjusted cost of a personal computer in constant 1996 dollar s was $1098 (D. Wasshausen, personal communication, April 13, 2000). By 1996, t he cost of a PC, holding quality constant, was $100, less than a tenth of th e 1983 cost. By 1999, that quality-adjusted PC had further deflated to $29. 10. I based this estimate on unduplicated volumes claim ed by Thomson Prometric ( 11. www.prometric.com ), Vantage Technologies ( www.intellimetric.com/index.html ), and the U.S. Armed Forces (A. Nicewander, person al communication, November 2, 2000). These three organizations alone claim some 8.5 million tests annually. These tests include both high-stakes and low-stakes assessments. SCORM is being built upon the work of the IMS Globa l Learning Consortium (IMS) ( 12. www.imsproject.org/aboutims.html ). IMS is developing open specifications for facilitating distributed learning activities such a s locating and using educational content, tracking learner progress, reporting learn er performance, and exchanging student records between administrative systems. Bot h IMS and SCORM incorporate XML (see note 1 above).That the largest facilitating factor will be technological is not to say that we 13.
16 of 23should necessarily let technology drive the substan ce of assessment. We shouldn't.ReferencesACT and EDS alliance to expand the nation's testing and training opportunities. (1999, June 8). ACT Newsroom [On-line]. Available: www.act.org/news/releases/1999/06-08-99.html Adelman, C. (2000). A parallel postsecondary univer se: The certification system in information technology. Washington, D.C.: Office of Educational Research and Improvement, U.S. Department of Education. Availabl e: www.ed.gov/pubs/ParallelUniverse/ Anderson, R. E., & Ronnkvist, A. (1999). The presen ce of computers in American Schools Irvine, CA: Center for Research on Information Te chnology and Organizations. Available: www.crito.uci.edu/tlc/findings/computers_in_americ an_schools/ Ball, S. (1999). Measurement and the culture of edu cation: The story of VSAM. Educational Measurement: Issues and Practice, 18(2) 50-51. Barton, P. E. (1999). What jobs require: Literacy, education, and training, 1940-2006. Princeton, NJ: Policy Information Center, Education al Testing Service. Available: www.ets.org/research/pic Bennett, R. E. (1998). Reinventing assessment: Spec ulations on the future of large-scale educational testing. Princeton, NJ: Policy Informat ion Center, Educational Testing Service. Available: www.ets.org/research/pic/bennett.html Bennett, R. E. (1999). Using new technology to impr ove assessment. Educational Measurement: Issues and Practice, 18(3), 5-12. Bennett, R. E., Goodman, M., Hessinger, J., Ligget, J., Marshall, G., Kahn, H., & Zack, J. (1999). Using multimedia in large-scale computer -based testing programs. Computers in Human Behavior, 15, 283-294. Big fish in a big pool. (1999). TIME Digital December 2. Burstein, J., Braden-Harder, L., Chodorow, M., Hua, S., Kaplan, B., Kukich, K., Lu, C., Nolan, J., Rock, D., & Wolff, S. (1998). Computer a nalysis of essay content for automated score prediction (RR-98-15). Princeton, N J: Educational Testing Service. Cairncross, F. (1997). The death of distance: How the communications revol ution will change our lives Boston, MA: Harvard Business School Press. Carr, S. (1999, December 10). 2 more universities s tart diploma-granting virtual high schools. The Chronicle of Higher Education, p. A49. Carr, S. (2000a, March 24). Cornell creates a for-p rofit subsidiary to market distance education programs. The Chronicle of Higher Education, p. A47. Carr, S. (2000b, August 18). Army bombshell rocks d istance education. The Chronicle of Higher Education, p. A35.
17 of 23Carr, S., & Young, J. R. (1999, October 22). As dis tance learning boom spreads, colleges help set up virtual high schools. The Chronicle of Higher Education, p. A55. Cerf, V. (1993). How the Internet came to be. In B. Aboba (Ed.), The online user's encyclopedia New York: Addison-Wesley. Available: http://www.bell-labs.com/user/zhwang/vcerf.html Christensen, C. M. (1997). The innovator's dilemma: When new technologies caus e great firms to fail Boston, MA: Harvard University Press. Church, G. J. (1999). The economy of the future? TIME 154(14). Available: http://www.time.com/time/magazine/article/0,9171,31 522,00.html Cisco Systems, Inc. (2000). Discover all that's possible on the Internet: 2000 annual report San Jose, CA: Cisco Systems, Inc. Available: www.cisco.com/warp/public/749/ar2000 Clauser, B. E., Margolis, M. J., Clyman, S. G., & R oss, L. P. (1997). Development of automated scoring algorithms for complex performanc e assessments: A comparison of two approaches. Journal of Educational Measurement, 34, 141-161. Cohen, A. (1999). The attic of e. TIME 154(26). Available: http://www.time.com/time/magazine/article/0,9171,36 306-1,00.html Cohen, A. (2001). eBay's bid to conquer all. TIME, 157(5) 48-51. Department of Education. (2000). Request for propos als for the technology enhanced student assessment system. Salem, OR: Department of Education. Available: www.ode.state.or.us/asmt/develop/rfptesa.htm Dunn, S. L. (2000). The virtualizing of education. The Futurist, 34(2), p.34-38. Early test prep. (1999). ABC News.com [On-line]. Av ailable: http://abcnews.go.com/sections/tech/DailyNews/testi ng991020.html Education prognosis 1999. (1999, January 11). Business Week 132-133. Evans, P., & Wurster, T. S. (2000). Blown to bits: How the economics of information transforms strategy Boston, MA: Harvard Business School Press. FCC: E-rate subsidy funded at $2.25 billion cap. (2 000). What Works in Teaching and Learning, 32(8), p. 8. Fernandez, S. M. (2000). Latin America logs on. TIME, 155(19), B2-B4. Frederiksen, N. (1984). The real test bias: Influen ces of testing on teaching and learning. American Psychologist, 39 193-202. Friedman, T. L. (2000). The Lexus and the olive tree: Understanding globali zation New York: Anchor Books.Gershenfeld, N. (1999). When things start to think New York: Holt.
18 of 23Gibney, Jr., F. (2000). Enron plays the pipes. TIME, 156(9), 38-39. Gitomer, D. H., Mislevy, R. J., & Steinberg, L. S. (1995). Diagnostic assessment of troubleshooting skill in an intelligent tutoring sy stem. In P. D. Nichols, S. F. Chipman, R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 72-101). Hillsdale, NJ: Erlbaum.Glaser, R. (1991). Expertise and assessment. In M. C. Wittrock & E. L. Baker (Eds.), Testing and cognition (pp. 17-30). Englewood Cliffs, NJ: Prentice-Hall. Global Reach. (2000). Global Internet statistics (b y language). Available: www.glreach.com/globstats/index.php3 Green, C. (1996). An introduction to Internet proto cols for newbies. Available: www.halcyon.com/cliffg/uwteach/shared_info/internet _protocols.html Grice, C. (2000). Wireless handhelds will rule the day, PC execs predict. CNET News.com [On-line]. Available: http://news.cnet.com/news/0-1004-200-1560446.html Henry, D., Buckley, P., Gill, G., Cooke, S., Dumaga n, J., Pastore, D., & LaPorte, S. (1999). The emerging digital economy II. Washington D.C.: U.S. Department of Commerce. Available: www.ecommerce.gov/ede/ede2.pdf How many online? (2000, December 20). Nua Internet Surveys. Available: www.nua.ie/surveys/how_many_online/index.html Internet access tops 56 percent in U. S., according to Neilsen//NetRatings. (2000, December 18). Available: http://126.96.36.199/press_releases/PDF/pr_001215 .pdf Internet domain survey host count. (2000). Internet Software Consortium. Available: www.isc.org/ds/hosts.html Kennard, W. E. (2000, January). E-rate: A success s tory. Presentation at the Educational Technology Leadership ConferenceÂ—2000, Washington, D.C. Kerrey, B., & Isakson, J. (2000). The power of the Internet for learning: Moving from promise to practice. (Report of the Web-based Educa tion Commission). Washington, D.C.: Web-Based Education Commission. Available: http://interact.hpcnet.org/webcommission/index.htm Landler, M. (1995, May 16). Slow-to-adapt Encyclopaedia Britannica is for sale. New York Times D1, D22. Levine, A. (2000a, March). The remaking of the Amer ican university. Paper presented at the Blackboard Summit, Washington, D. C.Levine, A. (2000b, March 13). The soul of a new uni versity. New York Times p. 21. Lewis, L., Snow, K., Farris, E., Levin, D., & Green e, B. (1999). Distance education at postsecondary education institutions: 1997-1998 (NC ES Statistical Analysis Report 2000-013). Washington, D.C.: National Center for Ed ucation Statistics. Available:
19 of 23 http://nces.ed.gov/pubs2000/2000013.pdf Maney, K. (2000). E-novel approach promises new cha pter for book lovers. USA Today, 18(169), 8A-9A. Melcher, R. A. (1997). Dusting off the Britannica : A new order has digital dreams for the august encyclopedia. Business Week Online Available: www.businessweek.com/1997/42/b3549124.htm Mendels, P. (1999). The leading issues of '99? Wire d schools and accreditation. The New York Times On the Web [On-line]. Available: www.nytimes.com/library/tech/99/12/cyber/education/ 29education.html Messick, S. (1999). Technology and the future of hi gher education assessment. In S. Messick (Ed.), Assessment in higher education: Issues of access, s tudent development, and public policy (pp. 245-254). Hillsdale, NJ: Erlbaum. Moe, M. T., & Blodget, H. (2000). The knowledge web: People powerÂ—Fuel for the new economy San Francisco: Merrill Lynch. National Center for Education Statistics. (2000). S tats in brief: Internet access in US public schools and classrooms: 1994-99. Washington, DC: US Department of Education, Office of Research and Improvement. NCS secures rights to iPaper electronic technology in testing and education market. (2000, July 11). Minneapolis, MN: National Computer Systems (NCS). Available: www.ncs.com/ncscorp/top/news/000711.htm Negroponte, N. (1995). Being digital New York: Vintage. Odendahl, N. (1999, April). Online delivery and sco ring of constructed-response assessments. Paper presented at the annual meeting of the American Educational Research Association, Montreal.Olsen, F. (2000, February 18). A UCLA professor and net pioneer paves the way for the next big thing. The Chronicle of Higher Education, 46 Pellegrino, J. W., Jones, L. R., & Mitchell, K. J. (1999). Grading the nation's report card Washington, D.C.: National Academy Press. Perry, J. (2000). Digital tests spark controversy: Critics say revamped exams limit the options to challenge a score. Online US News [On-line]. Available: www.usnews.com/usnews/edu/beyond/grad/gbgre.htm Poised to go global: Accuplacer online sales soar. (2000, April). The Bulletin Board, 5(9) 5. Report: College Net use growing. (2000, March 16). USA Today.com [On-line]. Available: www.usatoday.com/life/cyber/tech/cth566.htm Riley, R. W., Holleman, F. S., & Roberts, L. G. (20 00). e-Learning: Putting a world-class education at the fingertips of all chil dren (The national educational
20 of 23technology plan). Washington, D.C.: U.S. Department of Education. Available: www.ed.gov/Technology/elearning/e-learning.pdf Russell, M., & Plati, T. (2001). Effects of compute r versus paper administration of a state-mandated writing assessment. Teachers College Record Available: www.tcrecord.org/Content.asp?ContentID=10709 Schmidt, P. (2000, January 21). Judge sees no bias in Texas test for high-school graduation. Chronicle of Higher Education p. A27. Schulze, K. G., Shelby, R. N., Treacy, D. J., & Win tersgill, M. C. (2000, April). Andes: A coached learning environment for classical Newton ian physics. In Proceedings of the 11th International Conference on College Teaching a nd Learning, Jacksonville, FL. Available: www.pitt.edu/~vanlehn/icctl.pdf Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7) 4-14. Singley, M. K., & Bennett, R. E. (in press). Item g eneration and beyond: Applications of schema theory to mathematics assessment. In S. Irvi ne & P. Kyllonen (Eds.), Item generation for test development Hillsdale, NJ: Erlbaum. State of Georgia. (2001). Request for proposal numb er 41400-026-0000000031. Available: http://www2.state.ga.us/Departments/doas/ procure/rfp/rfp-41400-026-0000000031.doc Thomas, K. (2000, April 6). One school's quantum le ap. USA Today 1A. Available: www.usatoday.com/usatonline/20000406/2117463s.htm Tulloch, J. B. (2000). Sophisticated technology off ers higher education options. T.H.E. Journal [On-line]. Available: www.thejournal.com/magazine/vault/A3165.cfm U.S. Census Bureau. (2001). Median income for 4-per son families, by state. Available: www.census.gov/ftp/pub/hhes/income/4person.html U.S. Department of Commerce. (2000). Falling throug h the Net: Toward digital inclusion. Available: www.esa.doc.gov/fttn00.pdf U.S. dominance seen slipping in Internet use, comme rce. (2001). Cyberatlas: The Big Picture Geographics. Available: http://cyberatlas.internet.com/big_picture/ geographics/article/0,,5911_377801,00.html Virginia Department of Education. (Undated). Demons trating success: A statewide web-based Standards of Learning technology and on-l ine testing initiative (Request for proposal # RFP-WEB2000). Richmond, VA: Virginia Dep artment of Education. Available: www.pen.k12.va.us/VDOE/Technology/soltech/rfp/rfpwe b2000.pdf Whalen, S. J., & Bejar, I. I. (1998). Relational da tabases in assessment: An application to online scoring. Journal of Educational Computing Research, 18, 1-13.
21 of 23Appendix: Some Organizations Investing in ComputerBased TestingACT, Inc. In partnership with EDS, ACT, Inc. is est ablishing a nationwide network of electronic testing and training centers. These cent ers will provide computer-delivered certification and licensure tests for the trades an d professions; a computerized measure of workplace skills to guide training decisions; an d computerized educational and career guidance. More than 250 ACT Centers are expected to be operational by the end of 2001 ("ACT and EDS," 1999). ACT also offers a computeriz ed placement test for post-secondary institutions to use in determining w hether entering students need assignment to remedial or developmental courses in mathematics, reading, writing, and English-as-a-second-language ( www.act.org/compass/ ). Bloomington (MN) Public Schools. This district was reportedly the first in the US to do its math and reading testing exclusively via comput er ("Early test prep," 1999). Bloomington uses an intranet-delivered computer-ada ptive test designed by the Northwest Evaluation Association (see entry below) ( www.bloomington.k12.mn.us/Staff_Resources/ Office_of_Research_and_Evaluat/CALT_Technical_Descr iption /calt_technical_description.htm ). CITO. CITO, the measurement organization of the Net herlands, has developed a computerized adaptive test, WisCat, for placement i n adult education. WisCat is used by approximately half the vocational training institut es in the Netherlands (Verschoor, personal communication, November 7, 2000).College Board. The College Board offers Accuplacer, an adaptive placement test that can be delivered over the Internet for use in posts econdary institutions ( www.collegeboard.org/accuplacer/html/ accupla1.html ). Last year, over 2 million exams were administered ("Poised to go global," 200 0), probably making Accuplacer the largest volume CBT in the world. By July 2001, the Board will also be offering its entire College Level Examination Program (CLEP) on compute r: over 30 tests designed to allow individuals to get college credit for knowled ge gained outside of school ( www.collegeboard.com/clep/clepcntr/html/tc0 01.html ). CTB/McGraw-Hill. This company offers a PC version o f the Test of Adult Basic Education, a measure of reading, mathematics, langu age, and spelling skills used in adult literacy programs ( www.ctb.com/products_services/tabe/ index.html ). Edison Schools. This for-profit company manages 113 public schools with a total enrollment of 57,000 students. Edison recently intr oduced its Benchmark Assessment System, designed to provide teachers with ongoing, instructionally relevant information about the progress of their 2nd to 8th grade studen ts. These computerized assessments in reading, math, writing, and language arts will be a dministered over 1 million times during the 2000-2001 academic year ( www.intellimetric.com/when.newstoday0.html ). Educational Testing Service (ETS). In the 1999-2000 year, ETS administered over a million tests on computer for the GRE, GMAT, and TO EFL programs. In addition, a variety of licensure and certification examinations were given through ETS' Chauncey Group International subsidiary ( www.ets.org/cbt/index.html ). A second subsidiary, ETS Technologies, markets automated scoring services fo r computer-delivered writing tests
22 of 23 Copyright 2001 by the Education Policy Analysis ArchivesThe World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu General questions about appropriateness of topics o r particular articles may be addressed to the Editor, Gene V Glass, email@example.com or reach him at College of Education, Arizona State University, Tempe, AZ 8 5287-0211. (602-965-9644). The Commentary Editor is Casey D. C obb: firstname.lastname@example.org .EPAA Editorial Board Michael W. Apple University of Wisconsin Greg Camilli Rutgers University John Covaleskie Northern Michigan University Alan Davis University of Colorado, Denver Sherman Dorn University of South Florida Mark E. Fetler California Commission on Teacher Credentialing Richard Garlikov email@example.com Thomas F. Green Syracuse University Alison I. Griffith York University Arlen Gullickson Western Michigan University Ernest R. House University of Colorado Aimee Howley Ohio University Craig B. Howley Appalachia Educational Laboratory William Hunter University of Calgary Daniel Kalls Ume University Benjamin Levin University of Manitoba Thomas Mauhs-Pugh Green Mountain College Dewayne Matthews Western Interstate Commission for HigherEducation William McInerney Purdue University Mary McKeown-Moak MGT of America (Austin, TX) Les McLean University of Toronto Susan Bobbitt Nolen University of Washington Anne L. Pemberton firstname.lastname@example.org Hugh G. Petrie SUNY Buffalo Richard C. Richardson New York University Anthony G. Rud Jr. Purdue University Dennis Sayers Ann Leavenworth Centerfor Accelerated Learning Jay D. Scribner University of Texas at Austin Michael Scriven email@example.com Robert E. Stake University of IllinoisÂ—UC
23 of 23 Robert Stonehill U.S. Department of Education David D. Williams Brigham Young UniversityEPAA Spanish Language Editorial BoardAssociate Editor for Spanish Language Roberto Rodrguez Gmez Universidad Nacional Autnoma de Mxico firstname.lastname@example.org Adrin Acosta (Mxico) Universidad de Guadalajaraadrianacosta@compuserve.com J. Flix Angulo Rasco (Spain) Universidad de Cdizfelix.email@example.com Teresa Bracho (Mxico) Centro de Investigacin y DocenciaEconmica-CIDEbracho dis1.cide.mx Alejandro Canales (Mxico) Universidad Nacional Autnoma deMxicocanalesa@servidor.unam.mx Ursula Casanova (U.S.A.) Arizona State Universitycasanova@asu.edu Jos Contreras Domingo Universitat de Barcelona Jose.Contreras@doe.d5.ub.es Erwin Epstein (U.S.A.) Loyola University of ChicagoEepstein@luc.edu Josu Gonzlez (U.S.A.) Arizona State Universityjosue@asu.edu Rollin Kent (Mxico)Departamento de InvestigacinEducativa-DIE/CINVESTAVrkent@gemtel.com.mx firstname.lastname@example.org Mara Beatriz Luce (Brazil)Universidad Federal de Rio Grande do Sul-UFRGSlucemb@orion.ufrgs.brJavier Mendoza Rojas (Mxico)Universidad Nacional Autnoma deMxicojaviermr@servidor.unam.mxMarcela Mollis (Argentina)Universidad de Buenos Airesmmollis@filo.uba.ar Humberto Muoz Garca (Mxico) Universidad Nacional Autnoma deMxicohumberto@servidor.unam.mxAngel Ignacio Prez Gmez (Spain)Universidad de Mlagaaiperez@uma.es Daniel Schugurensky (Argentina-Canad)OISE/UT, Canadadschugurensky@oise.utoronto.ca Simon Schwartzman (Brazil)Fundao Instituto Brasileiro e Geografiae Estatstica email@example.com Jurjo Torres Santom (Spain)Universidad de A Coruajurjo@udc.es Carlos Alberto Torres (U.S.A.)University of California, Los Angelestorres@gseisucla.edu