USF Libraries
USF Digital Collections

Refinement of a chemistry attitude measure for college students

MISSING IMAGE

Material Information

Title:
Refinement of a chemistry attitude measure for college students
Physical Description:
Book
Language:
English
Creator:
Xu, Xiaoying
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
Scale development
Reliability
Validity
Factor analysis
Higher education
Dissertations, Academic -- Chemistry -- Masters -- USF   ( lcsh )
Genre:
non-fiction   ( marcgt )

Notes

Abstract:
ABSTRACT: This work presents the evaluation and refinement of a chemistry attitude measure, Attitude toward the Subject of Chemistry Inventory (ASCI), for college students. The original 20-item and revised eight-item versions of ASCI (V1, V2, V3 and V3.1) were administered to different samples. The evaluation for ASCI had two main foci on the psychometric properties: reliability and validity. Both psychological theory and factor analysis techniques were applied to guide instrument refinement. Evidence for good reliability and validity was established for ASCI V2 score interpretation. The two-factor correlated structure was supported by confirmatory factor analysis. The two subscales, intellectual accessibility and emotional satisfaction, are indicators of cognition and affect components of attitude, and thus we can make good connection with tripartite attitude framework (cognition, affect and behavior) in psychology. As regards to attitude-achievement relationship, other than the well-known aptitude test score, students' attitude played a significant role in predicting final achievement in General Chemistry I course. This work represents the first time in the area of chemical education that a benchmark for scale development was successfully implemented to reconstruct an existing instrument. In addition to showcasing a method for examining validity, a second important result of this work is to recommend refinements that lead to greater ease of administration while improving validity: this eight item instrument now takes very little time to administer.
Thesis:
Thesis (M.S.)--University of South Florida, 2010.
Bibliography:
Includes bibliographical references.
System Details:
Mode of access: World Wide Web.
System Details:
System requirements: World Wide Web browser and PDF reader.
Statement of Responsibility:
by Xiaoying Xu.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains X pages.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
usfldc doi - E14-SFE0003361
usfldc handle - e14.3361
System ID:
SFS0027677:00001


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 22 Ka 4500
controlfield tag 007 cr-bnu---uuuuu
008 s2010 flu s 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0003361
035
(OCoLC)
040
FHM
c FHM
049
FHMM
090
XX9999 (Online)
1 100
Xu, Xiaoying.
0 245
Refinement of a chemistry attitude measure for college students
h [electronic resource] /
by Xiaoying Xu.
260
[Tampa, Fla] :
b University of South Florida,
2010.
500
Title from PDF of title page.
Document formatted into pages; contains X pages.
502
Thesis (M.S.)--University of South Florida, 2010.
504
Includes bibliographical references.
516
Text (Electronic thesis) in PDF format.
538
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
3 520
ABSTRACT: This work presents the evaluation and refinement of a chemistry attitude measure, Attitude toward the Subject of Chemistry Inventory (ASCI), for college students. The original 20-item and revised eight-item versions of ASCI (V1, V2, V3 and V3.1) were administered to different samples. The evaluation for ASCI had two main foci on the psychometric properties: reliability and validity. Both psychological theory and factor analysis techniques were applied to guide instrument refinement. Evidence for good reliability and validity was established for ASCI V2 score interpretation. The two-factor correlated structure was supported by confirmatory factor analysis. The two subscales, intellectual accessibility and emotional satisfaction, are indicators of cognition and affect components of attitude, and thus we can make good connection with tripartite attitude framework (cognition, affect and behavior) in psychology. As regards to attitude-achievement relationship, other than the well-known aptitude test score, students' attitude played a significant role in predicting final achievement in General Chemistry I course. This work represents the first time in the area of chemical education that a benchmark for scale development was successfully implemented to reconstruct an existing instrument. In addition to showcasing a method for examining validity, a second important result of this work is to recommend refinements that lead to greater ease of administration while improving validity: this eight item instrument now takes very little time to administer.
590
Advisor: Jennifer E. Lewis, Ph.D.
653
Scale development
Reliability
Validity
Factor analysis
Higher education
690
Dissertations, Academic
z USF
x Chemistry
Masters.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.3361



PAGE 1

Refinement of a Chemistry Attitude Measure f or College Students by Xiaoying Xu A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Chemistry College of Arts and Sciences University of So uth Florida Major Professor: Jennifer E. Lewis, Ph.D. Xiao Li, Ph.D. Robert F. Dedrick, Ph.D Santiago Sandi Urena, Ph.D Date of Approval: February 1, 20 10 Keywords : scale development reliability, validity, factor analysis, higher education Copyright 20 10 Xiaoying Xu

PAGE 2

Acknowledgements I would like to thank Dr. Lewis, my major advisor, for her support, guidance and encouragement during the study. Her enthusias tic, patient and persisten t attitudes in chemical education program are extremely valuable to o vercome possible difficulties. This project could not be done without her. I need to thank my committee, Drs. Sheryl Li, Robert Dedrick and Santiago Sandi Urena for their essential contribution in many ways. I also thank Drs. Kromrey and Ferron for convers ation on data analysis I am in gratitude to all group members, Alicia, Bo, Janelle Karla, Keily Sachel Teresa and Ushiri for their friendship and academic advice. I appreciate the instrumental role the chemistry department office staffs, instructors, p eer leaders, test proctors lab TAs and students played, and the data services the scanning office provided Special acknowledge ment is for Dr. R ick M oog for helpful suggestions on the manuscript. Finally I would like to acknowledge the steadfast love, ins piration and support from my family : Charley and Feng Cheng

PAGE 3

i Table of C ontents List of Tables iii List of Figures vi A bstract vii Chapter One : Introductio n 1 Why Attitude? 3 Attitude Concept in Social Psychology 5 Attitude Concept in Science Education 9 Attitude C oncept in T his S tudy 12 Chapter Two : Literature Review 15 Typical Attitude Measurement Scales 15 Quality Evaluation for Existing Science Attitude Instruments 16 Quality Evaluation for Existing Chemistry Attitude Instruments 21 Research questions 25 Question 1 25 Question 2 25 Question 3 25 Chapter Three : Methodological Background 26 Proposed Instrument Development Processes 26 Reliability 27 Validity 29 Factor Analysis 35 Sample Size 35 EFA or PCA 35 EFA or CFA 37 The Number of Factors 38 Summary 39 Ch apter Four : Method 40 General Research Methods 40 Instruments 40 Participants and Data Collection 41 Data Analysis Strategy 42 Scale Reconstruction Based on ASCI V1 44 Validity Estimat e for ASCI V2 Score 46

PAGE 4

ii Nomological Network among Attitude, Aptitude and Chemistry Achievement 46 Predictive Validity for ASCI V2 score 47 Chi S quare Analysis for V3.1 Participants 48 Chapter Five : Quality Evaluation for Original ASCI (V1) 49 Results and Discussion 49 Proposal for Possible Revisions of ASCI 55 Summary 57 Chapter Six : Quality Evaluation for Revised ASCI 58 Quality Evalua tion for Revised ASCI V2, Peer Led classes 58 Results and Discussion 58 Nomological Network among Attitude, Aptitude and Chemistry Achievement 63 Results and Discussi on 64 Predictive Validity to Predict Chemistry Achievement 65 Results and Discussion 66 Evaluation of ASCI V3 for Chemistry Class at University MW 68 Results and Discussion 68 Evaluation of ASCI V3.1 for Attitudes toward Calculus 72 Results and Discussion 72 Evaluation of ASCI V2 for Biochemistry Clas s 79 Results and Discussion 79 Summary 82 Chapter Seven : Summary and Concluding Discussion 83 Summary and Implication for Chemical Educ ation 83 Limitations of the Study and Recommendations for Future Research 85 Reference s 87 Appendices 98 Appendix A: Commonly Used Acr onyms 99 Appendix B: ASCI V1 100 Appendix C: ASCI V2 101 Appendix D: ASCI V3 102 Appendix E: ASCI V3.1 103

PAGE 5

iii List of T ables Table 1.1.1 Definition of attitude related concepts with possible items 1 1 Table 1.1.2 Definition of science attitude and related concepts with possible items 1 2 Table 2. 1 Description of four typical attitude measurement scal es 16 Table 2. 2 Score for chemistry attitude instruments on JCE according to Blalock 2 2 Table 4.1 Summary of administrations for different versions of ASCI 4 1 Table 5.1 Descriptive statistics for ASCI V1 General Chemistry labs I and II 5 0 Table 5.2 Inte rnal consistency and test retest reliability for ASCI, General Chemistry labs 5 1 T able 5.3 The item loading s for V1 General Chemistry labs 5 2 Table 5.4.1 Subscale score s and Pearson correlations for V1 5 3 Table 5.4.2 Subscale s core s and Pearson correlations by Bauer 5 3 Table 5.5 Fit statistics of V 1 for four factor CFA 54 Table 5.6 Factor correlation for V1 from CFA 5 5 Table 5.7 CFA results for Scale Reconstruction based on ASCI V1 5 6 Table 6.1 Descriptive statistics for ASC I V2 PL class es 59 Table 6.2 Subscale score s and Pearson correlations for V2, PL class es along with I labs 6 0

PAGE 6

iv Table 6.3 Internal consistency for V2, PL classes along with I Lab I and literature results 6 0 Table 6.4 The item loading s for V2, PL cl ass es 6 1 Table 6.5 Fit statistics of V2 results for 1 and 2 factor CFA 6 3 Table 6.6 Descriptive analysis for attitude aptitude score and final score on ACS 64 Table 6.7 Pearson correlation among attitude aptitude and achievement 65 Table 6.8 De scriptive statistics for ASCI V3, MW chemistry 69 Table 6.9 Internal consistency for V3, MW chemistry 69 Table 6.10 The item loading s for V3 at MW and V2 from PL classes at SE 7 0 Table 6.11 Factor scores and Pearson correlations for V3, MW and PL cl ass es at SE 7 1 Table 6.12 Fit statistics of V3 at MW from two factor CFA 7 1 Table 6.13 D escriptive statistics of ASCI V3.1 for calculus attitude 73 Table 6.14 V3.1 calculus 73 Table 6.15 The item loading s fo r V3.1 Calculus sections 74 Table 6.16 Factor score for V3.1 calculus sections 75 Table 6.17 The frequency of most declared major s for students at each course level 76 Table 6.18 Fit statistics of V3.1 calculus results from two factor CFA, alo ng with MW 78 Table 6.19 Descriptive statistics of ASCI V2 for Biochemistry class 79 Table 6.20 Internal consistency for V2 in Biochemistry class 8 0 Table 6.21 The item loading s of V2 for Biochemistry and PL class es 8 0

PAGE 7

v Table 6.22 Factor score and Pear son correlations for PL and Biochemistry class 8 1 Table 6.23 Fit statistics of V2 results from two factor CFA 82

PAGE 8

vi List of F igures Figure 1 .1 Conceptual framework for a ttitude 6 Figure 3.1 Instrument Development Processes 2 7 Figure 3.2 Internal consistency and unidimensionality 3 0 Figure 3.3 PCA and EFA Model 3 6 Figure 3.4 EFA and CFA model 38 Figure 4.1 Framework for Scale Development 4 5 Figure 6.1 Parameter e stimate for o ne f actor and t wo f actor m odel 6 2

PAGE 9

vii Refinement of a Chemistr y Attitude Measure for College Students Xiaoying Xu A BSTRACT This work presents the evaluation and refinement of a chemistry attitude measure, Attitude toward the Subject of Chemistry Inventory ( ASCI ) for college students The original 20 item and revised eight item versions of ASCI (V1, V2, V3 and V3.1) were administered to different samples. The evaluation for ASCI had two main foci on the psychometric properties : reliability and validity. Both psychological theory and factor analysis techniques were app lied to guide instrument refinement. E vidence for good reliability and validity was established for ASCI V2 score interpretation The two factor correlated structure was supported by confirmatory factor analysis. The two sub scales i ntellectual accessibili ty and e motional satisfaction are indicators of cognition and affect components of attitude and thus we can make good connection with tripartite attitude framework (cognition, affect and behavior) in psychology As regards to attitude achievement relatio nship other than the well known aptitude test score, students attitude played a significant role in predicting final achievement in General Chemistry I course. T his work represents the first time in the area of chemical education that a benchmark for sca le development was successfully implemented to reconstruct an existing instrument. In addition to showcasing a method for examining validity, a second important result of

PAGE 10

viii this work is to recommend refinements that lead to greater ease of administration whi le improving validity: this eight item instrument now takes very little time to administer.

PAGE 11

1 Chapter One Introduction Assessment is a n essential component in education Results from assessment s serve diverse functions for diagnosis, placement, and predict ion. Course teachers rely on diverse tests to obtain information mastery of content and other contextual variables such as problem solving and creativity Although assessment itself is not able to suggest proper instruction, it provides eviden ce for teachers to judge how effective or ineffective the teaching has worked to accomplish their expected course objectives. Nation wide exams are mostly used for screening of large candidate pool. For example, most colleges use SAT or ACT score s for coll ege admission s and GRE score s for graduate admission s T he critical role of education al assessment has produced a burgeoning testing industry with a market growing from $7 million in 195 5 (adjusted to 1998 dollars) to $263 million in 1997, to somewhere be tween $400 million to $700 million in 2002 (WGBH E ducational Foundation 2002) There has been a long and vig orous debate among education stakeholders in term s of how to interpret test scores and create education polic y (Sedlacek, 2004) The debate has developed into a national concern since the implementation of the No Child Left Behind (NCLB) Act in 2001. Under this influential e ducation reform, all participating states and school districts receiving Title I funds are subject to biennial National Assessment of Educational Progress (NAEP) testing in reading and math for the 4 th and 8 th grades since 2003 and state testing. All schoo ls need to make adequate yearly progress (AYP) so that

PAGE 12

2 all students can meet the proscribed proficient level by the 2013 2014 school year. Supporters of NCLB believe the mandated performance standards and emphasis on test scores can improve the quality of public education for all students. Critics focus on the limitations of the tests, potential misinterpretations of test scores, and the unintended consequences of a testing program. For example, is the NAEP measuring the most important aspects of the curric ulum? Unfortunately, the love of learning and creativity is not included in the accountability system. Does the testing system motivate teachers to teach students only how to fill in exam bubbles or even to cheat? How does one match up individual state tes t to NAEP and performance standards? Is it meaningful to compare two groups of students to measure educational progress and AYP ? Is it fair to label the regardless of th e effort those schools have made to improve? Overall, does the testing program result in help for disadvantaged students? These issues, and others, are still debated so frequently in the political arena that it is hard to find a discussion of the relevant While questions such as these will always be difficult to address, a better understanding of educational measurement theory can provide a more nuanced approach to discussions of education policy. The full assortme nt of measurement theories and practices is beyond the scope of this paper, but the basics are pertinent for all of us who teach chemistry and are aware of the need to improve STEM education outcomes at the national level. For the purpose of assessing both academic achievement and non cognitive variables such as present one way to proceed with this task.

PAGE 13

3 W hy A ttitude ? goal of science education. Usually, scientific literacy focuses on the cognitive knowledge applies appropriate science con cepts, principles, laws, and theories in interacting with his (Rubba & Anderso n, 1978) However, many science educators emphasize that non cognitive factors such as values and attitudes are important component of science literacy. According to the American Association for the Advancement of Science ( AAAS, 1989) spelling out the knowledge, skills, and attitudes all students should curriculum to be considered as promoting scientific literacy. AAAS places attitudes on an equal footing with knowledge and skills. toward learning science (Custers & Ten Cate, 2002; Dahle, Forsberg, Sv anberg Hard, Wyon, & Hammar, 1997; Gouveiaoliveira, Rodrigues, & Demelo, 1994; Hren et al., 2004; Parke & Coble, 1997) Educational practitioners have developed quality science curricula informed by attitude related learning theories Rather than indoctri nation, great effort has been taken to develop various innovative programs, which put more emphasis on the problem solving, inquiry based learning, hand on activities and computer aided instruction (Ault, 1998; Freedman, 1997; Laws, Rosborough, & Poodry, 1999; Ozmen, 2008; Paris, Yambor, & Packard, 1998; Romance & Vitale, 1992; Shymansky, Yore, & An derson, 2004; Steele, Medder, & Turner, 2000) The ideal is a curriculum that supports both gains in content knowledge and positive attitudes toward science.

PAGE 14

4 While most practitioners are comfortable creating their own measures of content knowledge, and ca likely to have the necessary skills and comfort level to create and evaluate a measure of attitude. They may turn to existing instruments, and either use them directly or with slight adap tations to fit a particular course. Unfortunately, most existing studies of attitude fail to scrutinize the validity of the instruments they have chosen to use, which can raise questions about study results. Conflicting results, in particular, highlight th e importance of using well characterized instruments, so that the instrument itself can be ruled out as a major source of disagreement. For example, an examination of literature on attitude and chemistry achievement reveals conflicting results (Fowler, 1980; Gutwill, 1998; Lindsay, 2001; Pribyl, Adams, Hoganmiller, & Stevens 1991; Shibley, Milakofsky, Bender, & Patterson, 2003; Turner & Lindsay, 2003) While some claim a low correlation between attitude and achievement, others claim the two are strongly positively correlated. Is this disagreement an artifact of the way atti tude was measured (differently in each case!), or a real difference stemming from the diverse contexts of the studies? What if, as in the example above, there is no readily available high quality instrument that has a long history of working well in the r elevant context? With much science education research focused on the K12 environment, it is often difficult to find high quality instruments that are appropriate for use in the college science classroom. Creating a high quality instrument is a research pro ject in itself, and one that many faculty would not be interested in undertaking. However, it is possible to take an existing instrument, test it in the desired setting, and quickly optimize it for use. This study seeks to offer an example of carefully mo difying an existing attitude instrument to achieve better quality,

PAGE 15

5 while reviewing appropriate strategies for any instrument development process: reliability, validity, factor analysis, and the alignment of any sub scales with a theoretical framework. The se issues must be addressed in deciding whether an assessment can be The remainder of this chapter will introduce the central concept of attitude in psychology and science education, and create a conceptual framework for this study. A t titude C oncept in Social P sychology Attitude is one of the most important con cept s in social psychology dat ing back to the ancient philosophers (Zanna & Rempel, 1988) A ttitude has been extensively investigated the literature From a behavioral view, a ttitude is defin ed readiness to respond, org anized through experience, exerting a directive and/or dynamic (Allport, 1935 p. 810 ) In other words, attitudes are considered as tendencies or predispositions to respond to certain stimuli, and the traditional tripartite model comprises three major types of responses: cognitive, affecti ve, and behavioral as illustrated in Fig. 2.1 (Rosenberg & Hovland, 1960, p. 3). Given a particular object about c knowledge and beliefs about properties of the object (i ncluding both favorable and unfavorable judgments). and bad feelings), as expressed via physiological activity or overt communication. s well as intentions to act (again, both positive and negative actions and intentions). The object of attitude can be something material, such as the game of baseball, or something abstract, such as romantic

PAGE 16

6 love. The research and application of attitude t heory from this view highlights the responses to stimuli (such as punishment s or rewards), and reinforcement of positive behaviors. Figure 1 .1 Conceptual frameworks for attitude Since the later e r a of last century, scholars of cognitive system have comp eted with the behavior ists and argued that influence of environmental stimuli on attitude was overemphasized Instead, a ttitude is regarded as a node in a dynamic network as elaborated in theory of cognitive dissonance, self perception, value expectancy, self efficacy, and dissonance reduction For example, Bandura believe d that attitude concept is used in conjunction with motivation to achieve is self efficacy in h is social cognitive theory (Bandura, 1977) He emphasized the interaction between people, behavior and environments, and that people have the capacity to regulate their motivation, af fect and behavior. The theory of cognitive dissonance assumes that a person know s (Festinger, 1957 p. 9 ) For Mary is his mother. The theory proposes t (Festinger, 1957) Possibly, John will change his attitude and stop smoking. Self perception theory posits

PAGE 17

7 their own attitude, they may infer their attitude based on external cues and their own previous behaviors (Bem, 1967; Bem & McConnell, 1970) Eagly & Chaiken reviewed the attitude related theory from different persp ective and proposed a working definition. They considered that a ttitude is (Eagly & Chaiken, 1993) p. 1 Th is definition lies in the root of cognitive theories, be cause the nature of attitude is evaluation which is more subjective. I n behaviorism system t he nature of attitude i s the tendency to respon d to environmental stimuli Eagly & Chaiken further conceptualized t he evaluation of attitude as all classes of ev aluative responding, whether overt or covert, or cognitive, affective, or behavioral. Evaluation thus encompasses the evaluative aspects of beliefs and thoughts, feelings and emotions, and intentions and overt behavior (Eagly & Chaiken, 2007 p. 583 ) This conceptualiz ation has three necess ary components in the tripartite model : cognit ion affect, and behavior as in Fig 1 .1 therefore, they called it a neo tripartite model By this approach, th e neo tripartite framework may provide a useful link with the previous attitude research and furth er theory development T he three component structure of attitude has been supported by many empirical studies (Bagozzi & Burnkrant, 1979; B reckler, 1984; Eagly & Chaiken, 2007) However, the dimension ality of attitude has been challenged Some studie s apply a unitary model Petty et al. defined attitude evaluation of objects along a dimension from positive to (1997) A ttitude here is considered unitary rather than multi dimensional. This hypothesis is supported in some studies because high correlation s w ere observed among

PAGE 18

8 the three components (Ostrom, 1969) Zanna disagreed with the unitary model. He (1988) emphasized that cognitive and affect are two different components of attitude The t wo component model has been proposed which does not include the behavior component The behavior component is often questioned because of the weak prediction power of attitude for prescribed behaviors (Bid dle & Mutrie, 2001; Hagger, Chatzisarantis, & Biddle, 2002) Furthermore, it is very hard to create specific behavior item s when complicated pattern s of behaviors are involved with the objects. The q uestion arises which model is more appropriate, unitary two or three component (McGuire, 1985) ? One focus of attitude literature has been on the attitude behavior relationship and cross situation behavior consistency issues Most of the results are suggesting weak relationship and inconsistent behavior pattern (Berg, 1966; Bray, 1950; Kutner, Wllkins, & Yarrow, 1952; Lapiere, 1934; Palmer, Burwitz, Dyer, & Spray, 2005; Perlman, Bobak, Steptoe, Rose, & Marmot, 2003) Meta analysis of 30 at titude behavior studies yield ed no or weak relationship s between them (Wicker, 1969) T he t heory of planned b ehavior (TPB) is proposed to approach the problem for a better prediction (Fishbein & Ajzen, 1975) In TPB theory the behavior component is exclude d from attitude and considered as another distinct concept; the intention link is add ed to mediate attitude and behavior He used attitude and other factors to account for inte ntion and intention to explain behavior. The TPB theory has been applied in current literatures and the strong correlation between intention and behavior has been supported (Randall & Wolff, 1994; Rivis & Sheeran, 2003) The relationships among various concepts are not the major research purpose in this study but can provide insight for estimat ion of attitude related nomological network

PAGE 19

9 Attitude Concept in Science Education Historically learning outcomes are predominant ly measured in terms of cognitive factors by standard aptitude test, grade po int average (GPA), l ogical thinking ability and spatial ability (Sedlacek, 2004) However, some researchers argue that the common exams may have little predictive validity for the potential of students from various background s and cultures (Sedlacek, 2004; Sternberg & Williams, 1997; Tracey & Sedlacek, 1984, 1987) Noncognitive factors have been considered as an important learning outcome as early as the Socratic era. Assessment of noncognitive properties, including attitude, motivation, self efficacy, and learning strategy, can provide unique information to account for students academic achievement. Science education is influenced by attitude related theory in vario us ways. The behavioral theories have been applied to learning and instruction such as the application of TPB in SER literature (Fishbein & Ajzen, 1975) One teaching approach is called ope rant conditioning, which focuse s on the effect o f reinforcement to desirable learning behaviors One teaching approach from a social cognitive perspective is self regulation learning (SRL) (Corno & Mandinach, 1983) Since SRL was proposed in the 1980s it has gain ed broad popularity in field of science education research and practice It emphasizes the interactions of cognitive, motivational, and contextual factors and focuses on the effect of environments to help student s to become self regulated learners. SRL has been referred to as the theoretical base in many SER articles (Bong, 2001; Bong & Clark, 1999; Cunningham & Mahoney, 2004; Franko et al., 2008; Gao, Lee, & Harrison, 2008; Hsieh & Schallert, 2008; Margolis & McCabe, 2006; M ills, Pajares, & Herron, 2007; Pajares &

PAGE 20

10 Graham, 1999; Schunk, 1991; Sins, van Joolingen, Savelsbergh, & van Hout Wolters, 2008; Walker, Greene, & Mansell, 2006) The science attitude concept is rarely defined clearly and operationalized consistently, pos sibly due to the diverse theory sources. Researchers started to use attitude in a vague (Park & Burgress, 1924) A ttitudes toward science (or science attitude) has been defined and di mension alized in different ways. It may be defined (Gardner, 1975) which put s emphasis on the affective aspect of attitude Influenced by TPB, ve, or negative feeling about science that serves as (Koballa & Crawley, 1985) which is more cognitive These definitions viewed attitude as a un itary concept, whether affective or cognitive component. In education research literature, science attitude has a more broad meaning It can include non cognitive factor s such as self concept and scientific attitude. S cientific attitude is another differe nt concept which is about objectivity, open p. 56 Attitude can be used exchangeable with other terms. For example, the concept of tensively investigated in learning theory and sometimes used as another name of attitude and sustains goal (Brophy, 2004) Jurisevic et al. used the three components for the intrinsic motivation, including e mo tional component of interest, cognitive component of interest and challenge component of internal motivation

PAGE 21

11 (Jurisevic, Glazar, Pucko, & Devetak, 2008) The component s of motivation are partially overlapped with the cognitive and affective aspect s of attitude. Since the motivation estimate in science learning is more dependent on specific science c ourse and institution, it will not be framework in this study It is worth mention ing that the a ttitude cognitive component is different from the cognitive concept. In education, c ognitive test usually means knowledge test s to measure mastery of content knowledge or thinking skills On the contrary, a ttitude cognitive component i s the belief about the knowledge, value and habits of thinking ways which is the major issue in this study More attitude related terms mostly used in SER and sample test items are listed in Table 1 .1 In sum, there is an overlapping part using various terms. T he same name could be used to describe different things, and different names could be used for the same thing Research findings would be more insightful when m ajor concepts are clearly defined and distinguished from other concepts It is hard to generalize results across studies using instruments based on vague definition and different conceptualization. Table 1.1.1 Definition of attitude related concepts with possib le test items Concept Definition and reference Possible item Motivation related (Brophy, 2004) (Tuan, Chin, & Shieh, 2005) Meta cognition (Brown, 1987) (Melanie M. Cooper & Santiago Sandi Urena, 2009) Self efficacy (Bandura, 1982) (Uzuntiryaki & Aydin, 2009) Self concept (Pintrich & Schunk, 1996) (Bauer, 2005) discussions with school friends about

PAGE 22

12 Table 1.1.2 Definition of science attitude and related concepts with possible test items Conce pt Definition and reference Possible item Attitudes toward science (Bauer, 2005) (Gardner, 1975) Science is fascinating and challenging Scientific attitudes mindedness, and questioning atti (Mayer & Richmond, 1982) p. 56 I believe my calculation results whatever it makes sense or not. Value of science solving, science inquiry, thinking, and the relevance of science Science 1993, NRC 1996) ing science is important because I can use it (Tuan et al., 2005) Nature of science functions, its interactions with society, an (Aikenhead, 1973) Science can provide absolute truth for everything. Attitude C oncept in T his S tudy T his study is focused on the concept of at titude towards the science subject, chemistry. The investigator is interested to develop an easy to use and high quality chemistry attitude instrument for college students. In order to accomplish the goal, it is important to establish the theoretical frame work for the instrument design Moreover, instrument should be able to stand up for empirical evaluation I f the assessment people used to collect data is not based on the proposed theory and there is no evidence in the follow up studies in terms of reliab ility and validity, it is hard to compare the results based on different attitude assessments. Just as Doran and colleagues pointed out: [T] hat ambiguity of terms and quality of instruments are two serious problems facing those interested in assessing at titudes to science. The lack of a theoretical base has been cited in nearly all cases as a hindrance to assessment. Furthermore, the lack of empirical support for most of the existing instrument s has exacerbated the situation (Doran, Lawrenz, & Helgeson, 1994 p. 428 )

PAGE 23

13 T his study does not have a specific theory to base on. Instead the investigator agree s with Eagly et al in the use of the cognitive perspective of attitude concept, because learning is a compl icated and interactive process D erived from Eagl y working definition science attitude is defined as the psycho logical tendency through evaluating the subject of science in general with some degree of favor or disfavor From this definition, the invest igator is creat ing a pragmatic model including two attitude components from the neo tripartite model: cognition and affect. Although definitions by Gardner and Petty view attitude as a unitary concept the investigator would argue for the retaining a two c omponent framework in the area of science education. First, educators ice cream has a cognitive component (unhealthy, not a part of a balanced diet) and an affect ive component (yummy!), students often say science is challenging (cognitive) yet interesting (affective). The affective and cognitive components of attitude remain rather than lumping them together to get a single attitude score or simply gathering information regarding one or the other. The investigator would also argue that, when the science subject in general is the object of attitude, instruments that exclude the behavior component are the best for many research purposes. Although attitude can be inferred from behavior, behavior is not at the same level of abstraction as cognition and emotion. In that sense, concrete items about behavior on an instrument can actua lly make it harder for respondents to focus on accurate reporting for more abstract items relating to emotions or beliefs. Also, since desirable behavior patterns can vary dramatically for instructional settings, it is hard to

PAGE 24

14 create behavioral items suita ble for different situations in order to estimate attitudinal differences. Furthermore, when attitude is to be treated as an indicator of future behavior, as the Theory of Planned Behavior recommends, excluding behavior from the attitude study is meaningfu l (Fishbein & Ajzen, 1975) Therefore, the investigator decided to use the two component framework. The two component model including the cognitive and affective aspects has been supported by confirmatory factor analysis ( CFA ) in the SER literature (Gonzlez, 1992; Mohsin, 1990; Oppenheim, 1992) In the instrument which was developed in 2000 to a ssess student attitude toward physical education with reliability and validity evidence, the authors found two primary factors, enjoyment and perceived usefulness which are congruent with the affective and cognitive aspects of the two component framework (Subramaniam & Silverman, 2000, 2007) In sum, the two component (cognition and affect) attitude model wi ll be used as a framework to guide this instrument development process. In the following thesis, Chapter Two summarize s existing attitude measurement s ; Chapter Three introduce s technique background about psychometric quality, including reliability and vali dity estimate; Chapter Four is the method section for this study; Chapters Five and Six are results and discussion from the administration of original and revised ASCI; at last, Chapter Seven summarize s the whole study.

PAGE 25

15 Chapter Two Literature Review T h is chapter start s with literature review of the typical attitude scales. Then the previous attempts to measure attitude towards science and chemistry are discussed. Typical Attitude Measurement Scales Appropriate assessments of science attitude are importa nt for diverse research purpose, such as evaluation of course effect and attitude theory development. The concept of be measured directly, at least not as directly as the way to measure speed using a stopwatc h. Indicators from responses to carefully designed statements are often used to measure abstract constructs indirectly. toward science. Because qualitative approaches includ ing observation s and interview s are time consuming and not applicable to classroom use, this study focuses on the quantitative methods using psychometric survey. Single indicator attitude survey is often used, for example, how do you describe your liking o f science on a 1 10 scale? Critiques of single indicator survey are the lack of reliability estimate (Straus & Wauchope, 1992) A c omposite score based on multiple indicators is better than a single indicator in terms of reliability and validity evidence from a psychometric perspective According to the organization of test items, there are four typical different attitudes scales, including Thurstone, Likert, Guttman, and semantic differential (Table 2. 1 ). Thurstone and Guttman scales call for answers of agree or disagree, so the responses are dichotomous.

PAGE 26

16 Likert and Sem antic differential scales ask participants to choose the position the y locate usually on a five or seven point scale. Table 2. 1 Description of four typical attitude measurement scales Scale Description Sample test item Guttman scale (DeVellis, 2003) a subset of the surv ey items having binary answers scale if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that ordered scale. Agreement with item 3 implies agreement with items of lower rank ord er: 1 and 2. T o what level the attitude is toward immigrants : Are you willing to permit immigrants to live (Least extreme) 1 in your country? 2. in your community? 3. in your neighborhoods? 4. next door to you? (Most extreme) 5. in your family ( marry your child) ? ( answer Y/N to each one) Thurstone Scale (Thurstone, 1928) The statements followed by binary answer ( e.g. agree or disagree) should constitute nearly an evenly graduated series of scale values (e.g. from very necessary to useless). Happiness has -----to do with achieving educational or material goal. 1 nothing 2 something 3. a lot ( answer Y/N to each one) Likert Scale (Likert, 1932) respondents specify their level of agreement to a declarative statement. (could range from strongly disagree to strongly agree in a 5 likert) Science knowledge is important in my daily life. 1 strongly disagree 2 di sagree 3 Neutral 4 agree 5 strongly agree (pick one from the five options) Semantic differential Scale (Snider & Osgood, 1969) The respondent is asked to choose where his or her position lies, on a scale between bipolar adjectives Learning chemistry is: Valuable Wor thless (pick one from the seven positions) Quality Evaluation for Existing S cience A ttitude Instrument s Many science attitude instruments exist. Fortunately, meta analysis can be utilized to systematically evaluate the quality of these instruments, and when published meta analyses exist, they are a tremendous help for instrument selection. For example, Munby reviewed 56 instruments for their psychometric and conceptual evidence (1983) and recommended seven instruments. However, his selection criteria did not put much weight on validity and item content issues. More recently, Blalock developed a scoring rubric to rate the 66 instruments published in 150 peer reviewed articles during 1935 ~ 2003 (Blalock et al., 2008) The rating rubric includes five major categories: 1) Theoretical background for instrument development (3 pts)

PAGE 27

17 2) Reliability: internal consistency, test retest, and standard error of measurement (9 pts) 3) Validity: content, discriminant, congruent, contrasting groups, and factor analyses (9 pts) 4) Dimensionality (6 pts) 5) Development and usage (1 pt) T his rubric covered most aspects of psychometric properties. A high quality instrument should hold to current measurement standards, and score high on a rubric like this. However a high score on th is rubric does not necessarily guarantee a high quality ins trument, because s rubric is not harsh but relatively lenient. Some important issues were not accounted for in his rating rubric, such as the influence of missing data descriptive analysis and normality check to perform factor analysis Moreover, there are different factor analysis techniques which serve different research purposes and should be used consistently. At this time, the investigator use s rubric for estimate the existing attitude instruments. The average score for the 66 instrum ents was reported to be 9.5 out of 28. Scores ranged from 0 to 22, with a standard devia tion of 4.9. Twenty four instruments scored 7 or below. At the higher end, only two instruments scored higher than 20 points. conclusion is that the general quality of instruments is quite low. Most of them lack the psychometric aspects, including the use of explicit theory base, report of reliability and validity evidence. Researchers have created and are still continuing to create new attitude instruments on their own resulting in the overall poor psychometric quality instead of revising and improving an existing one. Most instruments have never been used in follow up studies. When an old one is used, very little attention was paid about its quality. Even the most extensively used one, Scientific Attitude Inventory (only scored 10 out of

PAGE 28

18 28), which was even criticized for the revised form (Lichtenstein et al., 2008) Although psychomet rics property has long been well honored, science attitude instruments did not exhibit an obvious pattern of quality improvement with years. There are various reasons to possibly account for this unsatisfactory situation One is that test validation is a l ong and complicated process, which always involves mixed research methods and crosses multiple disciplines like psychology, psychometrics, learning theory, inferential statistics, pedagogy, and subject content knowledge. Another reason is that in most grad uate programs students are poorly trained in measurement related fields and there is a shortage of measurement professionals to take the responsibility of test score validation sociobehavioral s ciences, especially doctoral programs, require a modicum of exposure to statistics and research design, few seem to require the same where measurement is P roper description and reference of theory background is the first and important s tep in guid ing instrument development because it is the starting point to c learly define what kind of attitude concept the instrument is really intended to measure and the dimensionality of the concept The developers tend ed to bring more related construc ts to an attitude questionnaire without careful definition and discrimination such as self concept, self efficacy, interest, learning strategy, motivation, behavior of effort, and learning goals, value of science and so on. So the survey itself is a netwo rk of related constructs rather than components of one major construct (more descriptions about these terms are in Table 1 .1). The intension and dimensionality of the concept should be consistently app lied to create pilot item pool. If an instrument has mo re than one

PAGE 29

19 dimension, it is problematic to use it as if it is one dimensional by using a single composite score (Neuberg, West, Judice, & Thompson, 1997) Review of literature revealed more instrument develop prob l ems. It is desirable to have items designed to load on one thing conceptually not the others (Netemeyer, Bearden, & Sharma) If concepts are not discriminated from one another, it is hard to create items and make interpretations using sum scores For example, the item could be an indicator of constructs like self confidence, self concept, self efficacy, self esteem or self respect on different instruments However, a n item like this is not appropriate to be used for different concepts on one instrument. Moreover, it is improper to add items representing different objects together to create one cience attitudes su rveys tend to have items representing various attitude objects including classroom environment, labs, test anxiety, teachers and scientists, which seem inclusive and comprehensive. In this case, it is hard to compare the composite score across situations w ith different backgrounds because students may have different understanding of specific objects like labs or teachers. The importance of a single attitude object is illustrated as follows: An attitude instrument yields a score. If this score is to be meani ngful, it should defined continuum. For this to happen, the items within the scale must all be related to a single attitude object. A disparate collection of items, reflecting attitude towards a wid e variety of attitude object, does not constitute a scale, and cannot yield a meaningful score (Gardner, 1975: 12).

PAGE 30

20 P sychometric evidence should be reported when using instruments and the strategies to perform statistics analysis should be carefully selec ted to serve the research purpose. Unfortunately that is very rare. Methodological problems have been found in the science or chemistry attitude measurements. Most articles never justify their strategies and just ferent strategies, or overuse the default option provided by the statistical software, hence cannot stand against the critics (Cronkhite & Liska, 1980) For example test retest reliability requires two administration s of the same instrument for the same people after a period of time, and should not be used for the different test forms or di fferent samples. Another issue is, the i nternal consistency unidimensionality and the use of the total score (Gardner, 1995) This means, Cronbach's alpha and factor score should be report ed for each scale rather than overall instrument d CFA rather than EFA, why PCA, why oblique rotation rather than orthogonal rotation, how the number of factors was decided to extract and so on. Furthermore, when factor analysis yields a model of different item loading patterns from the proposed model, usually no explan ations or suggestion s were given. Because of various reason s, the average score for the overall instrument is so low. To make the findings based on instruments more acceptable and generalizable for other scientists and practitioners, it is important to establish evidence for the responses each time when using the instrument. In depth discussions about the techniques to evaluate psychometric quality such as factor analysis will be discussed in Chapter Three.

PAGE 31

21 Quality Evaluation for Existing Chemistry Attitude Instrument s The q uestion arises, i s the situation of scien ce attitude instruments similar to chemistry specific instruments? Several chemistry a ttitude instruments have been developed and reported specifically (Barbera, Adams, Wieman, & Perkins, 2008; Bauer, 2005, 2008; Leav ers, 1975; S. E. Lewis, Shaw, Heitz, & Webster, 2009; Walczak & Walczak, 2009) They are asking students to answer questions on different a ttitude related construct, such as attitude t owards s cience self concept, metacognition, c ognitive e xpectations tow ard chemistry s rubric is app lied to these reports (Table 2. 2 ), overall the psychometric quality is better than for the science attitude measurements but still unsatisfactory. The only two instruments for which detailed factor analysis and f actor scores were reported were the Attitude toward the Subject of Chemistry Inventory (ASCI) (B auer, 2008) and the Chemistry Self concept Inventory (SCI) (Bauer, 2005). SCI has excellent evidence for reliability and validity, scoring the highest of all the instruments, but it is aligned with a particular sub construct of attitude rather than with the general attitude concept discussed above as our target. ASCI, which is more aligned with the general attitude concept, scored the second highest, with 18 out of 28 points, significantly higher than the average of the 66 science attitude instruments but still lower than desirable. In addition, the ASCI subscales were not captured well via factor analysis, indicating a potential problem with the underlying fit between the instrument and its theoretical framework.

PAGE 32

22 Table 2. 2 Score for chemistry attitude in s rubric Survey Study population Theor etic bas e Reliability Validity Dimensionality Development and usage Score Highest possible points 3 9 a 9 b 6 1 28 Opinion survey (Leavers, 1975) n=130, college chemistry clas s students (for nonscientist) 0 0 0 0 0 0 Attitudes toward Laboratorie s (Chatterjee, Williamson, McCann, & Peck, 2009) n =703, college chemistry lab oratory student s 3 0 3 0 0 6 VOSTS (Walczak & Walczak, 2009) n=37, college chemistry students in a non science 3 0 6 0 1 10 CHEMX (Grove & Bretz, 2007) n =157, college chemistry faculty 3 3 9 0 1 16 ClASS (Barbera et al., 2008) n>500, college student in general chemistry 3 3 9 0 1 16 MCAI (M. M. Cooper & S. Sandi Urena, 2009) N=537, college students, intact class in chemistry lab 3 3 9 0 1 16 ASCI (Bauer, 2008) n=379, college chemistry lab 3 2 9 3 1 18 Self Concept Inventory (Bauer, 2005) N=379, students in a non college chemistry course 3 5 9 6 1 24 a Internal consistency, test re test and standard error of measurem ent b Content, discriminant congruent, contrasting groups, FA The investigator found that the Attitude toward the Subject of Chemistry Inventory (ASCI) (Bauer, 2008) was a good candidate to for this study for several reasons Overall the ASCI is a high quality instrument, and it is informative for other teachers implementation in chemistry class room Firstly, t he test purpose is clearly stated to measure the attitude towards chemistry in general, not the specific course or instructor. The attitude concept was properly defined and distinguished with other variables like belief, self concept and s cientific habits Secondly, it is designed in the 7 point semantic point scale between two One sample item asks students to rate their position from exciting to boring It is the

PAGE 33

23 eas iest format to answer and intuitive to interpret. Twenty pairs of objectives are general after a cognitive interview. Furthermore ASCI provide d an acceptable internal consistency and test retest reliability. Construct validity was supported by correlation of attitude scores with other concepts, contrasting groups, discriminate and fa ctor analysis. The items were grouped into five subscales: interest and utility (including five items), fear (one item), emotional satisfaction (four items), intellectual accessibility (five items), and anxiety producing (four items) according to factor an alysis results Finally ASCI comes with guidelines for administration, scoring, and score interpretation for instructors and other researchers. However, there is space for improvement. The factors emerging from factor analysis based on the 379 responses from the first semester of general chemistry laboratories did n o t reflect the subscales the authors claimed. Actually, only three factors emerged from factor analysis. The F ear subs cale only has one item, which d id n o t load well on other factors, so it can not be considered as a factor. The Emotional S atisfaction items loads on F Interest and U tility subscale possibly contains two different concepts. Because procedures l ike confirmatory factory analysis (CFA) was not performed, there is no report of measurement error. Another reason for cho osing ASCI is t he attitude object for ASCI is chemistry in general, rather than specific curriculum or lab. This study is interested i n develop ing an attitude measurement which can yield reliable and valid scores for the college students toward chemistry in general in order to fit the different universities or curriculums. If items are

PAGE 34

24 about specific contexts like cooperative learning fo r college chemistry, students could have different understanding dependent on their course settings, which will lead to different item loading pattern for different populations. By using chemistry in general, the investigator expect ed that assessment would hold valid for diverse college c hemistry course situations. The study of the Attitude Toward Science in School Assessment (ATSSA) (Germann, 1988) inspired us to use factor analysis to refine an instrument. In its pilot t est, ATSSA included 24 items which loaded on 5 factors from exploratory factor analysis (EFA). The developer concluded that the first factor, comprising 14 items, was consistent with the desired construct but the other 10 items dealt with more specific obj ects like testing and labs. In a follow up administration of a shortened version of the instrument, the one component structure was supported, and this became the final version of the instrument. In addition to proposing and administering a revised versio n of ASCI, we will be able to utilize confirmatory factor analysis to obtain an estimate of the model fit and information about sources of measurement error, which was not easy to do at the time ATSSA was developed. In this study, because the investigator wanted to capture two subscales (to align with two components of the general theory of the attitude concept), the original and revised ASCI (V2), comprising items from the original 20 item ASCI, were proposed for college chemistry students. It is expected that the revised ASCI would yield responses with two component structure and map into our two component theoretical framework, and thus improve the construct validity. The technique backgrounds for scale reconstruction and

PAGE 35

25 quality evaluation however, dese rves further comment and will be discussed in the next chapter This study will attempt to address three specific research questions. Research questions Question 1 How is psychometric quality evaluat ion for the responses from ASCI in terms of reliability and validity? Question 2 How can ASCI be modified bas ed on factor analysis and theoretic al framework ? Question 3 How is psychometric quality evaluat ion for the responses from revised ASCI versions in terms of reliability and validity?

PAGE 36

26 Chapter Three Meth odological Background T his chapter present s the methodological background about psychometric quality which is necessary to understand what is coming in the next chapters. It start s with the proposed instrument development processes, and then elaborate s the two major issues for instrument development: reliability and validity. Then several factor analysis techniques are discuss ed Proposed Instrument Development Processes Th e overall instrument development process is illustrated in Fig. 3 1 The first stage (left part) involves the qualitative process. It includes the identification of test purpose: what is the concept the test is designed to measure? The construct should be clearly defined along with content domain and dimensionality. After the test blueprint or table of specification is created, it is good to generate initial item pool. Ask an expert and layperson to review items to identify and discuss any content issues re lated to these items: What is the scale of measurement for these items? What is your reaction to the analyses using this scale in terms of content validity and face validity? Revision will be made to make the pilot test items according to the feedback. T he second stage (the right part) mainly involves the quantitative psychometric process. Pilot test is given to the representative samples to examine the responses in terms of reliability, validity, utility and practicality. Revise or delete the items assoc iated with responses of poor psychometric quality. Continue to conduct studies until the revised test is functioning well. The last step is to develop

PAGE 37

27 guidelines for administration, scoring, and interpreting the scores for furth er study and other researche rs. Figure 3 .1 Instrument development processes ( ) As is discussed in the previous chapter, ASCI did a good job in term of qualitative part. Starting from ASCI has the advantage of skip ping the most time consuming part of th e fir st stage The second stage, the quantitative part especially on reliability and validity is the key point for this study and will be discussed in more details below Reliability The most important issues for a the reliability of the score s (Is it providing consistent results?) and validity (Is it measuring what the developer intended to measure?). The conception of reliability dates back to 1904 (Spearman, 1904) Spearman first proposed the concept and the ways to evaluate the degree of score reliability. Reliability is important because no score can be perfectly reliable even in the natural science area. e of measurement errors in

PAGE 38

28 (Pedhazur, 1997 p. 172 ) If a measurement yields totally random scores, the scores are not reliable and cannot m easure anything. The scores with poor reliability cannot be valid. And the findings based on the unreliable measured scores are questionable. Wilkinson and the APA Task Force on Statistical Inference emphasized the importance of reporting the reliability e stimate : should provide reliability coefficients of the scores for the data being analyzed even when the focus of their research is not psychometric. .. Interpreting the si ze of observed effects requires an assessment of the reliability of the scores. (Wilkinson, 1999 p. 596 ) R eliability estimate has different angles such as test retest reliability and i nternal consistency. Test retest reliability means that the same individual should score similarly on one instrument at different times if other environmental factors are controlled. However, people usually only take the instrument once, therefore test retest reliability is hard to estimate. Another aspect of r eliability is i nternal consistency, which can be reliability estimate because of its convenience of single administration. It is calculated using Equation 3 .1: = 1 [ 1 ( k 2 )/ total 2 ] Equation 3 .1 Where K k 2 is the sum of the K individual item score variance and total 2

PAGE 39

29 infinit y to +1 mathematically. The closer it is to +1, the more reliable the total score is. When it is close to +1, the item scores are positively linearly correlated with each other and the total score. Being greater than .7 is considered to be acceptable reliability (Nunnally, 1978) If the alpha is significantly less than 0, the items may be perfectly correlated with each other but with mixed signs (positive or negative relation mixed), and th us make s y low If the alpha is close to 0, th e items scores are uncorrelated. Validity Reliability is only one part of the instrument validity but not sufficient (Moss, 1994) If the scores are not reliable, they cannot be considered to be valid unless the instrument is designed to measure nothing (Bruce Thompson, 2003) An in stance of validity is, if I step on a scale three times in a day, and it reads 0, 100 and 200, then I am really not sure what is to measure. Even when good internal consistency estimate is achieved, it is not sufficient for validity of one construct and us e of total score. A good example was presented by Gardner (1995) as shown in Fig 3. 2 Assume we have nine items in the instrument. For situation 1, items produce random and not correlated results, suggesting poor internal consistency and poor dimensionalit y. For situation 3, all items are highly correlated with each other, which indicate good internal consistency and one dimension. In situation 2, nine items are divided into 3 distinct groups. In this situation, overall internal consistency is good, items w ithin groups are correlated. Since three factors are distinct and un correlated mathematically, it is better to report factor score on each scale rather than a total score. It is not meaningful to add an apple, a hotdog and a notebook together. When the tes t has three dimensions like in situation 2, validity of using a single

PAGE 40

30 composite score is not established. Instead, i nternal consistency and factor score should also be reported for each scale rather than for the whole instrument. Figure 3. 2 Internal co nsistency and unidimensionality (Gardner, 1995) Validity is about the degree to which scores from a measurement measure what it purports to measure. Score interpretation will affect the following decisio n and action, which is also a validity concern For example, i f I stand on the scale and it reads 200 d oes that mean I am IQ 200 ? D oes that mean I am a smart student? Shall I celebrate for that? Here is a list of representative selection of more elaborate definitions and standard s for validity. [V alidity i s ] the degree to which inferences from the results of the assessment of the knowledge, skills, attitudes, and behaviors o f the evaluatee are trustworthy (The joint committ ee on standards for educational evaluation, 2007 p. 179 ) The key validity issues in future assessment are the same as they have always empirical evidence and theoretical ration ale supports the adequacy and

PAGE 41

31 appropriateness of inference and actions based on test score s or other models of assessment (S. Messick, 1989) [V alidity ref ers to ] interpretations of test scores entailed by proposed uses of tests. Validity is, therefore, the most fundamental consideration in developing and evaluating tests. The process of validation involve s accumulating evidence to provide a sound scientific basis for the proposed score interpretations. It is the interpretations of test scores required by proposed uses that are evaluated, not the test itself. When test scores are used or interpreted in more than one way, each intended NCME 1999, p. 9) Validation speaks to a diverse and potentially critical audience; the arguments Stro whether a practice has appropriate consequences for individuals and institutions, an especially to guard aga inst adverse consequences. (L. J. Cronbach, 1988) There are common features running through these validity definition s Validity is a property of test scores rather than the test itself (AERA, APA, & NCME 1999, p. 9) Both empirical evidence and theoretica l rationale need to be considered in terms of validity (S. Messick, 1989) Psychometric evaluation of an instrument score can verify the theory related to the construct. They suggest modification for the theory underlying the construct when necessary. They indicate m odification of the instrument for better psychometric quality It is worthy to mention that test scores can be valid for one purpose

PAGE 42

32 but not for another. Validity is also the process to justify the inference based on the test score s Cronbach further emphasizes the plausibility against rival hypotheses (L. J. Cronbach, 1988) There are various aspects of validity discussed in literature such as c o ntent validity f actorial validity c onvergent validity and c riterion related validity etc Content validity involves the process of developing and reviewing the content of the instrument by the panel of subject matter experts about clarity and representa tive of construct domain. This is a fundamental property of a test but it is arbitrary and not dependent on the test scores. Factorial validity refers to the loadings of test item scores on the assumed factors from internal structure analysis, and thus pr ovides strong evidence of validity. Convergent validity means the results of one specific construct from different methods should be similar. D iscriminant validity means the results of different constructs should be different. Criterion related validity in cludes predictive (purpose of measurement, predict later performance) and concurrent validity (correlation with other related constructs measured at the same time). Nomological network pertain s to empirical evidence to build lawful sets of relationships between the measured construct and other theoretical concepts or observable properties in order to make a case for the existence of the construct (Lee J. Cronbach & Meehl, 1955) The y used the word law to describe the expected relation ship among observable properties or theoretical constructs. They proposed that to make something is scientifically means to set fo rth the laws in which it occurs My understanding is a robust nomological network grounds a measured construct in a web of proposed relationships, the strengths and directions of which, as determined empirically,

PAGE 43

33 match theoretical predictions. Accordingly, correlational and experime n tal analyses are the most used techniques to suppor t nomological network building. All aspects of validity can fall under one overarching category: construct validity. the theory of the construct it is meant to measure. More accurately, it reflects a verifica tion of the inferences and interpretations to be drawn from the test scores and a corresponding modification (if so indicated) of the instrument and/or the theory (Angoff, 1988) unifying concept of validity that integrates criterion and content considerations into a common framework for testing rational hypothes es about theoretically relevant (Samuel Messick, 1980) Cronbach and Meehl first proposed the factor out profitable ways of dividing the construct (Lee J. Cronbach & Meehl, 1955) T he Standards for Educational and Psychological Tests (SEPT) recommended the report of internal structure from a test score, and dimensionality, which is supported by statistic al technique like factor analysis If the rationale for a test use or interpretation depends on premises about the relationships among parts of the test, evidence concerning the internal structure of xample, that a test is essentially unidimensional. Such a claim could be supported by a multivariate statistical analysis, such as factor analysis, showing that the score variability

PAGE 44

34 attributable to one major dimension was much greater than score variabil ity attributable to any other identified dimension. When a test provides more than one score, the interrelationships of those scores should be shown to be consistent with the construct(s) being assessed. (Standard 1.11) Regardless of the endorsements from the most eminent scientists like Cronbach and SEPT standard mentioned above factor analysis has encount ered critics, as Cronkhite and L iska noted: Apparently, it is so easy to find semantic scales which seem relevant to [information] source, so easy to n ame or describe potential/hypothetical sources, so easy to capture college students to use the scales to relate the sources, so easy to research assistant returns with the computer printout, and so rewarding to have a guaranteed publication with no fear of nonsignificant results that researchers, once exposed to the pleasures of the factor analytic approach, rapidly become addicted to it. (Cronkhite & Liska, 1980 p. 102 ) It is important to keep these critics in mind when factor analysis is used for data anal ysis and interpretation. F actor analytic approach should not be misused as an advantage over other statistics approaches. It should not have any privilege for publication purpose without careful judgment To make good use of factor analysis t he research d esign and analysis strategy should be carefully selected and justified, rather than just trying different strategies and reporting the best wanted results. Several methodological problems have been mentioned in previous chapter. In the following sections the factor analysis approach will be discuss ed in more detail s

PAGE 45

35 Factor Analysis Factor analysis is widely used to analyze scores from an instrument. It seeks to discover the simple pattern in the relationship among the observed variables. For example, if a n instrument has 20 items grouped into 2 scales and items within each scale are highly correlated it is reasonable to report 2 factor scores summing the grouping items, rather than report all the 20 items It m ake s the interpretation simple and easy! Sam ple Size Usually factor analysis requires a large sample size. However, it is not clear how large is enough. Some rules of thumb can be used at hand (Hogarty, Hines, Kromrey, Ferron, & Mumford, 2005; MacCallum, Widaman, Preacher, & Hong, 2001; Marsh & Balla, 1994; Marsh, Balla, & McDonald, 1988; Shevlin & Miles, 1998) Unfor tunately, there are serious drawbacks to such guidelines. One problem is that these recommendations vary dramatically. For instance, Gorsuch (1983) suggested a ratio of five participants per measured variable and that the sample size never be less than 100 In contrast, Nunnally (1978) and Everitt (1975) proposed ratios of 10 to 1. More importantly, recent research has suggested that such guidelines are not sufficiently sensitive to a variety of important characteristics of the data (MacCallum et al., 1999; Velicer & Fava, 1998). The primary limitation of such guidelines is that adequate sample size is not a function of the number of measured variables per se but is instead influenced by the extent to which factors are overdetermined and the level of the com munalities of the measured variables. When each common factor is overdetermined (i.e., at least three or four measured variables represent each common factor) and the communalities are high (i.e., an average of .70 or higher), accurate estimates of populat ion parameters can be obtained with samples as small as 100

PAGE 46

36 (MacCallum et al., 1999). However, under more moderate conditions a sample size of at least 200 might be needed; when these conditions are poor it is possible that samples as large as 400 to 800 m ight not be sufficient (Fabrigar, Wegener, MacCallum, & Strahan, 1999) EFA or PCA EFA and PCA (princip al component analysis) are similar eigenvector based multivariate analysis methods with th e only difference in the regression direction. The different model of EFA and PCA is illustrated in Fig 3. 3 (from Ferron Statistics classnote). EFA is used when a researcher wishes to identify a set of latent constructs underlying a battery of measured v ariables. The primary purpose of EFA is to arrive at a more parsimonious conceptual understanding of a set of measured variables Before using EFA, a researcher should carefully consider if this is a goal of the research project (Fabrigar et al., 1999) In PCA, the component variables ar e combinative of all items, and they reflect latent variables as in EFA Therefore, there are clear conceptual distinctions between PCA and EFA. Although t hese approaches often produce similar results, this is not true in certain contexts. When the goal of the analysis is to identify latent constructs underlying measured variables, it is more sensible to use EFA than PCA (Cattell, 1978; Gorsuch, 1983; McDona ld, 1985; Mulaik, 1972; (Fabrigar et al., 1999)

PAGE 47

37 PCA Model EFA model Figure 3. 3 PCA and EFA model EFA or CFA While both EFA and CFA provide parameter estimate including item loadings, C FA offers more advantages over EFA at the instrument validation stage. It is even more appropriate when the instrument is designed based on a theoretical and conceptual model. In this situation, the CFA preference is endorsed by measurement professionals, which is embodied in statements, that [ing] on construct validation argues that the most rigorous test of factorial structu (Greenbaum & Dedrick, 1998) and that CFA is a commonly accepted method to test/confirm dimensionality (Netemeyer et al., 2003) CFA can provide fit indexes to examine the overall goodness of fit of a model which EFA cannot Also it offers information about the source of measurement error and how the model fit can be improved by modifying the model. EFA and CFA models are illustrated in Fig. 3.4. For the EFA, all items are allowed to load on all t weight (or standard regression coefficient) of each item on its factors. It is given in the factor pattern coefficient (matrix P). The single correlation coefficient between th e item

PAGE 48

38 set s of coefficients are identical in number for orthogonal rotation when factor correlation is 0 (so factor correlation matrix R is identity matrix), but di ffe rent for oblique rotation. Both the matrix P and S are expected for EPM ( Educational a nd Psychological Measurement ) manuscript s (B. Thompson & Daniel, 1996) For CFA, the term is used to refer to ei ther matrix P or S since in CFA items are set to load on one factor only, and there is only one set of coefficient matrix When we use the term item loading, we always mean the correlation coefficient for the item on its specific factor. EFA m odel CF A model Figure 3. 4 EFA and CFA model The Number of Factors As to how many factors should be extracted in the EFA, there are many rules of thumbs. The most used rule is to examine the number of eigenvalue greater than one because it is the default option in most statistics package s like Mplus (Muthen & Muthen, 2007) However, it is far from perfect because it is hard to defend, especially to decide a factor when eigenvalue is 1.01 and not a factor when eigenvalue is .99. Other approaches, such as bootstrap (Thompson, 1994) and parallel ana lysis, have more considerable merit but ar e underused. It is safe to check from different ways leading to the same solution.

PAGE 49

39 Summary It is clear that reliability and validity estimates are necessary to support the statistical inference based on test score s However, report on reliability and establishment of validity is not frequently seen in literature. The mostly used strategies for factor analysis for the instrument development in science education are PCA or EFA even when the instrument has a clear de sign of subscales and CFA is the most appropriate CFA has never been reported in the attitude related instrument s published on JCE at present. C hapter F our will report the methods part for all administrations for di fferent versions of ASCI.

PAGE 50

40 Chapter Fou r M e thod The different versions of ASCI were administered for five times i n this study during the semester of spring and summer of 2009 Firstly, original ASCI (V1) was administered Then three revised versions of ASCI were created a nd given to four differ ent populations. The responses from all administrations were evaluated in terms of psychometric properties and compared with literature results where appropriate. T he evaluations had two main foci: reliability and validity. T his method chapter is organized in four parts. The first is the summary of general research methods that are common to all administrations The second is the scale reconstruction, i.e., using results from ASCI V1 to create meaning ful subscales aligned with the general theory of the atti tude concept. The third is the gathering of additional validity evidence for ASCI V2 score s including the discussion of a nomological network and of predictive validity The last is analysis specific for calculus course such as Chi square test General R e search Method s Instruments For the first adminis tra t ion ASCI V1 (Bauer, 2008) was used. As dis cussed previously, it has twenty pairs of objectives grouped into status towards chemistry in general. For the second adminis tra t ion the revised ASCI (V2) was used. ASCI V2 (Appendix) is a short form of V1. It has eight pairs of objectives

PAGE 51

41 organized in two subscales. ASCI V3 has the same eight items as those in V2, with the only difference in item order of #2 (simple vs complicated) and #8 (chaotic vs organized). V3.1 is the same as V3, c alculus. Participants and Data Collection The participants for the five administrations are summarized in T able 4 .1 organized on time and version order ASCI V1 was given in a ll lab rooms of Gener al Chemistry I & II labs at a large Southeastern Public Research University (SE) during the 4 th week of the Spring 09 semester. ASCI V2 was given to all the General Chemistry I Peer led (PL) sessions on Mar. 27 (two days after the thi rd term exam and after the course drop date) at SE. ASCI V3 was given to chemistry students during the last week of Spring 09 at a large Middle West Public Research University (MW). ASCI V3.1 was given to students in seven calculus classrooms during the la st week at SE. Later in the first week of Summer 09 semester, V2 was given to all the Biochemistry students at SE. Table 4 .1 Summary of administrations for different versions of ASCI ASCI version # of item # of scale Administration time participants Major changes V1 20 5 Near the end of semester General Chemistry lab at UNH a Original, or V1 (Bauer, 2008) V1 20 5 2/2 ~ 2/6, 4 th week, Sp09 General Chemistry lab I II, at SE b Replication of V1 V2 8 2 3/27, 11 week, Sp09 General Chemistry class at SE Short from V1 V3 8 2 4/27 ~ 5/1, last week, Sp09 Chemistry class at MW c Change item order from V2 V3.1 8 2 4/20 ~ 4/24, last week, Sp09 Calculus class at SE Chemistry inV3, calculus in V3.1 V2 8 2 5/18, 1 st week, Summer 09 Biochemistry class at SE Replicatio n of V2 Note: administrations are organized by time. a UNH: University of New Hampshire b SE: a large Southeastern Public Research University c MW: a large Middle West Public Research University For each administration, the instrument was given to each intac t class as a paper and pencil test most at the beginning of the class Ten students in one lab section were asked

PAGE 52

42 to take the survey twice, before and after the lab experiment, to calculate test retest reliability. Students were verbally instructed t o answer with their feelings toward chemistry in general, rather than to a specific teacher or course. T he10 choice scantron (marked from A through J or from 0 through 9) was used to collect the data. Students were asked to bubble in the name, UID, and the ir answer to each item on the scantron. Any response with multiple answers, missing data, missing UID or wrong response beyond the supposed range (each response should range from 1 to 7) was excluded for all the analysis below. The same data cleaning proce ss was applied for all the analysis throughout the study Pattern of missing data was checked for whether the missing data may bias the findings. Data Analysis Strategy To accomplish research goals, data from each administration was analyzed for reliabilit y and validity evidence. V arious statistics were performed including descriptive statistics, reliability, effect size of group difference correlation among different concepts multiple regression to predict chemistry achievement and factor analysis (expl oratory & confirmatory) All descriptive statistics were performed in SPSS 17.0 for each item score after the negatively stated items were reversely coded. Good normality of the item score distribution was examined by skewness and kurtosis I nternal consis T est retest reliability was obtained for ASCI V1. Factor score wa s created by adding scores of all the items associated with the factor. Difference in factor scores of student groups at various course levels wa d effect size (Cohen, 1988) The guidelines

PAGE 53

43 (d > .2, small; d > .5, medium, d > .8, large) are used to interpret the d effect size based on Cohen s standard s (1988) d reveals how many standard deviation units apart the two groups are on the factor score. Factor correlation values were measured in two ways. In the traditional way, factor correlations are based on the factor score as the sum of all grouping items. Due to the existence of measurement error, these correlations can be undere stimated. Factor correlations obtained via confirmatory factor analysis are corrected for measurement error. Both values are reported to support discussion of how well students discriminat e the different scales. EFA was performed on the correlation matrix of all the item data in SPSS 17.0 to better understand the internal structure of the results each time. To decide the number of factors to extract, the eigenvalue greater than one rule was carefully used, and comprehensive approaches such as scree plot, pa rallel analysis, and interpretability were used Bauer performed FA using principal components analysis (PCA) and Varimax rotation which will be referred to as strategy A S trategy A was occasionally used to make easy comparison with literature results H owever, PCA is appropriate for the research interest of prediction which is not the major concern here. Because the research interest is the measure of latent variable and components which are correlated, principal axis factor analysis methods (EFA) and P romax to allow factors to correlate ( strategy B ) S trategy B was applied consistently throughout all the FA report if not stated explicitly CFA wa s performed in Mplus 5.2 to estimate how theoretical model fits the data. It was run on a first order (4 fact or solution for ASCI V1, 2 factor for V2, V3 and V3.1), where the latent factors were set to correlate with each other. Using the variance covariance

PAGE 54

44 matrix of all the item data as indicated by Cudeck (1989) a maximum likelihood method was employed All i tems were set to load on their assumed factors only. Model is identified by fixing the first item on its factor at 1. In general, models based on a large number of scores are likely to have an inflated C hi square value for the model fit, so it is a good id ea to examine three additional fit statistics: if the Comparative Fit Index (CFI) is greater than .95 (some may use CFI less than.90, (Cheng & Chan, 2003) ), and the Root Mean Square Error of Estimation (RMSEA) and the Standard Root Mean Square Residue (SRMR) are both less than .08, the model can be considered a good fit to the data (Hu & Bentler, 1999) This criterion will be used consistently for the estimate of model fit. For the CFA, it is important to falsify the plausible rival hypotheses since untested models may fit the same data even better. For the ASCI V1, we tested man y possible model options based on responses of reduced number of items in order to create conceptual meaningful scales for the revised version. For the ASCI V2, when the two factor model fit was supported by CFA, we further tested the plausible rival hyp otheses of one factor model for the most parsimonious solution. Scale Reconstruction B ased on ASCI V1 A framework for scale reconstruction was created (Fig. 4.1 ) and the major steps were implemented to refine ASCI. Step 1 is the obtainment of pilot test in formation, which was ASCI (V1) from literature; Step 2 is the administration of the instrument; Step 3 is the descriptive analysis of mean, standard deviation, pattern of missing data, reliability, and item total correlations; Step 4 is EFA for the interna l structure and comparison with literature; Step 5 is CFA for alternate model fit and revealing aspects of measurement error; and Step 6 is to propose conceptually meaningful scales when there is a lack of

PAGE 55

45 psychometric evidence for existing scales, and to run CFA on the possible revised versions based on the scores of remaining items and reconstructed models. The proposed revised version is then administered to a different population and the process is repeated. Step 7, which we are currently working on in collaboration with others, is extensive administration to independent populations within a similar context. Step 8 (not shown) would involve evaluation in different contexts. Figure 4.1 Framework for scale development The ideal candidates for removal f rom an instrument are items with poor psychometric quality, including those with highly skewed or unbalanced distributions, low variabilities, low item total correlations, weak loadings on the hypothetical factor (below .35 from EFA), or heavy loadings on multiple factors. Additionally, CFA can reveal items or

PAGE 56

46 scales with correlated errors, which indicates that only one of the correlated items/scales is needed to capture the information. This scale development strategy has been successfully applied to crea te n ew scales emerging from factor analysis of test responses in literature (Germann, 1988) Validity Estimat e for ASCI V2 Score Nomological Network among Attitude, Aptitude and Chemistry Achievement Correlational analysis among attitude, ability and achievement was performed in SAS 9.13 to support attitud e nomological network. Data were obtained from the same general chemistry population as ASCI V2 data. ASCI V2 score ( emotion and cognition components) Student demographic information and their ability score Previous a bility was measured by the quantitative portion of the Scholastic Aptitude Test (SATM) which has been used to predict the first year chemistry achievement (Scott E. Lewis & Lewis, 2007) . At the end of the seme ster, students' scores on the First Term General Chemistry Blended Examination from the Examinations Institute of the ACS Division of Chemical Education (40 items) (www4.uwm.edu/chemexams/contact) Due to confidentiality requirements for using the ACS Exa m, no test item can be shown here. The distribution for each variable was examined. Scatterplots of each pair of variables were also examined. Consequently, the relationships among variables were summarized using correlation coefficients. The following va lues (.10 small, .30 medium, .50 large) (51) were used to interpret the magnitude of coefficients

PAGE 57

47 T he distribution for each variable was examined Scatterplots of each pair of variables were then visually examined. Consequently, these relationships were summarized using correlations coefficients T he following values (.10 small, .30 medium, .50 large) in social science study (Cohen, 1977) was used to interpret the magnitude of coefficients r Predictive Validity for ASCI V2 score Multiple regression analysis was performed in SAS 9. 13 to predict student achievement in general chemistry based on scores from ASCI V2 and SATM. The score on the ACS exam was the dependent variable (the variable we want to predict, or D.V.) Three different regression models were tested using combination o f predictor variables (the variables we use to make prediction, or independent variable, I.V.) The possible violation of assumptions to run correlation multiple regression include normality, linearity, outliers and homoscedasticity, which may greatly affe ct the results when violated (Osborne & Waters, 2002) Normality assumption means all the predictor variables ( emotion, cognition and SATM ), dependent variable (ACS) and prediction error should be normally distributed. Linearity assumption means that there a linear correlation between the I.V. and the P.V. Homoscedasticity assumption means that the error should distribute equally on the both sides of the regression line at any point of th e predicted value. Data were checked for possible violations of assumptions, including assumptions of normality, linearity, and homoscedasticity Influential outliers were examined by par tial correlations are reported for the best and most parsimonious model.

PAGE 58

48 Chi square Analysis for V3.1 Participants In the study of ASCI V3.1 given to calculus students, 7 sections of calculus class instructor s agreed to participate. In order t o examine th e representative of the student sample in the whole college calculus student population, t he frequency of surveyed students on declared major was compared with those all the enrolled students without survey by Chi square analysis and Fisher exact test (Fisher, 1922) For Chi square analysis w hen the cells in the contingency table have less than 5 counts, th e test assum ption of large sample size may be violated Fisher exact test is a better estimate when cells have small sample size. In this study most students belonged to a few majors such as biomedical sciences and engineering major, while many other majo rs like social science have zero count. Therefore, Fisher exact test was performed along with the traditional Chi square analysis In the next two chapters, results from the original and revised ASCI will be presented each started with a brief method desc ription and followed by analysis and discussion

PAGE 59

49 Chapter Five Quality Evaluation for O riginal ASCI (V1) T his chapter present s results from quality evaluat ion of ASCI V1. ASCI V1 was given to each intact class of G eneral C hemistry I and II labs during th e 6th week of the Spring 09 semester (from Feb. 2 through Feb 6) at a large Southeastern Public Research University (SE) The data analysis process described by the ASCI develop er was applied to this study here New versions of ASCI based on ASCI V1 result s will be proposed. Results and Discussion There we re 405 comp l ete and correct responses for G eneral C hemistry I labs and 509 responses for G eneral C hemistry II labs. No pattern was found in those missing data. Descriptive statistics a re shown in Table 5.1 for each item organized by each subscale described in Bauer s study with all 14 negatively stated items recoded (200 8 ) According five items, which are items 15, 2, 6, cale has five items, which are items 19, 16, 8, 20 and 13. five four item s (11, 14, 17 and 7). The average item scores range d from 2.31 to 5. 82, and standard deviation s range d from 1.27 to 1.71. No item was found to have skewness or kurtosis greater than 1. 2, which suggest s good normality of the item score For the General Chemistry I lab s the mean scores for the five and ut were 4.86, 5.13, 4.54, 4.68 and 4 .00 which are all above the middle point, suggesting average students feel chemistry is i nteresting and useful. The item mean were 5.1, 5.82, 3.9 5, 4.14 and 3.95, which sug gest s

PAGE 60

50 the item means were 2.8 0 3.22, 2.6, 2.31 and 4.02, which indicates they normally think item means were 3.63, 3.76, 4.03 and 3.26, which means they are not emotional ly satisfied. Table 5.1 D escriptive statistics for ASCI V1 General Chemistry labs I and II Mean SD Mean SD Item in each subscale Lab I (N=405), Lab I I (N=509) Interest & utility 15 r a worthwhile useless 4.86 1.57 4.92 1.48 2 worthless beneficial 5.13 1.58 5.14 1.48 6 r good bad 4.54 1.6 4.56 1.53 12 r interesting dull 4.68 1.6 4.56 1.61 3 r exciting boring 4 .00 1.54 4.05 1.61 Anxiety 1 9 r tense relaxed 5.1 1.36 4.94 1.38 16 r work play 5.82 1.36 5.65 1.46 8 r scary fun 3.95 1.41 4.06 1.44 20 r insecure secure 4.14 1.4 4.16 1.36 13 r disgusting attractive 3.95 1.3 3.92 1.44 Intellectual a ccessibility 4 complicated simple 2.8 0 1.52 2.91 1.51 5 confusing clear 3.22 1.5 3.38 1.51 1 r easy hard 2.6 1.27 2.98 1.45 10 challenging unchallenging 2.31 1.35 2.57 1.47 9 r comprehensible incomprehensible 4.02 1.5 4.26 1.52 Fear 18 safe dangerous 3.88 1.39 3.78 1.44 Emotional satisfa ction 11 r pleasant unpleasant 3.63 1.51 3.69 1.53 14 r comfortable uncomfortable 3.76 1.39 3.91 1.39 17 chaotic organized 4.03 1.62 3.97 1.54 7 r satisfying frustrating 3.26 1.71 3.44 1.71 Note: Each score should range from 1 to 7, and 4 is in the middle point. Items are organized by each subscale in the same order as in the literature. Higher score means students feel chemistry is interesting and useful, anxiety producing, intellectually accessible, dangerous and emotional satisfying. Item 16 has the highest mean, and item 10 has the lowest mean, which indicates they feel chemistry is work rather than play, and is challenging. r a : item score is reversed before averaging.

PAGE 61

51 The internal consistencies for each subscale using the listwise deletion a re in Table 5.2 wa s .82, .7 1 .79 and .74 for the General Chemistry I labs for each subscale and .85, .79, .82 and .78 for General Chemistry II labs, which are all above satisfactory level .70, and comparable to the literature results. The test retest (before and after the lab) reliability was above .9 and better than literature report. From these analyses, our instrument data has a very good reliability. Table 5.2 Inte rnal consistency and test retest reliability for V1 General Chemistry la bs Scale (Items) Inte rnal consistency alpha Test retest reliability by Pearson correlation L ab I (N=405) L ab II (N=509) Bauer s study for our sample (N=10) Bauer s study Interest&Utility (15,2,6,12,3) 82 85 83 91 74 Anxiety (19,16,8, 20,13) 71 79 77 96 64 Accessibility (4,5,1,10,9) 79 82 78 96 71 Emotional satisfaction (11, 14,17 7) 74 78 79 96 72 Item loadings f rom EFA are listed in Table 5.3 along with the loading Bauer reported item 8, which is abou vs .38, .26, .53 and .15 on 49, .33, .40 and .20 for General Chemistry II labs. It is hard to tell since this item should however sometimes it a t e that different

PAGE 62

52 stud ent samples from different institutions and different levels may respond to survey items in a similar way. Table 5.3 The item loading s for V1 General Chemistry labs Item General Chemistry I labs General Chemistry II labs Literature f1 f2 f3 f4 f1 f 2 f3 f4 f1 f2 f3 f4 Interest and utility use anx accs emot use anx accs emot use anx accs emot 15 a worthwhile useless .75 .22 .01 .06 .85 .05 .08 .06 .85 .01 .06 .11 2 worthless beneficial .68 .21 .05 .04 .79 .01 .09 .06 .79 .1 .03 .04 6 a good bad .68 .20 .16 .13 .72 .22 .12 .17 .71 .05 .2 .04 12 a interesting dull .80 .17 .13 .01 .77 .31 .00 .05 .67 .32 .02 .15 3 a exciting boring .69 .08 .24 .08 .57 .48 .03 .02 .58 .38 .05 .09 Anxiety 19 a tense relaxed .1 7 .55 .45 .31 .23 .65 .39 .11 .14 .75 .32 .02 16 a work play .14 .23 .36 .71 .05 .71 .39 .03 .06 .74 .23 .15 8 a scary fun .38 .26 .53 .15 .49 .33 .40 .20 .35 .6 .18 .16 20 a insecure secure .20 .64 .37 .17 .35 .46 .16 .43 .34 .53 .23 .29 13 a disgusting attractive .55 .44 .08 .16 .58 .45 .12 .12 .42 .53 .01 .11 Intellectual Accessibil ity 4 complicated simple .05 .17 .72 .05 .06 .22 .76 .02 .03 .13 .8 .13 5 confusing clear .10 .04 .78 .06 .25 .18 .77 .13 .24 .33 .75 .06 1 a easy hard .19 .24 .70 .07 .29 .07 .71 .18 .13 .18 .73 .34 10 challenging unchallen ging .08 .09 .69 .36 .20 .25 .74 .03 .29 .36 .54 .01 9 a comprehens ible incompre hensible .49 .12 .52 .20 .54 .11 .51 .32 .38 .03 .52 41 Fear 18 safe dangerous .09 .29 .18 .68 .01 .03 .00 .87 .03 .05 .05 .85 Emotional Satisfactio n 11 a pleasant unpleasan t .60 .13 .50 .10 .55 .47 .32 .18 .5 .44 .35 .27 14 a comfortable uncomfort able .46 .42 .48 .03 .52 .35 .31 .37 .48 .43 .35 .28 17 chaotic organized .25 .73 .08 .07 .25 .11 .28 .55 .44 .34 .32 .15 7 a satisfying frustrating .49 .07 .67 .03 .56 .32 .43 .18 .41 .3 .46 .28 Note: EFA is performed in SPSS 17.0. Same parameters were used including principal components methods and Varimax (orthogonal) rotation using the covariance matrix. Four factors are extracted for comparison. Item loadings on the assumed factors are hig hlighted in bold along with the literature loading. The item loadings with different loading pattern are italic a negatively stated Factor score s (created by adding scores of all the items associated with the factor) and factor correlation value s are listed in Table 5.4.1 General Chemistry II lab students score

PAGE 63

53 s subscale. There is no significant difference in other subscales for these two populations. Compared with the literature, our data are major students. They experienced strong feeling s sample size. Factor correlation pattern is very similar, which indicate s the instrument function similarly across different populations. Table 5.4.1 Factor score and Pearson correlations for V1 Interest&use Anxiety Access Fear Emotion Interest&use Anxiet y Access Fear Emotion General Chemistry I lab (n=405) General Chemistry II lab (n=509) Mean a 54.9 59.9 33.2 47.9 44.5 54.6 59.1 37.0 46.3 45.9 SD 13.3 15.7 17.6 23.2 19.8 13.7 17.4 19.0 24.0 20.1 Chemistry major students in lab I (N=12) Chem istry major students in lab II (N=12) Mean 62.6 64.5 42.8 54.1 46.4 61.4 66.7 45.2 50.6 53.6 Factor correlation Interest&Use 0.55 0.40 0.18 0.63 0.63 0.38 0.15 0.64 Anxiety 0.64 0.17 0.75 0.66 0.19 0.79 Access 0.17 0.68 0.18 0.67 Fear 0.25 0.28 Table 5.4.2 Subscale score s and Pearson correlations reported by Bauer (2008) From literature Interest & use anx iety access fear emotion Mean a 56 60 44 38 45 Mean of chemistry major 82 43 46 41 70 Factor correlation Interest & use 0.51 0.32 0.15 0.64 anxiety 58 .15 0.72 access 0.19 0.62 fear 0.22

PAGE 64

54 Subscale score s in percentage for comparison (scale value 1= 0%, 7=100%). CFA wa s performed to estimate goodness of the 4 factor solution. The five items, which are items 15, 2, 6, 12 and 3, were set to load on the factor interest and utility" only The items 19, 16, 8, 20 and 13, were set to load on factor "anxiety" only; and so on for the factor model fit for General Chemistry I lab is (Table 5.5 2 (N = 405, df = 146, p = .00) = 618.5, CFI = .86 2 (N = 509, df = 146, p = .00) = 692.9, CFI = 89 RMSEA = 0.086, SRMR = .0 79. Since CFI was less than .95 and RMSEA was greater than .08, data reasonably well. Table 5.5 Fit statistics of V 1 for four f actor CFA Fit Statistic Lab I Lab I I 2 618.5 692.9 df 146 146 p .00 .00 N 405 509 CFI 8 6 8 9 RMSEA .089 .086 SRMR .076 .079 Note: 2 = Chi square, d f = degree of freedom, N = sample size, CFI = Comparative Fit Index, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residual Factor correlation from CFA is shown in Table 5.6 The correlation between the "anxiety" an s is about .97, which means th e s e two subscales are strongly correlated and redundant.

PAGE 65

55 Table 5.6 Factor correlation for V1 from CFA Anxiety Access Emotional satisfaction Anxiety Access Emotional satisfaction Lab I Lab II Interest & utility .72 .46 .80 .77 .49 .84 Anxiety .83 .98 .82 .97 Intellectual Accessibity .85 .82 Proposal for Possible Revisions of ASCI Because no item exhibited an unexpected distribution or contributed negatively to i nternal consistency, the descriptive analysis did not indicate any item candidates for removal. This result reflected the r elatively high quality of ASCI VI. EFA results revealed the item loading pattern to be similar to the literature report, which yielde d a smaller number of factors than intended. CFA results indicated that the proposed factors did not produce an acceptable model fit and contained redundancies, revealing the need to refine the instrument by reconsidering its intended scale structure. Giv en the existing items, there are various possible combinations for creating scales for a new version Our first step was simply to look at the scales individually. Table 5.7 presents the model fit results from CFA for six one scale versions. For the single scale of CFA w as performed with all its items ( 1, 4, 5, 9, and 10) loaded on one factor with poor fit statistics ( O ption 1 A ) Because Item 9 loaded on multiple factors from the EFA of the whole instrument (Table 5.3 ), we ex cluded it and ran CFA again based on the remaining four items. Both fit statistics are improved : CFI = .99 and SRMR = .02, which supports the one factor structure ( O ption 1 B )

PAGE 66

56 Table 5.7 CFA results for Scale Reconstruction based on ASCI (V1) subscales Op tion # Of Scale Items Model Fit S tatistics Comments CFI RMSEA SRMR 1 A 1 1, 4, 5, 9, 10 0.94 0.15 0.08 9 harms the validity of intellectual accessibility scale; removing it produces a good fit 1 B 1 1, 4, 5, 10 0.99 0.08 0.02 2 A 1 2, 3, 6, 12, 15 0.95 0.14 0. 0 4 removing 15 (redundant with 2) produces a better fit ; regardless, the scale is conceptually problematic 2 B 1 2, 3, 6, 12 0.99 0.09 0.02 3 1 8, 13, 16, 19, 20 0.97 0.09 0.04 g ood fit for the anxiety scale 4 1 7, 11, 14, 17 0.99 0.07 0.0 2 g ood fit for this emotional satisfaction set, also a good match with attitude theory V 2 2 1, 4, 5, 10 7, 11, 14, 17 0.96 0.08 0.04 final version c aptures both intellectual accessibility and emotional satisfaction, congruent with two component attitude t heory F or the scale when all items (2, 3, 6, 12, 15) are set to load on one factor for CFA ( Option 2 A), the fit statistics are: 2 (N = 509 df = 10 p = .00) = 54.2, CFI = .95, RMSEA = 1 4. Based on model modification indice s, the 2 value would decrease by 42.5 if the correlation between error terms of Items 2 and 15 was added to the model. One of these two items was therefore a good candidate for deletion. A CFA model was run with out Item 15 ( Option 2 B). The fit statistics are: 2 (N = 509 df = 6 p = .00) = 9.7, CFI = .99, SRMR = .02, which supports the one factor model. H owever, since the scale actually has two concepts (interest and utility ), and it is not desirable for one subscale to have more than one concept it was not included in the revised version The CFA model fits are acceptable for the subscales of O ptions 3 and 4 inearly related and redundant, only fits better with the affect component of the attitude framework. In order to capture both scales, cognition a nd affect, we propos e the ASCI V2 It consists

PAGE 67

57 model fit of two factor structures based on the eig ht item scores was tenable (Table 5.7 ). The next step is to determine whether students discriminat e these two factors in an instrument that contains only the relevant eight items, which requires the administration of ASCI V2 to another student sample. Summ ary Based on above analysis, ASCI V1 functions similarly across our population and those described by Bauer I t did no t provide meaningful interpretation for all these 20 items, and did not reflect the underpinning theoretic al framework because the way to group the items is not stable and inconsistent with factor analysis results The analysis of this administration leads to the modification of a new version. ASCI V 2 was proposed and hopefully can be more conceptually meaningful. It has eight items across two subscales. One is intellectual accessibility and the other is emotional satisfaction therefore, it was congruent with the cognitive and affective components of attitude theory. ASCI V2 and two more derivative version s (V3 and V3.1) were further given to different student population s and will be discussed in the next chapter

PAGE 68

58 Chapter Six Quality Evaluation for Revised ASCI This chapter describes quality evaluation results from four administration s of the modified ASCI All of these modified version s (V2, V3, V3.1) have eight items grouped across two subscales. The following presentation of results is organized based on each administration in the same order as in Table 4 .1 ASCI V2 was given to the General Chemistry I Peer L ed (PL) class es on Mar ch 27 at the beginning of the class at a large Southeastern Public Research University, two days after the third term exam and after the drop date. ASCI V3 and V3.1 were created on the basis of V2 to examine if changed item order could lead to better validity e vidence. V3.1 was modified from V3 to assess the effectiveness in a calculus context to measure students attitude toward calculus. At last, ASCI V2 was given to another student group enrolled in Biochemistry course. Quality Evaluation for Revised ASCI V2, Peer L ed classes Results and Discussion From 375 set of data returned, 354 respon dent s with the complete set were used for data analysis. Descriptive statistics w ere performed for each item score after all four negatively stated items were recoded. Averag e scores for each item from the 354 respondents ranged from 2.50 to 4.26, and standard deviation range d from 1.28 to 1.70 ( Table 6. 1) No item was found to have skewness or kurtosis greater than .7, which suggests good normality of the scores The item sco re s on V2 are comparable to the same

PAGE 69

59 item score s in V1. The mean score s for General Chemistry I lab students are comparable to the score s constructed for a similar sample of students who took ASCI V1 in the General Chemistry I Laboratory (Table 6.1 ). Table 6 .1 Descriptive statistics for ASCI V2 PL class es Item # Lab I, n=405, V1 PL, n=354, V2 V1 V2 Mean SD Mean SD 1 1r easy hard 2.60 1.27 2.81 1.28 4 2 complicated simple 2.80 1.52 2.95 1.43 5 3 confusing clear 3.22 1.50 3.36 1.40 14 4r comfortab le uncomfortable 3.76 1.39 3.64 1.36 7 5r satisfying frustrating 3.26 1.71 3.24 1.70 10 6 challenging unchallenging 2.31 1.35 2.50 1.50 11 7r pleasant unpleasant 3.63 1.51 3.38 1.41 17 8 chaotic organized 4.03 1.62 4.26 1.66 Note: I tem score is revers ed before averaging. Each item score should range from 1 to 7, while 4 is in the middle point. Higher score means students feel chemistry is intellectually accessible, emotional satisfying. Item 8 has the highest mean of 4.26, and item 6 has the lowest mea n of 2.50, which indicates they feel chemistry is organized, and is challenging. I nternal consistency reliability was measured by for each subscale. wa sati of thumb satisfactory level of .7, and comparable to ASCI (V1) and the literature results. Factor score s were created by sum ming the items within each subscale : 2.91 (31.8% if converted to a percentage) for intellectual accessibility and 3.63 (43.8%) for emotional satisfaction, with a correlation of .64 between the two factors. Factor score s and correlation f or this administration of ACSI V2 were very similar to those calculated from ASCI V1 data for a similar sample of students, which indicate these eight items function similarly even when the other 12 items are removed from the instrument.

PAGE 70

60 Table 6. 2 Subscale score s and Pearson correlations for ASCI V2, PL classes along with results from General Chemistry I labs PL (n=354), V2 Gen. Chemistry I lab (n=405), ASCI V1 Access Emotion Interest&Use Anxiety Access Fear Emotion Mean 2.91 (31.8 a ) 3.62 (43.8 a ) 54.9 59.9 33.2 47.9 44.5 SD 1.13 1.20 13.3 15.7 17.6 23.2 19.8 Factor correlation Interest & Use 0.55 0.40 0.18 0.63 Anxiety 0.64 0.17 0.75 Access .64 0.17 0.68 Fear 0.25 a Subscale score s in percentage for comparison (scale value 1= 0%, 7=100%). Table 6. 3 Internal consistency for PL class es along with General Chemistry I L ab I and Bauer s results a The ASCI V1 Intellectual accessibility has five items (4,5,1,10, 9 ) V2 has its four without item 9 An exploratory factor analysis for V2 wa s performed using strategy A including principal components methods and Varimax rotation as in t he literature Two factors were extracted. Item loadings are listed in Table 6. 4 All items were loaded well ( all loading s were greater than .6 here ) on the assumed factors in the literature. The loading pattern for the short version wa s exactly what we ex pected for the revised version. Internal consistency by PL(N=354), V2 lab I (N=405), V1 Bauer s result (2008), V 1 Intellectual Accessibility a .82 .79 .78 Emotional satisfaction .79 .74 .79

PAGE 71

61 Table 6.4 The item loading s for V2, PL class es Item # in each scale V2 loading F1 F2 Intellectual Accessibility 1r a easy hard 0.41 0.66 2 complicated simple 0.34 0.75 3 confusing clear 0.45 0.68 6 chall enging unchallenging 0.00 0.86 Emotional Satisfaction 4r comfortable uncomfortable 0.74 0.29 5r satisfying frustrating 0.72 0.36 7r pleasant unpleasant 0.75 0.38 8 chaotic organized 0.71 0.02 Note: EFA is performed in SPSS 17.0. Strategy A wa s used including principal components methods and Varimax (orthogonal) rotation using the covariance matrix. 2 factors are extracted for comparison. Item loadings on the assumed factors are highlighted in bold. r a : negatively stated CFA was performed to estimate goodness of fit for the 2 factor model Items 1, 2, 3 and two factors were allowed to correlate In order to investigate a more parsimonious model, t he alternate 1 factor model solution was also sought. Models were identified by fixing the first i tem on its factor at 1. Fig 6 1 presents the standardized parameter estimates for both models.

PAGE 72

62 Fig ure 6 1 Standardized parameter estimate for one factor and two factor model (N=354). The large ovals designate the latent variables, the small circles are the residual variances, and the rectangles indicate the observed variables. Items were set to load on their assigned factors only. All item loadings are significant ly different from 0 (p < .01). The estimation of the 2 factor model fit ( Table 6.5 ) test wa 2 (N = 354, df = 19, p = .00) = 77.0, CFI = .95, RMSEA = 0.093, SRMR = .042. Since CFI was.95, an d RMSEA was close to .08, SRMR was less than .08, the overall model fit s the data reasonably well. The estimation of the 1 factor model fit wa 2 (N = 354, df = 19, p = .00) = 156.1, CFI = .89, RMSEA = 0.14, SRMR = 056. Since CFI was less than .9 5 and RMSEA was greater than .08, the overall 1 factor model does not fit the data. Therefore, the 2 factor model is more tenable

PAGE 73

63 Table 6.5 Fit statistics of V2 results for 1 and 2 factor CFA Fit Statistic Value Model one factor two factor 2 156.1 77.0 d f 19 19 p .00 .00 N 354 354 CFI .89 .95 RMSEA .14 .093 SRMR .056 .042 Note: 2 = Chi square, df = degree of freedom, N = sample size, CFI = Comparative Fit Index, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residual The two aspect s of attitude measured by ASCI V2 are expected to be related, though not redundant, which is supported by the 2 factor CFA: the correlation between the Fr om the above results, the results of V2 exhibit better psychometric quality to measure and discrimina t e meaningful component for attitude construct Nomological Network among Attitude, Aptitude and Chemistry Achievement Previous aptitude score such as SATM is often used for college admission in the United States due to its predictive power of their performance in college science courses (Scott E. Lewis & Lewis, 2007) In this study ACS score was used as part of the final exam for the General C h emistry I course and ex pected to correlate with SATM and ACTM. I n previous studies l ow to moderate attitude achievement correlation was found. Here ASCI score is expected to correlate with ACS Since cognition part ( intellectual accessibility ) the difficulty of the chemistry, cognition score may be correlated with SATM

PAGE 74

64 Results and Discussion There were 456 students enrolled in the discussion sections of the General Chemistry I course and had ACS score 383 had SAT M scores, 292 had ACTM scores, and 354 had ASCI V2 score. The entire population was included in the correlation analysis. First, we examined the distribution for each variable. The mean, standard deviation, skewness and kurtosis values are presented in Table 6.6 All the skewness and k urtosis values are less than .32, which indicates normal distribution. Scatterplots of each pair of variables were then examined. No bivariate outliers or nonlinear relationships were found. Attitude scores for SAT and ACT test takers were not significantl y different. Table 6. 6 Descriptive analysis for attitude aptitude score and final score on ACS Variable N Mean Std Dev Skewness Kurtosis access 354 2.91 1.13 0.32 0.25 emotion 354 3.63 1.2 0.03 0.08 SATM 383 548.5 67.1 0.19 0.23 ACTM 292 23.19 3.66 0.19 0.22 ACS 456 23.17 6.2 0.02 0.07 The relationships among these variables were summarized with correlation coefficient, as presented in Table 6.7 which in essence allows examination of a small nomological network. All relationships were positive as anticipated significant at the .05 level. The correlation between ACS and other four variables is classified as moderate b ased on the guideline for interpreting the magnitude of coefficient r (.10 small, .30 moderate, .50

PAGE 75

65 large) in social science stu dy (Cohen, 1977) The attit ude achievement correlation (coefficients: .30 and .34) is consistent with extensive studies (Germann, 1988; Reynolds & Walberg, 1992; Salta & Tzougraki, 2004) and not as strong as the ability achievement correlation (.45 and .46). S ince attitude toward s chemistry and mathematic al ab ility are conceptually distinct it is reassuring that low correlations between constructs related to those different concepts are observed (coefficients: .19, .14, .13 and .15). Since two components of the attitude concept cognition and emotion sho uld be related but not identical, the high correlation between these two variables from ASCI V2 (coefficient: .64) is also reas suring. Finally, as expected, SATM and ACTM, both intended to measure the mathematics ability, reveal the highest correlation of all (coefficient: .74). Table 6. 7 Pearson correlation among attitude aptitude and achievement, N = 456, Prob > |r| under H0: Rho=0 Emotion SATM ACTM ACS C ognit ion ( Intellectual accessibility part of ASCI) 64 19 14 30 A ffec t ( Emotion al satisfactio n part of ASCI) 13 15 34 SATM 74 45 ACTM 46 *P < 0.05 for all the correlation, correlation coefficients with ACS (dependent variable) are in bold. In sum, all five variables correlate with each other as expected, and thus provide evidence f or a small but reasonable nomological network. Predictive Validity to Predict Chemistry Achievement Multiple regression analysis was in general chemistry measured by ACS from three predictor variables : ASCI V2 cognition, emotion and SATM. For the 456 students in the above correlation analysis 297 of them

PAGE 76

66 have complete data for SATM, ASCI V2 and ACS score. All 297 students were used in the following regression analysis. We expect that the attitude can account f or the different portion from SATM in the variance of chemistry achievement. Results and Discussion All t hree predictors were entered into an initial regression model: emotional satisfaction (ASCI V2), intellectual accessibility (ASCI V2), and mathematics ability (SATM). The maximum values of stud were 3.3 and .048, respectively. These small values led us to believe that no ne of the cases were exerting undue influence on the regression analysis. An examination of a scatterplot of the residuals with the predicted values revealed no violations of the linearity or homescedasicity assumptions, and the distribution of the residuals was found to be approximately normal (skewness = .098, kurtosis = .025). Based on the screening of the data it appeared appropriate to proceed with the result analysis. The obtained R 2 value was .286 suggesting about 2 8.6 % of the variance in ACS Exam score was accounted for by the three predictors. From squared semipartial correlation of the predictors, S ATM uniquely accounted for 17.7% of the variability in ACS score, emotional satisfaction uniquely 2.4% and intellectual accessibility .3%. The regression coefficient for the intellectual accessibility was found to not be significantly different from zero. Although intellectual accessibility and SATM both correlate moderately with ACS score, SATM partially overlaps with the intellectual accessibility. This is a reasonable result, since intellectual accessibility of chemistry as perceived by General Chemistry I and II students is expected to be highly influenced by mathematical ability

PAGE 77

67 Therefore, a more parsimonious second model was run with only emotional satisfaction and SATM as predictors. For the two predictor model, the obtained R 2 value was .283, sugge sting about 28 .3 % of the variance in ACS Exam score was accounted for by the two predictors. The adjusted R 2 was .278. It appears that the model accounted for an acceptable proportion of the 2 =R 2 /(1 R 2 ) was computed to be .40, which can be interpreted as a large effect. The root mean square error was 4.95, which indicated that predictions of ACS Exam scores from this model tend to be off by about 5 points and cannot be regarded as exact predictions. The o btained prediction equation was: Predicted ACS score = 2.29 + 1.24*Emotion + 0.038* SATM (Equation 6 .1) The regression coefficients were statistically significant. Since the magnitudes for each variable are different, the regression coefficients should be standardized for better comparison of their contributions to ACS Exam score prediction. Values of .25 and .44 were obtained for emotion and SATM, respectively. The intercept with mean centered predictors wa s 23.5 so the predicted ACS score for a student with an average emotional satisfaction score and average SATM score is 23.5 SATM scores uniquely account for 18.8% of the variability in ACS Exam scores, while emotional satisfaction uniquely accounted for 6.16%. A two predictor model using intellectual a ccessibility and SATM was also examined. SATM uniquely accounted for 18.0% of the variability in ACS score, while intellectual accessibility uniquely accounted for 4.05%.The obtained R 2 value was .262, suggesting

PAGE 78

68 about 26% of the variance in ACS test was a ccounted by the two predictors, which is lower than the two predictor model with emotional satisfaction and SATM. Comparisons of the three regression models suggest that a two predictor model based on emotional satisfaction and SATM represents the best m odel to predict ACS Exam score for this data. In terms of predictive validity, the emotional s atisfaction score from ASCI V2 can be used to improve predictions of chemistry achievement that would result from the use of a predictor of mathematical ability a lone, which supports the idea that emotional satisfaction is a distinct measured construct. Evaluation of ASCI V3 for C hemistry C lass at University MW In ASCI V2, items 1, 2 and 3 were all about the intellectual accessibility. I am interested to see how t he survey works when the items from two scales are more mixed. For this purpose the item order of items 2 and 8 was swapped and thus the new version V 3 was created Hopefully it might reduce possible measure ment error and achieve better quality for V3 A SCI V3 was given to chemistry students at a large Middle West Public Research University (MW) during the week of April 27 2009 There were 309 responses returned, 305 of which with complete set s of data were used for the following analysis. The results of ASCI V1 from PL classes are also listed for comparison. Results and Discussion Descriptive statistics were performed for each item score after all four negatively st at ed items were recoded ( Table 6. 8 ). T he average scores for each item range d from 3.07 to 4.44, and standard deviation range d from 1.29 to 1.58. No item was found to have skewness or kurtosis greater than .93 which suggest s good normality of the scores Each

PAGE 79

69 item mean score was higher than that in the V2 version for PL students, which suggests a more positive attitude for the MW students. Table 6. 8 Descriptive statistics for ASCI V3, MW chemistry Item # MW n=305 V3 PL, n=354 V2 Mean SD Mean SD 1r easy hard 3.64 1.48 2.81 1.28 2 (in PL, #8) chaotic organized 4.44 1.29 4.26 1.66 3 confusing clear 4.26 1.40 3.36 1.40 4r comfortable uncomfortable 4.41 1.35 3.64 1.36 5r satisfying frustrating 4.29 1.58 3.24 1.70 6 challenging unchallenging 3.07 1.29 2.50 1.50 7r pleasant unpleasant 4.13 1.35 3.38 1.41 8 (in PL, #2) complicated si mple 3.46 1.33 2.95 1.43 Note: I tem score is reversed before averaging. Each score should range from 1 to 7, and 4 is in the middle point. Higher score means students feel chemistry is intellectually accessible, emotional satisfying. Item 2 has the highes t mean of 4.44, and item 6 has the lowest mean of 3.07, which indicates they feel chemistry is organized, and is challenging. I nternal consistencies were calculated for each subscale using the listwise deletion ( Table 6. 9 ). According to instrument design four four items, which are items 4, 5, 7 and 2. was .77 and .78 for the 2 subscales, which are all above satisfa ctory level .70, and comparable to the Bauer s results. Table 6. 9 Internal consistency for V3, MW chemistry a The ASCI V1 i ntellectual accessibility scale has five items. Internal consistency by MW (N=305) V3 PL (N=354) V2 Literature (Bauer, 2008) V1 Intellectual a ccessibility .7 7 .82 .78 a Emotional s atisfaction .78 .79 .79

PAGE 80

70 An exploratory factor analysis wa s performed using strategy B ( principal axis fac tor analysis methods and Promax). Two factors are extracted. Item loadings are listed in Table 6. 1 0 All items were loaded well on the assumed factors except items 1 and 3. Table 6. 1 0 The item loading s for V3 at MW and V2 from PL sessions at SE Item # i n each scale L oading MW V3 L oading PL V2 F1 F2 F1 F2 Intellectual a ccessibility N=305 n=354 1r a easy hard 0.35 0.49 0.38 0.38 8 (in V2 #2) complicated si mple 0.02 0.76 0.01 0.81 3 confusing clear 0.45 0.37 0.19 0.66 6 challenging unchallenging 0.19 0.81 0.11 0.73 Emotional s atisfaction 4r comfortable uncomfortable 0.74 0.00 0.80 0.06 5r satisfying frustrating 0.89 0.13 0.79 0.01 7r pleasan t unpleasant 0.76 0.08 0.80 0.03 2 (in V2 #8) chaotic organized 0.39 0.08 0.42 0.06 Note: EFA is performed in SPSS 17.0. Strategy B was used including principal axis fact or analysis methods and Promax rotation using the covariance matrix. 2 factors are extracted for comparison. Item loadings on the assumed factors are highlighted in bold. Items not loaded on the assumed factors in the literature are in italic a : negatively stated Factor score s created by sum ming item score s and factor correlation valu e are listed in Table 6. 1 1 F actor score is 3.61 for i ntellectual accessibility and 4.31 for emotional satisfaction, with a correlation of .51 between them. Factor correlation wa s similar with that from V2 of for peer led class es which indicated the facto r function similarly across different versions. Factor score for MW students is higher in both scales, which indicates they feel chemistry is more intellectually accessible and satisfying.

PAGE 81

71 Table 6.11 Factor scores and Pearson correlations for V3, MW and PL class es at SE C onfirmatory fac tor analysis (CFA) was performed to estimate the influence of the measure error and the fit of 2 factor model. The estimation of the 2 factor model fit for the MW chemistry class is ( Table 6. 1 2 ) 2 (N = 305, df = 19, p = .00) = 113.2, CFI = .89 RMSEA = 0 .13 SRMR = .07. Since CFI was less to .95, and RMS EA was greater than .08 the overall model does not fit the data well. The source of misfit is the relatedness for the error term of items 6 and 8 T 2 would decrease 45.0 if this association is added t o the model. Table 6. 1 2 Fit statistics of V3 at MW from t wo f actor CFA Fit Statistic Value 2 113.2 df 19 p .00 N 305 CFI .89 RMSEA .13 SRMR .07 Note: 2 = Chi square, df = degree of freedom, N = sample size, CFI = Comparative Fit Index, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residual which is consistent with previous findings. MW, V3 (n=305) PL, V2 (n=354) Intellectual a ccess Emotion Intellectual a ccess Emotion Mean 3.61 4.31 2.91 3.63 SD 1.07 1.08 1.13 1.20 Factor correlation Intellectual accessibility 0.51 0.64

PAGE 82

72 From th e above analysis t he construct validity of ASCI V3 results is not supported by two factor CFA Evaluation of ASCI V3.1 for Attitude s toward Calculus All versions of ASCI discussed so far are for chemistry attitude In order to examine how the survey work s for was changed to makes the Version 3.1. At the similar time of ASCI V3 administration, ASCI V 3.1 was given to 7 sections of the calculus course during the first 15 minute s of class (3 from Enginee ring Calculus II, 1 from Engineering Calculus I, 1 from Engineering Calculus III, 1 from life science Calculus II, and 1 from Calculus I) during the week of Apr il 20th through April 24th. There were 190 responses returned, 187 with complete set s of data an d correct UID without multiple answers which were used for the following analysis. Results and Discussion Descriptive statistics were performed for each item score after all four negatively stated items were recoded ( Table 6. 1 3 ). It show s the average scor es for each item range from 2.66 to 4.68, and standard deviation s range from 1.41 to 1.76. No item was found to have skewness or kurtosis greater than 1, which suggest s good normality of the scores

PAGE 83

73 Table 6. 1 3 D escriptive statistics of ASCI V3.1 for calcu lus attitude Item # Calculus, V3.1 (N=187) Mean SD 1r easy hard 3.29 1.59 2 chaotic organized 4.68 1.47 3 confusing clear 4.09 1.65 4r comfortable uncomfortable 4.07 1.55 5r satisfying frustrating 3.86 1.76 6 challenging unchallenging 2.66 1.45 7r p leasant unpleasant 3.94 1.49 8 complicated simple 3.16 1.41 Note: item score is reversed before averaging. Each score should range from 1 to 7, and 4 is in the middle point. Higher score means students feel calculus is intellectually accessible, emotiona l satisfying. Item 2 has the highest mean of 4.68, and item 6 has the lowest mean of 2.66 ( italic ) which indicates they feel calculus is organized, and is challenging. The internal consistencies were calculated for each subscale using the listwise deleti on ( Table 6. 1 4 ). T four items, which are items 1, 3 6 and 8 four item s, which are items 2, 4, 5 and 7 wa s 83 and 7 8 for the 2 subscales, which is ab ove satisfactory level .70, and comparable to the literature results. V3.1 calculus sections a The original version for the literature, I ntellectual accessibility has five items. An exploratory factor analysis wa s performed using strategy B (principal axis factor analysis methods and Promax). Two factors were extracted. Item loadings are listed in Table 6. 1 5 All items were loaded well on the assumed factors except items 5 and 2. It is not clear Intellectual accessibility Calculus V3.1 (N=187) Literature V1 (Bauer, 2008) Intellectual accessibility (1,8,3,6) .83 .78 a Emotional satisfaction (4,5,7,2) .78 .79

PAGE 84

74 and cognitive interview need to be done to examine the possible reason for t his different data pattern. Table 6.15 The item loading s for V3.1 Calculus sections from EFA Item # in each scale Loading Calculus, 3.1 Loading MW,V3 F1 F2 F1 F2 Intellectual accessibility N=187 N=305 1r a easy hard 0.53 0.29 0.35 0.49 8 compli cated simple 0.65 0.07 0.02 0.76 3 confusing clear 0.61 0.20 0.45 0.37 6 challenging unchallenging 0.81 0.02 0.19 0.81 Emotional satisfaction 4r comfortable uncomfortable 0.89 0.16 0.74 0.00 5r satisfying frustrating 0.15 0.69 0.89 0.13 7 r pleasant unpleasant 0.79 0.02 0.76 0.08 2 chaotic organized 0.07 0.88 0.39 0.08 Note: EFA is performed in SPSS 17.0. Strategy B was used including principal axis factor analysis methods and Promax rotation using the covariance matrix. 2 factors are e xtracted for comparison. Item loadings on the assumed factors are highlighted in bold. Items not loaded on the assumed factors in the literature are italic r a : negatively stated Factor score s created by sum ming associated item score are listed in the Ta ble 6. 1 6 for each course level. Life Sciences Calculus II ha d the lowest factor score of 3.13 for Intellectual accessibility and 3.36 for emotional satisfaction. For the Engineering Calculus course, the higher level course ha d significantly higher level of positive attitudes towards calculus, (e.g., intellectually Accessibility score was 3.56, 3.79 and 4.33 for level I, II and III respectively) with the median effect size ranging from .2 to .6, which indicates level III students feel that calculus is more i ntellectually accessible and emotional satisfying. Calculus I ha d the similar mean scores with the Engineering Calculus I.

PAGE 85

75 Table 6. 1 6 F actor score for V3.1 calculus sections Level Intellectual A ccessibility Emotion S atisfaction N Mean SD Mean SD Lif e Sciences Calculus II 31 3.17 1.05 3.36 1.27 Calculus I 29 3.53 1.00 3.58 1.01 Engineering Calculus I 16 3.56 1.03 3.56 1.36 Engineering Calculus II 92 3.79 1.08 3.84 1.26 Engineering Calculus III 19 4.33 1.30 4.45 1.27 Overall 187 3.68 1.12 3.76 1.2 6 Although the trend of attitude difference among these levels was observed from Table 6.17 it is worthwhile to avoid any hasty conclusion because the student sample here may not represent the whole calculus population and a lot of factors may exits a nd affect attitudes. One factor is that, students from different major s may have different attitude s towards chemistry. Non science major students may hold more negative attitude towards science (Gogolin & Swartz, 1992) The question arises; did the student sample here represent the students population enrolled in calculus in terms of declared major ? It means, if the sample has overrepresented percentage of some specific major, the concl usion could be biased. To get a gross impression of the entire enrollment of the class, the frequency of declared major for each calculus course level is listed in Table 6. 1 7 Calculus I has the most diverse students from 40 different majors. Lif e Calculus II consists of students mostly from Biomedical Sciences, Environmental Science & Policy and other majors. All the three levels o f the Engineering Calculus I have the students domi nantly from engineering or pre engineering, with the percentage gr eater than 76%. The frequency of those in this study on declared major was compared with all the enrolled students without the survey by Chi square analysis and Fisher exact test. For 2 (N = 40, df = 12, p = .15) = 17.1, Fisher value: .21; For the Calculus I,

PAGE 86

76 2 (N = 200, df = 39, p = .91) = 27.8, no Fisher va 2 2 (N = 218, df = 20, p = .45) = 20.2, Fisher value: .42. All above Chi Square test s and fisher value s were not significant at the .05 level, which means no eviden ce to conclude that the student sample in our study and those without survey are different in declared majors. Table 6. 1 7 The frequency of most declared major s for student s at each course level Course level Major code Major of students enrolled Number Percent (%) Calculus I BMS Biomedical Sciences 31 15.5 PGU Pre Engineering 27 13.5 PBI Pre Biology/Microbiology 23 11.5 ESP Environmental Science & Policy 19 9.5 CHM Chemistry 17 8.5 EGU Engineering General 14 7 Total 200 100 Life Calculus II BMS Biomedical Sciences 13 32.5 ESP Environmental Science & Policy 13 32.5 EGU Engineering General 2 5 Total 40 100 Engineering Calculus I PGU Pre Engineering 121 43.37 EGU Engineering General 92 32.97 PAR Pre Architecture 28 10.04 BMS Biomedical Sciences 5 1.79 PBA COB Pre Business Admin 5 1.79 Total 27 9 100 Engineering Calculus II EGU Engineering General 141 51.65 PGU Pre Engineering 95 34.8 CHM Chemistry 9 3.3 PAR Pre Architecture 3 1.1 Total 27 3 100 Engineering Calculus III EGU Engineering General 109 50 PGU Pre Engineering 69 31.65 BCS Computer Science 10 4.59 MTH Mathematics 7 3.21 CHM Chemistry 2 0.92 Total 218 100

PAGE 87

77 2 (N = 273, df = 20, p = .09) = 8.7, Fisher value: .03. Fisher exact test indicated the significant difference at p < .05. When examining the roster, 51.7% students declared major of Pre Engineering and 34.8% declared Engineering general. For the survey studen ts, 61.5% declared Pre Engineering (this is larger than those without survey), 24.2% declared Engineering general. According to the admission requirement (USF, 2008) s tudents mee ting or exceeding certain criteria (high school GPA, SAT etc) are granted direct entry to the College, and Pre Engineering lculus attitudes score for Engineering Calculus II could be biased using this sample. CFA wa s performed to estimate goodness of fit of the 2 factor model. The estimation of the 2 factor mo del fit for the Calculus class wa s ( Table 6. 1 8 ) 2 (N = 187, df = 19, p = .00 ) = 76.2, CFI = .91, RMSEA = 0.1 3 SRMR = .0 7 Since CFI was less than .95, and RMSEA was greater than .08, the overall model did not fit the data reasonably well. In terms of possible reason for the misfit the combination o f different level courses for different population (environment, life and general calculus I, II, and III) could be a improve the model fit. A nother reason could be that c alculus students subjective words to describe their cognitive and affective attitude towards calculus, since words in ASCI were originally constructed speci fically for chemistry students. Cognitive interview may reveal possible hints fo r calculus attitude items but not performed in this study.

PAGE 88

78 Table 6. 1 8 Fit statistics of V3.1 calculus results from two f actor CFA along with MW Fit Statistic Calculus, V3.1 MW, V3 2 76.2 113.2 df 19 19 p .00 .00 N 187 305 CFI .91 .89 RMSEA .13 .13 SRMR .07 .07 Note: 2 = Chi square, df = degree of freedom, N = sample size, CFI = Comparative Fit Index, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residual Factor correlation from CFA between the "intelle wa s .8 6 which suggests the strong correlation for the two components and consistent with previous findings. Although t he above comparison wa s made for different course s and different population s, similar p attern appeared consistently For both cases the survey V3 and V3. 1 may not be useful to identify the attitude component for college students. The reliability for each scale is acceptable. The construct validity is not completely confirmed. Some items are not loaded on the assumed factor, and measurement error of items 6 and 8 were correlated, which indicated the decrease of 2 about 34.3 for calculus and 50.5 for MW data by adding this correlation into the model differ ent item order can lead to this different response patterns. One possible theory for the item order effect is that ASCI V2 may trigger students to discriminate the trivial difference among the items. Are the first two items, easy vs hard and complicated vs simple, same or different? Maybe students got the cues to find distinctions for each item, and may pay more attention. In sum, ASCI V3.1 is not yet done for the calculus course at this point.

PAGE 89

79 Evaluation of ASCI V2 for Biochemistry C lass The results of ASC I V3 for MW chemistry students and the V 3.1 for the calculus students exhibit simple and clean internal structure After examining the source of mi sfit, the error term of items 6 and 8 was found to covar y and harm ed the model fit for both situations Meanwhile, the cleaner 2 factor solution was confirmed by ASCI V2 results. The investigator decided to move back to V 2. Hopefully ASCI V2 can reproduce the good model fit resul t ASCI V2 was given to the students enrolled in Biochemistry course at the la st 15 minutes of the class on May 18. There were 105 responses returned, 88 with complete data, which were u sed for the following analysis. Results and Discussion Descriptive statistics were performed for each item score after all 4 negatively stated items were recoded ( Table 6. 19 ). The average scores for each item range d from 2.84 to 4.55, and standard deviation range d from 1.25 to 1.62. No item was found to have skewness or kurtosis greater than .9, which suggests good normality of the item responses. Tab le 6. 19 Descriptive statistics of ASCI V2 for Biochemistry class Item # Biochemistry, N=88 Mean SD 1r easy hard 3.2 5 1.2 4 2 complicated simple 3.4 3 1. 55 3 confusing clear 3. 89 1.4 0 4r comfortable uncomfortable 4. 06 1. 38 5r satisfying frustrating 3. 84 1. 56 6 challenging unchallenging 2.84 1.57 7r pleasant unpleasant 3.8 2 1.4 2 8 chaotic organized 4.51 1.46 Note: I tem score is reversed before averaging. Each score should range from 1 to 7, and 4 is in the middle point. Higher score means students fee l chemistry is intellectually accessible, emotional satisfying. Item 2 has the highest mean of 4.55, and item 6 has the lowest mean of 2.84, which indicates they feel chemistry is organized, and is challenging.

PAGE 90

80 Internal consistencies were calculated for ea ch subscale using the listwise deletion (Table wa s .78 and .81 for the 2 subscales, which was above satisfactory level .70, and comparable to the literature results. An exploratory factor analysis was p erformed using strategy B (principal axis factor analysis methods and Promax). Two factors were extracted. Item loadings for each item are listed in Table 6.21. All items were loaded well on the assumed factors. Table 6.20 Internal consistency for V2 in Bi ochemistry class a In ASCI V1, Intellectual accessibility has five items. Table 6. 2 1 The item loading s of V2 for Biochemistry and PL class es Item # in each scale Loading Biochemistry Loading PL F1 F2 F1 F2 Intellectual accessibility N=88 N=354 1r a easy hard 0.32 0.42 0.41 0.66 2 complicated simple 0.17 1.07 0.34 0.75 3 confusing clear 0.37 0.47 0.45 0.68 6 challenging unchallenging 0.07 0.53 .00 0.86 Emotional satisfaction 4r comfortable uncomfortable 0.75 0.08 0.74 0.29 5r satisfying frustrating 0.75 0.08 0.72 0.36 7r pleasant unpleasant 0.75 0.08 0.75 0.38 8 chaotic organized 0.40 0.00 0.71 0.02 EFA is performed in SPSS 17. 0. Strategy B was used including principal axis factor analysis methods and Promax rotation using the covariance matrix. 2 factors are extracted for comparison. Item loadings on the assumed factors are highlighted in bold. r a : negatively stated Factor s core s created by sum of item score and factor correlation s are listed in Table 6.22 ASCI, V2 Scale and item Biochemistry (N=95) PL (n=354) Literature (Bauer, 2008) v1 Intellectual accessibility (1,2,3,6) .79 .82 .78 a Emotional satisfaction (4,5,7,8) .81 .79 .79

PAGE 91

81 Attitude score s of Biochemistry students were only compared with Peer led classes and Peer leaders. Scores on ASCI V3 was not listed because 2 factor structure was not conf irmed. Attitude score for Biochemistry students wa s 3.36 for Intellectual accessibility and 4.09 for emotional satisfaction, with a correlation of .54 be tween them. Factor correlation wa s similar with V2 of Peer led class es which indicate s the factor func tion s similarly across different populations Attitude score for Biochemistry students wa s higher than PL class es in both scales Because enrollment in the Biochemistry General Chemistry I and II, Organic Chemistry I and II, and Biology the result reflected our expectation that Biochemistry students with more chemistry expertise h e ld a more positive attitude (effect size is about .3) Peer leaders were most satisfied with chemistry, and they scored the highest o n b oth scales Table 6. 2 2 Factor score and Pearson correlations for PL and Biochemistry class ASCI, V2 Biochemistry (n=88) PL student (n=354) Peer l eader (n=1 2 ) Access Emotion Access Emotion Access Emotion Mean 3.36 4.06 2.91 3.62 4.2 7 4. 58 SD 1.13 1.17 1 .13 1.20 1.66 2.01 Factor correlation Intellectual access 0.54 0.64 0.33(n.s.) CFA wa s performed to estimate goodness of fit of the 2 factor model. T he items were set to load on its assumed factor only. The estimation of the 2 factor model fit for the Biochemistry class wa s ( Table 6. 2 3 ) 2 (N = 88 df = 19, p = .05 9 ) = 29.5, which failed to reject the null hypothesis of a good fit. Since CFI was greater than .95, RMSEA was less than .08 and SRMR wa s less than .08, the overall model fit the data pretty well.

PAGE 92

82 Factor cor relation from CFA b wa s 67 which suggests the strong correlation for the two components and consistent with previous findings (r: .80) Table 6.23 Fit statistics of V2 results from two factor CFA Fit Stati stic Value Biochemistry PL 2 29.5 77.0 df 19 19 p .059 (n.s.) .00 N 88 354 CFI .96 .95 RMSEA .079 .093 SRMR .050 .042 Note: 2 = Chi square, df = degree of freedom, N = sample size, CFI = Comparative Fit Index, RMSEA = Root Mean Square Error of Approximation, SRMR = Standardized Root Mean Square Residual Summary The above results evaluated reliability and construct validity of ASCI V2 data f rom two groups of students enrolled in PL and Biochemistry class es. For these eight items on the survey t he two factor structure (intellectual accessibility and emotional satisfaction) is supported by CFA for V2 results but not for V3 and V3.1 results

PAGE 93

83 Chapter Seven Summary and Concluding Discussion Summary and Implication s for Chemical Education Startin g with the original ASCI (V1) developed by Bauer, we designed and implemented a process of scale development to refine the instrument for better construct validity. Better construct validity results in greater ease of interpretation, which we demonstrated via an examination of a nomological network an d predictive validity for ASCI V2 The research questions were successfully answered Research q uestion one was to evaluate psychometric quality for the responses from ASCI V1 in terms of reliability and vali dity. ASCI V1 results showed a similar data pattern with the literature, and the construct validity of ASCI V1 scores was not supported by confirmatory factor analysis. Research q uestion two was to propose modified versions of ASCI based on factor analysis and attitude theor etic al framework Starting from the 20 item ASCI, we deleted redundant items and scale s reconstruct ed meaningful subscales and estimated the data fit to theoretical models. Several methodological issues were clarified here. For example, confirmatory factor analysis w as used as the most rigorous test for whether the internal data structure matches the conceptual framewor k. Additionally, factor score s were suggested for use when there is more than one factor among the measured data set, wi th only those items load ing on the same construct used to determine the factor score In general, t he investigator ha s tried to showcase ways in which evidence for reliability and

PAGE 94

84 validity of instrument scores can be provided so readers can interpret and a pply research findings with a certain degree of confidence. T his work represents the first time in the area of chemical education that a benchmark for scale development was successful ly implemented to reconstruct an existing instrument Research q uestion t hree was to evaluate psychometric quality for the responses from revised ASCI (V2, V3 and V3.1) in terms of reliability and validity. ASCI V2 was desi gned to measure two subscales of attitude: intellectual accessibility and emotional satisfaction. These tw o subscales are indicators of cognition and of affect, respectively, and thus make a good connection with the conceptual framework for attitude from psychology Evidence for construct validity was obtained. The two factor correlated structure was u pheld by confirmatory factor analysis As regards to correlation s among attitude, achievement and ability, attitude correlates with SATM, ACTM, and ACS scores as expected, which provides evidence for a nomological network validity. Finally, once aptitude ability i s taken into account a componen t of attitude measured by ASCI V2 plays a significant unique part in predicting final achie vement in General Chemistry I Peer led students The two factor correlated structure of ASCI V2 was further replicated for biochemist ry students The score of ASCI could discriminate different student cohorts, as supported by the more positive attitudes held by students in Biochemistry course with more advanced understanding of chemistry than those in G eneral C hemistry I course Constru ct validity of ASCI V3 and V3.1 score was not confirmed by CFA. This study contribute s to chemical education research by providing validity evidence for a revised instrument, ASCI V2 for college chemistry students In addition to showcasing

PAGE 95

85 a method for examining validity, a second important result of this work is to recommend refinements that lead to greater ease of administration while improving validity: this eight item instrument now takes very little time to administer. Because of its convenience an d high psychometric quality, chemistry educators who are interested in determining students attitudes towards chemistry can easil y find time to administer ASCI V2, even to a large class, and can expect the results to be interpretable. Limitations of the Study and Recommendations for Future Research We envision a significant potential use of ASCI V2 will be to identify the effects of curricular reforms via experimental or quasi experimental design s. U sing ASCI V2 as a pre and post test for treatment and comparison groups would allow the effect of course experience on the attitude change of each student to be tracked. In addition, t his type of study c ould provide further evidence for validity which we were not able to provide here, by exploring attitude c hange directly with the same sample of students on the same measure, and by exploring whether the two factors relate to curricular intentions in an expected way. It is not the case that ASCI (V2) captures all potential aspects of attitude, nor that a singl e measure is desirable. Exploration of attitude theo ry and past research practice involve both a variety of quantitative assessment s (Table 6.6 is but a small subset) and of qualitative investigations. Understanding how ASCI V2 fits into this complex set of possibilities can only be accomplished by further examining its relationships with other measures of attitude, achievement, and additional constructs such as learning strategies, motivation, and behavior.

PAGE 96

86 Ultimately, since we as chemical educators are increasingly asked to utilize assessment within our classrooms, we have a great need for easy to use instruments that yield valid, reliable, and readily interpretable scores for constructs of interest. We hope that, in the classes of our collective commun more rather than les s positive, and also that ASCI V2 will help us all to investigate the truth of that proposition. Since the similar pattern of chemistry and calculus students responded to ASCI V3 a nd V3.1, it is worthwhile to create the version V2.1 (by changing the word of chemistry into calculus) to calculus classes If it functions well with calculus, possibly we can give more derivative version s to other science subjects like physics. However, i t remains a challenge because people may use different adjectives to describe their attitude towards different subjects, which necessitates the some new items from scratch and cognitive interview

PAGE 97

87 Reference s American A ssociation for the A dvancement of S cience (1989). science for all americans. Project 2061 The American Association for the Advancement of Science : Oxford University Press, Inc. Aikenhead, G. S. (1973). The measurement of high school students' knowledge about scienc e and scientists. Science Education, 51 539 549. Allport, G. W. (1935). Attitudes. In C. M. Murchison (Ed.), Handbook of social psychology (pp. 798 844). London: Oxford University Press. Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 19 32). Hillsdale, N J: Lawrence Erlbanm Associates. Ault, C. R. (1998). Criteria of excellence for geological inquiry: The necessity of ambiguity. Journal of Research in Science Teaching, 35 (2), 189 212. Bagozzi, R. P., & Burnkrant, R. E. (1979). Attitude organisation and the attitude behaviour relationship. Journal of Personality and Social Psychology, 37 913 929. Bandura, A. (1977). Social Learning theory : Oxford, England: Prentice Hall. Bandura, A. (1982). Self Ef ficacy Mechanism in Human Agency. American Psychologist, 37 (2), 122 147. Barbera, J., Adams, W. K., Wieman, C. E., & Perkins, K. K. (2008). Modifying and validating the Colorado Learning Attitudes about Science Survey for use in chemistry. Journal of Chemi cal Education, 85 (10), 1435 1439. Concept Inventory for Assessment of the Affective Component of Student Learning. Journal of Chemical Education, 82 (12), 1864 1870. Bauer, C. (2008). Attitude tow ards Chemistry: A Semantic Differential Instrument for Assessing Curriculum Impacts. Journal of Chemical Education, 85 (10), 1440 1445. Bem, D. J. (1967). Self Perception: An alternative interpretation of cognitive dissonance phenomena. Psychological Review 74 183 200.

PAGE 98

88 Bem, D. J., & McConnell, H. K. (1970). Testing Self Perception Explanation of Dissonance Phenomena Salience of Premanipulation Attitudes. Journal of Personality and Social Psychology, 14 (1), 23 31. Berg, K. E. (1966). Ethnic attitudes and agreement with a Negro person. Journal of Personality and Social Psychology, 4 215 220. Biddle, S. H. J., & Mutrie, N. (2001). Psychology of physical activity: Determinants, well being and interventions New York, NY: Routledge. Blalock, C. L., Lichtenste in, M. J., Owen, S., Pruski, L., Marshall, C., & Toepperwein, M. (2008). In Pursuit of Validity: A comprehensive review of science attitude instruments 1935 2005. International Journal of Science Education, 30 (7), 961 977. Bong, M. (2001). Between and wit hin domain relations of academic motivation among middle and high school students: Self efficacy, task value, and achievement goals. Journal of Educational Psychology, 93 (1), 23 34. Bong, M., & Clark, R. E. (1999). Comparison between self concept and self efficacy in academic motivation research. Educational Psychologist, 34 (3), 139 153. Bray, D. W. (1950). The prediction of behavior from two attitude scales. Journal of Abnormal and Social Psychology, 45 64 84. Breckler, S. J. (1984). Empirical Validation of Affect, Behavior, and Cognition as Distinct Components of Attitude. Journal of Personality and Social Psychology, 47 (6), 1191 1205. Brophy, B. M. (2004). Motivating students to learn(2nd Edition) Hillsdale: Lawrence Erlbaum Associates. Brown, A. L. (19 87). Metacognition, executive control, self regulation, and other more mysterious mechanisms. In F. E. Weinert & R. H. Kluwe (Eds.), Metacognition, motivation, and understanding (pp. 65 116) Hillsdale, New Jersey: Lawrence Erlbaum Associates. Chatterjee, S., Williamson, V. M., McCann, K., & Peck, M. L. (2009). Surveying Students' Attitudes and Perceptions toward Guided Inquiry and Open Inquiry Laboratories. Journal of Chemical Education, 86 (12), 1427. Cheng, S. T., & Chan, A. C. M. (2003). The development of a brief measure of school attitude. Educational and Psychological Measurement, 63 (6), 1060 1070. Cohen, J. (1977). Statistical power analysis for the behavioral sciences (revised edition) New York: Academic Press.

PAGE 99

89 Cohen, J. (1988). Statistical power an alysis for the behavioral sciences (2nd edition) Hillsdale, NJ: Erlbaum. Cooper, M. M., & Sandi Urena, S. (2009). Design and Validation of an Instrument To Assess Metacognitive Skillfulness in Chemistry Problem Solving. Journal of Chemical Education, 86 (2 ), 240 245. Cooper, M. M., & Sandi Urena, S. (2009). Design and Validation of an Instrument To Assess Metacognitive Skillfulness in Chemistry Problem Solving J. Chem. Educ., 86 (2), 240 245. Corno, L., & Mandinach, E. B. (1983). The role of cognitive engage ment in classroom learning and motivation. Educational Psychologist, 18 88 108. Cronbach, L. J. (1988). Five perspectives on validation argument. In H. Wainer & H. Braun (Eds.), Test validity (pp. 3 17). Hillsdale, NJ: Erlbaum. Cronbach, L. J., & Meehl, P E. (1955). Construct Validity in Psychological Tests. Psychological Bulletin, 52 281 302. Cronkhite, G., & Liska, J. R. (1980). The judgment of communicant acceptability. In M. R. Roloff & G. R. Miller (Eds.), Persuasion: New directions in theory and re search Beverly Hills, CA: Sage. Cunningham, G. B., & Mahoney, K. (2004). Self efficacy of part time employees in university athletics: The influence of organizational commitment, valence of training, and training motivation. Journal of Sport Management, 1 8 (1), 59 73. Custers, E., & Ten Cate, O. T. J. (2002). Medical students' attitudes towards and perception of the basic sciences: a comparison between students in the old and the new curriculum at the University Medical Center Utrecht, The Netherlands. Medi cal Education, 36 (12), 1142 1150. Dahle, L. O., Forsberg, P., Svanberg Hard, H., Wyon, Y., & Hammar, M. (1997). Problem based medical education: development of a theoretical foundation and a science based professional attitude. Medical Education, 31 (6), 41 6 424. DeVellis, R. F. (2003). Scale Development: Theory and Applications, Second Edition Thousand Oaks, CA: Sage Publications. Doran, R., Lawrenz, F., & Helgeson, S. (1994). Research on assessment in science. In D. Gabel (Ed.), Handbook of research on sc ience teaching and learning (pp. 388 442). New York: Macmillan. Eagly, A. H., & Chaiken, S. (1993). The psychology of attitudes Fort Worth, TX: Harcourt Brace Jovanovich.

PAGE 100

90 Eagly, A. H., & Chaiken, S. (2007). The advantages of an inclusive definition of att itude. Social Cognition, 25 (5), 582 602. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4 (3), 272 299. Festinger, L. (1957). A theory of cognitive dissonance Stanford, CA: Stanford University Press. Fishbein, M., & Ajzen, I. (1975). Beliefs, Attitude, Intention, and Behavior: An Introduction to Theory and Research Reading, MA: Addison Wesley. Fisher, R. A. (1922). On the interpr calculation of P. Journal of the Royal Statistical Society, 85 (1), 87 94. foundation, W. e. (2002). The testing industry's big four: Profiles of the four companies that dominate the business of making and scor ing standardized achievement tests. Retrieved 2/16/2010, from http://www.pbs.org/wgbh/pages/frontline/shows/schools/testing/companies.html Fowler, L. S. (1980). An Application of Piaget Theory of Cognitive Development in Teaching Chemistry the Learning Cycle. Journal of Chemical Education, 57 (2), 135 136. Franko, D. L., Cousineau, T. M., Trant, M., Green, T. C., Rancourt, D., Thompson, D., et al. (2008). Motivatio n, self efficacy, physical activity and nutrition in college students: Randomized controlled trial of an internet based education program. Preventive Medicine, 47 (4), 369 377. Freedman, M. P. (1997). Relationship among laboratory instruction, attitude towa rd science, and achievement in science knowledge. Journal of Research in Science Teaching, 34 (4), 343 357. Gao, Z., Lee, A. M., & Harrison, L. (2008). Understanding students' motivation in sport and physical education: From the expectancy value mode and se lf efficacy theory perspectives. Quest, 60 (2), 236 254. Gardner, P. L. (1975). Attitudes to science: A review. Studies in Science Education, 2 1 41. Gardner, P. L. (1995). Measuring Attitudes to Science: Unidimensionality and Internal Consistency Revisite d. Research in Science Education, 25 (3), 283 289. Germann, P. J. (1988). Development of the attitude toward science in school assessment and its use to investigate the relationship between science achievement and attitude toward science in school. Journal of Research in Science Teaching, 25 (8), 689 703.

PAGE 101

91 Gogolin, L., & Swartz, F. (1992). A Quantitative and Qualitative Inquiry into the Attitudes toward Science of Nonscience College Students. Journal of Research in Science Teaching, 29 (5), 487 504. Gonzlez, V E. (1992). On human attitudes: Root metaphors in theoretical conceptions. University of Gothenburg. Gouveiaoliveira, A., Rodrigues, T., & Demelo, F. G. (1994). Computer Education Attitudes and Opinions of 1st Year Medical Students. Medical Education, 2 8 (6), 501 507. Greenbaum, P. E., & Dedrick, R. F. (1998). Hierarchical confirmatory factor analysis of the Child Behavior Checklist/4 18. Psychological Assessment, 10 (2), 149 155. Grove, N., & Bretz, S. L. (2007). CHEMX: An Instrument to Assess Students' C ognitive Expectations for Learning Chemistry. Journal of Chemical Education, 84 (9), 1524 1929. Gutwill, J. (1998). Cognitive and attitudinal change in a ModularChem/ChemLinks modular chemistry course at the post secondary level. Abstracts of Papers of the American Chemical Society, 215 015 SOCED. Hagger, M. S., Chatzisarantis, N. L., & Biddle, J. H. (2002). A meta analytic review of the theories of reasoned action and planned behavior in physical activity: Predictive validity and the contribution of additi onal variables. Journal of Sport & Exercise Psychology, 24 3 32. Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The quality of factor solutions in exploratory factor analysis: The influence of sample size, communalit y, and overdetermination. Educational and Psychological Measurement, 65 (2), 202 226. Hren, D., Lukic, I. K., Marusic, A., Vodopivec, I., Vujaklija, A., Hrabak, M., et al. (2004). Teaching research methodology in medical schools: students' attitudes towards and knowledge about science. Medical Education, 38 (1), 81 86. Hsieh, P. H. P., & Schallert, D. L. (2008). Implications from self efficacy and attribution theories for an understanding of undergraduates' motivation in a foreign language course. Contemporar y Educational Psychology, 33 (4), 513 532. Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6 1 55. Jurisevic, M., Glazar, S. A., Pucko, C. R., & Devetak, I. (2008). Intrinsic motivation of pre service primary school teachers for learning chemistry in relation to their academic achievement. International Journal of Science Education, 30 (1), 87 107.

PAGE 102

92 Koballa, T. R. J., & Crawley, F. E (1985). The influence of attitude on science teaching and learning. School Science and Mathematics, 85 222 232. Kutner, B., Wllkins, C., & Yarrow, P. R. (1952). Verbal attitudes and overt behavior involving racial prejudice. Journal of Abnormal and Soci al Psychology, 47 649 652. Lapiere, R. T. (1934). Attitude vs. actions. Social Farces, 13 230 237. Laws, P. W., Rosborough, P. J., & Poodry, F. J. (1999). Women's responses to an activity based introductory physics program. American Journal of Physics, 6 7 (7), S32 S37. Leavers, D. R. (1975). Course Which Changed Attitudes of Students Towards Science. Journal of Chemical Education, 52 (12), 804 804. Lewis, S. E., & Lewis, J. E. (2007). Predicting at risk students in general chemistry: comparing formal though t to a general achievement measure. Chemistry Education: Research and Practice, 8 (1), 32 51. Lewis, S. E., Shaw, J. L., Heitz, J. O., & Webster, G. H. (2009). Attitude Counts: Self Concept and Success in General Chemistry. Journal of Chemical Education, 86 (6), 744 749. Lichtenstein, M. J., Owen, S. V., Blalock, C. L., Liu, Y., Ramirez, K. A., Pruski, L. A., et al. (2008). Psychometric reevaluation of the scientific attitude inventory revised (SAI II). Journal of Research in Science Teaching, 45 (5), 600 616. Likert, R. (1932). A Technique for the Measurement of Attitudes. Archives of Psychology (140), 1 55. Lindsay, H. A. (2001). Investigation of the relationship between selected cognitive and non cognitive variables and achievement in sophomore organic chemis try. Abstracts of Papers of the American Chemical Society, 221 18 CHED. MacCallum, R. C., Widaman, K. F., Preacher, K. J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36 (4), 611 637. Margol is, H., & McCabe, P. P. (2006). Improving self efficacy and motivation: What to do, what to say. Intervention in School and Clinic, 41 (4), 218 227. Marsh, H. W., & Balla, J. (1994). Goodness of Fit in Confirmatory Factor Analysis the Effects of Sample Si ze and Model Parsimony. Quality & Quantity, 28 (2), 185 217.

PAGE 103

93 Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness of Fit Indexes in Confirmatory Factor Analysis the Effect of Sample Size. Psychological Bulletin, 103 (3), 391 410. Mayer, V. J., & Richmond, J. M. (1982). An overview of assessment instruments in science. Science Education, 66 49 66. McGuire, W. J. (1985). Attitudes and attitude change. In G. Lindzey & E. Aronson (Eds.), Handbook of Social Psychology. 3rd ed (Vol. 2:233 346). New Yor k: Random House. Messick, S. (1980). Test Validity and the Ethics of Assessment. American Psychologist, 36 (11), 1012 1027. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd edition) (pp. 13 103). Washington, DC: American Counc il on Education. Mills, N., Pajares, F., & Herron, C. (2007). Self efficacy of college intermediate french students: Relation to achievement and motivation. Language Learning, 57 (3), 417 442. Mohsin, S. M. (1990). Attitude: Concept, formation and change N ew Delhi: Wiley, Darya Ganj. Moss, P. A. (1994). Can there be validity without reliability? Educational Researcher, 23 5 12. Munby, H. (1983). An investigation into the measurement of attitudes in science education. Columbus, OH: SMEAC Information Referen ce Center. Muthen & Muthen. Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scaling Procedures: Issues and Applications Thousand Oaks, CA: Sage Publications. Neub erg, S. L., West, S. G., Judice, T. N., & Thompson, M. M. (1997). On dimensionality, discriminant validity, and the role of psychometric analyses in of the Need for Closure Sca le. Journal of Personality and Social Psychology, 73 1017 1029. Nunnally, J. C. (1978). Psychometric Theory. Second ed. New York: McGraw Hill. Oppenheim, V. E. (1992). Questionnaire design, interviewing and attitude measurement New York: St. Martin's Pre ss.

PAGE 104

94 Osborne, J. W., & Waters, E. (2002). Four Assumptions Of Multiple Regression That Researchers Should Always Test. Practical Assessment, Research, and Evaluation, 8 (2). Ostrom, T. M. (1969). The relationship between the affective, behavioral, and cognit ive components of attitude. Journal of Experimental Social Psychology, 5 12 30. Ozmen, H. (2008). The influence of computer assisted instruction on students' conceptual understanding of chemical bonding and attitude toward chemistry: A case for Turkey. Co mputers & Education, 51 (1), 423 438. Pajares, F., & Graham, L. (1999). Self efficacy, motivation constructs, and mathematics performance of entering middle school students. Contemporary Educational Psychology, 24 (2), 124 139. Palmer, C. L., Burwitz, L., Dy er, A. N., & Spray, C. M. (2005). Endurance training adherence in elite junior netball athletes: A test of the theory of planned behaviour and a revised theory of planned behaviour. Journal of Sports Sciences, 23 (3), 277 288. Paris, S. G., Yambor, K. M., & Packard, B. W. L. (1998). Hands on biology: A museum school university partnership for enhancing students' interest and learning in science. Elementary School Journal, 98 (3), 267 288. Park, R. E., & Burgress, E. W. (1924). Introduction to the science of s ociology Chicago: Univ. Chicago Press. Parke, H. M., & Coble, C. R. (1997). Teachers designing curriculum as professional development: A model for transformational science teaching. Journal of Research in Science Teaching, 34 (8), 773 789. Pedhazur, E. J. (1997). Multiple Regression in Behavioral Research (3rd ed.) Orlando, FL: Harcourt Brace. Perlman, F., Bobak, M., Steptoe, A., Rose, R., & Marmot, M. (2003). Do health control beliefs predict behaviour in Russians? Preventive Medicine, 37 (2), 73 81. Petty R. E., Wegener, D. T., & Fabrigar, L. R. (1997). Attitudes and attitude change. Annual Review of Psychology, 48 609 647. Pintrich, P. R., & Schunk, D. H. (1996). Motivation in Education Englewood Cliffs, NJ: Prentice Hall. Pribyl, J. R., Adams, D., Hog anmiller, E., & Stevens, M. (1991). A Study of the Relationships between Achievement Motivation, Coping Skills, Self Efficacy, Cognitive Level, Health Behavior and Success in Introductory Chemistry Courses. Abstracts of Papers of the American Chemical Soci ety, 201 216 CHED.

PAGE 105

95 Randall, D. M., & Wolff, J. A. (1994). The Time Interval in the Intention Behavior Relationship Metaanalysis. British Journal of Social Psychology, 33 405 418. Reynolds, A. J., & Walberg, H. J. (1992). A Structural Model of Science A chievement and Attitude an Extension to High School. Journal of Educational Psychology, 84 (3), 371 382. Rivis, A., & Sheeran, P. (2003). Descriptive norms as an additional predictor in the theory of planned behaviour: A meta analysis. Current Psychology, 22 (3), 218 233. Romance, N. R., & Vitale, M. R. (1992). A Curriculum Strategy That Expands Time for in Depth Elementary Science Instruction by Using Science Based Reading Strategies Effects of a Year Long Study in Grade 4. Journal of Research in Science Teaching, 29 (6), 545 554. Rubba, P. A., & Anderson, H. O. (1978). Development of an instrument to assess secondary students understanding of the nature of scientific knowledge. Science Education, 62 (4), 449 458. Salta, K., & Tzougraki, C. (2004). Attitude s toward chemistry among 11th grade students in high schools in Greece. Science Education, 88 (4), 535 547. Schunk, D. H. (1991). Self Efficacy and Academic Motivation. Educational Psychologist, 26 (3 4), 207 231. Sedlacek, W. E. (2004). Beyond the Big Test: Noncognitive Assessment in Higher Education San Francisco: Jossy Bass. Shevlin, M., & Miles, J. N. V. (1998). Effects of sample size, model specification and factor loadings on the GFI in confirmatory factor analysis. Personality and Individual Differenc es, 25 (1), 85 90. Shibley, I. A., Milakofsky, L., Bender, D. S., & Patterson, H. O. (2003). College chemistry and piaget: An analysis of gender difference, cognitive abilities, and achievement measures seventeen years apart. Journal of Chemical Education, 80 (5), 569 573. Shymansky, J. A., Yore, L. D., & Anderson, J. O. (2004). Impact of a school district's science reform effort on the achievement and attitudes of third and fourth grade students. Journal of Research in Science Teaching, 41 (8), 771 790. Sins P. H. M., van Joolingen, W. R., Savelsbergh, E. R., & van Hout Wolters, B. (2008). Motivation and performance within a collaborative computer based modeling task: Relations between students' achievement goal orientation, self efficacy, cognitive processi ng, and achievement. Contemporary Educational Psychology, 33 (1), 58 77.

PAGE 106

96 Snider, J. G., & Osgood, C. E. (1969). Semantic Differential Technique: A Sourcebook Chicago, IL: Aldine. Spearman, C. (1904). The proof and measurement of association between two thi ngs. American Journal of Psychology, 15 72 101. Steele, D. J., Medder, J. D., & Turner, P. (2000). A comparison of learning outcomes and attitudes in student versus faculty led problem based learning: an experimental study. Medical Education, 34 (1), 23 2 9. Sternberg, R. J., & Williams, W. M. (1997). Does the Graduate Record Examination predict meaningful success in the graduate training of psychologists? A case study. American Psychologist, 52 (6), 630 641. Straus, M. A., & Wauchope, B. (1992). Measureme nt Instruments. In E. F. Borgatta & M. L. Borgatta (Eds.), Encyclopedia of Sociology (Vol. 2, pp. 1236 1240). NY: Macmillian Publishing Co. Subramaniam, P. R., & Silverman, S. (2000). Validation of Scores From an Instrument Assessing Student Attitude Towar d Physical Education. Measurement in Physical Education and Exercise Science, 4 (1), 29 43. Subramaniam, P. R., & Silverman, S. (2007). Middle school students' attitudes toward physical education. Teaching and Teacher Education, 23 (5), 602 611. The joint co mmittee on standards for educational evaluation. (2007). The personnel evaluation standards, 2nd edition. Thompson, B. (2003). Score reliability: Contemporary thinking on reliability issues Thousand Oaks, CA: Sage Publication, Inc. Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores: A historical overview and some guidelines. Educational and Psychological Measurement, 56 (2), 197 208. Thurstone, L. L. (1928). Attitudes Can Be Measured. The American Journal of Sociology, 33 (4), 529 554. Tracey, T. J., & Sedlacek, W. E. (1984). Noncognitive Variables in Predicting Academic Success by Race. Measurement and Evaluation in Guidance, 16 (4), 171 178. Tracey, T. J., & Sedlacek, W. E. (1987). A Comparison of White and Bl ack Student Academic Success Using Noncognitive Variables a Lisrel Analysis. Research in Higher Education, 27 (4), 333 348. Tuan, H. L., Chin, C. C., & Shieh, S. H. (2005). The development of a questionnaire to measure students' motivation towards science learning. International Journal of Science Education, 27 (6), 639 654.

PAGE 107

97 Turner, R. C., & Lindsay, H. A. (2003). Gender differences in cognitive and noncognitive factors related to achievement in organic chemistry. Journal of Chemical Education, 80 (5), 563 568. USF. (2008). USF 2008 2009 Undergraduate Catalog -College of Engineering general acedemic information. Retrieved 10/6, 2009, from http://www.ugs.usf.edu/catalogs/0809/enacainf.htm Uzun tiryaki, E., & Aydin, Y. C. (2009). evelopment and validation of chemistry self efficacy scale for college students. Research in Science Education, 39 539 551. Walczak, M. M., & Walczak, D. E. (2009). Do Student Attitudes toward Science Change during a Ge neral Education Chemistry Course? Journal of Chemical Education, 86 (8), 985 991. Walker, C. O., Greene, B. A., & Mansell, R. A. (2006). Identification with academics, intrinsic/extrinsic motivation, and self efficacy as predictors of cognitive engagement. Learning and Individual Differences, 16 (1), 1 12. Wicker, A. W. (1969). Attitudes versus actions: The relationship of verbal and overt behavior responses to attitude objects. Journal of Social Issues, 25 (4), 41 78. Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54 (8), 594 604. Zanna, M. P., & Rempel, J. K. (1988). Attitudes: A new look at an old concept. In D. Bar Tal & A. W. Kruglanski (Eds.), The Social Psychology of Knowledge (pp. 315 334). New York: Cambridge University Press.

PAGE 108

98 Appendices

PAGE 109

99 Appendix A: Commonly Used Acronyms University MW: a large Middle West Public Research University University SE: a large Southeastern Public Research University PL : Peer Led Guid ed Inquiry SAT Math : SATM math portion of the SAT exam ACT math: ACTM math portion of the ACT exam ACS : American Chemical Society ACS Exam : American Chemical Society First Semester General Chemistry (Special) Examination Sk: Skewness a measure of symmet ry of the probability distribution Ku: Kurtosis a measure of "peak ed ness" (peaked or flat) of the probability n: sample size M: mean SD: standard deviation EFA: exploratory factor analysis CFA: confirmative factor analysis

PAGE 110

100 Appendix B : ASCI V1 Name: U** ID lab section:_____ A list of opposing words appears below. Rate how well these words describe your feelings about chemistry Think carefully and try not to include your feelings toward chemistry teachers or chemistry courses. For each li ne, choose a position between the two words that describes exactly how you feel Circle that number on this sheet. The middle position is if you are undecided or have no feelings related to the terms on that line. When you are done with all 20 items, also record your responses on the scantr on. (B=1, C=2, D=3, E=4, F=5, G=6, H=7) CHEMISTRY IS 1 easy |__1__|__2__|__3__|__4__|__5__|__6__|__7__| hard middle 2 worthless |__1__|__2__|__3__|__4__|__5__|__6__|__7__| benefi cial 3 exciting |__1__|__2__|__3__|__4__|__5__|__6__|__7__| boring 4 complicated |__1__|__2__|__3__|__4__|__5__|__6__|__7__| simple 5 confusing |__1__|__2__|__3__|__4__|__5__|__6__|__7__| clear 6 good |__1__|__2__|__3__|__4__|__5__|__6__|__7__| bad middle 7 satisfying |__1__|__2__|__3__|__4__|__5__|__6__|__7__| frustrating 8 scary |__1__|__2__|__3__|__4__|__5__|__6__|__7__| fun 9 comprehensible |__1__|__2__|__3__|__4__|__5__|__6__|__7__| incomprehensible 10 chal lenging |__1__|__2__|__3__|__4__|__5__|__6__|__7__| not challenging 11 pleasant |__1__|__2__|__3__|__4__|__5__|__6__|__7__| unpleasant middle 12 interesting |__1__|__2__|__3__|__4__|__5__|__6__|__7__| dull 13 disgus ting |__1__|__2__|__3__|__4__|__5__|__6__|__7__| attractive 14 comfortable |__1__|__2__|__3__|__4__|__5__|__6__|__7__| uncomfortable 15 worthwhile |__1__|__2__|__3__|__4__|__5__|__6__|__7__| useless 16 work |__1__|__2__|__3__|__4__|__5__|__6__|__7__| pl ay middle 17 chaotic |__1__|__2__|__3__|__4__|__5__|__6__|__7__| organized 18 safe |__1__|__2__|__3__|__4__|__5__|__6__|__7__| dangerous 19 tense |__1__|__2__|__3__|__4__|__5__|__6__|__7__| relaxed 20 insecure |__1 __|__2__|__3__|__4__|__5__|__6__|__7__| secure

PAGE 111

101 Appendix C : ASCI V2 Name: U ** ID: Section:____________ A list of opposing words appears below. Rate how well these words describe your feelings about chemistry Think carefully and try not to include your feelings toward chemistry teachers or chemistry courses. For each line, choose a position between the two words that describes exactly how you feel Circle that number on this sheet. The middle position is if you are undecided or have no fe elings related to the terms on that line. When you are done with all 8 items, also record your responses on the scantron (B=1, C=2, D=3, E=4, F=5, G=6, H=7) Make sure to bubble in your name and U ** ID CHEMISTRY IS 1 easy |__1__|__2__|__3__|__4__|__5__|_ _6__|__7__| hard middle 2 complicated |__1__|__2__|__3__|__4__|__5__|__6__|__7__| simple 3 confusing |__1__|__2__|__3__|__4__|__5__|__6__|__7__| clear 4 comfortable |__1__|__2__|__3__|__4__|__5__|__6__|__7__| un comfortable 5 satisfying |__1__|__2__|__3__|__4__|__5__|__6__|__7__| frustrating 6 challenging |__1__|__2__|__3__|__4__|__5__|__6__|__7__| not challenging 7 pleasant |__1__|__2__|__3__|__4__|__5__|__6__|__7__| unpleasant middle 8 chaotic |__1__|__2__|__3__|__4__|__5__|__6__|__7__| organized

PAGE 112

102 Appendix D: ASCI V3 Name: U ** ID: Peer Leader:____________ A list of opposing words appears below. Rate how well these words describe your feelings ab out chemistry Think carefully and try not to include your feelings toward chemistry teachers or chemistry courses. For each line, choose a position between the two words that describes exactly how you feel Circle that number on this sheet. The middle pos ition is if you are undecided or have no feelings related to the terms on that line. When you are done with all 8 items, also record your responses on the scantron (B=1, C=2, D=3, E=4, F=5, G=6, H=7) Make sure to bubble in your name and U ** ID CHEMISTRY IS 1 easy |__1__|__2__|__3__|__4__|__5__|__6__|__7__| hard middle 2 chaotic |__1__|__2__|__3__|__4__|__5__|__6__|__7__| organized 3 confusing |__1__|__2__|__3__|__4__|__5__|__6__|__7__| clear 4 comfortable |__1 __|__2__|__3__|__4__|__5__|__6__|__7__| uncomfortable 5 satisfying |__1__|__2__|__3__|__4__|__5__|__6__|__7__| frustrating 6 challenging |__1__|__2__|__3__|__4__|__5__|__6__|__7__| not challenging 7 pleasant |__1__|__2__|__3__|__4__|__5__ |__6__|__7__| unpleasant middle 8 complicated |__1__|__2__|__3__|__4__|__5__|__6__|__7__| simple

PAGE 113

103 Appendix E : ASCI V3. 1 Name: U** ID: Instructor: ____________ A list of opposing words appears below. Rate h ow well these words describe your feelings about calculus Think carefully and try not to include your feelings toward calculus teachers or calculus courses. For each line, choose a position between the two words that describes exactly how you feel Circle that number on this sheet. The middle position is if you are undecided or have no feelings related to the terms on that line. When you are done with all 8 items, also record your responses on the scantron (B=1, C=2, D=3, E=4, F=5, G=6, H=7) Make sure to bubble in your name and U** ID CALCULUS IS 1 easy |__1__|__2__|__3__|__4__|__5__|__6__|__7__| Hard middle 2 chaotic |__1__|__2__|__3__|__4__|__5__|__6__|__7__| organized 3 confusing |__1__|__2__|__3__|__4__|__5__|__ 6__|__7__| Clear 4 comfortable |__1__|__2__|__3__|__4__|__5__|__6__|__7__| uncomfortable 5 satisfying |__1__|__2__|__3__|__4__|__5__|__6__|__7__| frustrating 6 challenging |__1__|__2__|__3__|__4__|__5__|__6__|__7__| Not challenging 7 pleasant |__1__|__2__|__3__|__4__|__5__|__6__|__7__| unpleasant middle 8 complicated |__1__|__2__|__3__|__4__|__5__|__6__|__7__| simple