USFDC Home  USF Electronic Theses and Dissertations   RSS 
Material Information
Subjects
Notes
Record Information

Full Text 
xml version 1.0 encoding UTF8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd leader nam Ka controlfield tag 001 001478751 003 fts 006 med 007 cr mnuuuuuu 008 040811s2004 flua sbm s0000 eng d datafield ind1 8 ind2 024 subfield code a E14SFE0000393 035 (OCoLC)56389719 9 AJS2441 b SE SFE0000393 040 FHM c FHM 090 RA425 (ONLINE) 1 100 Toyinbo, Peter A. 0 245 On effective and efficient experimental designs for neurobehavioral screening tests h [electronic resource] : the choice of a testing time for estimating the time of peak effects / by Peter A. Toyinbo. 260 [Tampa, Fla.] : University of South Florida, 2004. 502 Thesis (M.S.P.H.)University of South Florida, 2004. 504 Includes bibliographical references. 516 Text (Electronic thesis) in PDF format. 538 System requirements: World Wide Web browser and PDF reader. Mode of access: World Wide Web. 500 Title from PDF of title page. Document formatted into pages; contains 71 pages. 520 ABSTRACT: In its latest neurotoxicity guidelines released by the US EPA Office of Prevention, Pesticides and Toxic Substances (OPPTS) in 1998, it is recommended that in a neurobehavioral testing, at a minimum, for acute studies, observations and activity testing should be made before the initiation of exposure, at the estimated TOPE (time of peak effects) within 8 hrs of dosing, and at 7 and 14 days after dosing. It is recommended that estimation of TOPE be made by dosing pairs of rats across a range of doses and making regular observations of gait and arousal. However it is well known that TOPE may vary with end points or exposure conditions. In order to derive quantitative safety measures such as the benchmark doses (BMD), dosetimeresponse modeling must be done first and a modelbased estimate is then implied. In many cases, the overall BMD corresponds to a TOPE estimate. In such cases a substantial variation in the TOPE estimate in turn may result in substantial variation in BMD estimate. Therefore a reliable statistical estimate of TOPE is crucial to the correct determination of BMD. We therefore performed simulation studies to assess the impact of the experimentbased TOPE on the statistical estimation of the true TOPE on the basis of a fitted dosetimeresponse model. The simulation allows for the determination of the optimal timing range for the 2nd testing. The results indicated that given only four repeated observations, the optimal second testing time was at about midway between time zero and the true TOPE. Choosing the second testing time at the TOPE may not generate statistical estimates closer to the true TOPE. 590 Adviser: Yiliang Zhu 653 benchmark dose. modeling. methodology. neurotoxicity. dose response. 690 Dissertations, Academic z USF x Public Health Masters. 773 t USF Electronic Theses and Dissertations. 4 856 u http://digital.lib.usf.edu/?e14.393 PAGE 1 On Effective and Efficient Experimental Desi gns for Neurobehavioral Screening Tests: The Choice of a Testing Time for Es timating the Time of Peak Effects by Peter A. Toyinbo A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Public Health Department of Epidemiology and Biostatistics College of Public Health University of South Florida Major Professor: Yiliang Zhu, Ph.D. Member: C. Hendricks Brown, Ph.D. Member: Getachew Dagne, Ph.D. Date of Approval: July 06, 2004 Keywords: Neurotoxicity, dose response, m odeling, methodology, benchmark dose. Copyright 2004, Peter A. Toyinbo PAGE 2 DEDICATION To Grace For as long as we can remember, each step of our remarkable journey, you continue to be an inspiration to us. You freely bestow to our family all that you have, all that you are. Gracefully, you led us to one conviction: we are blessed with a lasting amazing Grace. We love you, Grace. Peter, for Funbi, Femi and Tomi. PAGE 3 ACKNOWLEDGMENTS I am grateful to Dr. Yiliang Zhu, my advisor and major professor, for his encouragement, support, research expertise, cr itical evaluation and excellent guidance in grooming this thesis into its end product. In addition, the gracious time, technical guidance and professional advice that Dr. Wei Wang gave were invaluable to the contents of this thesis. I gratefully acknowledge Dr. Hendricks Brown and Dr. Getachew Dagne for their participation on this thesis committee and for sharing key insights and valued expertise in the completion of the thesis. A special thank to my friends and colleagues in the Health Risk Assessment Methodology Group of the University of South Florida College of Public Health: Michael Wessel, Jenny Jia and Dr. Sanj ay Muniredi for all the hours of brain storming, thoughtful comments, and relaxing fun that went in to this thesis. I wish to acknowledge and thank Dr. Virginia Moser of US EPA as the provider of the FOB data used to carry out this thesis. Finally, as a fruitful tree would acknow ledge its root for its nurture, I am incredibly indebted to my family for th eir support, patience, understanding, and for believing in me. Especially to Grace, my wi fe, this accomplishment is the resultant of your countless gentle nudges. PAGE 4 i TABLE OF CONTENTS LIST OF TABLES iii LIST OF FIGURES iv ABSTRACT vi INTRODUCTION 1 Neurotoxicity and Neurobeha vioral Screening Methods 1 Neurobehavioral Sc reening Protocol 3 Experiment Based TOPE Estimate 4 Model Based TOPE Estimate 5 Objectives of this Study 6 THEORY AND METHODS 8 DoseTimeResponse Models 8 TOPE Estimation 9 Testing Times and DoseTimeResponse Profiles 10 Consideration for Optimal Design Theory 11 General Principles 11 Optimal Design under Nonlinear Model 13 Simulation Rational 17 Experimental Designs 18 Simulation Steps 20 Step 1: Define the DoseResponse Model and Population Parameters 20 Step 2: Generate Datasets for Each Design 22 Step 3: Estimate Parameters 23 Case Large Variance 23 Simulation under Two Additional Models 24 ToxicoDiffusion Model: Acute DDT Experiment / Neuromuscular Domain 24 Rational Function Model: Acute TET E xperiment / Activity Domain 24 RESULTS 28 Acute TET Experiment: Activity Domain / LinearExponential Model 28 Distribution of TOPE Estimates 28 Distribution of TOPE Estimates: Large Variance 32 Acute DDT experiment: Neuromuscula r Domain/ ToxicoDiffusion Model 37 PAGE 5 ii Acute TET Experiment/ Activity Domain: Rational Function Model 41 Rational Function Model: Case One 43 Rational Function Model: Case Two 46 Summary of Results and Interpretations 50 DISCUSSION 54 REFERENCES 59 PAGE 6 iii LIST OF TABLES Table 2.1 Testing Times for EPA/IP CS Collaborative Study Design and Candidate Designs 19 Table 2.2 Table of Four Testing Times 1 (Hours) by 30 Different Designs 20 Table 2.3 Acute TET Exposure Study: Activity Scores with Linear Exponential Model Fit 21 Table 2.4 Toxicodiffusion Model fit to Neuromuscular Scores of Rats Exposed to DDT 25 Table 2.5 Rational Function Model Specifications 26 Table 3.1 Convergence across Designs for Acute TET Experiment: Simulated Activity Scores (Large Variance) with LinearExponential Model Fit 33 Table 3.2 Table of Designs with Unique 2 nd Testing Times for Acute DDT Experiment 38 Table 3.3 Summary of Designs for the Estimation of TOPE 51 PAGE 7 iv LIST OF FIGURES Figure 2.1 Theoretical ResponseTime Profiles for the Highest Dose Group Based on Comparison Models: Activity Scores of Rats Exposed to TET 27 Figure 3.1 Boxplots of TOPE Estima tes across Designs for Acute TET Experiment: Simulated Activity Sc ores with Linear Exponential Model Fit 29 Figure 3.2 Bias and Relative Bias of TO PE Estimates across Designs for Acute TET Experiment: Simulated Activity Scores with Linear Exponential Model Fit 30 Figure 3.3 Median TOPE Estimates across Designs for Acute TET Experiment: Simulated Activity Scores with Linear Exponential Model Fit 31 Figure 3.4 Plots of Coefficient of Vari ation (CV) across Designs for Acute TET Experiment: Simulated Activity Sc ores with Linear Exponential Model Fit 32 Figure 3.5 Median TOPE Estimates across Designs for Acute TET Experiment: Simulated Activity Scores (Large Va riance) with Line ar Exponential Model Fit 34 Figure 3.6 Bias and Relative Bias of TO PE Estimates across Designs for Acute TET Experiment: Simulated Activity Scores (Large Variance) with Linear Exponential Model Fit 35 Figure 3.7 Boxplots of TOPE Estima tes across Designs for Acute TET Experiment: Simulated Activity Sc ores (Large Variance) with Linear Exponential Model Fit 36 Figure 3.8 Coefficient of Variation (CV) Across Designs for Acute TET Experiment: Simulated Activity Scor es (Large Variance) with Linear Exponential Model Fit 37 Figure 3.9 Median TOPE Estimates across Designs for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 39 PAGE 8 v Figure 3.10 Relative Bias of TOPE Estimates across Designs for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 39 Figure 3.11 Boxplots of TOPE Estima tes across Designs for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 40 Figure 3.12 Plots of Coefficient of Vari ation (CV) across Designs for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 41 Figure 3.13 Convergences across Designs of Acute TET Experiment: Simulated Activity Scores with Rational Function Model Fit 42 Figure 3.14 Boxplots of TOPE Estima tes across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 6.16 hr) with Rational Function Model Fit 43 Figure 3.15 Median TOPE Estimates across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 6.16 hr) with Rational Function Model Fit 44 Figure 3.16 Relative Bias of TOPE Estimates across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 6.16 hr) with Rational Function Model Fit 45 Figure 3.17 Plots of Coefficient of Vari ation (CV) across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 6.16 hr) with Rational Function Model Fit 46 Figure 3.18 Boxplots of TOPE Estima tes across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 2 hr) with Rational Function Model Fit 47 Figure 3.19 Median and Relative Bias of TOPE Estimates across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 2 hr) with Rational Function Model Fit 48 Figure 3.20 Plots of Coefficient of Vari ation (CV) across Designs of Acute TET Experiment: Simulated Activity Scores (TOPE = 2 hr) with Rational Function Model Fit 49 PAGE 9 vi ON EFFECTIVE AND EFFICIENT EXPERIMENTAL DESIGNS FOR NEUROBEHAVIORAL SCREENING TESTS: THE CHOICE OF A TESTING TIME FOR ESTIMATING THE TI ME OF PEAK EFFECTS Peter A. Toyinbo ABSTRACT In its latest neurotoxicity guidelines released by the US EPA Office of Prevention, Pesticides and Toxic Substances (OPPT S) in 1998, it is recommended that in a neurobehavioral testing, at a mi nimum, for acute studies, obse rvations and activity testing should be made before the initiation of expos ure, at the estimated TOPE (time of peak effects) within 8 hrs of dosing, and at 7 a nd 14 days after dosing. It is recommended that estimation of TOPE be made by dosing pairs of rats across a range of doses and making regular observations of gait a nd arousal. However it is well known that TOPE may vary with end points or exposure conditions. In order to derive quantitative safety measures such as the benchmark doses (BMD), dosetimeresponse modeling must be done first and a modelbased estimate is then implied. In many cases, the overall BMD corresponds to a TOPE estimate. In such cases a substantial variation in the TOPE estimate in turn may result in substantial variation in BMD estimate. Ther efore a reliable statistical es timate of TOPE is crucial to the correct determination of BMD. We therefore performed simulation studies to assess the impact of the experimentbased TOPE on the statistical estimation of the true TOPE on the basis of a fitted dose PAGE 10 vii timeresponse model. The simulation allows for the determination of the optimal timing range for the 2 nd testing. The results indicated that given only four repeated observations, the optimal second testing time was at about midway between time zero and the true TOPE. Choosing the second testing time at the TOPE may not generate statistical estimates closer to the true TOPE. PAGE 11 1 CHAPTER 1 INTRODUCTION 1.1 Neurotoxicity and Neurobehavioral Screening Methods Chemicals are an integral part of life, with the capacity to improve as well as endanger health. A link between human expos ure to some chemical substances and neurotoxicity has been firmly established (Anger, 1986; US EPA, 1990). Neurotoxicity is defined as adverse effects on ei ther the structure or functions of the nervous system (US EPA, 1998a). In addition to its primary role in psychological functions, the nervous system controls most, if not all, other bodily processes. Nervous system is sensitive to perturbation from various sources and has limite d ability to regenerate. Therefore, there is a need for regulation of neurotoxicants on a sc ientific basis. It is important to have consistent guidance on how to evaluate neurot oxic substances and asse ss their potential to cause transient or persistent, direct or indirect effects on human health. (US EPA, 1998a) In the EPAs neurotoxicity risk assessment guidelines (US EPA, 1998a), five categories of endpoints were described: structural or neuropathological, neurophysiological, neurochemical, behavior al, and developmental. The guidelines outline the scientific basis for evaluating e ffects due to exposure to neurotoxicants and discuss principles and methods for evaluating data from human and animal studies using the described endpoints. PAGE 12 In collaboration with international organizations such as the International Programme on Chemical Safety (IPCS), the US EPA has been developing and evaluating test methods that may eventually lead to an integrated approach to risk assessment of neurotoxicity. The EPA recommended the use of neurobehavioral screening methods as a first tier test for identifying and quantifying neurotoxicity of chemicals from animal studies (MacPhail et al, 1997; US EPA, 1998a). One such neurotoxicity screening battery is the Functional Observation Battery (FOB) in conjunction with motor activity (US EPA, 1998b). For the purpose of this thesis, FOB will be construed to encompass neurobehavioral screening methods. 1.2 Time of Peak Effects and Benchmark Dose Estimation There have been increasing efforts to improving the scientific methodologies for risk assessment of neurotoxic effects in human due to chemical exposure. US EPA has recommended to use BMD as an alternative to the NOAEL/LOAEL methodology (US EPA, 1998a). The benchmark dose (BMD) approach aims to identify an effective dose (ED) that would induce an increase (typically 110%) of the attributable risk of adverse effects over background through empirical modeling (Crump, 1984; Zhu 2001). This approach provides for more quantitative doseresponse evaluation when sufficient data are available and it takes into account the variability in the data and the slope of the doseresponse curve. (Crump, 1984; U.S. EPA, 1995; Zhu, 2001). A number of nonlinear mixed effects models have been developed for describing the dosetimeresponse relationships observed in the FOB data from the EPA Superfund study and the IPCS Collaborative study (Zhu, 2001; Zhu et al, 2003a,b). Methods to 2 PAGE 13 3 implement benchmark dose methodology for neurot oxicity data have also been developed (Zhu et al, 2003c). Zhu (2003) showed that bo th estimates of attri butable risk and BMD vary with exposure level, tim e of testing, and spontaneous risk. He argues that a time profile of BMD be considered and the smallest value over the time course be reported as the overall value for deriving a safety dose in regulation. For some doseresponse models (that will be focus of this th esis), this overall BMD must correspond to the time of peak effects (TOPE). A reliable estimate of TOPE is therefore crucial to the correct determination of BMD. 1.3 Neurobehavioral Screening Protocol Neurotoxicity testing procedures must meet certain data requirements of the U.S. EPA under the Toxic Substances Control Act an d the Federal Insecticide, Fungicide and Rodenticide Act (US EPA, 1991). In order to minimize variations among the testing procedures, the US EPA Office of Prevention, Pesticides and Toxic Substances (OPPTS) harmonized several other guidelines into a si ngle set of OPPTS guidelines released in 1998 (US EPA, 1998b). Specifically for single do se experiments, the OPPTS guidelines include the following recommendation for time of testing: At a minimum, for acute studies, observations and ac tivity testing should be made before the initiation of exposure, at the estimated TOPE (time of peak effects) within 8 hr s of dosing, at 7 and 14 days after dosing. Estimation of TOPE may be made by dosing pairs of rats across a range of doses and making regular ob servations of gait and arousal. The OPPTS guidelines of 1998 (US EP A, 1998b) was preceded by a similar design protocol that was adopted in the IPCS Collaborative study (Moser et al, 1997a,b) PAGE 14 4 and which produced the FOB data used in this thesis. Under the Collaborative study protocol, for the acute exposure experiment s, FOB and motor activity measurements were conducted at four testing times: t 1 immediately prior to exposure, t 2 estimated time of peak effect (TOPE), t 3 one day after dosing, and t 4 seven days after dosing. 1.4 Experiment Based TOPE Estimate The endpoints of FOB tests consist of about 30 noninvasive measures of gross functional deficits that quantify neurobehavioral changes in animals exposed to a chemical substance. The FOB measures can be grouped into six neurobehavioral functional domains, including activity, neur omuscular, excitability, sensorimotor, physiological and autonomic functions (McDan iel and Moser, 1993; Moser et al, 1997a). Whereas individual endpoints ca n be used for risk assessment, there are efforts to explore the use of composite domain scores. Ob viously it is practically inefficient to employ all available endpoints in a pilot study. Alternatively, the US EPA recommended that the method for selecting time of testing to be used in acute studies be based on rangefinding pilot study using gait and arousal as the endpoints for determining TOPE (Moser et al, 1997a; US EPA, 1998b), thus reduc ing the number of endpoints to a more manageable size of two. As a result, the experimentally determined TOPE is by design unique to individual chemical agents. However a previous study has shown that when the recommended endpoints for determining the TOPE in a pilot study are limited to only PAGE 15 5 gait and arousal, the second testing time (t 2 ) selected (assuming four testing times) for the acute study proper based on this TOPE estim ate may not be appropriate for other neurotoxic effects or endpoints which show a different time course (Lammers and Kulig, 1997). Conceivably, apart from random error of measurement, the TOPE estimate thus obtained might systematically differ from the true TOPE (parameter). For this reason, the timing adopted for the second testing may differ from the true TOPE substantially even following the EPA guidelines. The impact of such selection is largely unknown. 1.5 Model Based TOPE Estimate The FOB measures were multiscale and also were grouped into six functional domains. In order to reduce the number of e ndpoints for statistical efficiency, these multiscale measures were converted to domainspecific composite scores (McDaniel and Moser, 1993; Zhu et al, 2003b). Typically, a composite score would be a weighted average of individual scores involved. According to Zhu et al (2003b) this approach mandates, as a prerequisite, conversion of individual measures to a common ordinal scale. The authors therefore converted every measure, continuous or categorical, into a 4level ordinal scale in which ranking of an observation was based on the extent to which the corresponding neurobehavioral response was common in occurrence in a reference group. Zhu et al (2003b) then proceeded to dos etimeresponse modeling of the domainspecific composite scores, i.e. of grouped FOB measures, as part of the steps leading to BMD estimation. For each acute experiment (or chemical), a statistical model was fitted separately to each of six domains to pr oduce a total of six domainspecific TOPE PAGE 16 6 estimates. It was these modelbased TOPE es timates that were used to compute the benchmark dose for each composite score. Expectedly, for individual chemicals, each of the six domainspecific modelbased TOPE estimates might be different than the single experimentbased TOPE estimate used from the pilot study. While diffe rences in values be tween these two types of TOPE estimates are expecte d, the reliability of modelba sed TOPE estimates cannot be presumed. It is conceivable that the reliabil ity of the modelbased TOPE estimates might also be affected by the uncertainty inherent in the timing of the 2 nd testing that was determined from experimentbased TOPE estimate. 1.6 Objectives of this Study We believe that any uncertainty about the TOPE derived from the pilot experiment is carried over to va riability in the timing of the 2 nd testing. Furthermore, it is not clear how variability in the 2 nd testing time around the true TOPE could impact the statistical estimation of the true TOPE on th e basis of a fitted dosetimeresponse model. Clearly there is a need for e ffective experimental designs to facilitate the estimation of the true underlying TOPE by any well fitted mo del. On the contrary, a poorly designed experiment often results not onl y in inefficient use of time an d other resources, but also in invalid (bias) and/or impreci se (large variation) estimation. Therefore investigating effective and efficient time points in FOB tests for identifying TOPE may lead to improved neurotoxicity screeni ng procedures. It is this as pect of the neurobehavioral screening protocol for acute experime nts that this thesis will focus on. The main research questions this thesis seeks to answer are as follows: PAGE 17 7 1. Under the proposed protocol of the US EPA/IPCS Collaborative study, what impacts would the timing of the 2nd tes ting have on estimating the true TOPE? 2. If we fix the 1 st 3 rd and 4 th FOB testing times, what would be the optimal range for the 2 nd testing time to effectively estimate the TOPE? 3. How sensitive are the nonlinear mixed effects models under consideration to variability in the 2 nd testing times? PAGE 18 CHAPTER 2 THEORY AND METHODS 2.1 DoseTimeResponse Models A family of three linear/nonlinear doseresponse models with random effects (Zhu et al, 2003a,b.) were fitted to the Functional Observation Battery (FOB) and an automated motor activity data from the EPA/IPCS Collaborative Study (Moser et al, 1997b). The three statistical models are LinearExponential, ComplementaryExponential and ToxicoDiffusion models. The first two are different forms of the diffusion model as briefly described below. The diffusion model describes the expected response as a function of dose and time: Expected )exp(1)exp(),(tKCtdtKBtdAdtfresponseee where t = testing time; d = administered dose, and B, C and K e (elimination rate) are parameters to be estimated. A is the baseline level and can be time dependent. If the coefficient C=0, we have the linear exponential model given by Expected ) exp(),(tKBtdAdtfresponsee Linearization of the diffusion model with respect to )exp(tKCtde via first order Taylor series expansion leads to the complementary exponential model: Expected )}exp(1){exp(),(tKCtdtKBtdAdtfresponseee 8 PAGE 19 In all three models the elimination rate plays an important role. As t varies from [0,), f(t,d) attains an extreme value (either maximum or minimum depending on the sign of B) at t=1/, then returns towards the baseline f(0,0). These three models are capable of modeling neurotoxic effects that are transient in time, with a common time of peak effects (TOPE) at t=1/K eK eK e irrespective of exposure level. Of the three, only the LinearExponential and ToxicoDiffusion were used as cases in this thesis. Another nonlinear model that has never been fitted to the FOB data was also used in this thesis. Unlike the three models previously described, this model is nonexponential and nonmechanistic in any sense. It is a simple rational function hence it is referred to as Rational Function model and is given by 2),(tKBtdAdtfresponse Here the TOPE is also independent of the exposure dose and is computed from estimable K parameter. As t varies from [0, ), f(t,d) peaks to a maximum (B>0) at t = K (TOPE), then decreases back towards A. The inclusion of this nonexponential model would permit us to further examine the sensitivity of designs to the underlying models. 2.2 TOPE Estimation Statistical modeling of a sample data such as the FOB data aims to capture and describe the underlying distribution of the data in an analytical way so that it is understandable and interpretable systematically. The TOPE, our parameter of interest, 9 PAGE 20 must be estimated directly from a model fit to the data. The reliability of the TOPE estimate depends upon both the statistical estimator and the experimental design. Ideally we would like expected value of the estimator to equal the parameter estimated; that is E( ) = where is the population parameter and is the point estimator of The point estimator is said to be unbiased if the bias B = E( )= 0. In addition we would also prefer that the variation of the estimator V( ) be as small as possible because a smaller variance indicates that under replications, a higher fraction of values of will be close to The overall accuracy of the point estimator can be characterized by the mean squared error (MSE) that combines variance and bias to form a single measure. MSE( ) = V( ) + B 2 Thus, assuming we know the true population dosetimeresponse trend, we can numerically measure the overall quality of a statistical estimator of the TOPE by computing both the bias and the mean squared error. However a statistical estimator and its properties generally depend on the experimental design that generates the data. In the FOB tests, for example, the different sets of spacing of testing time will individually constitute different experimental designs that may lead to estimators with varying degree of bias (or lack of it) and mean squared error of the TOPE estimator. 2.3 Testing Times and DoseTimeResponse Profiles Several factors can shape the profile or time trend of effects of acute exposure to potential neurotoxic chemicals. Such factors include the type of chemical agent, the administered dose (and route), the endpoint being measured and the timing of measurements. Timing is an important factor because effects of acute exposure to 10 PAGE 21 11 neurotoxic compounds usually ha ve specific time profiles, w ith a certain window of time in which maximum effects can be observed (Zhu et al, 2003b). In the FOB protocol, the US EPA considered also these factor s in their recommendation that the 2 nd of four testing (the minimum required) be conducted at the estimated TOPE while the remaining three times of testing are fixed. The timing of the 2nd measurement can therefore vary depending mainly on prior knowledge, if any, of the TOPE of a particular chemicalendpoint combination. 2.4 Consideration for Optimal Design Theory 2.4.1 General Principles According to Tobia (2004) a regression model may be used to investigate the relation between a response variable and a num ber of explanatory variables. In some cases one is able to choose th e values of the explanatory vari ables, i.e. one can choose in which situations observations can be done. Su ch choice will determin e the quality of the experiment. The theory of experimental design governs the quality of the experiment with respect to its effectiveness of provi ding relevant information about the model. Using the notation similar to Tobia (2 004), let us consider a model with n explanatory variables x 1 ,,x n Under a linear relationship th e regression model is given by Y i = 1 f 1 (X i ) + 2 f 2 (X i ) + .+ k f k (X i ) + i and under a nonlinear relationship by Y i = f (X i ) + i An observation Y i is the sum of the response function f ( X i ) and error term i with PAGE 22 X i = ( xi1 ,.. xin ) as the vector of the explanatory variables, and = ( 1 ,, k ) as the vector of unknown parameters. The errors i (i = 1,.., N) are, in the simplest case, assumed to have expectation of zero, constant variance, and to be uncorrelated: V ( i ) = 2 and i ~ N(0, 2 ). Next, we can describe a design as follows. The m points in the experimental region where observations will be done are notated as X 1* X 2* ..., X m* where X i* = (x i1 x i2 ..., x in ). The number of observations at the point X i* is notated as n i so we have Nnmii1 with an experiment notated as xper(N) where N indicates how many observations are done in the design, and xper(N) = (X 1* ., X m* ; n 1 ,, n m ; N) Under the linear model, the design matrix is N by kmatrix X where )()()()()()()()()()()()(**1**1*2*21*2*21*1*11*1*11mkmmkmkkkkxfxfxfxfxfxfxfxfxfxfxfxfX mnnn21 For the least squares estimator we have ),.....,(1k 12 PAGE 23 YXXXTT1)( where Y is the vector of observations, Y = (Y 1 ,,Y N ), M =X T X is referred to as the information matrix and X is termed the design matrix. If the matrix M is not degenerative, then the matrix M 1 (X; ) 2 = is the dispersion matrix or the variancecovariancematrix of the best linear estimator of (Federov, 1972). 21)()(XXTCov The information matrix depends on the choice of the design X and choosing an optimal design means that we have to choose an X, say X independent of and error terms, which makes some realvalued function {M (X)} as large as possible, that is best for all (, ). We can say that X is optimal (Silvey, 1980). The D, A, Eand Goptimality are described briefly as follows. 1. The Dcriterion considers the generalized variance, i.e. the determinant of the informationmatrix. So a Doptimal design is a design for which the determinant of the informationmatrix is made as large as possible. 2. Goptimality is concerned with the variance of a predicted future observation at a given point x o : (1 + x 0 T (X T X) 1 x 0 ) 2 The design objective is to minimize the variance. 3. Aoptimality considers the trace of the matrix (X T X) 1 An Aoptimal design minimizes the value of tr(X T X) 1 so that the sum of the marginal variances of the estimators is minimal. 4. Eoptimality aims to maximize the eigenvalues of the matrix (X T X) 1 13 PAGE 24 2.4.2 Optimal Design under Nonlinear Model Following the general principles for the linear regression, we now consider our example: Linearexponential model, a nonlinear case with two predictor variables. The model y i = f(t i d i ) = A + Bt i d i exp(K e t i ) + i is given in section 2.1 and is a special case of the general nonlinear model Y i = f(X i ) + i where =(A B, K e ) are unknown parameters. Note that the parameter K e is of special interest because it determines the time of peak effect (TOPE); X i is a vector of time t i and dose d i for the ith observation; and the error terms are independent normal with constant variance: i ~ N(0, 2 ). The problem of seeking estimates becomes more complicated when the function f(X i ) is nonlinear in Using the GaussNewton method, a Taylor series expansion can be used to approximate the nonlinear regression model with linear terms and then employ ordinary least squares to estimate the parameters (Neter, 1996). Taking a first order Taylor approximation of mean response function f(X, ) at the estimate we have Y i (f(X i ) f A 1 (X i ) B f 2 (X i ) f eK 3 (X i ) = A f 1 (X i ) + B f 2 (X i ) + K e f 3 (X i ) + i where jjff and X is our experimental setting of t*d combination. From the standpoint of numerical approximation, the design matrix is determined by f 1 (X, ), f 2 (X, ), and f 3 (X, ), with 14 PAGE 25 f 1 (X, ) = 1; f 2 (X, ) = td exp(K e t); and f 3 (X, ) = Bt 2 d exp(K e t). Here we find that the function f 2 (X, ) includes K e while f 3 (X, ) includes both K e and B. Unlike in the case of linear regression model, the functions here are dependent on the parameters, and so is the design matrix. The implication is that the optimal design measures are actually dependent upon the true value of as well as the model. The solution to optimal design is obtained iteratively and necessarily begin with initial or starting values for the regression parameters A, B and K e With the approximate approach, we have a design matrix D of partial derivatives now playing the role of the X matrix (Neter, 1996). Similarly the form of D is memememeeeeeeeeetKdBttKtdtKdBttKtdtKdBttKtdtKdBttKtdtKdBttKtdtKdBttKtd)exp()exp(1)exp()exp(1)exp()exp(1)exp()exp(1)exp()exp(1)exp()exp(122222222121121D mnnn21 where n 1 = n 2 =.. n m = 50, m = 4; and 1, 2, m correspond to t 1 t 2 t 3 and t 4 In a typical design setting without constraints, the least squares estimator will be given as 15 PAGE 26 = (D T D) 1 D T Y where is a vector of the least squares estimated regression coefficients. The variancecovariance matrix of is Var() = (D T D) 1 2 In this case therefore we are looking for designs D which will maximize the optimality function {M (D), }, that is, D may depend on other than through the information matrix only. Although it could be problematic, the initial or starting values for the parameters would have to be found. In this thesis, optimal design takes some special constraints. We are interested only in designs with second testing time t 2 to be determined while everything else is fixed. In defining our experimental region, we are constrained by the FOB design protocol so we will focus on finding the optimum timing of 2 nd of four repeated measurements with the rest three testing times fixed. Five dose groups with their dose values were predecided. Therefore we define our experimental region as follows: t = 0, t 2 24, 168; where 0< t 2 =< 20 d = 0, 0.75, 1.5, 3.0, and 6.0 Also we want to be able to allow for dose groupspecific variances. In summary, the design considered here is a special case in which we maximize CMC. The C matrix selects particular components of the covariance matrix. Our focus of interest is to find an optimal design D{f(K e )} for the purpose of estimating TOPE as a function of one of the unknown model parameters K e while acknowledging that this optimality is also dependent on the parameters In addition we would like the optimal 16 PAGE 27 17 design to accommodate heteroscedasticity. Howe ver, instead of seeking algorithms that would enable us to construct the appropriate design measures, we opted for a relatively more empirical approach by doing simulation studies. 2.5 Simulation Rational In order to evaluate a design, we assume an underlying dosetimeresponse relationship is given and we generate data according to the dosetimeresponse relationship and normal random error. The data are simulated under a chosen design. The simulated data are then used to estimate the model parameters under which simulation was done. A number of experimental designs are po ssible with respect to the choice of dose and time. However, these designs vary in th eir capability of revering the information about the true parameters. Thus, we wished to perform a test to determine which of these different designs would produce the best estimate(s) of the population parameters. Our target parameter was the time of peak effects (TOPE). For this purpose, we simulated different designs by generating the FOB data based on the true model. An efficient design should allow estimation of the parameters to yield estimates as close to the true value as possible and as reliable as possibl e. For every simulated design, we fitted the same true model to the data via nonlinear mi xedeffects modeling to obtain an estimate of TOPE. The simulation and model fitting pro cess were replicated N times under each design. The designs which produced the best TOPE estimates were determined based on the bias and mean squared error (MSE) statis tics obtained from replicated estimates. The PAGE 28 18 efficiency of each design was evaluated with specific optimality criteria as follows. The absolute relative bias must be less than 5% of underlying TOPE and/or the design must be associated with the minimum MSE. The minimum MSE was determined both by computation and graphical illustrations. We placed more emphasis on the MSE as a single measure because it combines the effect of bias and sampling variation of the estimator. We applied the concept of coefficien t of variation (CV) to relate the MSE to the underlying TOPE. We therefore devised a modified coefficient of variation (mCV) which was computed as a ratio of square d root MSE to the underlying TOPE. This measure was used to compare the variab ility of different designs based on 2000 replications for each design. Practical desi gns were determined as those of minimum mCV and/or mCV of no more than 15%. 2.6 Experimental Designs We employed a number of different experi mental designs that essentially were variants of the EPA/IPCS Collaborative Study design (Moser et al, 1997a). These designs differ only with respect to the 2 nd testing time point. The desi gn for each acute exposure experiment in the EPA/IPCS Collaborative Study was as follows: Sample size = 50 rats: 5 dose groups with 10 rats per group Testing times per rat: t 1 = 0 hr, t 2 = TOPE (hr), t 3 = 24 hr, t 4 = 168 hr Total number of observations = 200 In line with the EPA/IPCS study protocol above, we fixed the dose levels and also fixed the three testing times t 1 t 3 and t 4 at baseline, 24 and 168 hours post exposure respectively. In order to inve stigate how the choice of se cond testing time would affect PAGE 29 19 the estimation of TOPE, we let t 2 vary between designs. Specifically, a sequence of 30 different designs was chosen with t 2 values ranging between 0.2 hr and 20 hr. Table 2.1 illustrates the EPA/IPCS Collaborative Study design and three of thirty test designs (first, second and thirtieth). It means that the desi gns allowed for comparison with the one that used the true TOPE as its 2 nd testing time. Table 2.1 Testing Times for EPA/IP CS Collaborative Study Design and Candidate Designs Testing Times EPA/IPCS Study Design Design #1 Design #2 Design #30 t 1 0 hr 0 hr 0 hr 0 hr t 2 Estimated TOPE (hr) 0.2 hr 0.4 hr 20 hr t 3 24 hr 24 hr 24 hr 24 hr t 4 168 hr 168 hr 168 hr 168 hr Table 2.2 shows all the 30 unique designs represented by the table columns, and of which their t 2 range from 0.2hr to 20hr. These designs were applied to three dosetimeresponse models coupled with different combinations of functional domain and experiment (or chemical). Three doseresponse models were cons idered: LinearExpone ntial, ToxicoDiffusion (Zhu et al, 2003a) and Rational Functi on models. Parameter values were taken from previous models fit to the real datase ts (Zhu et al. 2003a,b) except for the Rational Function model to which reasonable parameter values were simply assigned. Simulations were based on the following three selected co mbinations of design, response variable, and model: PAGE 30 20 1. Acute TET exposure experiment / Linear Exponential model / Activity domain composite score 2. Acute DDT experiment / ToxicoDi ffusion model / Neuromuscular domain composite scores 3. Acute TET exposure experiment / Rational Function model / Activity domain composite scores Table 2.2 Table of Four Testing Times 1 (Hours) by Thirty Different Designs 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 t 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 t 2 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.5 3 3.5 4 5 t 3 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 t 4 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 t 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 t 2 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 t 3 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 t 4 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 1. Testing times designated as t 1 t 2 t 3 and t 4 occur at 0hr, t 2 24hr and 168hr respectively. 2. The designs numbering from 1 to 30 are designated by their respective t 2 ranging from 0.2hr to 20hr 2.7 Simulation Steps Our simulation scheme is illustrated below using the first combination: Activity domain scores in conjunction with the Li nearExponential model. Simulations of the other two combinations we re conducted similarly. PAGE 31 21 2.7.1 Step 1: Define the DoseResponse Model and Population Parameters Table 2.2 was taken from Zhu et al (2003b) and it shows the resu lts of the LinearExponential model fit to the Activity domain composite scores from the acute TET experiment of the EPA/IPCS Collaborative studies (Moser et al, 1997b). We assumed that the fitted model repres ents a true doseresponse re lationship in a hypothetical population of rats, that is, the estimated model parameters were taken as the true values for this population. Based on this true model, the TOPE equals 1/ K o (= 6.16 hrs). The model specification accommodates heteroscedasticity wi th dosespecific standard error given by 0.2822, 0.2822*1.2884, 0.2822*1.3858, 0.2822*1.2953, 0.2822*2.548 for the five groups of dose=0, 0.75mg, 1.5mg, 3.0mg, and 6.0m g, respectively. The model also contains a random intercept (standard error= 0.1697) for each rat to allow for betweenrat variation. Table 2.3 Acute TET Exposure Study: Activity Scores with Linear Exponential Model 1,2 Fit Parameter Value Std.Error DF tvalue pvalue A 1.2738 0.0412 147 30.9129 <.0001 B 0.1755 0.0211 147 8.3261 <.0001 K 0.1623 0.015 147 10.8451 <.0001 Variance Estimate Dose 0 0.75 1.5 3 6 StdErr 1 1.2884 1.3858 1.2955 2.5479 Random effects A Residual Std Dev 0.1697 0.2822 Model Selection Criteria: AIC BIC logLik 248.3158 277.9555 115.1579 1. Model=A+B*dose*time*exp(K e *time) 2. Distinct variance assumed for each dose grou p, and the dosespecific standard error is in proportion to that of control PAGE 32 22 2.7.2 Step 2: Generate Datasets for Each Design For each design, we simulated an experiment consisting of 10 rats in each of five dose groups. Each rat was tested at 4 time poi nts yielding a total of 200 observations in each experiment. Data were generated based on the following mixedeffects model: y ijk = f( ik dose i time j ) + error ijk i = 1,2,..5; j = 1,2,3,4, k = 1,2 In the model ik includes also random effects (int ercepts) that are additive to the population parameters. The response values were obtained by evaluating the function at the true parameter values, simulated random effects, and the design points of dose and time given in Tables 2.2 & 2.3. The final out come values were obtained by further adding to the response values the simulated random effects and errors. For the linear exponential model, for example, the outcome is, y ijk =A+a ik +B*dose i *time j *exp(K*time j )+error ijk Simulations of random effects and errors were accomplished by using computer generated random numbers from specif ied distributions as follows: Random effects: This is unique to in dividual rat. Therefore for 50 rats, 50 random numbers were generated from a normal dist ribution with zero mean and standard deviation=0.1697 corresponding to that of random intercepts from the fitted model (table 2.2). Random error and heteroscedasticity: Random error is associ ated with individual observation and variances are unique to individual dose groups. Therefore, for 40 observations in each of 5 dose groups, 40 random numbers were generated from a normal distribution with a zero mean and standard deviation specific for that dosegroup. PAGE 33 23 Replication: We replicated N=2000 datasets under each design. N was established by allowing it to increase until fitted TOPE wa s stable. We found N = 2000 satisfactory for all designs in this study. 2.7.3 Step 3: Estimate Parameters The underlying model was fitted to each si mulated dataset to get estimates for parameters A, B and K. TOPE was computed from the estimate of K (TOPE = 1/K). This simulation process resulted in a sample of estimates (2000 replica tions, convergence rates of about 80% or greater) for each of A, B, K, and the TOPE. Based on this sample estimates, the followings were computed Bias = sample mean of the TOPE estimates the true TOPE, Mean Squared Error (MSE) = Bias 2 + Sample Variance of the TOPE Estimates, Relative bias = 100*bias/true TOPE and Modified coefficient of variation (mCV) = 100*sqrt(MSE) / true TOPE 2.7.4 Case Large Variance In order to see the impact of random e rror on the design, the simulation procedure ( Steps 14) was repeated for the TET/Ac tivity/LinearExponential model setting employing larger variation (StdErr=2.0; 7882% convergence). This illustrative example would enable us to evaluate each stu dy design for this experiment under extreme variability of population dose response profiles. PAGE 34 24 2.8 Simulation under Two Additional Models Simulation was conducted for two additional models: ToxicoDiffusion model and Rational Function model. This would allow us to evaluate the sensitivity of designs across model types. The followings are specif ic information about the models derived from previously fitting the ToxicoDiffusion model to the real data and from simply assigning parameter values to the Rational Function model. 2.8.1 ToxicoDiffusion Model: Acute DDT Experiment / Neuromuscular Domain Here we had the opportunity to explore a di fferent member of the same family of models as well as a different exposure ag ent and neurobehavioral domain. Table 2.4 taken from Zhu et al (2003b) shows the to xicodiffusion model f it to the commonality scores of the neuromuscular domain in th e acute DDT experiment. The estimated parameter values were used to simulate data under the toxicodiffusion model. From the table, the TOPE estimate directly computed fr om 1/K is 4.7hr. This value was assumed to be the true TOPE parameter for this setting. 2.8.2 Rational Function Model: Acute TET Experiment / Activity Domain This model is a simple rational function developed solely for the purpose of this thesis. Unlike the other two models, it was never before fitted to the real FOB data. The reason for inclusion of the model was to furt her examine the sensitivity of designs to models. In using this model, we were able to assign parameter values such that 1) the model reasonably describes a dosetimeresponse profile similar to that observed in the original FOB data, and 2) we would have th e opportunity to fit a model to simulated PAGE 35 datasets from a population profile with a relatively small true TOPE of 2 hr, the value recorded in the pilot study of the acute TET experiment. (Moser et al, 1997b) The Rational Function model was specified as follows: model = A + B*dose*time/(K + time 2 ), where TOPE = K Two sets of population parameters and sigma were simulated and fitted with Rational Function model as specified in the Table 2.5. The set of parameters (Case 1) describes a response profile similar to that of the exponential model fit to TET/activity scores from the highest dose group. The second set (Case 2) describes still a similar profile but the true TOPE is set lower at 2hr. Table 2.4 Toxicodiffusion Model 1,2 fit to Neuromuscular Scores of Rats Exposed to DDT 3 Parameter Value Std.Error DF tvalue pvalue A 1.3388 0.0307 147 43.57 <.0001 B 0.0187 0.0073 147 2.58 0.0109 C 0.0108 0.00667 147 1.61 0.1089 K e 0.2129 0.0327 147 6.52 <.0001 Variance Estimate Dose Group 0 10.9 21.8 43.5 87 StdErr 1 0.8859 1.0332 1.6252 1.2037 Random effects A: Intercept Residual StdDev: 0.1042 0.2785 1. Model=A+B*Dose*Time*exp(K e *Time)/ (1+C*Dose*Time*exp(K e *Time)) 2. Distinct variance assumed for each dose group, and the dosespecific standard error is in proportion to that of control 3. The IPCS/EPA Collaborative Study 25 PAGE 36 26 Table 2.5 Rational Function Model Specifications Parameter coefficients A B K Sigma ( ) TOPE (hr) Case 1 1.27 5.167 37.95 1.0 6.16 Case 2 1.25 1.67 4 0.3 2 In Figure 2.1 the LinearExponential (LE) and Rational Function (RF) models are compared by their theoretical doseresponse prof iles for a fixed dose. In the Figure, plot A displays the theoretical curve for the LE mode l fit to the Activity scores for the highest dose group in the acute TET experiment (Moser et al, 1997b; Zhu et al, 2003b). Plots B1 & B2 (Figure 2.1) are the doseresponse profiles as described by RF model under the assigned parameter values of Case 1 and Case 2, respectively. In the simulations, the two cases of RF m odel were used as the base models. The standard deviation was specified as = 1.0 across all dose groups in Case 1 where TOPE=6.12 hr. In Case 2 with TOPE=2.0 hr we specified = 0.3. PAGE 37 Figure 2.1 Theoretical ResponseTime Profiles for the Highest Dose 1 Group Based on Comparison Models 2 : Activity Scores of Rats Exposed TET Time(hr)Activity Score 01020304050 1.52.53.5 A. LinearExponential Model: TOPE = 6.16 hr Time(hr)Response 01020304050 1.52.53.5 B1. Rational Function Model: TOPE = 6.16 hr Time(hr)Response 01020304050 1.52.53.5 B2. Rational Function Model: TOPE = 2 hr 1. Maximum exposure dose was 6 mg in the acute TET experiment of IPCS/EPA Collaborative Study (Moser et al, 1997b) 2. LinearExponential Model = A+B*dose*time*exp(K e *time): (A, B, K e ) = (1.27, 0.17, 0.16) Rational Function Model = A + B*dose*time/(K + time^2): (A, B, K ) = (1.27, 5.167, 37.95) for TOPE=6.16 hr (A, B, K ) = (1.25, 1.67, 4) for TOPE = 2 hr 27 PAGE 38 28 CHAPTER 3 RESULTS 3.1 Acute TET Experiment: Activity Do main / LinearExponential Model The simulation results are summarized here in this section according to the pattern of standard deviation. Two variance pattern s were simulated. In the first category the random effect and dose group specific random erro rs in the original fitted data were simulated. Specifically, the standard deviati ons were 0.28, 0.36, 0.39, 0.37, and 0.72 for the five dose groups respectively. In the s econd category, a large constant value of standard deviation was set at 2.0 for all dose groups. 3.1.1 Distribution of TOPE Estimates The rate of convergence among 2000 repli cations was recorded for each of the 30 designs. The convergence rate was greater than 95% in all cases. Thirty boxplots (one boxplot of 2000 TOPE estimates per design) are displayed graphically sidebyside in Figure 3.1 to show the spread of TOPE estimates for individual designs. The designs with t 2 between about 0.5 hr and 6 hr appeared to have relatively smaller spread for the TOPE estimates. As the second testing time point continues to increase beyond the underlying TOPE=6.16 hr, the TOPE estimator becomes more variable. These findings suggested that the designs which have their 2 nd testing times at or before the TOPE are r obust to the estimation of TOPE. PAGE 39 Figure 3.1 Boxplots of TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 with LinearExponential Model Fit 0.20.40.60.811.21.41.61.822.533.54567891011121314151617181920 2468 TOPE2nd TIME OF TESTING (HR) 1. One boxplot per design with 2000 replications of TOPE estimates 2. Each of 30 designs is designated by the value of its 2 nd time of testing along yaxis 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, and 0.72), random intercept per subject ( = 0.17). The limits of xaxis have been reduced for clarity. Figure 3.2 shows bias (A) and relative bias (B) of the TOPE estimator. It is seen here that the relative bias of TOPE estimates was less than 5% for the designs with t 2 of 0.6 15 hr and was greater than 5% but less than 10% for designs with t 2 between 0.6 and 17 hr. Furthermore, the Figure also shows a likely positive bias associated with t 2 that is either much smaller or larger than the underlying TOPE. 29 PAGE 40 Figure 3.2 Bias and Relative Bias of TOPE Estimates 1 across Designs for Acute TET Experiment: Simulated Activity Scores 2 with LinearExponential Model Fit A2nd testing timebias 05101520 01234 B2nd testing timerelative bias (%) 05101520 5051015 1. Bias for each design is computed using the mean of 2000 replicates of TOPE estimates. Relative bias = 100*bias/true TOPE. 2. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.17). Each of 30 designs is designated by the value of its 2 nd time of testing along the xaxis. Vertical Dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr). Horizontal dashedlines from bottom mark 0%, 5% and 10% relative bias (B). Vertical axis of B has been reduced to enhance clarity. From Figure 3.1 we observe that the distributions of TOPE estimates for the designs appear to be normal when t 2 values are in the midrange but are likely positively skewed when t 2 is smaller or larger. When a distribution is positively skewed, its mean is greater than the median and the median then becomes a more robust measure of the center of the distribution. Therefore the profile of the median TOPE estimates in Figures 3.1 & 3.3 provides a supplementary picture of potential bias of TOPE estimates across designs. 30 PAGE 41 Figure 3.3 Median TOPE Estimates 1 across Designs for Acute TET Experiment: Simulated Activity Scores 2 with LinearExponential Model Fit 2nd test time (hr)Median TOPE estimate 05101520 5.05.56.06.57.0 1. Median of 2000 replicates of TOPE estimates. 2. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.17). Each of 30 designs is designated by the value of its 2 nd time of testing along the xaxis. Dashdot line marks the true TOPE (6.16 hr). Dotted lines mark the upper and lower 5% margins of underlying TOPE. Coefficient of variation (mCV) is plotted against designs in Figure 3.4. Plot A displays values corresponding to the whole t 2 spectrum (i.e. all designs) while in plot B the focus is on designs with t 2 less than 10 hr. The lowest mCV of 11.3% (corresponding to the lowest MSE of 0.481) was associated with the design of t 2 = 2 hr. However those designs with t 2 between 1 and 7 hours had mCV less than 15% (Figure 3.4B) and these designs were most precise in their estimation of TOPE. 31 PAGE 42 Figure 3.4 Plots of Coefficient of Variation (mCV) 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 with LinearExponential Model Fit A2nd testing time (hr)coefficient of variation(%) 05101520 50100150200 B2nd testing time (hr)coefficient of variation(%) 0246810 051015202530 1. MSE for each design was computed from the mean and the variance of 2000 replicates of TOPE estimates. mCV = 100*sqrt(MSE) / true TOPE 2. Plot A displays all 30 designs. The upper limit of yaxis has been reduced in plot B for clarity. 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.17). Vertical Dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr). 3.1.2 Distribution of TOPE Estimates: Large Variance The case of large variability in dose response was also considered in order to assess the impacts that such large variability might have on designs. We specified a large constant variance ( = 2.0) across all dose groups in the true model. This was expected to be a challenge to modeling considering that this variation constituted about three times the standard deviation ( = 0.72) of the most variant dose group in the original dataset. Expectedly, relatively low percent convergences (6878%) were recorded for the designs associated with both smaller and larger values of the second test times. However designs 32 PAGE 43 33 with t 2 values of between 0.6 hr and 15 hr re corded convergence fr actions of between 78% and 82%. Table 3.1 Convergence across Designs 1 for Acute TET Experiment: Simulated Activity Scores (Large Variance) 2 with LinearExponential Model Fit Design (t2 (hr)) 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Convergence (%) 69 76.95 78.45 78.35 76.25 78.45 78.5 79.3 81.05 80.1 Design (t2 (hr)) 2.5 3 3.5 4 5 6 7 8 9 10 Convergence (%) 79.1 80.2 80.85 79.6 78.95 80 80.55 81.5 79.05 80.8 Design (t2 (hr)) 11 12 13 14 15 16 17 18 19 20 Convergence (%) 80.25 80 78.9 79.7 79.55 76.7 76.65 74.6 71.85 68.9 1. Each design is designated by the value of its 2 nd time of testing. 2. Constant variance ( = 2.0) across dose groups In Figure 3.5 we see that the median TOPE estimates were within 5% margin of the true TOPE only for designs with t 2 of 0.4 to 0.6 hr. We also see from the figure that the median estimates were within abou t the 10% margin for designs with t 2 of less than about 10 hr. On the other hand, Figure 3.6 show s that all of the designs were positively biased in their estimation of TOPE; the bi as was about 10% above the underlying TOPE when t 2 was between 2 and 5 hr. Figure 3.7 reveals that the replicated TOPE estimates were generally positively skewed for designs with larger and sm aller (to a less extent) values of t 2 Because of the positive skewness, the means were generally gr eater than the medians and this may offer an explanation for some disp arity in profiles acr oss designs seen from Figures 3.5 and 3.6. The bias displayed in Figure 3.6 may be due in part to the skewness of the PAGE 44 distributions and, as a result, the profile of median estimates shown in Figure 3.5 may provide complimentary information. Figure 3.5 Median TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores (Large Variance) 3 with LinearExponential Model Fit 2nd test time (hr)Median TOPE estimate 05101520 456789 1. Median of 2000 replicates of TOPE estimates. 2. The designs are designated by the values of their 2 nd time of testing along the xaxis. 3. Constant variance ( = 2.0) across dose groups Dashdot line marks the true TOPE (6.16 hr). Dotted lines (from bottom) mark the lower 10%, 5%, and upper 5% and 10% margin of the underlying TOPE. 34 PAGE 45 Figure 3.6 Bias & Relative Bias of TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores (Large Variance) 3 with LinearExponential Model Fit A2nd time of testing (hr)bias 05101520 05101520 B2nd time of testing (hr)relative bias (%) 02468 510152025 1. Bias for each design was computed using the mean of 2000 replicates of TOPE estimates. Relative bias = 100*bias/true TOPE. 2. The 30 designs (not all is shown on these plots) are designated by the value of their 2 nd time of testing along the xaxis. 3. Constant variance ( = 2.0) across dose groups Dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr). Dottedlines mark intervals on xaxis. Limits of both axes in plot B have been reduced to enhance clarity. 35 PAGE 46 Figure 3.7 Boxplots of TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 (Large Variance) with LinearExponential Model Fit 0.20.40.60.811.21.41.61.822.533.54567891011121314151617181920 2468101214 TOPE2nd TIME OF TESTING (HR) 1. One boxplot of 2000 replications of TOPE estimates per design 2. Each of 30 designs is designated by the value of its 2 nd time of testing along yaxis 3. Constant variance ( = 2.0) across dose groups Upper limit of xaxis has been reduced for clarity. It should be noted, however, that on the downside, the estimates from these designs were associated with relatively wide spread. The mCV profile of all designs shown in Figure 3.8A and the expanded form in Figure 3.8B both appear to indicate that designs with t 2 ranging from 2.5 hr to 6 hr are associated with the lowest mCVs with the minimum value of 32.4% occurring at t 2 of 5 hr. Although the mCV trend within this t 2 range (2.5 6 hr) appears to be unstable (Figure 3.8B), the smallest spread coupled with 36 PAGE 47 least skewness have been demonstrated for designs in this t 2 range. In contrast, the mCV is about three times as large as the case of small variance (refer to section 3.1.1). Figure 3.8 Coefficient of Variation (mCV) 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 (Large Variance) with LinearExponential Model Fit A2nd testing time (hr)coefficient of variation(%) 05101520 01000200030004000 B2nd testing time (hr)coefficient of variation(%) 0246810 020406080100 1. MSE for each design was computed from the mean and the variance of 2000 replicates of TOPE estimates. mCV = 100*sqrt(MSE) / true TOPE 2. Designs with mCV larger than 100% were excluded in this figure for better display of mCV 3. Constant variance ( = 2.0) across dose groups compared to most variant dose group ( = 0.7) in original data. Vertical dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr). Dotted lines form grids to aid data point localization. 3.2 Acute DDT experiment: Neuromuscular Domain/ ToxicoDiffusion Model The DDT experiment is another experiment of the IPCS/EPA Collaborative Study (Moser et al, 1997b). The population parameters, plus random and error variances were obtained from the results (refer to Table 2.4) of a prior fit of the ToxicoDiffusion model 37 PAGE 48 38 to the dataset (Zhu et al 2003b). Accordingly, dose group specific variances ( = 0.28, 0.25, 0.29, 0.45, 0.34) were employed for simulation. Since the underlying TOPE in the DDT e xperiment was 4.7 hr, the spacing of t 2 was slightly modified here. A total of 31 designs with t 2 ranging from 0.2hr to 17hr were employed in this phase (Table 3.2). The c onvergence rate was great er than 96% for all simulated designs when fitting with the ToxicoDiffusion model. Table 3.2 Table of Designs with Unique 2 nd Testing Times for Acute DDT Experiment design # 1 2 3 4 5 6 7 8 9 10 2nd test time 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 design # 11 12 13 14 15 16 17 18 19 20 2nd test time 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 design # 21 22 23 24 25 26 27 28 29 30 31 2nd test time 7.5 8 9 10 11 12 13 14 15 16 17 Figure 3.9 again shows the median TOPE estimates for all designs, which were consistently within 5% margin of the underlyi ng value of 4.7hr. The relative bias (Figure 3.10) was less than 5% for t 2 between 0.4 hr and 12 hr, and less than 10% for t 2 between 0.2 hr and 14 hr; for t 2 = 15 hr and beyond, the bias beca me increasingly large, reaching more than 30% at t 2 = 17 hr. Figure 3.11 shows that the distribution of TOPE estimator here is not much different from the previous boxplots (refer to Figure 3.1). Figure 3.12 clearly demonstrates that the designs with t 2 fixed at 0.8 4 hr were associated with mCV of 15% or less. PAGE 49 Figure 3.9 Median TOPE Estimates 1 Across Designs 2 for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 2nd test time (hr)Median TOPE estimate 051015 4.04.55.05.5 1. Median of 2000 replicates of TOPE estimates. 2. Each design is designated by the value of its 2 nd time of testing along the xaxis. 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.10). Dashdot line marks the true TOPE (4.7 hr). Dotted lines (from bottom) mark the lower and upper 5% margins of the true TOPE. Figure 3.10. Relative Bias of TOPE Estimates 1 Across Designs 2 for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 2nd time of testing (hr)RELATIVE BIAS (%) 051015 0102030 1. Bias for each design was computed using the mean of 2000 replicates of TOPE estimates. 2. The designs are designated by the value of their 2 nd time of testing along the xaxis. 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.10). Dashdot line marks the true TOPE (4.7 hr). Dotted lines (from bottom) mark the 0%, 5% and 10% margins of the true TOPE. 39 PAGE 50 Figure 3.11 Boxplots of TOPE Estimates 1 Across Designs 2 for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit 0.20.40.60.811.21.41.61.822.533.544.555.566.577.5891011121314151617 2345678 TOPE2nd TIME OF TESTING (HR) 1. One boxplot per design with 2000 replications of TOPE estimates 2. Each of 31 designs is designated by the value of its 2 nd time of testing along yaxis 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.10). Upper limit of xaxis has been reduced to aid visualization. 40 PAGE 51 Figure 3.12 Plots of Coefficient of Variation (mCV) 1 Across Designs 2 for Acute DDT Experiment: Simulated Neuromuscular Scores 3 with ToxicoDiffusion Model Fit A2nd testing time (hr)coefficient of variation(%) 051015 20406080100 B2nd testing time (hr)coefficient of variation(%) 0246810 1015202530 1. MSE for each design was computed from the mean and the variance of 2000 replicates of TOPE estimates. mCV = 100*sqrt(MSE) / true TOPE. 2. Only plot A displays all the 31 designs. The upper limits of both axes have been reduced in plot B for better display of mCV. 3. Simulated variance pattern is equivalent to that obtained from the original FOB data: different variance per dose group ( = 0.28, 0.36, 0.39, 0.37, 0.72), random intercept per subject ( = 0.10). Vertical dashdot line passes through 2 nd testing time at the true TOPE (4.7 hr). Dotted lines form grids to aid localization of data points. 3.3 Acute TET Experiment/ Activity Domain: Rational Function Model This model is based on rational function that may also be used to describe a doseresponse profile similar to those demonstrated by the Activity domain in the acute TET experiment. The main purpose of using this model is to test sensitivity of design with respect to models, particularly with an underlying TOPE as small as 2 hr. 41 PAGE 52 Two sets of population parameters and sigma were simulated. For the first case, TOPE = 6.16 hr and standard deviation was specified as = 1.0 across all dose groups. The standard deviation was about 40% larger than that of the most variant dose group ( = 0.28*2.55 = 0.71) from the original dataset (Table 2.2). The second case had TOPE = 2 hr with = 0.3 for every dose group. In each of both simulations, the convergence rate of fitting simulated data was greater than 80% under designs of t 2 values less than 10 hr (Figure 3.13A, case 1) and 3.5 hr (Figure 3.13B, case 2). Figure 3.13 Convergence across Designs 1 : Simulated Activity Scores 2 of the Acute TET Experiment With Rational Function Model Fit A. TOPE = 6.16 HRTIME OF TESTING (HR)CONVERGENCE (%) 05101520 60708090100 B. TOPE = 2 HRTIME OF TESTING (HR)CONVERGENCE (%) 05101520 5060708090100 1. Each of 30 designs is designated by the value of its 2 nd time of testing along the xaxis. 2. Sigma equals 1.0 for A and 0.3 for B. Vertical dashdot line is the underlying TOPE. 42 PAGE 53 Since in each instance the t 2 range associated with good convergence stretches beyond the underlying TOPE reasonably well in both directions, there is sufficiently wide time window within which to reliably test candidate designs. Therefore nonconvergence is not a problem. 3.3.1 Rational Function Model: Case One Figure 3.14 shows that the spread of the replicated TOPE estimates is relatively Figure 3.14 Boxplots of TOPE Estimates 1 to Compare Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 (TOPE = 6.16 hr) With Rational Function Model Fit 0.20.40.60.811.21.41.61.822.533.54567891011121314151617181920 051015 TOPE2nd TIME OF TESTING (HR) 1. One boxplot per design with 2000 replications of TOPE estimates 2. Each of 30 designs is designated by the value of its 2 nd time of testing along yaxis 3. Simulated total variance is 1.0 (control group variance for the original FOB data = 0.6) Upper limit of xaxis has been reduced for clarity. 43 PAGE 54 small, apparently between 1 hr and 5 hr. The median increasingly shifted to the right (increased) starting from t 2 greater than about 9 hr but shifted to the left (decreased) when t 2 was less than about 0.6 hr. In addition, for both of these extreme t 2 values, the skewness increased and the variability of the TOPE estimator became increasingly large. Figure 3.15 Median TOPE Estimates 1 by Designs for Acute TET Experiment: Simulated Activity Scores 2 (TOPE=6.16 hr) with Rational Function Model Fit 2nd test time (hr)Median TOPE Estimates 05101520 5.05.56.06.57.07.58.0 1. Median of 2000 replicates of TOPE estimates. 2. Simulated constant variance per dose group (= 1.0). Each of 30 designs is designated by the value of its 2 nd time of testing along the xaxis. Dashdot line marks the true TOPE (6.16 hr). Dotted lines mark the upper and lower 5% & 10% margins of the true TOPE. Upper limit of yaxis has been reduced for clarity. In Figure 3.15, the median TOPE estimates for designs within the t 2 ranges of 0.28 hr and 0.29 hr are shown to be within the 5% and 10% margins of the underlying TOPE respectively. For the bias, Figure 3.16 shows that the relative bias was no more 44 PAGE 55 than 5% and 10% for t 2 in the ranges of 1.4 8 hr and 1.2 9 hr respectively. The mCV for all designs tested were greater than 15% while a lowest mCV of 18.4% was recorded for t 2 of 2.5 hr (Figure 3.17). Figure 3.16 Relative Bias of TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores (TOPE=6.16 hr) with Rational Function Model Fit 2nd time of testing (hr)Relative Bias(%) 05101520 50510152025 1. Bias for each design was computed using the mean of 2000 replicates of TOPE estimates. 2. Thirty designs are designated by the value of their 2 nd time of testing along the xaxis. Vertical dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr) while dotted lines form grids to aid data point localization. 45 PAGE 56 Figure 3.17 Plots of Coefficient of Variation (mCV) 1 Across 30 Designs 2 of Acute TET Experiment: Simulated Activity Scores 3 with TOPE=6.16 hr and Rational Function Model Fit A2nd testing time (hr)coefficient of variation(%) 05101520 50100150200250300 B2nd testing time (hr)coefficient of variation(%) 02468 15202530 1. MSE for each design was computed from the mean and the variance of 2000 replicates of TOPE estimates. mCV = 100*sqrt(MSE) / true TOPE. 2. Designs are designated by the value of their 2 nd time of testing along the xaxis. All 30 designs are displayed in A while in B the upper limits of the xand yaxes have been reduced for clarity. 3. Simulated constant variance per dose group (= 1.0). Vertical dashdot line passes through 2 nd testing time at the true TOPE (6.16 hr). Dotted lines form grids to aid data point localization. 3.3.2 Rational Function Model: Case Two Inspection of Figures 3.18 and 3.19A reveals a progressive shift of the median estimate away from the underlying TOPE as t 2 increased from 2 hr. As t 2 increased, the spread and skewness of the replicated TOPE estimates also increased. Quantitatively, the median TOPE estimates were within the 5% and 10% margins of the true value when t 2 46 PAGE 57 was no greater than 2.5 hr and 3 hr respectively (Figure 3.19 A). However those designs of which t 2 values were 0.2 3 hr produced estimates with no more than 5% relative bias. (Figure 3.19 B). Figure 3.18 Boxplots of TOPE Estimates 1 to Compare Designs 2 for Acute TET Experiment: Simulated Activity Scores 3 (TOPE = 2 hr) With Rational Function Model Fit 0.20.40.60.811.21.41.61.822.533.54567891011121314151617181920 012345 TOPE2nd TIME OF TESTING (HR) 1. One boxplot per design with 2000 replications of TOPE estimates 2. Each of 30 designs is designated by the value of its 2 nd time of testing along yaxis 3. Simulated constant variance per dose group (= 0.3). Upper limit of xaxis has been reduced for clarity. 47 PAGE 58 Figure 3.19 Median and Relative Bias of TOPE Estimates 1 Across Designs 2 for Acute TET Experiment: Simulated Activity Scores (TOPE=2 hr) with Rational Function Model Fit A2nd test time (hr)Median TOPE Estimates 0123456 1.52.02.53.0 B2nd time of testing (hr)Relative Bias (%) 012345 5051015 1. 2000 replicates of TOPE estimates. 2. Each of 30 designs is designated by the value of its 2 nd time of testing along the xaxis. Dashdot line marks the true TOPE (2 hr). Dotted lines mark the upper and lower 5% 10% (A) and 5% (B) margins of the true TOPE. The limits of both axes have been adjusted for better display. All of the designs tested in this case were associated with relatively high mCV (Figure 3.20). The minimum mCV was 15.1% and was recorded for the design t 2 of 0.4 hr (Figure 3.20B). 48 PAGE 59 Figure 3.20 Plots of Coefficient of Variation (mCV) 1 Across 30 Designs 2 of Acute TET Experiment: Simulated Activity Scores 3 (TOPE = 2 hr) with Rational Function Model Fit A2nd testing time (hr)coefficient of variation(%) 05101520 020040060080010001200 B2nd testing time (hr)coefficient of variation(%) 0.00.51.01.52.02.5 1015202530 1. MSE for each design was computed from the mean and the variance of 2000 replicates of TOPE estimates. mCV = 100*sqrt(MSE) / true TOPE. 2. Designs are designated by their 2 nd testing time along the xaxis. All designs are displayed in A while in B the limits of the xand yaxes have been adjusted to better display mCV. 3. Simulated constant variance per dose group (= 0.3). Vertical dashdot line passes through 2 nd testing time at the true TOPE (2 hr). Dotted lines form grids to aid data point localization. 49 PAGE 60 50 3.4 Summary of Results and Interpretations The results are summarized in Table 3.3 below. Each row (category) of the table represents one distinct setting of an expe riment with respect to FOB domain and dose response model structure. Thirty (31 for category C) different de signs (of different t 2 ) were evaluated within each setting. In the four categories where each involved 30 designs, the t 2 values tested ranged from 0.2 hr to 20 hr while for category C the range was 0.2 hr to 17 hr. The time window included the underlying TOPE (6.16 hr, 4.7 hr and 2.0 hr) in every instance. In cases A & C where the simulated dataset share the same variance pattern with the original dataset, designs within a wide range of t 2 (0.612 hr) yielded TOPE estimates with no more than 5% relative bias away from the true TOPEs. Similarly, in case E where the underlying TOPE was relatively sma ll (2 hr), only the designs in the t 2 range of 0.2 3 hr were able to produce estimates lying within 5% relative bias. As the variance of the simulated data increa sed over that of the original dataset, a decreasing number of designs qualify as efficient. For exam ple, in cases A and C where variance is comparable to that in the original dataset, the range of t 2 required to estimate the TOPE to within 10% of bias was in each case 0.6 17 hr and 0.2 14 hr respectively. However in case B (with about 200% increase in standard deviation over the most variant group in case A), a narrower range of t 2 (25 hr) was required to achieve the same level of accuracy for estimating the TOPE. Case D is intermediate between A and B with respect to variance (about 40% increase) and qualifying designs (1.2 9 hr). PAGE 61 51 Table 3.3 Summary of Designs for the Estimation of TOPE Best Designs Designated by Best 2 nd Time of Testing or t 2 (hr) Experiment / Domain /Model Variance Pattern in Simulated Data True TOPE (hr) Convergence (%) Median within 5% margin of TOPE Relative Bias =<5% Relative Bias =<10% mCV =<15% Lowest mCV A : TET/ Activity/ LE Different variance per dose group/ Random intercept 6.16 0.2 20 All designs (>95%) 0.2 20 All designs 0.6 15 0.6 17 1 7 2.0 (11.3%) B : TET/ Activity/ LE Large constant variance = 2.0 (~ 200% larger) 6.16 0.6 15 (78 82%) 0.4 0.6 none 2 5 none 5.0 (32.4%) C : DDT/ Neuromuscular/ TD Different variance per dose group/ Random intercept 4.7 0.2 17 All designs (>95%) 0.2 17 All designs 0.4 12 0.2 14 0.8 4 1.8 (13.7%) D : TET/ Activity/ RF Constant variance = 1.0 (~ 40% larger) 6.16 0.2 10 (>80%) 0.2 8 1.4 8 1.2 9 none 2.5 (18.4%) E : TET/ Activity/ RF Constant variance per dose group = 1.0 2.0 0.2 3.5 (>80%) 0.2 2.5 0.2 3 0.2 3. 0.4 0.4 (15.1%) Legend: LE Linearexponential model TD Toxicodiffusion model RF Rational function model TOPE Time of peak effects mCV Modified coefficient of variation PAGE 62 52 Generally the above findings suggest that for the cases consid ered, in order to produce reasonably accurate estimates of the TOPE to say within 5% of the true value, a design must choose a 2 nd testing time not far away from the underlying TOPE. Furthermore they suggest that the presence of wide variability in the data may reduce the capability of designs for accurate estimation of the TOPE and further restricts the choice of t 2 for effective designs to values less th an the underlying TOPE. Although the models are different in most of the cases considered here, it is reasonable to expect that variation in data may influence the designs as suggested by our findings. With respect to mCV, the bias and vari ance of estimation are combined. There is a direct linear relationship be tween mCV and MSE with lower values of either indicating high precision of estimation for a given design. Compared to relative bias (=<5%), here a much narrower range of t 2 was consistently required to achieve a desirable level of precision of estimates (mCV=< 15%) irrespective of the model or patte rn of variance in the data. Generally where TOPE=6.16 hr, t 2 should be about 2.5 hr in order to attain the smallest mCV which varied between 10% and 30% for the designs considered. For TOPE = 4.7 hr, the minimum mCV of 13.7% was achieved at t 2 =1.8 hr, however t 2 would be in the range of about 14 hr in order to have mCV of no more than 15%. Similarly, a minimum mCV of 15% was obtai ned by only one design of t 2 =0.4 hr when the underlying TOPE was 2 hr. Overall, the number of designs with the greatest precision (smallest mCV) is a subset of designs with the highe st validity (least bias) in the estimation of TOPE. In the cases considered in the this thesis, the most precise TOPE estimates were produced generally when the second test ing time was situated about midway between time zero and PAGE 63 53 the underlying TOPE. An exception where the mo st precise design had its second testing time (5 hr) relatively closer to the unde rlying TOPE (6.16 hr) was category B where variability in the sample was large ( =2). Here the smallest attainable relative bias (about 10%) and mCV (about 32%) were compar atively larger than those of other categories. It should be recalled that the tre nds of both measures (relative bias and mCV) across designs were rather unstable in the mi nimum regions (refer to Figures 3.6 & 3.8). The implication is that for this category, there is probability that the appropriate t 2 for the most effective design could be anywhere from 2 hr to 6 hr, which still leans more to the lower side of the underlying TOPE of 6.16 hr. In all, the t 2 value of each of these identified effective designs remained smaller than the underlying TOPE. It follows that unde r various combinations of conditions such as exposure agent, neurobehavioral domain, st atistical model, or value of the underlying TOPE, all of the qualifying effective designs seemed to share a common robust feature that the 2 nd testing time should be chosen at a po int a little earlie r than the underlying TOPE in order to achieve robust estimation of the TOPE. PAGE 64 54 CHAPTER 4 DISCUSSION The IPCS/EPA Collaborative Study protoc ol (Moser et al, 1997b) under which the existing FOB data were generated propos ed that the 2nd testing in a particular experiment be performed at the time of peak effects (TOPE) for that chemical. The TOPE is derived using two endpoint s through a pilot ex periment. Since true TOPE may vary with the testing chemical, the dosing leve l, and the endpoint, the choice of the 2 nd testing time can be important in determining the qual ity of the experiment. This thesis set out principally to find effective desi gns with respect to the choice of the second testing time point. Through simulation of a set of designs uniquely defined within a range of the 2nd testing time, the most effective designs were selected based on specified criteria. The results of the study showed that many designs are robust against a mi sspecification of the TOPE choice, and can produce TOPE estimates within a relative bias of 5% margin. These designs are also robust with respect to the criterion mean squared error (MSE) or modified coefficient of variati on (mCV); however the range of t 2 becomes narrower because of the inclusion of variance in these criteria. Further, empirical evidences show that these designs prefer to have the 2 nd testing time point before the true TOPE. However, it is not clear in general how ear lier the second testing time point can be. Further investigation will be helpful before our results can be generalized to a broader situation. PAGE 65 55 The doseresponse models considered in our study dictates that the TOPE is a function of the model parameters, and does not vary with dose level. Our simulation utilizes parameter values derived from se veral real datasets. For each design, we simulated 2000 replication experiments, and fit the underlying doseresponse model to them. The convergence rate was generally hi gh when fitting the doseresponse models to simulated datasets. Bias in estimating TOPE was generally negligible for most designs. Although, based on our findings, there seems to be reasonable latitude allowable around the TOPE for the choice of 2nd testing ti me in order for the statistical estimate of TOPE to be associated with no more than 5% relative bias, this may not in itself be sufficient or be readily achiev able in practice. As our study further shows, those designs with second testing performed at the TOPE may be associated with relatively high MSE or mCV. That means that in a single experiment, there is a high probability for such design to yield a TOPE estimate with more than 5% deviation from the true value. Alternatively, designs in which the second testing were performed at about halfway below the true value of TOPE were cred ited with the least MSE in our study and therefore can be expected to have the highest probability of producing TOPE estimates within the 5% margins of the true value in a single experiment. The main interpretation of our findings may be exemplified as follows. Let us consider for example an ideal situation under the proposed IP CS/EPA protocol. Prior to a certain acute exposure experime nt, the TOPE was accurately determined to be 4 hr post exposure in a range finding pilot study. As imp lied by the findings of this thesis, if we conduct the 2nd testing of the e xperiment at 2 hr (or between 1 and 3 hr) post exposure, the subsequent statistical estimate of the TO PE has a higher probability of being close to PAGE 66 56 the true TOPE value of 4 hr than if we ha d conducted the testing at 4 hr. Therefore the timing of the 2nd testing has an impact on the overall capability for statistical estimation of the true TOPE on the basis of a fitted dosetimeresponse model. The significance of our findi ngs can be further illustrat ed by exploring a scenario closer to real under the IPCS /EPA protocol. Major sources of uncertainty in the TOPE estimate obtained from a pilot study include syst ematic errors or bi as (inaccuracy) and random error or statistical variation (impr ecision). Therefore, if a pilot study came up with an estimated TOPE = 4 hr, given the uncertainty of estimation we can reasonably assume that the true value could be anywhere between 3 and 5 hr. If the conclusions from our present findings were to apply, then th e 2nd testing would be performed at halfway below the estimated TOPE, which would be at 2 hr. In effect the 2nd testing time (2 hr) would be about halfway below the true TOPE (which lies anywhere between 3 and 5 hr). That means this design would be close to optim al in spite of the uncertainty in the TOPE estimate from the pilot study. On the other hand, under the IPCS/EPA protocol, the 2nd testing would have to be performed at 4 hr. Such design might be fairly close to optimal if the true TOPE was between 4 and 5 hr but the design would definite ly be even further away from optimal if the true TOPE lied between 3 and 4 hr. The last scenario above is very conc eivable given the fact that the pilot experiment based TOPE estimates obtained for just two FOB measur es may not be truly representative of all th e 30 FOB response measures both within and across neurobehavioral domains. Hence it is reasonable to expect that the TOPE estimate may be fraught with substantial uncertainty as depicted above. Designing the experiment proper so that the 2nd testing time is about half way below the pilot experiment based PAGE 67 57 TOPE estimate is therefore recommendable. Th is may increase the probability of getting TOPE estimate that is close to the true va lue thereby facilitating effective statistical estimation of the TOPE. The findings and recommendations of this thesis may have a limited direct application to the OPPTS guidelines released in 1998 (US EPA, 1998b), where the proposed minimum times of testing are before exposure, at TOPE, and 7 and 14 days post exposure. The present study was evaluated und er the IPCS/EPA protocol that produced the FOB data, and where times of testing were before exposure, at TOPE, and 1 and 7 days post exposure. If the dosing effects are transient such that the toxic effects are largely washed out between day 1 and day 7, then data collected on day 14 provides very little additional information beyond those data collected on prev ious testing times. In that situation our findings may be inapplicable to such data generated under the 1998 guidelines. It should be noted though that the interindividual variability with respect to dosetimeresponse characteristic that is distri buted in a given population of rats should be inherent to that population regardless of under which protocol (whether 1997 or 1998 protocol) observations are made. So far as the statistical models referred to in this thesis adequately fit the dosetime response trend (e.g. single peak, maximum or minimum; no premature washout) in a given sample, our pr esent findings may be applicable under both protocols. Nevertheless, the potential impact that the difference between data generated under both protocols may have on statistical mode ling should be a subject of future study. In further research it will be useful to investigate whether at least two testing time points surrounding the TOPE may be needed, one earlier and one later. It will also be helpful to assess the impact of TOPE es timate on the variability of Benchmark dose PAGE 68 58 (BMD) which is related to the TOPE estimate. Such research may help to further quantify the relative contributions of the comparison designs tested in this thesis to the variation in TOPE and BMD estimations. PAGE 69 59 REFERENCES Anger, W. K. (1986). Workplace exposures. In Neurobehavioral Toxicology (Z. A. Annau, ed., pp. 331347. Johns Hopkins University Press. Crump, K. S. (1984). A new method for determining allowable daily intakes. Fundamental and Applied Toxicology 4, 854871. Fedorov, V. (1972). Theory of optimal experiments. Academic Press, New York. Lammers, J. H. C. M., and Kulig, B. M. (1997). Multivariate time of peak effects assessment for use in selecting time of te sting in acute neurotoxicity studies. Neurotoxicology 18, 107984. MacPhail, R. C., Tilson, H. A., Moser, V. C., and al, e. (1997). The IPCS Collaborative Study on Neurobehaviroal Screening I. Background and Genesis. Neurotoxicology 18, 925928. McDaniel, K. L., and Moser, V. C. (1993). Ut ility of a neurobehavioral screening battery for differentiating the effects of two pyr ethroids, permethrin and cypermethrin. Neurotoxicology and Teratology 15, 7183. Moser, V. C., Becking, G. C., Cuomo, V., and al, e. (1997). The IPCS collaborative study on neurobehavioral screening methods V: results of chemical testing. Neurotoxicology 18, 9691056. Moser, V. C., Tilson, H. A., MacPhail, R. C., Becking, G. C., Cuomo, V., Frantik, E., Kulig, B. M., and Winneke, G. (1997). The IPCS collaborative study on PAGE 70 60 neurobehavioral screening methods II: protocol design and testing procedures. Neurotoxicology 18, 929938. Neter, J., Kutner, M., Nachtshe im, C., and Wasserman, W. (1996). Applied linear statistical model. WCB/McGrawHill, Boston, MA. Silvey, S. (1980). Optimal design. Chapman and Hall, London. Tobias RD (Retrieved 2004). The stru cture of optimal design algorithms. http://support.sas.com/rnd/app/papers/optex.pdf. U.S. EPA (1991). Pesticide assessment guidelin es, subdivision F. Hazard evaluation: human and domestic animals. OPPT S Addendum 10. PB91154617. NTIS, Springfield VA, Washington, DC. U.S. EPA (1995). Reportable quantity adjust ments; final rule, pp. 60(112):3092630962. Federal Register. U.S. EPA (1998). Neurotoxicity Screening Ba ttery. OPPTS 870.6200. In Health Effects Test Guidelines EPA 712C98238. US EPA (1990). Neurotoxicit y: identifying and controlling poisons of the nervous system, Vol. OTABA436, pp. 105144. Office Of Technology Assessment (OTA), U.S. Congress, Washington, DC: Government Printing Office. US EPA (1998). Guidelines for Neurotoxi city Risk Assessment, pp. 63:2692626954. Federal Register. Wackerly, D., Mendenhall, I. W., and Scheaffer, R. (1996). Mathematical statistics with applications. Wadsworth Publishing Company, California. PAGE 71 61 Zhu, Y. (2001). Neurobehavioral toxicity risk assessment: dosetimeresponse modeling and benchmark dose estimation. Reported to National Center for Environmental Assessment, US EPA. Zhu, Y. (2003). Mixedeffects toxicokinetic and toxicodynamic doseresponse models for neurotoxicity risk assessment. Environmentrics. Under review. Zhu, Y., Toyinbo, P., (2003). Analyses of Neurobehavioral Data: Dosetimeresponse Modeling of Composite Scores. Unpublished manuscript. Zhu, Y., Jia, Z., Wang, W., Gift, J., Moser, V ., and PierreLouis, B. (2004c). Analyses of the Neurobehavioral Screening Data III: Benchmark Dose Estimation. Under EPA's peer review. Zhu, Y., Toyinbo, P., Woodruff, S., Liu, T., and Moser, V. (2004b). Analyses of Neurobehavioral Data II: Dosetimeresponse Modeling of Composite Scores. Under EPA's peers review. Zhu, Y., Wessel, M., Woodruff, S., Liu, T., and Moser, V. (2004a). Analyses of Neurobehavioral Screening Data I: Do seTimeResponse Modeling of Continuous Outcomes. Under EPA's peers review. 