USFDC Home  USF Electronic Theses and Dissertations   RSS 
Material Information
Subjects
Notes
Record Information

Full Text 
xml version 1.0 encoding UTF8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd leader nam 2200397Ka 4500 controlfield tag 001 001795320 005 20080626162332.0 007 cr mnu uuuuu 008 070703s2006 flu sbm 000 0 eng d datafield ind1 8 ind2 024 subfield code a E14SFE0001561 035 (OCoLC)151100958 040 FHM c FHM 049 FHMM 090 QA36 (ONLINE) 1 100 Rajaram, Lakshminarayan. 0 245 Statistical models in environmental and life sciences h [electronic resource] / by Lakshminarayan Rajaram. 260 [Tampa, Fla] : b University of South Florida, 2006. 520 ABSTRACT: The dissertation focuses on developing statistical models in environmental and life sciences. The Generalized Extreme Value distribution is used to model annual monthly maximum rainfall data from 44 locations in Florida. Time dependence of the rainfall data is incorporated into the model by assuming the location parameter to be a function of time, both linear and quadratic. Estimates and confidence intervals are obtained for return levels of return periods of 10, 20, 50, and 100 years. Locations are grouped into statistical profiles based on their similarities in return level graphs for all locations, and locations within each climatic zone. A family of extreme values distributions is applied to model simulated maximum drug concentration (Cmax) data of an anticoagulant drug. For small samples (n < 100) data exhibited bimodality. The results of investigating a mixture of two extreme value distributions to model such bimodal data using twoparameter Gumbel, Pareto and Weibull concluded that a mixture of two Weibull distributions is the only suitable FTSel.For large samples Cmax data are modeled using the Generalized Extreme Value, Gumbel, Weibull, and Pareto distributions. These results concluded that the Generalized Extreme Value distribution is the only suitable model. A system of random differential equations is used to investigate the drug concentration behavior in a threecompartment pharmacokinetic model which describes coumermycin's disposition. The rate constants used in the differential equations are assumed to have a trivariate distribution, and hence, simulated from the trivariate truncated normal probability distribution. Numerical solutions are developed under different combinations of the covariance structure and the nonrandom initial conditions. We study the dependence effect that such a pharmacokinetic system has among the three compartments as well as the effect of variance in identifying the concentration behavior in each compartment.^We identify the time delays in each compartment. We extend these models to incorporate the identified time delays. We provide the graphical display of the time delay effects on the drug concentration behavior as well as the comparison of the deterministic behavior with and without the time delay, and effect of different sets of time delay on deterministic and stochastic behaviors. 502 Dissertation (Ph.D.)University of South Florida, 2006. 504 Includes bibliographical references. 516 Text (Electronic dissertation) in PDF format. 538 System requirements: World Wide Web browser and PDF reader. Mode of access: World Wide Web. 500 Title from PDF of title page. Document formatted into pages; contains 181pages. Includes vita. 590 Adviser: Chris P. Tsokos, Ph.D. 653 Trivariate normal. Simulation. Covariance structure. Bioavailability. Extreme value distribution. 690 Dissertations, Academic z USF x Mathematics Doctoral. 773 t USF Electronic Theses and Dissertations. 949 FTS SFERS ETD QA36 (ONLINE) 4 856 u http://digital.lib.usf.edu/?e14.1561 PAGE 1 Statistical Models in Environmental and Life Sciences by Lakshminarayan Rajaram A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Mathematics College of Arts and Sciences University of South Florida Major Professor: Chris P. Tsokos, Ph.D. Marcus Mcwaters, Ph.D. Kandethody Ramachandran, Ph.D. Geoffrey Okogbaa, Ph.D. Date of Approval: April 6, 2006 Keywords: trivariate normal, simulation, cova riance structure, bioa vailability, extreme value distribution Copyright 2006 Lakshminarayan Rajaram PAGE 2 Dedication I dedicate this dissertation to my wife, Ha, to my son, Ganesh, and to my brother, Swamy. Especially to Swamy, who spent his entire life nurturing all of us, our parents and siblings, with his unflinching devotion a nd love. His loving words of encouragement from the depths of his magnanimous heart and that everpresent radiant smile have solely contributed to my commitment to complete the doctoral degree. PAGE 3 Acknowledgments I take this opportunity to acknowledge a number of people who have helped me to succeed in accomplishing my goal. Firstly, I would like to express my profound appreciation to Dr. Chris P. Tsokos for suggesting the research problems, for his continual support and encouragement all through my graduate studies. I also want to thank the other members of my committeeDr. Marcus McWaters, Dr. K. Ramachandran, and Dr. Okogbaafor their guidance and support. Finally, I like to express a special note of thanks to Dr. A.N.V. Rao for all his encouragement. PAGE 4 i Table of Contents List of Tables v List of Figure vii Abstract xi Chapter One Review of Literature and Statistical Modeling Using Extreme Value Distributions 1.0 Introduction and the Focus of Chapter One 1 1.1 Literature Review of Extreme Value Theory 3 1.2 Probability Framework for Extreme Value Theory 6 1.3 Generalized Extreme Value (GEV) Distribution 11 1.3.1 Probability Density Func tion and Maximum Likelihood 11 1.3.2 Quantile 13 1.3.3 Model Diagnostics 14 1.3.4 GEV Modeling of Nonstationary Processes 16 1.4 Gumbel Distribution 18 1.4.1 Probability Density Func tion and Maximum Likelihood 18 1.4.2 Quantile 20 1.4.3 Model Diagnostics 20 1.4.4 Statistical Modeling of Annual Maximum Discharge 21 1.5 Frechet Distribution 24 1.5.1 Probability Density Func tion and Maximum Likelihood 24 1.5.2 Quantile 25 1.5.3 Model Diagnostics 26 1.6 Weibull Distribution 27 1.6.1 Probability Density Func tion and Maximum Likelihood 27 1.6.2 Quantile 28 1.6.3 Model Diagnostics 29 1.6.4 Statistical Modeling of Annual Maximum Storm Surge 30 1.7 Generalized Pareto Distribution 32 1.7.1 Probability Density Func tion and Maximum Likelihood 33 1.7.2 Quantile 34 1.7.3 Model Diagnostics and Methods of Selecting a Threshold Value 36 1.7.4 Statistical Modeling of Daily Precipitation Data 37 1.8 Aims of the Present Research 41 PAGE 5 ii Chapter Two Statistical Modeling of A nnual Monthly Maximum Rainfall Using the Generalized Extreme Value Distribution 2.0 Introduction 44 2.1 Focus of Chapter Two 45 2.1.1 Data 45 2.1.2 Methodology 48 2.1.3 Models 49 2.2 Results and Discussions 50 2.3 Similarity Profiles by Each of the Three Climatic Zones 57 2.4 Conclusion 64 Chapter Three Mixed Statistical Model for the Pharmacokinetic Parameter, Maximum Drug Concentration (Cmax): Small Samples 3.0 Introduction 65 3.1 Outline of Each Section of Chapter Three 66 3.2 Pharmacokinetics 66 3.2.1 Bioavailability and Asse ssment of Bioavailability 67 3.2.2 Introduction to Warfarin 68 3.2.3 Maximum Drug Concentration (C max ) Data 70 3.3 Focus of the Chapter Three 71 3.4 Statistical Methods for Small Samples 72 3.4.1 Generalized Extreme Value and Gumbel Distributions 72 3.4.2 Weibull Distribution 72 3.4.3 Pareto Distribution 74 3.5 Mixture of Two Extrem e Value Distributions 75 3.5.1 Introduction 75 3.5.2 Mixture of Two Gu mbel Distributions 77 3.5.3 Mixture of Two Pareto Distributions 80 3.5.4 Mixture of Two Wei bull Distributions 83 3.6 Results of the Modeling on Sample Sizes 50 and 100 88 3.6.1 Mixture of Two Pareto Distributions 88 3.6.2 Mixture of Two Wei bull Distributions 89 3.7 Summary and Conclusions 95 Chapter Four Statistical Model for the Pharmacokinetic Parameter, Maximum Drug Concentration (C max ): Large Samples 4.0 Introduction 97 4.1 Outline of Each Section of Chapter Four 97 4.2 Generalized Extreme Value (GEV) Distribution 98 4.2.1 Derivations of Normal Equations 100 4.2.2 Validity of the GEV Model 101 4.3 Gumbel Distribution (Type I Extreme Value Distribution) 102 4.3.1 Derivations of Normal Equations 102 4.3.2 Validity of the Gumbel Model 103 PAGE 6 iii 4.4 Weibull Distribution (Type II I Extreme Value Distribution) 104 4.4.1 Derivations of Normal Equations 104 4.4.2 Validity of the Weibull Model 105 4.5 Generalized Pareto Distribution 105 4.6 Results of the Modeling on Large Samples 107 4.6.1 Assessment of Generalized Extreme Value Model 107 4.6.2 Assessment of Gumbel Model 110 4.6.3 Assessment of Weibull Model 111 4.6.4 Assessment of Gene ralized Pareto Model 111 4.7 Return Level Profiles Based on the Quartiles 113 4.8 Summary and Conclusions 114 Chapter Five Statistical Modeling of a Pharmacokinetic System 5.0 Introduction 116 5.1 Focus of the Chapter Five 119 5.2 A Pharmacokinetic Model for the Antibiotic Drug, Coumermycin A 1 120 5.3 Simulation of Rate Constants Usi ng Trivariate Normal Distributions 124 5.4 Discussion of the Results 128 5.4.1 Effects of Varying the Values of the Standard Deviation and the Correlation Structure for Studying the Behaviors of x 1 (t), x 2 (t) and x 3 (t) 129 5.4.2 Summary of the Effects of the Standard Deviation and the Correlation Structure on the Behaviors of x 1 (t), x 2 (t) and x 3 (t) 130 5.5 Stochastic Behavior of the Drug Concentration in Each Compartment 136 5.5.1 Effects of Varying the Values of the Standard Deviation and the Correlation Structure on the Deterministic ( x i (t) ) and Stochastic ( E[ x i (t)]) behaviors of the Indi vidual Compartments 136 5.5.2 Summary of the Effects of the Standard deviation and the Correlation Structure on the Deterministic and the Stochastic Characterizations 137 5.6 Simulation of Rate Constants Usi ng Trivariate Exponential Distribution 142 5.6.1 Introduction 142 5.6.2 Discussion of Results 144 5.7 Conclusion 148 Chapter Six Statistical Mode ling of a Pharmacokinetic System Using a System of Delay Random Differential Equations 6.0 Introduction 149 6.1 Focus of Chapter Six 151 6.2 Pharmacokinetic Model 154 6.3 Discussion of the Results 156 6.3.1 Overall Behavior of the Drug Concentration in all Three Compartments 156 6.3.2 Comparison of Deterministic Behavior Between Delay and Nondelay Models 158 PAGE 7 iv 6.3.3 Comparison of Deterministic a nd Stochastic behaviors of the Delay Models 163 6.4 Conclusion 168 Chapter Seven Future Research Studies 7.0 Possible Extensions of the Present Research Studies 169 References 171 Bibliography 179 About the author End Page PAGE 8 List of Tables Table 1.6.1 Estimates of shape and scale pa rameters with their standard errors from 3 from 3 and 2parameter Weibull fits on the data set containing 111 annual maximum storm surge. 31 Table 1.7.1 Estimates of scale and shape pa rameters with standard errors from the generalized Pareto model on th e Fort Collins precipitation data. 38 Table 1.7.2 95% Confidence interval es timation using profile likelihood for the GPD shape parameter and 100year return level. 38 Table 2.1 Tabulation of the location, clim ate zone (central, north, or south), years of data, latitude and longitu de for all forty four locations. 46 Table 2.2 Summary of the stationary or nonstationary form of the maximum rainfall data for 56 years of da ta for the forty four locations. 52 Table 2.3 Return levels of the extreme rain fall with profile devi ance of quantiles. 54 Table 3.1 Maximum likelihood estimators of the parameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,200,100021 1 n 40.0 ,*2002 pd 78 Table 3.2 Maximum likelihood estimators of the parameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,200,50021 1 n 40.0 ,*2002 pd 79 Table 3.3 Maximum likelihood estimators of the parameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,100,50021 1 n 40.0 ,*2002 pd 79 Table 3.4 Parameter estimates and the sta ndard errors from the mixture Pareto model (n = 50). The Akaike Information Criterion (AIC) for this model is 720. 88 v PAGE 9 vi Table 3.5 Parameter estimates and the sta ndard errors from the mixture Pareto model (n = 100). The Akaike Information Criterion (AIC) for this model is 1537. 89 Table 3.6A Parameter estimates and the sta ndard errors from the mixture Weibull model(n = 50). The Akaike Information Criterion (AIC) for this model is 727.63 89 Table 3.6B Parameter estimates and the sta ndard errors from the mixture Weibull model (n = 50). The AIC value for this model is 728. 90 Table 3.7A Parameter estimates and the sta ndard errors from the mixture Weibull model (n=100). The AIC value for this model is 1445 91 Table 3.7B Parameter estimates and the sta ndard errors from the mixture Weibull model (n = 100). The AIC value for this model is 1445. 92 Table 4.1 Parameter Estimates (SE) for the GEV Model. 107 Table 4.2 Parameter Estimates (SE) for the Gumbel Model. 110 Table 4.3 Parameter Estimates (SE) for the ThreeParameter Weibull model. 111 Table 4.4 Parameter Estimates (SE) fo r the generalized Pareto Model. 112 Table 4.5 Parameter Estimates (SE) for the GEV fit on four groups of data. 113 Table 5.2.1 Rate constants for the disposition of coumermycin A 1 121 Table 5.3.1 Flowchart desc ribing all possible configur ations in our numerical study of the pharmacokinetic system. 127 Table 5.4.1 Effects of varying the value of the standard deviation and correlation structure for studying the behavior of x 1 (t), x 2 (t) and x 3 (t). 129 Table 5.5.1 Effects of varying the values of the standard deviation and correlation structure on the deterministic a nd stochastic behaviors of the individual compartments. 136 PAGE 10 vii List of Figures Figure 1.4.1 Probability plot for the annual maximum discharges of Feather River. 22 Figure 1.4.2 Graph of u(T) Threshold ve rsus Tyear for the annual maximum discharges of Feather River data set. 23 Figure 1.6.1 Estimates of storm surges based on MLE of 2and 3parameter Weibull distribution (n = 111). 31 Figure 1.7.1 Diagnostic plots for the GPD fit of the Fort Collins Precipitation Data (threshold value = 0.40). 39 Figure 1.7.2 Profile loglikelihood plot for GPD 100year re turn level (inches) for Fort Collins precipitation data. 40 Figure 1.7.3 Profile loglikelihood plot for GPD shape parameter for Fort Collins precipitation data. 40 Figure 2.0 Map of the State of Florid a identifying all 44 locations that are used in the statistical modeling. 47 Figure 2.1 Diagnostics plots for Plant City: Quantile and density plots. 51 Figure 2.2 Return Level versus Retu rn Periods for all 44 locations. 56 Figure 2.3A Return level profiles for all 18 locations in Central Climatic Zone. 58 Figure 2.3B Return level profiles for all 12 locations in North Climatic Zone. 58 Figure 2.3C Return level profiles for all 14 locations in South Climatic Zone. 59 Figure 2.3D 10Year return le vel versus location with 95% Confidence Interval for all forty four locations. 60 Figure 2.3E 20Year return level versus location with 95% Confidence Interval for all forty four locations. 61 Figure 2.3F 50Year return level versus location with 95% Confidence Interval for all forty four locations. 62 PAGE 11 Figure 2.3G 100Year return level versus location with 95% Confidence Interval for all forty four locations. 63 Figure 3.1 Probability Plot from the 2parameter Weibull fit for n = 50. 73 Figure 3.2 Probability Plot from the 2parameter Weibull fit for n = 100. 73 Figure 3.3 Probability plot from the Pareto fit for n = 50. 74 Figure 3.4 Probability plot from the Pareto fit for n = 100. 75 Figure 3.5A Fitted distribution of maximum drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.6A 90 Figure 3.5B Fitted distribution of maximum drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.6B 90 Figure 3.6A Fitted distribution of maximum drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.7A 91 Figure 3.6B Fitted distribution of maximum drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.7B 92 Figure 3.7A Prediction of C max based on the model parameters displayed in Table 3.6A 93 Figure 3.7B Prediction of C max based on the model parameters displayed in Table 3.6B 93 Figure 3.8A Prediction of C max based on the model parameters displayed in Table 3.7A 94 Figure 3.8B Prediction of C max based on the model parameters displayed in Table 3.7B 94 Figure 4.1 Diagnostic Plots for GEV 1000 n 108 viii PAGE 12 ix Figure 4.2 Prediction of C max versus Number of Subjects with 90% Confidence Interval 109 Figure 4.3 Prediction of C max versus Number of Subjects with 95% Confidence Interval 109 Figure 4.4 Prediction of C max versus Number of Subjects with 99% Confidence Interval 110 Figure 4.5 Prediction of C max versus Number of Subjects from the Maximum Likelihood Estimates of the GEV Model 114 Figure 5.2.1 Model for the disposition of Coumermycin A 1 120 Figure 5.4A Deterministic characterization of drug concentration in all compartments. 132 Figure 5.4B Deterministic characterization of coumermycin A 1 concentration (x 1 (t)). 133 Figure 5.4C Deterministic characterization of coumermycin A 1 concentration (x 2 (t)). 134 Figure 5.4D Deterministic characterization of coumermycin A 1 concentration (x 3 (t)). 135 Figure 5.5A Deterministic and stoc hastic characterizations of x 1 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 139 Figure 5.5B Deterministic and stoc hastic characterizations of x 2 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 140 Figure 5.5C Deterministic and stoc hastic characterizations of x 3 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 141 Figure 5.6A Graphs of the drug conc entration behaviors in the three compartments of the two si mulations when the initial conditions are c 1 = 1, c 2 = 0, and c 3 = 0. 145 Figure 5.6B Graphs of the drug conc entration behaviors in the three compartments of the two si mulations when the initial conditions are c 1 = 1, c 2 = 0.05, and c 3 = 0.05. 146 PAGE 13 x Figure 5.6C Graphs of the drug conc entration behaviors in the three compartments of the two si mulations when the initial conditions are c 1 = 1, c 2 = 0.10, and c 3 = 0.10. 147 Figure 6.1.1 Flowchart describing the configurations for all 27 models for each of the four sets of constant time delay values. 153 Figure 6.2.1 Model for the disposition of Coumermycin A 1 154 Figure 6.3.1 Deterministic characterizati ons of drug concentration in all 3 compartments. 157 Figure 6.3.2A Deterministic characterization of coumermycin A 1 concentration (x 1 (t)) for time delay values 0.5h, 1h, and 3.5h. 160 Figure 6.3.2B Deterministic characterization of coumermycin A 1 concentration (x 2 (t)) for time delay values 1h, 2h, and 6h. 161 Figure 6.3.2C Deterministic characterization of coumermycin A 1 concentration (x 3 (t)) for time delay values 1.5h, 3h, and 8h. 162 Figure 6.3.3A Stochastic and Deterministic characterizations of drug Concentration in all three compar tments with standard deviation for time delay values: t 1 = 3.5h, t 2 = 6h, t 3 = 3h. 164 Figure 6.3.3B Deterministic (x 1 (t)) and stochastic (E[x 1 (t)]) characterizations of coumermycin A 1 concentration for all four sets of time delay values in hours 165 Figure 6.3.3C Deterministic (x 2 (t)) and stochastic (E[x 2 (t)]) characterizations of coumermycin A 1 concentration for all four sets of time delay values in hours 166 Figure 6.3.3D Deterministic (x 3 (t)) and stochastic (E[x 3 (t)]) characterizations of coumermycin A 1 concentration for all four sets of time delay values in hours 167 PAGE 14 Statistical Models in Environmental and Life Sciences Lakshminarayan Rajaram ABSTRACT The dissertation focuses on developing stat istical models in environmental and life sciences. The Generalized Extreme Value distribu tion is used to model annual monthly maximum rainfall data from 44 locations in Florida. Time dependence of the rainfall data is incorporated into the m odel by assuming the location parameter to be a function of time, both linear and quadratic. Estimates and confidence intervals are obtained for return levels of return periods of 10, 20, 50, and 100 y ears. Locations are group ed into statistical profiles based on their similarities in return level graphs for all locations, and locations within each climatic zone. A family of extreme values distributions is applied to model simulated maximum drug concentration (C max ) data of an anticoagulant drug. For small samples (n 100) data exhibited bimodality. The results of investigating a mixture of two extreme value distributions to model such bimodal data using twoparameter Gumbel, Pareto and Weibull concluded that a mixture of two Weibull distributions is the only suitable model. xi For large samples, C 100 n max data are modeled using the Generalized Extreme Value, Gumbel, Weibull, and Pareto distri butions. These results concluded that the Generalized Extreme Value distribu tion is the only suitable model. PAGE 15 xii A system of random differential equati ons is used to investigate the drug concentration behavior in a threecompar tment pharmacokinetic model which describes coumermycins disposition. The rate constant s used in the differential equations are assumed to have a trivariate distribution, and hence, simulated from the trivariate truncated normal probability distribution. Numerical solutions are deve loped under different combinations of the covariance structure and the nonrandom initial conditions. We study the dependence effect that such a pharmacokinetic system has among the three compartments as well as the effect of variance in identifying the concentration behavi or in each compartment. We identify the time delays in each compartment. We extend these models to incorporate the identified time delays. We provide the graphical display of the time delay effects on the drug concentration behavior as well as the comparison of the deterministic behavior with and without the time delay, and effect of different sets of time delay on dete rministic and stochastic behaviors. PAGE 16 1 Chapter One Review of Literature and Statistical M odeling Using Extreme Value Distributions 1.0 Introduction and the Focus of Chapter One In real life, examples such as How tall should one design an embankment so that the sea reaches this level only once in 100 years? ; What is the lowest value the Dow Jones Industrial Average can reach in the next three years? ; How high can the drug concentration in bloodstream go before causing toxicity? require estimation. But since no data or only few have been observed as by definition extreme events are rare a branch of statistics that helps us to deal with such rare situations and that gives a scientific alternative to pure guesswork is the Extreme Value Theory (EVT). The explosion of the space shuttle Challe nger in 1983 was the consequence of an extreme event: the exceptionally low temperature (15 o F lower than the next coldest previous launch) the night before launchi ng ultimately led to fa ilure of the Orings (objects that are used to seal mechanical parts against fluid movement, air or liquid) which caused the disaster. Using standard EV Tanalysis, one could have predicted that one should not launch at such cold temperatur e, despite having no measurements at such low temperatures. If one seeks to estimate about everyday events, it might not matter if extreme data are cut off. But if one asks questions about events that do not happen very often, one should apply Extreme Value Theory; especially as these are the situations where one has the most to lose or win. For the layperson, events such as earthquakes, hurricanes, and PAGE 17 2 stock market crashes seem to follow no rule, bu t careful analysis has helped to discover distributions that acceptably model these extreme events [19]. The most important feature of an extr eme value analysis is the objective to quantify the stochastic behavior of the maximum and the minimum of i.i.d. random variables. The distributional properties of extremes (maximum and minimum), extreme and intermediate order statistics, and exceed ances over high (or below low) thresholds are determined by the upper and lower tails of the underlying dist ribution. The extreme value analysis requires the estimation of the probability of events that are more extreme than any that have already been observed [23]. Extreme value theory is a unique discipline that develops stat istical techniques for describing the unusual phenomena such as rainfall, floods, wind gusts, air pollution, ea rthquakes, risk management, insurance, and financial matters. Focus of Chapter One Section 1.1 contains an extensive literature review that describes the evolution of the extreme value theory (EVT) to its present status. Section 1.2 focuses on the probability framework for the EVT. Sections 1.3 through 1.7 present a survey of statistical modeling using Generalized Extreme Value (GEV), Gumble, Frechet, Weibull, and Generalized Pareto (GP) distributions. For each of these distributions, a thorough discussion of the probability density function, maximum likeli hood estimation, quantile expression, and model diagnostics is given along with a numerical example for a few of them. Section 1.8 provides a brief overview of th e aims and objectives of the present study with emphasis being placed on each research problem. PAGE 18 3 1.1 Literature Review of Extreme Value Theory Historically, work on extreme value problem s can be traced back to as early as 1709 when Nicholas Bernoulli discussed the mean largest distance fr om the origin given n points lying at random on a straight line of a fixed length t [45]. Probably the first paper that described an application of extreme values in flood flows was by Fuller [35] in 1914, and Griffith [44], in 1920, brought out an a pplication while discu ssing the phenomena of rupture and flow in solids. A paper writte n in 1922 by von Bortkiew icz [12] may have contributed to a systematic development of ex treme value theory. His paper dealt with the distribution of range in random samples from a normal distribution, and the concept of distribution of largest value was introduced for the firs t time. In 1923, von Mises [69] evaluated the expected value of this distri bution, and Dodd [27] calcu lated its median and discussed some nonnormal parent distributions A paper that had more direct relevance to the extreme value theory was written in 1927 by Frechet [33] in which he discussed the asymptotic distributions of largest values. In 1928, Fisher and Tippett [31] published the results of their research into the same pr oblem. In addition, they showed that extreme limit distributions can only be one of three types. In 1936, von Mises [70] presented sufficient conditions for the weak convergence of th e largest order statistic to each of the three types of limit distributions given by Fi sher and Tippett. In 1943, Gnedenko [43], in this breakthrough paper presented a solid foundation for the extreme value theory and provided necessary and sufficient conditions for the weak convergence of the extreme order statistics. Gnedenkos work was refi ned later on by many others that include Mejzler [68] in 1949 and de Haan [25] in 1970. PAGE 19 4 Following the theoretical developments of the extreme value theory during 1920s and mid 1930s, many scholarly papers dealing w ith the variety of practical applications of the theory were published in late 1930s and 1940s. E.J. Gumbel played a pioneering role during 1940s and 1950s, and from the a pplication point of view, he made many significant contributions to th e extreme value theory. He presented all of these in his statistics of extremes [45, 46] in 1958 and this work wa s pivotal in promoting extreme value theory as a tool for modeling the extrem al behavior of observed physical processes. The Generalized Extreme Value (GEV), Gumbel, Frechet, Weibull, and the Generalized Pareto (GP) distributions are just the tip of the iceberg of an entirely new and quickly growing branch of statistics. The Gumbel di stribution has light or medium tails, Frechet distribution has heavy tails, and Weibull di stribution has bounded or short tails. The Pareto distribution is used to model how income is distributed, and to estimate finite limit of human lifespan [1]. Since the publication of Gnedenkos limit theorem [42] for maxima in 1941, and Gumbels statistics of extremes extreme value theory has found applications in engineering, environmental modeling, and fi nance [84]. Some recent applications of Gumbel distribution have been for fire protection and insuran ce problems [76], the prediction of earthquake magnit udes, modeling of extremely high temperatures [16], and the prediction of high return levels of wind speeds relevant for the design of civil engineering structures [71]. Recent applica tions of Frechet include estimation of the probabilities of extreme occu rrences in Germanys stock index and prediction of the behavior of solar proton peak fluxes [15, 100]. Weibull applic ations include modeling of failure strengths of loadsharing and window glasses [49, 8], analysis of corrosion PAGE 20 5 failures of leadsheathed cables at the Kennedy Space center [62], and estimating the occurrence probability of gian t freak waves in the sea area around Japan [101]. Internet traffic, structural reliability and biotech analyses are few of the other prime targets for EVT applications, as their data distributions display heavy tails, too. Another more recent application of EVT is in the finance indus try. A quantity known as ValueatRisk (VaR) has become the standard risk measurement to protect portfolio hol ders against adverse market conditions and prevent them from taki ng extraordinary risks [5, 6]. The VaR is defined as the quantile of the ProfitandLo ss (P&L) distribution of value V t at time t over the holding period, or horizon, h Existing standard methods to calculate VaR assume normality of the data which is inappropriate since the unconditional distribution of financial time series is known to be hea vytailed. This gave birth to the use of EVT methods to model the tail and to estimat e VaR more reliably. In 1975, Canfield [1 7], and Canfield and Borgman [18] have discussed the usefulness of this distribution to model timetofailure data in reliability studies. In 1986, Rossi et al [78, 79, 10] proposed a twocomponent extreme value distribution for flood frequency analysis. In 1987, Achcar et al [3] have discussed the advantage of tr ansforming a survival data to Gumbel distribution form before analyzing it. PAGE 21 1.2 Probability Framework for Extreme Value Theory Extreme Value Theory (EVT) is the study of probabilistic extremes and focuses primarily on the asymptotic behavior as sample size approaches infinity [23]. Let be a sequence of independent random variables having a common distribution, F The model focuses on the statistical behavior of nXXX ,,.........,21 },......,,max{21 n nXXX M where usually represent values of a proce ss measured on a regular time scale, for example, hourly measurements of stock prices or plasma drug concentrations over a certain period, so that represents the maximum of the process over n time units of observation. If n is the number of observations in a day, then corresponds to the daily maximum. s iX nM nM In theory, the distribution of can be derived exactly for all values of n : nM )(}Pr{.........}Pr{ }...,,......... Pr{}Pr{1 1zFzX zX zXzXzMn n n n (1.1) The difficulty that arises in practice is the fact that the distribution function F is unknown. One possibility is to use standard st atistical techniques to estimate F from observed data, and then substitute this estimate into (1.1). But very small discrepancies in the estimate of F can lead to substantial discrepancies for n F This leads to an approach based on asymptotic argument which re quires determining what possible limit distributions are possible for as nM n The question then is what are the possible limit distributions in the extremal case? 6 PAGE 22 The same problem arises in classical statistics when we insist that with probability 1, nX converges to the population mean. In Central Limit Theorem (CLT), this problem is resolved by allowing a linear scaling, so that )1,0( N Xn nn where n and nn are linear rescalings which prevent the degenerate limits. The same approach is adopted in obta ining the lim its of the distribution of looking instead for limiting distributions of nM n nna bM where are sequences of normalizing coefficients such that n nba and n nn na bM F leads to a nondegenerate distribution as Specifically, we seek n n nb a and 0 such that zG a bM Fn nn n where does not depend on n )( zGExtremal Types Theorem : If there exist sequences of constants n nb a and 0 such that, as n )( Pr zGz a bMn nn where G is a nongenerate distribution function, then, G belongs to one of the following families [23]: I. ; expexp)( z a bz zG II. ; exp b; z 0, )( bz a bz zG 7 PAGE 23 III. ; ,1 , exp )( bz bz a bz zG for parameters and, in the case of II and III, (location) ,0(scale) b a .0(shape) In words, the theorem states th at the rescaled sample maxima n nna bM converge in distribution to a variable ha ving a distribution within one of these families labeled I, II, and III. Collectively, the three classes of distri bution are referred to as the extreme value distributions. The remarkable feature of this re sult is that the three types of extreme value distributions are the only possibl e limits for the distributions of n nna bM regardless of the distribution F for the population. It is in th is sense that the above theorem provides an extreme value analog of the central limit theorem. The three types of limits that arise in Ex trem al Types Theorm have distinct forms of behavior, corresponding to the diffe rent forms of tail behavior for F of These distinct forms can be made precise by l ooking at the behavior of G at its upperend point For the Weibull distribution is finite, while for both the Frechet and Gumbel distributions iX.zz zThe three families of distributions Gumbel, Frechet, and Weibull can be combined into a single family of models having distribution f unctions of the form 1exp)(1 z zG (1.2) 8 PAGE 24 9 defined on ,0/1: zz where the parameters satisfy < < > 0 and < < [23]. The equation (1.2) above is the generalized extreme value family of distributions. This was obtained independently by Von Mise s [70] and Jenkinson [54]. Equation (1.2) can be written as: 0;for z expexp 0;for ;0for z1exp )(1 z z z zG (1.3) The parameters of th is distribution are (shape), (location), and (scale). The distribution in (1.3) is also referred to as the von Mises type extrem e value distribution or the von MisesJenkinson type distribution. Shape Parameter The importance of the shape parameter is apparent from the above equations. If is negative the quantities are bounded above, and any extrapolation will lead to a finite limit. On the other hand if it is 0 or positive then the limiting quantity is unbounded, and extrapolation leads to an infi nite limit. The shape parameter determines the shape, and depending on its value, the function can cha nge drastically. From the sensitivity to the shape parameter, it is obvious extreme valu es will change drastically depending on the value of the shape parameter. PAGE 25 The Type I (Gumbel) class of extreme valu e distribution is obtained in the limit as .0 This is the case of an exponentially decreasing tail. The Type II (Frechet) class of extr eme value distribution corresponds to case 0 When 0 the GEV distribution has a fi nite lower end point, given by / This is the case of a polynomially decreasing tail function and therefore, corresponds to a longtailed parent distribution. The Type III (Weibull) class of extr eme value distribution corresponds to case 0 When 0 the GEV has a finite upper end point, also given by / The case 0 is that of a finite upper endpoint, and is therefore shorttailed. 10 PAGE 26 1.3 Generalized Extreme Value Distribution 1.3.1 Probability Density Fu nction and Maximum Likelihood The probability density [60] function co rresponding to the equation (1.3) is 0 for z1 e exp 0;for 0;for zz 1 1 1exp )(1 1 1 z ze z z zg (1.4) The standard forms of the probability density function and cum ulative distribution function for the GEV distributions are obtai ned from the equations (1.4) and (1.3), respectively, by taking1 and 0 Maximum Likelihood Estimation Suppose we have n observations Z1, Z2, Z3, .. Zn for which the GEV distribution (1.3) is appropr iate [23]. The loglikelihood for GEV parameters when 0 is 1 1log 1 1log),,(1 1 1 n i i n i iz z n l (1.5) provided that ,.....,2,1for ,0 1 ni zi (1.6) At parameter combinations for which (1.6) is violated, corresponding to a configuration for which at leas t one of the observed data fa lls beyond an endpoint of the distribution, the likelihood is zero and the loglikelihood is 11 PAGE 27 The case = 0 requires separate treatment using the Gumbel limit of the GEV distribution. This lead s to the loglikelihood exp log),(1 1 n i i n i iz z nl (1.7) Maximization of the pair of equations (1.5) and (1 .7) with respect to the parameter vector ),,( leads to the maximum likelihood estimate with respect to the entire GEV family. There is no analytical solution, but for a given dataset the maximization is straightforward using standa rd numerical optimization algorithms [23]. The maximum likelihood estimators are the values of the unknown parameters that maximize the loglikelihood. In practi ce these are local maxi ma found by nonlinear optimization. The regularity conditions that are required for the usual asymptotic properties associated with the maximum likelihood estimator are not satisfied by the GEV model because the endpoints of the GEV distribution are functions of the parameters; is an upper endpoint of the distribution when ,0 and a lower endpoint when .0 The violation of the regularity conditions means that the standard asymptotic likelihood results are not automati cally applicable. Smith [80] studied this problem and obtained th e following results: When 1 maximum likelihood estimators ar e unlikely to be obtainable. When 2 1 1 maximum likelihood estimators are generally obtainable, but do not have the standard asymptotic properties. When 2 1 the second and higher moments do not exist. 12 PAGE 28 The standard asymptotic results of c onsistency, asymptotic efficiency and asymptotic normality hold for these distributions when 2 1 The elements of the matrix of secondorde r partial derivatives, evaluated at the maximum likelihood estimators, are known as the observed information matrix and the inverse of this matrix is the variancecovariance matrix of the maximum likelihood estimators. The square roots of the diagonal entr ies of this inverse ma trix are estimates of the standard deviations of the three parameter estimates, widely known as standard errors of those estimates. All these results are asymptotic approximations valid for large sample sizes, but in practice they are widely used even when the sample sizes are fairly small. 1.3.2 Quantile The interest is in the quantile for which pz pzGp 1)( where G denotes the cumulative distribution function (cdf ). Estimates of extreme quantiles are obtained by inverting 1exp)(1 z zG as follows: )1log(1 p zp where pzGp 1)( 0for p)log(1log0for )1log(1 p zp (1.8) pz is the return level associated with the return period (1/p), since to a reasonable degree of accuracy, the level is expected to be exceeded on average once every (1/p) years, pz 13 PAGE 29 and more precisely, is exceeded by the maximum in any particular year with probability pzp1.3.3 Model Diagnostics For checking the validity of a GEV mode l, graphical goodnessoffit checks such as probability, quantile, return level a nd density plots are discussed next [ 23]. Probability Plot: The probability plot is a comparison of the empirical and fitted distribution functions. With ordered data)( )2()1(......nz zz the empirical distribution function evaluated at is given by )(iz 1/)( ~ )( nizGi By substituting the GEV parameter estimates into equation (1.4), the corresponding model based estimates are 1 )( )( 1exp)( i iz zG. If the GEV model fits the data well, )( ~ )( )( )( i izGzG for each i so a probability plot, consisting of the points nizGzGi i,.......2,1,)( ),( ~)()( should lie close to the unit diagonal. The drawback of the probability plot for extreme value models is that )( ~ and )( )( )( i izGzG are bound to approach 1 as increases. Since the accuracy of the model for large values of z is of interest in extreme value problems, the probability plot seems to provide inadequate informati on in the region of most interest. )( iz 14 PAGE 30 Quantile Plot : The deficiency of the probability plot can be eliminated by the quantile plot, consis ting of the points niz n i Gi,...2,1,, 1 )( 1 where 1 log1 1 1n i m i G If the GEV model is an adequate fit, then the points on the quantile plot should lie close to a unit diagonal. Return Level Plot: Since quantiles enable probability models to be expressed on the scale of data, the relationship of the GEV model to its parameters is most easily interpreted in terms of the quantil e expression (1.8). Suppose we define)1log( p yp Then, 0for )log(y0,for 1p p py z Suppose is plotted against on a logarithm ic scale (or equivalently, if is plotted against ). The plot is linear forpzpypzpy log0 If ,0 the plot is convex with asymptotic limit as at 0 p / If ,0 the plot is concave and has no finite bound [23]. This graph is a return level plot. The return level plots are convenient for both model presentation and validation. The tail of the distribution is compressed so that return level estimates for long return pe riods are displayed, while the linearity of the plot in the case0 provides a baseline against which to judge the effect of the estimated shape parameter. Confidence interval and empirical estimates of the return level function enable the return level plot to be used as a model di agnostic. If the GEV model is an adequate fit 15 PAGE 31 for the data, the modelbased curve and em pirical estimates should be in reasonable agreement [23]. Interpretation of a Tyear Return Level: If P(z) is the probability of a level z being exceeded in a single year, then th at level is often said to have a return period, which is in the inverse of P(z) years. For example, a sea level ha ving a probability of being exceeded in a year of 0.01 is said to have a return period of 100 years. A sea level that has a probability of being exceeded on ce in hundred years is called the 100year return level Density Plot: The probability, quantile and retu rn level plots are based on a comparison of modelbased and empirical estim ates of the distribution function. Another diagnostic is based on the comparison of the pr obability density functi on of a fitted model with the histogram of the data. This compar ison is less informative than the previous plots since the form of a histogram depends on the choice of groupi ng interval. There is no unique estimator of a density function, and hence, the comparison is more difficult and subjective. 1.3.4 GEV Modeling of Nonstationary Processes A process is referred to as a nonstationa ry process if its ch aracteristics change systematically with respect to change in ti me [23]. For example, the basic level of the annual maximum rainfall may change linearl y over an observation period. If the distribution to be used is GEV, then it follows that a suitable model for X t the annual maximum rainfall level in year t can be written as X t GEV ((t), ), where (t) = 0 + 1 t for parameters 0 and 1 This models the variations through time in the observed process as a linear trend in th e location parameter of the gene ralized extreme value model. The parameter 1 represents the annual rate of ch ange in annual maximum rainfall. 16 PAGE 32 More complex changes in the location parameter may be modeled as a quadratic model given by 2 210)( ttt Nonstationarity can also be expres sed in terms of other extreme value parameters such as where the exponential function is used to ensure that the positivity of tet10)( is respected for all values of t. Modeling the extreme value model shape parameters as a smooth function of time is rather unrealistic since the shape parameters are difficult to estimate with precision [23]. An existence of a situation in which the extremal behavior of one series being related to that of another vari able, referred to as a covariate, is a possibility. For example, in the statistical modeling of maximum drug concentrations in a pharmacokinetic study using extreme value theory, the variable age, can be one of the covariates. 17 PAGE 33 1.4 Gumbel Distribution The Type I extreme value distribution has two form s. One is based on the smallest extreme and the other is based on the larges t extreme. These are referred to as the minimum and maximum cases, respectively. The Gumbel di stribution is also referred to as a Type I extreme value distributi on or logWeibull distribution. 1.4.1 Probability Density Fu nction and Maxim um Likelihood Probability Density Function Maximum Case : The pro bability density function of the Gumbel distribution is z z zf exp exp 1 )( where is the location parameter and is the scale parameter. Minimum Case: The probability density function of the Gumbel distribution is z z zf exp exp 1 ) ( w h e r e is the location parameter and is the scale parameter. Cumulative Distribution Function Maximum Case: The cumulative distribution functi on of the Gumbel distribution is z z zF expexp)( 18 PAGE 34 Minimum Case : The cumulative distribution function of the Gumbel distribution is z z zF expexp1)( Maximum Likelihood Estimation From equation (1.5), the loglikelihood for the GEV parameters when 0 is 1 1log 1 1log),,(1 1 1 n i i n i iz z n l (1.9) provided that ,.....,2,1for ,0 1 ni zi When 0 (1.9) reduces to the loglikelihood func tion (23) for Gumbel distribution as exp log),(1 1 n i i n i iz z nl (1.10) In order to find the MLEs of and we need to maximize (1.10). We first set .0 and 0 ll After some mathematical manipulations, we get two equations with two unknowns, n i iz n1 exp 1 ln (1.11) 0 exp exp 1 1 1 1 n i i n i i i n i iz z z z n (1.12) These equations can be solved numerically to obtain and 19 PAGE 35 The corresponding 95% confidence intervals for and given by n n n n2 2 2 2 60793.0 96.1 60793.0 96.1 and 10867.1 96.1 10867.1 96.1 1.4.2 Quantile The interest is in the quantile, for whichpz pzFp 1)( where F denotes the cumulative distribution function given by z zF expexp) ( (1.13) Inverting for the above f orm ofpzFp 1)( F we get )1log(log p zp and hence, the MLE of pzFp 1)( is given by p zp 1log(log (1.14) 1.4.3 Model Diagnostics QuantileQuantile (QQ) : The QQ plot is a grap hical tool to assess the fit of the model to the data. For the Gumbel distribution, the QQ plot is where are obtained by arranging the data in the ascending order so that i izy versuss iz)( nz zz .....21 are known as the order statistics or the observed quantiles, and known as the expected quantiles, are computed usings iy .,....3,2,1 1 lnln ni i n yi If the points lie close to the 45 degree stra ight line it is an indication of an adequate fit of the Gumbel distribution to the data [23]. 20 PAGE 36 Gumbel Probability Plot : The Gumbel probability plot is the plot of where are the observed data points that are sorted from the smallest to the largest to form the order statistics,i izy versuss iz)()( )2()1(...........nZ ZZ The values of are obtained from s iy 11n i Fyi where n is the number of observed data points, and 1 F is the inverse Gumbel cumulative distribution function given by z zF 1 lnln1. Note that and are not needed to check the lin earity of set of points 1 ,1 )(n i Fzi, that is, 1 1 lnln ,)(n i zi= i n zi1 lnln ,)( for i = 1, 2, 3, n. The probability plot for the Gumbel distributi on w ill be linear if and only if the plot of i n zi1 lnln ,)(is linear for i = 1, 2, 3, n If the plot produces points that fall close to a straight line, then the Gumbel distribution is a reasonable m odel. 1.4.4 Statistical Modeling of Annual Maximum Discharge Following are the data that consist of n = 59 annual maximum discharges of Feather River in ft3/sec from 1902 to 1960 [22, 26]. 230000, 203000, 185000, 185000, 152000, 140000, 135000, 128000, 122000, 118000, 113000, 110000, 108000, 102000, 102000, 94000, 92100, 85400, 84200, 83100, 81400, 81000, 80400, 80100, 75400, 65900, 64300, 62300, 60100, 59200, 58600, 55700, 54800, 21 PAGE 37 54400, 46400, 45600, 42400, 42400, 42000, 36700, 36400, 34500, 31000, 28200, 24900, 23400, 22600, 22400, 20300, 19200, 16800, 16800, 16400, 16300, 14000, 13000, 11600, 8860, 8080 The probability plot in Figure 1.4.1 seem s linear, implying the adequacy of the Gumbel model to fit to the data. The ma ximum likelihood parameters were computed as 4.47309 and 1.37309 Plot of (ln(ln((n + 1)/i))) vs A nnual Maximum Flood Levels2 1 0 1 2 3 4 5 0 50000100000150000200000250000 Annual Maximum Flood levels(ln(ln((n + 1)/i))) Figure 1.4.1 Probability plot for the annual maximum discharges of Feather River. The probability that the maximum flood le vel observed during a given year does not exceed a certain level L is given by LZPj In the data set the largest value observed is 230,000. Therefore, the probability of exceeding this value of 230000 in any given year can be calculated as follows [22]: 08825.37309 41816.47309 230000 expexp1)230000 (1)230000 (j jZF ZP That is, .00744.0)230000 ( jZP 22 PAGE 38 Thus, the return period is 134 00744.0 1 years. To compute Tyear level u(T) (where u( T) = T year flood level) where T = 100 year ) T 1 (1)( 1 1))(( ))((1 1 ))((1 1 FTu T TuFTuF T TuZP 937,218 100 99 1 lnln*08825.37309 41816.47309 100 99 100 1 1)100(1 1 F Fu Therefore, the 100year return level is 218937. The graph in Figure 1.4.2 displays the predic ted return levels as a function of return period. Anticpated Tyear level as a function of u(T) using estimates derived via MLE0 25000 50000 75000 100000 125000 150000 175000 200000 225000 250000 0102030405060708090100110120130140150Tyearu(T) Threshold Figure 1.4.2 Graph of u(T) Threshold versus Tyear for the annual m aximum discharges of Feather River data set. 23 PAGE 39 1.5 Frechet Distribution The Frechet is the second extreme value distribution, also kn own as the Type II extrem e value distribution. It has wide ranging applications in engineering, environmental modeling, finance and other ar eas. Recent applications include prediction of solar proton peak fluxes and modeling interf acial damage in micr oelectronic packages and material properties of constituent particles in an aluminum alloy [96]. 1.5.1 Probability Density Fu nction and Maxim um Likelihood When0 the cumulative distribution function of the GEV distribution reduces to reduces to the cumulative distribution function for Frechet distribution given by exp)( z zF The corresponding probability de nsity function is given by exp )(1 z z zf (1.15) where and are the shape, location and sc ale parameters, respectively. Maximum Likelihood Estimation For a random sample the likelihood function is given by nzzz ,,.........,21 n i i n i i nnt t l1 )1( 1exp ),,( The maximum likelihood estimates are obtained by setting [96, 65] and , ly.respective and , at 0 log and ,0 log ,0 log L L L After some mathematical simplification, we get the equations 24 PAGE 40 n i i n i i n i i it t t tn n1 1 1 ) log( ) ( ) log() ( (1.16) 1 )1 ( ) ( ) ( 1 1 1 )1 ( n i i n i i n i it t tn (1.17) and 1 1 1 n i it n (1.18) These equations can be solved numerically to obtain the estimates and , 1.5.2 Quantile The interest is in the quantile for whi chpz ,1 pzFp where F denotes the cumulative distribution function [96]. Inverting pzFp 1)( for )( zF p pz p z p )1log( exp 1 )1log(1 p zp (1.19) By the invariance property of th e m ethod of maximum likelihood, the mle of (1.19) is given by )1log( 1 p zp (1.20) For large we can assum e that equation (1.20) is normally distributed with the mean given by equation (1.19) and the variance given by n 25 PAGE 41 ) cov()1log(log)1log( 2 ) cov()1log(log)1log( 2 ) cov()1log(2 ) var()1log(log)1log( ) var()1log() var( var2 2 1 2 1 2 2 4 2 2 p p p p p p p p zp The )%1(100 confidence interval for (1.19) is given by ) ( 2 p pzVarzz (1.21) where 2z is the )% 2 1(100 percentile of the standard normal distribution [96]. 1.5.3 Model Diagnostics Model diagnostics for Frechet distributi on is examined through the PP Plot and QQ Plot in the same way as it was done in the case of Gu mbel distribution. 26 PAGE 42 1.6 Weibull Distribution The Weibull is the third extreme value distribution, also known as the Type III extrem e value distribution. It has wide ranging applications in engineering, environmental modeling, finance and other ar eas. Recent applications include evaluating the magnitude of future earthquakes in the Pa cific, Argentina, Japan and in the Indian subcontinent [77]. 1.6.1 Probability Density Fu nction and M aximum Likelihood When0 the cumulative distribution functi on of the GEV distribution reduces to the cumulative distribution functi on for Weibull distribution given by exp1)( z zF The probability density function for a 3parameter Weibull model (61) is given by exp1 z z zf (1.22) where and are the shape, location and sc ale parameters, respectively. The pdf for a 2param eter Weibull is obtained by setting = 0 in equation (1.22). The pdf for a 1parameter Weibull is obtained by setting = 0 in equation (1.22) and assuming constant. C In the formulation of oneparameter Weibull we assume that the shape parameter is known a priori, from past e xperience. The advantage of doing this is that data sets with fewer observations can be analyzed. 27 PAGE 43 Maximum Likelihood Estimation For a random sample the likelihood function is given by nzzz ,,.........,21 n i i n i i nnt t l1 1 1exp ),,( The maximum likelihood estimates are obtained by setting and , ly.respective and , at 0 log and ,0 log ,0 log L L L After some mathematical simplification, we get the equations n i n i i n i i i it t tn t n1 1 1 ) ( ) log() ( ) log( (1.23) 1 )1 ( ) ( ) ( 1 1 1 1 n i i n i i n i it t tn (1.24) and 1 1 1 n i it n (1.25) These equations can be solved numerically to obtain the estimates and , 1.6.2 Quantile The interest is in the quantile for whi chpz ,1pzFp where F denotes the cumulative distributi on function given by z zF exp1) ( where and are as described above. Inverting for the above form of F, the expression for quantile is given by, pzFp 1)( 28 PAGE 44 log1p zp (1.26) By the invariance property of the method of maximum likelihood, the mle of (1.26) is given by )log( 1p zp (1.27) For large n, we can assume that equation (1.27) is normally distributed with the mean given by equation (1.26) and the variance given by ) cov(logloglog 2 ) cov(logloglog 2 ) cov( log2 ) var(loglog log ) var()log) var( var2 2 1 2 1 2 2 4 2 2 p p p p p p p p zp The )%1(100 confidence interval for (1.26) is given by ) ( 2 p pzVarzz (1.28) where 2z is the )% 2 1(100 percentile of the standa rd normal distribution. 1.6.3 Model Diagnostics Model diagnostics for Weibull distributi on is examined through the PP Plot and QQ Plot in the same way as it was done in the case of Gu mbel distribution. 29 PAGE 45 30 1.6.4 Statistical Modeling of Annual Maximum Storm Surge Storm surge return periods are useful to land use planners and emergency managers for assessing the likelihood of extr eme water depths associated with tropical cyclones [45, 46]. At a given location, it is desirable to determine sound statistical estimates of return periods (for 20, 25, 50, 100year epochs) and corresponding assessments of uncertainty due to both limitations in the historical record and effects of parameter estimation. Using the historical storm set consisting of 951 North Atlantic tropical cyclones for the pe riod 18861996 for one particular location a total of 111 annual maximum storm surges we re obtained. This example is discussed in detail in Caribbean Storm Surge Return Periods: Final Report by Mark E. Johnson, December 1997 [55]. These 111 annual maximum storm surges are the data for fitting the model. The parameter estimates for 2parameter and 3parameter Weibull fits are obtained using the Statistical An alysis Software (SAS) [91], ve rsion 9.1. In the case of 2parameter Weibull, the location parameter is assumed to be zero whereas in the case of 3parameter Weibull, it can be either set equal to a value less than the smallest data value or estimated. The graphs of the return levels for 3parameter and 2parameterWeibull distribution fits are displayed in Figure 1.6.1. It can be noted that they are almost similar in the prediction of the return levels. The maximum likelihood estimates of the shape and scale parameters for the two Weibull di stributions are given in Table 1.6.1 PAGE 46 Table 1.6.1 Estimates of shape and scale para meters with their standard errors from 3parameter and 2parameter Weibull fits on the dataset containing 111 annual maximum storm surges. MODEL (SE) (SE) 3Parameter Weibull 0.247296 (0.0412) 0.601609 (0.0449) 2Parameter Weibull 0.291671 (0.0421) 0.696758 (0.0506) Estimates of storm surges based on MLE fit of 3parameter Weibull distribution (n = 111)0.5 0.8 1.0 1.3 1.5 1.8 2.0 2.3 2.5 2.8 3.0 3.3 3.5 3.8 4.0 0102030405060708090100110120130140150 Return PeriodReturn Levels Estimates of storm surges based on MLE fit of 2parameter Weibull distribution (n = 111)0.5 0.8 1.0 1.3 1.5 1.8 2.0 2.3 2.5 2.8 3.0 3.3 3.5 3.8 4.0 0102030405060708090100110120130140150 Return PeriodReturn Levels Figure 1.6.1 Estimates of storm surges based on MLE of 2and 3parameter Weibull 31 PAGE 47 1.7 Generalized Pareto Distribution Introduction The principal drawback to the classica l GEV/Gumbel method is that only one value is selected per epoch. This reduces the data available for analysis. To increase the number of cases for analysis, an alternativ e approach is to obtain PeakOverThreshold (POT) maxima, extracted from sample data se ries to produce a series of extreme values above a chosen (high) threshold, and use them with the Generalized Pareto Distribution. This is a common approach to extreme valu e statistics based on the exceedances over high thresholds [23, 81, 82, 83]. Probability Framework Let X 1 X 2 be a sequence of iid random variables, having marginal distribution function F. Let u be a certain high threshold value. Any X i that exceeds u is considered an extreme event. Denoting an arbitrary term in the X i sequence by X, it follows that a description of the stochastic behavior of extreme events is given by the conditional probability: .0 )(1 )(1  y uF yuF uXyuXP Let M n = Max{X 1 X 2 . . .,X n }. Denote an arbitrary term in the X i sequence by X, and suppose that F satisfies Extremal Type Theorem so that for large n, Pr{M n z} G(z) where 11exp)( z zG for some > 0 and Then, for large enough u, the distribution function of (X u), conditional on X > u, is approximately: 32 PAGE 48 1~ 11)( y yH (1.29) defined on 0 ~ 1 and 0: y yy where )( ~ u (1.30) 1.7.1 Probability Density Fu nction and Maximum Likelihood Probability Density Function The probability density function, corresponding to the cumulative distribution function given in equation (1.29), is 1 11 1 )( y yh (1.31) where and are the scale and shape parameters, respectively. The shape parameter is dominant in determining the qualitative behavior of the GPD, just as it is for the GEV [23]: when < 0 the distribution of excesses has an upper bound of / u when > 0 the distribution has no upper limit. when = 0, the distribution is also un bounded, which should be interpreted by taking the limit of 1~ 11)( y yH as 0 leading to 0,1)(yeyHy which corresponds to an exponential distribution with parameter /1 Maximum Likelihood Estimation Suppose that the values are the k excesses of a threshold u For kyyy .,,.........,21 0 the loglikelihood is derived as 33 PAGE 49 k i iy kl1,1log 1 1log),( (1.32) provided 01 iy for ;,.......,3,2,1 k i otherwise, ),( l In the case,0 the loglikelihood is obtained from 1)(yeyH as k i iy kl11 log)( Analytical maximization of the loglikelihood is not possible, so numerical techniques are again required, taking care to av o id numerical instabilities when0 in (1.32). 1.7.2 Quantile It is more convenient to interpret ex treme value models in terms of quantiles (or return levels ). Suppose that a generalized Pare to distribution with parameters and is a suitable model for exceedances of a threshold u by a variable X [23]. That is, for X > u 11Pr ux uXxX It follows that 11Pr Pr ux uXxX The level that is ex ceeded on average once every m observations is the solution of mx m ux uXm1 1Pr1 Solving for and using mx }, Pr{ uXu we get 0for )log( 0for 1 u u mmu mux (1.33) 34 PAGE 50 provided m is sufficiently large to ensure that x m > u x m is the m observation return level. Plotting x m against m on a logarithmic scale produces the same qualitative features as retu rn level plot based on GEV model: linearity if = 0; convexity if < 0; concavity if > 0. It is more convenient to give quantiles or return levels on an annual scale, so that Nyear return level is the level expected to be exceeded once every N years. If there are n y observations per year, this corresponds to the m observation return level with m = N*n y Hence, the Nyear return level is given by 0for )*log( 0for 1* uy uy NnNu nNuz (1.34) In order to find we use the maximum likelihood estimates of and and an estimate of Nz u The probability of an individual observation exceeding the threshold u is u This has a natural estimator of / nku the sample proportion of observations exceeding u Since the number of exceedances of u follows the binomial ),(unB distribution, u is also the maximum likelihood estimator of u The variance of can be derived by the delta met hod, but the uncertainty in the estimate of mx u should also be included in the calc ulation. From the properties of the binomial distribution, n Varuuu/) 1( Then, the complete variancecovariance matrix for , u is approximately where denotes the (i, j) term of the variancecovariance 2,2 2,1 1,2 1,1 0 0 0 0 ) var( vv vv Vu jiv, 35 PAGE 51 matrix of and By the delta method, where m T m mxVxxVar ) ( mm u m T mxxx x,, evaluated at , u 1.7.3 Model Diagnostics and Methods of Selecting a Threshold Value Model Diagnostics If the generalized Pareto model is reasonable for modeling excesses of u then both probability plot and quantile plot are approximately linear. The return level plot consists of the locus of points mxm for large values of m where 1 u mmux The density function of the fitted generalized Pareto model can be compared to a histogram of exceedances. Methods of Selecting a Threshold Value Selecting an appropriate th reshold value is very critical. Too low a threshold value is likely to violate the asymptotic basi s of the model, leading to bias. Too high a threshold value might result in obtaining t oo few excesses, leading to high variance. The standard practice is to adopt as low a thresh old as possible, subject to the limit model providing a reasonable approximation [23]. First Method: This method of selecting a thr eshold value is an exploratory technique carried out prior to model estimati on. The method is based on the mean of the generalized Pareto distribution. This requires plotting the points max 1 )( : 1 xuux n uun i i u where consist of the observations )( )2()1(,.......,,unxxx un 36 PAGE 52 that exceed u, and is the largest of the This plot is called mean residual life plot or the mean excess function [90]. This plot should be approximately linear in u above a threshold value at which the generalized Pareto distribution pr ovides a valid approximation to the excess distribution. Confiden ce limits can be added to the plot based on the approximate normality of sample means. maxx iX 0u Second Method: This method is an assessment of the stability of the parameter estimates based on the fitting of models across a range of different threshold values. Above a level at which the asymptotic motivation for the generalized Pareto distribution is valid, estimates of the shape parameter should be approximately constant, whereas the estimates of 0u u should be linear in u. 1.7.4 Statistical Modeling of Daily Precipitation Data The data set used has n = 36524 observa tions during the time period 1900 1999 for a single location in Fort Collins, Colorado, USA [59]. Exceedance rate (per year) = 10.24. The va lue of negative log likelihood is 109.91. The model diagnostic plots are displayed in Figure 1.7.1. 37 PAGE 53 Table 1.7.1 Estimates of scale and shape para meters with standard errors from the generalized Pareto model on the Fort Collins precipitation data. PARAMETER ESTIMATE STANDARD ERROR (scale) 0.33972 0.0164 (shape) 0.18684 0.0374 The Table 1.7.2 provides the estimates of shape parameter and 100year return level with 95% profile likelihood confidence intervals. Figures 1.7.2 and 1.7.3 display the same results graphically. Table 1.7.2 95% Confidence in terval estimation using profile likelihood for the GPD shape parameter and 100year return level. PARAMETER ESTIMATE 95% CONFIDENCE INTERVAL Shape Parameter 0.1868 2650.0 ,1181.0 100year Return level 5.2207 82.6 ,24.4 38 PAGE 54 0.00.40.8 0.00.6 Probability PlotEmpiricalModel 12345 13 Quantile PlotModelEmpirical 010 Return Level PlotReturn period (years)Return level 0.1101000 Density Plotxf(x) 012345 0.01.53.0 Figure 1.7.1 Diagnostic plot s for the GPD fit of the Fort Collins Precipitation Data (threshold value = 0.40) 39 PAGE 55 Figure 1.7.2 Profile loglikelihood plot for GPD 100year return level (inches) for Fort Collins precipitation data. Figure 1.7.3 Profile loglikelihood plot for GPD shape parameter for Fort Collins precipitation data. This dataset is of special interest due to a flood th at occurred on July 28, 1997. The amount recorded for the high precipitati on event of July 1997 was 4.63 inches. Note that the confidence interval for the 100year return leve l obtained above does include 4.63 inches. 40 PAGE 56 1.8 Aims of the Present Research The main objective of Chapter 2 is to apply the Generalized Extreme Value (GEV) distribution to statistically model th e annual monthly maximum rainfall data for the years 1950 through 2004 for forty four locat ions, representing centr al, north and south climatic region of the State of Florida, and use the estimated model parameters for predictive purposes. The time dependence of the rainfall data is incorporated into the GEV model as a nonstationary component in the location parameter. Estimates and confidence intervals are obtained for return leve ls of return periods of 10, 20, 50, and 100 years, and all 44 locations are cl assified into clusters based on the similarity profiles. Chapters 3 and 4 focus on the investigation of the applicability of the theories of a mixture of two extreme value distributions and nonmixture extreme value distributions to statistically model the maximum drug concentration data (C max ), a critical pharmacokinetic parameter in the drug de velopment process), the computation of statistical estimates of the model parameters th at provide the scientific best prediction of C max the application of computer algorithms to examine the various model diagnostics and goodnessoffit tests, and to answer quest ions regarding the be havior of the data. In a typical pharmacokinetic experiment, a fixed dose of drug is administered to subjects, and each subject has blood samples taken at time points and is the C n ,,.....,,21 pttt ) (maxjC max observed in subject j Thus, the data set has random C n max observations. For the present study, the si mulated data values for C max are used with the assumption that these random C n max values are having the extreme value distribution with location, scale and shape parameters. ... dii 41 PAGE 57 42 By careful examination of the histograms of the simulated C max for small samples, a bimodal nature in the distribution of C max was observed. Hence, Chapter 3 focuses on the application of a mixture of two extreme value distributions to statistically model the C max data in small samples. The particular pr obability density func tions used in the mixture model are determined by examining a number of different distributions and evaluating the set of distributi ons that best fit the data. Chapter 4 focuses on the application of nonmixture extreme value distributions to statistically model the C max data in large samples. The particular probability distribution functions used are determined by examining a number of extreme value distributions and evaluating the dist ribution that best fits the data. An extensive search of the statistical and pharmacological literature has failed to show any research articles relevant to the application of extreme value theory or a mixture of extreme value distributions to statistically model the C max values. This is the major way that the approach presented in this study differs from the pharmacokinetic approach to dealing with C max Chapter 5 presents the sta tistical modeling of a pharmacokinetic system to investigate the drug concentration beha vior in an open threecompartment pharmacokinetic model which describes the disposition of an antibiotic drug, coumermycin A 1 We studied a system of random differential equati ons representing this model. The three rate constants that are used in the system of random differential equations are assumed to have a trivariate distribution. The trivariate statistical distributions that will be used to simulate these rate constants are truncated normal and exponential probability distributions. PAGE 58 43 More precisely, the numerical solutions for the system will be obtained using the rate constants that are simulated from these distributions for differe nt combinations of covariance structure and initial conditions where the initial co nditions are nonrandom. The effects of different combinations of covariance structures and initials conditions on the deterministic behavior of the drug concentration and the mean behavior of the random solutions as a function of time will be discussed in addition to comparing these two behaviors. Also discussed is the su itability of the use of these two probability distributions to simulate the rate constants. In Chapter 6 we incorporate the constant time delays into the system of random differential equations considered in Chapter 5, develop extensive numerical solutions to the system of delay random differential equa tions and study their impact on the rate of decay, absorption and biotransformation of th e drug concentration. We begin with the three time delays observed in Chapter 5, that is, a time delay of 3.5 hours for the drug to reach compartment two from compartment one, 6 hours for the drug to reach compartment one from compartment two, a nd 3 hours for the drug to reach compartment three from compartment two. In addition we consider other sets of constant time delay values to study the behavior of the drug concentration under different time delays. We discuss the impact of time delays on the overall behavior of the drug concentration in all three compartments, and on the deterministic behavior of the drug concentration and the mean behavior of the random solutions as a function of time. In addition the deterministic behaviors of the drug concentration with and without time delay for each compartment are compared. Chapter 7 presents the possible exte nsions of the present research. PAGE 59 44 Chapter Two Statistical Modeling of A nnual Monthly Maximum Rainfall Using the Generalized Extreme Value Distribution 2.0 Introduction Events such as extreme rainfalls, earthquakes, hurricanes, and stock market crashes seem to follow no rule, but careful anal ysis has helped to discover distributions that acceptably model these extreme events [19]. The most important feature of an extreme value analysis (EVA) is the objective to quantify the stochas tic behavior of the maximum and the minimum of i.i.d. random variables. The EVA requires the estimation of the probability of events that are more extreme than any that have already been observed [23]. Extreme value th eory is a unique di scipline that develops statistical techniques for describing the unusual phenom ena such as extreme rainfalls, floods, wind gusts, earthquakes, risk ma nagement, and insurance. E.J. Gumbel, in his statistics of extremes [45, 46] in 1958, discussed many applications that dealt with hydrology or climatology. As part of the global climate change, an accelerated hydrologic cycle incl uding an increase in heavy precipitation is anticipated on a theoretical basi s [94, 95], is predicted by nume rical models of the climate system [24], and has been detected in obser ved precipitation [32, 58]. Katz et al. [47] used extreme value distributions to fit to th e precipitation data at a single location (Fort Collins, Co, USA) for the time period 19001999. A review of applications of extreme value distributions to climate data can be found in Farago and Katz [29]. An Introduction PAGE 60 45 to Statistical Modeling of Extreme Values is a fairly recent book which can serve as an excellent reference [23]. 2.1 Focus of the Chapter Two The main focus of this chapter is the application of the th eory of Generalized Extreme Value (GEV) distribution to derive statistical models to monitor the rainfall data of forty four locations. These locations are sele cted from different climatic zones such as central, north and south climatic regions of th e State of Florida. The rainfall data is a function of time and thus the subject probability density function is to be adjusted to inherit the time dependence. This is done by adjusting the location parameter () to be time dependent. Thus, our statistical model will incorporate both stationary and nonstationary components to obtain the predictions of median ra infall with profile confidence intervals for each location in addition to providi ng similarity profiles for rainfall locations and identifying clusters. 2.1.1 Data The data consist of total monthly ra infall records from 1950 to 2004 for these forty four locations. The data were obtain ed through two sources: National Oceanic and Atmospheric Administration Na tional Data Center [72] through annual subscription, and Southeast Regional Climate Center [89]. We have used what is known as the blockmaxima method to define the extreme rainfa ll as the maximum of monthly rainfalls within each year. The data for any month in a given year is considered missing if that month has more than five days of data missing. PAGE 61 46 Table 2.1 Tabulation of the locati on, climate zone (central, nort h, or south), years of data, latitude and longitude for all forty four locations LOCATION CLIMATE ZONE YEARS OF DATA LATITUDE LONGITUDE Apalachicola North 1950 2004 2944'N 851'W Arcadia South 1950 2004 2713'N 812'W Avon Park South 1950 2004 2736'N 812'W Bartow South 1950 2004 2754'N 811'W Brooksville Central 1950 2004 287'N 82'W Clermont South 1950 2004 2827'N 815'W Daytona Beach Central 1950 2004 29'N 81'W De Funiak Springs North 1950 2004 3045'N 865'W Deland Central 1950 2004 2901'N 819'W Ft. Lauderdale South 1950 2004 2606'N 802'W Ft. Myers South 1950 2004 2635'N 812'W Gainesville Central 1950 2004 291'N 82'W Hialeah South 1950 2004 2549'N 807'W Inverness Central 1950 2004 2848'N 829'W Jacksonville AP North 1950 2004 300'N 81'W Jacksonville Beach No rth 1950 2004 30'N 81'W Key West AP South 1950 2004 2433'N 815'W Kissimmee Central 1950 2004 2817'N 815'W Lake City North 1950 2004 3011'N 826'W Lakeland Central 1950 2004 2801'N 815'W Live Oak North 1950 2004 3017'N 828'W Madison North 1950 2004 3027'N 835'W Melbourne Central 1950 2004 2806'N 809'W Miami Airport South 1950 2004 2547'N 809'W Miami Beach South 1950 2004 25'N 80'W Plant City Central 1950 2004 2801'N 829'W Saint Augustine North 1950 2004 2953'N 810'W Saint Leo Central 1950 2004 2820'N 826'W Saint Petersburg Central 1950 2004 2746'N 828'W Sanford Central 1950 2004 2848'N 816'W Stuart South 1950 2004 2712'N 800'W Tallahassee North 1950 2004 3024'N 841'W Tampa Airport Central 1950 2004 2758'N 822'W Tarpon Springs Central 1950 2004 289'N 82'W Titusville Central 1950 2004 288'N 80'W Vero Beach Central 1950 2004 2739'N 805'W West Palm Beach South 1950 2004 26'N 80'W Winter Haven Central 1950 2004 2801'N 814'W Naples South 1950 2004 2610'N 813'W Ocala Central 1950 2004 2905'N 825'W Orlando Central 1950 2004 2826'N 810'W Panama City North 1950 2004 3010'N 852'W Pensacola North 1950 2004 3029'N 871'W Perry North 1950 2004 3006'N 834'W PAGE 62 47 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 28 29 30 31 32 34 33 35 37 36 38 39 40 41 42 43 44 21 23 24 22 25 27 26 1 2 3 4 1=Pensacola 12=Tampa AP 23=Hialeah 34=Clermont 2=DeFuniak 13=Lakeland 24=Ft. Lauderdale 35=Sanford 3=Panama City 14=Bartow 25=West Palm 36=Daytona Beach 4=Apalachicola 15=St. Petersburg 26=Stuart 37=Deland 5=Perry 16=Plant City 27=Vero Beach 38=Saint Augustine 6=Gainesville 17=Arcadia 28=Avon Park 39=Jacksonville BCH 7=Ocala 18=Ft. Myers 29=Melbourne 40=Jacksonville AP 8=Brooksville 19=Naples 30=Wint er Haven 41=Lake City 9=Inverness 20=Key West AP 31=Orlando 42=Live Oak 10=Saint Leo 21=Miami Beach 32=Kissimmee 43= Madison 11=Tarpon Springs 22=Miami Airpor t 33= itusville 44= Tallahassee Figure 2.0 Map of the State of Fl orida [89] identifying all 44 lo cations that are used in the statistical modeling. PAGE 63 2.1.2 Methodology To achieve our objective we need to pe rform a statistical goo dnessoffit test and this requires estimates of the parameters that are inherent in the GEV. To obtain such estimates we utilize the maximum likeli hood functional form of the GEV given by n i n i i i nx x L1 1 1 1 11exp 1 1 ,, (2.1) The estimates of , and that is, are taken to be those values which maximize the likelihood function L. Furthermore, we calculated the return level of rainfall X and , T for the GEV pdf, given by the following expression T xT1 1log1 (2.2) where T is defined as the rainfall return period Under the assumption that X T can be approximated by the probability distribution, we can obtain the confidence limits for the return level of rainfall, that is, profile deviance of X 2 T The %1 confidence interval for X T can be obtained using the following expression 2 1,1)( ),,( ln2: Tp TXL L X where ,,sup)(T TpXLXL 48 PAGE 64 2.1.3 Models The models fitted to the rainfall data are the GEV model with the following variations in the location parameter, : Model 1: Stationary, where , and are constants, implying that the maximum rainfall data is stationary with respect to time. Model 2: Linear trend in which the lo cation parameter, possesses a linear behavior with time and is numerically approxi mated with its slope indicating a decrease or increase in rainfall maximum. The scale ( ) and shape ( ) parameters are constants. Model 3: Quadratic trend in which the loca tion parameter, is treated as timedependent with nonlinear behavior with time and is numerically approximated. The scale ( ) and shape ( ) parameters are constants. When once the most appropriate of the models has been identified, we proceed to estimate numerically the median of the rainfall data using the following analytical expression, Median = )2log(1 (2.3) In addition, we obtain approximate nume rical estimates of 90%, 95%, and 99% confidence limits using the following equations, respectively, )95.0log(1 ,)20log(1 (2.4) )975.0log(1 ,)40log(1 (2.5) 49 PAGE 65 )995.0log(1 ,)200log(1 (2.6) 2.2 Results and Discussion Models 1 through 3 were fitted to the annual m aximum monthly rainfall data for each of the forty four locati ons listed in Table 2.1. The me thod of maximum likelihood was used. For Plant City data, the stationary GE V m odel (Model 1) leads to a maximized loglikelihood of 139.97. The GEV model with a linear trend in (Model 2) has a maximized loglikelihood of 137.85, and the GEV model with a quadratic trend in (Model 3) resulted in a maximized likeli hood of 136.44. The deviance statistic for comparing Model 1 and Model 2 is greater than suggesting that a model with linear trend is preferable over th e stationary model ( is constant). However, the deviance statistic for comparing Model 1 and Model 3 is greater than This suggests that the model with a quadratic trend is preferable. The two model diagnostic plots, quantile and the density plots that are given in Figure 2.1 su pport the fit of the model. Thus, we conclude that the best model for extreme rainfall in Plant City is Model 3. 84.32 950,199.52 950,2For Hialeah data, the stationary GEV mode l (Model 1) leads to a maximized loglikelihood of 157.5. The GEV mode l with a linear trend in (Model 2) has a maximized loglikelihood of 155.34, and the GEV model with a quadratic trend in (Model 3) resulted in a maximized likelihood of 155.19. The deviance statistic for comparing Model 1 and Model 2 is greater than suggesting that a model with linear trend is preferable over the stationary mode l ( is constant). However, the deviance 84.32 950,1 50 PAGE 66 statistic for comparing Model 1 and Model 3 is less than and for comparing Model 2 and Model 3 is less than Neither of them is statistically significant. Thus, we conclude that the best model fo r extreme rainfall in Hialeah is Model 2. 99.52 950,284.32 950,1The above analyses were repeated for the remaining forty two locations. Figure 2.1 Diagnostics plots for Plant City: Quantile and density plots 51 PAGE 67 52 Table 2.2 Summary of the stati onary or nonstationary form of the maximum rainfall data for 56 years of data for the forty four locations LOCATION CLIMATE ZONE TYPE OF NONSTATIONARITY ESTIMATE OF SLOPE ASSUMING LINEAR TREND Brooksville Central Stationary 0.2354 Daytona Beach Central Quadratic 2.2908 Deland Central Stationary 2.192 Gainesville Central Stationary 1.2524 Kissimmee Central Stationary 3.9379 Lakeland Central Linear 5.5988 Melbourne Central Stationary 4.4886 Ocala Central Stationary 1.7752 Orlando Central Stationary 1.0776 Plant City Central Quadratic 4.7798 Saint Leo Central Quadratic 5.475 Saint Petersburg Central Stationary 0.5477 Sanford Central Stationary 0.148 Tampa Airport Central Stationary 0.9241 Tarpon Springs Central Linear 4.4139 Titusville Central Stationary 0.8291 Vero Beach Central Stationary 4.1174 Winter Haven Central Stationary 2.7802 Apalachicola North Stationary 1.773 De Funiak Springs North Stationary 1.6623 Jacksonville AP North Stationary 0.4221 Jacksonville Beach Nort h Stationary 0.9628 Lake City North Stationary 1.089 Live Oak North Stationary 1.2083 Madison North Stationary 1.3547 Panama City North Stationary 0.9075 Pensacola North Linear 5.6868 Perry North Stationary 0.5131 Saint Augustine North Stationary 1.1906 Tallahassee North Stationary 2.576 Arcadia South Stationary 4.5838 Avon Park South Stationary 0.8828 Bartow South Quadratic 4.546 Clermont South Stationary 1.5075 Ft. Lauderdale South Stationary 5.149 Ft. Myers South Stationary 4.3141 Hialeah South Linear 5.621 Inverness South Stationary 1.749 Key West AP South Stationary 0.2484 Miami Airport South Stationary 3.2126 Miami Beach South Stationary 2.3161 Naples South Stationary 1.3354 Stuart South Stationary 2.173 West Palm Beach Sout h Stationary 0.034 PAGE 68 53 Table 2.2 also displays the estimate of the slope assuming the linear trend for the forty four locations. Thirty six locations are identified as stationa ry. Four locations are identified as nonstationary, a nd the nonstationarity is due to linear trend. Four locations are identified as nonstationar y, and the nonstationarity is due to quadratic trend. Table 2.3 displays return levels of the ex treme rainfall for the return period 10, 20, 50 and 100 years along with the profile deviance of quantiles for a few locations. Figure 2.2 displays the graphs of the return level X T for T= 10, 20, 50, and 100 for all forty four locations. From these graphs of return levels, these forty four locations can be grouped into five different clusters. Hialeah, Panama City and Jacksonville Beach form a cluster with Hialeah showing the highest return levels of a ll forty four stations. Gainesville, Lakeland, Daytona Beach and Madison form another cluster with Gainesville showing the lowest return levels of all forty four locations. West Palm Beach, Miami Airport and St. Petersburg form another cluster. The rest of the locations can be categorized into two large clusters. PAGE 69 54 Table 2.3 Return levels of the extreme ra infall with profile deviance of quantiles LOCATION PERIOD (YEARS) RETURN LEVELS 90% CONFIDENCE INTERVAL 95% CONFIDENCE INTERVAL 99% CONFIDENCE INTERVAL Brooksville 10 16.28 (14.9, 18.57) (14.67, 19.22) (14.26, 20.79) 20 18.43 (16.53, 22.19) (16.27, 23.34) (15.77, 26.21) 50 21.34 (18.52, 28.03) (18.14, 30.2) (17.52, 35.87) 100 23.62 (19.87, 33.42) (19.42, 36.75) (18.72, 45.72) Daytona Beach 10 13.28 (13.67, 16.24) (12.49, 16.6 5) (13.16, 17.63) 20 14.61 (14.98, 18.5) (13.21, 19.16) (14.41, 20.79) 50 16.14 (16.42, 21.63) (15.2, 22.74) (15.78, 25.57) 100 17.16 (17.32, 24.11) (16.21, 25.67) (16.62, 29.71) Gainesville 10 13.53 (12.83, 14.49) (12.71, 14.75) (12.47, 15.4) 20 14.49 (13.71, 15.88) (13.58, 16.31) (13.34, 17.39) 50 15.55 (14.62, 17.69) (14.48, 18.4) (14.23, 20.28) 100 16.23 (15.15, 19.05) (15, 20.02) (14.75, 22.66) Lakeland 10 13.72 (12.71, 15.11) (12.55, 15.53) (12.25, 16.56) 20 14.89 (13.73, 17.09) (13.56, 17.77) (13.24, 19.51) 50 16.26 (14.82, 19.87) (14.61, 21.04) (14.27, 24.11) 100 17.19 (15.45, 22.12) (15.23, 23.77) (14.89, 28.23) Melbourne 10 14.54 (13.54, 15.99) (13.37, 16.38) (13.05, 17.31) 20 15.98 (14.78, 18.11) (14.59, 18.73) (14.24, 20.26) 50 17.67 (16.13, 21.02) (15.91, 22.06) (15.53, 24.7) 100 18.83 (16.97, 23.31) (16.71, 24.76) (16.3, 28.51) St. Petersburg 10 17.65 (16.04, 20.4) (15.77, 21.2) (15.28, 23.19) 20 20.19 (17.98, 24.87) (17.68, 26.35) (17.1, 30.13) 50 23.60 (20.29, 32.19) (19.85, 35.1) (19.17, 42.91) 100 26.25 (21.8, 39.04) (21.35, 43.62) (37.91, 56.27) Vero Beach 10 16.52 (14.98, 19.15) (14.73, 19.91) (14.27, 21.84) 20 18.95 (16.81, 23.41) (16.51, 24.83) (15.95, 28.53) 50 22.31 (19.07, 30.52) (18.62, 33.36) (17.92, 41.16) 100 24.98 (20.62, 37.3) (20.13, 41.83) (19.32, 54.72) Ft. Myers 10 17.11 (16.22, 18.37) (16.06, 18.74) (15.75, 19.66) 20 18.27 (17.31, 20.15) (17.15, 20.76) (16.84, 22.32) 50 19.49 (18.35, 22.45) (18.19, 23.47) (17.9, 26.18) 100 20.23 (18.91, 24.15) (18.77, 25.55) (18.49, 29.36) Lake City 10 14.54 (13.37, 16.33) (13.18, 16.82) (12.83, 18.01) 20 16.44 (14.87, 19.3) (14.62, 20.15) (14.2, 22.3) 50 19.05 (16.78, 24) (16.47, 25.61) (15.9, 29.8) 100 21.13 (18.19, 28.31) (17.79, 30.74) (17.13, 37.33) PAGE 70 55 Table 2.3 (Continued) Pensacola 10 17.50 (16.27, 20.29) (16.02, 21.01) (15.54, 22.81) 20 19.66 (18.09, 24.14) (17.81, 25.42) (17.26, 28.72) 50 22.38 (20.16, 30.04) (19.76, 32.45) (19.16, 38.92) 100 24.36 (21.44, 35.24) (21.06, 38.87) (20.38, 48.97) Tallahassee 10 16.33 (15.04, 18.41) (14.83, 18.99) (14.43, 20.37) 20 18.33 (16.58, 21.67) (16.34, 22.66) (15.87, 25.12) 50 20.96 (18.41, 26.74) (18.07, 28.56) (17.51, 33.26) 100 22.96 (19.6, 31.26) (19.16, 34) (18.55, 41.24) Ft. Lauderdale 10 18.76 (17.64, 20.27) (17.44, 20.67) (17.06, 21.6) 20 20.32 (19.06, 22.4) (18.84, 23) (18.45, 24.48) 50 22.07 (20.55, 25.13) (20.34, 26.08) (19.91, 28.49) 100 23.20 (21.49, 27.15) (21.25, 28.41) (20.82, 31.69) Hialeah 10 20.15 (18.66, 23.2) (18.38, 23.99) (17.85, 25.95) 20 22.99 (20.78, 27.78) (20.45, 29.2) (19.82, 32.84) 50 26.92 (23.4, 35.15) (22.93, 37.9) (22.15, 45.22) 100 30.08 (25.21, 41.95) (24.65, 46.21) (23.78, 57.9) Naples 10 16.57 (15.52, 18.1) (15.34, 18.52) (14.99, 19.52) 20 18.09 (16.82, 20.34) (16.61, 21.01) (16.24, 22.69) 50 19.90 (18.25, 23.47) (18.03, 24.6) (17.61, 27.56) 100 21.15 (19.18, 25.96) (18.9, 27.56) (18.46, 31.82) PAGE 71 Return Levels versus Return Period10 15 20 25 30 5101520253035404550556065707580859095100Return PeriodReturn Level Brooksville Daytona Beach Deland Gainsville Kissimmee Lakeland Melbourne Ocala Orlando Plant City Saint Petersburg Sanford Tampa Airport Tarpon Springs Titusville Vero Beach Winter Haven Saint Leo Apalachicola De Funiak Springs Jacksonville AP Jacksonville Beach Lake City Live Oak Panama City Pensacola Perry Saint Augustine Tallahassee Madison Arcadia Avon Park Bartow Clermont Ft. Lauderlade Ft. Myers Hialeah Inverness Key West AP Miami Airport Miami Beach Naples Stuart West Palm Beach Figure 2.2: Return Level versus Return Periods for all 44 locations 56 PAGE 72 57 2.3 Similarity Profiles by Each of the Three Climatic Zones In the central climatic zone 18 locations are selected. From the similarities in the return level profiles shown in the graphs of Figure 2.3A these 18 locations can be grouped into 5 clusters. Vero Beach and St. Petersburg form one group with St. Petersburg showing the highest return levels of all eight een locations. Lakeland, Daytona Beach and Gainesville form another cluster w ith Gainesville showing the lowest return levels of all eighteen locations. While Brooks ville and Tarpon Springs form on cluster, another cluster is formed by Winter Have n, Ocala, Tampa Airport and Melbourne. The remaining seven locations turn out to be another cluster. In the north climatic zone 12 locations are selected. From the similarities in the return level profiles shown in the graphs of Figure 2.3B these 12 locations can be grouped into 5 clusters. Madison has the lowest return level and seems to stand by itself. While Lake City, St. Augustine and Jacksonville Airport form one cluster, Jacksonville Beach and Panama City form another cluster. The remaining six locations make up another cluster. In the south climatic zone 14 locations are selected. From the similarities in the return level profiles shown in the graphs of Figure 2.3C these 14 locations can be grouped into 5 clusters. Hialeah stands by itself and has the hi ghest return level. Bartow stands by itself and has the lowest return level. While Miami Airport and West Palm Beach form a cluster, Napl es, Miami Beach, Avon Park, Ft. Myers and Clermont make another cluster. The remaining five locations make up another cluster. PAGE 73 Return Levels versus Return Period (Climate Zone: Central)10 15 20 25 30 5101520253035404550556065707580859095100 Return PeriodReturn Level St. Pete Vero Beach Tarpon Springs Brooksville Lakeland Daytona Beach Gainsville Winter Haven Tampa Airport Ocala Melbourne Figure 2.3A Return level profiles for all 18 locations in Central Climatic Zone Return Levels versus Return Period (Climate Zone: North)10 15 20 25 30 5101520253035404550556065707580859095100Return PeriodReturn Level Panama City Jacksonville Beach Madison Jacksonville AP St. Augustine Lake City Figure 2.3B Return level profiles for all 12 locations in North Climatic Zone 58 PAGE 74 Return Level versus Return Period (Climate Zone: South)10 15 20 25 30 510152025303540455055606570758085909510 0Return PeriodReturn Level Hialeah Bartow West Palm Beach Miami Airport Ft. Myers Naples Clermont A von Park Miami Beach Figure 2.3C Return level profiles for all 14 locations in South Climatic Zone The plots of 10year, 20year, 50year, a nd 100year return le vels along with the 95% profile likelihood confidence intervals for all forty four locations provide further insight into the similarities in these pr ofiles. Figures 2.3D th rough 2.3G provide these plots for 10year, 20year, 50year, and 100year return levels, respectively 59 PAGE 75 10Year Return Level versus Location with 95% Profile Confidence Interval10 15 20 25 30 35 40 45 50 02468101214161820222426283032343638404244 Location10Year Return Level 1=Brooksville 2=Daytona Beach 3=Deland 4=Gainsville 5=Kissimmee 6=Lakeland 7=Melbourne 8=Ocala 9=Orlando 10=Plant City 11=St. Petersburg 12=Sanford 13=Tampa Airport 14=Tarpon Springs 15=Titusville 16=Vero Beach 17=Winter Haven 18=St Leo 19=Apalachicola 20=De Funiak Springs 21=Jacksonville AP 22=Jacksonville Beach 23=Lake City 24=Live Oak 25=Madison 26=Panama City 27=Pensacola 28=Perry 29=Saint Augustine 30=Tallahassee 31=Arcadia 32=Avon Park 33=Bartow 34=Clermont 35=Ft. Lauderlade 36=Ft. Myers 37=Hialeah 38=Inverness 39=Key West AP 40=Miami Airport 41=Miami Beach 42=Naples 43=Stuart 44=West Palm Beach Central North South Figure 2.3D 10Year return le vel versus location with 95% CI for all 44 locations 60 PAGE 76 20Year Return Level versus Location with 95% Profile Confidence Interval10 15 20 25 30 35 40 45 50 02468101214161820222426283032343638404244 Location20Year Return Level 1=Brooksville 2=Daytona Beach 3=Deland 4=Gainsville 5=Kissimmee 6=Lakeland 7=Melbourne 8=Ocala 9=Orlando 10=Plant City 11=St. Petersburg 12=Sanford 13=Tampa Airport 14=Tarpon Springs 15=Titusville 16=Vero Beach 17=Winter Haven 18=St Leo 19=Apalachicola 20=De Funiak Springs 21=Jacksonville AP 22=Jacksonville Beach 23=Lake City 24=Live Oak 25=Madison 26=Panama City 27=Pensacola 28=Perry 29=Saint Augustine 30=Tallahassee 31=Arcadia 32=Avon Park 33=Bartow 34=Clermont 35=Ft. Lauderlade 36=Ft. Myers 37=Hialeah 38=Inverness 39=Key West AP 40=Miami Airport 41=Miami Beach 42=Naples 43=Stuart 44=West Palm Beach Central North South Figure 2.3E 20Year return le vel versus location with 95% CI for all 44 locations. 61 PAGE 77 50Year Return Level versus Location with 95% Profile Confidence Interval10 15 20 25 30 35 40 45 50 02468101214161820222426283032343638404244 Location50Year Return Level 1=Brooksville 2=Daytona Beach 3=Deland 4=Gainsville 5=Kissimmee 6=Lakeland 7=Melbourne 8=Ocala 9=Orlando 10=Plant City 11=St. Petersburg 12=Sanford 13=Tampa Airport 14=Tarpon Springs 15=Titusville 16=Vero Beach 17=Winter Haven 18=St Leo 19=Apalachicola 20=De Funiak Springs 21=Jacksonville AP 22=Jacksonville Beach 23=Lake City 24=Live Oak 25=Madison 26=Panama City 27=Pensacola 28=Perry 29=Saint Augustine 30=Tallahassee 31=Arcadia 32=Avon Park 33=Bartow 34=Clermont 35=Ft. Lauderlade 36=Ft. Myers 37=Hialeah 38=Inverness 39=Key West AP 40=Miami Airport 41=Miami Beach 42=Naples 43=Stuart 44=West Palm Beach Central North South Figure 2.3F 50Year return le vel versus location with 95% CI for all 44 locations 62 PAGE 78 100Year Return Level versus Location with 95% Profile Confidence Interval10 15 20 25 30 35 40 45 50 02468101214161820222426283032343638404244 Location100Year Return Level 1=Brooksville 2=Daytona Beach 3=Deland 4=Gainsville 5=Kissimmee 6=Lakeland 7=Melbourne 8=Ocala 9=Orlando 10=Plant City 11=St. Petersburg 12=Sanford 13=Tampa Airport 14=Tarpon Springs 15=Titusville 16=Vero Beach 17=Winter Haven 18=St Leo 19=Apalachicola 20=De Funiak Springs 21=Jacksonville AP 22=Jacksonville Beach 23=Lake City 24=Live Oak 25=Madison 26=Panama City 27=Pensacola 28=Perry 29=Saint Augustine 30=Tallahassee 31=Arcadia 32=Avon Park 33=Bartow 34=Clermont 35=Ft. Lauderlade 36=Ft. Myers 37=Hialeah 38=Inverness 39=Key West AP 40=Miami Airport 41=Miami Beach 42=Naples 43=Stuart 44=West Palm Beach Central North South Figure 2.3G 100Year return level versus lo cation with 95% CI for all 44 locations. 63 PAGE 79 64 2.4 Conclusion We have modeled the annual maximum m onthly rainfall data for forty four locations from the central, north and south clim atic zones in the State of Florida using the Generalized Extreme Value distribution. Since th e rainfall data is a function of time, we utilized the timedependence structure in th e location parameter to build nonstationary model with the nonstationarity in the form of a linear trend as well as a quadratic trend. Extreme rainfalls for some locations do disp lay an evidence of nonstationarity in the form of either a linear trend or a quadratic tr end. The return level estim ates can be used to determine ways to protect people and the property by taking appropriate measures against extreme rainfalls that may lead to floods and other waterrelated natural disasters. Based on the magnitude of these estimates, the forty four locations can be classified into five clusters as described in the results and discussion section. PAGE 80 65 Chapter Three Mixed Statistical Model for the Phar macokinetic Parameter, Maximum Drug Concentration(C max ): Small Samples 3.0 Introduction The maximum drug concentration in blood or plasma after the administration of a certain drug in an individual is considered as an extreme value. Extreme maximum drug concentration levels are associated with unde sirable clinical outcomes such as toxicity, especially in cases of thos e drugs that have narrow therapeutic windows. A drug is identified as having a narrow therapeutic index if small changes in systemic concentration can lead to either subtherap eutic or toxic results in patients [9, 67]. Warfarin is among those drugs considered to have a narrow therapeutic index. As these maximum drug concentration levels vary in patie nts due to varying factors, such as age, body weight, genetic polymorphism and drugdrug interactions, analys is of such data using appropriate statistical techniques is crucial in understanding the behavior of the drug [52]. Thus, the purpose of the present study is to formulate a necessary theory that includes the computation of sta tistical estimates of the model parameters that provide the scientific best predic tion of maximum drug concentrati on, and applicatio n of algorithms to answer key questions concer ning the behavior of the data. PAGE 81 3.1 Outline of Each Sect ion of Chapter III Section 3.2 introduces the concept of pha rmacokinetics, bioavailability and its assessment. In addition, it discusses Warfarin drug, its application for the prevention and treatment of thromboembolic disorders, side effects, and interaction with other drugs. The section ends with the description of the simulation of the maximum drug concentration (C max ) data. Section 3.3 presents th e focus of chapter three. Section 3.4 introduces the statis tical distributions that ar e considered to model the bimodal data. Section 3.5 presents a brief introduction to the concept of a mixture of two extreme value distributions. This section also presents an analytical basis for not using the mixture of two Gumbel models using a simulation study. It further discusses the mixture models for Pareto and Weibull, a nd the derivations of the normal equations. Section 3.6 has a detailed discussion of the results of the analysis on small samples from the mixture models. Section 3.7 presents the conclusion. 3.2 Pharmacokinetics Pharmacokinetics is a mathematical quantif ication of the amounts of drug in the body over time. It is the study of the processes of bodily absorption, distribution, metabolism, and excretion of a drug [36, 38] The pharmacokinetic parameters required in the Food and Drug Administration regulations for an invivo bioa vailability study are the maximum drug concentration, C max the area under the plasma or blood concentrationtime curve, time to achieve the maximum concentration, T 0AUC max elimination halflife, 21t and the rate constant, k e of the therapeutic moiety [30]. 66 PAGE 82 3.2.1 Bioavailability and Asse ssment of Bioavailability Bioavailability According to the Food and Drug Administ ration guidelines, bioavailability is defined as the rate and extent to which the active i ngredient or active moiety is absorbed from a drug product and becomes ava ilable at the site of action. The maximum drug concentration is a function of both the rate and extent of absorption [30]. Before a medication can have any effect on the body, it must enter the bloodstream. If the medicati on is administered intrav enously, then 100% of the medication is free in the bloodstream immediat ely after dosing. The simulated data that are used in this chapter are from a me dication that is administered orally. Bioavailability of orally administered drugs depends on absorption from the gastrointestinal tract (GI) and few factors that affect this process are the enzymes in the gastrointestinal tract that metabolize the dr ug and thus prevent certain percentage of it from getting into the bloodstream; firstpass metabolism of the drug after it is absorbed into the blood; and some portion of the dr ug may bind with protein in the plasma and hence may not be free in the bl oodstream for the body to use it. Assessment of Bioavailability One of the three main pharmacokinetic parameters that are used to assess the bioavailability is the maximum drug concentration. The increases with an increase in the dose as well as with an increase in the absorption rate. The the time at which the occurs reflects the rate of dr ug absorption, and decreases as the absorption rate increases [30]. ,maxC maxC maxT maxC 67 PAGE 83 68 3.2.2 Introduction to Warfarin Warfarin (Brand name: Coumadin) is the most widely prescribed anticoagulant drug for the prevention and treatment of thro mboembolic disorders. Because of large interpatient variability in doseanticoagulan t effect relationship a nd a narrow therapeutic index, dosing is a challenge. The goal is to minimize the risk of serious bleeding events without compromising the anticoagulant effect Haemorrhage is the major complication (~ 5 30% of patients). Haem aturia, epistaxis, uterine bl eeding, petechial, or simple bruising are common side effect s but GI, intracranial, and retroperitoneal bleeding may also occur [52]. An enormous variation in pharmacological response is a characteristic feature of the clinical use of Warfarin. The average pa tient requires a daily dose of about 5 mg to maintain blood hypocoagulability but dosage requirements can vary from 1 25% daily. Differences in both pharmacoki netics and pharmacodynamics contribute to this variation. Some well documented factors th at influence the response to Warfarin are shown below. Factors influencing the response to Warfarin Factors Effect Mechanism Age with increasing age Enhanced receptor site sensitivity Pregnancy Increased blood coagulability Liver disease Defective synthesis of clotting factors Heart failure Reduced clotting factor synthesis Hyperthyroidism Increased clotting factor degradation Concomitant drugs Induced or inhibited metabolism PAGE 84 69 Warfarin shows a bimodal plasma concentr ation distribution. In other words, the same dose of drug gives high levels in some pa tients, low levels in others. Warfarin is eliminated almost entirely (> 99%) by meta bolism. Cytochrome P450 (CYP2C9) is the enzyme primarily responsible for the metabo lism of warfarin and it has been suggested that genetic variation in the gene coding for this enzyme contributes significantly to the large interpatient variability in warfarindos e requirements. Of all the covariates tested, CYP2C9 genotype was the only one identified as having a significant effect, with a 46% (14%) reduction in clearance for warfarin as compared to wildtype [4]. Patients nutritional status is another factor to be considered before the Warfarin therapy. Those who are malnourished should receive lower doses of Warfarin because they probably have low vitamin K intake and decrease d serum albumin concentrations. Women generally require lower doses than men, a nd thus, gender is another factor. Major pharmacokinetic drugdrug interactions can occu r with warfarin and drugs that interfere with the metabolism of warfarin resulting in va riable plasma concentrations of warfarin. Examples of major pharmacokinetics drug inte ractions with warfar in are shown below [52]. Major Pharmacokinetic drug interactions with warfarin Drug Effect Mechanism Cholestyramine Decreased absorption Amiodarone Decreased metabolism Cotrimoxazole and Cimetidine Decreased metabolism Barbiturates, carbamazepine, Rifampicin Increased metabolism PAGE 85 A system, commonly called the Interna tional Normalized Ratio (INR), was established by the World Health Organization and the International Committee on Thrombosis and Hemostasis for reporting th e results of blood coagulation tests. All results are standardized using the International Sensitivity In dex (ISI) for the particular thromboplastin reagent and instrument combin ation utilized to perform the test. For example, a person taking the anticoagulan t warfarin might optimally maintain a Prothrombin Time (PT) of 2 to 3 INR. The pur pose of administering wa rfarin is to give enough dosage so as to achieve the patients PT of 2 to 3 INR. The INR is the ratio of a patients PT to a normal (control) sample, raised to the power of the ISI [98]. That is, ISI normal testPT PT INR Knowledge of the pharmacoki netics of warfarin is he lpful in understanding the initial response to therapy. Warfarin can be detected in the plasma one hour after oral administration, and peak concentrations occur within two to eight hours of administration. 3.2.3 Maximum Drug Concentration (C max ) Data A Monte Carlo simulation method was used to generate maximum drug concentration data. Monte Carlo simulation relies on pseudo random numbers to generate random times to C max based on C max The analysis tool pack of Microsoft Excel is utilized to perform the simulation. The simulation was based on an assumption that the distribution of logarithm of pharm acokinetic parameter (that is, ln[C max ]) is normally distributed. The ln[C max ] of warfarin were used in th e simulation. In this case, random simulation of ln[C max ] was performed using published mean and coefficient of variation of ln[C max ]. The obtained ln[C max ] was transformed to C max antilog value. As a result, 70 PAGE 86 we got the random simulated C max values. The sample size of this simulated data set is 1000. The random samples of varying sample sizes, ranging from 50 to 500, were generated. The histograms of the data fo r sample sizes n = 50, n = 100, n = 200, n = 300, n = 400, n = 500 were examined. 3.3 Focus of the Chapter Three For sample size N 100, Warfarin C max shows a bimodal distribution. In other words, the same dose of drug gives high levels in some patients, low levels in others. Because of the bimodal nature in the maximu m drug concentration data when sample size is small, a mixture of two extreme value dist ributions is more appropriate in adequately modeling such bimodal data. Thus, the focus of the present chapter is the statistical modeling of the simulated C max the maximum drug concentration data for warfarin, for small samples using a mixture of two extreme value distributions. Th e particular probability density functions used in the mixture model are determin ed by examining a number of different distributions and evaluating the se t of distributions that best fit the data. In each of these distributions the maximum likelihood met hods were used to estimate the model parameters and their standard errors. The criter ion used to select th e model is the Akaike Information Criterion (AIC). The AIC for a model is computed as where M is the number of parameters estimated for the model, and L is the maximized likelihood for that model. The smaller the valu e of AIC the better is the fit of the model to the observed data. These models are examin ed below in separate sections, followed by an overall summary of results. ,2ln2 ML 71 PAGE 87 3.4 Statistical Methods for Small Samples 3.4.1 Generalized Extreme Value and Gumbel Distributions For small sample of sizes 50 and 100, th e Gumbel distribution was fit. The pvalue for all three goodnessof fit tests (Chisquare, Ande rsonDarling, and KolmogorovSmirnoff) was nonsignificant (> 0.10), lead ing to the failure of rejecting the null hypothesis that the shape parameter is zero. The same conclusion was drawn based on the comparison of the value of Akaike Informati on Criterion (AIC) from this model to the value of AIC from the generalized extreme value model. But a nonmixture model with the shape parameter being zero is not a suitable model to fit to the maximum drug concentration data that exhibit bimodality in small samples. This leads to the exploration of a mixture model to ade quately model such data. 3.4.2 Weibull Distribution Although the threeparamete r Weibull model seems to fit the maximum drug concentration data for small samples based on the goodnessoffit test, the bimodal nature of the data are better fit by a mixt ure model. The twoparameter Weibull does not provide a good fit of the data. This was a ssessed by using the lik elihood ratio test, and also through AndersonDarling and Cram ervon Mises goodnessoffit tests. The following probability plots display the absen ce of linearity in th e graph suggesting the inappropriateness of the 2parameter nonmixt ure Weibull model. The points on the plot could be modeled by two straight lines of diffe rent slopes. This may be an indicative of a bimodal data set. 2 72 PAGE 88 2Parameter Weibull Probability Plot (n= 50)0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 0.000.100.200.300.400.500.600.700.800.901.00Observed probabilityExpected probability Figure 3.1 Probability Plot from the 2parameter Weibull fit for n = 50. 2Parameter Weibull Probability Plot (n= 100)0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 0.000.100.200.300.400.500.600.700.800.901.00Observed probabilityExpected probability Figure 3.2 Probability Plot from the 2parameter Weibull fit for n = 100. 73 PAGE 89 Hence, the modeling of the data for sma ll samples of size 50 and 100 was carried out using the mixture of two Weibull extreme value distributions (for which the location parameter = 0) with a mixing parameter .10 pp 3.4.3 Pareto Distribution The Pareto distribution was not a good fit to the maximum concentration data for small samples. This was assessed by using the goodnessoffit test. The probability plots shown below lack the linearity in their appearances. Hence, a mixture of two Pareto distributions will be examined to measure its adequacy to model the bimodal data when sample size is small. 2 Pareto(3.0583, 1121.7) F itted pvalue Input pvalue 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3.3 Probability plot from the Pareto fit for n = 50. 74 PAGE 90 Pareto(1.9105, 875.92) Fitted pvalue Input pvalue 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3.4 Probability plot from the Pareto fit for n = 100. 3.5 Mixture of Two Extreme Value Distributions 3.5.1 Introduction Because of the bimodal nature of th e maximum concentration data, these observations were used to estimate the para meters of a twocomponent mixture model. The twocomponent mixture model has the general form [61], );()1();(),,;(2 1 xfp xpfpxf (3.1) 75 PAGE 91 where );(1 xf is the probability density function with parameters for one subgroup, and );(2 xf is the probability density function with parameters for the other group. The mixing parameter is the fraction of obser vations in one group and 10 pp )1( p is the rest of the observations in the other group [61]. Suppose p is known : In a sample of n one would expect approximately observations from the first component in the above probability density function and observations from the second. If the two distributions are fairly well separated and p is not too close to 0 or 1, the smalle st observations will be from one component and the largest observations are from th e other. Then, one can treat the k np )1(pn 1 smallest observations as the first k 1 in a sample of size from the first distribution, and the k np 2 largest as the largest observations in a sample of size ) 1(pn from the second distribution. We can estimate the parameters of the two distributions separately from these two parts of the data. Suppose p is unknown: Being unsure about how sensitive the procedure is to the exact value of p being used, it is desirable to estimate p from the mixture model using different start values for p so as to examine the consistency in the estimation of the model parameters and their standard errors The particular probability density functions used for );(1 xf and );(2 xf were selected after examining the extreme value type I (or Gumbel), Pareto, and Weibull distributions for each component of the model. 76 PAGE 92 3.5.2 Mixture of Two Gumbel Distributions Using the probability density function for Gumbel, equation (3.1) can now be written as, 2 2 2 2 1 1 1 1 2 1 2121exp 1 1 exp 1 ,,,,; i i i ix x x x iiee p ee pp xf (3.2) where 1 and 1 are the location and scale parameters respectively for the first probability density function in the mixture model, and 2 and 2 are for the second probability density function. The parameter p is the mixing parameter, and i = 1, 2,.n observations. The Gumbel distri bution is unimodal, and its shape parameter has a value of zero. Since the shape parameter determines the shape, and depending on its value, the function can change drastically using a distribution such as Gumbel for which the shape parameter is zero does not adequately model the data that exhibits bimodality The simulation results described below serves as analytical basis for not using a mixture of two Gumbels to model the maximum concentration data. Simulation Three Gumbel density functions were simulated using a set of prechosen parameters. The first Gumbel was generated using 1y 2001 and 1 1*2.0 The second Gumbel was generated using 2y d *112 and 12 where the value of was fixed. The third Gumbel is a mixture of defined as d 3y 2 1 and yy 2 1 3*1*ypypy where p is a mixture parameter 10 p The value of 40.0 p was used. 77 PAGE 93 The above simulation was performed for three different sample sizes 1000, 500, and 100, respectively. Each simulation was conduc ted for each of the five values of the parameter 01 .0 ,25.0 ,5.0 ,0.1 ,5.1 d In the next step, a mixture of two Gumbel data sets was simulated using the same set of parameters as describe d above. A mixture of two Gumb el models was then fit to the simulated data and the maximum likelihood estimators were obtained. Table 3.1 displays the results of the mixture model fit. Table 3.1 Maximum likelihood estimators of the pa rameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,200,100021 1 n 40.0 ,*2002 pd n d 1 (SE) 1 (SE) 2 (SE) 2 (SE) p(SE) 1000 1.50 203.9 (2.23) 41.5 (1.71) 499.1 (1.64) 37.9 (1.27) 0.40 (0.02) 1000 1.00 203.1 (2.38) 40.9 (2.06) 403.1 (1.84) 39.1 (1.34) 0.40 (0.02) 1000 0.50 202.4 (4.61) 40.8 (3.55) 308.5 (3.10) 39.4 (1.86) 0.44 (0.04) 1000 0.25 225.5 (1.67) 48.4 (1.18) 599.5 1.0 0.99 (0.03) 1000 0.05 208.3 (1.37) 41.0 (1.04) 421.3 (0.64) 1.0 (1.18) 0.99 (0.00) The mixture of two Gumbel model turns out to be a reasonable fit to the data when The maximum likelihood parameter estimates are very close to the preset values that were used to si mulate the data. However, for values of the maximum likelihood estimators and the standard errors are unreliable, leading to the unsuitability of a mixture of two Gumbel models as gets smaller and smaller. 5 .0 and ,0.1 ,5.1 d ,05.0 and 25.0 d d 78 PAGE 94 A similar simulation analysis was done when the sample sizes are 500 and 100. Table 3.2 and Table 3.3 present the results fr om the analysis using the mixture of two Gumble models when sample sizes n = 500 and n = 100, respectively. Table 3.2 Maximum likelihood estimators of the pa rameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,200,50021 1 n 40.0 ,*2002 pd n d 1 (SE) 1 (SE) 2 (SE) 2 (SE) p(SE) 500 1.50 192.9 (3.02) 40.4 (2.53) 500.2 (2.39) 39.1 (1.80) 0.40 (0.02) 500 1.00 195.1 (3.28) 39.1 (2.52) 399.8 (2.59) 39.9 (2.14) 0.39 (0.02) 500 0.50 244.1 (3.35) 62.4 (2.80) 330.8 (2.71) 8.91 (3.06) 0.92 (0.02) 500 0.25 223.9 (2.52) 60.0 (1.88) 270.0 (0.84) 1.90 (0.85) 0.96 (0.02) 500 0.05 205.2 (1.90) 40.0 (1.48) 316.0 (1.09) 1.20 (1.22) 0.99 (0.01) Table 3.3 Maximum likelihood estimators of the pa rameters with standard errors from the mixture of two Gumbel models fit to the mixture of two Gumbel data sets simulated using ,40 ,100,10021 1 n 40.0 ,*2002 pd n d 1 (SE) 1 (SE) 2 (SE) 2 (SE) p(SE) 100 1.50 205.4 (6.40) 35.3 (5.17) 507.6(2.39) 44.4 (4.85) 0.40 (0.05) 100 1.00 203.1 (5.57) 32.8 (5.16) 405.8 (6.26) 44.1 (5.45) 0.41 (0.05) 100 0.50 241.4 (7.34) 68.4 (7.25) 303.9 (0.91) 1.00 (0.88) 0.95 (0.03) 100 0.25 226.8 (5.20) 45.0 (3.86) 448.6 (6.55) 1.00 (17.5) 0.98 (0.10) 100 0.05 202.9 (4.00) 36.9 (3.26) 160.5 (2.88) 1.00 (4.10) 0.98 (0.08) 79 PAGE 95 From the values that are shaded in the a bove tables, it is clear that the maximum likelihood estimators for the parameters of a mixture of two Gumbel models are unreliable. This becomes even more evident when the sample size gets smaller. These analyses results clearly indicate the nonsuit ability of a mixture of two Gumbel models, especially, for sample sizes moreover, for sample sizes as large as 1000 the mixture of two Gumbels is not suitable when the difference in the two Gumbel location parameters is ; 500 25.0 units. Hence, a mixture of two Gumble models will not be further investigated. 3.5.3 Mixture of Two Pareto Distributions Using the probability density function for Pareto, equation (3.1) can now be written as, 1 1 2 2 2 1 1 1 1 1 21212 11 1 1,,,,; i i iix p x p p xf (3.3) where 1 and 1 are the shape and scale parameters respectively for the first probability density function in the mixture model, and 2 and 2 are for the second probability density function. The parameter p is the mixing parameter as previously defined, and i = 1, 2,.n observations. Maximum Likelihood Estimation of the Parameters The loglikelihood function for a sample of n observations is given by n i i ix p x p xL1 1 1 2 2 2 1 1 1 1 12 11 1 1ln),(ln (3.4) 80 PAGE 96 The vector of partial derivatives of ),(ln xL with respect to p,,,,2121 gives the score function as n i j iixf1;ln where j = 1, 2, 3, 4, 5. Further simplification of (3.4) yields the following equation x i ix p x p L21 1 2 22 1 2 1 1 112 2 1 1)1( lnln (3.5) The partial derivative [66] of (3.5 ) with respec t to the parameter 1 is equal to x i i i i i ix p x p x x x x p L2 2 1 1 1 11 2 22 1 1 11 2 2 111 1 1 11 2 1 1 1 1 1 11 1)1( 1 ln 1 1 ln Simplifying and setting it equal to zero gives the following equation: 0 )1( ln ln 12 2 1 1 1 1 11 2 22 1 1 1 11 2 11 2 11 1 11 11 1 11 1 1 11 2 2 x i i i ii i i i ix p x px xx x x x x p (3.6) Taking the partial derivative of loglikelihood with respect to 2 and setting it equal to 0, the following equation is obtained xb a L2 2 21ln (3.7) where a and b are given by the following equations: 81 PAGE 97 2 2 2 2 22 22 2 2 2 2 22 22 11 1 2 22ln)( ln )(2 i i i i ii i i i ipxpx x px xx x x x x a i i i i i ix x xpx x xppb22 1 2 22 11 11 1 1 11 22* 2 1 The derivative with respect to 1 is given by xd cpL1 1ln (3.8) where c and d are given by the following equations: 1 22 1 1 111 ii ixx x c )(* 11 1 2 22 11 11 1 1 11 222 1i i i i i ix x xppx x x pd The partial derivative [66] with respect to 2 is given by xf e L2 21ln (3.9) where e and f are given by the following equations: 2 2 111 1 2 222 ppxxx x eii i 82 PAGE 98 i i i i i ix x xppx x x pf22 1 2 22 11 11 1 1 11 22 2 1 0 1 0 ln2 2 1 1 2 2 1 11 2 22 1 1 1 11 2 1 2 22 1 1 1 11 2 i i i ix p x p x x p L (3.10) Equations (3.6) through (3.10) ar e the norm al equations that ca nnot be solved explicitly to get closed form solutions for the maximum likelihood estimates. Instead, for each sample the equations must be so lved using an iterative numerical procedure. The maximum likelihood estimates are those values of that maximize the likelihood. The parameter estimates were calculated numerically using the version 2.2 statistical programming language [51]. p and , 2121, ,,.........,21 nxxxp and , 2121mle3.5.4 Mixture of Two Weibull Distributions Using the probability density function fo r a tw oparameter Wei bull, equation (3.1) can now be written as, 2 2 1 12 1 22 2 1 1 11 1 2121exp 1 exp ,,,,; i i i i iix x p x x p p xf (3.11) where 1 and 1 are the shape and scale parameters respectively for the first probability density function in the mixture model, and 2 and 2 are for the second probability density function. The parameter p is the mixing parameter, and i = 1, 2,.n observations. 83 PAGE 99 Maximum Likelihood Estimation of the Parameters Maximum likelihood was used to estima te m odel parameters. The likelihood function for a sample of n observations is given by n i i i i ix x p x x p xL1 2 1 22 2 1 1 11 12 2 1 1exp 1 exp ),( The log likelihood function is given by n i i i i ix x p x x p xL1 2 1 22 2 1 1 11 12 2 1 1exp 1 exp ln),(ln (3.12) The vector of partial derivatives of ) ,(ln xL with respect to p ,,,,2121 gives the score function as: n i j iixf1;ln where j = 1, 2, 3, 4, 5 We now find the partial deri vatives [66] with respect to each of the five unknown param eters and set each one of them equa l to zero to get a set of normal equations. n i i i i i i ip x x x x x x p L1 1 2 1 2 2 1 2 1 2 22 1 1 1 1 212 2 2 2 1 1exp exp exp 1 lnln n i i i i i i ip x x x x x x p x L1 2 2 2 2 2 1 2 1 1 12 2 2 2 1 1exp exp exp 1 lnln 84 PAGE 100 n i x i x i x i x i i x i i x ipe x e x e x p e x x p e xx p e x p Li i i i i i1 2 2 2 2 1 1 1 2 1 1 1 1 1 1 12 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1ln ln ln simplifying and setting it equal to zero results in the following equation: 0 ln ln1 2 2 2 2 1 1 1 2 1 1 1 1 1 11 1 2 1 1 2 2 2 1 1 1 1 2 2 n i x i x i x i i i i i i xi i i ipe x e x e x p xxxx x e p (3.13) n iL1 2r denominato numerator ln where the denominator term is given by the following expression: pe x e x e x pi i ix i x i x i2 2 2 2 2 2 1 1 12 2 2 2 1 1 and the numerator term is given by the following expression: pe x x pe xx pe x e x x e xx e xi i i i i ix i i x i i x i x i i x i i x i2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 22 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2ln ln ln ln 0 ln2L 85 PAGE 101 0 ln ln ln ln1 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 21 1 2 1 1 2 2 2 1 2 2 2 2 2 2 1 1 n i x i x i x i i i i i i i i i i i xi i i ipe x e x e x p p xx p xx p xxxxx x e (3.14) n i x i x i x i x i x ipe x e x e x p e x pe x p Li i i i i1 2 2 2 2 1 1 2 11 2 1 11 2 1 12 2 2 2 2 2 1 1 1 1 1 1 1 1 1ln Simplifying and setting equal to zero gives 0 11 2 2 2 2 1 1 2 1 1 2 1 11 1 2 1 1 2 2 2 1 1 1 2 2 n i x i x i x i i i xpe x e x e x p xx e pi i i i (3.15) 86 PAGE 102 n i x i x i x i x i x i x i x ii i i i i i ie x pe x e x p e x p e x pe x e x L1 2 2 2 2 1 1 2 22 2 2 22 2 2 2 22 2 2 22 2 2 22 2 2 2 2 2 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2ln Further simplification of this e quation gives the following equation: 0 11 2 2 2 2 1 1 2 2 2 2 2 2 2 2 21 1 2 1 1 2 2 2 1 2 2 2 2 1 1 n i x i x i x i i i i i xpe x e x e x p x pp xxx e pi i i i (3.16) 0 ln p L 01 2 2 2 2 1 1 2 2 1 11 1 2 1 1 2 2 2 1 1 1 2 2 2 1 n i x i x i x i x i x ii i i i ie x pe x e x p e x e x (3.17) These normal equations (3.13) through (3.17) ca nnot be solved explicitly to give general formulas for the maximum likelihood estimates. Instead, for each sample the equations must be solved using an iterative numerical procedure. The maximum likelihood estimates are those values of that maximize the likelihood. The parameter es timates were calculated numerically using the version 2.2 statistical programming language [51]. p and , 2121, ,,.........,21nxxxp and , 2121mle 87 PAGE 103 3.6 Results of the Modeling on Sample Sizes 50 and 100 3.6.1 Mixture of Two Pareto Distributions The Pareto mixture model was fit into the data to obtain all 5 parameter estimates using the maximum likelihood method for various start values of the mixture parameter p In the model, p is the estimated mixture parameter, 1 and 1 are the scale and shape parameters of the Pareto distribution for the first subgroup, and 2 and 1 are the scale and shape parameters of the Pa reto distribution for the second subgroup. For n = 50 and for any start value of p greater or equal to 0.5, with an increment of 0.05, the model resulted in a singular matr ix and the results obtained with start value of p being 0.05, 0.10, 0.15, 0.25, and 0.45 are displayed in Table 3.4 Table 3.4 Parameter estimates and the standard errors from the mixture Pareto model (n = 50). The Akaike Information Criterion (AIC) for this model is 720.01. PARAMETER p 1 2 1 2 ESTIMATE 0.63 1121.7 1587.5 4.40 4.99 STANDARD ERROR 0.16 0 134.86 2.12 2.46 For n = 100, and for any start value of p with an increment of 0.05, Table 3.5 displays the results. All models with different start values for p gave the same parameter estimates but the standard errors of the pa rameter estimates of the second component of the model are extremely high. Moreover, 99.1% of the data values were put in one subgroup for each of these models. 88 PAGE 104 Table 3.5 Parameter estimates and the standard errors from the mixture Pareto model (n = 100). The Akaike Information Crite rion (AIC) for this model is 1537.22 PARAMETER p 1 2 1 2 ESTIMATE 0.99 875.9 1331.4 1.91 0.50 STANDARD ERROR 0.10 0.00 1167843201 0.49 986008150 3.6.2 Mixture of Two Weibull Distributions The Weibull mixture model was fit into the data to obtain all 5 parameter estimates using the maximum likelihood method for various start values of the mixture parameter p The estimated parameters p 1 1 2 and 1 are as defined before but for the Weibull distributions. For n = 50, and for any start value of p less than or equal to 0.45 with an increment of 0.05 Table 3.6A displays the results. The density graphs corresponding to the parameter estimates in Table 3.6A, obtaine d from the mixture of two Weibull models, are presented in Figure 3.5A. Table 3.6A Parameter estimates and the standard errors from the mixture Weibull model (n = 50). The Akaike Information Criterion (AIC) for this model is 727.63. PARAMETER p 1 2 1 2 ESTIMATE 0.44 1981.66 1452.70 7.81 8.57 STANDARD ERROR 0.23 169.8 60.41 4.17 2.62 89 PAGE 105 0 0.0005 0.001 0.0015 0.002 0.0025 0 500 1000 1500 2000 2500 Figure 3.5A Fitted distribution of maximu m drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.6A For n = 50, and for start value of p greater or equal to 0.55, results are displayed in Table 3.6B. The start value of p is 0.5 resulted in a singular matrix. The density graphs corresponding to the parameter estimates in Table 3.6B are presented in Figure 3.5B. Table 3.6B Parameter estimates and the standard errors from the mixture Weibull model (n = 50). The AIC value for this model is 727.93. PARAMETER p 1 2 1 2 ESTIMATE 0.31 1331.37 1857.15 11.99 6.25 STANDARD ERROR 0.15 45.64 102.29 5.38 1.84 0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0 500 1000 1500 2000 2500 Figure 3.5B Fitted distribution of maximu m drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.6B 90 PAGE 106 For n = 100, and for start value of p less than or equal to 0.45 with an increment of 0.05, Table 3.7A displa ys the results. The density graphs corresponding to the parameter estimates in Table 3.7A are presented in Figure 3.6A. Table 3.7A Parameter estimates and the standard errors from the mixture Weibull model (n = 100). The AIC value for this model is 1445.41 PARAMETER p 1 2 1 2 ESTIMATE 0.65 1472.56 1893.03 7.06 5.87 STANDARD ERROR 0.55 57.9 558.58 2.01 6.49 0 0.0002 0.0004 0.0006 0.0008 0.001 0.0012 0.0014 0.0016 0.0018 0 500 1000 1500 2000 2500 Figure 3.6A Fitted distribution of maximu m drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.7A For n = 100, the results, when start value of p is greater or equal to 0.50, are displayed in Table 3.7B. The density graphs corresponding to the parameter estimates in Table 3.7B are presented in Figure 3.6B. 91 PAGE 107 Table 3.7B Parameter estimates and the st andard errors from the mixture Weibull model (n = 100). The AIC value for this model is 1445.41 PARAMETER p 1 2 1 2 ESTIMATE 0.28 1326.38 1737.35 11.23 5.31 STANDARD ERROR 0.17 43.58 106.56 4.95 1.09 0 0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0 500 1000 1500 2000 2500 Figure 3.6B Fitted distribution of maximu m drug concentration for each subgroup and combined subgroups, based on the parameter estimates in Table 3.7B The graphs in Figures 3.7A and 3.7B show the predicted maximum drug concentration based on the mixture model parameters displayed in Table 3.6A and Table 3.6B, respectively, when the sample size is 50. Similarly, graphs in Figures 3.8A and 3.8B show the predicted maximum drug c oncentration based on the mixture model parameters displayed in Table 3.7A and Table 3.7B, respectively, when the sample size is 100. 92 PAGE 108 Prediction of Cmax from a Mixture Model of Two Weibull Distributions (n = 50)2500 2550 2600 2650 2700 2750 28000 10000 20000 30000 4000 0 5 000 0 60000 70000 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 3.7A Prediction of C max based on the model parameters displayed in Table 3.6A. Prediction of Cmax from a Mixture Model of Two Weibull Distributions (n = 50)2500 2550 2600 2650 2700 2750 28000 1000 0 2 000 0 30000 40000 50000 60000 7000 0 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 3.7B Prediction of C max based on the model parameters displayed in Table 3.6B. 93 PAGE 109 Prediction of Cmax from a Mixture Model of Two Weibull Distributions (n = 100)2500 2550 2600 2650 2700 2750 28000 10000 20000 30000 40000 5 000 0 60000 70000 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 3.8A Prediction of C max based on the model parameters displayed in Table 3.7A Prediction of Cmax from a Mixture Model of Two Weibull Distributions (n = 100)2500 2550 2600 2650 2700 2750 28000 1000 0 2 000 0 30000 40000 50000 60000 7000 0 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 3.8B Prediction of C max based on the model parameters displayed in Table 3.7B 94 PAGE 110 3.7 Summary and Conclusions For n = 50, the Pareto model results in a singular matrix when the start value of p is greater or equal to 0.50 with an incr ement of 0.05. For other start values of p the model parameters have fluctuating standard errors. The inadequacy of the Pareto model stems from this sensitivity of the model to the start value of p in estimating the model parameters and their standard errors, although the AIC value in this case is slightly smaller in comparison with the AIC value for the Weibull mixture model. For n = 100, although the values of the estimated model parameters from the Pareto distribution remain almost the same for any start value of the mixture parameter p its inadequacy comes from the fact that it places 99.1% of the da ta in one subgroup, and the standard errors for the estimated para meters for the second probability density component of the distribution are extremel y high. The value of Akaike Information Criterion (AIC) is much larger in comparison with the AIC value for Weibull mixture model. This leads to the examination of the results of Weibull mixture model. For n = 50, the parameter estimates and their standard errors from the Weibull model are consistent for any start value of p less than or equal to 0.45 with an increment of 0.05. The results are tabulated in the Table 3.6A. Another se t of consistent parameter estimates was obtained for any start value of p greater or equal to 0.55 w ith an increment of 0.05. These results are presented in Table 3.6B. Th e AIC value remains the same in both cases. 95 For n = 100, and any start value of p less than or equal to 0.45, the results are tabulated in Table 3.7A. Similarly, for n = 100 and any start value of p greater or equal to 0.5, the results are tabulated in Table 3.7B. One can observe that the estimated value of PAGE 111 the mixture parameter in Table 3.7B is equal to (1 estimated p from Table 3.7A), and the estimates of the scale parameters get interchanged as well as estimates of the shape parameters. Based on the comparison of the AIC values between Pareto a nd Weibull for both sample sizes n = 50 and n = 100, and taki ng into account the inconsistency in the estimation of the Pareto model parameters, a mixture of two Weibull models is the only suitable model to fit to the data that exhi bit bimodality when sample size is small. 96 PAGE 112 97 Chapter Four Statistical Model for the Pharmacokinetic Parameter, Maximum Drug Concentration (C max ): Large Samples 4.0 Introduction In chapter three, we examined a mixtur e of two extreme valu e distributions to statistically model the maximum drug con centration data that exhibited a bimodal distribution in small samples (N 100). The focus of the present chapter is to examine various statistical models appropriate for largesample extreme model distributions: generalized extreme value distribution, Gu mbel distribution, generalized Pareto distribution, the threeparameter and the twoparameter Weibull distributions. In each of these distributions the maximum likelihood me thods were used to estimate the model parameters and their standard errors, and using these estimates, extreme quantiles of the maximum drug concentration distribution were obtained. These models are examined below in separate sections, followed by an overall summary of results. 4.1 Outline of Each Section of Chapter Four Section 4.2 presents the derivations of the normal equations for the generalized extreme value distribution, and discu sses the validity of the model. Sections 4.3 and 4.4 have the same information as in section 4.3 but for Gumbel and Weibull distributions, respectively. Section 4.5 gives a description of generalized Pareto distribution pertaining to mode ling maximum drug concentration data. PAGE 113 Section 4.6 presents a detailed discussion of the results of th e statistical modeling of the maximum drug concentration data for large samples using the extreme value models. Section 4.7 contains th e return level profiles based on the quartiles, followed by the summary and conclusions in section 4.8. 4.2 Generalized Extreme Value (GEV) Distribution The Generalized Extreme Value distributi on asymptotically models maxima from any distribution with a stable maximum valu e distribution. The modeling of maxima is used to answer questions about how often, in th e future, values larger than a certain value may occur. In the present problem, the interest lies in the estimation of the return level of the maximum drug concentration for any give n patient in the population. For example, the interest is to estimate the return level of the maximum drug concentration for an n th patient with %1100 confidence interval where 10.0 and ,05.0 ,01.0 The data set consists of maximum drug c oncentration values. Each data value can be represented as where is a sequence of independent random variables having a comm on distribution function F. In the present application, the represent values of drug concentration measured at n preset time points so that represents the maximum drug concentration of these n values for each individual. The distribution of can be derived as [23] }, ,........,,max{21 n nXXX M nXXX ,........,,21 sXi' nM nM n n nzFzX zXzXzM )( Pr...... Pr Pr Pr2 1 In practice, this is not helpful since F is not known, and a small discrepancy in the estimate of F can lead to substantial discrepancies for In reallife applications, one has to be able to statistically estimate th e probability of predicting the maximum drug nF 98 PAGE 114 concentration for an individual after the administration of the drug with confidence bounds, and thus characterize th e stochastic distribution of maximum drug concentrations. Probability Density Function The statistical modeling of the maximum dr ug concentration data starts with the family of models having the dist ribution function of the form [23] 0;for x expexp 0;for ;0for x1exp )(1 x x x xFX (4.1) This is the generalized extreme value (GEV) family of distributions. When ,0 the above distribution reduces to Type 1 extreme value distribution known as the Gumbel distribution. When ,0 it reduces to Type 2 extr eme value distribution known as the Frechet distribution, and finally, when ,0 it reduces to Type 3 extreme value distribution referred to as the Weibull di stribution. The probability density function corresponding to (4.1) is 0 for 1 e exp 0;for 0;for 1 1 1exp )(1 1 1 x e x x x x xfx x X (4.2) where and ,, are the shape, location and sc ale parameters, respectively. 99 PAGE 115 4.2.1 Derivation of Normal Equations Suppose we have n observations for which the GEV distribution is appropriate. The loglikelihood for the GEV parameters when nXXX ,,.........,21 0 is given by [23] 1 1 11 1log 1 1log),,( n i i n i ix x n l (4.3) The partial derivatives [66] of (4.3) w ith respect to each of th e three unknown parameters and ,, are obtained and set each one of th em equal to zero to get a set of normal equations. 0l 0 ln ln ln ln ln ln 1 11 1 1 2 1 1 2 2 x i i i i i i i i i i i ii i i i ix x x x x x x x x x x xx x xx x (4.4) 0l 0 11 x i ix x (4.5) 100 PAGE 116 0 l 01 1 x i i ii ix x xx x (4.6) These normal equations (4.4) through (4.6) cannot be solv ed explic itly to give general formulas for the maximum likelihood estimates Instead, for each sample the equations must be solved using an iterative numerical procedure. The maximum likelihood estimates are those values of that maximize the likelihood. The parameter estimat es and standard errors were calculated numerically using the (extreme value distribution) package for the R statistical computing system [75, 53]. and , ,,.........,21 nxxx and , evd4.2.2 Validity of the GEV model The validity of the GEV model to model the maximum drug concentration data is exam ined through probability, quantile, and retu rn level plots, and each of these plots is based on a comparison of modelbased and empirical estimates of the distribution function. Another method to examine the valid ity of the model is the comparison of the probability density function of a fitted model with a histogram of the data. Any departure from linearity in the probability or quantile plot indicates the inadequacy of the model to fit the maximu m drug concentration data. The return level plots enable us to assess the adequacy of the model. The modelbased curve and 101 PAGE 117 empirical estimates should be in reasonable agreement. The presence of substantial disagreement indicates the in adequacy of the GEV model. 4.3 Gumbel Distribution (Type I Extreme Value Distribution) Probability Density Function The shape parameter determines the shape, and depending on its value, the distribution function can cha nge drastically. The standard asymptotic results of consistency, asymptotic efficiency and asymptotic normality hold when .5.0 For ,5.01 maximum likelihood estimators do not have the asymptotic properties and for 1 they are unlikely to be obtainable. It is often of interest to test the hypothesis that the shape parameter is 0, leading to modeling the maximum drug concentration data using Gumbel distribution. When ,0 the probability density function for GEV reduces to Gumbel probability density function given by x ee xfx x x, exp 1 )( (4.7) 4.3.1 Derivation of Normal Equations As in the case of GEV, the loglikeli hood for the Gumbel parameters [23] is n i i n i ix x nl1 1exp log),( (4.8) Taking the partial derivatives of ),( l with respect to the parameters and the result is the following: 102 PAGE 118 1 ne ln i xi (4.9) 2 1 n i i x i xxexe i li i (4.10) Setting 0 l and solving for we have n i xien1ln)ln( (4.11) Setting 0 l, and solving for we get n i x i n i xi ie xe x1 1 (4.12) The parameter estimates and standard errors were calculated numerically using the B estFit 4.5.4 [11], a probability distribution fitting software package for Microsoft Windows from Palisade Corporation. 4.3.2 Validity of the Gumbel Model The validity of the Gumbel model to mode l the maxim um drug concentration data is examined through probability and quantile plots, and through the comparison of the probability density function of a fitted model with a histogram of the data. The suitability of any particular member of the GEV fa mily can be assessed by comparison with the maximum likelihood estimate within the entire family. The 103 PAGE 119 appropriateness of replacing the GEV family with the Gumbel family, corresponding to the 0 subset of the GEV family is also assesse d using the likelihood ra tio test statistic. 4.4 Weibull Distribution (Type III Extreme Value Distribution) Probability Density Function Since the fitted shape from the generalized extreme value distribution is negative, Weibull distribution was used to model the maximum drug concentration data to examine if it is an appropriate distribution to m odel the data. The threeparameter probability density function is given by )(1 xe x xf ,0 ,0, (4.13) where and ,, are the shape (also called Weibull slop e), location, and scale parameters, respectively. Different values of can have marked effects on the behavior of the distribution. 4.4.1 Derivation of Normal Equations For a random sample the loglikelihood function is given by nxxx ,........,,21 n i n i i ix x nnL11ln1lnlnln (4.14) The maximum likelihood estimates are obtained by setting and , ly. respective and , at 0 ln and ,0 ln ,0 ln L LL After some mathematical simplification, we get the equations 104 PAGE 120 n i n i i n i i i it t tn t n1 1 1 ) ( ) log() ( ) log( (4.15) 1 )1 ( ) ( ) ( 1 1 1 1 n i i n i i n i it t tn (4.16) and 1 1 1 n i it n (4.17) These equations can be solved numerically to obtain the estimates The parameter estimates and standard errors were calculated numerically using the Statistical Analysis Software (SAS ), Version 9.1, Cary, N.C. [91] and , 4.4.2 Validity of the Weibull Model The validity of the Weibull model to model the maximum drug concentration was examined using the goodnessoffit tests by using CramervonMises and AndersonDarling tests. The appropriaten ess of replacing the threeparameter Weibull with the twoparameter Weibull, corresponding to the location parameter 0 was also assessed using the likelihood ra tio test statistic. 4.5 Generalized Pareto Distribution The generalized Pareto distribution (G PD) is a twoparameter family of distributions which can be used to mode l exceedances over a threshold. The most common estimators for the shape and scale parameters of the GPD are maximum likelihood estimators. The probability de nsity function for the GPD is given by 105 PAGE 121 1 11 1 ),;( x xf (4.18) where and are the shape and the scale parameters, respectively. One of the crucial factors is to select an appropriate threshold value. The two techniques that are used to select a threshol d value are the use of m ean residual life plots, or fit the generalized Pareto distribution at a range of thresholds and look for stability of parameter estimates. Above a cer tain value of the threshold at which the generalized Pareto distribution provides a valid approximation to the excess distribution, the mean residual plot should be approximately linear inu. By arbitrarily selecting a range of threshold values, the mean residual plots drawn did not display lin earity, indicating the inappropriateness of the use of the generalized Pareto dist ribution to model the maximum drug concentration data. 0u 106 PAGE 122 4.6 Results of the Modeling on Large Samples 4.6.1 Assessment of Generalized Extreme Value (GEV) Distribution For large samples of sizes 300, 400, 500 and 1000, the generalized extreme value distribution was fit. The various diagnostic plots for assessing the accuracy of the GEV model fitted to the maximum drug concentratio n data were carefully examined. Both the probability and quantile plots were linear or nearlinear indicating no evidence of lack of fit to the GEV distribution. The estimated shape parameter was negative, and as a consequence of this, the return level curve asym ptotes to a finite level. The fitted density seems a reasonable fit to the histogram of the maximum drug concentrations. Thus, all four diagnostic plots supported the fitted GE V model. The Table 4.1 displays maximum likelihood estimates for the GEV model along wi th their standard errors, and Deviance statistic for each of the samples of sizes 300, 400, 500, and 1000. The Figure 4.1 displays the diagnostic plots for the case of n = 1000. Figures 4.2 through 4.4 display the return level graphs with 90%, 95%, and 99% confidence intervals, respectively. Table 4.1 Parameter Estimates (SE) for the GEV Model n (SE) (SE) (SE) DEVIANCE 1000 1360.81 (10.6) 303.45 (7.5) 0.095 (0.020) 14468.9 500 1363.00 (15.2) 305.27 (10.7) 0.082 (0.029) 7250.02 400 1346.68 (16.2) 291.44 (11.5) 0.075 (0.033) 5765.8 300 1355.65 (21.4) 330.03 (15.2) 0.112 (0.041) 4388.8 In the above table and , are the location, the scale and the shape parameters of the GEV. Using the parameter estimates and th eir corresponding standard errors one can construct 95% (90% or 99%) confidence interv al. A greater accuracy in the computation of confidence intervals can be obtai ned by the use of profile likelihood. 107 PAGE 123 Figure 4.1 Diagnostic Plots for GEV 1000 n 108 PAGE 124 Prediction of Cmax with 90% Confidence Interval Generalized Extreme Value (GEV) Distribution (n = 1000)2700 2900 3100 3300 3500 3700 3900 41000 1000 0 2 000 0 30000 40000 50000 60000 7000 0 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 4.2 Prediction of C max versus Number of Subjects with 90% Confidence Interval Prediction of Cmax with 95% Confidence Interval Generalized Extreme Value (GEV) Distribution (n = 1000)2700 2900 3100 3300 3500 3700 3900 41000 10000 20000 30000 40000 5 000 0 60000 70000 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 4.3 Prediction of C max versus Number of Subjects with 95% Confidence Interval 109 PAGE 125 Prediction of Cmax with 99% Confidence Interval Generalized Extreme Value (GEV) Distribution (n =1000)2700 2900 3100 3300 3500 3700 3900 41000 10000 20000 30000 4000 0 5 000 0 60000 70000 80000 90000 100000Number of SubjectsModel Based Prediction of Cmax Figure 4.4 Prediction of C max versus Number of Subjects with 99% Confidence Interval 4.6.2 Assessment of Gumble Model Next, the suitability of replacing the GEV family with Gumbel family ( 0 ) was assessed. Table 4.2 Parameter Estimates (SE) for the Gumbel Model n (SE) (SE) DEVIANCE DEVIANCE (GEV) DF 2 1000 1344.58 (10.1) 296.25 (7.2) 14487 14469 1 18 500 1349.6 (14.28) 299.56 (11.2) 7257 7250 1 7 400 1334.66 (15.2) 285.95 (11.1) 5770 5766 1 4 300 1336.09 (19.5) 320.46 (17.3) 4395 4389 1 6 pvalues < 0.05 The pvalue corresponding to each of the four values presented in Table 4.2 is less than 0.05, leading to the rejection of the null hypothesis that the shape parameter is 0. 2 110 PAGE 126 The degree of significance increases as the sa mple size increases. This leads to the conclusion that the Gumbel model is not a suitable model to model the maximum drug concentration data when the sample size is large. 4.6.3 Assessment of Weibull Model Since the value of the shape parameter is negative, the suitability of replacing the GEV model with the Weibull model was assessed. Although the Weibull model was suitable for small samples with its shortcomings to adequately explain the bimodal nature of the data, its suitability to model the la rge sample data was inconsistent. Table 4.3 displays the parameter estimates for the Weibull fit. Table 4.3 Parameter Estimates (SE) for the ThreeParameter Weibull Model n (SE) (SE) (SE) valuep 1000 624.54 (9.55) 993.54 (16.24) 2.68 (0.07) 0.004 500 775.20 (19.9) 835.42 (29.38) 2.18 (0.10) 0.014 400 704.72 (17.1) 889.85 (27.3) 2.43 (0.11) 0.064 300 774.5 (36.4) 833.9 (49.6) 2.06 (0.16) 0.47 *pvalue corresponding to goodnessoffit test 2 The inconsistency of the suitability of the Weibull model stems from the fact that the pvalue is statistically mo re significant as the sample size becomes large, leading to the rejection of the null h ypothesis that the Weibull model fits the maximum drug concentration data when the sample size becomes large. 4.6.4 Assessment of Generalized Pareto Model Also examined was the suitability of the generalized Pareto distribution to model the maximum drug concentration data. The result s from this approach indicated that the Pareto distribution was an inadequate distri bution to model the data. Table 4.4 displays the parameter estimates for the generalized Pareto fit. 111 PAGE 127 Table 4.4 Parameter Estimates (SE) for the generalized Pareto Model n (SE) (SE) valuep 1000 636.7 (16.9) 1.196 (0.14) 0.0 500 800.4 (21.9) 1.634 (0.19) 0.0 400 724.5 (25.3) 1.432 (0.22) 0.0 300 835.8 (27.2) 1.774 (0.23) 0.0 *pvalue corresponding to goodnessoffit test 2 The goodnessoffit test used was Chisquare and the pvalues were statistically significant, leading to the rejection of the null hypothesis that the Pare to distribution is an appropriate distribution for any of the large sample sizes 300, 400, 500, or 1000. 112 PAGE 128 113 4.7 Return Level Profiles Based on the Quartiles The complete data set (n =1000) was split into four groups based on the quartiles. The first quartile value is 1249.32. The sec ond and third quartile values are 1463.41 and 1710.56 respectively. The first group, named as Q1, consists of all values of maximum drug concentrations that are less than or equal to 1249.32. The second group, named as Q2, consists of all values of maximum drug con centrations that are less than or equal to 1463.41. The third group, named as Q3, c onsists of all values of maximum drug concentrations that are less than or equal to 1710.56. The last group, named as ALL, consists of all 1000 original maximum drug concentration values. The generalized extreme value (GEV) distribution was fit into each of these groups and maximum likelihood parameters were estimated. The following table displays the parameter estimates along with the standard e rrors for each of the four groups. Table 4.5 Parameter Estimates (SE) fo r the GEV fit on four groups of data. n (SE) (SE) (SE) Q1 1072.02 (13.22) 166.72 (12.44) 0.9404 (0.00) Q2 1163.85 (11.80) 208.79 (13.21) 0.6847 (0.04) Q3 1269.82 (9.67) 223.78 (7.09) 0.4497 (0.03) ALL 1360.81 (10.63) 303.45 (7.51) 0.0950 (0.02) The graphs of the quantiles derived fr om the maximum likelihood estimators for each of the four groups are displayed in a si ngle graph in Figure 4.5. If the maximum value of maximum drug concentrations that can be attained is limited by these quartile values that define the groups Q1, Q2, and Q3 then the quantile graphs for the groups Q1, PAGE 129 Q2, and Q3 are almost linear as n becomes a very large number. The graph is displayed in Figure 4.5. Prediction of Cmaxversus Number of Subjects 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 36000 10 000 20000 30000 4 0000 50000 60000 70000 80000 900 0 0 100 0 0 0Number of Sub j ects Model Based Prediction of Cmax Q1 Q2 Q3 ALL Figure 4.5 Prediction of C max versus Number of Subject s from the Maximum Likelihood Estimates of the GEV Model 4.8 Summary and Conclusions The results from the application of extr eme value theory to model the maximum drug concentration data for larg e samples are summarized in the form of tables in section 4.6. The Gumbel distribution (testing the 0 :0 H ) is an inadequate model based on goodnessoftests pvalue shown in Table 4.2. A similar conclusion can be drawn for the 3parameter Weibull because of its inconsistency in fitting to the data as sample size increases. This is evident from the pvalues shown in Table 4.3 for different values of Table 4.4 tabulates the pvalues corresponding to the Pa reto fit to the data for 2 n 114 PAGE 130 different sample sizes. From this, it is easy see that the Pareto distribution is an inadequate model to fit to the maximum drug concentration data. From the model diagnostic plots for th e generalized extreme value distribution (GEV) shown in Figure 4.1, a nd the deviance statistic shown in Table 4.1, it is evident that GEV is the only suitable model to m odel the maximum drug concentration data for large samples. Figures 4.2, 4.3, and 4.4 show the qua ntile graphs obtained using the maximum likelihood estimators from the GEV model with 90%, 95%, and 99% confidence intervals, respectively. The usefulness of these graphs is that one can predict the maximum drug concentration value in a given patient after the administrati on of the drug with 90%, 95%, and 99% confidence limits. Using the extreme value theory to model and predict has not appeared in the literature prior to this study. The behavior and the effect of any drug in a human body are influenced by gender, age, weight, interactions with other drugs, severity of the disease, and other factors which are unique to a pa rticular drug. The methodology developed in this study can easily be applie d to construct profiles of of a drug for each covariate or combination of covariates. maxC maxC 115 PAGE 131 116 Chapter Five Statistical Modeling of a Pharmacokinetic System 5.0 Introduction In the pharmaceutical drug discovery and development process, mathematical modeling, computerintensive si mulations and statistical validations have become an integral part of the process. For any chem ical compound to be useful as a drug it must exhibit a sufficient binding to th e target receptor, that is, it must have efficacy at the site of action. In addition, sufficient quantities of th e compound must reach the site of action and remain there long enough for the de sired therapeutic effect, that is, it must have desirable pharmacokinetics In order to accomplish these two goals it is necessary that the drug be absorbed into the body, be distributed to the site of action, and remain in the body long enough for its benefits to emerge. Each of these areas is subject to research in pharmaceutical drug development and requires extensive mathematical modeling. Mathematical modeling and the resulting computer simulations have become part of a successful method of doing scientific research by complementing and blending experiment and theory [13, 14, 28]. Mathematical/numerical modeling not only offers many advantages but also aids and accelerat es research and development by predicting the response of the system under different, often critical, conditions, giving insight into processes over a time scale, and elimin ating the need of doing expensive tests on unsuccessful designs. PAGE 132 In multicompartment pharmacokinetic models the two rate constants that are of importance are the absorption rate constant and the elimination rate constant. The absorption rate constant expresses the speed of absorption and affects such pharmacokinetic parameters as the maximu m drug concentration, time at which the maximum concentration occurs, and the area under the concentrationtime curve. The average drug concentration during a longterm therapy depends on the extent of absorption, and the degree of fluctuation is re lated to the absorpti on rate constant. The elimination rate constant is a function of how a drug is cleared from the blood by the eliminating organs and how the drug distribut es throughout the body [93]. As a result of increasing interest in the ki netics of drug absorption, dist ribution, and elimination, many analytical and mathematical techniques have been developed to perform highly sophisticated pharmacokinetic analyses [37]. The purpose of mathematical models in biology and medicine is to obtain a comprehensible representation of complex biol ogical activities. M odels are based on the experiments, and experiments do contain errors of random nature. In addition, biological systems have intrinsic variability. These fact ors indicate the impor tance of stochastic approach to formulations of mathematical models in medicine and biology [63, 64]. The use of compartment models in pharmacokinetics was first proposed by Teorell [92] in 1937, and ever since, these m odels have been used extensively for the study of drug concentration problems in pharmacokinetics [39, 40]. An n compartment model is described by the following system of differential equations: n ji j jji iiin ijtxktxkt1,.....,3,2,1, x (5.0.1) 117 PAGE 133 with initial conditions where ,000i ixx txi is the drug concentration in the i th compartment; are the rate constants which represent the proportional constants associated with the rate of absorption or elimination from the compartments. The rate constants are nonnegati ve and satisfy the relationship ),.....,3,2,1,( n jikij n jikkji ij ii,.....,3,2,1, Assuming that the rate constants are know n, the system of differential equations, (5.0.1), can be solved to obtai n a theoretical description of the drug concentration as a function of time in each compartment. But the complicating factor in this problem is the uncertainty in the rate constant s due to factors such as vari ed experimental conditions like sampling and measurement errors, variations in patient parameters as functions of time and environmental effects. These uncertainties naturally lead to the assumption that the rate constants behave as stocha stic variables, and hence, the use of stochastic procedures in the analysis is appropriate [86]. A well c onstructed model in a stochastic setting will converge in the mean to the deterministic solution. By considering the initial concentration or rate constants or both to be stochastic, the system of differential equations, (5.0.1), can be made stochastic. One can then seek to develop the probability density function that will characterize the random behavior of drug concentration. Having the probability dist ribution one can obtain the mean behavior of the drug concentration and compare it to its deterministic counterpart. Tsokos et al [97] have discussed in detail the procedur e for stochastizing the pharmacokinetic model that describes the profile of an antibiotic drug coumermycin A 1 and have given a 118 PAGE 134 119 numerical comparison between the determinis tic trajectory of the above drug with the mean value of the random solu tion as a function of time. 5.1 Focus of the Chapter Five The focus of the present study is to develop numerical solutions for a system of random differential equations that represents the threecompartment open system which describes the disposition of coumermycin A 1 that was considered by Tsokos et al. [97]. Schematically, the open threecompartment pha rmacokinetic model that we are studying is illustrated in Figure 5.2.1 and the system of differential equations describing the model is given by the equation 5.2.1. The rate constant s used in the system are assumed to be random and have been characterized by a trivar iate probability distri bution. The trivariate probability distributions that will be used to simulate these rate constants are the truncated trivariate normal and e xponential probability distributions. More precisely, the numerical solutions for the system of random differential equations will be obtained using the rate cons tants that are simulated from the trivariate probability distribution under diffe rent combinations of covariance structures and initial conditions, where the initial conditions are nonrandom. The effects of different values of covari ance structures and in itial conditions on the deterministic behavior of the drug concen tration and the mean behavior of the random solutions as a function of time will be disc ussed in addition to comparing these two behaviors. Also discussed is the suitability of the use of the two probability distributions to simulate the rate constants PAGE 135 5.2 A Pharmacokinetic Model for the Antibiotic Drug, Coumermycin A 1 In the model for the disp osition of coumermycin A 1 the first compartment reflects solely the plasma water, while the sec ond compartment comprises the remaining body distribution space and the third compartment re flects the biotransformation and excretion of the drug [57]. In designing a mode l for the disposition of coumermycin A 1 the additional factors to be considered are the nonelimination of the drug from the central or plasma compartment. Therefore biotransformation is occurring elsewhere in the body. The appropriate model for the disposition of coumermycin A 1 introduced by Kaplan [57] is as follows: x 1 (t) x 2 (t) k 12 Compartment 1 Compartment 2 plasma water k 21 extracellular water and tissues k 23 120 x 3 (t) Compartment 3 biotransformation and excretion Figure 5.2.1: Model for the di sposition of Coumermycin A 1 The pharmacokinetic profile of coumermycin A 1 was determined in four human subjects based on the serum level data from the report of a clinical study [85]. The associated rate constants were calculated [57] and are presented in Table 5.2.1 below. PAGE 136 Table 5.2.1 Rate constants for the disposition of coumermycin A 1 SUBJECT SUBJECT A SUBJECT B SUBJECT C SUBJECT D k 12 hr 1 0.338 0.297 0.273 0.222 k 21 hr 1 0.060 0.097 0.030 0.033 k 23 hr 1 0.041 0.065 0.027 0.017 The above compartmental model is described by the following system of differential equations: txktx txkktxktx txktxktx223 3 22321 112 2 221 112 1 (5.2.1) with the initial conditions 3 32 21 10 ,0 ,0 cxcxcx ; where txtxtx3 21 and , are the amounts of the drug concentration in compartments 1, 2, and 3, respectively, at time 0 are the rate constants of the system. ;0 t and ,,23 2112 kkk In the matrix form, the system (3.1) can be written as follows: tx tx tx k kkk kk tx tx tx tx3 2 1 23 2321 12 2112 3 2 10 0 0 0 with initial conditions (5.2.2) 3 2 1c c c C The solution to the differential system (5.2.1) with its corresponding initial conditions is given by Tsokos et al [97] as follows, 121 PAGE 137 t t t t te k c kk ck e kk ck k c kk cckk kk cckk k c de k c kk ck ce kk ck k c be k c kk ck ae kk ck k c tx tx tx txt 3 2 3 2 3 223 12 12223 2221 12323 3321 23 31 123 2 23 32223321 122 2 23 33223221 23 32 1 23 12 12223 2221 12323 2321 23 31 23 12 12223 2221 12323 2321 23 31 3 2 1 1 (5.2.3) where ,01 2312 2 122321 122321 24 *5.0 kkkkkkkk, 2312 2 122321 122321 34 *5.0 kkkkkkkk, ,12223 221kk k a ,12323 321kk k b ,23 2k c ,23 3k d and .123122 23 2 23 3221 kk k k In the matrix form, the genera l solu tion can be written as: ACtx 1 where and 3 2 1c c c C 122 PAGE 138 A is a matrix whose three columns A.1 A.2, and A.3 are given as follows: 33 x 1 1 1 1.3 2 3 2 3 22 332 23 2 3 23 2 3 23 t t t t t tee k dece k beae k A e12.12323 321 12223 221 122 2 123 3 23 21 122 2 123 3 23 213 3 2 3 2kk k kk k k de k ce k k k be k ae k k At t t t t 0 0 3.223 2 12323 321 122 2 23 3221 te kkk k kk k A Tsokos et. al. [97] compared the deterministic behavior of the concentration of coumermycin A 1 and the mean behavior of the rando m solution as a function of time with the assumption that initial concentrations being 0 and 0,13 2 1 ccc and the joint probability density function of random vector being probabilistically characterized by k elsewhere 0 0 ,( 2 1 exp 2 ;1 1 2 1 0 kk k D tkfT where ,2 3 23 13 32 2 2 12 31 21 2 1 23 21 12 23 21 12 kE kE kE k k k k and is a normalization factor. 0D 123 PAGE 139 5.3 Simulation of Rate Constants Using Trun cated Trivariate Normal Distribution The importance of rate constants is quite evident as they affect such pharmacokinetic parameters as maximum drug concentration, time at which this maximum concentration is attained and the area under the concentra tiontime curve as well as how a drug is cleared from the blood and distributed through out the body. Chaung and Lloyd [6] specifically mention th at the rate constants are correlated, but the presumption of their independence or of having some speci fic correlation is necessary in order that an an alysis of the problem to be relevant. Thus, we study the drug concentration behavior in a stoc hastic setting with respect to small (0.25), medium (0.50) and high (0.75) correlation structure. The standard deviation in any type of pharmacokinetic system plays a major role in the final interpretation of the drug concentration behavior in each of the compartments. Thus, we study its effect on the drug concentrat ion behavior with resp ect to small (5% of the mean), medium (10% of the mean), a nd large (20% of the mean) increase in the standard deviation. The two necessary components, the mean vector and the covariance matrix, to generate rate constants that are truncated tr ivariate normally distributed are obtained as follows: Mean Vector : The first entry in the mean colu mn vector was computed as the average of values measured on four patients, the second entry was the average of values measured on the same four patients, and the third entry was the average of values measured on the same four patients. 12k 21k 23k 124 PAGE 140 125 Covariance Matrix : The values used for the correl ation coefficient and standard deviations to obtain the covariance matrix stru cture are described in Figure 5.3.1 in the form of a flowchart. It also describes the set of initial conditions used in solving the system of random differential equations. The numerical solutions for the system of random differential equations are obtained for the following three sets of initial conditions: 1. vector (c 1 c 2 c 3 ) = (1.0, 0.00, 0.00) 2. vector (c 1 c 2 c 3 ) = (1.0, 0.05, 0.05) 3. vector (c 1 c 2 c 3 ) = (1.0, 0.10, 0.10). For each set of initial conditions, three co rrelation coefficients (0.25, 0.50 and 0.75) and three values for the standard devi ation (5%, 10% and 20%), represented as a percentage of the mean value, will be c onsidered. Thus, numerical solutions will be obtained for a total of 3 = 27 models. All the numerical solutions summarized in this study are based on 1000 simulations. It is helpful to have a convenient nota tion to refer to individual models. For example, the notation for the model consisting of the third set of initial conditions, correlation = 0.50 and standard deviation = 10% of the mean is M (3, 0.50, 10%). Similarly, M(3, 0.25, 5%) refers to the m odel consisting of the third set of initial conditions, correlation = 0.25 and standard deviation = 5% of the mean. Thus, a total of twenty seven model configurations that we studied are presented in Figure 5.3.1 In section 5.4 we present the representa tive graphs along with the discussion of the numerical results pertaining to the behavior of the deterministic characterization in PAGE 141 126 each of the three compartments as the correla tion structure and standard deviation values change. In section 5.5 we present the representa tive graphs along with the discussion of the numerical results pertaining to the behavior of the stochastic characterization of individual compartments as the correlati on structure and standard deviation values change. In section 5.6 we discuss the simulation of rate constants using the trivariate exponential probability distribut ion and the results from solving the system of random differential equations usi ng these rate constants. PAGE 142 127 PAGE 143 5.4 Discussion of the Results The numerical results of the different mode ls, as outlined in Figure 5.3.1, that we study are presented in Figures 5.4A through 5.4D. Each of the three representative graphs in Figure 5.4A shows the deterministic behavior of coumermycin A 1 concentration in all three compartments for the first 50 hours. This behavior of the drug concentration in all three compartmen ts is as expected. The drug concentration in the first compartment ( plasma water) decays rapidly as indicated by the graph of, while the drug concentrati on in the second compartment ( extracellular water and tissues ) increases during the first six to eight hours as indicated by the graph of before it starts decaying. The c oncentration behavior in the third compartment ( biotransformation and excretion) increases exponentially as indicated by the graph of tx1 tx2 tx3 Discussion of the results displayed in the graphs in Figures 5.4B through 5.4D is given in Table 5.4.1. More preci sely, the behavior of the deterministic characterization in each of the three compartments as is affected by the change in the correlation structure and in crease in the standard deviation is displayed. 128 PAGE 144 129 5.4.1 Effects of Varying the Values of the Standard Deviation and the Correlation Structure for Studying the Behaviors of x 1 (t), x 2 (t) and x 3 (t) Table 5.4.1 Effects of varying the value of the standard deviation and correlation structure for studying the behavior of x 1 (t), x 2 (t) and x 3 (t). Change in Change in Correlation Standard Deviation 0.25 to 0.50 0.25 to 0.75 5% to 10% x 1 (t): Rate of decay is almost identical for the Rate of decay is significantly faster first 6 hours when r = 0.50, and after when r = 0.75 for the first 10 hours that, significantly slower rate of decay. and after that, rates are almost identical. x 2 (t): Amount of absorption is significan tly Amount of absorption is higher for higher when r = 0.50 after the first 4 r = 0.75 for the first 7 hours, and hours, and this difference increases after that, a significantly lower with time. absorption. x 3 (t): Amount of biotransformation is Amount of biotransformation is significantly lower when r = 0.50 an d significantly higher when r = 0.75. this differences increases with time. 5% to 20% x 1 (t): Rate of decay is significantly slower Rate of decay is significantly slower when r = 0.50 after the first 2 hour s. when r = 0.75 between 2 and 24 hours and after that, the two rates are almost identical. x 2 (t): Amount of absorption is slightly lo wer Amount of absorption is significantly when r = 0.50 between 2 and 10 hours, lower when r = 0.75 all through. and after that, higher amount of absorption. x 3 (t): Amount of biotransformation is Amount of biotransformation is significantly lower when r = 0.50 and al most identical for the first 20 hours this differences increases with time. and after that, it is slightly higher for r = 0.75. PAGE 145 130 5.4.2 Summary of the Effects of the Standa rd Deviation and the Correlation Structure on the Behaviors of x 1 (t), x 2 (t) and x 3 (t) Drug Concentration, x 1 (t) For a combination of changes in the sta ndard deviation from 5% to 10% and the correlation from 0.25 to 0.50, the rate of decay is almost identical for the first six hours (delay effect) and then a significantly slower rate of decay. For correlation change from 0.25 to 0.75, rate of decay is faster for the fi rst ten hours, and after that, almost identical. For a combination of changes in the sta ndard deviation from 5% to 20% and the correlation from 0.25 to 0.50, the rate of decay is slower after the first 2 hours, while it is significantly slower between 2 and 24 hours when the correlation changes from 0.25 to 0.75. From these results we can conclude that, for the combination of higher values of the correlation and standard deviation, th e rate of decay of drug concentration is significantly affected as show n in the sample graphs of x 1 (t) in Figure 5.4B. Drug Concentration, x 2 (t) As standard deviation cha nges from 5% to 10% and correlation changes from 0.25 to 0.50, the amount of absorption of th e drug concentration is significantly higher after the first four hours, while for the change in correlation from 0.25 to 0.75 it is higher for the first 7 hours (delay eff ect) and after that the amount of absorption is significantly lower. For a combination of changes in the st andard deviation from 5% to 20% and correlation from 0.25 to 0.50, the amount of absorption is lower between 2 and 10 hours but higher after that time point, whereas for th e change from 0.25 to 0.75, the absorption is lower all through. PAGE 146 131 Hence, there is a significant effect of both standard deviati on and correlation on the absorption rate as displaye d in the sample graphs of x 2 (t) in Figure 5.4C. However, it is important to note this effect on the abso rption rate as it affects such pharmacokinetic parameters as the maximum drug concentr ation, time at which the maximum drug concentration occurs, and the area u nder the concentrationtime profile. Drug Concentration, x 3 (t) For a combination of changes in the standa rd deviation from e ither 5% to 10% or 5% to 20%, and the correlation from 0.25 to 0.50, there is a significantly lower amount of biotransformation, and this difference increases with time. When the correlation changes from 0.25 to 0.75, and standard deviation from 5% to 10%, the amount of biotransformation is significantly higher but for the change in standard deviation from 5% to 20%, the amount of biotransformati on is higher after the first 20 hours after being almost identical during the first 20 hours (delay effect). Thus, higher standard deviation and correla tion values do have a significant effect on the biotransformation of th e drug concentration as depicted by the sample graphs of x 3 (t) shown in Figure 5.4D. PAGE 147 Deterministic Characterizat ion of Drug Concentration M(3, 0.25, 5%)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4681012141618202224262830 323436 3840 42 44 464850TimeConcentration x1(t) x3(t) x2(t) Deterministic Characterizat ion of Drug Concentration M(3, 0.50, 10%)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 024 68 10 121416182022242628303234363840 4244 464850TimeConcentration x1(t) x3(t) x2(t) Deterministic Characterizat ion of Drug Concentration M(3, 0.75, 20%)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2468 10 12141618202224262830323436 38 404244 464850 TimeConcentration x1(t) x3(t) x2(t) Figure 5.4A Deterministic characterization of drug concentration in all compartments 132 PAGE 148 Deterministic Characterization of Coumermycin A1 Concentration (x1(t)) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4 6 8 10 1214161820222426283032343638 40 42 44 464850TimeConcentration r0.25_x1(t) r0.50_x1(t) r0.50 x1 ( t ) r0.25x 1 ( t ) Figure 5.4B Deterministic characterization of coumermycin A 1 concentration (x 1 (t)) 133 PAGE 149 Deterministic Characterization of Coumermycin A1Concentration (x2(t)) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 46 8 10 12141618202224262830323436 38 40 42 44464850TimeConcentration r0.25_x2(t) r0.50_x2(t) r0.50x 2 ( t ) r0.25x 2 ( t ) 134 Figure 5.4C Deterministic characterization of coumermycin A 1 concentration (x 2 (t)) PAGE 150 Deterministic Characterization of Coumermycin A1Concentration (x3(t)) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4 6 8 10 12 14161820222426283032343638 40 42 44 464850TimeConcentration r0.25_x3(t) r0.50_x3(t) r0.50x 3 ( t ) r0.25x 3 ( t ) Figure 5.4D Deterministic characterization of coumermycin A 1 concentration (x 3 (t)) 135 PAGE 151 136 5.5 Stochastic Behavior of the Drug Concentration in Each Compartment 5.5.1 Effects of Varying the Values of th e Standard Deviation a nd Correlation Structure on the Deterministic (x i (t)) and Stochastic (E[ x i (t)]) Behaviors of the Individual Compartments The stochastic results of the drug concentration behavior in each of the three compartments are summarized in Table 5.5.1 for varying values of correlation structure and varying increase in th e standard deviation. Table 5.5.1 Effects of varying the values of the standard deviation and correlation structure on the deterministic and stochastic behaviors of the i ndividual compartments Change in Standard Change in Correlation Deviation 0.25 to 0.50 0.25 to 0.75 5% to 10% x 1 (t): Both decrease rapidly but after the fi rst Both the stochastic and deterministic 2 hours, the stochastic is consistently behaviors decrease but the stochastic higher. is consistently higher. x 2 (t): Both deterministic and stochastic are Both deterministic and stochastic are almost identical for the first 4 hou rs, almost identical although the and after that, stochastic is significantly deterministic is slightly higher for lower. the first 6 hours, and after that, stochastic is significantly higher. x 3 (t): Stochastic characterization is higher Deterministic trajectory is higher and this difference increases with time. this difference increases with time. 5% to 20% x 1 (t): Stochastic characterization is sign ificantly Stochastic characterization is slower than the deterministic trajectory. significantly slower than the deterministic trajectory. x 2 (t): Deterministic is higher for the first 8 Deterministic is higher for the first 6 hours and after that, the stoc hastic is hours and after that, the stochastic is significantly higher and this differe nce significantly higher and this increases with time. difference increases with time. x 3 (t): Deterministic trajectory is significantly Deterministic is significantly higher higher than the stochastic characterizatio n than the stochastic characterization after the first 6 hours, and this differe nce after the first 4 hours, and this increases with time. difference increases with time. PAGE 152 137 5.5.2 Summary of the Effects of the Standa rd Deviation and the Correlation Structure on the Deterministic and the Stochastic Characterizations Deterministic and Stochastic Characterizations of x 1 (t) For a combination of changes in standa rd deviation from 5% to 10% and the correlation from 0.25 to 0.50 (or 0.25 to 0.75), the stochastic characterization has a rate of decrease that is consistently slower than that of the deterministic trajectory. This slower rate of decrease of the stochastic characterization is even more pronounced as the standard deviation changes from 5% to 20% and correlation changes from 0.25 to 0.50 (or 0.25 to 0.75). This can be seen from the representative graph presented in Figure 5.5A. More precisely, in Figure 5.5A, we note th at after approximately 3.5 units of time, there is a difference between the determinis tic and stochastic behaviors of the drug concentration in compartment one. Deterministic and Stochastic Characterizations of x 2 (t) As the standard deviation changes from 5% to 10% and correlation changes from 0.25 to 0.50, from both characterizations be ing almost identical for the first 4 hours (delay effect), the stochastic is significantly lower than the deterministic after that point of time. But when the correlation changes from 0.25 to 0.75, after being almost identical for the first 6 hours, the stochastic behavior of x 2 (t) is significantly higher than the deterministic. For a combination of changes in standa rd deviation from 5% to 20% and the correlation from 0.25 to 0.50, the determinis tic behavior is higher for the first 8 hours (delay effect), but after that period the stoc hastic characterization is significantly higher with the difference increasing with time. Ho wever, when the correlation changes from 0.25 to 0.75, the deterministic is higher for th e first 6 hours (delay e ffect), and after that, PAGE 153 138 the stochastic behavior of x 2 (t) is significantly higher with the difference increasing with time. This identifies the time delay of init ial drug concentration to reach compartment two. This can be seen from the represen tative graph presented in Figure 5.5B. Deterministic and Stochastic Characterizations of x 3 (t) For a combination of changes in standa rd deviation from 5% to 10% and the correlation from 0.25 to 0.50, the stochastic ch aracterization is significantly higher with the difference increasing with time while, for th e change in correlation from 0.25 to 0.75, the deterministic is significantly higher with the difference increasing with time. As the standard deviation changes from 5% to 20% and correlation changes from 0.25 to 0.50, the deterministic is significantl y higher after about approximately the first 6 hours (delay effect), and this difference incr eases with time. The same behavior repeats but after about approximately the first 3 hours when the corre lation changes from 0.25 to 0.75. This can be seen in the representa tive graph presented in Figure 5.5C. This identifies the time delay of the drug concentration to reach compartment three from two. Thus, the deterministic and the stochastic characterizations of the drug concentrations x 1 (t), x 2 (t), and x 3 (t) are significantly affected by the variations in the values of the standard deviation and the corr elation structure. This significance in the effect is evident, especially, for the highe r values of the standard deviation and the correlation structure. PAGE 154 Stochastic and Deterministic Characterization of Drug Concentration in Compartment 1 along with Standard Deviation 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4 6 8 10 1214161820222426283032343638 40 42 44 464850 TimeConcentration x1(t) E[x1(t)] S.D.[x1(t)] x 1 ( t ) E [ x 1 ( t ) ] S.D. [ x 1 ( t )] Figure 5.5A Deterministic and st ochastic characterizations of x 1 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 139 PAGE 155 Stochastic and Deterministic Characterization of Drug Concentration in Compartment 2 along with Standard Deviation 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4 6 8 10 1214161820222426283032343638 40 42 44 464850 TimeConcentration x2(t) E[x2(t)] S.D.[x2(t)] x 2 ( t ) E [ x 2 ( t ) ] S.D. [ x 2 ( t )] Figure 5.5B Deterministic and st ochastic characterizations of x 2 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 140 PAGE 156 Stochastic and Deterministic Characterization of Drug Concentration in Compartment 3 along with Standard Deviation 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 4 6 8 10 1214161820222426283032343638 40 42 44 464850 TimeConcentration x3(t) E[x3(t)] S.D.[x3(t)] x 3 ( t ) E [ x 3 ( t ) ] S.D. [ x 3 ( t )] Figure 5.5C Deterministic and st ochastic characterizations of x 3 (t) as correlation changes from 0.25 to 0.75, and standard deviation, from 5% to 20%. 141 PAGE 157 5.6 Simulation of Rate Constants Usin g Trivariate Exponential Distribution 5.6.1 Introduction Except for the normal distribution, mu ltiv ariate extensions of univariate distributions are not obvious. The normal is simple because the first and the second moments extend in an obvious way, and they completely determine the multivariate normal distributions. There are various ways to de ne a m ultivariate exponential distribution. Marshall and Olkin de ned a multivariate exponential distribut ion that arises naturally in modeling observable processes, in terms of Poisson s hocks. The idea is to consider a system of d components that experience shocks that follo w independent Poisson processes. There are d Poisson processes that aect only one component; there are that aect two components simultaneously; and so on; and one that aects all d components simultaneously. 2 d Each component is therefore aected by 2d 1 Poisson shock processes. The element of the multivariate exponential random variable is the interval between two successive shocks experienced by the component; that is, it is the minimum of independent exponential random variables. The exponential rate parameter for the element is the sum of the rates of the Poisson shock processes that aect the component. The magnitudes of the rates of the processes that aect both the and components simultaneously determ ine the covariance between the and elements of the multivariate exponential random variable. Marshall and Olkin discuss a number of thithi12dthithithithjthithj 142 PAGE 158 interesting properties of this distribution, including the fact that it, like the univariate exponential distribution, is memoryless. For a dvariate exponential random variable, choose 2 d 1 exponential random variables with rate parameters 1 ,..., d 12 ,..., 1d 23 ,..., 2d ,..., 123 ,..., 12d ,..., 12d We generate 2 d 1 exponential random variates t 1 ,...,t d ,t 12 ,...,t 1d ,t 23 ,...,t 2d ,...,t 123 ,...,t 12d ,...,t 12d independently from these distributions. Then, for the element, we take x thi i as the minimum of all of the t s with a subscript containing i. For example, x 1 = min{t 1 ,...,t d ,t 12 ,...,t 1d ,t 123 ,...,t 12d ,...,t 12d }. The MarshallOlkin dvariate exponential distributi on can also be expressed in terms of conditional distributions. A complication, however, is the fact that this multivariate distribution has regions of positive probability for which the Lebesgue measure is 0. (These co rrespond to events that aect two or more components simultaneously.) Dagpunar (1988) describes a method based on conditi onal distributions. The discontinuities require sp ecial handling, and so the method only works well for small values of d. Based on the above, for the trivariate cas e, that is, for a 3variate exponential random variable, choose exponential random variables with rate parameters 7123 and ,,,,,,123 231312321 The value of 1 is taken as the average of values 12k 143 PAGE 159 measured on four patients. Similarly, the values for 3 2 and are the average of values and the average of values, respectively measured on the same four patients. 21k 23k The value for 12 is computed as the average of the values of 2 1 and ; the value of 13 is computed as the average of 3 1 and ; the value of 123 is computed as the average of 3 21 and Generate exponential random variates For the element, take x 7 123 and ,,,,,,123 231312321ttttttt thi i as the minimum of all the t s with subscript containing i That is, 23 12323133 3 21 12323122 2 12 12313121 1},,,min{ },,,min{ },,,min{ kttttx kttttx kttttx The rate constants that follow trivariate e xponential distribution ar e generated using the programming codes. 5.6.2 Discussion of Results The values of the rate constants generated by the exponential probability distributions are such small values for the majority of the simulations that the overall behaviors of the drug concentra tion in the three compartments do not reflect the expected behavior of the system. The sample graphs presenting the overall behavior of the system are shown in Figure 5.6A when the initial conditions for the system of random differential equations are 0,0,13 21 ccc Figure 5.6B and Figure 5.6C show similar sample graphs displaying the overall behavior of the system when the initial conditions are and 05 .0,05.0,13 21 ccc ,10.0 and ,10.0,13 21 c cc respectively. 144 PAGE 160 Thus, this approach that uses the rate constants generated from the exponential probability distribution in solving the system of random differential equations resulted in a poor and inconsistent characterization of the coumermycin A 1 drug concentrations as shown in the graphs 5.6A through 5.6C. This is a strong indication of the unsuitability of the use of exponential probabi lity distribution to simula te the rate constants. Characterization of Drug Concentration (c1= 1, c2= 0, c3= 0)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 02468101214161820222426283032343638404244464850 TimeConcentration X1(t) X2(t) X3(t) Characterization of Drug Concentration (c1= 1, c2= 0, c3= 0)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 02468101214161820222426283032343638404244464850 TimeConcentration X1(t) X2(t) X3(t) Figure 5.6A Graphs of the drug concentration behaviors in the three compartments of the two simulations when the initial conditions are c 1 =1, c 2 =0, and c 3 =0. 145 PAGE 161 Deterministic Characterizat ion of Drug Concentration (c1 = 1, c2 = 0.05, c3 = 0.05) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 02468 1012 14161820222426283032343638404244 464850TimeConcentration x1(t) x2(t) x3(t) Deterministic Characterizat ion of Drug Concentration (c1 = 1, c2 = 0.05, c3 = 0.05) 00 0.1 02 03 0.4 05 06 0.7 08 09 10 02468101214161820222426283032343638404244464850TimeConcentration x1(t) x2(t) x3(t) Figure 5.6B Graphs of the drug concentration beh aviors in the three compartments of the two simulations when the initial conditions are c1 = 1, c2 = 0.05, and c3 = 0.05. 146 PAGE 162 Deterministic Characterizat ion of Drug Concentration (c1 = 1, c2 = 0.10, c3 = 0.10) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 024681012 1416182022242628303234363840 4244 464850TimeConcentration x1(t) x2(t) x3(t) Deterministic Characterizat ion of Drug Concentration (c1 = 1, c2 = 0.10, c3 = 0.10) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 02468 1012 14161820222426283032343638 404244 464850TimeConcentration x1(t) x2(t) x3(t) Figure 5.6C Graphs of the drug concentration behaviors in the three compartments of the two simulations when the initial conditions are c 1 = 1, c 2 = 0.10, and c 3 = 0.10. 147 PAGE 163 148 5.7 Conclusion We have presented a deterministic and st ochastic analytical formulation of a threecompartment open pharmacokinetic system that describes the behavior of the coumermycin A 1 concentration in each compartment. A stochastic comparison with the deterministic characterization of the drug con centration behavior in each compartment is given. The results clearly identify the depe ndence effect that exists among the three compartmental structure of the antibiotic. Using as the basic structure the actual data that was collected from administering coumermycin A 1 to four different patients, we studied numerically the deterministic and stochastic behaviors. As expect ed, stochastic version uniformly is different than that of its deterministic counterpart. Our numerical study included the behavior of the drug concentration as we vary the correlation stru cture from small, medium to high and also varying the variance from small, medium to high. Thus, we recommend that the stochastic formulation and an alysis to be more appropriate for decision making for the antibiotic drug, coumermycin A 1 commonly used for the Lyme disease. PAGE 164 149 Chapter Six Statistical Modeling of a Ph armacokinetic System Using a System of Delay Random Differential Equations 6.0 Introduction Pharmacokinetics is a discipline that uses mathematical models to describe and predict the timecourse of drug concentra tions in body fluids. Our ability to apply pharmacokinetic principles has vastly improved in the past few decades as a result of advances in analytical chemistry techniques, the development of highly sensitive methods of quantitation of drug concentrations in plasma and tissues, and the availability of iterative computer methods. Pharmacodynamics is a discipline that st udies the time course and intensity of drug action or response. The ability to unders tand and predict indi vidual differences in drug response is of critical importance. A dvances in the clini cal applications of pharmacodynamics are due to improved prec ision and objectivity in methods for measuring human drug response. The ki neticdynamic modeling which uses mathematical methods to link drug concentrations directly to c linical effect has contributed to this advancement (Pharmac okinetics and Pharmacodynamics, David J. Greenblatt, Jerold S. Harmatz, Lisa L. von Moltke, and Richard I. Shader). In any pharmacokinetics or pharmacodynamics study when a process that develops in the body over a course of time, such as drug absorption, drug dissolution, drug bioavailability, etc. is delayed, then th is delay, referred to as time delay, is an PAGE 165 150 important parameter of the process. In such cases, it is necessary to build a structured model of a pharmacokinetic system with time delays. The time delay is independent of the method used to study a particular process, and signifies the delay between the time of dosing and time of appearance of concentra tion in the sampling compartment. The delay in drug response may complicate drug monitori ng. For example, digoxins effect on the heart is delayed because the dr ug requires time to be distri buted to the active site, and hence, digoxin concentrations should not be measured with in six hours of a dose, even after intravenous administration. Before that time, the plasma concentration does not reflect the concentration at the active site [93]. In the previous chapter, the focus wa s to investigate th e drug concentration behavior in a threecompartment open pha rmacokinetic model which describes the disposition of an antibiotic drug coumermycin A 1 We studied a system of nondelay random differential equations representing this model. The three rate constants that were used in the system of differential equations were simulated using the trivariate truncated normal probability distribution. The initial values of the rate constant s that were used in the simulation were calculated from the pharmacokinetic profile of coumermycin A 1 determined in four human subjects based on the serum level data obtained from the report of a clinical study. The extensiv e numerical solutions for the system of nondelay random differential equations under different combin ations of the covariance structure and the initial conditions were developed. Numerical comparisons of the determin istic characterizations of the drug concentration as a function of time of the individual compartments to study the effect of various combinations of the covariance st ructure and the initial conditions on these PAGE 166 characterizations were presented. A similar comparison between the deterministic and the stochastic characterizati ons was also presented. 6.1 Focus of Chapter Six The focus of the present chapter is to inco rporate the time delay into the system of random differential equations that was consider ed in the previous chapter, and develop numerical solutions for the system which describes the dispositi on of coumermycin A 1 Schematically, the open threecompartment pha rmacokinetic model with the time delay components that we are studying is illustrated in Figure 6.2.1. More precisely, the numerical soluti ons for the system of delay random differential equations will be obtained for diffe rent sets of constant time delay values of 1 2 and 3 but under the same combinations of th e covariance structur e and the initial conditions that were described in the previous chapter. The rate constants used in the system are same as those simulated using th e trivariate truncated normal distribution as described in the previous chapter. From the previous chapter we observed a time delay of 3.5 hours for the drug to reach compartment two from compartment one, a time delay of 6 hours for the drug to reach compartment one from compartment two, and a time delay of 3 hours for the drug to reach compartment three from compartment two. In the present chapter we use these time delays hhh 3,6,5.33 2 1 to obtain numerical solutions for the system of delay differential equations and study its impact on the rate of decay, absorption and biotransformation of the drug concentration. In addition we consider other se ts of constant time delay values to study the behavior of the drug concentration unde r different time delays. 151 PAGE 167 152 For all 27 configurations shown in Fi gure 6.1.1 the numerical solutions for the system of delay random differential equatio ns are obtained based on 1000 simulations and using each of the following four sets (referred to as DT1, DT2, DT3, and DT4) of constant time delay values in hours: DT1. vector( 1 2 3 ) = (0.5, 1, 1.5) DT2. vector( 1 2 3 ) = (1, 2, 3) DT3. vector( 1 2 3 ) = (3.5, 6, 3) DT4. vector( 1 2 3 ) = (4, 6, 8) At first we present the discussion of the eff ect of each of the four sets of constant time delay values on the overall behavior of the drug concentr ation in all three compartments. Secondly, the comparison of the determinis tic behavior of the drug concentration with and without time delay for each compartment will be discussed. Finally, the effects of di fferent sets of constant time delay values on the deterministic behavior of the drug concentr ation and the mean behavior of the random solutions as a function of time will be discussed. PAGE 168 153 PAGE 169 6.2 Pharmacokinetic Model The pharmacokinetic model with the delay components is as follows: Compartment 1 plasma water Compartment 2 extracellular water and tissues Compartment 3 biotransformation and excretion x1(t) x2(t) x3(t)12312k21k23k Figure 6.2.1 Model for the disposition of coumermycin A 1 The above model is described by the fo llowing system of random differential equations: 3223 3 322322211112 2 22211112 1 txktx txktxktxktx txktxktx with the initial conditions 3 32 21 10 ,0 ,0cxcxcx ; txtxtx3 21 and , are the amounts of the drug concentra tion in compartments 1, 2, and 3, respectively, at time ;0 t 1 2 and 3 are the time delays before the drug concentration reaches compartment 2 from compartment 1, compartment 1 from compartment 2, and compartment 3 from compartment 2, respectively; are the rate constants of the system. 0 and ,,23 2112kkk 154 PAGE 170 In the matrix form, the above system of differential equations can be written as follows: 32 22 11 23 23 21 12 2112 3 2 1 0 0 0 tx tx tx k kkk kk tx tx tx tx with initial conditions 3 2 1c c c C In general, solutions of this type of equa tions are difficult to fi nd in closed forms. Using an integral approach, integrati ng each side of the equations, we have, 1 2221 1112 1mdttxkdttxktx 2 3223 2221 1112 2mdttxkdttxkdttxktx 3 3223 3mdttxktx where are constants of integration. By knowing the values of each of the functions prior to 3 ,21 and mmm 0 it this system can be solved using the method of steps. However, for the present study the numerical solutions to the above system of delay differential equations were obtaine d through a computer program written in Mathematica 5.1 and SAS 9.1 computer languages. Both programs produced identical results. In section 6.3, we present the discussi on of the numerical results along with the representative graphs, followed by conclusion in section 6.4. 155 PAGE 171 6.3 Discussion of the Results 6.3.1 Overall Behavior of the Drug Con centration in all Three Compartments The graphs presented in Figure 6.3.1 show the deterministic behavior of coumermycin A 1 concentration in all three compartm ents for the first 50 hours under four different sets of constant time delay values th at are described in sec tion 6.1. The behavior of the drug concentration in all three compartments is as expected. The drug concentration in the first compartment ( plasma water) decays rapidly as indicated by the graph of after the initial delay, while th e drug concentration in the second compartment ( extracellular water and tissues ) increases during the first eight to ten hours as indicated by the graph of tx1 tx2 before it starts decaying. The peak values are slightly higher than those for the nondelay models of the previous chapte r. The concentration behavior in the third compartment ( biotransformation and excretion ) increases exponentially as indicated by the graph of tx3 Across all models this overall behavior of the drug concentration in all three compartments is similar. Deterministic Characterizations of Drug Concentration in Compartments 1, 2 and 3 (M(3, 20%, 0.75); Delay: t1 = 0.5 h, t2 = 1 h, t3 = 1.5 h)0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration x3(t) x1(t) x2(t) 156 PAGE 172 Deterministic Characterizations of Drug Concentration in Compartments 1, 2 and 3 (M(3, 20%, 0.75); Delay: t1 = 1 h, t2 = 2 h, t3 = 3 h)0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration x2(t) x1(t) x3(t) Deterministic Characterization of Drug Concentration in Compartments 1, 2 and 3 (M(3, 20%, 0.75); Delay: t1 = 3.5 h, t2 = 6 h, t3 = 3 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration x1(t) x3(t) x2(t) Deterministic Characterization of Drug Concentration in Compartments 1, 2 and 3 (M(3, 20%, 0.75); Delay: t1 = 4 h, t2 = 6 h, t3 = 8 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration x1(t) x3(t) x2(t) 157 Figure 6.3.1 Deterministic characterizations of drug concentration in all 3 compartments PAGE 173 6.3.2 Comparison of Deterministic Behavi or Between Delay and NonDelay Models Deterministic Behavior of Drug Concentration, x 1 (t) The difference in the rate of decay in the drug concentration in compartment one due to a small delay of 0.5 hour is almost ne gligible during the firs t 12 hours, while this difference increases as the time delay increases from 0.5 to 1 hour. The deterministic behaviors of the drug concentration in compartment one for both delay and nondelay cases are almost identical after about th e first 12 hours when the time delay in compartment one is small ( 1 = 0.5 or 1 h) compared to th e length of time (50 hours) the drug concentrations are being monitored. Th ese can be observed from the graphs shown in the top two panels of Figure 6.3.2A. As the time delay increases from 1 hour to 3.5 or 4 hours, not only the difference in the rate of decay of drug concentrati on in compartment one is highly significant compared to the nondelay profile but also takes longer period of time, beyond 24 hours, before the difference between the two profiles become negligibly small. For the case of ,5.31h the graph is shown in the bo ttom panel of Figure 6.3.2A. Deterministic Behavior of Drug Concentration, x 2 (t) The rate of change in the drug concentration in compartment two is subject to two time delay constants. It takes a time delay of 1 for the drug to reach compartment two from compartment one, and a time delay of 2 for the drug to reach compartment one from compartment two. The difference in the rate of absorption of the drug concentration in compartment two due to a sm all delay of 1 hour is almost negligible during the first 8 hours, while this difference increases as the time delay increases from 1 to 2 hour. The peak value in both cases of time delay is almo st the same. The deterministic behaviors of 158 PAGE 174 the drug concentration in compartment two for both dela y and nondelay cases are almost identical after about the firs t 8 hours when the time delay in compartment one is small ( 1 =1) compared to the length of time (50 hours) the drug concentrations are being monitored. This can be observed from the gr aph shown in the top panel of Figure 6.3.2B. As the time delay increases from 1 hour to 2 or 6 hours, not only the difference in the rate of absorption of dr ug concentration in compartmen t two is significant compared to the nondelay profile but also attains a higher peak value, especially in case of 6hour delay. All through the observation period of 50 hours the am ount of absorption is slightly higher than those observed in its correspondi ng nondelay case. These are displayed in the graphs shown in the middle a nd bottom panels of Figure 6.3.2B. Deterministic Behavior of Drug Concentration, x 3 (t) The rate of change in the drug concentration in compartment three is subject to one time delay constant, that is, the time 3 taken by the drug to reach compartment three from compartment two. Although the amount of biotransformation and excretion in case of a small time delay of 1.5 hours is almost identical to that of the nondelay case, there is a uniformly significant difference in the dete rministic behaviors of for both delay and nondelay as the time delay increases from 1.5 to 3 hours or 3 to 8 hours. These behaviors are shown in Figures 6.3.2C. 159 PAGE 175 Deterministic Characterization of Coumermycin A1 Concentration (x1(t)) (M(3, 20%, 0.75); Delay t1 = 0.5 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Deterministic Characterization of Coumermycin A1 Concentration (x1(t)) (M(3, 20%, 0.75); Delay t1 = 1 h) 000 0.10 020 030 0.40 050 0.60 0.70 080 090 100 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Deterministic Characterization of Coumermycin A1 Concentration (x1(t)) (M(3, 20%, 0.75); Delay t1 = 3.5 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Nondelay Delay Figure 6.3.2A Deterministic characterization of coumermycin A1 concentration (x1(t)) for time delay values 0.5h, 1h, and 3.5h 160 PAGE 176 Deterministic Characterization of Coumermycin A1 Concentration (x2(t)) (M(3, 20%, 0.75); Delay t2 = 1 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Deterministic Characterization of Coumermycin A1 Concentration (x2(t)) (M(3, 20%, 0.75); Delay t2 = 2 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Nondelay Delay Deterministic Characterization of Coumermycin A1 Concentration (x2(t)) (M(3, 20%, 0.75); Delay t2 = 6 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Nondelay Delay Figure 6.3.2B Deterministic characterization of coumermycin A 1 concentration (x 2 (t)) for time delay values 1h, 2h, and 6h 161 PAGE 177 Deterministic Characterization of Coumermycin A1 Concentration (x3(t)) (M(3, 20%, 0.75); Delay t3 = 1.5 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Deterministic Characterization of Coumermycin A1 Concentration (x3(t)) (M(3, 20%, 0.75); Delay t3 = 3 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Deterministic Characterization of Coumermycin A1 Concentration (x3(t)) (M(3, 20%, 0.75); Delay t3 = 8 h) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration Delay Nondelay Figure 6.3.2C Deterministic characterization of coumermycin A 1 concentration (x 3 (t)) for time delay values 1.5h, 3h, and 8h 162 PAGE 178 6.3.3 Comparison of Deterministic and Stoc hastic Behaviors for the Delay Models Deterministic and Stochastic Characterizations of x 1 (t) The deterministic and stochastic characterizations are almost identical during the first 6 hours for the first two sets of time de lay values whereas for the last two sets of delay constants they are almost identical du ring the first 7 hours. But after these time points, the stochastic characterization is uni formly higher than the deterministic for all four sets of time delay values. Despite the varying time delay values the two characterizations are almost identic al during the first 6 to 7 hours. Deterministic and Stochastic Characterizations of x 2 (t) The deterministic and stochastic characterizations are almost identical during the first 10 hours for the first two se ts of time delay values wherea s for the last two sets of delay constants they are almo st identical during the first 12 to 14 hours. Bu t after these time points, the stochastic characterization is significantly higher than the deterministic for all four sets of time delay values. Deterministic and Stochastic Characterizations of x 3 (t) The deterministic and stochastic character izations are identical during the initial time delay values for all four sets of time delay values. But after these time points, the stochastic characterization is significantly highe r than the deterministic in all four cases, and the difference increases with time. Three panels that are displayed in Figure 6. 3.3A show the representative graphs of the deterministic and stochastic characterizati ons of the drug concentr ation for the case of time delay constants, h hh 3 and ,6 ,5.33 2 1 Figures 6.3.3B 6.3.3D show both characterizations of all three compartments and for all four sets of time delay values. 163 PAGE 179 Stochastic and Deterministic Characterizations of Drug Concentration in Compartment 1 with Standard Deviation: M(3, 20%, 0.75) and Delay: t1 = 3.5 h, t2 = 6 h, t3 = 3 h0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration E[x1(t)] x1(t) S.D.[x1(t)] Stochastic and Deterministic Characterizations of Drug Concentration in Compartment 2 with Standard Deviation: M(3, 20%, 0.75) and Delay: t1 = 3.5 h, t2 = 6 h, t3 = 3 h0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration E[x2(t)] x2(t) S.D.[x2(t)] Stochastic and Deterministic Characterizations of Drug Concentration in Compartment 3 with Standard Deviation: M(3, 20%, 0.75) and Delay: t1 = 3.5 h, t2 = 6 h, t3 = 3 h0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 02468101214161820222426283032343638404244464850 TimeDrug Concentration x3(t) E[x3(t)] S.D.[x3(t)] Figure 6.3.3A Stochastic and deterministic ch aracterizations of drug concentration in all three compartments with standard deviation for time delay values: t 1 =3.5h, t 2 =6h, t 3 =3h 164 PAGE 180 Deterministic Characterization of Coumermycin A1 Concentration ( x1(t)) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (3.5, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 100 02468101214161820222426283032343638404244464850 TimeDrug Concentration DT2 DT4 DT3 DT1 Stochastic Characterization of Coumermycin A1 Concentration (E[x1(t)]) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (3.5, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 100 02468101214161820222426283032343638404244464850 TimeDrug Concentration DT2 DT1 DT4 DT3 Figure 6.3.3B Deterministic (x1(t)) and stochastic (E(x1(t)) characterizations of coumermycin A1 drug concentration for all four se ts of time delay values in hours 165 PAGE 181 Deterministic Characterization of Coumermycin A1 Concentration (x2(t)) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (35, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 100 02468101214161820222426283032343638404244464850TimeDrug Concentration DT1 DT2 DT3 DT4 Stochastic Characterization of Coumermycin A1 Concentration (E[ x2(t)]) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (3.5, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 100 02468101214161820222426283032343638404244464850TimeDrug Concentration DT1 DT2 DT3 DT4 Figure 6.3.3C Deterministic (x2(t)) and stochastic (E(x2(t)) characterizations of coumermycin A1 drug concentration for all four se ts of time delay values in hours 166 PAGE 182 Deterministic Characterization of Coumermycin A1 Concentration (x3(t)) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (3.5, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 100 02468101214161820222426283032343638404244464850TimeDrug Concentration DT1 DT2 DT3 DT4 Stochastic Characterization of Coumermycin A1 Concentration (E[x3(t)]) for All Four Sets of Time Delay Values in Hours: DT1 = (0.5, 1, 1.5), DT2 = (1, 2, 3), DT3 = (3.5, 6, 3), and DT4 = (4, 6, 8)000 010 020 030 040 050 060 070 080 090 10002468101214161820222426283032343638404244464850TimeDrug Concentration DT1 DT2 DT3 DT4 Figure 6.3.3D Deterministic (x3(t)) and stochastic (E(x3(t)) characterizations of coumermycin A1 drug concentration for all four se ts of time delay values in hours 167 PAGE 183 168 6.4 Conclusion In this study we have presented a dete rministic and stochastic formulation of a threecompartment open pharmacokinetic syst em with predetermined constant time delays incorporated in the system that de scribes the behavior of the coumermycin A 1 in each compartment. We have presented the e ffects of constant time delays on the drug concentration behavior in each compartment. We studied numerically the deterministic behavior of delay and nondelay systems, and performed the comparison of the drug concentration behaviors. The re sults clearly indicate the dependence effect that exists among the compartments when the time delays are larger. We studied numerically the deterministic and stochastic behaviors of the delay system. As expected the stochastic version uniformly is different than that of its deterministic counterpart, especially at larger values of the time delays. Our numerical study included the behavior of the drug concen tration as we vary time delay values from small, medium to high. Understanding the time delay th at exists between the time of administration of the drug and the first time point at which drug c oncentration is observe d in the system is crucial in drug monitoring process. PAGE 184 Chapter Seven Future Research Studies 7.0 Possible Extensions of the Present Research Studies In this chapter we shall pose some possible extensions of the present research. In chapter two, we have used the Generalized Extreme Value (GEV) distribution for statistical modeling of the annual monthl y maximum rainfall from 44 locations in the State of Florida. Using the model parameters we obtain estimates and confidence intervals for return levels of various return periods. We fu rther classify all forty four locations into clusters based on the similarity profiles. It would be of interest to develop a statistical model that identifies the contributory independent va riables that affect the rainfall characterizations in the State of Florida. For example, to determine if su ch independent variables as humidity, air temperature, altitude, longitude, wind velocity, air pressure, among others are specifically statistically significantly contributing to the amount of rainfall in Florida. Having established such models which ma y be nonlinear in nature with a high degree of accuracy, namely, high (greater than 0.85) r 2 and adjustedr 2 we want to look at similar models for regional segments in the United States and furthermore, how such models relate to nontropical regions such as desert regions in the United States. In chapters three and four, we have utilized the family of extreme value theory to statistically model the ma ximum drug concentration maxC a critical pharmacokinetic 169 PAGE 185 parameter in the drug development process in a clinical trial, for predictive purposes. This is the first application of extreme value theory to model maxC Extreme maximum drug concentration levels of any drug in a human body can cause toxicity, inefficaciousness, or even se rious health problems. Hence, it is of paramount importance that a model, incorporating such covariates as age, gender, presence of other diseases and use of othe r concurrent medications, among others, be developed. Using such models, one can predict the maximum dr ug concentration levels in a patient whose profile is known. This can revolutionize the concep t of individualized, optimal patient care. For the problem of statistical modeling of a threecompartment pharmacokinetic system using a system of random differential equations, from chapters five and six, we propose to investigate the id entification of interactions among the compartments. Furthermore, we want to determine if a markovian approach to modeling the threecompartment model is meaningful to get a better understand ing of the drug. In addition, we like to determine if c onnecting the third and first compartment will provide more information about th e drug disposition. This will require the modification of the stochastic system process to a system of four differential equations. Also, the question of interest is to investigate the stochas tic system from a nonlinear perspective and determine and ju stify in deviating from a linear to nonlinear statistical characterization of the behavior of the drug, coumermycin A 1 170 PAGE 186 171 List of References 1 Aarssen, K and De Haan, L. (1994): On the maximal life span of humans, Mathematical Population Studies, 4, 259281. 2 Abdallah, H.Y. and Ludden, T.M. (1995): A spreadsheet program for simulation of bioequivalence and bioava ilability studies. Comput Biol Med, 25: 349354, 3 Achcar, J.A., Bolfarine, H. and Peri cchi, L.R. (1987). Transformtaion of survival data to an extreme value distribution, Sta tistician 36, 229234. 4 Aithal, G, Day, P and Daly, K.A.: Association of polymorphisms in the cytochrome P450 CYP2C9 with Warfar in does requirement and risk of bleeding complications. The Lancet, Volume 353, Issue 9154, Pages 717719. 5 Artzner P., F. Delbaen, J.M. Eber & D. Heath (1997). Thinking coherently. Risk 10, 6871. 6 Artzner P., F. Delbaen, J.M. Eber & D. Heath (1999).Coherent measures of risk. Mathematical Finance 9, 203228. 7 Balkema, A., and L. de Haan (1974). Residual life time at great age Annals of Probability, 2, 792804. 8 Behr, R.A., Karson, M.J. and Minor, J. E. Reliabilityanalysis of window glass failure pressure data, St ructural Safety, 1991, 11, 4358. 9 Benet L.Z. and Goyan, J.E. Bioequivalence and narrow therapeutic index drugs. Pharmacotherapy 1995, 15:433440 10 Beran, M., Hosking, J.R.M. and Ar nell, N. (1986). Comment on Twocomponent extreme value distribution for flood frequency analysis by Fabio Rossi, Mauro Fiorentino, Pasquale Versace, Water Resources Res 22, 263266. 11 BestFit, Version 4.5.4. Probability Distri bution Fitting software for Microsoft Windows, Palisade Corporation. PAGE 187 172 12 Bortkiewicz, L., von (1922). Varia tionsbreite und mittlerer Fehler, Sitzungsber Berli. Math. Ges. 21, 311. 13 Bourne, David W.A.: Mathematical Modeling of Pharmacokinetic Data. Technomic Publishing Company, 1995. 14 Bourne, David W.A.: Re: Pharmacokinetic Software [Online] 1996 July 23 PAGE 188 173 25 De Haan, L. (1970). On Regular Variation and its Application to the Weak Convergence of Sample Extremes Mathematical Centre Tracts 32, Mathematisch Centrum, Amsterdam. 26 Deshotels, B. and Fitzgerald M. (200 1). Designing for extreme temperatures, Chance, vol. 14, no. 3, Summer issue. 27 Dodd, E.L. (1923). The greatest and least variateunder general laws of error, Trans. Amer. Math. Soc. 25, 525539. 28 Erdman, D.: A study of kinetics: The es timation and simulation of systems of firstorder differential equations SAS Institute Inc., Cary, NC. 29 Farago, T. and Katz, R. (1990): Extrem es and design values in climatology, World Meteorological Organization WCAP14, WMO/TDNo. 386. 30 FDA Guidance for Industry Bioavaila bility and Bioequiva lence Studies for Orally Administered Drug Produc ts General Considerations. 31 Fisher, R.A. and Tippett, L.H.C. (192 8). On the estimation of the frequency distributions of the largest or smallest member of a sample. Procedding of the Cambridge Philosophical Society 24, 180190. 32 Folland, C.K., Karl, T.R. (2001): Observed climate variability and change. In: Houghton JT et al., editors. Climate ch ange 2001: the scientific basis. Cambridge: Cambridge University Press; 2001. p.99181. 33 Frechet, M. (1927). Sur la loi de prob abilite de lecart maximum, Ann. Soc. Polon. Math. Cracovie 6, 93116. 34 Friesen, M.H. and Walker, S.E. Ar e the current bioequivalence standards sufficient for the acceptance of narrow th erapeutic index drugs? Utilization of a computer simulated Warfarin bioequivalence model. J. Pharm Pharmaceut Sci, 2(1): 1522, 1999. 35 Fuller, W.E. (1914). Flood flows, Trans Amer. Soc. Civil Engineers 77, 564. 36 Gabrrielsson, J. and Weiner, D. Pharmacokinetic and Pharmacodynamic Data Analysis: Concepts and Applicati ons. Swedish Pharmaceutical Press, Stockholm, 1994. 37 Garrett, E.R.: Antibiot. Chemotherapia, 12, 227 (1964) 38 Gibaldi, M. and Perrier, D. Pharmac okinetics, Second Edition, Marcel Dekker, Inc. New York, 1982. PAGE 189 174 39 Gibaldi, M., Levy, G. and Wein traub, H.: Drug distribution and pharmacological effects, Clin. Pharmacol. Therap., 12, 1971, pp. 734742. 40 Gibaldi, M. and Levy, G.: Dosedepe ndent decline of pha rmacologic effects of drugs with linear pharmacokinetic characteristics, Journal of Pharmaceutical Sciences, Vol. 61, No. 4, April 1972. 41 Gilleland, E., Katz, R. and Young, G. Ex tremes Toolkit: Weather and Climate Applications of Extreme Value Statistics. 42 Gnedenko B.V. (1941). Limit theorems fo r the maximal term of a variational series. Comptes Rendus de lAcadem ie des Sciences delURSS 32, 79. 43 Gnedenko, B.V. (1943), Sur la distribution limite du terme maximum d'une serie aleatoire Annals of Mathematics, 44, 423453. Translated and reprinted in: Breakthroughs in Statistics Vol I, 1992, eds. S. Kotz and N.L. Johnson, SpringerVerlag, pp. 195225. 44 Griffith, A.A. (1920). The phenomena of rupture and flow in solids, Philos. Trans. Roy. Soc. London A, 221, 163198. 45 Gumbel E.J. (1958). Statistics of Extr emes. Columbia Univ. Press., New York. 46 Gumbel, E.J. (1954). Statistical Theory of Extrem e Values and Some Practical Applications National Bureau of Standards, Applied Mathematics Series, Vol.33. 47 Gurwitz JH, Avorn J, RossDegnan D, Choodnovskiy I, Ansell J. Aging and the anticoagulant response to wa rfarin therapy. Ann Intern Med 1992;116:9014. 48 Haines, S.T., Reflection on generic Warfarin. Am J Health Syst Pharm, 55:72933; 1998 49 Harlow, D.G., Smith, R.L. and Taylor, H.M. Lower tail analysis of the distribution of the strength of load sharing systems, Journal of Applied Probability 1983, 20, 358367. 50 Harrison L, Johnston M, Massicotte MP Crowther M, Moffatt K, Hirsh J. Comparison of 5mg and 10mg loading doses in initiation of warfarin therapy. Ann Intern Med 1997;126:1336. 51 Holman DJ. 2003.mle: a programming language for building likelihood models. PAGE 190 175 52 Horton J.D. and Bushwick, B.M. Warfar in Therapy: Evolving strategies in anticoagulation. Published by American Academy of Family Physicians, February 1999. 53 Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and graphics. Journal of Computationa l and Graphical Statistics, 5, 299314. 54 Jenkinson, A.F. (1955). The frequency distribution of the annual maximum (or minimum) values of meteorological events. Quarterly Jo urnal of the Royal Meterorological Society 81, 158172. 55 Johnson, M.E. (December 1997). Caribbean Storm Surge Return Periods: Final Report 56 Johnson, M.E. and Watson, Jr. C.C. (1999): Hurricane return period estimation, 10 th Symposium on Global Change Studies, 1015 Jan 1999. Dallas, Texas. 57 Kaplan, S.T.: Pharmacokinetic profile of Coumermycin A, Journal of Pharmaceutical Sciences, Vol. 59, No. 3, 1970, pp. 309313. 58 Karl, T.R., Knight, R.W. (1998). Secu lar trends of precipitation amount, frequency, and intensity in the United States. Bull Am Meterol Soc 1998; 79:23141. 59 Katz, R.W., Parlange, M.B. and Naveau, P. (2002). Statistics of extremes in hydrology, Advances in Water resources, 25: 12871304. 60 Kotz S. and Nadarajah S. (2000). Extreme Value Distributions, Theory and Applications. Imperial College Press. 61 Lawless, J.F. Statistical Models and Methods for Lifetime Data. Wiley series in Probability and Mathematical Statistics. 62 Lee, R.U. Statisticalanalysis of co rrosion failures of leadsheathed cables, Materials Performance 1992, 31, 2023. 63 Leslie, P.H.: A stochastic model for studying the properties of certain biological systems by numerical me thods, Biometrika, 45, 1958, pp. 1631. 64 Levy G. and Jusko, W.J.: Multicompa rtment pharmacokinetic models and pharmacological effects, Journal of Ph armaceutical Sciences, Vol. 58, No. 4, April 1969. PAGE 191 176 65 Longin M.F. (1996). The asymptotic di stribution of extreme stock market returns, The Journal of Business, Volume 69, Issue 3 (July, 1996), 383408. 66 Maple 9, Mathematical Software Package, Maplesoft. 67 Maximizing value from integrated PKPD Prediction. Bayer Technology Service. http://www.bioinformaticsforumuk.net/ oxbf/images/bioconvergence/talks/Joer g_Lippert_Bayer.ppt 68 Mejzler, D.G. (1949). On a theorem of B.V. Gnedenko. Sb. Trudov Inst. Mat. Akad. Nauk Ukrain. SSR 12, 3135 69 Mises, R. von (1923). Uber die vari ationsbreite einer Beobachtungsreihe, Sitzungsber. Berlin. Math. Ges 22, 38. 70 Mises, R. von (1954). La distribution de la plus gr ande de n valeurs. In Selected Papers Volume II, pages 271294. American Mathematical Society, Providence, RI. This paper was the re production of Mises, R, vons original paper published in Rev. Math. Union Interbalk. 1, 141160 (1936). 71 Naess, A. Estimation of long return period design values for wind speeds, Journal of Engineering MechanicsAmer ican Society of Civil Engineers 1998, 124, 252259. 72 National Oceanic and Atmospheric Ad ministration, National Data Center ( http://www.ncdc.noaa.gov ) 73 Pearson JD, Morrell CH, Brant LJ. 1 992. Mixture models for investigating complex distributions. J Quant Anthroplo 3:325345. 74 Pickands, J. (1975). Statis tical inference using extrem e order statistics. Annals of Statistics 3, 119131. 75 R, Statistical Computing System 76 Ramachandran, G. (1982). Properties of extreme order statistics and their application to fire protection and insurance problems, Fire Safety Journal 1982, 5, 5976 77 Rao, N.M., Rao, P.P., Kaila, K.L. The fi rst and third asymptotic distributions of extremes as applied to the seismic source regions of India and adjacent areas, Geophysical Journal International 1997, 128, 639646. PAGE 192 177 78 Rossi, F. Fiorentino, M. and Versac e, P. (1986). Twocomponent extreme value distribution for flood frequency analysis, Water Resources Res 22. 79 Rossi, F. (1986). Reply to Comment on Twocomponent extreme value distribution for flood frequency analysis, Water Resources Res 22, 267269. 80 Smith, R.L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika 72, 6790. 81 Smith, R.L. (1987). Estimating tails of probability distributions. The Annals of Statistics 15, 11741207. 82 Smith, R.L. and Weissman, I. (1985) Maximum likelihood estimation of the lower tail of a probability distributio n. J. Roy. Statist. Soc. B, 47, 285298. 83 Smith, R.L. and Davison, A.C. (1990). Model for exceedances over high thresholds (with discussion), J. Roy. Statist. Soc. B, 52, 393442. 84 Smith, R.L. (2003). Statistics of extremes with applications in environment, insurance and finance. Department of St atistics, University of North Carolina, Chapel Hill. Web reference: 85 Solomon, H.: Unpublished data on file, HoffmannLa Roche, Nutley, N.J., 1968. 86 Soong, T.T., Pharmacokinetics with uncertain ties in rate constants I, II, and III the inverse problem, Math. Biosciences, 12, 1971, pp. 235243. 87 Soong, T.T.: Random Differential Equations in Science and Engineering, Vol. in Mathematics in Science and Engi neering, Academic Press, New York, 1973. 88 Soong, T.T.: Solutions of a class of random differential equations, SIAM J. Appl. Math., Vol. 24, No. 4, June 1973. 89 Southeast Regional Climate Center ( http://www.sercc.com ) 90 SPLUS functions for extreme value modeling: An accompaniment to the book An Introduction to Statisti cal Modeling of Extreme Values by Stuart Coles. 91 Statistical Analysis Software (SAS), Version 9.1. SAS Institute Inc., Cary, NC, USA. PAGE 193 178 92 Teorell, T.: Kinetics of distribution of substances administered in the body, Arch. Intern. Pharmacod yn. Therap. 57 (1937), 205240. 93 The Merck Manual of Diagnosis an d Therapy, Section 22: Clinical Pharmacology, Chapter 299: Pharmacoki netics; Chapter 303: Monitoring Drug Treatment 94 Trenberth, K.E. (1998). Atmospheric mo isture residence times and cycling: implications for rainfall rates and clim ate change. Climatic Change 1998; 39: 66794. 95 Trenberth, K.E. (1999). Conceptual fram ework for changes of extremes of the hydrological cycle with climate ch ange. Climatic Change 1999; 42: 32739. 96 Tsokos, C.P. and Nadarajah, S. Extreme value models for software reliability. 97 Tsokos, J.O. and Tsokos, P.T.: Statistical modeling of pharmacokinetic systems, J. Dynamic Systems, Measurement and Control, ASME, March 1976 98 Web reference: http://www. dml.co.nz/yourtest/inr.asp 99 Wettstein, Justin J. and Mearns, Linda O. (2002). The influence of the north atlanticarctic oscillation on mean, variance and extremes of temperature in the northeastern United States and Ca nada. Journal of Climate, 15:35863600, 2002. 100 Xapson, M.A., Summers, G.P. and Barke, E.A.. Extreme value analysis of solar energetic motion peak fl uxes, Solar Physics 1998, 183, 157164. 101 Yasuda, T. and Mori, N. Occurrence pr operties of giant freak waves in sea area around Japan, Journal of Waterway Port Coastal and Ocean EngineeringAmerican Society of Civil Engineers 1997, 123, 209213. PAGE 194 179 Bibliography 1 Aarons, L.: Pharmacokinetic a nd pharmacodynamic modelling in drug development. Stat. Methods Med. Res. (editorial) 1999, Vol 8, pp 181182. 2 Bellman, R., Topics in pharmacokinetic s, III: Repeated dosage and impulse control, Math. Biosci., 12, 1971, pp. 15. 3 BharuchaReid, A.T.: Random Integral Equations, Vol. in Mathematics in Science and Engineering, Academic Press, New York, 1972. 4 Lain Glen: Pharmacokinetic Analysis, Anaesthesia and intensice care medicine, Vol. 6, Issue 8, pp 280282, 2005. 5 Lotsch J Kobal G Geisslinger G .: Programming of a flexible computer simulation to visualize pharmacokine ticpharmacodynamic models. Int. J. Clin. Pharmacol. Ther. 2004 Jan: 42(1):1522. 6 Niazi, S.: Multicompartment pharmacokine tic analysis and simulations using a programmable calculator. Int. J. Biomed. Comput. 1979 May: 10(3):24555. 7 Russell, C.D.: A Bayesian 3Compartment Model for 99m TcMAG3 Clearance: Journal of Nuclear Medi cine (2003), Vol. 44 No. 8 13571361. 8 Schoenwald, R.D.: Pharmacokinetic Principles of Dosing Adjustment, Technomic Publishing Co. Lancaster, 2001. 9 Shargel Leon, Yu Andrew B.C. Applicat ions of Computers in PK. In: Applied Biopharmaceutics and PK. 3rd ed. New York: PrenticeHall International Editions, 1993:571583. 10 Thomas P. Lodise Jr., Ben Lomaestro, Keith A. Rodvold, Larry H. Danziger, and George L. Drusano: Pharmacodynamic Profiling of Piperacillin in the Presence of Tazobactam in Patients through the Use of Population Pharmacokinetic Models and Monte Ca rlo Simulation. Antimicrobial Agents and Chemotherapy, December 2004, p. 47184724, Vol. 48, No. 12. 11 Tsokos, C.P. and Padgett, W.J.: The or igin and application of stochastic integral equations, Int. J. Systems Sci., 2, 1971, pp. 135148. PAGE 195 180 12 Tsokos, C.P. and Padgett, W.J.: Random In tegral Equations with Applications to Life Sciences and Engineering, Vol. in Mathematics in Science and Engineering, Academic Press, New York, 1974. PAGE 196 181 About the Author Lakshminarayan Rajaram received a b achelors degree in Physics and Mathematics, and a masters degree in Pure Mathematics from the University of Mysore, India. He also received a masters degree in Applied Mathematics from the New Jersey Institute of Technology, Newark, N.J. in 1986. He worked as a biostatistician for clinical trials in the Division of E ndocrinology and Metabolism at the Johns Hopkins School of Medicine, and later, in the Clinical Research Division at TheraTech, Inc., Utah and Baker Norton Pharmaceuticals, Miami. He also worked as a data analyst at the Institute on Aging, University of South Florida. At presen t, he is a tenured f aculty in mathematics and statistics at the St. Petersburg College Tarpon Springs Campus He is an adjunct faculty of biostatistics in the Department of Epidemiology and Biostatistics, University of South Florida. He also is an independent consultant in the database design, data management and biostatistics for healthrela ted studies including pharmaceutical clinical trials. Additionally, he gives lectures and workshops in the pharmacokinetics, biostatistics, and SAS (Statistical Analysis Software) programming both in the United States and abroad. 