USFDC Home  USF Electronic Theses and Dissertations   RSS 
Material Information
Subjects
Notes
Record Information

Full Text 
PAGE 1 Statistical Changes in Lakes in Urbanizing Watersheds and Lake Return Frequencies Adjusted for Trend and Initial Stage Utilizing Generalized Extreme Value Theory by Shayne Paynter A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Civil and Environmental Engineering College of Engineering University of South Florida Major Professor: Mahmood Nachabe, Ph.D. Mark Ross, Ph.D. George Yanev, Ph.D. Mark Stewart, Ph.D. Jayajit Chakraborty, Ph.D. Date of Approval: January 5, 2009 Keywords: regression, time seri es, autocorrelation, flood, drought Copyright 2009, Shayne Paynter PAGE 2 i Table of Contents List of Tables ................................................................................................................ ..... iii List of Figures ............................................................................................................... ..... iv ABSTRACT ...................................................................................................................... .. v 1.0 Regional Scale Spatiotemporal Consistenc y of Precipitation Va riables Related to Water Resource Management and Planning .................................................... 1 1.1 Background ....................................................................................................... 1 1.2 Materials and Methods ...................................................................................... 5 1.2.1 Data .................................................................................................... 5 1.2.2 Precipitation Variables ....................................................................... 8 1.2.3 Gamma Distribution ......................................................................... 10 1.2.4 GEV Distribution ............................................................................. 11 1.2.5 Spatial Analysis ............................................................................... 13 1.2.6 Temporal Analysis ........................................................................... 13 1.3 Results and Discussion ................................................................................... 15 1.3.1 Spatial Analysis ............................................................................... 17 1.3.1.1 Annual Rainfall ................................................................. 18 1.3.1.2 Rainfall Days per Year ...................................................... 20 1.3.1.3 Event Rainfall Annual Maximum ..................................... 23 1.3.1.4 Annual Maximum Interevent Days ................................... 25 1.3.2 Temporal Analysis ........................................................................... 28 1.4 Conclusions ..................................................................................................... 33 2.0 Statistical Changes of Lake Stag es in Urbanizing Watersheds .................................. 35 2.1 Background ..................................................................................................... 35 2.2 Materials and Methods .................................................................................... 39 2.2.1 Lake Information and Data .............................................................. 39 2.2.2 Time Series Analysis ....................................................................... 45 2.2.3 Autocorrelation and Variance .......................................................... 48 2.2.4 Regression ........................................................................................ 49 2.3 Results and Discussion ................................................................................... 51 2.3.1 Time Series Modeling ...................................................................... 53 2.3.2 Autocorrelation and Variance .......................................................... 58 PAGE 3 ii 2.3.3 Regression ........................................................................................ 62 2.4 Conclusions ..................................................................................................... 68 3.0 Use of Generalized Extreme Value Covariates to Improve Estimation of Trends and Return Frequencies for Lake Levels ....................................................... 70 3.1 Background ..................................................................................................... 70 3.2 Materials and Methods .................................................................................... 74 3.2.1 Lake Information and Data .............................................................. 74 3.2.2 GEV Distribution ............................................................................. 76 3.3 Results and Discussion ................................................................................... 80 3.3.1 Trend Analysis ................................................................................. 83 3.3.2 Starting Stage Analysis .................................................................... 85 3.3.2.1 Flood Return Period .......................................................... 85 3.3.2.2 Drought Return Period ...................................................... 90 3.4 Conclusions ..................................................................................................... 95 4.0 Conclusion ................................................................................................................ .. 97 References.. .................................................................................................................. ... 101 About the Author ................................................................................................... End Page PAGE 4 iii List of Tables Table 11: Rainfall gage data summary ............................................................................. 8 Table 12: Precipitatio n variable summary ...................................................................... 17 Table 13: Gamma variable parameter summary ............................................................. 18 Table 14: GEV variable parameter summary ................................................................. 23 Table 15: GEV model comparison ................................................................................. 29 Table 21: Lake char acteristics summary ........................................................................ 41 Table 22: Regional la ke data summary .......................................................................... 53 Table 23: Lake waters hed population growth ................................................................ 53 Table 24: SARIMA model parameters ........................................................................... 56 Table 25: Autocorrelation and variance ......................................................................... 60 Table 26: Regression model parameters ......................................................................... 63 Table 31: Lake data summary ........................................................................................ 80 Table 32: GEV flood parameter summary ...................................................................... 82 Table 33: GEV drought parameter summary .................................................................. 83 PAGE 5 iv List of Figures Figure 11: Rainfall gage location ................................................................................... 7 Figure 12: Gage G16 quantil e plot of annual rainfall ................................................... 19 Figure 13: Annual rainfa ll cumulative distribution ...................................................... 20 Figure 14: Gage G 16 quantile plot of the rainfall days per year .................................. 21 Figure 15: Rainfall days pe r year cumulative distribution ........................................... 22 Figure 16: Gage G 16 quantile plot of event rainfall annual maximum ........................ 24 Figure 17: Event rainfall annual maximum cumulative distribution ............................ 25 Figure 18: Gage G16 qua ntile plot of the annual maximum interevent days ............... 27 Figure 19: Intereve nt days annual maximum cumulative distribution ......................... 28 Figure 110: Quantile plot for gage G1 annual rainfall days model 2 ........................... 31 Figure 111: Gage G1 observed variable data ................................................................. 32 Figure 21: Lake and rainfall gage location ................................................................... 42 Figure 22: Moon Lake .................................................................................................. 43 Figure 23: Cow Lake .................................................................................................... 44 Figure 24: Cow Lake stages (19762007) .................................................................... 55 Figure 25: Autocorrelation of residuals for Cow Lake (19761980, lag in weeks) ...... 58 Figure 26: Cow Lake autocorrelation with exponential fit lines (19761991) ............. 62 Figure 27: Cow Lake res ponse versus fit (19761980) ................................................ 65 Figure 28: Cow Lake quantilequantile plot of the residuals (19761980) .................. 65 Figure 31: Locati on map of study lakes ....................................................................... 75 Figure 32: Lake Carroll stage data ............................................................................... 81 Figure 33: Lake We ohyakapka stage data .................................................................... 81 Figure 34: Lake Ar buckle flood stage standardi zed residual quantiles ........................ 86 Figure 35: Lake Carroll flood stage standardized residual quantiles ........................... 87 Figure 36: Lake Arbuc kle flood frequencies with and without covariates ................... 88 Figure 37: Lake Tra fford flood frequencies with and without covariates .................... 89 Figure 38: Lake Tra fford drought stage standardi zed residual quantiles ..................... 91 Figure 39: Lake Weo hyakapka drought stage standard ized residual quantiles ............ 91 Figure 310: Lake Arbuckle drought frequencies with a nd without covariates ............... 93 Figure 311: Lake Carroll drought frequenc ies with and without covariates .................. 94 PAGE 6 v Statistical Changes in Lakes in Urbanizing Watersheds and Lake Return Frequencies Adjusted for Trend and Initial Stage Utilizing Generalized Extreme Value Theory Shayne Paynter ABSTRACT Many water resources throughout the world are demonstrating changes in historic water levels. Potential reasons for these cha nges include climate shifts, anthropogenic alterations or basin urbanization. The focus of this research was threefold: 1) to determine the extent of spatiotemporal cha nges in regional precipitation patterns 2) to determine the statistical changes that occur in lakes with urbanizing watersheds and 3) to develop accurate prediction of trends a nd lake level return frequencies. To investigate rainfall patterns regionally, appropriate distributi ons, either gamma or generalized extreme value (GEV), were fitted to variables at a number of rainfall gages utilizing maximum likelihood estimation. The spatial distribution of rainfall variables was found to be quite homogenous within th e region in terms of an average annual expectation. Furthermore, the temporal distri bution of rainfall variables was found to be stationary with only one gage evidencing a significant trend. PAGE 7 vi In order to study statistical ch anges of lake water surface leve ls in urbanizing watersheds, serial changes in time series parameters, autocorrelation and varian ce were evaluated and a regression model to estimate weekly lake level fluctuations was developed. The following general conclusions about lakes in urbanizing watersheds were reached: 1) The statistical structure of lake level time series is systematically altered and is related to the extent of urbanization 2) in the absence of other forcing mechanisms, autocorrelation and baseflow appear to decrease and 3) the presence of wetlands adjacent to lakes can offset the reduction in baseflow. In regards to the thir d objective, the dire ction and magnitude of trends in flood and drought stages were estimated and both longterm and shortterm flood and drought stage return frequencies were pr edicted utilizing the genera lized extreme value (GEV) distribution with time and starti ng stage covariates. All of the lakes researched evidenced either no trend or very small trends unlikel y to significantly alter prediction of future flood or drought return levels. However, for all of the lakes, significant improvement in the prediction of extremes was obtained with the inclusion of star ting lake stage as a covariate. PAGE 8 1 1.0 Regional Scale Spatiotemporal Consistenc y of Precipitation Va riables Related to Water Resource Management and Planning 1.1 Background This dissertation consists of three main sections which correspond to three papers submitted to technical journals and represent three distinct but interrelated subjects. The first section investigates the spatial homogeneity and temporal stationa rity of rainfall in a given region. As lakes and the changes in lake levels are the major focus of this research, it is necessary to first identify any spatial or temporal trends in the major input to lake levels so that any further analysis can take these trends into acc ount. The second section investigates changes in lake levels induced by urbanization. The focus was to remove or account for signals other than ur banization, such as rainfall or control structures, as much as possible. Once lakes with sufficiently is olated urbanization signa ls were identified, changes in time series model parameters, auto correlation and baseflow were investigated. The third section, utiliz ing the methods developed in the first section as well as taking advantage of the autocorrelation in lake leve ls identified in the second section, sought to significantly improve the prediction of lake level return periods. During the last century, the pl anet has experienced treme ndous growth in population and development. In addition, many researchers world wide have concluded that climate PAGE 9 2 change is occurring in the form of increases or decreases in temperature, rainfall, drought, evaporation and other climatologi cal variables. Specifically, in regards to precipitation, Kunkel and Andsager (1999) and Karl and Kn ight (1998) investigated extreme rainfall events and found upward trends utilizing the non parametric Kendall sta tistical test in both the magnitude and number of events in some parts of the United States. However, Dahamsheh and Aksoy (2007) found no such trends in rainfall in other areas of the world using the nonparametric Spearman rank order corr elation statistical te st. Garcia et al (2007) has identified positive, negative and ab sence of trends in extreme rainfall in various regions of Spain with nonparametric te sts as well as by utilizing timedependent parameters of the Generalized Extreme Valu e (GEV) distribution. Zolina et al (2004) applied the gamma distribution to daily rain fall from various gages across Europe and found substantial variability from region to region in terms of both the presence and direction of trends and th e shape and scale parameters of the distribution. At the same time that rainfall patterns are changing, many changes in the general trends of lake, stream and other surface water levels have occurred. Many anthropogenic factors, such as watershed urbanization, wate r supply pumping and structural changes to the water body itself or climatic changes coul d be responsible for these trends. Water resources are vital for many reasons includi ng recreation, tourism, environment, ecology and water supply. Understanding wh at is responsible for impacts to water levels is key to determining an effective future management pl an. As rainfall, or in the case of drought the lack thereof, is the major contributing factor influencing wate r stages, precipitation trends must be identified before investig ating other potential factors. Much of the PAGE 10 3 literature evaluates different statistical methods for determining trend since trend detection in hydroclimatological data is complicated by the time scale of data, nonnormal distributions, seasonality, autocorrela tion, data collection methods, censored or missing data, nonstationarity and other difficu lties. Hirsch et al (1982) developed a seasonal Kendall nonparametric test that overcomes some of the difficulties of traditional methods of trend detection. Using Mont e Carlo simulations, Zhang et al (2004) compared ordinary least squares regression, the nonparametric Kenda ll test, and allowing the parameters of the GEV distribution to va ry with time. The study found that while the nonparametric test is more robust than ordinary least squares, varying the parameters of the GEV distribution significantly outperform s the other two methods and increases the chances of correctly identifyi ng a trend. Recent research shows wide application of GEV theory to hydroclimatological data. Katz et al (2002) provides a discussion of the statistics of hydroclimatological extremes and the applicatio n of the generalized extreme value distribution. Nadarajah and Shiau (2005) Zhang et al (2004) and Garcia et al (2007) utilized maximum likelihood estimation of generalized extreme value parameters, allowed parameter covariates to vary with time or other phenomena, and employed a likelihood ratio test to determ ine if the parameter covari ates improve the fit. Identifying trends in both space and time is integral to effective water resource management and planning. Spatial trends are important at the regional scale because often, insufficient climatological data at a part icular lake, stream, reservoir or other water resource are available. Establishing regional precipitation homogeneity allows data from regional gages to be utilized at a particular water resource anywhere within that region. PAGE 11 4 Temporal trends are equally important to water resource management, as robust identification of a downward trend could al low for early implementation of mitigation plans for water supply, wetland or lake restoration a nd other water resource issues. There is substantial research regardi ng both spatial and temporal va riability of rainfall. Zolina et al (2004) found substantia l regional variation in the parameters of the gamma distribution across Europe. In a similar st udy, Groisman et al (1999) found spatial and temporal regional stability in the shape parameter of the gamm a distribution when applied to daily rainfall. The spatial distri bution of rainfall trends was investigated by Cannarozzo et al (2006) and found to be spatially homogenous across Sicily. Further research evaluated spatial and temporal depe ndence of rainfall data across South America and found evidence of regional dependence. Fu rthermore, it was found that the regional dependence extended further in the la titudinal direction (Kuhn et al, 2007). Spatial and temporal changes in rainfall are inve stigated in this research by analyzing the probability distribution of a set of precipitati on variables that are likely to influence regional lake, stream or other water resour ce levels. These vari ables include annual rainfall, annual rainfall days per year, the maximum rainfall per week in any given year, and the number of days between events. Th ese variables are determined for rainfall gages surrounding Moon Lake, located in Pa sco County, Florida, Un ited States. This lake was chosen because there are a sufficien t number of rainfall gages within the region with complete data sets and long records. Furthermore, the lake itself has not been anthropogenically altered by pumping or other means and rainfall is the major influence on lake levels. To determine spatial ho mogeneity, an average set of distribution PAGE 12 5 parameters are estimated and confidence limits established to determine if the variables at each gage fall within these limits. To determine temporal stationarity, distribution parameters are allowed to vary with time to ascertain if doing so provi des a better fit than constant parameters. Much of the referenced literature regarding climate has focused on identifying largescale global or continental changes in weather pa tterns. Identifying trends utilizing GEV distribution parameter covariat es rather than ordinary least squares or MannKendall statistical tests is a recent development in the literature; the papers cited have applied this method to flood peaks, daily precipitation and temperature. This research applies GEV distribution covariates to rainfa ll variables that have not yet been analyzed in such a way. Furthermore, the utilization of parameter conf idence limits applied in such a way to the variables analyzed was not found in the literature review. Th e objectives of this research are: 1) Determine if the rainfall variab les analyzed exhibit sp atial homogeneity and temporal stationarity utilizing methods that can be adapted to other regions 2) Develop a representative distribution fo r each variable that can be utilized for water resource management and planning within a given region. 1.2 Materials and Methods 1.2.1 Data Because rainfall is highly variable from gage to gage even within a limited area, several rainfall gages within a 40kilometer radius of Moon Lake were analyzed. Several gages PAGE 13 6 within 8, 16, 24, 32 and 40 kilometer radii of the lake were select ed based on achieving adequate coverage of the re gion. Furthermore, gages with at least 25 years of daily rainfall data that were at least 95 percent co mplete were selected. In some cases, gage records were extended by joining the reco rds of two immediately adjacent gages. Rainfall data available to the general public was obtained from bot h Southwest Florida Water Management District and the National C limatic Data Center. Figure 11 gives a graphic overview of gage loca tions. Table 11 gives a summa ry of the data associated with each gage. PAGE 14 7 Figure 11: Rainfall gage location PAGE 15 8 Table 11: Rainfall gage data summary Gage No. Station Name Year Data Begins Year Data Ends Percent Complete Kilometers from Moon Lake G1 Growers Kent 1973200799.88 G2 Starkey 1983 2007 97.5 8 G3 Eldridge Wilde 1973 2007 99.0 16 G4 South Pasco 1976 2007 98.8 16 G5 Island Ford 1973 2007 98.3 16 G6 Tarpon Springs 1901 2004 98.7 24 G7 Lutz 1965 2005 100.0 24 G8 Whalen 1975 2005 100.0 24 G9 Crews Lake 1976 2007 98.9 24 G10 Hunter Lake 1976 2006 99.1 24 G11 Imperial Key 1974 2007 99.2 32 G12 Weeki Wachee 1971 2007 99.6 32 G13 Bay Lake 1970 2007 99.1 32 G14 Dunedin 1970 2005 99.9 40 G15 Temple Terrace 1975 2007 99.7 40 G16 St. Leo 1901 2007 98.3 40 G17 Tampa Int. 1901 2007 99.8 40 G18 Horse Lake 1981 2007 98.6 40 1.2.2 Precipitation Variables Because the focus of this research is on water resource planning and management, the rainfall/runoff/water body stage relationship is key. As such, rainfall variables to be analyzed for the presence of trends need to be selected carefully. Variables that will likely have a significant impact on water levels include: to tal annual rainfall, rainfall days per year, annual maximum event rainfall, and annual maximum interevent days. Because lake levels exhibit significant auto correlation, variables too close in time cannot be considered independent and therefore ca nnot be used to analyze trends. Before selecting the aforementioned variables, Moon La ke level data was analyzed to determine the extent of autocorrelation via the autocorr elation function. Before the autocorrelation PAGE 16 9 could be applied, the lake data was de trended and any seasonality removed by differencing. The autocorrelation func tion, a dimensionless measure of linear dependence of time series values at lag k, rk, is given by: (1) where the numerator of the equation is the au tocovariance, a measur e of how related the variance from the mean adjacent time series values are and the denominator is the variance at lag 0. Moon Lake e xhibits statistically significant autocorrelation up to nearly one year; therefore, annual variables are the focus of this research. Annual rainfall is the major wate r resource input and is directly related to water stages. Any significant upwards or dow nwards trend will correspond to increasing or decreasing future water levels. Lakes, streams a nd other water resources require a regular replenishment of rainfall to maintain wate r levels throughout the ye ar. The distribution of rainfall over a given year will contribute to changes in water levels as a few heavy storms per year would generate a different water surface level signature than many small events. The number of rainfall days per year, in combination with the event and interevent annual maximum, will capture this dynamic. Becau se there is unlikely to be any effect on stages from very small amount s of rainfall due to interception, pooling, transpiration and other extracti ons, rainfall less than 0.5 cm fo r any given day is filtered out. For purposes of this research, rainfall days are defined as a ny day with rainfall greater than 0.5 cm. 1 2 1{()()} ()NK ttk t N t t x xxx xx PAGE 17 10 Changes in the magnitude and frequency of extreme events will be identified by the maximum annual event rainfall variable. For pu rposes of this variable and its relation to water stages, an event is defined as the sum total of a week of ra infall; a moving weekly window was applied throughout each year. Incr easing trends in interevent times, which correspond to drought, will correlate to future lo wered water levels. Interevent time is defined as any number of consecutive days with daily rainfall less than 0.5 cm. In order to evaluate spatial homogeneity, each variable at each gage location was fitted with the most applicable distribution. In the case of the number of to tal annual rainfall and rainfall days per year, the gamma distribu tion was utilized. In the case of extreme variables, the annual maximum event rainfa ll and the annual maximum interevent time, the GEV distribution was used. 1.2.3 Gamma Distribution The use of the gamma distribution for precip itation data has been established in the literature (Zolina et al, 2004; Semenov and Bengstsson, 2002; Watterson and Dix, 2003; Groisman et al, 1999), although other distributio ns such as the Weibull and Poisson have also been used (Sharda and Das, 2005; Burgue no et al, 2004). The gamma distribution is positively skewed and has good flexibility by al lowing for variability in both mean and variance with its shape and scale parameters The gamma distribution function is given by: 1/(/) (,,) ()xxe fx where 0,,0 x (2) and PAGE 18 111 0()ttedt (3) where x is the random variable, i.e. total annual rainfall, is the shape parameter, is the scale parameter and ( ) is the gamma function. The pa rameters of the gamma function are estimated using maximum likelihood estimation. Maximum likelihood has been found to generally provide better estimates of parameters for the gamma distribution when compared to other methods (Choi and Wette, 1969; Wilks, 1990). Maximum likelihood estimation determines the paramete rs that maximize the probability of the sample data by maximizing the likelihood f unction, either by differentiating the loglikelihood function and equating it to zero or, if this does no t yield explicit solutions, by using numerical techniques such as the NewtonRaphson me thod. The loglikelihood function for the gamma distri bution developed by Choi a nd Wette (1969) is given as follows: 11(,)loglog()(1)lognn ii iilnxx (4) where xi xn represent a random sample of the gamma distribution ra ndom variable. 1.2.4 GEV Distribution The use of the GEV distribution has gained wi despread application in recent literature because of its flexibility and ability to capture the freque ncy of extremes (Martins and Stedinger, 2000; Nadarajah and Shiau, 2005; Morrison and Smith, 2002). The GEV is the generalized form of three commonly a pplied extreme value distributions: the Gumbel, the Frechet and the Weibull. The GEV is applicable to variables of block PAGE 19 12 maxima, where the blocks are equal divisions of time. Th e GEV cumulative distribution function is given by: 1/()exp1 x Fx (5) where x is the random variable, is the location parameter, is the scale parameter and is the shape parameter and 1+ (x)/ > 0. It readily follows that the subdistributions are: Gumbel: ()expexp, x Fxx (6) Frechet: 1/0 () exp Fx x x x (7) Weibull: 1/exp () 1 x Fx x x (8) In a similar fashion to the gamma distribu tion parameters, GEV distribution parameters are determined using maximum likelihood esti mation. The loglike lihood function, for 0, is given by: 1/ 11(,,)log(11/)log11mm ii iixx lm given that 10ix for i = 1, , m (Coles, 2004) (9) PAGE 20 13 Also similar to the gamma distribution, ther e is no explicit soluti on for the loglikelihood function and it must be solved using numerical methods. 1.2.5 Spatial Analysis Once the distribution parameters were estimate d for each variable at each of the rainfall gages, the fits were confirmed with th e KolmogorovSmirnov test statistic. The KolmogorovSmirnov test statistic was applied to the gamma and GEV distributions at a given significance level to determine the goodness of fit. If the test statistic, D, given by max()()xnDPxSx where ()xPxis the complete theoretical cumulative distribution function and ()nSxis the cumulative density func tion based on n observations, was greater than the test statistic at a given significance level, the hypothesis that the sample data fits a given distribution was rejected (Haan, 2002). Fits were then averaged with 99percent confidence limits to create a representa tive distribution for the region. Given the substantial variation of rainfall across large regions of Florida, if the gagespecific distributions fall within the confidence limits of the average, spatial homogeneity is reasonably established for the region in whic h the gages are located. The focus is to establish an average expectation for each vari able in any given year; therefore, it is especially important that vari able distribution fits are contained by the confidence limits near the 0.5 percentile. 1.2.6 Temporal Analysis In order to analyze the tempor al variation, distribution para meters were allowed to vary with time and then compared to the origin al distribution model to determine if a PAGE 21 14 statistically significant better fit was achieved. If there is an upward or downward trend in a particular hydroclimatologi cal variable, the extreme valu es themselves are generally getting larger or smaller over time and the di stribution itself is potentially changing. Changing distribution parameters with tim e allows for the distribution to be nonstationary and also gives an estimate on the rate of change. The use of a GEV parameter covariate, such as time, to identify tre nds in hydroclimatological data has been well established (Katz et al, 2002; Nadarajah and Shiau, 2005; Garcia et al, 2007). For consistency in temporal trend analysis and evaluation of parameter rates of change, the annual rainfall and annual rainfall days vari ables were fitted with the GEV distribution similar to the block maxima variables and then parameters were allowed to vary with time. It should be noted that as a check on the GEV parameters, the original gamma fitted variable parameters were allowed to vary with time to identify any differences from the GEV models in trend detection. For purposes of this research, model 1 wa s the GEV distribution with parameters and held constant; model 2 is a submodel of model 1 with = a+bt (10) = c+dt (11) = e+ft (12) where t is the time in years and a, b, c, d, e and f are constants of a linear trend evaluated at each year. The loglikelihood for the GEV distribution with parameters that are a function of time is given by: PAGE 22 15 1/() 1()() (,,)log()(11/())log1()1() ()()t m tt ixtxt ltttt tt given that () 1()0 ()txt t t for all t = 1, ..,m (Coles, 2004) (13) Once parameters were estimated for both cases, the models were compared to determine if the time covariate gives a statistically signi ficant better fit and parameters are indeed changing with time. In order to test model 1 against model 2, the likelihood ratio test was utilized. If 1l and 2l represent the maximized loglikel ihoods of model 1 and model 2, respectively, then a deviance statistic is given by 212 Dll Assuming a chisquare distribution, a quantile, c at significance can be determined and if D> c the submodel explains significantly more of the vari ation in the data (Col es, 2004). In cases where a model with timedependent parameters shows a significantl y better fit based on the likelihood ratio fit, fits were further investigated by examining standard quantile plots. However, because model 2 is nonsta tionary and parameters are varying at each observation, the random variable X should be transformed to a new variable Z for the quantile plot. A transform to the standard Gu mbel distribution is given by (Coles, 2004): () 1 log1() ()()t t X t Zt tt (14) 1.3 Results and Discussion In order to examine the general bounds and consistency of the data, average, maximum and minimum values were calculated for va riables at each gage Table 12 gives a summary of the variables determined at each rainfall gage as well as the standard PAGE 23 16 deviation, variance and average for all gages. From Table 12, it is apparent that the data is generally quite consistent. The standard deviation for each of the four variables, annual rainfall, annual rainfall days, annual event maximum and annual interevent days indicates minimal variation ar ound the average. One exception is gage G17; the values for this gage represent the lo w average value for annual rainfa ll and annual event rainfall. Although the values for both variables are within three standard deviations of the mean, indicating the gage may not be a complete outlier, ther e are some potential physical reasons this gage may not be consistent with the others. This is the southernmost gage, located 40 kilometers from Moon Lake, and is near the northernmost part of the South Tampa peninsula. It is possibl e that precipitation dynamics at this gage are influenced by proximity to Tampa Bay and Little Tampa Bay. Furthermore, this gage is in the most urbanized area of all the ga ges, which may have an e ffect on rainfall patterns. PAGE 24 17 Table 12: Precipitation variable summary Gage No. Avg. annual rainfall (cm) Max Min Avg. annual rainfall days MaxMinAvg. annual max. event rainfall (cm) MaxMin Avg. annual interevent days MaxMin G1 145.3 236.88.4 61.1103.41.019.842.29.9 39.4 71.019.0 G2 127.0 199. 92.2 55.9 71.038.018.0 43.76.6 40.1 67.025.0 G3 130.3 175. 95.3 54.9 74.039.019.1 38.69.4 42.7 84.024.0 G4 130.3 172. 78.2 57.2 70.041.017.3 34.510.2 40.4 82.022.0 G5 142.0 197. 83.1 60.6 81.032.020.1 44.711.4 40.4 76.022.0 G6 130.0 190. 87.9 55.0 80.034.019.3 40.111.2 42.7 76.024.0 G7 128.5 190. 87.6 54.8 72.033.017.8 33.38.4 44.3 94.021.0 G8 142.2 209. 95.8 58.5 73.043.018.8 30.09.4 39.3 67.021.0 G9 139.2 203. 65.8 55.7 72.035.019.1 40.47.6 42.5 79.024.0 G10 131.8 187. 89.2 54.9 73.042.019.3 46.59.9 41.4 66.024.0 G11 150.4 225.101. 55.5 76.035.018.0 30.010.4 36.5 76.019.0 G12 148.1 255. 96.3 58.6 87.042.021.3 38.410.2 41.5 73.024.0 G13 148.8 225. 62.2 64.0 86.027.019.6 32.59.4 39.0 105. 19.0 G14 136.7 212. 75.2 59.6 79.038.020.1 65.011.2 40.9 81.024.0 G15 139.4 222. 85.6 59.8 71.041.018.3 34.58.6 40.2 66.022.0 G16 138.2 192. 92.5 59.7 79.045.017.8 39.69.9 41.7 78.023.0 G17 115.6 172. 75.9 51.9 68.037.016.0 35.17.9 41.3 69.023.0 G18 142.2 186. 93.2 59.3 82.041.018.0 36.39.9 39.9 70.023.0 Std. Dev. 9.1 23.1 10.7 3.0 8.4 4.6 1.3 8.1 1.3 1.8 10.21.9 Var. 32.5 209. 44.7 9.0 70.721.20.5 25.40.8 3.1 104. 3.7 Total 136.9 202. 85.9 57.6 77.638.018.8 39.19.7 40.8 76.722.4 1.3.1 Spatial Analysis Both the gamma and GEV distributions gave generally good fits to the respective variables they were applied to, base d upon the KolmogorovSmirnov 99percent test statistic. The parameters for the gamm a distribution based upon maximum likelihood estimation as well as goodnessoffit estimati ons are summarized in Table 13. Since PAGE 25 18 gage G16 has the longest record and is genera lly representative of the fit dynamics at other gages, quantilequantile pl ots for each variable at this gage are displayed for visual inspection (Figures 12, 14 and 16). Table 13: Gamma variable parameter summary Gage No. Annual Rainfall Annual Rainfall Days Shape Scale KS KS D** Shape Scale KS KS D G1 16.49 8.81 0.28 0.11 20.98 2.91 0.28 0.06 G2 28.47 4.46 0.34 0.14 57.84 0.97 0.34 0.16 G3 32.76 3.98 0.27 0.14 47.77 1.15 0.27 0.1 G4 34.22 3.81 0.29 0.10 48.85 1.17 0.29 0.14 G5 33.77 4.21 0.29 0.07 28.47 2.13 0.29 0.17 G6 34.06 3.82 0.24 0.08 44.6 1.23 0.24 0.12 G7 25.12 5.12 0.26 0.10 34.34 1.6 0.26 0.14 G8 29.47 4.82 0.29 0.12 52.53 1.11 0.29 0.08 G9 18.61 7.47 0.29 0.13 45.93 1.21 0.29 0.15 G10 31.60 4.17 0.29 0.10 47.55 1.15 0.29 0.09 G11 21.30 7.06 0.28 0.11 40.59 1.37 0.28 0.12 G12 17.34 8.54 0.27 0.12 27.97 2.09 0.27 0.14 G13 18.20 8.19 0.27 0.12 22.64 2.83 0.27 0.11 G14 20.60 6.63 0.27 0.09 26.24 2.27 0.27 0.12 G15 22.30 6.26 0.28 0.06 77.83 0.77 0.28 0.12 G16 32.71 4.22 0.23 0.11 53.54 1.12 0.23 0.1 G17 22.93 5.04 0.24 0.11 60.03 0.86 0.24 0.11 G18 25.39 5.60 0.32 0.11 35.79 1.66 0.32 0.07 *KS refers to KolmogorovSmirnov stat istic at 99 percent significance **KS D refers to KolmogorovSmirnov test statistic, D for the gamma distribution 1.3.1.1 Annual Rainfall Examining the quantilequantile plot (Figure 12) of the observed annual rainfall and the gamma predictions provides a visual confirmation of the fit; observed and fitted values roughly follow a 45degree line indicating ag reement. Below approximately 110 cm and upwards of 160 cm of rainfall, Figure 12 data points begin to show some deviance from the 45degree match line. The cumulative distribution for each gage, with the average PAGE 26 19 distribution and corresponding 99percent confidence limits superimposed, is shown in Figure 13. The 99percent lo wer confidence limit for the average distribution contains the fits for all gages with the exception of gage G17, which is well outside this limit for all frequencies. Also observed in Figure 13, and as observed in the quantile plot, for frequencies near the 0.7 percen tile, corresponding to nearly of 165 cm of rainfall, the fits for gages G1, G11, G12 and G13 begin to fa ll slightly outside of the upper confidence limit. Quantiles of Gamma(shape = 32.71, scale = 4.22)Quantiles of G16 100120140160180200 100120140160180200 Figure 12: Gage G16 quantil e plot of annual rainfall PAGE 27 20 Annual Rainfall (cm) 80100120140160180200220 F(x) 0.0 0.2 0.4 0.6 0.8 1.0 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G14 G15 G16 G17 G18 Average Upper Conf. Limit Lower Conf. Limit Figure 13: Annual rainfa ll cumulative distribution 1.3.1.2 Rainfall Days per Year The quantilequantile plot for gage G16 rainfall days per year (Figur e 14) indicates a fit with slight variation around the match line ne ar the lower percentiles and above 60 days. The cumulative distribution (Figure 15) shows that while only gage G15 falls outside of the confidence limits at a low percentile, severa l fits fall outside the confidence limits at high percentiles. Gage G17 falls outside the lower confidence limit above the 0.18 percentile and all other gage fits are cont ained by the lower limit. Along the upper confidence limit, gage G13 falls outside a bove the 0.38 percentile and gages G1, G5 and G14 fall out approximately above the 0.7 per centile, corresponding to just above 60 days. The parameters for the GEV distribution ba sed upon maximum likelihood estimation as well as goodnessoffit estimations are summarized in Table 14. It can be seen from the PAGE 28 21 table that the GEV distribution genera lly gives a good fit based upon the KolmogorovSmirnov test statistic. All but two gage fi ts, including G17, are contained at the 0.5 percentile. Quantiles of Gamma(shape = 53.54, scale = 1.115)Quantiles of G16 50607080 50607080 Figure 14: Gage G16 quant ile plot of the rainfall days per year PAGE 29 22 Figure 15: Rainfall days per year cumulative distribution Annual Rainfall Days (>0.5 cm) 405060708090 F(x) 0.0 0.2 0.4 0.6 0.8 1.0 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G14 G15 G16 G17 G18 Average Upper Conf. Limit Lower Conf. Limit PAGE 30 23 Table 14: GEV variab le parameter summary Gage No. Annual Maximum Event Rainfa ll Annual Maximum Interevent Days Locatio n Scale Shape KS *KS D** Locati on Scale Shape KS *KS D** G1 16.32 5.16 0.10 0.28 0.1036.279.200.03 0.28 0.10 G2 14.90 5.65 0.00 0.34 0.1236.9711.470.04 0.34 0.16 G3 15.91 4.67 0.10 0.27 0.1235.039.370.22 0.27 0.11 G4 13.84 3.81 0.28 0.29 0.0635.829.030.15 0.29 0.08 G5 16.93 4.33 0.14 0.29 0.1133.669.800.10 0.29 0.07 G6 16.01 4.07 0.23 0.27 0.0840.7910.890.03 0.24 0.06 G7 15.04 4.69 0.01 0.26 0.0735.728.850.09 0.26 0.07 G8 16.55 4.82 0.14 0.29 0.0733.679.800.10 0.29 0.09 G9 15.93 5.90 0.05 0.29 0.1033.709.160.14 0.29 0.10 G10 15.61 4.66 0.17 0.29 0.1033.7611.810.12 0.29 0.09 G11 16.11 4.12 0.13 0.28 0.0830.858.590.08 0.28 0.11 G12 18.35 5.55 0.05 0.27 0.1034.628.390.22 0.27 0.08 G13 16.91 5.27 0.09 0.27 0.0932.048.990.16 0.27 0.11 G14 16.06 4.30 0.25 0.27 0.1234.667.920.18 0.27 0.11 G15 15.62 4.83 0.00 0.28 0.1035.959.150.13 0.28 0.10 G16 14.97 3.71 0.17 0.23 0.1034.559.440.17 0.23 0.11 G17 12.98 4.21 0.15 0.24 0.0836.758.400.04 0.24 0.08 G18 15.76 3.84 0.03 0.32 0.0934.728.490.03 0.32 0.12 *KS refers to KolmogorovSmirnov stat istic at 99 percent significance **KS D refers to KolmogorovSmirnov test statistic, D for the GEV distribution 1.3.1.3 Event Rainfall Annual Maximum The GEV fitted variables demonstrated somewhat more variability at the extreme end of the scale. Examining the quantile plot for the gage G16 fit (Figure 16), a good fit is evidenced up until approximately 20 cm of rain fall when a large deviance from the match line is observed. This deviance is likely due to the fact that some of these event rainfall maximums are a result of hurricanes or trop ical storms, which are not part of normal rainfall mechanisms or distributions. The cumulative distribution for all gages for the event rainfall annual maximum (Figure 17) i ndicates that several GE V fits fall outside the confidence limits near the 0.8 percenti le, corresponding to approximately 24 cm of PAGE 31 24 rainfall. Gages G6, G14 and G16 fall outside of the lower confiden ce limit above the 0.86 percentile while gages G2, G8, G9, G11 and G13 fall outside of the upper confidence limit above the 0.76 percentile. The fits of gages G4, G12 and G17, located 16, 32 and 40 kilometers from Moon Lake, respectively, fall out well before the 0.8 percentile. Gage G17 falls entirely outside of the lowe r confidence and gage G4 falls outside the lower confidence limit above the 0.64 per centile. Gage G12 falls outside the upper confidence limit above the 0.42 percentile. All but two gage fits, including G17, are contained at the 0.5 percentile. Quantiles of Generalized Extreme Value(loca tion = 14.965, scale = 3.708, shape = 0.171)Quantiles of G16 10152025303540 10152025303540 Figure 16: Gage G16 quantile plot of event rainfall annual maximum PAGE 32 25 Event Rainfall Annual Maximum (cm) 101520253035 F(x) 0.0 0.2 0.4 0.6 0.8 1.0 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G14 G15 G16 G17 G18 Average Upper Conf. Limit Lower Conf. Limit Figure 17: Event rainfall annual maximum cumulative distribution 1.3.1.4 Annual Maximum Interevent Days As shown in Figure 18, the quantile plot for gage G16 s hows similar dynamics to other gages for the annual maximum interevent days. A good fit is evidenced up until approximately 65 days of interevent time. From this point, some deviance from the match line is observed. The cumulative dist ribution (Figure 19) exhibits substantial variability at higher percentiles. Nearly all gage fits are contained within the confidence limits up to approximately the 0.74 percentile with the exception of gage G6, which falls just outside the upper confidence limit from the 0.30 percentile. From the 0.74 percentile upwards, gages, G11, G12, G13, G14 and G 16 fall slightly outside the lower confidence limit. Although the interevent annual maximum exhibits more variability outside the 99percent confidence limits associ ated with the average distribution, a case for regional PAGE 33 26 homogeneity can be made. Six of the gage s that fall outside the confidence limits are greater than 32 kilometers from Moon Lake. Ga ge G6 is the only gage that falls outside the limits prior to the 0.72 percentile. All other fits are contained at the 0.5 percentile. In terms of establishing spatial homogeneity in regards to water resource management, it is important to establish average expected c onditions. Although for most of the variable fits analyzed there we re areas at the upper and lower pe rcentiles where a few individual gage fits exceeded the 99percent upper or lower confidence interv al of the average, nearly all gage fits were contained at the 0.5 percentile, representing the average annual variable value. Gage G17 was the only fit that consistently fell outside the confidence limits for multiple variables. As previously discussed, this gage is located the furthest from Moon Lake and may be subject to differ ent rainfall forcing mechanisms. Most of the other fits that exceeded the confidence bounds at high or low percentiles exceeded them at locations where the f its themselves break down. Exceeding the confidence limits slightly at high or low percentiles is most lik ely a function of the fit variability and not an indication of spatial nonhomogeneity. Thus, the precipitation variables evaluated appear to exhibit spatial homogeneity within the give n confidence limits for the region analyzed. An average distribution of rainfall variable s that could be used for water resources management and planning within the region studied was developed. Furthermore, the methods utilized to evaluate spatial variability an d develop representa tive distributions could easily be applied to other regions. PAGE 34 27Quantiles of Generalized Extreme Value(loca tion = 34.553, scale = 9.436, shape = 0.167)Quantiles of G16 20406080 20406080 Figure 18: Gage G16 qua ntile plot of the annual maximum interevent days PAGE 35 28 Figure 19: Interevent days a nnual maximum cumulative distribution 1.3.2 Temporal Analysis Although the total annual rainfall and rainfall days per year were fitted with the gamma distribution for purposes of spatia l analysis, they were fitted with the GEV distribution in order to investigate temporal variability in a consistent ma nner with the other variables and to compare relative changes in parameters with covariates. As such, the KolmogorovSmirnov test statistic was applied to these fits as well; nearly all of the variables at each gage gave as good or better a fit with the GEV distribution. This is largely due to the Weibull distribution, which is also often applied to rainfall data, being a subdistribution of the GEV. The use of the Weibull distribution is consistent with the findings of Burgueno et al (2004), which found th e Weibull distribution well suited to fit time intervals between rainfall and Sharda and Das (2005) which found the Weibull distribution fit weekly rainfall data better th an the gamma distribu tion as well as other Interevent Annual Maximum (Days) 20304050607080 F(x) 0.0 0.2 0.4 0.6 0.8 1.0 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G14 G15 G16 G17 G18 Average Upper Conf. Limit Lower Conf. Limit PAGE 36 29 rainfall research. As a check, the gamma pa rameters of the original fits were also allowed to vary with time with very simila r results in terms of trend identification. All parameters ( , ) were allowed to vary with time; however, no significant improvement was evidenced by using one parameter over another. Therefore, for purposes of model comparison, the location parameter, was allowed to vary with time in model 2 and this was compared with the constant parameter model 1. Table 15 gives a summary of the likelihood ratio for the two mo dels at the 99percent significance level. Table 15: GEV model comparison Gage No. Annual Rainfall AnnualRainfall Days Annual Maximum Event Rainfall Annual Maximum Interevent Days M2/M 1* ** M2/M 1* ** M2/M 1* ** M2/M 1* ** G1 9.05 6.63 17.06 6.63 0.24 6.63 11.08 6.63 G2 0.49 6.63 1.70 6.63 .08 6.63 2.73 6.63 G3 0.75 6.63 0.42 6.63 0.07 6.63 0.49 6.63 G4 0.79 6.63 6.92 6.63 3.73 6.63 0.20 6.63 G5 1.26 6.63 1.81 6.63 1.06 6.63 3.39 6.63 G6 0.34 6.63 0.02 6.63 0.95 6.63 0.03 6.63 G7 0.02 6.63 2.75 6.63 0.00 6.63 0.17 6.63 G8 0.00 6.63 0.18 6.63 0.47 6.63 0.04 6.63 G9 0.03 6.63 0.00 6.63 2.18 6.63 0.30 6.63 G10 0.29 6.63 1.17 6.63 0.26 6.63 2.21 6.63 G11 1.11 6.63 3.51 6.63 2.38 6.63 0.61 6.63 G12 5.26 6.63 13.72 6.63 1.11 6.63 0.57 6.63 G13 0.00 6.63 14.60 6.63 0.00 6.63 2.47 6.63 G14 6.46 6.63 3.55 6.63 0.40 6.63 0.75 6.63 G15 0.76 6.63 0.21 6.63 0.00 6.63 0.93 6.63 G16 0.01 6.63 2.42 6.63 4.14 6.63 0.99 6.63 G17 0.27 6.63 0.02 6.63 1.35 6.63 0.65 6.63 G18 1.01 6.63 0.35 6.63 6.19 6.63 0.05 6.63 Model 2/Model 1 likelihood ratio ** 99percent significance (likelihood ratio should be greater to indicate a trend) PAGE 37 30 From the table it can be seen that gages G4, G12 and G13 show a weak trend in the rainfall days per year. In the case of G4, this is an upward trend; G12 and G13 demonstrate a downward trend. Because a si milar trend in the other variables is not evidenced, a possible increase or decrease in ra infall days is not corresponding to changes in annual rainfall nor to changes in event maximums. Furthermore, an increase or decrease in rainfall days would be expected to correlate to a d ecrease or increase, respectively, in annual maximum interevent da ys, which has not occurred at these three gages. As such, the trend for these gages doe s not appear to be significant. Gage G1, however, shows a significant trend in rainfall days, interevent days and annual rainfall. Looking at the trend equations for the location parameter for each variable: Rainfall Days: = 70.90 0.88t (15) Interevent Days: = 23.53 + 0.64t (16) Annual Rainfall: = 156.05 1.72t (17) It can be seen that there is a decrease in the number of rainfall days GEV fit location parameter of approximately 0.88 per year. Th is statistically and l ogically corresponds to an 0.64 per year increase in the interevent days GEV fit location parameter and a 1.72 decrease in the total annual rainfall per year GEV fit locati on parameter. Because annual rainfall days exhibited the strongest trend, quantile and probability plots for model 2 (Figure 110) were visually investigated to confirm the model fit. PAGE 38 31 Figure 110: Quantile plot for gage G1 annual rainfall days model 2 From the plot, the fit is adequate. Other gage G1 variables were investigated visually and similar results were found. However, from th e graph of the observed variable data below (Figure 111), it does appear that the trends indicated by model 2 are apparent. It is possible that, given the lack of similar trends for all other gages analyzed, the trends observed at this particular gage are due to time scale and a longer record would weaken the apparent trend. Focusing on the annual rainfall variable, which has been divided by three to scale with the other variables in Fi gure 111, it appears th at there is a downward trend up until 2002 and 2004 when two of the highest rainfall totals on record are achieved. The annual rainfall days follow a similar pattern. The maximum interevent PAGE 39 32 days variable is nearly a mirror image of the annual rainfall where an upward trend is apparent until 2002, after which severa l low extremes are recorded. Given that only one of the gages analyzed demonstrates a significant trend based on timedependent parameters of the GEV distri bution, a very strong case for temporal stationarity in rainfall patter ns in this region can be made This method of allowing GEV parameters to vary with time is a robust met hod of analyzing trends and can be applied to rainfall data in other regions. Year 1975198019851990199520002005 Variable 0 20 40 60 80 100 Annual Rainfall (cm/3) Annual Rainfall Days Annual Event Maximum (cm) Annual Interevent Maximum (Days) Figure 111: Gage G1 observed variable data PAGE 40 33 1.4 Conclusions The focus of this research was to determine the extent of spatiotemporal changes in precipitation patterns in a particular region utilizing met hods that can be generalized and applied at the regional scale. Furthermore, the aim was to develop average distributions for each rainfall variable that can be app lied to any water resource within the region analyzed, regardless of the proximity of ga ge data. Given the documented changes in spatial and temporal rainfall patterns in ma ny parts of the world, it was expected that some changes in rainfall patterns would be evident in the region analyzed. However, based upon the spatial analysis, the vast majority of variables analyzed at each gage were confined to a 99percent confidence band asso ciated with the average fit, gamma or GEV, of the data. There were some exception s; however most of these were at gages at the outer fringes of the area analyzed and at pe rcentiles near the high or low end. Nearly all of the fits were contained at the 0.5 percentile, representi ng the average annual variable a particular water resource can exp ect to experience. Th e method utilized to establish spatial homogeneity can easily be applied to other areas. Furthermore, developing an average, representative fit for a given region can be a powerful tool when forecasting return levels of the various variables and managing and planning water resources. In regards to temporal variability, it wa s also somewhat surprising that almost no significant trends were detected. Many of the gages investigated would have demonstrated a trend if analyzed with traditi onal methods such as ordinary least squares or nonparametric MannKendall. However, varying parameters with time to detect a PAGE 41 34 trend is a robust method to overcome traditiona l difficulties inherent in realworld hydroclimatological data, including limited or incomple te data, autocorrelation, and changes in collection methods. Furthermore, in cases wh ere trends are detected, this method gives an estimate on how much distributions are changing with time. In essence, since the distribution itself is changing with time, retu rn frequencies are a f unction of time and the trend in parameters gives an estimate of this change. PAGE 42 35 2.0 Statistical Changes of Lake Stages in Urbanizing Watersheds 2.1 Background During the last century, the pl anet has experienced treme ndous growth in population and development. In order to support this growth, an everincreasing stra in is being placed on water resources. While prolonged droughts or other changes in rain fall patterns present clear impacts to water resources, the urbaniza tion of a basin change s the rainfall runoff relationship in such a way that impacts to wa ter resources management are not so clear. Many researchers have studied some of the im pacts of urbanization on water resources in various watersheds. Meyer and Wilson (2001) found that streams in urbanized basins exhibited a reduction in baseflow. In a sim ilar study, Rose and Peters (2001) found that streams in urbanized watersheds demonstrated a decrease in baseflow, an increase in peak flows and a decrease in recession times for both baseflow and peak flow. Both studies attribute their findings to an increa se in impervious area and rapid storm runoff from efficient collection systems and the corresponding decrease in infiltration. However, Meyer (2005) found no trend in base flow changes in several streams with urbanized watersheds and attributed this to the low permeability of nearsurface soils and presence of stormwater detention system s. Smith and Baeck (2002) found that the increased efficiency of the drainage network in a rapidly urbanizing stream watershed is the major factor in a positive trend in fl ood magnitude primarily due to a shortened response time. Changnon and Demissie (1996) found that in two urban and two rural PAGE 43 36 stream basins undergoing changes in rainfall and land use, the majority of the increase in mean flows was due to the land use changes. Furthermore, mean and peak flows in the urban watersheds demonstrated considerably more response to rainfall shifts. McMahon et al (2003) investigated ch anges in stream watersheds undergoing urbanization in three different locales. The study found that increas ed urbanization was rela ted to increases in stream flashiness and variabili ty but found less relation to the duration of high or low stage conditions. Much of the literature has focused on accurate modeling and prediction of water levels in order to predict future levels, identify the contribution of individua l forcing mechanisms such as land use change, or explore change s in the rainfall or runoff and water level response. Altunkaynak (2007) em ployed artificial neural netw orks to model increases in Van Lake, Turkey, water levels and compared the results to traditional autoregressive moving average models (ARMA) and found that while the neural network outperformed the time series models, both had low error. In a similar study, Khan and Coulibaly (2006) compared a support vector machine a nd a seasonal autoregressive (SAR) model in longterm prediction of lake water levels a nd found that while the support vector machine outperforms the SAR model, both also had lo w error. Privalsky (1992) utilized a SAR model in combination with spectral analysis to study the statistical properties of Lake Erie, United States, water leve ls. Irvine and Eberhardt (1992) utilized an integrated autoregressive moving average (ARIMA) model to characterize water levels at Lake Erie and Lake Ontario. Montanari et al (1997) applied a fractionally differenced ARIMA model to Lake Maggiore, Italy, and f ound the model outperformed traditional ARIMA PAGE 44 37 models with some limitations in accounting for seasonality. Yin and Nicholson (2002) utilized an autoregressive model coupled wi th rainfall inputs to predict Lake Victoria, Tanzania, Uganda and Kenya, levels. Several studies focused on water budget models or physically based models to characterize changes in lake le vels. Lenters (2004) found that changes in Lake Superior, United States and Canada, were primarily due to climatic and land use changes based upon a water budget model. Li et al (2007) and Jones et al (2001) al so utilized a water budget model to determine that declines in wa ter levels at Lake Qinghai, China, and at several lakes in Africa, respectively, were pr imarily due to climatic changes rather than landuse or other changes. E lias and Ierotheos (2006) utilized a transfer function model, a dynamic linear relationship model and a physically based model to describe the relationship between precipit ation and lake levels. In addition to time series modeling, some of the research literature has applied regression to lake stages. Gao (2004) utilized multiple li near regression to develop quantiles of lake level fluctuations. McBean and Motiee (2008 ) utilized regression to identify longterm trends in precipitation, temp erature and inflow to the Grea t Lakes of North America. Gibson et al (2006) found rainfa ll variability to be the pr imary driving force on Great Slave Lake, Canada, levels a nd developed a regression model for water level fluctuations based on this variable alone. However, Mendo za et al (2006) determin ed that lake level fluctuations can be estimated by regressi ng monthly mean precipitation as well as PAGE 45 38 temperature. Lall et al (2006) developed a locally weighted pol ynomial regression model to forecast the Great Salt Lake, United States, biweekly volume. Because lakes are vital for tourism, recr eation, ecology, the environment and water supply, understanding statistica l changes of lake dynamics in urbanizing watersheds is integral to effective future water resource management. In this research, changes in the statistical structure of lakes in urbanizing wa tersheds are investigated by evaluating serial differences in time series parameters, autocorrelation and variance as well as by developing a regression model to estimate week ly lake level fluctuations. The focus of the research was to develop ge neral expectations for lake le vels in urbanizing areas that can be applied globally utilizing methods that can be applied to other locations. The time series modeling involves fitting a seasonal integrated autoregressive moving average (SARIMA) model to lake levels over subperi ods of approximately equal length over the data record available and identifying trends in parameter values. The regression model was fit to weekly fluctuations in water su rface levels. The regression model independent variables consist of rainfall components as well as lake stage and temperature components. These analyses were performe d for six lakes in Pasco County, Florida, United States, that demonstrate consistent ur banization and have not been substantially anthropogenically altered other th an the addition of control stru ctures or culverts at two lakes as part of lake management for urban planning. While some studies have focused on modeling st ream stages or the effect of urbanization on stream stages, very few have focused on lake levels. For both streams and lakes, it is PAGE 46 39 difficult to separate out the signal of urba nization from the multitude of forcing mechanisms inherent in an urbanized waters hed, including pumping, surface withdrawal, dredging, filling, diversion, in stallation of control st ructures, etc. Furthermore, lake stage data are typically less available than str eam data, hence the previously cited lake regression analyses utilized a coarser time scal e than that employed in this research. The focus on isolating the urbanization signal on lake levels and the methods utilized to identify statistical changes in lake levels in urbanizing basins as appl ied in this research were not found in the literature review. The objectives of this research in regard to urbanizing lake watersheds are to: 1) Determine changes in the statistic al structure of lake level time series. 2) Analyze serial changes in lake level autocorrela tion and variance. 3) Identify changes in the runoff/baseflow rela tionship. As a result of researching these objectives, general expectations for water reso urce managers will be developed as these are requisite tools for effective planning. 2.2 Materials and Methods 2.2.1 Lake Information and Data In order to sufficiently isolate the impact s of watershed urbanization, other change mechanisms, including anthropogenic alterati ons, pumping and climate change must be accounted for or eliminated as much as possi ble. Paynter and Nachabe (2008) found that rainfall patterns within westcentral Fl orida are spatially homogenous and temporally stationary and no significant shif ts that would correlate to trends in lake levels were evidenced. For the present study, six lakes were selected within the westcentral Florida PAGE 47 40 region based on the availability of sufficient data, lack of direct lake withdrawals and lack of proximity to well fields. Some of th e lakes selected have had control structures added while others have been relatively unaltered other than urbanization of the basin. Because changes such as adding a weir or a cu lvert to a lake are i nherent to urbanization and management of water resources, these lakes were included in this analysis. Some of the lakes analyzed are flowthrough lakes in which the lakes receive substantial flow from upstream water bodies and discharge to do wnstream lakes or wetlands. Other lakes are simply drainage lakes with no flowthr ough. Because all of the lakes are located within a fairly small geographic area, they generally evidence similar geologic characteristics, including being located in a siltysandy environm ent above a limestone formation. A summary of the characteristic s of the six lakes utilized can be found in Table 21. The presence of other potentia l signal sources in addi tion to urbanization, including flow from upstream lakes, control structures and adjacent wetlands are also included in the table. Lake data available to the general public were obtained from the United States Geological Survey and the Sout hwest Florida Water Management District. The data are generally daily but because there are several small gaps of a week or more throughout the data, the time series of each la ke was converted to weekly increments by selecting the first day of available stage data, adding seven days to this date consecutively throughout the record and selecti ng the lake level that corresp onds to the weekly date. In order to determine the degree of watershe d urbanization, data was utilized from the United States census for 1980, 1990 and 2000 populati on counts at the census tract level. Population data prior to 1980 are only availabl e at the county level for Pasco County. PAGE 48 41 Census tracts were substantially larger than th e lake basins studied. For each lake basin, population was distributed even ly across the tract and proportioned to each basin. Population density within each watershed, wh ich has a more meaningful implication for estimating basin urbanization, was devel oped from these basinspecific population values. Lake Thomas and King Lake fall w ithin the same census tract as do Cow Lake and Lake Padgett. As such, these pairs of la kes will exhibit similar densities although the actual population numbers will be different. Fo r each particular lake, rainfall data from nearby gages were utilized. Figure 21 provide s an aerial view of the lakes and available rainfall gages. Figures 22 and 23 depi ct aerial views of Moon Lake and Cow Lake. Table 21: Lake characteristics summary Lake Basin Area (km2) Lake Area (km2) Basin/La ke Ratio Flow Throug h Weir Structur e Adjacent Wetland Area (km2) Adjacent Wetland Percent of Basin Area Moon 0.78 0.43 1.8 N N 0.07 9.0 Padgett 17.09 0.79 21.6 Y N 0.12 0.7 Thomas 2.59 0.66 3.9 N N 0.12 4.6 Ann Parker 8.00 0.38 21.1 Y Y 0.31 3.9 King 4.40 0.55 8.0 Y Y 0.25 5.7 Cow 1.55 0.40 3.9 N N 0.00 0.0 PAGE 49 42 Figure 21: Lake and rainfall gage location PAGE 50 43 Figure 22: Moon Lake PAGE 51 44 Figure 23: Cow Lake PAGE 52 45 2.2.2 Time Series Analysis Statistical changes in lake levels were first explored with time series analysis. For each lake, the total stage record was split into ap proximately equivalent units of less than 10 years and fit with a time series model to investigate any systematic changes in parameters. Some of these subunits were shif ted slightly by one or two years to achieve nearly constant variance in time subseries that exhibited heteroscedasticity. Time series analysis assumes a stationary time series; a ny systematic change in the mean (trend) and variance and any periodic variations (seas onality) were accounted for. Surface water levels throughout the world have evidenced tren ds related to pumping, climate change or other factors and seasonal variation is found as water levels rise in the rainy season and fall in the dry season. In addition, longer s easonal trends, such as those induced by the approximately 10year El Nino phenomenon, may also be present in hydrologic time series. For this data, any trends were re moved by differencing, i.e, by subtracting adjacent values. Seasonality was also removed by differencing, i.e., by subtracting values one period away. For purposes of this research, the period uti lized was 12 months. Based on the literature (Irvine and Eberhardt, 1992; Altunkaynak, 2007), autoregressive moving average models (ARMA) are often f ound to fit lake data well. Although there are more robust methods for forecasting lake leve ls, such as neural networks, the focus of this research is on changes in the statistical structure of lake levels, for which time series analysis is aptly suited. An autoregressive process is based completely on previous values of the time series. An autoregr essive process of order p is given by: 1122.......... ttttpt x xxpxw (18) PAGE 53 46 where xt is the current value of the lake level time series, xtp is the time series value at lag p,p is a unique constant parameter for each lagged value and wt is Gaussian white noise with mean zero. A moving average process is based completely on previous values of white noise. A moving average pr ocess of order q is given by: 1122.........ttttqtq x wwww (19) where q is a unique constant parameter for each whitenoise value. An ARMA process combines equations 18 and 19 and is said to be of order p, q. For most of the lakes studied, an ARMA model with simple differenc ing was required, yielding an integrated autoregressive moving average (ARIMA) model of order p, d, q where d is the number of time the series needs to be differenced to re move trends and achieve stationarity. Once the series is stationary, ARMA parameters were determined using maximum likelihood estimation. The likelihood of the model is given by: 1/2 22/2011 12 2() (,)(2)()()..........()exp 2nn wwn wS Lrrr (20) where 12 1 1(()) () ()t n tt t t txx S r (21) and is the vector of model parameters 11.....,.....pq 2 w is the variance of white noise and r is the mean squared error of the one step ahead prediction, 1 t tt x x. Parameter estimates were obtained by maximizing (20) with respect to and 2 w (Shumway and Stoffer, 2006). A seasonal component was added to the ARIMA model in some cases to adequately capture periodic fl uctuations. A seasonal ARIMA is given by: PAGE 54 47()()()()ss P tQt x w (22) where 12()1..... s ssPs PP (23) and 12()1..... s ssQs QQ (24) are the seasonal autoregressive and seasona l moving average operators, respectively, of order P and Q with seasonal period s. In essence, the seasonal part of the model estimates current time series values from valu es one seasonal period or more in the past. SARIMA models are noted as ARIMA (p, d, q) x (P, D, Q) where D is the number of seasonal differences. Once a model was fitted to the data, diagnostics were performed to ensure randomness of the residuals, includi ng inspecting the autocorrelations of the residuals, 2()erh, where h is the lag, and the LjungBox statistic, given by: 2 1() (2)H e hrh Qnn nh (25) where n is the sample size and H is arbitr arily chosen, typically near lag 20 (Shumway and Stoffer, 2006). The test statistic Q is ch isquare distributed and the null hypothesis of randomness is rejected if Q falls above th e selected significance limit quantile. To achieve parsimony, the simplest model that ad equately fit the data for all time series subsets was sought for each lake and model pa rameters were compared to identify any trends. In ranking the models, adjusted R2 and the Bayesian information criterion (BIC) were utilized. Whereas R2 is a common measure of the goodnessoffit, BIC takes into account the number of parameters required to ac hieve the fit; lower BIC values indicate a PAGE 55 48 preferred model. The general approach for each lake was to increase the number of parameters from a first order autoregressi ve model and note any improvements in the R2, LjungBox statistic or BIC criteria. 2.2.3 Autocorrelation and Variance Lake levels exhibit significant autocorrelati on, a measure of how related adjacent values are. The autocorrelation function, a dimensi onless measure of linear dependence of time series values at lag k, is given by: (26) The potential change in lake level autocorre lation was evaluated by analyzing any serial changes in the autocorrelation of the lake levels. A time scale of weeks is too long to capture any changes in autocorrelation due to shifts in the rainfall/runoff response from urbanization. However, changes in the slower process of infiltration and base flow into a lake, including a hypothesized reduc tion due to basin urbanizati on, should be evident. Furthermore, effects of changing mechanisms, such as a control structure, inherent to urbanization may also alter lake memory. Ba seflow, for purposes of this research, is defined as the fraction of watershed rainfall that infiltrates the ground and subsequently over weeks and months enters the lake. This includes potential spillover from wetlands that may enter the lake well after a precip itation event has occurred. Statistically significant autocorrelation values for each time series subunit were approximated with an 1 2 1{()()} ()NK ttk t N t t x xxx xx PAGE 56 49 exponential fit so that they could be charac terized by a single parameter and compared to other subperiods within each lake. The exponential fit is of the form: k kre (27) where is a constant and k is the lag. Because some of the time subseries appear to exhibit heteroscedasticity when compared to one another, an F te st for significantly different variances was employed. 2.2.4 Regression Any changes in lake level statistical signa tures found with time series modeling were further explored with regression. Instead of focusing on the water level itself, the differences between weekly stages were mode led. The time scale of overland flow to lakes is in hours or days while the time scale of baseflow recharge to lakes is in weeks. Hence, lake level fluctuations over a week contain both a runoff and baseflow component; insufficient data are available to reduce the time scale and separate these components. As such, evaluating changes over time between the runoff fraction and baseflow fraction may be difficult unless th ese changes are prominent. However, changes over time in the baseflow recharge to lakes derived from rainfall that has fallen more than a week in the past can be evalua ted. For the regression model, the difference (Yt) of the current weeks water level from that of the previous week was regressed against both the total rainfall for the current week (Rw), representing a combination of rainfall runoff and baseflow, and the months rainfall total previous to the current week (Rm), representing solely baseflow. In orde r to improve the goodnessoffit of the model, terms for temperature (T) and starting water le vel (W) were also included. The average PAGE 57 50 temperature helps to capture evaporation, wh ich, contrary to rainfall, has an inverse relationship with lake levels. Although evaporation pan data for the lakes analyzed were not available, temperature is highly correlate d to evaporation. The starting water level helps capture lake morphology since, in genera l, when a lake is at a lower level, less volume is available at a given elevation differ ence; as lake stage increases, generally the area of the lake also increases and larger am ounts of runoff and basefl ow are required to make a unit change in elevation. This variab le also has an inverse relationship with lake stages since the higher the initial stage, the less impact rainfall ultimately has on fluctuations. The regression equation is given by: 01234 twmtYRRTW (28) where t is the random error term. The parameters 04 were estimated with the method of least squares. In most cases, a longer peri od of lake level data were available than rainfall data and lake level records had to be truncated for the regression analysis. In order to assure pars imony, adjusted R2 and Akaikes information criterion (AIC) was utilized on the entire data set to verify each of the four variables substantially contributes to explaining water stage fluctuation for each lake. AIC takes into account the number of parameters required to achieve a particular fit; lower AIC values indicate a preferred model. The model selected for each lake cons isted of the four variables in equation (28) or a subset thereof to be applied to each time subperiod. Several common diagnostics were run to verify there were no substantia l correlations among th e regressors and no overly influential outliers. In particular, th e correlation matrix was calculated for each PAGE 58 51 model while Cooks distance was u tilized to measure the influen ce of specific data points and identify outliers. Data points with Cook s distances of near one are considered to have significant influence and merit further investigation. As with the time series analysis, sequential subunits of time were analyzed to identify any systematic changes in parameters for the independent variables. However in the case of regression, since population in the region began to substantiall y increase in the 1970s, regression values for a time period as close to 1970 as possible wa s desired to represen t preurbanization in each basin. As such, the first four years of data for each lake were modeled to represent the preurbanized lake dynamics and the rema ining portion of the data was utilized to represent the urbanized lake basin dynamics; the two subperiods were compared to identify changes. 2.3 Results and Discussion Table 22 provides a summary of the lake data. The average difference between the maximum and minimum for the lakes analyzed is 1.9 m. Given the very flat topography of westcentral Florida, relatively small differences in water level fluctuations can inundate large areas and houses are routinely set as low as 0.3 m above expected high water marks. The standard deviation and vari ance are fairly consistent with ranges of 0.30 m and 0.23 m, respectively. All but two la ke data sets were greater than 95 percent complete when the number of weekly data points available are divided by the total number of weeks for the time period analyzed. The two lakes with less data had evenly spaced gaps usually less than 2 to 3 week s apart for which weekly data could be interpolated easily from adjacent values. In the cases of King Lake and Cow Lake, data PAGE 59 52 had to be truncated after 1991 as data point s thereafter were to o sparse to achieve meaningful results. Data for Lake Thomas had to be truncated after 1994 and the period from 1992 to 1999 had to be excised from La ke Ann Parker for similar reasons. Based upon the GIS census data, population in the vicinity of each lake has measurably increased over the time periods studied. As the population grows, the watershed morphs from rural to residential development with significant increases in impervious area, channelized drainage provisions and possible infill due to th e raising of lots for home construction. The population density growth around each lake is summarized in Table 23. The Moon Lake watershed exhibits th e greatest overall gains with well over 100 percent density growth from both 1980 to 1990 and 1990 to 2000. Lake Ann Parker, although heavily urbanized, showed the sma llest percent gains in population density while Lake Thomas showed the smallest de nsities overall. Cow Lake population density was considerably higher than that of other lakes. PAGE 60 53 Table 22: Regional lake data summary Lake Period of Record Averag e (m) (NGV Maximu m (m) (NGVD Minimu m (m) (NGVD Percent Complet e* Standard Deviation (m) Variance (m) Moon 19652007 11.63 12.58 10.24 96.0 0.47 0.22 Padgett 19702003 21.18 21.9 20.23 95.5 0.25 0.06 Thomas 19682003 22.34 23.01 21.04 98.6 0.34 0.12 Ann Parker 19692007 15.91 16.8 14.75 72.4 0.36 0.13 King 19762007 21.85 22.53 20.39 100.0 0.27 0.07 Cow 19762007 23.56 24.11 23.03 82.3 0.17 0.29 *Note: Percent complete excludes years with too few data points to use for analysis Table 23: Lake wate rshed population growth Lake 1980 Density (pop./km2) 1990 Density (pop./km2) 2000 Density (pop./km2) Percent Density Growth 8090 Percent Density Growth 9000 Moon 19.9 92.7 238.4 366.2 157.2 Padgett 39.5 80.2 140.7 103.2 75.4 Thomas 27.5 55.9 59.5 103.2 6.4 Ann Parker 137.6 181.7 203.5 32.1 12.0 King 58.0 117.9 125.5 103.2 6.4 Cow 181.1 367.9 645.3 103.2 75.4 2.3.1 Time Series Modeling Plots of the lake level time series do not di splay any obvious trends refer to Figure 24 for Cow Lake stages. However, in cases where the autocorrelation of the raw data PAGE 61 54 indicated that a trend is likely present since the correlations slowly decay to insignificance, differencing was utilized. B ecause lake levels exhibit a high degree of autocorrelation, subtracting an adjacent va lue is not substantially different than subtracting a value one seasonal period ago; a single simple difference sufficiently removed any trend or seasonality at each of the lake time series. The model order for each lake as well as a summary of the para meters for each model can be found in Table 24. The LjungBox values were sufficiently low to ensure randomness of residuals. In a single subseries case for Moon Lake, Lake Padgett and Cow Lake, LjungBox values were slightly outside 95percent significance limits for randomness of the residual. In order to bring LjungBox values within these limits, several additional parameters would be required, overfitting the other subseries fo r each lake. As such, the simpler overall model was chosen and is sufficient for the analys is herein. All lakes required at most two autoregressive terms. Moon Lake, Lake Thomas, King Lake and Cow Lake required a single differencing as well as a single seas onal term. Moon Lake, Lake Padgett and Lake Ann Parker required at most two moving average terms. PAGE 62 55 Time (years) 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Stage (m) 22.8 23.0 23.2 23.4 23.6 23.8 24.0 24.2 Figure 24: Cow La ke stages (19762007) PAGE 63 56 Table 24: SARIMA model parameters Lake/model order Subseries Period Parameters LjungBox Moon (1,1,2)x (1,0,0) 12 12 1 19651976 0.68 N/A 0.29 0.07 0.02 15.1 19771986 0.72 N/A 0.49 0.00 0.03 23.2 19871996 0.78 N/A 0.56 0.06 0.13 24.6 19972006 0.81 N/A 0.68 0.090.01 35.0 Padgett (2,0,1) 19701975 1.15 0.200.11 N/A N/A 21.5 19761983 1.40 0.430.31 N/A N/A 14.1 19841991 1.594 0.620.45 N/A N/A 20.3 19922000 1.44 0.460.30 N/A N/A 36.7 Thomas (2,1,0)x(1,0,0) 19681976 0.20 0.03 N/A N/A 0.01 28.9 19771985 0.09 0.10 N/A N/A 0.12 18.4 19861994 0.09 0.03 N/A N/A 0.11 28.3 Ann Parker (2,0,1) 19691976 1.12 0.150.10 N/A N/A 26.6 19771984 1.20 0.230.02 N/A N/A 14.3 19851991 1.24 0.280.08 N/A N/A 23.8 20002007 1.71 0.720.46 N/A N/A 15.9 King (1,1,0)x(1,0,0) 19761980 0.01 N/A N/A N/A 0.12 18.7 19811985 0.17 N/A N/A N/A 0.18 23.7 19861991 0.12 N/A N/A N/A 0.13 13.6 Cow (1,1,0)x(1,0,0) 19761980 0.03 N/A N/A N/A 0.13 17.2 19811985 0.05 N/A N/A N/A 0.06 40.7 19861991 0.22 N/A N/A N/A 0.07 17.7 From inspection of Table 24, it can be s een that Lakes Padgett and Ann Parker are the only lakes to exhibit a consiste nt pattern for all parameters. For every subseries, Lake Ann Parker shows serial increases in the fi rst autoregressive parameter and the first moving average parameter and a serial decrease in the second autoregressive parameter. Lake Padgett shows a nearly identical patt ern with the exception of the most recent subseries. Lakes Padgett and Ann Parker have a substantially larger basin to lake area PAGE 64 57 ratio than the other lakes; it is surmised th at the effects of wate rshed urbanization are widespread enough compared to the lake itself to overcome the addition of outlet culverts and control structures and ot her potential anthropogenical influences on the statistical structure of the time series. While they do not reflect serial changes in all parameters, Moon Lake does demonstrate a consistent increa se in the first autoregressive parameter and the first moving average parameter while Cow Lake demonstr ates a consistent decrease in the autoregressive parameter. Moon Lake demonstrates the largest percent population density gains, representing the greate st relative change in urbanization, while Lake Padgett and Cow Lake demonstrate the largest population gain in overall numbers. Lake Ann Parker has consistently high dens ity for the time periods analyzed and although its percent growth has been small, it has added substantial population. King Lake and Lake Thomas do not demonstrat e appreciable patterns in parameters for consecutive subseries. These two lakes exhi bit the lowest population density as well as the lowest percent growth for the most rece nt time period. It is possible that an insufficient level of urbanization was achie ved within these watersheds to overcome other signals and cause serial changes in lake level time series. From the research, the degree and extent of urbanizati on appears to have influence on changes in the statistical structure of lake level time series. Furtherm ore, these changes are apparent despite the multitude of other signals present in the basi ns. Figure 25 indicates that errors are random for the Cow Lake time series model. Similar results were achieved for other lakes. PAGE 65 58 LagAutocorrelation 55 50 45 40 35 30 25 20 15 10 5 1 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 ACF of Residuals for Cow Lake(with 5% significance limits for the autocorrelations) Figure 25: Autocorrelation of residual s for Cow Lake (19761980, lag in weeks) 2.3.2 Autocorrelation and Variance The results of the autocorrelation analysis fo r the time subseries are found in Table 25, which includes the exponential parameter that characterizes the auto correlation from lag 0 until the autocorrelation drops to insignificance. The table also displays the variance in lake levels for each time subperiod. As a ba sin becomes more and more urbanized, the increase in basin imperviousness and more efficient runoff collection systems should reduce the infiltration within the basin and de crease the time of concentration. This intuitively would lead to a la rger fraction of runoff from a given rainfall event reaching a lake more quickly and reducing the much slow er process of groundwater recharge to the lake. This change would conceivably result in higher peak stages and lower low stages, since in times of drought, baseflow replenishmen t would be reduced. It is surmised that a PAGE 66 59 decrease in baseflow will translate to a reduc tion in autocorrelation in an urbanizing lake watershed with no other signals. A reduction in autocorrelation translates to a steeper autocorrelation curve with a la rger exponential parameter. Th is is the case for Cow Lake and Lake Ann Parker; however, Moon Lake and Lake Padgett show a consistent trend towards longer memory and Lake Thomas and King Lake display a general trend towards longer memory. One potential reason for increased autocorrelation may be wetlands adjacent to the lakes. Lake Thomas, King La ke and Moon Lake have the largest adjacent wetland area as a percent of the total basin. As a larger fraction of runoff from increasing urbanization flows into these adjacent wetlands, the average wetland water levels may increase and pass a larger amount of baseflow in to the lake, increasing memory. It is also possible that wetlands are inherently more efficient at recharging the lakes than watershed infiltration due to their proximity. This is consistent with Meyer (2005) who found that although most streams in urbanize d watersheds demonstrate a decrease in baseflow, streams with lowpermeability nea rsurface soils and a substantial number of detention basins which can serve as a rech arge mechanism demonstrated increases in baseflow. For all lakes that discharge in to wetlands, all lakes except Cow and King Lakes, higher antecedent tailwaters for the lakes due to increased overall runoff volume into the wetland storage areas may also contri bute to increases in lake memory. In cases where a control structure helps regulate lake stages, including Lake Ann Parker and King Lake, effects on autocorrelation ar e difficult to isolate; it is e xpected that if lake stages are lowered by the control structure, autoco rrelation would generally decrease as less water is stored in the lake while the oppos ite should occur if the control structure increases stages. From the data available, it is unclear when the control structures in PAGE 67 60 these lakes were installed or modified. Di fferences in autocorrelation might also be difficult to filter out at Lake Padgett and Lake Ann Parker which are flowthrough lakes and receive flow from upstream lakes with their own sets of basin and lake alterations. Table 25: Autocorre lation and variance Lake Subseries Period Exponentia l Parameter Variance Significan tly Different* Moon 19651976 0.034 0.13 19771986 0.035 0.11 N 19871996 0.022 0.17 Y 19972006 0.017 0.38 Y Padgett 19701975 0.261 0.04 19761983 0.064 0.06 Y 19841991 0.097 0.06 N 19922000 0.099 0.07 Y Thomas 19681976 0.063 0.05 19771985 0.040 0.08 Y 19861994 0.033 0.06 Y Ann Parker 19691976 0.060 0.13 19771984 0.104 0.06 Y 19851991 0.110 0.08 Y 20002007 0.237 0.20 Y King 19761980 0.089 0.04 19811985 0.091 0.07 Y 19861991 0.033 0.08 N Cow 19761980 0.077 0.03 19811985 0.107 0.03 N 19861991 0.219 0.01 Y Indicates if the current time period is significantly different from the previous at the 95percent confidence level PAGE 68 61 As should be expected, the results of autocorrel ation analysis are cons istent with the time series modeling; similar patterns of autore gressive parameter increase or decrease through time are observed. However, for lake s that required two autoregressive terms, including Lake Padgett, Lake Thomas and Lake Ann Parker, the trend in parameters for these terms was usually opposite of one anothe r. Most of the lakes demonstrated an increase in variance over time, although the variance for Cow Lake was nearly constant. While Lake Ann Parker had the highest varian ce in the most recent time period, it did not demonstrate a serial increase in variance as did the other lakes. With the exception of Moon Lake, the increases in variance were ma rginal and, in many cases, insignificant at the 95percent confidence level. Cow Lake appears to give the best representa tion of the effects of urbanization on lake level autocorrelation (Figure 26) as it is the only lake of the six studied that does not have an adjacent wetland or discharge to a wetland, is not a flowthrough lake and does not have a control structure. Furthermore, there is a high degree of dense urbanization around the lake. Cow Lake exhib its a significant decrease in autocorrelation over time. However, as it is the only lake studied with the aforementioned characteristics, a strong trend cannot be established. PAGE 69 62 Lag (weeks) 024681012 Autocorrelation 0.0 0.2 0.4 0.6 0.8 1.0 19861991 19811985 19761980 Figure 26: Cow Lake autocorrelation with exponential f it lines (19761991) 2.3.3 Regression Table 26 gives a summary of the inde pendent variables and associated R2 for the preurbanized and urbanized time periods at each lake as well as the R2 and AIC for the overall model which includes the entire data set. PAGE 70 63 Table 26: Regressi on model parameters Lake Subseries Period Parameters* R2 AIC Preweek Rain Premonth rain Starting Lake Stage Temp. Moon All Values 44.16 2.92 19731976 0.0088 0.0009 N/A 0.0005 48.8 19772007 0.0084 0.0012 0.0092 0.0009 45.7 Padgett All Values 52.9 2.62 19761979 0.0152 N/A 0.0334 0.0011 73.9 19802000 0.0133 N/A 0.0292 0.0005 48.0 Thomas All Values 53.0 1.30 19761979 0.0096 0.0008 N/A 0.0009 62.3 19802000 0.0120 N/A 0.0218 0.0008 51.5 Ann Parker All Values 39.4 3.09 19721976 0.0078 0.0017 0.0581 N/A 30.7 19772007 0.0097 0.0010 0.0214 0.0011 43.4 King All Values 52.8 1.61 19761979 0.0155 N/A 0.0356 0.0009 66.5 19801991 0.0128 0.0006 0.0010 48.1 Cow All Values 44.2 1.34 19761979 0.0129 0.0010 0.0410 0.0005 68.6 19801991 0.0097 N/A 0.0794 N/A 36.3 *N/A indicates the parameter was not significant at the 0.05 level of significance. AIC was run for the model selecti on on the entire data set only There was a large spread in the multiple R2 values for the subperiod models, from 30.7 to 73.9. However, a fair amount of uncertainty is expected due to the multitude of variables that contribute to lake levels as well as the availability of data. In time periods in which there were more data gaps, i.e., one or mo re weeks in which missing values had to be imputated by interpolation, R2 values decreased. Rainfall records were nearly 100 PAGE 71 64 percent complete for the time periods analy zed, however, there is substantial regional variability in rainfall and none of the lake s had gages located imme diately at the lakes themselves. Rainfall gages were within two to six km of the lakes analyzed. Although evaporation is highly correlated with temperature, it is also dependent on wind, humidity and other factors for which data were not available. While transpiration can be a significant fraction of the wate r budget in a shallow water table environment (Nachabe et al 2005), data for the lakes studied was not av ailable and a transpir ation variable term was not included in the regression. Alt hough the lakes are located in a geologically similar region and generally have silty and sandy soils, local differences in soil types, including the presence of wetlands adjacent to a lake, can have an influence on the rainfallbaseflow interaction within individual lakes. Given these factors, the obtained R2 values are generally acceptable. The indepe ndent variables are sign ificant in explaining the changes in lake levels. For all lakes, the regression model inclusive of all four independent variables was deemed most approp riate. Figure 27 demonstrates the fit of the model versus the actuals for Cow Lake. Figure 28 demonstrates the normality of the residuals for Cow Lake. Residuals for all regression models exhibited normality and homoscedasticity with some deviation from nor mality at the extremes. All values for the correlation coefficients of the regressors at each lake were equal to or less than 0.4, indicating there is no significan t correlation. All lakes wi th the exception of King Lake exhibited Cooks distances of less than 0.5, indicating no presence of outliers. One data point for King Lake exhibited a Cooks dist ance of near one, i ndicating a value of significant influence. However, from inspectio n of the data, this appears to be a valid data point. PAGE 72 65 Fitted : Runoff + Baseflow + Lake.stage + TemperatureLake.stage.difference 0.00.10.20.30.4 0.20.00.20.4 Figure 27: Cow Lake re sponse versus fit (19761980) Quantiles of Standard NormalResiduals 3210123 0.20.10.00.1 Figure 28: Cow Lake quantilequan tile plot of the residuals (19761980) PAGE 73 66 Because of the unique nature of each lake studied, including the presence of adjacent wetlands, control structures, degree of urbani zation and flowthrough ch aracteristics, it is important to ascertain the individual aspects of each lake that contribute to changes in the regression parameters. For most of the la kes analyzed, the previousweek rainfall variable is fairly consistent with time and is always highly significant. Lake Thomas and Lake Ann Parker are the only lakes to demonstr ate an increase in this parameter. Lake Padgett, King Lake and Cow Lake demonstrat e a significant decrease in this parameter while Moon Lake shows a marginal decrease Because this parameter includes the effects of both runoff and baseflow, it is difficu lt to draw significant conclusions as any increase in runoff may be offset by decrease s in baseflow. Daily data would likely be required to reach definitive conclusions as to urbanizationinduced changes in this parameter. Most lakes demonstrated a trend towards de creased baseflow based upon analysis of the premonth rainfall parameter. At Lake Thomas and Cow Lake baseflow is significant in the preurbanized period but not thereafter, representing a reduction in baseflow. At Lake Ann Parker there is a signi ficant decrease of 41 percent in this parameter. At Lake Padgett, the baseflow parameter is not signifi cant for either time period. However, since the runoff/baseflow preweek rainfall parame ter is highly significant for both subperiods and due to the extremely large basintolakearea ratio, this is likely more representative of a reduction in baseflow than a reduction in runoff. Two sites exhibited increases in baseflow; at Moon Lake and King Lake, the baseflow parameter is either low or insignificant in the first time period and larger or significant thereafter. These two lakes PAGE 74 67 have the highest adjacent wetland percentage of the lake basin, giving credence to wetlands being a mechanism of increased basefl ow and offsetting some of the impacts of urbanization. Furthermore, despite Moon Lake demonstrating substantial gains in population density and having a higher density that all but Cow Lake in the most recent time period, baseflow still declined. The ba seflow trends were consistent with the autocorrelation analysis; increasing baseflow correlated to longer autocorrelation while decreasing baseflow correlated to shorter autocorrelation. Lake Thomas and Lake Padgett were exceptions to this consistency. Starting lake levels were signi ficant for nearly all lakes. In the few cases where starting lake levels were not significant, lake leve ls were generally higher than in periods in which starting levels were sign ificant. This is consistent with the morphology of most lakes in which lake surface area increases with depth and a similar volume of runoff makes for a smaller increase in stages as lake levels rise. The temperature variable is significant in most cases. However, in cases where it is not significant it is probably due to factors such as wind or humidity exerting a greater relative impact on evaporation. Based on the fact that Lake Padgett, Lake Ann Parker and King La ke are flowthrough lakes and Lake Ann Parker and King Lake have added control structures it is difficult to isolate the signal of urbanization on the rela tive values of the regression parameters. Although Moon Lake, Lake Thomas and Cow Lake do not have the aforementioned complications, the presence of wetlands adjace nt to Moon Lake and Lake Thomas also cloud the results due to possible increases in basin runoff storing in the wetlands and PAGE 75 68 recharging the lake in lieu of entering the la ke as baseflow from the watershed. As was the case for the autocorrelation analysis, Cow Lake provides the best case to examine any effects watershed urbanization may incur. Furthermore, despite the presence of many other signals unique to each lake, most la kes demonstrated an overall decrease in baseflow. 2.4 Conclusions Separating the signal of lakebasin urbanization from the multitude of signals inherent in an urbanizing watershed is problematic. The particular lakes chosen for this study were not substantially influenced by pumping, surf ace water extraction or precipitation trends, helping isolate the effects of ur banization. Many of the lake s did exhibit other sources of influence on water levels, including the a ddition of control structures or culverts, presence of adjacent wetlands and inflow from upstream lakes. With regard to the time series modeling, lakes with a large basin to la ke area ratio demonstrated definite trends in model parameters. As the basin/lake ratio increases, the urbanization signal is likely increased enough to be detected by the time se ries modeling. Furthermore, a significant increase in basin population density appears to systematically a lter the time series signature, despite the presence of other c onflicting signals. It was hypothesized that urbanization would shorten the autocorrelati on of lakes as the baseflow fraction was decreased due to more efficient drainage and increased impervious area. While this was certainly true in the Cow Lake basin, which is the most heavily ur banized lake and does not have a control structure, inflow from an upstream lake or an ad jacent wetland, it was not true in several other lake watersheds. Because all other watersheds have wetlands PAGE 76 69 immediately adjacent to the lakes studied or the lakes discharge into wetlands, it is surmised that wetlands serve as an efficient recharge mechanism and can compensate for effects of urbanization on baseflow by stor ing increased runoff vol ume associated with urbanization and slowly passing it back to th e lake over time. In nearly all the lakes studied, variance increased with time; in most cases, however, the increase was negligible. For the regression analysis, four lakes demonstrated a decrease in baseflow contribution while the two lakes with th e largest relative adjacent wetland areas demonstrated the opposite. Based upon the re search, the following general conclusions about lakes in urbanizing wate rsheds can be reached: 1) The statistical stru cture of lake level time series is systematically altered and is related to the extent of urbanization. 2) In the absence of other forcing mechanisms, autocorrelation and baseflow appear to decrease. 3) The presence of wetlands adjace nt to lakes can offset the reduction in baseflow. These conclusions can be applied gl obally to similar regions that consist of lakes undergoing urbanization in flat, humid, shallow wate r table environments with wetlands. Furthermore, the methodology utilized can be applied at lakes in both similar and dissimilar environments to t hose studied in this research. PAGE 77 70 3.0 Use of Generalized Extreme Value Covari ates to Improve Estimation of Trends and Return Frequencies for Lake Levels 3.1 Background One of the most important tools in effective water management is the accurate forecast of both longterm and shortterm extreme va lues for both flood a nd drought conditions. High water stages associated with flood can cause extensive erosion or property damage while low stages associated with drought af fect wildlife, ecology, recreation, and water supply. Frequency return periods for both peak highs and lows are often utilized to gage risk and evaluate mitigation methods to minimi ze this risk. Accurately identifying trends in lake levels can affect longterm deci sion making such as forecasting water supply, while improving the prediction of nearterm fr equency return periods can affect shortterm planning such as the determinati on of evacuation zones in the face of an approaching hurricane. Another significant be nefit of more accurate shortterm forecasts is giving resource managers adequate tools in January to determine how much water to let out of a lake to prepare for flood stages that often occur in August or September. Changes in the general trends of lake, str eam and other surface wa ter bodies have been observed in many parts of the world. Thes e trends may be due to factors such as watershed urbanization, wate r supply pumping and morphologi cal changes to the water PAGE 78 71 body itself or climatic changes. Traditional met hods of trend detection, such as ordinary least squares (OLS) or the MannKendall test are not aptly suited for hydrologic systems since these systems often exhibit time scale issues, nonnormal distri butions, seasonality, autocorrelation, inconsistent data collection, missing data other complications that render these traditional methods unreliable. In a similar fashion, traditional methods of predicting extreme flood and drought frequenc ies, such as distribution fitting without parameter covariates, may be highly inaccurate in laketype systems, especially in the shortterm. In the case of lakes, traditional frequency return estimates assume extremes are independent of trend or starting lake stages. However, due to the significant autocorrelation of lake levels, the initial stage can have a significant influence on the severity of a given event. If a 100year preci pitation event occurs at a low lake stage, the peak stage will be much lower than if the in itial lake stage is high due to the additional storage available. In Florida, with the a nnual threat of hurrica nes and flat topography where small differences in extreme stages can have significant impacts, utilizing flood or drought predictions that take starting lake stag e and future trends into account will allow for more accurate appraisals of bot h shortterm and longterm risk. One of the objectives of this research was to evaluate trends in a robust manner that can accommodate the autocorrelation, missing data, nonstationarity, etc. previously noted. Many studies have attempted to identify a ppropriate methods of trend detection in hydrologic data. Hirsch et al (1982) presents the seasonal MannKendall test to improve upon traditional methods of trend detection to accommodate some of the aforementioned complications in hydrologic data. Katz et al (2002) describes the use of extreme PAGE 79 72 distribution parameter covariates in combin ation with maximum likelihood estimation as a more rigorous methodology to identify trends in hydrologic data. Zhang et al (2004) utilized Monte Carlo simula tions to compare OLS, the nonparametric Kendall test, and allowing the parameters of th e GEV distribution to vary wi th time. According to the study, while the nonparametric test is more effective at identifying trends than OLS, allowing a GEV parameter cova riate significantly outperfor ms both OLS and the Kendall test. The GEV distribution has recently been widely applied to hydrologic studies. Nadarajah and Shiau (2005) utilized the di stribution to model flood events for 39 years of data at the Pachang River, Taiwan, and employed para meter covariates of flood volume, duration and time to peak to both identify trends a nd improve the fit. Mo rrison and Smith (2002) found the GEV distribution to adequately fit fl ood peaks in streams with at least 30 years of data in the Appalachian Mountains, United States. Garcia et al (2007) found the GEV distribution with a time covariate to adequate ly fit and identify trends in daily extreme rainfall in the Iberian Peninsula at gages with 40 years of data. Another objective of this research was to in vestigate methods to improve estimation of extreme lake stages by incorporating variab les, including starting stage and time, in addition to lake stage. Several studies have analyzed the relation between initial stages, antecedent conditions and flood return periods in various hydrologic systems. Other studies have attempted to quantify the multiv ariate nature of flooding in streams and other water bodies by incorporating terms such as time to flood peak, initial stage, flood PAGE 80 73 volume, etc. into models and predictions. Buchberger (1995) de veloped nearterm flood risk estimates for Lake Erie, United States, based on an autoregressive time series model and the joint occurrence of a normally distributed storm surge and found that conventional frequency analysis underestimate s flood risk when star ting lake stages are high and overestimates flood risk when starti ng lake stages are low. Struthers and Sivapalan (2007) developed flood return periods dependent upon thresholds of evaporation, rainfall freque ncy, catchment response time, field capacity storage and catchment storage capacity. In a similar study, Kusumastuti et al (2007) developed lakespecific flood frequency return period curves based on field capacity storage and total storage thresholds. Kusumastuti et al ( 2008) developed lake flood frequency return periods based on several catchment and lake thresholds including antecedent storage, catchment to lake area ratio and magnitude of storm depths. The antecedent storage in the lake was found to be a dominant control on flood frequency and ma gnitude. Goel et al (1998) developed flood fr equency curves based on the joint probabili ty of flood volume and flood peak for the Narmada River, India. Lake level trends in both flood and drought were investigated in this research utilizing the GEV distribution with a time parameter covari ate. Lake flood and drought stages were also modeled with the GEV distribution utiliz ing covariates of starting lake stage and time. If the addition of time or lake stage co variates offered a signi ficant improvement of the fit, frequency return period curves were developed for these cases. Lakes studied are located in Florida, United States, and have at least 50 years of data that are not significantly anthropoge nically altered. PAGE 81 74 Trend identification in lake levels util izing the GEV distribution as well as the development of variable return periods ba sed on starting lake st ages are a practical application of GEV distribution theory that has not yet been applied to lakes. Estimates of trend that are more accurate than those de rived from traditional methods such as OLS as well as more accurate flood and drought fr equencies based on starting water level will be of significant use in water resource ma nagement in terms of hurricane evacuation decisions, lake management decisions includi ng letting an adequate amount of water out of a lake to minimize flooding impacts from an approaching hurricane or tropical storm, development of appropriate average water le vels to maintain throughout the year based upon return curves that can be adjusted to the average water levels selected and preparation for increases or decreases in future flooding or drought. The objectives of this research in regards to lake levels were to 1) accurately identify the direction and magnitude of trends in flood and drought stag es and 2) provide more accurate predictions of both longterm and shortterm flood and dr ought stage return fr equencies utilizing GEV with time and starting stage covariates. 3.2 Materials and Methods 3.2.1 Lake Information and Data Lakes with at least 50 years of data were selected across the s outhwestern portion of Florida that were mostly anthropogenically unaltered, i.e., from significant dredging, placement of berms, pumping, inst allation of major cont rol structures, etc. in such a way PAGE 82 75 that would significantly change the time se ries signature and, hence, the underlying distribution. Given the degr ee of urbanization across Florid a, it is not possible to find completely unaltered lakes with sufficient da ta. However, four lakes, including Lake Arbuckle, Lake Carroll, Lake Trafford a nd Lake Weohyakapka (Fi gure 31) that are relatively unaltered were utilized. Figure 31: Location map of study lakes PAGE 83 76 3.2.2 GEV Distribution Trends in lake levels and return level freque ncies were identified utilizing extreme value models. The main variables modeled we re the annual maximum and minimum lake levels, the flood and drought stages. In order to analyze any trends, distribution parameters were allowed to vary with time. Because lake levels exhibit substantial autocorrelation, it is surmised that annual star ting lake levels have a significant impact on the distribution of annual extremes; therefore, the GEV parameters also were allowed to vary with initial stage. The starting lake stage was taken as the water level on January 1st of any given year. The time and starting lake stage covariate models were compared to the original distribution model to determine if a statistically significant better fit was achieved. If covariates do significantly improve the fit, the distribution itself is potentially changing as these covariates chan ge. Changing distribu tion parameters with time or starting stage allows fo r the distribution to be nonst ationary and also gives an estimate on the rate of change. The GEV is the generalized form of three co mmonly applied extreme value distributions: the Gumbel, the Frechet and the Weibull. The GEV is applicable to variables of block maxima, where the blocks are equal divisions of time. Th e GEV cumulative distribution function is given by: 1/()exp1 x Fx (29) PAGE 84 77 where x is the random variable, is the location parameter, is the scale parameter and is the shape parameter and 1+ (x)/ > 0. It readily follows that the subdistributions are: Gumbel: ()expexp, x Fxx (30) Frechet: 1/0 () exp Fx x x x (31) Weibull: 1/exp () 1 x Fx x x (32) GEV distribution parameters are determined using maximum likelihood estimation. The loglikelihood function, for 0, is given by: 1/ 11(,,)log(11/)log11mm ii iixx lm given that 10ix for i = 1, , m (Coles, 2004) (33) The loglikelihood for the GEV distribution with parameters that are a function of time t or starting lake st age s is given by: 1/(,) 1(,) log(,)(11/(,))log1(,) (,) (,,) (,) 1(,) (,)ts m ts i txts tststs ts l xts ts ts given that ,(,) 1(,)0 (,)tsxts ts ts for all t = 1, ..,m (Coles, 2004) (34) PAGE 85 78 For purposes of this research, model 1 is the GEV distribution with parameters and held constant. The distribution parameters fo r model 1 for each lake were estimated and the goodnessoffit was evaluated with the Ko lmogorovSmirnov test statistic at the 95percent significance level. For model 2, th e location parameter of model 1 was allowed to vary with time or starting stage or both to investigate the presence of trends and determine if model 1 could be improved. Model 2 is therefore a submodel of model 1 with = a + by (35) where y is either the time in years or the st arting lake stage and a and b are constants. Model 3 is a submodel of model 2 with = c + dt + es (36) where t is the time in years, s is the starting lake stage and c, d and e are constants. Once parameters were estimated for all three cases, the models were compared to determine if the time and/or starting lake stage covariate give a statistically significant better fit. In order to test one model against another, the likelihood ratio test was utilized. If 1l and 2l represent the maximized logli kelihoods of the models to be compared, then a deviance statistic is given by: 212 Dll (37) Assuming a chisquare di stribution, a quantile, c at significance can be determined and if D> c the submodel explains significantly more of the variation in the data (Coles, 2004). Model 2 will be compared to model 1 while model 3 will be compared to both model 1 and model 2. In cases where a mode l with parameter covariates demonstrated a PAGE 86 79 significantly better fit, fits were further inve stigated by examining standard quantile plots for visual confirmation of the fit improvement. However, because models 2 and 3 are nonstationary and parameters are varying at each observation, the random variable X should be transformed to a new variable Z for the quantile plot. A transform to the standard Gumbel distributi on is given by (Coles, 2004): (,) 1 log1(,) (,)(,)t tXts Zts tsts (38) Quantilequantile plots were developed for these transformed standardized variables. If models 2 or 3 demonstrate an improved fit, it means estimated frequency return periods are changing with time or starting lake stage. Although the maximum likelihood ratio test is given more weight than the quantilequa ntile plots, the test co mpares the fit of all actual data points to the model and gives even weight to all frequency events. Because low frequency events are of main interest, qu antilequantile plots were utilized to focus on the fit in the extreme end of the distribution. If an adequate fit in this region was not confirmed via the plots, the simplest model that adequately predicted extremes was selected. Return level plots for the most appropriate model fo r both flood and drought were developed at each lake. Estimates of quantiles for the return level plots are given by: ,(,)(ln(1)),0 (,)ln(ln(1)),0xptsp q tsp (39) where q is the quantile estimate for lake stag e x at frequency p (Bei rlant et al ,2004). PAGE 87 80 3.3 Results and Discussion A summary of the data utilized is given in Table 31. Specifically, the number of years of record, maximum, minimum and average st ages and variance are provided. Plots of the maximum, minimum and starting stage for Lakes Carroll and Weohyakapka are provided in Figures 32 and 33. From the table, the standard deviation in lake levels is consistently near 0.3 m. The average di fference between the maximum and minimum for the lakes analyzed is 2.13 m. Given the fl at topography of westcen tral Florida and other similar regions, relatively small differences in water level fluctuations can inundate large areas and impact structures th at are routinely set as low as 0.3 m above expected high water marks. From inspection of the figures, it appears likely that annual starting stage is correlated with both annual maximum flood a nd minimum drought stages as the starting stage approximately parallels both the flood and drought stages The fits of the lake stages for flood and drought are given in Tabl es 32 and 33, respectively, for all GEV models. The KolmogorovSmirnov values were well within the 95percent test statistic for the nocovariate fits, indica ting the fits are acceptable. Table 31: Lake data summary Lake Period of Record Average (m) (NGVD) Maximum (m) (NGVD) Minimum (m) (NGVD) Standard Deviation (m) Arbuckle 19422008 16.35 17.79 15.59 0.36 Carroll 19462003 10.76 12.10 9.41 0.34 Trafford 19412007 5.98 6.95 4.85 0.29 Weohyakapka 19582008 18.64 19.46 17.95 0.24 PAGE 88 81 Year 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 Stage (m) 9.5 10.0 10.5 11.0 11.5 12.0 Maximum Minimum Starting Stage Figure 32: Lake Carroll stage data Year 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 Stage (m) 18.0 18.5 19.0 Maximum Minimum Starting Stage Figure 33: Lake Weoh yakapka stage data PAGE 89 82 Table 32: GEV flood parameter summary Lake Loc. time cov. start stage cov. Scale Shape Likelihood ratio* No covariate (model 1) Arbuckle 16.831 N/A N/A 0.347 0.286 Carroll 10.984 N/A N/A 0.300 0.147 Trafford 6.285 N/A N/A 0.193 0.191 Weohyakapka 18.928 N/A N/A 0.214 0.335 Time covariate (model 2) Model 2/ Model 1 Arbuckle 17.085 0.007 N/A 0.336 0.385 10.301 Carroll 11.176 0.006 N/A 0.282 0.149 6.567 Trafford 6.301 0.000 N/A 0.193 0.192 0.090 Weohyakapka 18.876 0.002 N/A 0.216 0.386 1.204 Starting stage covariate (model 2) Model 2/ Model 1 Arbuckle 8.954 N/A 0.484 0.343 0.387 8.203 Carroll 3.299 N/A 0.712 0.165 0.194 45.895 Trafford 4.015 N/A 0.376 0.179 0.115 4.457 Weohyakapka 5.691 N/A 0.710 0.192 0.464 16.431 Time and starting stag e covariates (model 3) Model 3/Model 1:Model Arbuckle 13.178 0.007 0.240 0.335 0.488 16.626/8.422 Carroll 3.470 0.000 0.697 0.169 0.169 45.782/0.113 Trafford 3.858 0.001 0.407 0.177 0.109 4.898/0.441 Weohyakapka 6.341 0.000 0.676 0.190 0.449 17.081/0.649 At the 95percent confidence interval, a maximum likelihood ratio of greater than 3.842 for model 2/model 1 or model 3/model 2 and 5.992 for model 3/model 1 indicates a significantly better fit. Th e model 2/model 1 and model 3/model 1 ratios were also compared to determine if the additional degree of freedom improves the fit. Selected models are bolded. PAGE 90 83 Table 33: GEV drought parameter summary Lake Loc. time cov. start stage cov. Scale Shape Likelihood ratio* No covariate (mo d el 1) Arbuckle 15.844 N/A N/A 0.182 0.150 Carroll 10.304 N/A N/A 0.409 0.395 Trafford 5.554 N/A N/A 0.301 0.504 Weohyakapka 18.354 N/A N/A 0.194 0.326 Time covariate (model 2) Model 2/ Model 1 Arbuckle 15.915 0.002 N/A 0.181 0.169 2.484 Carroll 10.561 0.008 N/A 0.380 0.455 11.005 Trafford 5.498 0.001 N/A 0.295 0.465 0.549 Weohyakapka 18.293 0.0026 N/A 0.193 0.363 2.023 Starting stage covariate (model 2) Model 2/ Model 1 Arbuckle 10.414 N/A 0.335 0.154 0.383 7.536 Carroll 0.117 N/A 0.947 0.200 0.264 71.923 Trafford 0.804 N/A 1.055 0.200 0.323 39.937 Weohyakapka 3.855 N/A 0.778 0.149 0.445 32.044 Time and starting stag e covariates (model 3) Model 3/Model 1:Model Arbuckle 9.441 0.001 0.396 0.166 0.276 21.753/14.217 Carroll 0.439 0.001 0.921 0.198 0.259 72.413/0.489 Trafford 1.519 0.000 1.177 0.177 0.329 34.432/5.505 Weohyakapka 15.839 0.002 0.134 0.176 0.430 8.160/23.884 At the 95percent confidence interval, a maximum likelihood ratio of greater than 3.842 for model 2/model 1 or model 3/model 2 and 5.992 for model 3/model 1 indicates a significantly better fit. Th e model 2/model 1 and model 3/model 1 ratios were also compared to determine if the additional degree of freedom improves the fit. Selected models are bolded. 3.3.1 Trend Analysis In regards to modeling both lake flood and drought stages with th e GEV distribution and a time covariate, only Lake Carroll demonstr ated a statistically significant improvement in the model 2 fit over the model 1 fit with the GEV distribution alone. Lake Arbuckle exhibited a trend in annual flood stages but not drought stages. For Lake Carroll, the PAGE 91 84 model 2 location parameter for flood stages, which yields an estimate of the relation between lake stage and time t in years, is given by: 11.176 0.006t (40) And the model 2 location parameter for drought stages is given by: 10.561 0.008t (41) Although the maximum likelihood ratio for both trends is substant ially larger than the 95percent confidence limit thresh old, the actual change in flood or drought stage is relatively small, 0.006 m and 0.008 m of decrease per year that the trend is extended into the future. This slight trend is visually c onfirmed in Figure 32. For Lake Arbuckle, the model 2 location parameter for flood stages is given by: 17.085 0.007t (42) The trend is again downward and of similar order, a decrease of 0.007 m per year. The lakes studied are relatively unaltered in regards to excessive pumping, dredging, management or other mechanisms that may induce dramatic trends. Furthermore, Paynter and Nachabe (2008) determined that th e rainfall patterns in the southwest Florida region do not exhibit significant trends that would correlate to changes in lake levels. Lakes Arbuckle, Trafford and Weohyakapka ar e fairly undeveloped when compared to Lake Carroll, which is highly urbanized. Although many lake s in Florida have demonstrated significant trends due to pumping or anthropogen ic change, it appears lakes left in a fairly natural state such as the four studied for this research exhibit slight but statistically significant trends in the case of Lakes Carroll and Arbuc kle or no trends in the cases of Lakes Trafford and Weohyakapka. PAGE 92 85 3.3.2 Starting Stage Analysis 3.3.2.1 Flood Return Period According to the maximum likelihood ratios, mo del 2, with a starting stage covariate, is most appropriate for Lakes Carroll, Traffo rd and Weohyakapka while model 3 is most appropriate for Lake Arbuckle. It should be noted that th e magnitude of the likelihood ratio is proportional to the degree of impr ovement of the fit; the model 2 ratios are generally high. Only Lake Arbuckle demonstr ated a statistically significant improvement in fit when covariates for both time and star ting stage are included. However, as the trend component of model 3 is negligible, the simpler model 2 was selected. The Lake Arbuckle (Figure 34) and Lake Carroll (Fi gure 35) quantilequantile plots demonstrate an adequate fit for model 2. Quantilequant ile plots for Lakes Trafford and Weohyakapka also demonstrated adequate model 2 fits. With the exception of Lake Carroll, in most of the quantilequantile plots for flood, the f it breaks down at the extreme end for all models. This is partly due to these points representing hurricanes or tropical storms that are not part of the same di stribution as normal rainfall events and partly due to extrapolating extreme events with 50 years of data. After evaluating both the maximum likelihood ratios and the quantilequa ntile plots, model 2 was sele cted for all four lakes in terms of flood stage. The location parameter, which yields an esti mate of the relation between starting stage and fl ood stage, is given by the following for Lakes Arbuckle, Carroll, Trafford and Weohyakapka, re spectively, for starting stage s: 8.954 + 0.484s (43) 3.299 + 0.712s (44) PAGE 93 86 4.015 + 0.376s (45) 5.691 + 0.710s (46) For every unit change in starting stage, there is a substantial change ranging from 0.376 to 0.712 m in the flood stage for a given year, indicating a ve ry high degree of correlation. Empirical 20246 Model 2 0 2 4 6 Model 2 Figure 34: Lake Arbuc kle flood stage standardized residual quantiles PAGE 94 87 Empirical 20246 Model 2 0 2 4 6 Model 2 Figure 35: Lake Carro ll flood stage standardi zed residual quantiles Figures 36 and 37 give the model 2 flood re turn period for Lakes Arbuckle and Trafford associated with the maximum, minimum and av erage starting stage as well as the return period associated with no covariate. For each lake, the return period associated with the average starting stage is fairly close to the retu rn period associated with no covariate. For Lakes Arbuckle, Carroll and Weohyakapka, there is some divergence between these two curves towards the larger return periods. At Lakes Arbuckle and Carro ll, this is likely due to the fact that these lakes exhibit so me trends and since the starting stage should correlate to any trends, the inclusion of th e starting stage covariate improves the fit and causes divergence from the fit without a covari ate. The flood return period associated with no covariate is bounded by that associat ed with the maximum and minimum starting PAGE 95 88 stage. In years with a low starting stage, tr aditional frequency anal ysis overpredicts the 100year flood by 108.3, 129.4, 75.9 and 179.2 pe rcent of standard deviation for Lakes Arbuckle, Carroll, Trafford and Weohyakapka, re spectively. In years with a high starting stage, traditional frequency analysis underp redicts the 100year flood by 50, 232.4, 69.0, and 91.7 percent of standard deviation for th e same lakes. As such there is a 0.57m, 1.22m, 0.42m and 0.65m difference, respectiv ely, between the 100year return period stage for the maximum and mi nimum starting lake stage co variate. Given the flat topography in Florida and other similar regions, a difference of as much as 1.22m can mean a substantial increase in the extent of flooding and potential number of structures flooded. Return Period (years) 1.52345681015202550100200500 Stage (m) 16.4 16.6 16.8 17.0 17.2 17.4 17.6 17.8 18.0 18.2 Starting stage = 15.8 Starting stage = 16.4 Starting stage = 17.0 No covariate Figure 36: Lake Arbuckl e flood frequencies with and without covariates PAGE 96 89 Return Period (years) 1.52345681015202550100200500 Stage (m) 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 Starting stage = 5.3 Starting stage = 5.9 Starting stage = 6.4 No covariate Figure 37: Lake Trafford flood frequencies with a nd without covariates Since more area is available at consistently higher elevations of a lake, it takes more runoff or baseflow volume to cause a unit rise in stage at higher lake elevations. Because of this it would be expected that in a lake left in its natural stage, return period curves would flatten out at more extreme frequencies. However, once a lake basin is urbanized, the watershed infilled with construction and management structures installed, it is difficult to consistently predict the shape of these curves in a general sense. Lakes Arbuckle, Trafford and Weohyakapka are rela tively undeveloped and they demonstrate the expected flattening of the return period cu rves at higher frequenc ies. Lake Carroll is the most urbanized and it shows some steepen ing of the return period curves at extreme events. PAGE 97 90 3.3.2.2 Drought Return Period According to the maximum likelihood ratios, mo del 2 (with a starting stage covariate) is most appropriate for Lakes Carroll, Traffo rd and Weohyakapka while model 3 is most appropriate for Lake Arbuckle. As in the flood analysis, the trend component is quite small and the simpler model 2 was deemed a ppropriate. Similar to the flooding case, the likelihood ratios for model 2 are quite high, indi cating that model 2 explains substantially more of the variation. The quantilequantil e plots for Lakes Arbuckle, Carroll, Trafford (Figure 38) and Weohyakapka (Fig ure 39) indicate an adequate fit for model 2. As with the flood quantiles, there is divergence be tween the model and empirical data at the extremes. This is likely due to longer timescale cycles, such as La Nina, that cause excessively dry years and are not explicitly included in the models; model 2 should capture some, but not all, of these longer cycl es with the inclusion of starting stage. Some of the fit breakdown is also due to ex trapolating events greater than the 50year from 50 years of data. After evaluati ng both the maximum likelihood ratios and the quantilequantile plots, model 2 was selected for all four lakes. PAGE 98 91 Empirical 20246 Model 2 0 2 4 6 Model 2 Figure 38: Lake Traffo rd drought stage standardized residual quantiles Empirical 20246 Model 2 0 2 4 6 Model 2 Figure 39: Lake Weohya kapka drought stage standa rdized residual quantiles PAGE 99 92 The location parameter associated with the mo st appropriate model for each lake is given by the following for Lakes Arbuckle, Carroll Trafford and Weohyakapka, respectively, for starting stage s: 10.414 + 0.335s (47) 0.117 + 0.947s (48) 0.804 + 1.055s (49) 3.855 + 0.778s (50) As in the flood case, for every unit change in starting stage, there is a substantial change in the drought stage for that year, in this case ranging from 0.335m to 1.055m. Figures 310 and 311 give the drought return period for Lakes Arbuc kle and Carroll associated with the maximum, minimum and average st arting stage as well as the return period associated with no covariate. Similar to th e flood return period case, the return period associated with no covariate is bounded by that associated with the maximum and minimum starting stage and nearly parallels th e return period associated with the average starting stage covariate. One exception is Lake Carroll where the nocovariate return period curves deviate significantly from th e average starting stage covariate curves towards the extreme end. Lake Carroll was th e only lake to exhibit a significant drought trend and, as in the flood case, including the starting stage as a covariate captures some of this trend and provides a better fit. In years with a low starting stage, traditional frequency analysis overpr edicts the 100year drought by 41.4, 105.9, 158.6 and 104.2 percent of standard deviation for Lakes Arbuckle, Carroll, Trafford and Weohyakapka, respectively. In years with a high starti ng stage, traditional frequency analysis underpredicts the 100year drought by 66.7, 373.5, 251.7 and 191.7 percent of standard PAGE 100 93 deviation for the same lakes. As such there is a 0.39m, 1. 63m, 1.19m and 0.72m difference, respectively, between the 100year return period stage for the maximum and minimum starting lake stage cova riate. In similar fashion to flood stages, it is expected that drought return period curv es would flatten at more extr eme return periods since there are more water loss mechanisms at higher lake stages. At lower stages, the only method of water loss may be evapotrans piration or recharge to the gr ound. All four lake drought return curves follow this general pattern. Return Period (years) 1.52345681015202550100200500 Stage (m) 15.2 15.4 15.6 15.8 16.0 16.2 16.4 Starting stage = 15.8 Starting stage = 16.4 Starting stage = 17.0 No covariate Figure 310: Lake Arbuckle drought frequencies with a nd without covariates PAGE 101 94 Return Period (years) 1.52345681015202550100200500 Stage (m) 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 Starting stage = 9.9 Starting stage = 10.7 Starting stage = 11.6 No covariate Figure 311: Lake Carroll drought frequenc ies with and without covariates There appears to be no correlation in the difference between flood stages and drought stages for the minimum and maximum starting lake stages within each lake, i.e, a small difference in the Lake Carroll flood stage associated with the maximum and minimum starting stage does not indicate a small differen ce in the drought stage associated with the maximum and minimum starting stage. This is likely due to differe nt physical dynamics operating in the flood and drought cases. Fl ood stages are generally controlled by some management mechanism, i.e., a weir, culvert, gate, etc. while drought stages are largely uncontrolled other than natural losses such as evapotranspiration or seepage to the ground. Furthermore, at extreme flood stages the basin morphology may change relative to the lake at lower stages, i.e., higher stages may be flatter than at lower stages, a basin PAGE 102 95 popoff to another basin may be reached or housing construction may have significantly altered the historic basin by infill. In all cases for both flood and drought, adding covariates for both trend and starting stage offer little improvement over a starting stage al one. This is likely because any monotonic trend in time should be captured in the star ting stage variable and because the trends identified were very small. Potential scenar ios for the inclusion of both time and starting stage covariates improving a fit include situa tions in which overall trends may not be reflected in the January 1st stage. One possibility may be seasonal trends such as an increase in summer floods due to hurricanes or tropical storms followed by periods of low rainfall whereby the annual star ting stage returns to normality. 3.4 Conclusions The lakes studied were relatively unaltere d in terms of extensive pumping, dredging, filling or other measures that would significantly alter the underlying lake level distribution. All of the lake s researched evidenced either no trend or very small trends unlikely to significantly alter prediction of future flood or drought return levels. However, for all of the lakes, significant im provement in the fits was obtained with the inclusion of starting lake st age as a covariate. This is likely because any monotonic trends are captured in the starting stage itself and the trends identified were negligible. Traditional methods of estimating flood or drou ght stages significantl y overpredict stages when starting lake stages are low and underpre dict stages when starting stages are high. The difference between these predictions can be substantially more than one meter, a PAGE 103 96 significant amount in urbanized watersheds in areas of the world with flat topography. Flood differences of over one meter can mean significant alterations in evacuation or other water management decisions. In a ddition to improving prediction of extreme events, utilizing GEV with time or starting st age covariates can provide guidance in lake management decisions in regards to how much water to release from a lake in preparation for an approaching hurricane, appropriate lake levels to maintain th roughout the year or determining minimum structure flood elevations in the watershed. Although there is less that can be done from a management standpoi nt in regards to drought, utilizing GEV with covariates provides a more accurate estimate of expected drought return periods, which can be useful in forecasting future water s upply or impacts to t ourism. The methodology employed in this research provides a means to estimate the direct ion and magnitude of lake trends that is robust de spite the inherent difficultie s in determining trends in hydrologic data. The methodology also allows for more accurate prediction of flood and drought return frequencies that can be applied to nearly any region globally. PAGE 104 97 4.0 Conclusion The focus of this research was threefold: 1) to determine the extent of spatiotemporal changes in precipitation patterns utilizing methods that can be regionalized and applied to various water resources such as lakes, stream s, reservoirs or other water bodies 2) to determine the statistical changes that occur in lakes with urbanizing watersheds and 3) to develop accurate prediction of trends a nd lake level return frequencies. In terms of spatial changes in precipitation, the vast majority of variables analyzed at each gage were confined to a 99percent conf idence band associated with the average fit, gamma or GEV, of the data. There were some exceptions; however most of these were at gages at the outer fringes of the area analyzed and at percentiles near the high or low end. Nearly all of the fits were contained at the 0.5 percentile, representing the average annual variable a particular water resource can expect to experience. In regards to temporal variability, it was also somewhat surprisi ng that almost no signi ficant trends were detected. Many of the gages investigated w ould have demonstrated a trend if analyzed with traditional methods such as ordinary least squares or nonpar ametric MannKendall. Separating the signal of lakebasin urbanization from the multitude of signals inherent in an urbanizing watershed is problematic. With regard to the time series modeling, lakes with a large basin to lake area ratio demonstrated definite tren ds in model parameters. As PAGE 105 98 the basin/lake ratio increases, the urbaniza tion signal is likely increased enough to be detected by the time series modeling. Furt hermore, a significant increase in basin population density appears to systematically al ter the time series signature, despite the presence of other conflicti ng signals. It was hypothesi zed that urbanization would shorten the autocorrelation of lakes as the baseflow fraction was decreased due to more efficient drainage and increased impervious ar ea. While this was certainly true in the Cow Lake basin, which is the most heavily urbanized lake and does not have a control structure, inflow from an upstream lake or an adjacent wetland, it was not true in several other lake watersheds. Because all other wa tersheds have wetlands immediately adjacent to the lakes studied or the lakes discharge in to wetlands, it is surmised that wetlands serve as an efficient recharge mechanism and can compensate for effects of urbanization on baseflow by storing increase d runoff volume associated w ith urbanization and slowly passing it back to the lake over time. For th e regression analysis, four lakes demonstrated a decrease in baseflow contribution while the two lakes with the largest relative adjacent wetland areas demonstrated the opposite. Base d upon the research, the following general conclusions about lakes in ur banizing watersheds can be reached: 1) Th e statistical structure of lake level time series is systemati cally altered and is rela ted to the extent of urbanization. 2) In the absence of othe r forcing mechanisms, autocorrelation and baseflow appear to decrease. 3) The presence of wetlands adjacent to lakes can offset the reduction in baseflow. In regards to utilizing the GEV distribution with covariates to identify trends, all of the lakes researched evidenced either no trend or very small trends unlikely to significantly PAGE 106 99 alter prediction of future flood or drought return levels. Ho wever, for all of the lakes, significant improvement in the fits was obtaine d with the inclusion of starting lake stage as a covariate. This is likely because a ny monotonic trends are captured in the starting stage itself and the trends iden tified were negligible. Tr aditional methods of estimating flood or drought stages significan tly overpredict stages when starting lake stages are low and underpredict stages when starting stages are high. The difference between these predictions can be nearly two meters, a signi ficant amount in urba nized watersheds in areas of the world with flat topography. Differences of near two meters can mean significant alterations in evacuat ion or other water management decisions. In addition to improving prediction of extreme events, ut ilizing GEV with time or starting stage covariates can provide guidance in lake mana gement decisions in regards to how much water to release from a lake in preparation for an approaching hurri cane, appropriate lake levels to maintain throughout the year, dete rmine minimum structure floor elevations in the watershed and allow more accurate forecas ting of future water supply or impacts to tourism. The methodology utilized for each of the three focus areas of the research can be applied to other regions globally. Furthermore, the re sults can likely also be applied to similar regions with flat topography and shallow wate r table environments. The focus of this research is on water management and engine ering and there are several implications and applications that can be derived from this research. Developing regional rainfall patterns that take into account potent ial trends allows water managers to develop realistic expectations for future water supply, water levels, etc. at ungaged water resources in a PAGE 107 100 given region. Understanding the impacts of urbanization allows for better management and engineering decisions in regards to lakes and other water resources. For example, the implication of wetlands mitigating some of the effects of urbanization may imply constructing or preserving wetlands adjacent to a lake as part of a regional development plan should be a future consideration. The benefits and implications of more accurate shortterm flood and drought return period predictions are many. Improving these predictions by approximately one meter or more can mean very different flood evacuation zones in areas w ith flat topography and may a lter the design of control structures as to how much water to releas e and appropriate lake levels to maintain throughout the year. PAGE 108 101 References Abraham B, Ledolter J (2006) Introducti on to regression modeling. Thompson Brooks/Cole Belmont Altunkaynak A (2007) Forecasting surface wate r level fluctuations of Lake Van by artificial neural networks. Water Resources Management 21: 399408 Beirlant J, et al (2004) Statistics of Extremes. John Wiley &Sons, Limited Buchberger S (1995) Conditional frequency anal ysis of autocorrelated lake levels. Journal of Water Resources Pl anning and Management 121: 158170 Burgueno A, Serra C, et al ( 2004) Monthly annual statistical distributions of daily rainfall at the Fabra observa tory (Barcelona, NE Spain) for the years 19171999. Theoretical and Applied Climatology 77: 5775 Cannarozzo M, Noto LV, et al (2006) Spatial distribution of rainfall trends in Sicily (19212000). Physics and Chemistry of the Earth 31: 12011211 Changnon S, Demissie M (1996) Detection of changes in streamflow patterns and floods resulting from climate fluctuati ons and landuse drainage changes. Climatic Change 32: 411421 Chatfield C (2004) The analysis of ti me series. CRC Press LLC Boca Raton Choi S, Wette R (1969) Maximum likelihood estimation of the parameters of the gamma distribution and th eir bias. Tecnometrics 11: 683690 Coles, Stuart (2004) An introduction to statistical modeling of extreme values. SpringerVerlag London Limited Dahamsheh A, Aksoy H (2007) St ructural characteris tics of annual precipitation data in Jordan. Theoretical and A pplied Climatology 88: 201212 PAGE 109 102 Elias D, Ierotheos Z (2006) Quantifying the rainfallwater level fluctuation process in a geologically complex lake catchment. Environmental Monitoring and Assessment 119: 491506 Gao J (2004) Lake stage fluctuation study in WestCentral Florida using multiple regression models. Masters thes is University of South Florida. Garcia J, Gallego M, et al (2007) Trend in blockseasonal extreme rainfall over the Iberian peninsula in the second half of the twentieth century. Journal of Climate 20: 113130 Gibson J, et al (2006) Hydroclimatic controls on water balance and wa ter level variability in Great Slave Lake. Hydr ological Processes 20: 41554172 Goel N, et al (1998) Multivariate modeling of flood flows. Journal of Hydraulic Engineering 146155 Groisman P, Karl T, et al (1999) Changes in the probability of heavy precipitation: important indicators of climate change. Climatic Change 42: 243283 Haan, Charles T. (2002) Sta tistical Methods in Hydrology. Iowa State Press Hirsch R, Slack J, et al (1982) Techniques of tr end analysis for monthly water quality data. Water Resources Research 18: 107121 Irvine K, Eberhardt A (1992) Multiplicative seasonal ARIMA models for Lake Erie and Lake Ontario water levels. Jour nal of the American Water Resources Association 28: 385396 Jones R, et al (2001) Modeling historical lake leve ls and recent climate change at three closed lakes, Western Victoria, Aust ralia (c. 18401990). Journal of Hydrology 246: 159180 Khan M, Coulibaly P (2006) Application of su pport vector machine in lake water level prediction. Journal of Hydr ologic Engineering 11: 199205 Karl T, Knight R (1998) Secular trends of preci pitation amount, frequenc y, and intensity in the United States. Bulletin of the American Meteorological Society 79: 231241 PAGE 110 103 Katz R, et al (2002) Statistics of extremes in hydrology. Advances in Water Resources 25: 12871304 Kuhn G, Khan S, et al (2007) Geospa tialtemporal dependence among weekly precipitation extemes with applications to observations and climate model simulations in South America. Adva nces in Water Resources 30: 24012423 Kunkel K, Andsager K (1999) Longterm trends in extreme precipitation events over the conterminous United States and Canada American Meteorological Society 12: 25152527 Kusumastuti D, et al (2007) Threshold e ffects in catchment storm response and the occurrence and magnitude of flood events: implications for flood frequency. Hydrology and Earth System Sciences 11: 15151528 Kusumastuti D, et al (2008) Thresholds in the storm response of a catchmentlake system and the occurrence and magnitude of lake overflows: implications for flood frequency. Water Res ources Research 44: 115 Lall U, et al (2006) Locally weighted polynomial regression: parameter choice and application to forecasts of the Great Sa lt Lake. Water Resources Research 42: 111 Lenters J (2004) Trends in La ke Superior water budget si nce 1948: a weakening seasonal cycle. Journal of Great Lakes Research 30: 2040 Li X, et al (2007) Lakelevel change and water balance anal ysis at Lake Qinghai, West China during recent decades. Water Resources Management 21: 15051516 Martins E, Stedinger J (2000) Generalized maximumlikelihood ge neralized extremevalue quantile estimators for hydrolog ic data. Water Resources Research 36: 737744 McBean, E, Motiee, H (2008) Assessment of impact of climate change on water resources: a long term anal ysis of the Great Lakes of North America. Hydrology and Earth Systems Science 12: 239255 PAGE 111 104 McMahon G, et al (2003) Use of stage data to characterize hydrologic conditions in an urbanizing environment. Journal of the American Water Resources Association 39: 15291546 Mendoza M, et al (2006) Predicting watersurf ace fluctuation of continental lakes: a RS and GIS based approach in Central Me xico. Water Resources Management 20: 291311 Meyer S (2005) Analysis of base flow trends in urban streams, Northeast Illinois, USA. Hydrogeology Journal 13: 871885 Meyer S, Wilson S (2001) Impacts of urbani zation on base flow and recharge rates, Northeast Illinois: summary of year 1 activities. Illinois State Water Survey Montanari A, et al (1997) Fractionally differenced ARIMA models applied to hydrologic time series: identification, estimation, and simulation. Water Resources Research 33: 10351044 Morrison J, Smith J (2002) Stochastic mode ling of flood peaks using the generalized extreme value distribution. Water Resources Research 38: 411 4112 Nachabe M, et al (2005) Eva potranspiration of two vegetatio n covers in a shallow water table environment. Soil Science So ciety of America Journal 69: 492499 Nadarajah S, Shiau J (2005) Analysis of ex treme flood events for the Pachang River, Taiwan. Water Resources Management 19: 363374 Privalsky V (1992) Statistical analysis and predictability of Lake Erie water level variations. Journal of Gr eat Lakes Research 18: 236243 Rose S, Peters N (2001) Effects of urbani zation on streamflow in the Atlanta area (Georgia, USA): comparative hydrologica l approach. Hydrological Processes 15: 14411457 Semenov V, Bengstsson L (2002) Se cular trends in daily prec ipitation characteristics: greenhouse gas simulation with a coupled AOGCM. Climate Dynamics 19: 123140 Sharda V, Das P (2005) Modeling weekly rainfa ll data for crop planning in a subhumid climate of India. Agricultural Water Management 76: 120138 PAGE 112 105 Shumway R, Stoffer D (2006) Time series analysis and its applications. Springer Science Business Media, LLC New York Smith J, Baeck M (2002) The regional hydrol ogy of extreme floods in an urbanizing drainage basin. American Me teorological Society 3: 267282 Struthers I and Sivaplan M (2007) A conceptu al investigation of process control upon flood frequency: role of thresholds. Hydrology and Eath Systems Science 11 14051416 Watterson I, Dix M (2003) Simulated cha nges due to global warming in daily precipitation means and extremes and th eir interpretation using the gamma distribution. Journal of Geophysical Research 10 8: ACL 31 ACL 320 Wilks D (1990) Maximum likelihood estimati on of the gamma distri bution using data containing zeros. American Meteorological Society 3: 14951501 Yin X, Nicholson E (2002) Interpreting annual ra infall from the levels of Lake Victoria. Journal of Hydrometeorology 3: 406416 Zhang X, Zwiers F, et al (2004) Monte Carlo experiments on the detection of trends in extreme Values. Journal of Climate 17: 19451952 Zolina O, Kapala A, et al (2004) Analysis of extreme precipitation over Europe from different reanalysis: a comparative asse ssment. Global and Planetary Change 44: 129161 PAGE 113 About the Author Shayne Paynter received a Bachelors Degr ee in Civil Engineering from Florida State University in 1991 and a Masters Degree in Civil Engineering from the University of South Florida in 2002. He cu rrently lives with his wonderf ul wife, Sunitha, without whose support this research would not have been completed, and daughter, Inara. xml version 1.0 encoding UTF8 standalone no record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd leader nam 2200397Ka 4500 controlfield tag 001 002028699 005 20090911152724.0 007 cr bnuuuuuu 008 090911s2009 flu s 000 0 eng d datafield ind1 8 ind2 024 subfield code a E14SFE0002807 035 (OCoLC)436301191 040 FHM c FHM 049 FHMM 090 TA170 (Online) 1 100 Paynter, Shayne. 0 245 Statistical changes in lakes in urbanizing watersheds and lake return frequencies adjusted for trend and initial stage utilizing generalized extreme value theory h [electronic resource] / by Shayne Paynter. 260 [Tampa, Fla] : b University of South Florida, 2009. 500 Title from PDF of title page. Document formatted into pages; contains 105 pages. Includes vita. 502 Dissertation (Ph.D.)University of South Florida, 2009. 504 Includes bibliographical references. 516 Text (Electronic dissertation) in PDF format. 520 ABSTRACT: Many water resources throughout the world are demonstrating changes in historic water levels. Potential reasons for these changes include climate shifts, anthropogenic alterations or basin urbanization. The focus of this research was threefold: 1) to determine the extent of spatiotemporal changes in regional precipitation patterns 2) to determine the statistical changes that occur in lakes with urbanizing watersheds and 3) to develop accurate prediction of trends and lake level return frequencies. To investigate rainfall patterns regionally, appropriate distributions, either gamma or generalized extreme value (GEV), were fitted to variables at a number of rainfall gages utilizing maximum likelihood estimation. The spatial distribution of rainfall variables was found to be quite homogenous within the region in terms of an average annual expectation.Furthermore, the temporal distribution of rainfall variables was found to be stationary with only one gage evidencing a significant trend. In order to study statistical changes of lake water surface levels in urbanizing watersheds, serial changes in time series parameters, autocorrelation and variance were evaluated and a regression model to estimate weekly lake level fluctuations was developed. The following general conclusions about lakes in urbanizing watersheds were reached: 1) The statistical structure of lake level time series is systematically altered and is related to the extent of urbanization 2) in the absence of other forcing mechanisms, autocorrelation and baseflow appear to decrease and 3) the presence of wetlands adjacent to lakes can offset the reduction in baseflow.In regards to the third objective, the direction and magnitude of trends in flood and drought stages were estimated and both longterm and shortterm flood and drought stage return frequencies were predicted utilizing the generalized extreme value (GEV) distribution with time and starting stage covariates. All of the lakes researched evidenced either no trend or very small trends unlikely to significantly alter prediction of future flood or drought return levels. However, for all of the lakes, significant improvement in the prediction of extremes was obtained with the inclusion of starting lake stage as a covariate. 538 Mode of access: World Wide Web. System requirements: World Wide Web browser and PDF reader. 590 Advisor: Mahmood Nachabe, Ph.D. 653 Regression Time series Autocorrelation Flood Drought 690 Dissertations, Academic z USF x Civil and Environmental Engineering Doctoral. 773 t USF Electronic Theses and Dissertations. 4 856 u http://digital.lib.usf.edu/?e14.2807 