USF Libraries
USF Digital Collections

Nonlinear dependence and extremes in hydrology and climate

MISSING IMAGE

Material Information

Title:
Nonlinear dependence and extremes in hydrology and climate
Physical Description:
Book
Language:
English
Creator:
Khan, Shiraj
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla.
Publication Date:

Subjects

Subjects / Keywords:
Mutual information
South America
Precipitation
Time series
Extreme value distribution
CCSM3 climate model
Chaos
Dissertations, Academic -- Civil Engineering -- Doctoral -- USF   ( lcsh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Summary:
ABSTRACT: The presence of nonlinear dependence and chaos has strong implications for predictive modeling and the analysis of dominant processes in hydrology and climate. Analysis of extremes may aid in developing predictive models in hydro-climatology by giving enhanced understanding of processes driving the extremes and perhaps delineate possible anthropogenic or natural causes. This dissertation develops and utilizes different set of tools for predictive modeling, specifically nonlinear dependence, extreme, and chaos, and tests the viability of these tools on the real data. Commonly used dependence measures, such as linear correlation, cross-correlogram or Kendall's tau, cannot capture the complete dependence structure in data unless the structure is restricted to linear, periodic or monotonic. Mutual information (MI) has been frequently utilized for capturing the complete dependence structure including nonlinear dependence.Since the geophysical data are generally finite and noisy, this dissertation attempts to address a key gap in the literature, specifically, the evaluation of recently proposed MI-estimation methods to choose the best method for capturing nonlinear dependence, particularly in terms of their robustness for short and noisy data. The performance of kernel density estimators (KDE) and k-nearest neighbors (KNN) are the best for 100 data points at high and low noise-to-signal levels, respectively, whereas KNN is the best for 1000 data points consistently across noise levels. One real application of nonlinear dependence based on MI is to capture extrabasinal connections between El Nino-Southern Oscillation (ENSO) and river flows in the tropics and subtropics, specifically the Nile, Amazon, Congo, Parana, and Ganges rivers which reveals 20-70% higher dependence than those suggested so far by linear correlations.For extremes analysis, this dissertation develops a new measure precipitation extremes volatility index (PEVI), which measures the variability of extremes, is defined as the ratio of return levels. Spatio-temporal variability of PEVI, based on the Poisson-generalized Pareto (Poisson-GP) model, is investigated on weekly maxima observations available at 2.5 degree grids for 1940-2004 in South America. From 1965-2004, the PEVI shows increasing trends in few parts of the Amazon basin and the Brazilian highlands, north-west Venezuela including Caracas, north Argentina, Uruguay, Rio De Janeiro, Sao Paulo, Asuncion, and Cayenne. Catingas, few parts of the Brazilian highlands, Sao Paulo and Cayenne experience increasing number of consecutive 2- and 3-days extremes from 1965-2004. This dissertation also addresses the ability to detect the chaotic signal from a finite time series observation of hydrologic systems.Tests with simulated data demonstrate the presence of thresholds, in terms of noise to chaotic-signal and seasonality to chaotic-signal ratios, beyond which the set of currently available tools is not able to detect the chaotic component. Our results indicate that the decomposition of a simulated time series into the corresponding random, seasonal and chaotic components is possible from finite data. Real streamflow data from the Arkansas and Colorado rivers do not exhibit chaos. While a chaotic component can be extracted from the Arkansas data, such a component is either not present or can not be extracted from the Colorado data.
Thesis:
Dissertation (Ph.D.)--University of South Florida, 2007.
Bibliography:
Includes bibliographical references.
System Details:
System requirements: World Wide Web browser and PDF reader.
System Details:
Mode of access: World Wide Web.
Statement of Responsibility:
by Shiraj Khan.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains 160 pages.
General Note:
Includes vita.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001921026
oclc - 190845637
usfldc doi - E14-SFE0002142
usfldc handle - e14.2142
System ID:
SFS0026460:00001


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001921026
003 fts
005 20080116125105.0
006 m||||e|||d||||||||
007 cr mnu|||uuuuu
008 080116s2007 flu sbm 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0002142
035
(OCoLC)190845637
040
FHM
c FHM
049
FHMM
090
TA145 (ONLINE)
1 100
Khan, Shiraj.
0 245
Nonlinear dependence and extremes in hydrology and climate
h [electronic resource] /
by Shiraj Khan.
260
[Tampa, Fla.] :
b University of South Florida,
2007.
520
ABSTRACT: The presence of nonlinear dependence and chaos has strong implications for predictive modeling and the analysis of dominant processes in hydrology and climate. Analysis of extremes may aid in developing predictive models in hydro-climatology by giving enhanced understanding of processes driving the extremes and perhaps delineate possible anthropogenic or natural causes. This dissertation develops and utilizes different set of tools for predictive modeling, specifically nonlinear dependence, extreme, and chaos, and tests the viability of these tools on the real data. Commonly used dependence measures, such as linear correlation, cross-correlogram or Kendall's tau, cannot capture the complete dependence structure in data unless the structure is restricted to linear, periodic or monotonic. Mutual information (MI) has been frequently utilized for capturing the complete dependence structure including nonlinear dependence.Since the geophysical data are generally finite and noisy, this dissertation attempts to address a key gap in the literature, specifically, the evaluation of recently proposed MI-estimation methods to choose the best method for capturing nonlinear dependence, particularly in terms of their robustness for short and noisy data. The performance of kernel density estimators (KDE) and k-nearest neighbors (KNN) are the best for 100 data points at high and low noise-to-signal levels, respectively, whereas KNN is the best for 1000 data points consistently across noise levels. One real application of nonlinear dependence based on MI is to capture extrabasinal connections between El Nino-Southern Oscillation (ENSO) and river flows in the tropics and subtropics, specifically the Nile, Amazon, Congo, Parana, and Ganges rivers which reveals 20-70% higher dependence than those suggested so far by linear correlations.For extremes analysis, this dissertation develops a new measure precipitation extremes volatility index (PEVI), which measures the variability of extremes, is defined as the ratio of return levels. Spatio-temporal variability of PEVI, based on the Poisson-generalized Pareto (Poisson-GP) model, is investigated on weekly maxima observations available at 2.5 degree grids for 1940-2004 in South America. From 1965-2004, the PEVI shows increasing trends in few parts of the Amazon basin and the Brazilian highlands, north-west Venezuela including Caracas, north Argentina, Uruguay, Rio De Janeiro, Sao Paulo, Asuncion, and Cayenne. Catingas, few parts of the Brazilian highlands, Sao Paulo and Cayenne experience increasing number of consecutive 2- and 3-days extremes from 1965-2004. This dissertation also addresses the ability to detect the chaotic signal from a finite time series observation of hydrologic systems.Tests with simulated data demonstrate the presence of thresholds, in terms of noise to chaotic-signal and seasonality to chaotic-signal ratios, beyond which the set of currently available tools is not able to detect the chaotic component. Our results indicate that the decomposition of a simulated time series into the corresponding random, seasonal and chaotic components is possible from finite data. Real streamflow data from the Arkansas and Colorado rivers do not exhibit chaos. While a chaotic component can be extracted from the Arkansas data, such a component is either not present or can not be extracted from the Colorado data.
502
Dissertation (Ph.D.)--University of South Florida, 2007.
504
Includes bibliographical references.
516
Text (Electronic dissertation) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
500
Title from PDF of title page.
Document formatted into pages; contains 160 pages.
Includes vita.
590
Advisor: Auroop R. Ganguly, Ph.D.
653
Mutual information.
South America.
Precipitation.
Time series.
Extreme value distribution.
CCSM3 climate model.
Chaos.
690
Dissertations, Academic
z USF
x Civil Engineering
Doctoral.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.2142



PAGE 3

ThisdissertationstudywaspartiallyfundedbytheSEEDmoneyfundsoftheLaboratoryDirectedResearchandDevelopmentProgramoftheOakRidgeNationalLaboratory(ORNL),managedbyUT-Battelle,LLCfortheU.S.DepartmentofEnergy(DOE)underContractNo.DE-AC05-00OR22725.TheSEEDmoneyfundedresearchhasresultedinpublishedoracceptedpeer-reviewedpapers.Thus,thisdissertationstudyisnotsubjecttoexportcontrolorrelatedU.S.DOEregulations. IwishtoexpressmyspecialgratitudetobothofmymentorsDr.AuroopR.GangulyandProf.SunilSaigalforgenerouslysupportingmydissertationwiththeirtime,commitment,scienticexpertise,andhelpfulcommentsandsuggestions.Iwouldliketothankthemfromthebottomofmyheartforencouragingmeandstimulatingmyanalyticalandscienticthinkingthroughoutmydoctoralworkwhichhelpedmeimmenselytodevelopindependentthinking,researchandwritingskills.IwouldalsoliketothankProf.SunilSaigalforhelpingmerealizemydreamsofpursuinghigherstudiesintheUnitedStatesbyofferingmefullscholarshipintheDepartmentofCivilandEnvironmentalEngineeringatCarnegieMellonUniversity(CMU)andforcontinuallysupportingmenanciallyduringmydoctoralwork. IamgratefulforhavinganexceptionaldoctoralcommitteeandwishtothankDr.DavidJ.EricksonIII,Dr.ThomasWilbanks,Prof.ManishAgrawal,andProf.BellieSivakumarfortheircontinualsupportandencouragement.IwishtoacknowledgeandthankDrs.DavidJ.EricksonIII,SharbaBandyopadhyay,GabrielKuhn,GeorgeOstrouchov,MarciaBranstetter,andProf.AmarGuptaforcollaboratingandimpartingknowledgeatvariousstagesofmydissertation.Iwouldliketothankallthepeoplewhoprovidedfeedbackandsupportatvariousphasesofmydissertation:Profs.GuilingWang,ShaqulIslam,NorbertoO.Garcia,RafaelBras,TailenHsing,andDrs.AlexanderKraskov,EarleWilliams,CarlosNobre,RickKatz,BrantLiebmann,DaveAllured,UnitedStatesGeologicalSurvey(USGS)andBureauofReclamation(BoR).IwouldalsoliketotakethisopportunitytothankallmyteachersatIITRoorkee,CMU,andUSF.Myspecialthanksalsogoouttoallmyfriendsandapartment-matesinTampa,Knoxville,Pittsburghandelsewhereforsupporting,encouragingandbearingwithmethroughoutmydoctoralstudies. Finally,Iamverygratefultomymother(RajiaKhan)andfather(BasiUllahKhan)fortheirunwaveringfaithinmyabilities,showingmethetrueworthofhardwork,andsupportingmethroughoutmylife.IwouldliketothankFiroj(brother),Farha(sister),andNazim(brother-in-law)fortheirlove,support,andencouragement.

PAGE 4

ListofTablesiv ListofFiguresvii Abstractxvi Chapter1Introduction1 Chapter2Motivation6 Chapter3RelativePerformanceofMutualInformationEstimationMethodsforQuantifyingtheDependenceAmongShortandNoisyData8 3.1Introduction9 3.2Mutualinformationanditsestimationmethods14 3.2.1Denitionsofmutualinformation14 3.2.2Mutualinformationestimators16 3.2.2.1Kerneldensityestimators(KDE)16 3.2.2.2k-nearestneighbors(KNN)17 3.2.2.3Edgeworthapproximationofdifferentialentropy(Edgeworth)18 3.2.2.4AdaptivepartitioningoftheXYplane(Cellucci)19 3.3Detailsofthedata19 3.3.1Computationsoftheoreticalmutualinformation21 3.3.1.1Linear21 3.3.1.2Quadratic22 3.3.1.3Periodic22 3.4Results23 3.4.1Performanceoflinearandnonlineardependencemeasures28 3.4.1.1Linear29 3.4.1.2Quadratic31 3.4.1.3Periodic33 3.4.1.4Chaotic35 3.4.2PerformanceofKDEandKNNwithdifferentparametervalues36 3.5Conclusionanddiscussion37 Chapter4NonlinearStatisticsRevealsStrongerTiesBetweenENSOandtheTropicalHy-drologicalCycle40 4.1Introduction40 4.2Dataandmethodology41 4.2.1ENSOandriverowdata41 4.2.2Mutualinformation(MI)44 4.3MIestimationmethods46 4.3.1Kerneldensityestimator(KDE)47 4.3.2k-nearestneighbors(KNN)47i

PAGE 5

4.4Analysisofsimulations48 4.4.1Detailsofthesimulateddata48 4.4.2Conclusionfromsimulations49 4.5ComparisonsofMIestimationmethodsusingsimulations49 4.5.1ComparisonbetweenKDE,KNN,andEdgeworth49 4.5.1.1Simulationcasesandtheirtheoreticalvalues54 4.5.1.2Shorttimeseries61 4.5.1.3Longtimeseries64 4.5.1.4Conclusionfromtheanalysisofsimulations64 4.5.1.5Discussionfromtheanalysisofsimulations66 4.5.2Comparisonofnonlineardependencemeasureswitharank-baseddepen-dence66 4.6Realdataanalysis75 4.6.1Descriptionofresults79 4.6.2Conclusionfromtheanalysisofrealdata80 4.7Discussion80 Chapter5Spatio-temporalVariabilityofDailyandWeeklyPrecipitationExtremesinSouthAmerica82 5.1Introduction82 5.2Dataandmethodology86 5.2.1Dataavailability86 5.2.2Methodology86 5.2.2.1Poisson-GPmodel86 5.2.2.2Precipitationextremesvolatilityindex(PEVI)88 5.2.2.3QualityofthePoisson-GPmodel89 5.2.3DatapreparationforthevalidityofthePoisson-GPmodel91 5.2.3.1Daily96 5.2.3.2Weeklymaxima99 5.2.3.3Weeklymaximaresiduals102 5.3Resultsanddiscussions111 5.3.1Brazil115 5.3.2NorthArgentina116 5.3.3Venezuela117 5.3.4Uruguay117 5.3.5Paraguay118 5.3.6SurinameandFrenchGuiana118 5.3.7Extremeswithtopographyandvegetation119 5.4Summaryandconclusions121 Chapter6DetectionandPredictiveModelingofChaosinFiniteHydrologicalTimeSeries124 6.1Introduction124 6.2Toolsandmethods125 6.2.1Stateoftheartandliteraturereview:toolsandconcepts125 6.2.1.1Correlationdimension126 6.2.1.2Articialneuralnetworks(ANNs)127 6.2.1.3Phase-spacereconstruction(PSR)prediction128 6.3Datadescription129 6.3.1Simulateddata129 6.3.2Hydrologictimeseries130 6.4Resultswithsimulateddata131ii

PAGE 6

6.4.2Mixedtimeseries134 6.4.2.1Mixtureofchaoticandseasonalseries134 6.4.2.2Mixtureofchaoticandrandomseries140 6.5Analysiswithhydrologictimeseries145 6.5.1Arkansasriver145 6.5.2Coloradoriver146 6.6Summaryandconclusions149 Chapter7Conclusions151 References153 AbouttheAuthorEndPageiii

PAGE 7

Table1.Linear:Descriptionofresultswhereeachentryconsistsofthreecolumnsgivenas(1)Column1:0,-,or+,where`0',`-'and`+'meannonlinearCCsarezero,negativelyandpositivelybiasedwithrespecttotheoreticalCCs,respectively,(2)Column2:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwiththeoreticalCCs,respectively,and(3)Column3:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwithlinearCCs,respectively.Boldandslantedentriesindicatethebestandthesecondbestmethodsforeachcasespeciedinthetopheadingsofthetable,respectively.29 Table2.Quadratic:Descriptionofresultswhereeachentryconsistsofthreecolumnsgivenas(1)Column1:0,-,or+,where`0',`-'and`+'meannonlinearCCsarezero,negativelyandpositivelybiasedwithrespecttotheoreticalCCs,respectively,(2)Column2:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwiththeoreticalCCs,respectively,and(3)Column3:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwithlinearCCs,respectively.Boldandslantedentriesindicatethebestandthesecondbestmethodsforeachcasespeciedinthetopheadingsofthetable,respectively.30 Table3.Periodic:Descriptionofresultswhereeachentryconsistsofthreecolumnsgivenas(1)Column1:0,-,or+,where`0',`-'and`+'meannonlinearCCsarezero,negativelyandpositivelybiasedwithrespecttotheoreticalCCs,respectively,(2)Column2:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwiththeoreticalCCs,respectively,and(3)Column3:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwithlinearCCs,respectively.Boldandslantedentriesindicatethebestandthesecondbestmethodsforeachcasespeciedinthetopheadingsofthetable,respectively.31 Table4.Runoffdatastatistics(1000m3=s).43 Table5.MIestimateswithstandarderrorsgiveninparenthesesbetweentwoGaussiannoisesets(Xi;Yi):XN(0;1);YN(0;1);i=1;::;N,whereXandYareiidandindependentofeachother.ThetotalnumberofsamplesforN=50,N=100,andN=1000are200,100,and20,respectively.TheMIestimatesanditsstandarderrorsarethemeanandstandarddeviationfromthetotalsamples.TheMIshouldbezerobetweentwoGaussiannoisesets.TheMIestimatesobtainedfromallthreemethodsareclosetozerobutbiasedupwardsinthecaseofKDEandKNN.56iv

PAGE 8

),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Table7.LinearandnonlinearCCsbetweentheannualowoftheAmazonRiverandtheENSOindexaveragedforeightquarters.Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth.Thebias-correctedestimates,( ),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Table8.LinearandnonlinearCCsbetweentheannualowoftheCongoRiverandtheENSOindexaveragedforeightquarters.Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth.Thebias-correctedestimates,( ),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Table9.LinearandnonlinearCCsbetweentheannualowoftheParanaRiverandtheENSOindexaveragedforeightquarters.Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth.Thebias-correctedestimates,( ),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations.dMSEanditsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationofMSEsestimatedfrom100jackknifereplicationsofsize0:8Nobservations.77v

PAGE 9

),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Table11.VariationintheannualowofriversassociatedwithENSO.LinearandnonlinearCCsareestimatedusingLRandKDE,respectively.Monthsinaquarteraregivenin[].Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth.79 Table12.Streamowdatastatistics(valuesinm3=s)131 Table13.Separationofwhitenoisefromamixtureofchaotic,seasonalandwhitenoiseseriesusingthePSRwithm=10.139 Table14.SeparationofwhitenoisefromamixtureofLorenz(Xcomponent)andwhitenoiseusingthePSRwithm=10.141 Table15.SeparationofwhitenoisefromamixtureofLorenz(Xcomponent)andwhitenoiseusingtheANNwithm=10.141vi

PAGE 10

Figure1.Plotof100pointswithdifferentnoise-to-signalratios(shownbyplus)andwithzeronoiselevel(shownbydots).Noise-to-signalratiosontheleftandrightguresare0:1and0:5,respectively.(A)XN(0;1);Y:yi=xi2+"i,where"N(0;")istheGaussiannoisewithzeromeanand"standarddeviation.(B)X:xi=Hxi+"xi;Y:yi=Hyi+"yi,whereHXandHYaretheXandYcomponentsoftheHenonmap,respectively."xN(0;HX)and"yN(0;HY),whereHXandHYarethestandarddeviationsofHXandHY,respectively.11 Figure2.Linear:normal(left)andkernel(right)densitieswithdifferentnoise-to-signalratios(=s)with100points.Forkerneldensity,aGaussiankernelwithoptimalsmoothingparameterhogiveninEq.(8)isused.(A)=s=0:2.(B)=s=0:9.Thelineardependencestructurecanbeseenclearlyin(A)butcannotbereadilyidentiedin(B)basedoneyeestimation.24 Figure3.Quadratic:normal(left)andkernel(right)densitieswithdifferentnoise-to-signalratios(=s)with100points.Forkerneldensity,aGaussiankernelwithoptimalsmoothingparameterhogiveninEq.(8)isused.(A)=s=0:2.(B)=s=0:9.Atlownoise,suchasin(A),thenonlineardependencecanbeclearlyseenasshownbythekerneldensity.Howeverathighnoise,suchasin(B),thedependencestructureisnotreadilydiscerniblevisuallyfromthekerneldensity.25 Figure4.Periodic:normal(left)andkernel(right)densitieswithdifferentnoise-to-signalratios(=s)with100points.Forkerneldensity,aGaussiankernelwithoptimalsmoothingparameterhogiveninEq.(8)isused.(A)=s=0:2.(B)=s=0:9.Withincreasingnoiselevels,thenonlineardependencestructurecannotbeidentiedvisuallyasshownbythekerneldensityplots.26 Figure5.Chaotic:normal(left)andkernel(right)densitieswithdifferentnoise-to-signalratios(=s)with100points.Forkerneldensity,aGaussiankernelwithoptimalsmoothingparameterhogiveninEq.(8)isused.(A)=s=0:2.(B)=s=0:9.KerneldensityplotshowstheHenonattractorin(A).HowevertheHenonattractorcannotbereadilydistinguishedvisuallyin(B).27 Figure6.Linear:ComparisonsbetweenlinearCCsfromLRandnonlinearCCsfromKDE,KNN,Edgeworth,Cellucci,andKendall's,atdifferentnoise-to-signalratios(=s)for(A)50points,(B)100points,and(C)1000points.28 Figure7.Quadratic:ComparisonsbetweenlinearCCsfromLRandnonlinearCCsfromKDE,KNN,Edgeworth,Cellucci,andKendall's,atdifferentnoise-to-signalratios(=s)for(A)50points,(B)100points,and(C)1000points.30vii

PAGE 11

Figure9.Chaotic:ComparisonsbetweenlinearCCsfromLRandnonlinearCCsfromKDE,KNN,Edgeworth,Cellucci,andKendall's,atdifferentnoise-to-signalratios(=s)for(A)50points,(B)100points,and(C)1000points.34 Figure10.PerformanceofKDEandKNNwithdifferentvaluesofsmoothingparameter(h)andnumberofnearestneighbors(k),respectively.Theresultsfromquadraticandperiodicarepresentedintheleftandright,respectively.(A)KDEwith100points.(B)KNNwith100points.(C)KNNwith1000points.In(A),hoistheoptimalsmoothingparameterforaGaussiankernelgiveninEq.(8).36 Figure11.Annualow(a)andaveragemonthlyow(b)oftheNileRiverfrom1873-1989,Ama-zonRiverfrom1903-1985,CongoRiverfrom1905-1985,ParanaRiverfrom1904-1997,andGangesRiverfrom1934-1993.ThefollowingyearsarehappenedtobethewarmepisodesofENSO:1877,1880,1884,1887,1891,1896,1899,1902,1905,1911,1914,1918,1923,1925,1930,1932,1939,1941,1951,1953,1957,1965,1969,1972,1976,1982,1986,1991,1993,and1997.TheaverageannualowandElNinoyearsareshownasdottedlinesandsoliddots,respectively,asshownin(a).42 Figure12.Comparisonoflinear(LR)andMI-baseddependenceobtainedafterttingbivariatenormaldistribution(Norm)toeachpair.Thedependenceisestimatedwithrespecttodifferentnoisetosignalratiosandquartersinthesimulatedandrealdata,respectively.(a)Case4(Chaotic):Henonmapwith100points.(b)RelationshipbetweenENSOandNileRiverow.In(a),themeandependencefrombothcasesaresamewhereasthereisaveryslightdifferenceinvariancesforfewnoisetosignalratios.In(b),bothcasescapturethesamemeandependecewhereasvariancesdifferveryslightlyforfewquarters.45 Figure13.Normalandkerneldensitieswithdifferentnoise(n)tosignal(s)ratiosforCase1(Linear)withN=100.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6,isused.(a)n=s=0:1.(b)n=s=0:5.(c)n=s=1:0.Thelineardependencestructurecanbeseenclearlyforcases(a)and(b)butcannotbereadilyidentiedforcase(c)basedoneyeestimation.50 Figure14.Normalandkerneldensitieswithdifferentnoise(n)tosignal(s)ratiosforCase2(Quadratic)withN=100.Forkerneldensity,aGaussiankernelwithoptimalGaus-sianbandwidth,givenash=N1=6,isused.(a)n=s=0:1.(b)n=s=0:5.(c)n=s=1:0.Atlowernoiselevels,suchasincases(a)and(b),thenonlineardepen-dencecanbeclearlyseenasshownbythekerneldensityplots.Howeverathighernoiselevels,suchasincase(c),thedependencestructureisnotreadilydiscerniblevisuallyfromthekerneldensity.51 Figure15.Normalandkerneldensitieswithdifferentnoise(n)tosignal(s)ratiosforCase3(Periodic)withN=100.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6,isused.(a)n=s=0:1.(b)n=s=0:5.(c)n=s=1:0.Withincreasingnoiselevels,thenonlineardependencestructurecannotbeidentiedvisually,asshownbythekerneldensityplots.52viii

PAGE 12

Figure17.NonlinearandlinearCCsforCase1(Linear)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.Inallcases,linearandnonlinearestimatesfromallthreemethodsoverlapwiththeoreticalCCsindicatingthatthelinearandnonlinearestimationmethodscapturethetruedependencewhenthereisonlyalineardependence.Butathighernoiselevels,KDEseemstohaveanedgeoverKNNandEdgeworthbecauseofitsnarrowbounds.57 Figure18.NonlinearandlinearCCsbetweenfunctions,suchas(a)Case1(Linear);(b)Case2(Quadratic);(c)Case3(Periodic);and(d)Case4(Chaotic),andtheir90%condenceboundsareobtainedusingKDEandLR,respectively.CCsandtheir90%boundsareobtainedfrom200samplesofsizeN=50.Athighernoiselevels,KDEcapturesthetruedependencegivenbytheoreticalCCsasshownin(a),(b),and(c).In(c),KDEestimatesarenotdifferentfromlinearCCsconsidering90%condencebounds.In(b)and(d),KDEgivesmorecorrelationascomparedtothelinearcorrelationandthereisaclearseparationbetweentheir90%condencebounds.58 Figure19.NonlinearandlinearCCsbetweenfunctions,suchas(a)Case1(Linear);(b)Case2(Quadratic);(c)Case3(Periodic);and(d)Case4(Chaotic),andtheir90%condenceboundsareobtainedusingKDEandLR,respectively.CCsandtheir90%boundsareobtainedfrom100samplesofsizeN=100.Athighernoiselevels,KDEcapturesthetruedependencegivenbytheoreticalCCsasshownin(a),(b),and(c).In(c),KDEestimatesarenotdifferentfromlinearCCsconsidering90%condencebounds.In(b)and(d),KDEgivesmorecorrelationascomparedtothelinearcorrelationandthereisaclearseparationbetweentheir90%condencebounds.59 Figure20.NonlinearandlinearCCswith90%condenceboundsobtainedfromKDEandLR,respectively,usingN=1000points.(a)Case1(Linear);(b)Case2(Quadratic);(c)Case3(Periodic);and(d)Case4(Chaotic).Atlowernoiselevels,KNNseemstothebestasitoverlapswiththeoreticalCCsandhasnarrowbounds.In(c),linearandEdgeworthestimatesoverlapexactly.TheperformanceofEdgeworthisnotgoodin(c)and(d).Athighernoiselevels,KDEandKNNestimatesoverlapandalsoincludetheoreticalCCsbutKNNestimatesalsooverlapwithlinearCCsin(d).Thus,KDEseemstohaveanedgeoverKNNasits90%condenceboundsarenarrowanddonotoverlapwithlinearCCswhenthedataisnoisyandrelativelylarge.60 Figure21.NonlinearandlinearCCsforCase2(Quadratic)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.AllthreenonlinearcorrelationestimatesincludetheoreticalCCsbut90%condenceboundsfromKNNandEdgeworthalsooverlapwithlinearCCsathighernoiselevelsin(a).ThisshowsthatKNNandEdgeworthestimatesarenotdifferentfromlinearCCsathighernoiselevels.KDEquantiesmorecorrelationascomparedtothelinearcorrelationastheir90%condenceboundsdonotoverlapindicatingthatKDEcantrulycapturethenonlineardependence.61ix

PAGE 13

Figure23.NonlinearandlinearCCsforCase4(Chaotic)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.Athighernoiselevels,KNNandEgdeworthCCsoverlapwithlinearCCsindicatingthattheydonotcaptureanythingmorethanthelinearcorrelation.KDEisthebestincapturingthenonlineardependenceasits90%condenceboundsdonotoverlapwithlinearCCs.63 Figure24.ComparisonofcorrelationcoefcientsobtainedforCase2(Quadratic)fromKNNwithdifferentnumberofnearestneighbors(k),i.e.,3,7,11,and15.(a)N=50.(b)N=100.Askincreases,boththebiasandvarianceincreaseatlowernoiselevels.Forhighernoiselevels,thebiasincreasesbutthevariancedecreasesaskincreases.65 Figure25.CorrelationcoefcientsforCase-Cubicwith90%condenceboundsobtainedfromLR,KDE,KNN,Edgeworth,andKendall'stau.(a)N=50.(b)N=100.(c)N=1000.Forallcases,thelowestcurveisobtainedfromKendall'stau.In(a)and(b),Kendall'stauoverlapswithlinearatlowernoiselevels.Butathighernoiselevels,itoverlapswithKNN.In(c),Kendall'staucapturesthelowestdependence.67 Figure26.ThebivariatenormalandkerneldensitybetweentheENSOindexfordifferentquartersandtheannualowoftheNileRiver.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6whereNisthetotalnumberofobservations,isused.(a)Quarter1.(b)Quarter4.(c)Quarter5.Quarter1and5showthelowestandhighestlinearCCsbetweentheENSOindexandtheNileow,respectively(Table6).Quarter1and4showthelowestandhighestnonlinearCCsbetweentheENSOindexandtheNileow,respectively(Table6).69 Figure27.ThebivariatenormalandkerneldensitybetweentheENSOindexfordifferentquartersandtheannualowoftheAmazonRiver.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6whereNisthetotalnumberofobservations,isused.(a)Quarter3.(b)Quarter7.Quarter7and3showthelowestandhighestlinearandnonlinearCCsbetweentheENSOindexandtheAmazonow,respectively(Table7).70 Figure28.ThebivariatenormalandkerneldensitybetweentheENSOindexfordifferentquartersandtheannualowoftheCongoRiver.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6whereNisthetotalnumberofobservations,isused.(a)Quarter2.(b)Quarter3.(c)Quarter7.Quarter7and2showthelowestandhighestlinearCCsbetweentheENSOindexandtheCongoow,respectively(Table8).Quarter7and3showsthelowestandhighestnonlinearCCsbetweentheENSOindexandtheCongoow,respectively(Table8).71x

PAGE 14

Figure30.ThebivariatenormalandkerneldensitybetweentheENSOindexfordifferentquartersandtheannualowoftheGangesRiver.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6whereNisthetotalnumberofobservations,isused.(a)Quarter1.(b)Quarter5.Quarter1and5showsthelowestandhighestlinearandnonlinearCCsbetweentheENSOindexandtheGangesow,respectively(Table10).73 Figure31.NonlinearandlinearCCswiththeir90%condenceboundsbetweenENSOandan-nualriverowsofNile,Amazon,Congo,Parana,andGangesusingKDEandLRap-proaches,respectively.Thebias-correctedestimates,( ),plottedassoliddotsareestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))isthemeanof100jackknifereplicationsofsize0:8Nobservations.The90%condenceboundsaregivenby5%and95%quantilesof100jackknifereplica-tionsofsize0:8N.74 Figure32.Percentageoftotaldataavailableateachgridpoint:(a)Percentageofdailydataavail-ablein65yearsfrom1940-2004;and(b)Meanpercentageofdailydataavailablein40yearsfrom1965-2004computedusing25-yearmovingwindowfrom1965-2004,i.e.,1965-1989,1966-1990,:::,1980-2004.Eachgridpointhavingatleast14yearsofdataisconsideredfortheanalysis.Thismeansthatallgridpointshavingmorethan22%and56%ofdataareusedfortheanalysisin(a)and(b),respectively.Thewhiteregionsonthemapindicateeithernon-availabilityofdataorinsufcientdata,i.e.,lessthan14yearsofdata,fortheanalysis.85 Figure33.Gridpointhaving(longitude;latitude)as(315;10):Dailydatawiththresholdgivenas99%-quantile(shownasahorizontallineinbluein(a)and(b)).(a)Timeseriesfor65years;(b)Timeseriesfor4years;(c)Excessesoverathresholdfortherst10years;(d)Auto-correlationplot;(e)Probabilityplot;and(f)Quantileplot.Weobservestrongseasonalityandtemporaldependenceandalsosomeclusteringofextremes.Thequalityofprobabilityandquantileplotsispoor.93 Figure34.Gridpointhaving(longitude;latitude)as(310;25):Dailydatawiththresholdgivenas99%-quantile(shownasahorizontallineinbluein(a)and(b)).(a)Timeseriesfor65years;(b)Timeseriesfor4years;(c)Excessesoverathresholdfortherst10years;(d)Auto-correlationplot;(e)Probabilityplot;and(f)Quantileplot.Theseasonalpatternsareweakbutthereexiststemporaldependenceandclustersofextremes.Thequalityofprobabilityandquantileplotsisgood.94 Figure35.

PAGE 15

Figure37.Gridpointhaving(longitude;latitude)as(310;25):Weeklymaximadatawiththresholdgivenas95%-quantile(shownasahorizontallineinbluein(a)and(b)).(a)Timeseriesfor65years;(b)Timeseriesfor4years;(c)Excessesoverathresholdfortherst20years;(d)Auto-correlationplot;(e)Probabilityplot;and(f)Quantileplot.Theseasonalpatternsarenotevidentfromtimeseriesplotsandthereissomeim-provementinclusteringofextremesascomparedtodailydata(Figure34c).Weobservesometemporaldependencebutitseemstobeofthesameorderasfromdaily(Figure34d).Thequalityofprobabilityandquantileplotsisgoodbutnotbetterthantheplotsfromdaily(Figures34e,f).98 Figure38.Gridpointhaving(longitude;latitude)as(315;10):Weeklymaximaresidualsdatawiththresholdgivenas95%-quantile(shownasahorizontallineinbluein(a)and(b)).(a)Timeseriesfor65years;(b)Timeseriesfor4years;(c)Excessesoverathresholdfortherst10years;(d)Auto-correlationplot;(e)Probabilityplot;and(f)Quantileplot.Thereexistsstrongseasonalpatternsandclustersofextremes.Weobservetemporaldependencebutitislessascomparedtodailyandweeklymaxima(Figures33dand36d).Thequalityofprobabilityandquantileplotsisgoodandalsobetterthanthatfromdailyandweeklymaxima(Figures33e,fand36e,f).100 Figure39.Gridpointhaving(longitude;latitude)as(310;25):Weeklymaximaresidualsdatawiththresholdgivenas95%-quantile(shownasahorizontallineinbluein(a)and(b)).(a)Timeseriesfor65years;(b)Timeseriesfor4years;(c)Excessesoverathresholdfortherst20years;(d)Auto-correlationplot;(e)Probabilityplot;and(f)Quantileplot.Theseasonalpatternsareabsent.Thereisnoimprovementinclusteringofextremesascomparedtoweeklymaxima(Figure37c).Thetemporaldependencedisappearscompletely.Weobservesignicantimprovementsintemporaldependenceascomparedtodailyandweeklymaxima(Figures34dand37d).Thequalityofprobabilityandquantileplotsisgood.101 Figure40.Scale()andshape()parametersandtheirstandarderrorsfromweeklymaximapre-cipitationfor1940-2004:(a)Spatialvariabilityofinmm;(b)Spatialvariabilityofstandarderrorsofinmm;(c)Spatialvariabilityof;and(d)Spatialvariabilityofstandarderrorsof.103 Figure41.Scaleparameter()anditsstandarderrorsinmmfromweeklymaximaprecipitationresiduals:(a)Spatialvariabilityoffrom1940-2004;(b)Spatialvariabilityofstan-darderrorsoffrom1940-2004;(c)Temporalvariabilityfrom1965-2004;and(d)R2fromlineartrendsshownin(c).In(c),thewhiteregionatalocationgivenby(longitude;latitude)as(295,-2.5)indicates-1.77.104 Figure42.Shapeparameter()anditsstandarderrorsfromweeklymaximaprecipitationresiduals:(a)Spatialvariabilityoffrom1940-2004;(b)Spatialvariabilityofstandarderrorsoffrom1940-2004;(c)Temporalvariabilityfrom1965-2004;and(d)R2fromlineartrendsshownin(c).In(c),thewhiteregionatalocationgivenby(longitude;latitude)as(295,-2.5)indicates0.063.105xii

PAGE 16

Figure44.Spatialvariabilityof50-yearand200-yearRLsandtheirstandarderrorsinmmfromweeklymaximaprecipitationresidualsfor1940-2004:(a)50-yearRL;(b)Standarderrorsof50-yearRL;(c)200-yearRL;and(d)Standarderrorsof200-yearRL.In(d),thewhiteregionatalocationgivenby(longitude;latitude)as(312.5,-7.5)indicates193.48mm.108 Figure45.Temporalvariabilityof50-yearand200-yearreturnlevels(RL)fromweeklymaximaprecipitationresidualsfor1965-2004:(a)Temporalvariabilityof50-yearRLfrom1965-2004;(b)R2fromlineartrendsshownin(a);(c)Temporalvariabilityof200-yearRLfrom1965-2004;and(d)R2fromlineartrendsshownin(c).In(c),thewhiteregionsatfourlocationsgivenby(longitude;latitude)as(292.5,-5),(302.5,-5),(305,-5)and(307.5,-7.5)indicate-22.07,-26.39,-21.12,and-34.47,respectively.109 Figure46.Precipitationextremesvolatilityindex(PEVI),denedastheratioof200-yearand50-yearRLs,fromweeklymaximaprecipitationresiduals:(a)Spatialvariabilityfor1940-2004;(b)Temporalvariabilityfrom1965-2004;and(c)R2fromlineartrendsshownin(b).In(a),thewhiteregionsattwolocationsgivenby(longitude;latitude)as(312.5,-7.5)and(315,-7.5)indicate2.22and1.82,respectively.In(b),thewhiteregionatalocationgivenby(longitude;latitude)as(307.5,-7.5)indicates0.042.110 Figure47.(Pleaselookatthelastgureforanenlargedone)Percentageofthenumberofcon-secutive2-and3-daysextremesoutofthetotalnumberofextremesbasedondailyprecipitationfor1940-2004.Thresholdischosenasthe99%-quantileofdailytimese-ries.(a)Spatialvariabilityofconsecutive2-daysextremesfrom1940-2004;(b)Spatialvariabilityofconsecutive3-daysextremesfrom1940-2004,wheretheyellowregionsshowingvaluesbetween0and-2donotindicateanyvaluesbutrepresentsregionswherethenumberofconsecutive3-daysextremesiszero;(c)Temporalvariabilityofconsec-utive2-daysextremesfrom1965-2004;(d)Temporalvariabilityofconsecutive3-daysextremesfrom1965-2004,wheretheyellowregionsshowingvaluesbetween-0.6and-0.8donotindicateanyvaluesbutrepresentsregionswherethenumberofconsecutive3-daysextremesiszero;(e)R2fromlineartrendsshownin(c);and(f)R2fromlineartrendsshownin(d),wheretheyellowregionsthatliesbetween0and-0.2donotindi-cateR2valuesbutrepresentsgridswherethenumberofconsecutive3-daysextremesiszero.In(c),thewhiteregionatalocationgivenby(longitude;latitude)as(295,-7.5)indicates4.66.In(d),thewhiteregionsattwolocationsgivenby(longitude;latitude)as(295,-7.5)and(302.5,-10)indicate4.71and-0.97,respectively.112 Figure48.Percentageofthenumberofmonthlyextremesoutofthetotalnumberofextremesbasedondailyprecipitationfortheperiod1940-2004.Thresholdischosenasthe99%-quantileofdailytimeseries.ExtremesmostlyoccurfromDecembertoAprilwithJanuaryreceivingthehighestnumberofextremes.TheperiodfromJulytoOctoberisrelativelyquieterwithrespecttoextremes.113xiii

PAGE 17

Figure50.MonthlystreamowtimeseriesobservedattheArkansasriver.130 Figure51.DailystreamowseriesobservedattheColoradoriver.131 Figure52.LnC(r)vs.LnrplotforLorenz(Xcomponent)timeseries.Thecurvesareshownfromtoptobottominascendingorderofembeddingdimension,m=2,4,..,20.132 Figure53.RelationbetweencorrelationexponentandembeddingdimensionforLorenz(Xcom-ponent),seasonal,andwhitenoiseseries.132 Figure54.ThevariationofCCandMSEwithforecastleadtimeforwhitenoisewith=0:16.133 Figure55.Relationbetweencorrelationexponentandembeddingdimensionformixedtimeseries.SeriesincludesLorenzX-componentandseasonalitywithf=10Hzanddifferentamplitudes.135 Figure56.Mixedtimesseries(LorenzX-componentandseasonality)anditsperiodogramsshow-ingthevariationofpowerspectraldensity(PSD)withfrequency.Top:LorenzX-componentandseasonalitywithf=10HzandA=5.Middle:LorenzX-componentandseasonalitywithf=25HzandA=10.Bottom:LorenzX-componentandseasonalitywithf=50HzandA=13.136 Figure57.VariationofCCandMSEwithforecastleadtimeforchaoticandmixedseries.L,WNandSstandforLorenz,whitenoiseandseasonality,respectively.137 Figure58.Correlationexponentvs.embeddingdimensionplotformixedtimeseriesconsistingofLorenz(Xcomponent)andwhitenoise.139 Figure59.Correlationexponentvs.embeddingdimensionformonthlystreamowseriesattheArkansasriver.142 Figure60.LnC(r)vs.Lnrplotfortheseries,afterremovingnoise,observedattheArkansasriver.Thecurvesareshowninascendingorderofembeddingdimension,m=1,2,..,20fromtoptobottom.142 Figure61.MultistepaheadpredictionsfortheArkansasriverstreamowdata.Top:one-stepaheadpredictions.Middle:two-stepaheadpredictions.Bottom:three-stepaheadpredictions.143 Figure62.MonthlystreamowdataattheArkansasriver:VariationofMSEwithforecastleadtimefortheoriginalanddeterministicdata.Thedeterministicdataisobtainedafterremovingnoisefromoriginaldata.144 Figure63.Correlationexponentvs.embeddingdimensionplotfordailystreamowseriesattheColoradoriver.146 Figure64.LnC(r)vs.Lnrfordailystreamowseries,afterremovingnoiseandseasonality,attheColoradoriver.Thecurvesareshowninascendingorderofembeddingdimension,m=1,2,..,20fromtoptobottom.147 Figure65.MultistepaheadpredictionsfortheColoradoriverstreamowdata.Top:one-stepaheadpredictions.Middle:two-stepaheadpredictions.Bottom:three-stepaheadpredictions.147xiv

PAGE 19

Thepresenceofnonlineardependenceandchaoshasstrongimplicationsforpredictivemodelingandtheanalysisofdominantprocessesinhydrologyandclimate.Analysisofextremesmayaidindevelopingpredictivemodelsinhydro-climatologybygivingenhancedunderstandingofprocessesdrivingtheextremesandperhapsdelineatepossibleanthropogenicornaturalcauses.Thisdissertationdevelopsandutilizesdif-ferentsetoftoolsforpredictivemodeling,specicallynonlineardependence,extreme,andchaos,andteststheviabilityofthesetoolsontherealdata.Commonlyuseddependencemeasures,suchaslinearcorrela-tion,cross-correlogramorKendall's,cannotcapturethecompletedependencestructureindataunlessthestructureisrestrictedtolinear,periodicormonotonic.Mutualinformation(MI)hasbeenfrequentlyutilizedforcapturingthecompletedependencestructureincludingnonlineardependence.Sincethegeophysicaldataaregenerallyniteandnoisy,thisdissertationattemptstoaddressakeygapintheliterature,specically,theevaluationofrecentlyproposedMI-estimationmethodstochoosethebestmethodforcapturingnonlineardependence,particularlyintermsoftheirrobustnessforshortandnoisydata.Theperformanceofkerneldensityestimators(KDE)andk-nearestneighbors(KNN)arethebestfor100datapointsathighandlownoise-to-signallevels,respectively,whereasKNNisthebestfor1000datapointsconsistentlyacrossnoiselevels.OnerealapplicationofnonlineardependencebasedonMIistocaptureextrabasinalconnectionsbetweenElNino-SouthernOscillation(ENSO)andriverowsinthetropicsandsubtropics,specicallytheNile,Amazon,Congo,Parana,andGangesriverswhichreveals20-70%higherdependencethanthosesuggestedsofarbylinearcorrelations.Forextremesanalysis,thisdissertationdevelopsanewmeasurepre-cipitationextremesvolatilityindex(PEVI),whichmeasuresthevariabilityofextremes,isdenedastheratioofreturnlevels.Spatio-temporalvariabilityofPEVI,basedonthePoisson-generalizedPareto(Poisson-GP)model,isinvestigatedonweeklymaximaobservationsavailableat2:50gridsfor1940-2004inSouthAmer-ica.From1965-2004,thePEVIshowsincreasingtrendsinfewpartsoftheAmazonbasinandtheBrazilianxvi

PAGE 21

Inhydrologyandclimate,thepresenceofnonlineardependenceandchaoshasstrongimplicationsforpredictivemodelingandtheanalysisofdominantprocesses.Analysisofhydrologicalandclimatologicalextremesmayalsoaidinpredictivemodelingbygivingenhancedunderstandingofprocessesdrivingtheextremesandperhapsdelineatepossibleanthropogenicornaturalcauses.Thisdissertationanalyzesanddevelopsthreecomponentsofpredictivemodeling,specicallynonlineardependence,extremes,andchaos,andteststheirviabilityandscalabilityontherealdata.Thissectionintroducesthesecomponents,describesthestateoftheartineachcomponent,pointsgapsintheirrespectiveliteratures,andoutlinesproceduresfortargetingtheirrespectivegaps. Innonlinearsystems,theunderstandingofunderlyingnonlinearprocessesandtheirinteractionsareveryimportantforpredictivemodelingaswellasforgeneratingboundsonpredictability.However,dataanalysismethodsbasedonnonlineardynamicalapproachesaretypicallynotrobustwhenappliedtoshortandnoisydata[1].Thedenitionofwhatconstitutesshortandnoisy,intermsofdatasizesandnoise-to-signalratios,maybeapplicationandcontextspecic.Aconsiderationofdataavailabilityscenariosinacoupleofdomains,specicallytheearthsciencesandbiomedicalengineering,inconjunctionwiththeliteratureonmutualin-formation(MI)estimationmethods,suggestthatacriticalgapcontinuestoexistinourunderstandingofsituationswherethelengthofdatasetsisshort,particularlyoftheorderof100or1000datapoints.Linearcorrelationmaynotbeanadequatemeasureofdependenceevenforsimplenonlinearfunctionalforms.Thiscanbesimplyshowninthecaseoftwovariables(X;Y),where(Y=X2),andXisuniformlydistributedintheinterval(1;1).ThetheoreticalcovarianceandhencethelinearcorrelationreducestozeroeventhoughthevariableYiscompletelyspeciedonceXisknown.Thesituationgetsevenmoreproblematicwhenthenonlinearinteractionsgetmorecomplex.Theproblemofdetectingexcessivespuriousdependenceormissingexistingdependencestructuresamongnonlinearsignalsisexacerbatedforshortandnoisydata.ThedegreetowhichevensmallamountofnoisecanobscuretheunderlyingdependencestructureisevidentfromFig.1whichshowstwocases,suchasquadraticandHenon,basedonsimulationswith100pointseach.Inbothcases,thesimulateddataarecontaminatedwithGaussiannoisewithzeromeanandstandarddeviationgiven1

PAGE 22

ENSOeventsimpactregionalprecipitationinthetropicsandsubtropics,ultimatelycausinginter-annualvariabilityinriverows.Theocean-atmosphere-landinteractionsarecomplexandfarfrombeingcompletelyunderstoodandaccuratelymodeled.Aslightdisturbanceintheseinteractionswouldusuallyresultinsome-timessurprisingdistantcorrelationsandclimatepatterns.Analysesoftherainfallanomaliesduringthewarm(ElNino)andcold(LaNina)episodesofENSOsuggesttheexistenceofnonlinearseasurfacetemperature(SST)-rainfallrelationshipsinthetropicsandastronginuenceofSSTforcingonequatorialrainfallinthegeographicvicinityofthatforcing[11].Toproperlyexplainandultimatelypredictthisvariability,itisim-portanttodisentangle,asfaraspossible,longrangeclimaticphenomenafromrecenteffectssuchasthosepossiblyproducedbydeforestationandglobalwarming.Whiletherelationshipsamongmanyclimateandhydrologicalvariablesaredecidedlynonlinear[12],lineardependencemeasuresarestillbeingusedasa2

PAGE 23

Precipitationextremescanhavesignicantimpactsonhumansociety,economics,andnature.Anunder-standingoftheintensityandfrequencyofprecipitationextremescanbeveryusefulforinfrastructuredevelop-menttopreventoodingandlandslides,aswellasforwaterresourcesandagriculturalmanagement.Abetterunderstandingofprecipitationextremescanhelphydrologicscientistsandclimatologistsgainenhancedun-derstandingofprecipitationprocessesdrivingtheextremesandperhapsdelineatepossibleanthropogenicornaturalcauses.Previousstudiesinvestigatedtrendsandvariabilityofprecipitationextremesinmanypartsoftheworldinthetwentiethcentury,specicallytheUnitedStates[22,23],India[24],SoutheastAsiaandtheSouthPacic[25],Australia[23,26],Europe[27],Caribbean[28],Italy[29],Balkans[30],Canada,Norway,Russia,China,Mexico[23],Japan[31],Sweden[32],southeasternSouthAmerica[33],andthestateofSaoPaulo,Brazil[34].Recentlythespatio-temporalvariabilityofdependenceamongprecipitationextremeswasinvestigatedovertheentireSouthAmericafortheperiod1940-2004usinganewapproach(suggestedbyKuhn[35])[36].However,wearenotawareofanypriorinvestigationsonspatialandtemporalvari-abilityofprecipitationextremesovertheentirecontinentofSouthAmerica.Thegeneralizedextremevalue(GEV)distribution,developedbyJenkinson[37],hasbeentraditionallyutilizedformodelingprecipitationextremes[38].Thisapproachisalsocalledtheblockmaximaapproachsinceittsthedistributiontothehighestvaluesinblocksofequalsize,e.g.,maximumyearlyprecipitation.Ithassomeadvantages,e.g.,itsrequirementscanbemetbyasimpliedsummaryofdataandtheblockmaximacanbeassumedtobeindependentrandomvariables[41].ButthemaindrawbackoftheGEVdistributionisthatitdoesnotutilizealltheavailableinformationabouttheuppertailofthedistribution,e.g.,twohighestextremeprecipitationeventsmayoccurinthesameyear[41].Analternativeapproachistousepeaksoverthreshold(POT)whichwasoriginatedinhydrologyandmakesuseofallthedataavailable,e.g.,alldailyprecipitationdata[42].ThestatisticalmodelunderlyingthePOTmethodconsistsof(1)Poissonprocessfortheoccurrencesofextremesoveralargethresholdand(2)generalizedPareto(GP)distribution(withscale()andshape()parameters),developedbyPickands[43],forthedistributionofexcessesoveralargethreshold.Thismodelisalsotermed3

PAGE 24

Thepresenceofchaosinhydrologyhasbeensuggestedbypreviousresearchers[48].Theabilitytodetectandmodelchaoticbehaviorfromnitehydrologictimeserieshasrecentlybeendebated[65,66].Characterizationofchaosfromreal-worldobservationsisknowntobeadifcultprobleminnonlineardy-namics[67].Thecomplexitywashighlightedinthecontextofclimatemodelsby[70],whodemonstratedthatsensitivitytoinitialconditionsmaybecomelessapparentwhentherandomnessininternalatmosphericvariablesbeginstodominate.Fundamentalquestionsstillremainunansweredintheseareas,forexample4

PAGE 26

Ingeophysics,theabilitytopredictmaybeincreased(a)byincludingnonlinearrelationshipsbetweengeophysicalprocesses,(b)bygeneratingt-yearreturnlevels(levelsexpectedtobeexceededonaverageonceeverytyears)basedontheanalysisoflowprobabilityandhighriskgeophysicalevents(alsocalledextremes),and(c)ifthereisapresenceofchaos,whichimpliesshort-termpredictability,ingeophysicalprocesses.Sincethegeophysicaldataaregenerallyshortandnoisy,thereisaneedtodevelopandinvestigatenonlineardependence,extremes,andchaosdetectionmeasures,whicharerobusttoniteandnoisydata.Mutualinformationhasbeenfrequentlyutilizedforcapturingthecompletedependencestructureincludingnonlineardependence.Recently,severalmethodshavebeenproposedfortheMIestimation,suchasKDE[2],KNN[5],Edgeworthapproximationofdifferentialentropy[6],andadaptivepartitioningoftheXYplane[8].However,outstandinggapsinthecurrentliteraturehaveprecludedtheabilitytoeffectivelyautomatethesemethods,which,inturn,havecausedlimitedadoptionsbytheapplicationcommunities.Thisdissertationattemptstoaddressakeygapintheliterature,specically,theevaluationoftheabovemethodstochoosethebestmethod,particularlyintermsoftheirrobustnessforshortandnoisydata,basedoncomparisonswiththetheoreticalMIestimates,whichcanbecomputedanalytically,aswellwithlinearcorrelationandKendall's.Inaddition,thereisalsoaneedtondanoptimalsmoothingparameterforaGaussiankernelforKDEandoptimalnumberofnearestneighborsforKNNwhenthedataareshortandnoisy.Anunderstandingoftheintensityandfrequencyofprecipitationextremescanbeveryusefulforinfrastructuredevelopmenttopreventoodingandlandslides,aswellasforwaterresourcesandagriculturalmanagement.Previousstudies[22]investigatedtrendsandvariabilityofprecipitationextremesinmanypartsoftheworldinthetwentiethcenturybutnostudywasfoundfocussingonprecipitationextremesovertheentirecontinentofSouthAmerica.SincethenationsofSouthAmericaaredeveloping,highlypopulated,andnotcapableenoughtorespondtodisasterscausedbyprecipitationextremes,thereisaclearneedtoinvestigatespatialandtemporalvariabilityofprecipitationextremesinSouthAmerica.Extremevaluedistributionsgenerateinformationaboutextremesintermsoftheirparameterswhichcanbeunderstoodonlybystatisticians.Thereisaneedtodevelopameasurewhichshouldbestatisticallyvalid,easilyquantiedandvisualizedoverlargegeographicalareas,andunderstood6

PAGE 28

Commonlyuseddependencemeasures,suchaslinearcorrelation,cross-correlogramorKendall's,can-notcapturethecompletedependencestructureindataunlessthestructureisrestrictedtolinear,periodicormonotonic.Mutualinformation(MI)hasbeenfrequentlyutilizedforcapturingthecompletedependencestructureincludingnonlineardependence.Recently,severalmethodshavebeenproposedfortheMIesti-mation,suchaskerneldensityestimators(KDE),k-nearestneighbors(KNN),Edgeworthapproximationofdifferentialentropy,andadaptivepartitioningoftheXYplane.However,outstandinggapsinthecurrentlit-eraturehaveprecludedtheabilitytoeffectivelyautomatethesemethods,which,inturn,havecausedlimitedadoptionsbytheapplicationcommunities.Thisdissertationattemptstoaddressakeygapintheliterature,specically,theevaluationoftheabovemethodstochoosethebestmethod,particularlyintermsoftheirrobustnessforshortandnoisydata,basedoncomparisonswiththetheoreticalMIestimates,whichcanbecomputedanalytically,aswellwithlinearcorrelationandKendall's.Hereweconsidersmallerdatasizes,suchas50,100,and1000,wherethisdissertationconsiders50and100datapointsasveryshortand1000asshort.Weconsiderabroaderclassoffunctions,specicallylinear,quadratic,periodicandchaotic,contami-natedwitharticialnoisewithvaryingnoise-to-signalratios.OurresultsindicateKDEasthebestchoiceforveryshortdataatrelativelyhighnoise-to-signallevelswhereastheperformanceofKNNisthebestforveryshortdataatrelativelylownoiselevelsaswellasforshortdataconsistentlyacrossnoiselevels.Inaddition,theoptimalsmoothingparameterofaGaussiankernelappearstobethebestchoiceforKDEwhilethreenearestneighborsappearoptimalforKNN.Thus,insituationswheretheapproximatedatasizesareknowninadvance,andexploratorydataanalysisand/ordomainknowledgecanbeusedtoprovideaprioriinsightsonthenoise-to-signalratios,theresultsinthepaperpointtoawayforwardforautomatingtheprocessofMIestimation.8

PAGE 29

Innonlinearsystems,theunderstandingofunderlyingnonlinearprocessesandtheirinteractionsareveryimportantforpredictivemodelingaswellasforgeneratingboundsonpredictability.However,dataanal-ysismethodsbasedonnonlineardynamicalapproachesaretypicallynotrobustwhenappliedtoshortandnoisydata[1].Thedenitionofwhatconstitutesshortandnoisy,intermsofdatasizesandnoise-to-signalratios,maybeapplicationandcontextspecic.Aconsiderationofdataavailabilityscenariosinacoupleofdomains,specicallytheearthsciencesandbiomedicalengineering,inconjunctionwiththeliteratureonmu-tualinformation(MI)estimationmethods,suggestthatacriticalgapcontinuestoexistinourunderstandingofsituationswherethelengthofdatasetsisshort,particularlyoftheorderof100or1000datapoints. Physically-baseddenitionsforwhatconstituteslongversusshortdatasizesneedtofollowfromacom-parisonofsamplingcoveragetime-spanvis-a-visthecharacteristicperiodofthedynamicalsystemunderconsideration.Thecharacteristicperiodcanbe,forexample,onefullseasonalcycleforpurelyseasonalob-servations,oracompletespanoftheattractorforachaoticsystem.Ifthesamplesizeislargebutthesamplingcoverageisrestrictedtoasmallportionofthecycleortheattractor,thenobservationsarestillnotrepresen-tativeofthepopulation.Inthissense,thedatasizemuststillbeconsideredshortinaphysicalsensebecausetheydonothavethecoveragenecessarytomaketherelevantinferencesfromthedata.Whilesampleswithgreatercoverageismorerepresentativeofthepopulation,thetradeoff,especiallyforalimitednumberofsamples,isthatthesamplingfrequencyneedstobeadequatetocapturethefeaturesofthedynamicalsystemandmakeappropriateinferencesfromtheobservations.ThesamplingfrequencyinthissenseisrelatedtotheNyquistfrequencyofthesystem.Inthissense,evenifthesamplingcoverageislargebutthefrequencyisinadequate,thedatasizemuststillbeconsideredshortfromaphysicalperspective.Thus,theNyquistfrequencyontheonehand,andthecharacteristicperiodofthedynamicalsystemunderconsiderationontheother,provideguidelinesforthedenitionsoflongversusshortdatasizes,andindeedprovidesaphysicalbasisforsuchdenitions.However,inreal-worldsituations,theknowledgeofthecharacteristicperiodofthedynamicalsystemorthesignalbandwidthmaynotnecessarilybeknownapriori,andinsomecases,maybedifculttoestimateifthedataarecontaminatedwithnon-repeatablepatterns,measurementerrors,orotherformsofnoise.Thus,forsuchsystems,thereisaneedforcautionbeforemakingaclaimthatasetofobservationsisshortor,perhapsmoreimportant,longenough.Thispaperisconcernedwithsimulateddata,wherewehaveknowledgeofthesystem,andgeneratesnoisesequencesfromindependentandidenticallydistributedprocesses.Hereweimplicitlydenethecharacteristictime(basicperiod)ofthesystemasequal9

PAGE 30

Weusethetermnoiseinagenericsensetoincludevariabilityinmeasurementerrorsaswellasanyinherent,butnon-repeatable,randomnessthatmaybepresentincomplexsystems.Indeed,noiselevelsencounteredinreal-worlddatamayvaryconsiderablydependingonthedomain,datacollectionmethods,measurementaccuracy,inherentrandomnessintheobservables,aswellasotherfactors.Hereweconsidernoise-to-signalratiosthatrangeallthewayfromzero,whichimpliesnonoise,tounity,whichimpliesthatthenoiseisasdominantastheunderlyingsignalitself.Forthisdissertation,wecallanoise-to-signalratioofzerotoaboutahalfaslownoiseandhigherratiosashighnoise. Linearcorrelationmaynotbeanadequatemeasureofdependenceevenforsimplenonlinearfunctionalforms.Thiscanbesimplyshowninthecaseoftwovariables(X;Y),where(Y=X2),andXisuniformlydistributedintheinterval(1;1).ThetheoreticalcovarianceandhencethelinearcorrelationreducestozeroeventhoughthevariableYiscompletelyspeciedonceXisknown.Thesituationgetsevenmoreprob-lematicwhenthenonlinearinteractionsgetmorecomplex.However,theapplicationofnonlineardynamicaland/orinformationtheoreticmeasuresofdependencecanbeachallenge,especiallywhenshortandnoisydataareavailable.Forexample,theidenticationoftheunderlyingnonlineardynamicalcomponentviathecorrelationdimensionisknowntobedifcultproblemforgeophysical[21]orelectroencephalographic(EEG)[78,79]signals.Similarly,thedetectionoftheunderlyinginteractionsamongvariablescharacterizingacomplexsystembecomesadifculttask[80].Theinherentdifcultyofnumericalestimationaswellasperceivedproblemswithmodelparsimonyoroverttinghaveresultedinrelativelylimiteduseofnonlinearapproaches,evenwhentheunderlyingprocessesareknowntobenonlinear.Theproblemexistsincertainbiomedicalapplications[81],butgrowsmoreacuteindomainslikegeophysics[21,84]wherethedatacollectionandgenerationprocessesareoftennotrepeatable.Ourdenitionofwhatconstitutesshortandnoisydataismotivatedfromproblemsinthesedomains.Thereferencescitedearliershowthatveryshortandshortdatasets,aswellaslownoiseandhighnoiseconditionsdoexistforreal-worldproblems.Thus,thereisaclearneedtoinvestigatemethods,whicharerobusttoshortandnoisydata,forthedeterminationofnonlinearmultivariateinteractions.However,themethodologiesneedtoberigorouslytestedsuchthatwell-knownproblemsinnonlinearstatisticslikeoverttingdonotyieldmisleadingcorrelations.10

PAGE 33

Asdiscussedearlier,severalmethodsfortheestimationofMIhavebeensuggestedinrecentyears,suchasKDE[2],adaptivepartitioningoftheobservationspace[3],Parzenwindowdensityestimator[4],KNN[5],Edgeworthapproximationofdifferentialentropy(Edgeworth)[6],mutualinformationcarriedbytheranksequences[7],andadaptivepartitioningoftheXYplane(referredhereasCellucci)[8].ThegoalofthisdissertationistoinvestigateandcomparerecentlydevelopedMIestimationmethods,specicallyKDE,KNN,Edgeworth,andCellucci,basedonsimulateddatageneratedfromlinear,quadratic,periodic,andchaoticdatacontaminatedarticiallywithvariouslevelsofGaussiannoise.Wegenerate50,100,and1000pointsforouranalysis.Asmentionedearlier,themotivationforthedatasizescomesfromaspecicgeophysicalapplication(therelationshipoftheinterannualclimateindexknownasENSOwiththevariabilityoftropicalriverows[84])andaspecicbiomedicalapplication(dependenceamongEEGsignals[81,82]).ThesimulateddataallowustocomparetherelativeperformanceoftheMIestimationmethodsacrossanorderofmagnitudeintermsofdatasizesandnoise-to-signalratiosrangingfrom0to1inincrementsof0:1.UncertaintiesontheMIestimatesareobtainedthroughbootstrappingandprovidedas90%condencebounds.Thetotalnumberofbootstrapsusedfor50,100,and1000pointsare200,100,and10,respectively,reectingapragmatictrade-offbetweentheneedforaccuracyandcomputationaltractability.However,suchtrade-offsmaynotberequiredinmoreefcientorhigherperformancecomputationalimplementations.TheperformanceoftheMI-estimationmethodsarecomparedagainsteachotherandagainstbaselinescomprisinglinearcorrelationcoefcient(CC)obtainedfromlinearregression(LR)andrank-basedCCsfromKendall's.WehavealsousedtheoreticalMIvaluesfromlinear,quadratic,andperiodic,whichcanbecomputedanalytically,forcomparingtheperformanceofdifferentMI-estimationmethods.ThepurposeoftheabovecomparisonsistoidentifytheoneMI-estimationmethodorcombinationofMI-estimationmethodsintermsofrobustnesstoshortandnoisydata,atleastfortheillustrationsconsideredhere,whoseestimationvaluesareclosesttothetheoreticalMIvaluesandsignicantlydifferentfromlinearestimatesinthattheircondenceboundsdonotintersect. Therestofthepaperisorganizedasfollows.InSec.3.2,theMIanditsestimationmethodsaredescribed.TheMIisdenedinSec.3.2AwhileweoutlinethefourMIestimationmethods,namelyKDE,KNN,Edgeworth,andCellucci,inSec.3.2B.InSec.3.3,thedescriptionofsimulateddatasetstobeanalyzedis13

PAGE 34

Severaldependencemeasures,suchaslinearcorrelation,cross-correlogram,Kendall's,andMI,havebeenutilizedtocapturethedependencestructurebetweenapairofvariables(X;Y).However,whiletherstthreemeasurescanonlycapturelinear,periodicormonotonicdependence,MIcandescribethefulldependencestructureincludingnonlineardependenceifany[86].Inaddition,MIreducestothelineardependencewhenthedataareindeedlinearlyrelated.Inaninformationtheoreticsense,MIquantiestheinformationstoredinonevariableaboutanothervariable.MIhasseveralsatisfyingtheoreticalpropertiesandanalogousrelationswiththelinearcorrelation.WhilethelinearCCcanbeusedtocalculatethepredictionmeansquarederrors(MSE)fromlinearregression,MIcanbeusedtocomputeaboundontheachievablepredictionMSEbasedontheinformationcontentintheindependentvariablesaboutthedependentvariables.MIhasbeenshowntohavetraditionalanalysisofvariance(ANOVA)-likeinterpretations[87].Fortimeserialdata,MIcanbecomputedasafunctionoftemporallagstoobtainnonlinearversionsoftheauto-orcross-correlation(ACForCCF)functions.TheinformationtheoreticpropertiesofMI,whichmakeitareliablemeasureofstatisticaldependence,havebeendescribedbyCoverandThomas[88].TheapplicabilityofMIforfeature,parameterandmodelselectionproblemshavebeendescribedbyBrillinger[87].BesidesthedirectuseofMIinthecomputationofnonlineardependence[87,89],MIhasindicatedvalueinareasrangingfromoptimaltimedelayembeddingsduringphase-spacereconstructions[9]toextractingcausalrelationshipsamongvariables([90,91])[8].3.2.1Denitionsofmutualinformation Forthebivariaterandomvariables(X;Y),theMIisdenedasI(X;Y)=ZYZXpXY(x;y)logpXY(x;y) wherepXY(x;y)isthejointprobabilitydensityfunction(pdf)betweenXandY;andpX(x)andpY(y)arethemarginalpdfs[8].TheunitofMIisdenedcorrespondingtothebaseofthelogarithminEq.(1),i.e.,natsforlog,bitsforlog2,andHartleysforlog10.MIispositiveandsymmetrical,i.e.,I(X;Y)=I(Y;X).Itisalsoinvariantunderonetoonetransformations,i.e.,I(X;Y)=I(U;V),whereu=f(x),v=f(y),14

PAGE 35

MIbetweenrandomvariablesXandYcanalsobedenedintermsofinformationentropiesasI(X;Y)=H(Y)H(YjX)=H(X)+H(Y)H(X;Y);(2) whereH(X)andH(Y)arecalledthemarginalinformationentropieswhichmeasuretheinformationcontentinXandY,respectively,H(YjX)istheentropyofYconditionalonXwhichmeasurestheinformationcontentremaininginYiftheinformationcontentinXisknowncompletely,andH(X;Y)isthejointinformationentropywhichmeasurestheinformationcontentinapairofrandomvariablesXandY.Thebivariatecaseisconsideredhereforsimplicity. ThelinearCC()betweentwovariablesXandYisameasureofthestrengthofthelineardependencebetweenthevariablesandvariesfrom0to1.Theestimationofthemostlikelyvalueandthecorrespondinguncertaintiesarerelativelystraightforward.However,theestimationofthemeananduncertaintybounds,foranMI-baseddependencemeasurethatisnormalizedtoscalebetween0to1,isanareaofongoingresearch. If(X;Y)isbivariatenormal,theMIandlinearCCarerelatedasI(X;Y)=0:5log[1(X;Y)2][92].Joe[93]proposedalinearCClikemeasureforMI,whichscalesfrom0to1,givenas^(X;Y)=q where^(X;Y)and^I(X;Y)aretheestimatednonlinearCCandMI,respectively.LaterGrangerandLin[94]usedthesamemeasuretoestimatenonlinearCCfromtheMI.WhilethisdissertationutilizesnonlinearCCbasedsolelyonMI,otherbasesfornonlinearCCsuggestedintheliteratureincludemutualnonlinearprediction[95]andnonlinearassociationanalysis[96].AdetailedcomparisonofthevariousdenitionsofnonlinearCCandtheirrelativeperformancesareleftasareasforfutureresearch.InordertoestimatethepredictabilityofYgivenX,oncetheMIisknown,Brillinger[87]proposedanequationwhichprovidesalowerboundonthepredictionMSE.Thisequation,whichisanalogoustotheMSEforlinearregressionobtainedfromthelinearcorrelationcoefcient,isgivenas\MSE(Y)1 2eexp[2f^H(Y)^I(X;Y)g];15

PAGE 36

^pX(xi)^pY(yi);(4) where^pXY(xi;yi)istheestimatedjointpdfand^pX(xi)and^pY(yi)aretheestimatedmarginalpdfsat(xi;yi). Forthemultivariatedataset(x1;::;xn),whereeachxisinad-dimensionalspace,themultivariatekerneldensityestimatorwithkernelKisdenedby^p(x)=1 wherehisthesmoothingparameter[97].WechoosethestandardmultivariatenormalkerneldenedbyK(x)=(2)d=2exp1 2xTx:(6) UsingEqs.(5)and(6),theprobabilitydensityfunctionisdenedas^p(x)=1 2h2!;(7) whereSisthecovariancematrixandjSjisthedeterminantofS.Foranormalkernel,Silverman[97]suggestedanoptimalsmoothingparameterorGaussianbandwidthgivenasho=4 Moonetal.[2]presentedthesameprocedureandutilizedEq.(7)forestimatingmarginalprobabilitydensi-ties,i.e.,^pXand^pY,andthejointprobabilitydensity,i.e.,^pXY,andsubstitutedthesedensitiesinEq.(4)toestimateMI.16

PAGE 37

where^p(xi)istheestimatedmarginalpdfatxi.Kraskovetal.[5]expandedEq.(9)as^H(X)=1 wherenandkarethenumberofdatapointsandnearestneighbors,respectively;dXisthedimensionofx;andcdXisthevolumeofthedX-dimensionalunitball.FortworandomvariablesXandY,let(i)=2bethedistancebetween(xi;yi)anditskthneighbordenotedby(kxi;kyi).Letx(i)=2andy(i)=2bedenedasjjxikxijjandjjyikyijj,respectively.nx(i)isthenumberofpointsxjsuchthatjjxixjjjx(i)=2.(x)isthedigammafunction,(x)=(x)1d(x)=dx,whichsatisestherelation(x+1)=(x)+1=x,with(1)=C,whereC=0:5772156649istheEuler-Mascheroniconstant.Similarly,^H(Y)canbederivedbyreplacingxwithyinEq.(10).Inthesimilarway,theestimatedjointentropybetweenXandYcanbegivenas^H(X;Y)=(k)1

PAGE 38

3!Xi;j;ki;j;khi;j;k(x)1A;(11) wherep(x)isthenormaldistributionwiththesamemeanandcovariancematrixasp;(i;j;k)istheinputdimensionwhere(i;j;k)2(1;:::;d);i;j;kisthestandardizedcumulant,i.e.,i;j;k=ijk Letp(x)bedenedinasetX.ThedifferentialentropyofXisdenedasH(X)=ZXp(x)logp(x)dx: whereH(p)=0:5logjSj+d 3!Pi;j;ki;j;khi;j;k(x).Substitutingp(x)inEq.(12)leadstoH(p)H(p)ZXp(x)[Z(x)+0:5Z(x)2]dx: 12dXi=1(i;i;i)21 4dXi;j=1;i6=j(i;i;j)21 72dXi;j;k=1;i
PAGE 39

LetxandyaxesbepartitionedintoequalnumberofelementsdenotedbyNEwhichleadsto^pX(i)=^pY(j)=n=NE NE2:NEiscomputedfromamoreconservativecriterion,i.e.,EXY(i;j)=n=NE25forallelements,ratherthantheCochrancriterion.AftercomputingNE,NEpartitionsinthexaxisandNEpartitionsintheyaxisareusedfortheestimationofjointprobabilitydensityatthe(i;j)thelementoftheXYpartition,i.e.,^pXY(i;j).TheMIisestimatedbysubstituting^pX,^pY,and^pXYintheequationgivenas^I(X;Y)=NEXi=1NEXj=1^pXY(i;j)log^pXY(i;j) ^pX(i)^pY(j):3.3Detailsofthedata Weanalyzesimpleexamplesoflinear,quadratic,andperiodicfunctions,aswellasachaoticsystem,speci-callytheHenonmap,contaminatedwithdifferentlevelsofarticialGaussiannoise.19

PAGE 41

Inthisdissertation,weconsiderfourdifferenttypesofsimulations,i.e.,linear,quadratic,periodicandchaoticsystem.Forlinear,quadratic,andperiodiccases,theexactMIsasdenedbyEq.(2)canbecomputedasshownbelow.3.3.1.1Linear InordertocomputeH(Y),thepdfofY,i.e.,pY(y),isneeded.SinceY=X+Z,andXandZareindependent,pY(y)canbeobtainedthroughtheconvolutionofthepdfsofXandZgivenaspY(y)=Z1pX(x)pZ(yx)dx:(14) SolvingEq.(26),wegetpY(y)=(2)1=2(X2+"2)1=2expx2

PAGE 43

WerstestimateMIfromKDE,KNN,Edgeworth,andCellucci,andthensubstituteinEq.(3)togetthenonlinearCCestimates.LinearCCsareobtainedfromLRwhereasrank-basedCCsareestimatedfromKendall's.ThemeanofCCsandits90%condenceboundsareevaluatedusingbootstrapping.Thetotalnumberofbootstrapsamplesusedfor50,100,and1000datapointsare200,100,and10,respectively.Thecorrelationcoefcientpresentedhereisthemeanofbootstrapsamples.Thelowerandupperboundsof90%condenceboundsaregivenas5%and95%quantilesofbootstrapsamples,respectively.23

PAGE 48

Inordertocomparetheperformanceofdifferentmethods,wecomparenonlinearCCsfromKDE,KNN,Edgeworth,andCellucciwithlinearCCsobtainedfromLR.IfthecondenceboundsofnonlinearCCsoverlapwiththeboundsoflinearCCs,itmeansherethatnonlinearcorrelationsarenotdifferentfromlinearcorrelationsat90%condencelevel.NonlinearCCsobtainedfromtheMIestimationmethodsarecomparedwiththeoreticalCCsderivedfromthetheoreticalMIvalueswhichcanbecomputedanalyticallyforthreeoutoffourtestcasesconsideredhere,namelylinear,quadratic,andperiodic.TheperformanceoftheMIestimationmethodsisalsocomparedwitharank-basedcorrelationmeasure,specicallytheKendall's.Plotsofnormalandkerneldensityestimatesforlinear,quadratic,periodic,andchaoticareshowninFigs.13-16. Figure6.Linear:ComparisonsbetweenlinearCCsfromLRandnonlinearCCsfromKDE,KNN,Edgeworth,Cellucci,andKendall's,atdifferentnoise-to-signalratios(=s)for(A)50points,(B)100points,and(C)1000points.28

PAGE 49

Veryshortdata Shortdata lownoise highnoise lownoise highnoise KDE +YY KNN 0YY +YY 0YY 0YY -NN -YY -YY +YY Kendall's -NY -NN -NN

PAGE 50

Table2.Quadratic:Descriptionofresultswhereeachentryconsistsofthreecolumnsgivenas(1)Column1:0,-,or+,where`0',`-'and`+'meannonlinearCCsarezero,negativelyandpositivelybiasedwithrespecttotheoreticalCCs,respectively,(2)Column2:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwiththeoreticalCCs,respectively,and(3)Column3:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwithlinearCCs,respectively.Boldandslantedentriesindicatethebestandthesecondbestmethodsforeachcasespeciedinthetopheadingsofthetable,respectively. Veryshortdata Shortdata lownoise highnoise lownoise highnoise KDE -NN -YN KNN 0YN -YN -YN Cellucci -NN -YY -YN -NY -NY -NY 30

PAGE 51

Table3.Periodic:Descriptionofresultswhereeachentryconsistsofthreecolumnsgivenas(1)Column1:0,-,or+,where`0',`-'and`+'meannonlinearCCsarezero,negativelyandpositivelybiasedwithrespecttotheoreticalCCs,respectively,(2)Column2:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwiththeoreticalCCs,respectively,and(3)Column3:YorN,where`Y'and`N'mean90%condenceboundsofnonlinearCCsoverlapanddonotoverlapwithlinearCCs,respectively.Boldandslantedentriesindicatethebestandthesecondbestmethodsforeachcasespeciedinthetopheadingsofthetable,respectively. Veryshortdata Shortdata lownoise highnoise lownoise highnoise KDE 0YN -NY -NY -NY -NY Cellucci -NY -YY Kendall's -NY -NN -NN 31

PAGE 55

Forveryshortdata,KNNestimateshigherCCsthanallothermethodswhen=sislessthanaround0:5afterwhichKDEyieldshighervaluescomparedtoallothermethods(Figs.23Aand23B).TheperformanceofKendall'sistheworstsinceitcaptureslessdependencethanthelinearcorrelationforthemajorityofnoiselevels.Atlownoise,bothEdgeworthandCellucciareruledoutbecausetheyarelowerthanKNNandKDEandhavewidecondencebounds.Thus,KNNseemstobeabetterchoiceatlownoisesinceKDEisnegativelybiased.Asthenoiselevelincreases,thecondenceboundsfromallmethodsincrease.Athighnoise,thecondenceboundsfromKNN,Edgeworth,andCelluccioverlapwithlinearCCs.KDEseemstohaveanedgeovertheothermethodssinceithasnarrowcondenceboundsanddoesnotoverlapwithlinearCCs.Thus,KNNandKDEmaybeutilizedforveryshortdataatlowandhighnoise,respectively.Forshortdata,Celluccidifferscompletelyfromtheotherestimatorsathighnoise(Fig.23C).KNNisabetterchoiceatlownoisebecauseitappearstobethemostconsistent.Athighnoise,KNNandEdgeworthareruledoutbecausetheyoverlapwithlinearCCsduetotheirwidecondencebounds.KDEoverlapswithKNNbutitstandsoutduetoitsabilitytocapturemorecorrelationthanpurelylinearcorrelation.Thus,forshortseries,KNNandKDEmaybeutilizedatlowandhighnoise,respectively.35

PAGE 56

InthecaseofKDE,theamountofsmoothingdenedbysmoothingparameter,hinEq.(5),isveryimportantforthedensityestimation,which,inturn,inuencestheMIestimates.Theselectionofappropriatesmooth-ingparameterneedstobeguidedbytheend-useofthedensityestimates.HereweusetheoptimalsmoothingparameterforaGaussiankernel(ho)withKDEgiveninEq.(8).WeinvestigatetheeffectsofhonnonlinearCCestimatesfromKDEbyselectingdifferentvaluesofharoundho.ForKNN,thenumberofnearestneigh-bors(k)governstheoverallamountofsmoothinginthedensitieswhicharesubsequentlyusedinentropyestimationgiveninEq.(10).Smallvaluesofkleadtosmallbiasandlargevariancewhereaslargekresultsinlargebiasandsmallvariance.Thus,thebias-variancetradeoff,whichisacommonissueencounteredinstatisticalestimationprocedures,isalsoimportanthere.Kraskovetal.[5]warnedagainstusinglargeksince36

PAGE 57

Forveryshortdata,thebiasandvariancefromKDEincreasewiththeincreaseofhatlownoiseandallnoiselevels,respectively(Fig.10A).Atlownoise,KDEdoesnotoverlapwiththeoreticalCCs.However,KDEwithh=0:75hoandh=hoperformsbetterathighnoisesincetheir90%condenceboundsoverlapwiththeoreticalCCs.ThebiasandvariancefromKNNincreaseasthenumberofnearestneighborsincreaseacrossallnoiselevels(Fig.10B).Atlownoise,theperformanceofKNNwithk=3isthebestofallthecasesconsideredheresinceithassmallbiasanditscondenceboundsoverlapwiththeoreticalCCs.Athighnoise,KNNhaslargebiasandvarianceforallk.IfKNNneedstobeusedathighnoise,k=3appearstobeabetterchoicesinceitisclosertotheoreticalCCsascomparedtotheothersandthevariancesfromallkarecomparable.Thus,forveryshortdata,KDEwithh=0:75hoorh=homaybeutilizedathighnoisewhereasKNNwithk=3seemstobeabetterchoiceatlownoise. Forshortdata,KNNwithk=3performsbetteratlownoisesinceithassmallbiasandvariance(Fig.10C).Askincreases,thebiasincreasesandthevariancedecreasesathighnoise.KNNwithallkconsideredhereperformsbetterathighnoisebuttheselectionofappropriatekneedstobeguidedbytheacceptablelevelsofbiasandvariance.Thus,forshortdata,KNNwithk=3isthebestsinceitoverlapsexactlywiththeoreticalCCsanditsvariancedoesnotdiffersignicantlyfromtheothers.3.5Conclusionanddiscussion OurresultsindicatethattwoMI-estimationmethods,specicallyKDEandKNN,outperformtheothermeth-odsandestimationproceduresintermsoftheirabilitytocapturethedependencestructureincludingnonlineardependencewherepresent.WendthatKNNisthebestestimatorforveryshortdatawithrelativelylownoisewhileKDEworksbetterforveryshortdatawhenthenoiselevelsarehigher.AvisualexaminationofthedensityplotsmayhelpinexplainingtherelativeperformanceofKDEandKNN(Figs.13-16inAppendixB).Forshortdata,KNNisthebestchoiceforcapturingthenonlineardependenceacrossallnoiselevelsexceptwhenthedataaregeneratedfromchaoticdynamics,whereKDEisabetterchoiceathighernoiselevels.WesurmisethattherelativeperformanceofKDEandKNNwithrespecttovariousnoiselevelsisaconsequenceofthebias-variancetradeoff.PreviousliteraturesuggeststhatKDEestimatescanoftenbehighlybiasediftheparticularKDErecipeusedhereisfollowed[85],whileKNNestimatescanhavesig-nicantvariancewhenthenumberofnearestneighbors(k)issettolowvalues,e.g.,k=3asusedinthis37

PAGE 58

Ingeneral,theabovediscussionsandpointersappeartosuggestthattheresultsfornonlineardependenceobtainedfromKDEandKNNmayinfactreectthelowerboundsofwhatmaybepotentiallyachievablethroughimprovementsorintelligentcombinationsofKNNandKDE.Specically,boththeKDEandKNNestimatescanbepotentiallyimprovedbyutilizingaplug-inmethodforkernel,smoothingparameter(h),orkselection.Suchplug-inprocedureswouldcauseadditionalestimationvariancebutmayreducetheoverallMSEofestimation.However,thedevelopmentorutilizationofproceduresfortheselectionofoptimalkernels,smoothingparameters,ornearestneighbors,mayberatherinvolvedandhenceisleftasanareaoffutureresearch. WehavepresentedpreliminaryjusticationsfortherelativeperformanceoftheMIestimationmethodsbasedonconsiderationslikethebias-variancetradeoffandthenatureoftheapproximationsunderlyingtheestimationprocedures.Ourevaluationsuggeststhatthedevelopmentofguidancefortheuseofthemostsuitableestimationproceduremaybepossibleandwoulddependonknowndataordomaincharacteristicsandexploratorydataanalysis.Ifsuchguidancecanindeedbeprovided,thiscouldconceivablyleadtothedevelopmentofautomatedorsemi-automatedproceduresforthechoiceofthemostappropriateestimationprocedureandthecorrespondingparameters.However,signicantfutureresearchonmultipletestcasescomprisingsimulatedandrealdatamaybenecessarybeforesuchprocedurescanbedeployedinrealworldsettings.Acknowledgements

PAGE 60

Cross-spectrumanalysisbasedonlinearcorrelationsinthetimedomainsuggestedacouplingbetweenlargeriverowsandtheElNino-SouthernOscillation(ENSO)cycle.Anonlinearmeasurebasedonmutualinformation(MI)revealsextrabasinalconnectionsbetweenENSOandriverowsinthetropicsandsubtrop-ics,thatare20-70%higherthanthosesuggestedsofarbylinearcorrelations.TheenhanceddependenceobservedfortheNile,Amazon,Congo,Parana,andGangesrivers,whichaffectlarge,denselypopulatedregionsoftheworld,hassignicantimpactsoninter-annualriverowpredictabilitiesand,hence,onwaterresourcesandagriculturalplanning.4.1Introduction ENSOeventsimpactregionalprecipitationinthetropicsandsubtropics,ultimatelycausinginter-annualvari-abilityinriverows.Theocean-atmosphere-landinteractionsarecomplexandfarfrombeingcompletelyunderstoodandaccuratelymodeled.Aslightdisturbanceintheseinteractionswouldusuallyresultinsome-timessurprisingdistantcorrelationsandclimatepatterns.Analysesoftherainfallanomaliesduringthewarm(ElNino)andcold(LaNina)episodesofENSOsuggesttheexistenceofnonlinearseasurfacetemperature(SST)-rainfallrelationshipsinthetropicsandastronginuenceofSSTforcingonequatorialrainfallinthegeographicvicinityofthatforcing[11].Toproperlyexplainandultimatelypredictthisvariability,itisim-portanttodisentangle,asfaraspossible,longrangeclimaticphenomenafromrecenteffectssuchasthosepossiblyproducedbydeforestationandglobalwarming. Whiletherelationshipsamongmanyclimateandhydrologicalvariablesaredecidedlynonlinear[12],lineardependencemeasuresarestillbeingusedasamatterofcoursetorelateENSOandinter-annualvari-abilityinriverows.Thesemeasureshaverangedfromlinearcorrelationcoefcients(CC)inthetimedomain[13]tothecross-spectrumanalysis[17,18].Oneofthereasonsforusinglinearmeasuresisthattheinherentnoiseandperiodicityintheobservationstogetherwithshortlengthoftheavailablesamplesizesmakeitdifculttousenonlinearapproachesinclimateandhydrology[19].40

PAGE 61

ENSOeventsareassociatedwithSSTanomaliesovertheeasternandcentralequitorialPacicOcean.Inthisdissertation,theENSOindexisdenedintermsofthemonthlySSTvariationsfromthelong-termmean,averagedovertheregions2o6oN,90o170oW;2oN6oS,90o1800W;and6o10oS,110o150oWofthePacicOcean.ThisdatasetwaspublishedasahomogenizedmonthlyseriesofthemeanSSTanomalyfortheperiod1872-1989[98].After1989,theNINO3.4isusedastheENSOindexbecauseitsgeographicalregions(5oN5oS;120o170oW)areclosetoregionscorrespondingtotheWrightSST.41

PAGE 63

ThedischargedataoftheAmazonRiverwascollectedmonthlyfromtheRioNegrostageatManaus(lat.3oS,long.60oW)overtheperiodfrom1903to1985.TheintegratedrunoffattheManausgaugecoversmorethan3Mkm2oftheAndeanandwesternAmazonwatershed[17].TheseasonalstreamowcycleoftheAmazonindicatesthattheminimumandmaximumdischargesareobservedinNovemberandJune,respectively(Figure11b). TheCongoRiverdischargedatawascollectedmonthlyfromtheriverstageatKinshasa,Zaire(lat.4:3oS,long.15:3oE)from1905to1985.AstheCongoRiverbasin,coveringapproximately3.8Mkm2,islo-catedaroundtheequator,itexperiencesamarkedsemi-annualrainfallcyclewhichisassociatedwiththenorth/southmovementoftheintertropicalconvergencezone(ITCZ)acrosstropicalAfrica[99].Thisisevi-dentfromtheseasonalcycleoftheCongoriverindicatingtwopeaksinMayandDecember(maximum)andthelowestowinAugust(Figure11b). TheParanaRiverdischargedatafortheperiod1904-1997wascollectedmonthlyatCorrientes(lat.27oS,long.59oW)locateddownstreamoftheconuenceoftheParaguayandtheParanarivers.TheseasonalcycleoftheParanaexhibitsasinglepeakinFebruarywithalongrecessionandlowdischargeinSeptember(Figure11b). ThemonthlyGangesRiverdischargedatawasrecordedovertheperiodfrom1934to1993attheHardingeBridgeinBangladeshbytheBangladeshWaterDevelopmentBoard.ItexperiencestheoodseasonfromJulytoOctober,duringwhichtheaverageannualowis82%[15].ThepeakowandlowdischargeoftheGangesareobservedinAugustandApril,respectively,asexhibitedbytheseasonalcycle(Figure11b).Table4.Runoffdatastatistics(1000m3=s). Parameter Nile Amazon Congo Parana Ganges Mean 2.79 113.50 481.84 204.49 134.22 Std.dev. 0.46 85.24 50.90 49.64 27.03 Max.(year) 4.06(1879) 1301(1922) 659.57(1962) 462.73(1983) 202.84(1956) Min.(year) 1.46(1913) 832(1926) 386.81(1984) 112.95(1944) 75.46(1992) 43

PAGE 64

MIisameasureofstatisticaldependenceamongrandomvariableswhichcapturesthefulldependencestruc-ture,bothlinearandnonlinear.TheconceptofMIwasoriginallydevelopedincommunicationtheoryandhasbeenappliedtomultipledomainsoverthelastfewdecades[5,9].ConsideringtworandomvariablesXandY,theMI,denotedbyI(X;Y),isdenedasI(X;Y)=H(Y)H(YjX)=H(X)+H(Y)H(X;Y);(17) whereH(X)orH(Y)isthemarginalinformationentropywhichmeasurestheinformationcontentinasignalandH(X;Y)isthejointinformationentropywhichmeasurestheinformationcontentinajointsystemofXandY.TheMIbetweentworandomvariablesXandYcanalsobedenedasI(X;Y)=ZYZXpXY(x;y)logpXY(x;y) wherepXY(x;y)isthejointprobabilitydensityfunction(pdf)betweenXandY,andpX(x)andpY(y)arethemarginalpdfs.TheMIvaluesrangefrom0(independent)to1(completelydependent).Forabivariatenormalset(X;Y),theMIandthelinearCC,denotedby,arerelatedasI(X;Y)=0:5log[1(X;Y)2][93].Forcomparinglinearandnonlineardependencemeasures,theMIbasednonlinearCC,i.e.,,rangingfrom0to1isdenedfromtheaboverelationshipas^(X;Y)=q where^(X;Y)and^I(X;Y)aretheestimatednonlinearCCandMIbetweenXandY,respectively[93,94].Inaddition,justasthemeansquarederrors(MSE)canbederivedfromLR,alowerboundofMI-basedMSE,whichisameasureofthepredictabilityofYbasedontheinformationcontentinX,canbeestimatedas\MSE(Y)1 2eexp[2(^H(Y)^I(X;Y))];(20)44

PAGE 65

Figure12.Comparisonoflinear(LR)andMI-baseddependenceobtainedafterttingbivariatenormaldis-tribution(Norm)toeachpair.Thedependenceisestimatedwithrespecttodifferentnoisetosignalratiosandquartersinthesimulatedandrealdata,respectively.(a)Case4(Chaotic):Henonmapwith100points.(b)RelationshipbetweenENSOandNileRiverow.In(a),themeandependencefrombothcasesaresamewhereasthereisaveryslightdifferenceinvariancesforfewnoisetosignalratios.In(b),bothcasescapturethesamemeandependecewhereasvariancesdifferveryslightlyforfewquarters.45

PAGE 66

TheestimationoftheMIrequirestheestimationofthejointandmarginalpdfs,which,inturn,arefrequentlyobtainedfromhistogramandkerneldensitybasedestimators.EstimatesofMIareconsistentandasymptot-icallyconvergetothetrueortheoreticalvaluewhenthedatasetsarerelativelylargeanderror-free.SinceobservationsofriverowsandtheENSOindexareshortandusuallyaffectedbyvariouserrors,itisimpor-tanttoassessvariousMIestimationmethodsforshortandnoisydata.RecentlydevelopedmethodologieshavebeenexploredforestimatingtheMI,suchaskerneldensityestimators(KDE)[2],k-nearestneighbors(KNN)[5],andEdgeworthapproximationofdifferentialentropy(Edgeworth)[6].46

PAGE 67

Foranybivariatedataset(X;Y)ofsizeN,^I(X;Y)isestimatedas^I(X;Y)=1 ^pX(xi)^pY(yi);(21) where^pXY(xi;yi)istheestimatedjointpdf,and^pX(xi)and^pY(yi)aretheestimatedmarginalpdfsat(xi;yi). Themultivariatekerneldensityestimatorusinganormalkernelisdenedas^pX(x)=1 2h2!;(22) whereNisthenumberofdatapoints;xandxiarethed-dimensionalvectors;Sisthecovariancematrixonthexi;jSjisthedeterminantofS;andhisthekernelbandwidthalsocalledthesmoothingparameter[2].Inthisdissertation,thesmoothingparameterischosenastheoptimalGaussianbandwidthforanormalkernelgivenash=[4=(d+2)]1=(d+4)N1=(d+4).TheMIestimatesareobtainedbyrstestimating^pX,^pY,and^pXYfromequation(22)andthenpluggingtheminequation(21).4.3.2k-nearestneighbors(KNN) TheMIbetweenXandYisestimatedas^I(X;Y)=(k)1 whereNandkarethenumberofdatapointsandnearestneighbors,respectivley;if(i)=2isthedistancebetween(xi;yi)anditskthneighbor,denotedby(kxi;kyi),andifx(i)=2andy(i)=2aregivenasjjxikxijjandjjyikyijj,respectively,thennx(i)isthenumberofpointsxjsuchthatjjxixjjjx(i)=2;ny(i)canbecalculatedsimilarly;(x)isthedigammafunction,(x)=(x)1d(x)=dx,whichsatisestherelation(x+1)=(x)+1=x,with(1)=C,whereC=0:5772156649istheEuler-Mascheroniconstant[5].Thisdissertationchooseskas3sinceKraskovetal.[5]suggestedk>1inordertoreducestatisticalerrorsandalsoindicatedtoavoidlargevaluesofkwhichleadtotheincreaseofsystematicerrors.47

PAGE 68

UsingEdgeworthexpansionofthedensityp(x);x=[x1;::;xd],thedifferentialentropyisdenedasH(p)=H(p)J(p)=H(p)1 12dXi=1(i;i;i)21 4dXi;j=1;i6=j(i;i;j)21 72dXi;j;k=1;i
PAGE 69

Thesimulationsindicatethatthepresenceofnoisetypicallyleadstoanunder-estimationofthetrueMIbetweentheunderlyingnonlinearsignals(seesection4.5).AscomparedtoKNNandEdgeworth,KDEisfoundtocapturetheunderlyingnonlineardependencemoreconsistentlybetweentwotimeserieswhentheyareshortandnoisyassumingsuchdependenceexists(seesection4.5).Wealsocomparenonlineardependencemeasures,suchasKDE,KNN,andEdgeworth,witharank-baseddependencemeasure,i.e.,Kendall'stau.FromKendall'stau,weobservealargenegativebiasinnonlineardependenceinthesimulateddatacontaminatedwithnoise(seesection4.5).ThusinthisdissertationLRandKDEapproacheshavebeenconsistentlyusedtoestimateandcomparelinearandnonlinearCCs.4.5ComparisonsofMIestimationmethodsusingsimulations4.5.1ComparisonbetweenKDE,KNN,andEdgeworth ThethreeMIestimationapproaches,viz.KDE,KNN,andEdgeworth,areinvestigatedtondthemosteffectivemethodintermsofquantifyingtheunderlyingnonlineardependencefornoisyandshortdata.Linearandnonlineardependencemeasuresareevaluatedandvalidatedusingsimulatedtimeserieswithdifferentlinearornonlinearbehavior.Thebestmethodshouldgivezerocorrelationwhenthereisnodependenceandquantifyanylinearornonlinearcorrelationwhichmaybepresent.Thepresenceofnoisemakesthedetectionoflinearandnonlineardependencedifcult,asthisdependencemaybeobscuredifthenoisecomponentdominates.Thisistrueespeciallyforshorttimeseriesandevidentfromthedensityplotsforlinear,nonlinear,periodic,andnonlineardynamicalsystems,withdifferentnoisetosignalratios(Figures13-16).Withincreasingnoiselevels,thelinearandnonlineardependencecannotbereadilydiscernedvisuallyasshowninthekerneldensityplots(Figures13-16).Thisdissertationchoosesthebestmethodwhichcapturesmoreofthetruenonlineardependencesuchthat90%condenceboundsdonotoverlapwiththeestimates49

PAGE 70

Figure13.Normalandkerneldensitieswithdifferentnoise(n)tosignal(s)ratiosforCase1(Linear)withN=100.Forkerneldensity,aGaussiankernelwithoptimalGaussianbandwidth,givenash=N1=6,isused.(a)n=s=0:1.(b)n=s=0:5.(c)n=s=1:0.Thelineardependencestructurecanbeseenclearlyforcases(a)and(b)butcannotbereadilyidentiedforcase(c)basedoneyeestimation.50

PAGE 74

whereH(YjX)=H(Z)=0:5log(2en2).ThepdfofZispZ(z)=(2)1=2(n)1exp(z2 Fromequation26,thepdfofYisgivenaspY(y)=(2)1=2(x2+n2)1=2exp[x2

PAGE 75

Equation28issolvedusingnumericalintegrationfordifferentvaluesofn.H(Y),givenasH(Y)=RpY(y)logpY(y)dy,iscomputedusingpY(y).TheMIiscomputedbysubstitutingH(Y)andH(YjX)inequation25. Ineachcasethreesetsofdataofsize(N)50,100,and1000pointsareused.Inthisdissertation,50and100points(comparabletothesizesofthegeophysicaldatasetsusedinthedissertation)and1000pointsareconsideredasshortandlongtimeseries,respectively.ThetotalnumberofsamplesconsideredforN=50,N=100,andN=1000are200,100,and20,respectively.CCsaredenedasthemeanCCfromthetotalnumberofsamples.The90%condenceboundsofCCsaregivenby5%and95%quantilesofCCsobtainedfromthetotalsamples.Thecorrelationcoefcientsareplottedagainstnoise(n)tosignal(s)ratio,i.e.n=s.CCsobtainedfromKDE,KNN,Edgeworth,andLRarecomparedtondthebestmethodwhich55

PAGE 76

Table5.MIestimateswithstandarderrorsgiveninparenthesesbetweentwoGaussiannoisesets(Xi;Yi):XN(0;1);YN(0;1);i=1;::;N,whereXandYareiidandindependentofeachother.ThetotalnumberofsamplesforN=50,N=100,andN=1000are200,100,and20,respectively.TheMIestimatesanditsstandarderrorsarethemeanandstandarddeviationfromthetotalsamples.TheMIshouldbezerobetweentwoGaussiannoisesets.TheMIestimatesobtainedfromallthreemethodsareclosetozerobutbiasedupwardsinthecaseofKDEandKNN. Method 50 100 1000 KDE 0.1033(0.0351) 0.0707(0.0192) 0.0270(0.0046) KNN 0.0829(0.0549) 0.0555(0.0425) 0.0220(0.0143) Edgeworth 0.0298(0.0282) 0.0139(0.0115) 0.0016(0.0012) Thebestmethodneedstoberobusttonoiseandshortdata,bothinthesensethatthedependenceamongtheunderlyingnonlinearitiesarecapturedaswellasthecomputeddependenceisindeedzerowhenthevariablesareknowntobeindependent.TheMIbetweentwoindependentvariablesiszeroandthisfactcanbeutilizedasaconsistencycheckforeachmethod.HeretwoGaussiannoisesetshavingzeromeanandunitvarianceareanalyzed.TheMIestimatesarefoundtobeclosetozerofromallthreemethods(Table5).Thisdemonstratesthatallthreemethods,viz.KDE,KNN,andEdgeworth,passtheconsistencytestandyieldzerodependencewhenthetwovariablesareindependent.EdgeworthgivesthebestestimatesandsmallesterrorboundsbetweentwouncorrelatedGaussians.TheMIestimatesfromKDEandKNNarebiasedupwards.KDEispreferableoverKNNasthestandarderrorsfromKNNaremorethan1:5timeslargerthanthosefromKDE.56

PAGE 81

Case1(Linear):AnalysiswithincreasinglevelsofnoiseshowsthatCCsde-creasewithincreasingnoisetosignalratios.Thisisexpectedasthenoisecomponentobscurestheunderly-inglineardependence.NonlinearCCsobtainedfromallthreemethodsarecloseandoverlapwiththeoreticalvalues(which,inturn,areexpectedtobeidenticaltolinearCCsinthiscase),indicatingalltheapproaches(Linear,KDE,KNN,Edgeworth)succeedincapturingthedependencebetweentherandomvariables(Figures17,19a,and18a).NonlinearCCsfromKDEarebiasedupwardsforseveralnoisetosignalratios(Figures19a,18a,and20a).Athighernoiselevels,condenceboundsobtainedfromKNNandEdgewortharemuchwiderascomparedtoKDE'scondencebounds(Figures17and20a).Whileallcorrelationmethodscapturethetruelineardependence,KDEisthebetterchoiceasithasnarrowercondencebounds. Figure21.NonlinearandlinearCCsforCase2(Quadratic)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.AllthreenonlinearcorrelationestimatesincludetheoreticalCCsbut90%condenceboundsfromKNNandEdgeworthalsooverlapwithlinearCCsathighernoiselevelsin(a).ThisshowsthatKNNandEdgeworthestimatesarenotdifferentfromlinearCCsathighernoiselevels.KDEquantiesmorecorrelationascomparedtothelinearcorrelationastheir90%condenceboundsdonotoverlapindicatingthatKDEcantrulycapturethenonlineardependence.61

PAGE 82

Figure22.NonlinearandlinearCCsforCase3(Periodic)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.Edgeworthcapturesnothingmorethanthelinearcorrelation.Atlownoiselevels,KNNseemstothebestasitoverlapswiththeoreticalCCsanditsboundsarenarrow.Athighernoiselevels,KDEandKNNCCsoverlapandalsoincludelinearandtheoreticalCCsbut90%condenceboundsfromKNNarewiderthanthatobtainedfromKDE.Athighernoiselevelsandforrelativelysmalldata,KDEseemstohaveanedgeoverKNNbecauseofitsnarrowbounds.62

PAGE 83

Figure23.NonlinearandlinearCCsforCase4(Chaotic)with90%condenceboundsobtainedfromKDEandLR,respectively.(a)N=50.(b)N=100.Athighernoiselevels,KNNandEgdeworthCCsoverlapwithlinearCCsindicatingthattheydonotcaptureanythingmorethanthelinearcorrelation.KDEisthebestincapturingthenonlineardependenceasits90%condenceboundsdonotoverlapwithlinearCCs.63

PAGE 84

TheHenonexampleaboveisyetanothercasewhereathighernoiselevels,forknownnonlineardepen-dencebetweenthevariables,estimatesofCCsbasedonKNNandEdgeworth,becauseoftheirlargeerrorbounds,cannotconclusivelyshowthepresenceofthatdependence.AlthoughtheirperformanceisoftenbetterthanKDEforsmallnoiselevels,theyfailinlargenoisescenariosinsmalldatasets.Sincethedataisshortandhighlynoisyingeophysics,thissimulationexercisepointsatKDEasthebestestimatorofMI/CCsamongtheestimatorsavailable.4.5.1.3Longtimeseries

PAGE 86

Kendall'stauisarank-baseddependencemeasureusedtoestimatethestrengthofdependencebetweenvariables.Itprovidesarobustapproachtoassessmonotonicnonlineardependence.Infact,linearandrankcorrelationhavebeenusedjointlytounderstandthenatureofdependenceintheliterature[100].Let(X;Y)FbeapairofrandomvariableswithdistributionFand(~X;~Y)Fbeindependentof(X;Y)(andwiththesamedistributionF),Kendall'staubetweenXandYisdenedas=PX~XY~Y>0PX~XY~Y<0:(29)66

PAGE 87

ThecompletedescriptionofKendall'stauisgivenin[101]. Figure25.CorrelationcoefcientsforCase-Cubicwith90%condenceboundsobtainedfromLR,KDE,KNN,Edgeworth,andKendall'stau.(a)N=50.(b)N=100.(c)N=1000.Forallcases,thelowestcurveisobtainedfromKendall'stau.In(a)and(b),Kendall'stauoverlapswithlinearatlowernoiselevels.Butathighernoiselevels,itoverlapswithKNN.In(c),Kendall'staucapturesthelowestdependence.67

PAGE 94

),plottedassoliddotsareestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))isthemeanof100jackknifereplicationsofsize0:8Nobservations.The90%condenceboundsaregivenby5%and95%quantilesof100jackknifereplicationsofsize0:8N.74

PAGE 95

Thisdissertationassumesthattheseasonalcycleforaparticularyearconsistsof12monthsstartingwiththemonthhavingthelowestaveragedischarge.Italsoassumesthatlong-termowvariabilityduetoENSOcanbecapturedintheannualow,which,inturn,isdenedastheintegratedstreamowoftheseasonalcycle.HereeightquarterlyENSOindices,i.e.threequartersjustbeforetheseasonalcycle,fourquarterscorrespondingtotheseasonalcycle,andonequarterjustaftertheseasonalcycle,arederivedfromquarterlyaveragesofmeanmonthlySSTanomalies.ThebivariatenormalandkerneldensitybetweenthequarterlyENSOindicesandtheannualowoftheNile,Amazon,Congo,Parana,andGangesriversareestimatedandplotted(Figures26-30).LinearandnonlinearCCsbetweentheENSOindexandtheannualowoftheNile,Amazon,Congo,Parana,andGangesRiversareobtainedusingLRandKDE,respectively(Figure31).Thebias-correctedCCsandtheir90%condenceboundsareestimatedusingjackkning. Thejackknifeisusedtoestimatethebias-correctedandandtheirstandarderrorsusingKDEandLR,respectively.Thetechniqueisdescribedbelowforandisthesamefor.Inthecaseofrealdataanalysis,thetotalnumberofobservations(N)variesfrom60to117.Ifdobservationsforjackkningareleftoutandp

PAGE 96

),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Quarter Aug.,Sep.,Oct.(ASO) 0.135(0.048) 2.53(0.18) 0.300(0.033) 2.14(0.17) Nov.,Dec.,Jan.(NDJ) 0.137(0.058) 2.53(0.20) 0.378(0.030) 2.03(0.16) Feb.,Mar.,Apr.(FMA) 0.286(0.053) 2.34(0.20) 0.475(0.025) 1.82(0.13) May,Jun,Jul.(MJJ) 0.504(0.035) 1.95(0.16) 0.634(0.023) 1.44(0.12) Aug.,Sep.,Oct.(ASO) 0.528(0.037) 1.88(0.15) 0.617(0.027) 1.47(0.12) Nov.,Dec.,Jan.(NDJ) 0.501(0.040) 1.92(0.17) 0.597(0.029) 1.51(0.14) Feb.,Mar.,Apr(FMA) 0.466(0.037) 2.04(0.14) 0.555(0.029) 1.63(0.11) May+,Jun.+,Jul.+(M+J+J+) 0.171(0.042) 2.51(0.16) 0.361(0.030) 2.03(0.15) Table7.LinearandnonlinearCCsbetweentheannualowoftheAmazonRiverandtheENSOindexaver-agedforeightquarters.Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(pos-itive)signfollowingamonth.Thebias-correctedestimates,( ),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Quarter Feb.,Mar.,Apr.(FMA) 0.092(0.063) 4.82(0.41) 0.341(0.027) 3.84(0.29) May,Jun.,Jul.(MJJ) 0.202(0.053) 4.59(0.45) 0.379(0.040) 3.68(0.33) Aug.,Sep.,Oct.(ASO) 0.325(0.056) 4.29(0.36) 0.423(0.041) 3.49(0.28) Nov.,Dec.,Jan.(NDJ) 0.321(0.049) 4.34(0.38) 0.397(0.032) 3.62(0.30) Feb.,Mar.,Apr.(FMA) 0.220(0.052) 4.63(0.40) 0.340(0.039) 3.85(0.32) May,Jun.,Jul.(MJJ) 0.141(0.049) 4.77(0.46) 0.302(0.033) 3.94(0.38) Aug.,Sep.,Oct.(ASO) 0.044(0.049) 4.85(0.45) 0.180(0.039) 4.16(0.37) Nov.+,Dec.+,Jan.+(N+D+J+) 0.048(0.053) 4.82(0.43) 0.228(0.036) 4.06(0.34) 76

PAGE 97

),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Quarter Nov.,Dec.,Jan.(NDJ) 0.230(0.045) 1.74(0.19) 0.346(0.034) 1.39(0.14) Feb.,Mar.,Apr.(FMA) 0.271(0.054) 1.66(0.18) 0.449(0.045) 1.24(0.13) May,Jun.,Jul.(MJJ) 0.261(0.045) 1.71(0.19) 0.478(0.027) 1.23(0.13) Aug.,Sep.,Oct.(ASO) 0.132(0.054) 1.79(0.19) 0.362(0.036) 1.38(0.14) Nov.,Dec.,Jan.(NDJ) 0.092(0.040) 1.83(0.21) 0.397(0.031) 1.37(0.15) Feb.,Mar.,Apr.(FMA) 0.167(0.049) 1.77(0.17) 0.352(0.035) 1.38(0.12) May,Jun.,Jul.(MJJ) 0.023(0.055) 1.81(0.19) 0.229(0.031) 1.47(0.14) Aug.+,Sep.+,Oct.+(A+S+O+) 0.072(0.054) 1.79(0.22) 0.291(0.038) 1.42(0.17) Table9.LinearandnonlinearCCsbetweentheannualowoftheParanaRiverandtheENSOindexaveragedforeightquarters.Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth.Thebias-correctedestimates,( ),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations.\MSEanditsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationofMSEsestimatedfrom100jackknifereplicationsofsize0:8Nobservations. Quarter Dec.,Jan.,Feb.(DJF) 0.141(0.040) 1.59(0.21) 0.315(0.033) 1.08(0.08) Mar.,Apr.,May(MAM) 0.211(0.041) 1.54(0.22) 0.297(0.038) 1.08(0.09) Jun.,Jul.,Aug.(JJA) 0.366(0.043) 1.35(0.22) 0.437(0.032) 0.95(0.08) Sep.,Oct.,Nov.(SON) 0.435(0.043) 1.30(0.16) 0.513(0.031) 0.89(0.07) Dec.,Jan.,Feb.(DJF) 0.476(0.058) 1.25(0.13) 0.542(0.043) 0.85(0.04) Mar.,Apr.,May(MAM) 0.453(0.052) 1.27(0.13) 0.530(0.043) 0.84(0.06) Jun.,Jul.,Aug.(JJA) 0.251(0.055) 1.52(0.20) 0.403(0.033) 1.01(0.07) Sep.+,Oct.+,Nov.+(S+O+N+) 0.087(0.047) 1.60(0.22) 0.309(0.041) 1.08(0.09) 77

PAGE 98

),areestimatedas2(^;^)(^(:);^(:)),where(^,^)aretheoriginalnonlinearandlinearCCsbetweentheannualowandENSO,respectively,consideringallNobservations.(^(:);^(:))andtheirsstandarderrorsgiveninparenthesesarethemeanandstandarddeviationof100jackknifereplicationsofsize0:8Nobservations. Quarter Jul.,Aug.,Sep.(JAS) 0.036(0.076) 4.70(0.55) 0.351(0.037) 3.64(0.46) Oct.,Nov.,Dec.(OND) 0.130(0.071) 4.83(0.46) 0.359(0.039) 3.76(0.43) Jan.,Feb.,Mar.(JFM) 0.118(0.063) 4.71(0.43) 0.416(0.033) 3.53(0.39) Apr.,May,Jun.(AMJ) 0.408(0.051) 4.08(0.34) 0.509(0.040) 3.23(0.30) Jul.,Aug.,Sep.(JAS) 0.492(0.043) 3.56(0.34) 0.567(0.034) 2.82(0.31) Oct.,Nov.,Dec.(OND) 0.483(0.040) 3.73(0.35) 0.562(0.028) 2.99(0.29) Jan.,Feb.,Mar.(JFM) 0.415(0.043) 3.98(0.36) 0.516(0.030) 3.18(0.31) Apr.+,May+,Jun.+(A+M+J+) 0.270(0.053) 4.53(0.48) 0.490(0.029) 3.36(0.40) 78

PAGE 99

Table11.VariationintheannualowofriversassociatedwithENSO.LinearandnonlinearCCsareestimatedusingLRandKDE,respectively.Monthsinaquarteraregivenin[].Themonthpreceding(following)theseasonalcycleisindicatedbyanegative(positive)signfollowingamonth. River PreviousStudies LinearCC NonlinearCC Nile 25%[SON][13] 28%[ASO] 40%[MJJ] Amazon 10%[DJF][14] 11%[ASO] 18%[ASO] Congo 10%[MAM][14] 7%[FMA] 23%[MJJ] Parana 19%[DJF][14] 23%[DJF] 29%[DJF] Ganges 29%[JJA][15] 24%[JAS] 32%[JAS] 4.6.1Descriptionofresults LinearCCsbetweenriverowsandsomequartersoftheENSOindex,suchas,allquartersoftheENSOindexandtheNileow,quarter2toquarter6oftheENSOindexandtheAmazonow,quarter1toquarter7oftheENSOindexandtheCongoow,andallquartersoftheENSOindexandtheGangesow,arenegative.SincenonlinearCCsobtainedfromKDE,KNN,andEdgeworthdonothavedirectionality,theabsolutevaluesoflinearCCsareconsideredandplotted.TheMI-basednonlineardependencemeasure,i.e.KDE,generatehigherCCsandlowerMSEsascomparedtolineardependencemeasure,i.e.LR,whichshowsthatKDEcapturesmoreextrabasinalconnectionbetweenENSOandriverowsinthetropicalandsubtropicalregionsoftheworldascomparedtoLR(Tables11and6-10).ThepercentagevariationintheannualowofriversassociatedwithENSOarecalculatedasthesquareofCCs.KDEsuggestsanincreaseofaround20-70%intheextrabasinalconnectionbetweenENSOandriverowsoverthosesuggestedbyLR(Figure31andTable11).InthecaseofNile,90%condenceboundsoflinearandnonlinearCCsarewellseparatedfor5quartersincludingquarterwiththehighestnonlinearCCindicatingthatKDEcapturesgreaterdependencebetweenENSOandtheannualowcomparedtoLR(Figure31a).KDEsuggestsgreaterdependencebetweentheCongoowandENSOsince90%condenceboundsoflinearandnonlinearCCsarewellseparatedforallquartersexcepttherstquarter(Figure31c).InthecaseofAmazon,Parana,andGanges,90%condenceboundsoflinearandnonlinearCCsoverlapforallthosequarterswhichhavehigherlinearCCsbutforotherquarterstheboundsarewellseparated(Figures31b,31d,and31e).ThisindicatesthatbothKDEandLRcapturenothingmorethanthelineardependenceforsomequartersbasedon90%condencebounds.However,thereisanincreaseinthebias-correctedCCsfromKDEascomparedto79

PAGE 100

Theresultswiththerealdatareportedheresuggestthatthereexistsanonlinearextrabasinalconnectionbe-tweenENSOandriverowsinthetopicsandsubtropics.ThisdissertationalsoshowsanappreciableincreaseinthevariationofannualriverowslinkedtoENSOusingnonlinearrelationshipmeasureascomparedtolinearmeasures.Hence,theseresultsindicateadditionalpredictabilityintheENSO-streamowextrabasi-nalconnectionwhenMI-basedapproachesareused,ascomparedtolinearapproachesusedbyresearcherstilldate.TheadditionaldependencecapturedbytheMI-basednonlinearCCsmaybeusefulfordevelop-ingmoreaccurateandlongerstreamowmodels.Thiscan,inturn,helpinwaterresourcesmanagement(e.g.,reservoirsanddamsforoodcontrol,powergeneration,droughtmitigationandpreparednessforwatersupply).4.7Discussion Streamowseriesmayreectmonotonictrendsrelatedtoanthropogenicfactors,whichmayincludediver-sions,consumptionsandowregulationswithinthebasin,inadditiontopossibleimpactsofclimatechange.Anestimationofthelikelymagnitudes,aswellasqualitativeassessmentoftheevidence,ofsuchchangesmayneedtobeperformedonacasebycasebasisforeachbasin.Justasanexample,threeoftheco-authorsofthispaperperformedqualitativeinvestigationsforstreamowsoftworiverswithintheUnitedStates[21].Theseinvestigationsdemonstratedthatmeaningfulstudiesmayneedtoberathertime-consuming,hencesucheffortsareleftasareasoffutureresearchforthepurposesofthispaper.Discussionsregardingthespecicdatasetsutilizedinthispapercanbefoundwithinthedatasourcesaswellaswithinthepreviousstudiesthathaveutilizedthesedatasets[13,17].Wewouldliketonotethataccountingforalltheknowntrends,ifpossible,mayhavesignicantimpactontheENSOtostreamowconnection.Thus,itislikelythattheENSO-streamowextrabasinalconnectionisactuallyevenhigherthanestimatedifsuchtrendsweretobeaccountedfor.Conversely,itispossiblethatsomeextremesarehighlightedintheanthropogenicbasinowtrendswhichtendtooveremphasizetheENSOconnection.Ontheotherhand,anargumentcanperhaps80

PAGE 101

AlthoughENSOhasadirectinuenceonrainfallanomaliesoverthetropicalandsubtropicalregions,onlyaportionofthevariationintheannualowofriverslocatedintheseregionsisassociatedwithENSOevents.Thismaybeduetothecomplexrelationshipbetweenrainfallandrunoff,which,inturn,dependsonsurfacehydrologicalandocean-atmosphere-landinteractionprocessesaswellasnoisyandpotentiallyincompleteorcorrupteddata. Inrecentdecades,economic,populationandgeo-politicalpressureshaveresultedinsignicantchangesinland-usepatternsthatmayaltertheland-atmosphere-watercycleinthetropicsandsubtropics.Thesechangesinthewatercyclecan,inturn,impactregionalprecipitation,watervaporux,andsurfacewaterows,causingregionalaswellasglobalshiftsinseasonal-to-interannualatmosphericvariability.AbetterunderstandingandquanticationoftherelationshipbetweenENSOandriverdischargescanhelpscientistsandpolicymakersunderstandandgetpreparedforthechangesinriverdischargepatternsbesidesattributingsuchchangestonaturaloranthropogenicdrivers.Acknowledgements

PAGE 102

Spatialandtemporalvariabilityofprecipitationextremesareinvestigatedbyutilizingdailyobservationsavailableat2:50griddedeldsinSouthAmericafortheperiod1940-2004.All65yearsofdatafrom1940-2004areanalyzedforspatialvariability.Thetemporalvariabilityisinvestigatedateachspatialgridbyutilizing25-yearmovingwindowsfrom1965-2004andvisualizedthroughplotsoftheslopeoftheregressionlineinadditiontoitsqualitymeasure(R2).ThePoisson-generalizedPareto(Poisson-GP)model,whichisapeaksoverthreshold(POT)approach,isappliedtoweeklyprecipitationmaximaresidualsbasedonthe95%-quantilethreshold,whiledailydataareutilizedtoanalyzethenumberofconsecutivedailyextremesanddailyextremesinamonthbasedonthe99%-quantilethreshold.UsingthePoisson-GPmodel,wecomputeparametersoftheGPdistribution,returnlevels(RL)andanewmeasurecalledtheprecipitationextremesvolatilityindex(PEVI).ThePEVImeasuresthevariabilityofextremesandisexpressedasaratioofreturnlevels.From1965-2004,thePEVIshowsincreasingtrendsintheAmazonbasinexcepteasternparts,fewpartsoftheBrazilianhighlands,north-westVenezuelaincludingCaracas,northArgentina,Uruguay,RioDeJaneiro,SaoPaulo,Asuncion,andCayenne.Catingas,fewpartsoftheBrazilianhighlands,SaoPauloandCayenneexperienceincreasingnumberofconsecutive2-and3-daysextremesfrom1965-2004.Thenumberofdailyextremes,computedforeachmonth,suggestthatlocalextremesoccurmostlyfromDecembertoAprilwithJulytoOctoberbeingrelativelyquietperiods.5.1Introduction Precipitationextremescanhavesignicantimpactsonhumansociety,economics,andnature.Floodingisdirectlyassociatedwithprecipitationextremeswhichcancauselargenumberofcasualties,lossofprop-erty,waterbornediseaseoutbreaksinhumans,plantsandanimals[103],andextensivedamagetocrops.Anunderstandingoftheintensityandfrequencyofprecipitationextremescanbeveryusefulforinfrastructuredevelopmenttopreventoodingandlandslides,aswellasforwaterresourcesandagriculturalmanagement.ThismayhelpnationsandworldbodiesliketheUNtobebetterpreparedforfuturedisasterscausedbyoodsandashoods.Abetterunderstandingofprecipitationextremescanhelphydrologicscientistsandclima-82

PAGE 103

Previousstudiesinvestigatedtrendsandvariabilityofprecipitationextremesinmanypartsoftheworldinthetwentiethcentury,specicallytheUnitedStates[22,23],India[24],SoutheastAsiaandtheSouthPacic[25],Australia[23,26],Europe[27],Caribbean[28],Italy[29],Balkans[30],Canada,Norway,Russia,China,Mexico[23],Japan[31],Sweden[32],southeasternSouthAmerica[33],andthestateofSaoPaulo,Brazil[34].Recentlythespatio-temporalvariabilityofdependenceamongprecipitationextremeswasinvestigatedovertheentireSouthAmericafortheperiod1940-2004usinganewapproach(suggestedbyKuhn[35])[36].However,wearenotawareofanypriorinvestigationsonspatialandtemporalvariabilityofprecipitationextremesovertheentirecontinentofSouthAmerica. Extremevaluetheory(EVT)hasbeenwidelyusedinhydrologytoperformoodfrequencyanalysesbyutilizinghistoricalrecordsofprecipitation,steamowandothervariables[104].Inrecentyears,EVThasbeenappliedinmultipledisciplinesincludinghydrology[39,44],ecology[41,47],hurricanedam-age[105],temperature[106],windspeed[107],andwildresizes[108].Thegeneralizedextremevalue(GEV)distribution,developedbyJenkinson[37],hasbeentraditionallyutilizedformodelingprecipitationextremes[38].Thisapproachisalsocalledtheblockmaximaapproachsinceittsthedistributiontothehighestvaluesinblocksofequalsize,e.g.,maximumyearlyprecipitation.Ithassomeadvantages,e.g.,itsrequirementscanbemetbyasimpliedsummaryofdataandtheblockmaximacanbeassumedtobeindependentrandomvariables[41].ButthemaindrawbackoftheGEVdistributionisthatitdoesnotutilizealltheavailableinformationabouttheuppertailofthedistribution,e.g.,twohighestextremeprecipitationeventsmayoccurinthesameyear[41].Analternativeapproachistousepeaksoverthreshold(POT)whichwasoriginatedinhydrologyandmakesuseofallthedataavailable,e.g.,alldailyprecipitationdata[42].ThestatisticalmodelunderlyingthePOTmethodconsistsof(1)Poissonprocessfortheoccurrencesofextremesoveralargethresholdand(2)generalizedPareto(GP)distribution(withscale()andshape()parameters),developedbyPickands[43],forthedistributionofexcessesoveralargethreshold.ThismodelisalsotermedasPoisson-GPmodel.Recently,theGPdistributionhasbeenutilizedformodelingthresholdexcessesfromdailyprecipitationdata[44,45].ThisdissertationutilizesthePoisson-GPmodelforinvestigatingthespatialandtemporalvariabilityofprecipitationextremesateachgridpointinSouthAmerica. Dailyprecipitationdataisavailablein2:50griddedeldsfortheperiod1940-2004inSouthAmer-ica.ThePoisson-GPmodelassumesthedatatobeindependentandidenticallydistributed(IID)[39].Along-termtrendandseasonalityinthedataviolatetheassumptionofidenticallydistributeddatawhereas83

PAGE 106

ThedailyprecipitationdatausedinthisdissertationwaspublishedforBrazil,Venezuela,northArgentina,Paraguay,Uruguay,SurinameandFrenchGuianafrom1940-2004byLiebmannandAllured[109].Thedatawaspresentedin2:50griddedeldswhichwereconstructedusingdailyprecipitationtotalsfrom7900stations.Thedailyprecipitationateachpointona2:50gridwascalculatedbyaveragingdailyprecipitationfromallstationswithinaradiusof1:8750ofthepoint.Thecompletedescriptionofthisdatasetsisgivenin[109].Thespatialvariabilityisinvestigatedfor65yearsfrom1940-2004wherethepercentageofdatapointsavailablefortheanalysisateachgridpointisshowninFigure32a.Weanalyzeallthosegridpointshaving14ormoreyearsofdata.Forthespatialvariabilityfrom1940-2004,223gridpointsareanalyzedsincetheyhave14ormoreyearsofdata.Weinvestigatetemporalvariabilityfor40yearsfrom1965-2004byconsidering25-yearmovingwindow,i.e.,1965-1989,1966-1990,:::,1980-2004,whichgenerates16values.Figure32bshowsthemeanpercentageofdatapoints,i.e.,meanof16percentagesofdatafor16windowsfrom1965-2004,availablefortheanalysisateachgridpoint.Atotalof216gridpointsareanalyzedforthetemporalvariabilitysincetheyhavemeanpercentagevaluesof56%ormorewhichisequivalentto14ormoreyearsofdataoutof25years.5.2.2Methodology5.2.2.1Poisson-GPmodel Letx1;:::;xnbeasequenceofIIDmeasurements.Anextremeeventxisdenedwhenitexceedsathresholdu.Ifx(1);:::;x(k)arethekexceedancesoverthresholdu,thenthresholdexcessesaredenedasyi=x(i)u,fori=1;:::;k.Ify1;:::;ykisanindependentsequenceofarandomvariable,thedistributionofthesethresholdexcessescanbeapproximatedbyamemberoftheGPfamily[110].The86

PAGE 107

wherey>0;>0isascaleparameter;and<<1isashapeparameter.TheshapeparameterisimportanttounderstandthequalitativebehavioroftheGPdistribution.TheGPdistributionhasanupperboundfor<0(alsocalledboundeddistribution)whereasitisunboundedfor=0(alsocalledlight-taileddistribution)andhasnoupperlimitfor>0(alsocalledheavy-taileddistribution)[41].TheparametersandoftheGPdistributionareestimatedbymaximizingthelog-likelihoodfunctionsincemaximumlikelihoodestimationassignsthehighestprobabilitytotheobserveddatabyadoptingthemodelwiththegreatestlikelihoodoutofallthemodelsunderconsideration[110].Thelog-likelihoodfunctionfortheGPdistributiondenedinEquation31isgivenasl;=8><>:klog()(1+1=)Pki=1log(ci);ci>0klog()1 whereci=(1+yi=)[110].TheGPmodelscanbeeasilyinterpretedusingextremeupperquantilesorreturnlevels.Inhydrology,thereturnlevelisgenerallydenedonanannualscale,e.g.,forareturnperiodN,N-yearreturnlevelisdenedasthelevelexpectedtobeexceededonceineveryNyears,orhavinganexceedanceprobabilityof1=Ninanygivenyear.N-yearreturnlevelcanbeobtainedbyinvertingEquation31asF1;=RLN=8><>:u+ [(Nnyu)1];6=0u+log(Nnyu);=0(33) whereuandnyarethethresholdandnumberofobservationsinayear,respectively;andu=k=nistheprobabilityofanindividualobservationexceedingu[110]. TheGPdistributionisalimitingdistributionofexcessesoveralargethreshold,thereforethechoiceofthresholdcanbecritical.Ifathresholdislow,itislikelytoviolatetheasymptoticbasisofthemodelleadingtobiasinestimationandextrapolationwhereasahighthresholdwillresultinsmallnumberofexceedancesformodelestimationleadingtolargeestimationvariance[110].Twomethodsforthresholdselection,whichprovideareasonableapproximationtothedistributionofthresholdexcesses,areavailable:(a)ndathresholdu0fromthemeanresidualplot,whichisaplotbetweenmeanofexcessesandthresholdu,abovewhichthe87

PAGE 109

Wecomparethedistributionoftheinter-arrivaltimesofthresholdexceedanceswiththeexponentialdistri-butionusingthegoodness-of-tstatisticDSP,suggestedbyMichael[113],whichisbasedonthestabilizedprobabilityplot.Lett1;:::;tkbekinter-arrivaltimesofexceedancesoverthresholdgivenasthe95%-quantile.Ift1<
PAGE 110

Iftheinter-arrivaltimesofthresholdexceedancesfollowone-dimensionalhomogeneousPoissonprocess, ThequalityofthettedGPmodeltothresholdexcessescanbeassessedbyprobabilityandquantileplots.Ify1;:::;ykarethekexcessesoverathresholduand^FisanestimatedGPmodel,theprobabilityplotcanbegeneratedasf(i=(k+1);^F(yi));i=1;:::;kg;90

PAGE 111

ThevalidityofthePoisson-GPmodelisbasedontheassumptionthatthedatashouldbeIID.Thepresenceoflongtermtrends,seasonality,andtemporalcorrelationsviolatetheassumptionofIIDdata[47].Precipitationdatamaybetemporallycorrelatedandhavelongtermorseasonaltrends[47].Galambos[116]investigatedtheeffectoflongtermtrends,seasonality,andtemporaldependenceinthedataonthevalidationofextremevaluetheoryandfoundthatiftheauto-correlationdecreasesaslagtimesincreases,theasymptoticdistri-butionofextremesisthesameasthatfromIIDsamples.Thedetectionofclusteringofextremesisalsoimportantbecausemaximumlikelihoodestimationtechniqueassumesthetimeseriesofexcessesoveralargethresholdtobeindependent[41].Clusteringgraduallydisappearsasthethresholdincreasesbuttherearesomevariables,suchastemperature,whichexhibitclusteringevenwithhighthresholds[110].Ifthereexistsclustersofextremesoverahighthreshold,TodorovicandZelenhasic[42]presentedanadhocandinefcientprocedurefordeclusteringwhichgeneratesatimeseriesbychoosingthehighestvalueofeachcluster. Weanalyzethreedifferentsetsofdatageneratedfromthedailydata,suchasdaily,weeklymaxima,andweeklymaximaresiduals,tochoosethebestdatasatisfyingtheIIDassumptionandimprovingthequalityofthePoisson-GPmodel.Insteadofpresentingtheresultsfrom223gridpoints,weoutlinetheresultsfor91

PAGE 119

Theresultsdiscussedandpresentedinthissectionprovideacoupleofinterestinginsights,which,inturn,haveinuencedourdataanalysischoices.First,minortorelativelymoresignicantreductions,intermsseasonalityorperiodicity,clusteringofextremesandautocorrelation,aswellasimprovementsintermsofthe

PAGE 131

Weanalyzeweeklymaximaresidualstoinvestigatethespatialandtemporalvariabilityofthreshold,50-yearRL,200-yearRL,andPEVI(Figures43,44,45,and46).Increasingordecreasingtrendsinprecipitationfrom1965-2004canbeevaluatedfromthetemporalvariabilityofthresholds.Ateachgridpoint,thethresholdischosenasthe95%-quantileoftimeseries.Spatialvariabilityisinvestigatedfor65years(1940-2004)andthelast40years(1965-2004)areanalyzedforthetemporalvariability,whichisgivenastheslopeoflineartrendobtainedbyttingaregressionlineto16valuescomputedfrom25-yearmovingwindowfrom1965-2004,i.e.,1965-1989,1966-1990,:::,1980-2004.UsingtheGPdistribution,thespatialandtemporalvariationsinandandtheirstandarderrorsareevaluatedandshowninFigures41and42.rangesfrom5-15mmatmajorpartsofSouthAmericaexceptsomepartsoftheAmazonbasin,northArgentina,andParaguaywhereitismorethan15mmwhereasthestandarderrorofvariesfrom0-3mminthewholeSouthAmerica(Figure41aand41b).From1965-2004,increasesineasternBrazilincludingRioDeJaneiroandmajorpartsoftheBrazilianHighlands,Uruguay,Paraguay,somepartsofnorthArgentina,southVenezuela,FrenchGuianaandSurinamewhereasdecreasingtrendsinareobservedintheAmazonbasin,Venezuela,theMatoGrassoPlateau,Catingas,SaoPaulo,andBuenosAires(Figure41c).Theshapeparameter()ismostlygreaterthanzerointhewholeSouthAmericaexcepteasternBrazilandnorthArgentina(Figure42a).rangesfrom0.4-0.6and0.2-0.4intheCatingasandMatoGrassoPlateau,respectively.Thestandarderrorofvariesfrom0.05-0.15inthewholeSouthAmerica(Figure42b).From1965-2004,thetemporalvariationsinindicateincreasingtrendsinVenezuela,easternBrazilincludingRioDeJaneiroandSaoPaulo,andmajorpartsoftheAmazonbasin,theBrazilianHighlands,Uruguay,Paraguay,andnorthArgentinaincludingBuenosAires(Figure42c).111

PAGE 135

Individualnationsneedtomakepolicydecisionsaboutwaterresources,agriculturalplanning,infrastruc-turemanagementanddisasterreadinessormitigationstrategies.Thus,wepresentourresultsbycountriesin3.1-3.6.Aninvestigationofprecipitationextremesinconjunctionwithtopographyandvegetation,whichispresentedin3.7,canleadtoenhancedhydrologicalandclimatologicalinsights.5.3.1Brazil IntheAmazonbasin,thresholdislargerthantheotherpartsofBrazilbutshowsadecreasingtrendfrom1965-2004(Figure43).IntheeasternpartsoftheAmazonbasin,50-yearand200-yearRLsandtheirstandarderrorsarehigherthantheotherpartsofSouthAmericabuttheseRLsdecreasemoresharplyascomparedtotheotherpartsofSouthAmericafrom1965-2004(Figures44and45).Both50-yearand200-yearRLsshowdecreasingtrendsfrom1965-2004inthewholebasin(Figure45).ThePEVIishigherinsomeeasternpartsofthebasinbutitdecreasessharplyfrom1965-2004inthoseparts(Figure46).WeobserveincreasingPEVItrendsfrom1965-2004inthemajorpartsofthebasinincludingnorth-west(NW)whereitshowssharplyincreasingtrends.Thepercentageofthenumberofconsecutive2-daysextremesislessthan10%whereasthemajorpartsofthebasinhavezeronumberofconsecutive3-daysextremes(Figures47aand47b).From1965-2004,thepercentageofconsecutive2-daysextremesincreasesonlyinthewesternpartsofthebasinwhereasnotrendsinthepercentageofconsecutive3-daysextremesareobservedbecauseofzeronumberofconsecutive3-daysextremesinthebasin(Figures47cand47d). InCatingasandtheMatoGrassoPlateau,thresholdsarelowerandshowdecreasingtrendsfrom1965-2004(Figure43).Catingashasthehighest50-yearRL,200-yearRL,andPEVIbuttheirtrendsindicatedownwardbehaviorfrom1965-2004(Figures44,45,and46).IntheMatoGrassoPlateau,thePEVIishigherrelativetothemajorpartsofSouthAmericaandliesbetween1.4and1.5butitshowsadecreasingtrendfrom1965-2004(Figure46).Thenumberofconsecutive2-and3-daysextremesarelessthan10%and2%ofthetotalextremes,respectively,inbothCatingasandtheMatoGrassoPlateau(Figures47aand47b).From1965-2004,thepercentageofconsecutive2-daysextremesshowsincreasingtrendsinCatingas115

PAGE 136

IntheBrazilianHighlands,thresholdislowbutitshowsanincreasingtrendinthesouthernparts(Figure43).ThresholdsarelowineastBrazilexceptsoutheasternBrazilwheretheyaremuchhigherrelativetotheotherpartsofSouthAmerica.From1965-2004,weobserveincreasingtrendsinthresholdalongeasterncoastalregionsofBrazilincludingRioDeJaneirobutthresholdsshowdecreasingtrendsinBrasilia,SaoPauloandtheirsurroundingregions.50-yearRL,200-yearRL,andPEVIarelowintheBrazilianHighlandsandeastBrazilbuttheyshowincreasingtrendsinthemajorpartsoftheBrazilianHighlandsandeastBrazilin-cludingRioDeJaneiroandSaoPaulo(Figures44,45,and46).InBrasilia,decreasingtrendsinPEVIareobservedfrom1965-2004(Figure46).20-35%and6-16%ofthetotalextremesoccurconsecutivelyfor2and3days,respectively,intheBrazilianhighlandsandnorth-east(NE)Brazil(Figures47aand47b).InRioDeJaneiroandSaoPaulo,15-20%ofthetotalextremesoccurfor2daysconsecutively(Figure47a).From1965-2004,thenumberofconsecutive2-and3-daysextremesshowincreasingtrendsinNEBrazil,fewpartsoftheBrazilianHighlandsandeastBrazilincludingSaoPaulobuttheydecreaseinRioDeJaneiro(Figures47cand47d).InBrasilia,thenumberofconsecutive2-daysand3-daysextremesvary20-25%and6-8%,respectively,andtheyshowdecreasingtrendsfrom1965-2004(Figure47). TheAmazonbasinexperiencesmostoftheextremesfromJanuarytoAprilwithMarchbeingthemonthmostpronetoextremeswhileitreceivesmostoftherainfallfromDecembertoMay(Figure48).ThewettermonthsinNEBrazilandCatingasarefromDecembertoMaybuttheyreceivemostoftheextremesfromJanuarytoAprilwithMarchreceivingthehighestnumberofextremes.TheMatoGrossoPlateauexperiencesmostextremesfromDecembertoFebruarywithJanuaryreceivingthehighestnumberofextremes.TheBrazilianhighlandsandsouth-east(SE)BrazilreceivesmostoftherainfallfromNovembertoAprilbutthehighestnumberofextremesareobservedinthesummermonths,i.e.,DecembertoFebruary,withJanuarybeingthemostcriticalwithrespecttothenumberofextremes.5.3.2NorthArgentina SomepartsofNEArgentinaincludingBuenosAireshavehigherthresholdsrelativetotheotherpartsofSouthAmerica(Figure43).From1965-2004,increasingtrendsinthresholdareobservedinthemajorpartsofnorthArgentinaexcludingBuenosAires.50-yearand200-yrRLsdonotshowmuchvariationsandtheirtrendsfrom1965-2004showincreasingbehaviorinthemajorpartsexcludingBuenosAires(Figures44and45).ThePEVIrangesfrom1.1-1.2inthemajorpartsbutitalsoliesbetween1.2and1.3insomeparts116

PAGE 137

InVenezuela,thresholdsarelowbuttheyshowincreasingtrendsfrom1965-2004everywhereexceptinsomesouthernparts(Figure43).Wedonotobservemuchvariationsin50-yearand200-yearRLsbuttheseRLsshowincreasingtrendsfrom1965-2004onlyinthenorthernpartsincludingCaracas(Figures44and45).ThePEVIisbetween1.2and1.3everywhereexceptinNWincludingCaracaswhereitishighandrangesfrom1.3-1.6(Figure46).IncreasingtrendsinPEVIareobservedfrom1965-2004inthemajorpartsincludingCaracas.SomenorthernpartsincludingCaracasreceive5-10%ofthetotalextremesfor2daysconsecutively(Figure47a).Venezueladoesnotexperienceextremesoccurringfor3daysconsecutively(Figure47b).From1965-2004,thenumberofconsecutive2-daysextremesshowsincreasingtrendsonlyinsouthVenezuelawhereasnotrendsareobservedinconsecutive3-daysextremesinthemajorpartssincethesepartsdonotreceiveconsecutive3-daysextremes(Figures47cand47d).Decreasingtrendsinconsecutive3-daysextremesareobservedfrom1965-2004onlyineastVenezuela.ThemainrainyseasoninVenezuelaisfromMaytoNovemberanditreceivesthemostnumberofextremesfromJunetoAugustwithJunebeingthemostcriticalwithrespecttothenumberofextremes(Figure48).5.3.4Uruguay UruguayhashighthresholdsinSouthAmericaandtheirtrendsfrom1965-2004indicateincreasinglevelsinthemostpartsexceptMontevideoanditssurroundingregions(Figure43).Wedonotobservemuchvariationsin50-yearand200-yearRLsandtheirtrendsshowincreasingbehaviorfrom1965-2004everywhereexceptinMontevideoanditssurroundingareas(Figures44and45).ThePEVIrangesfrom1.2-1.3everywhere117

PAGE 138

InParaguay,thresholdsarehighandtheyshowincreasingtrendsfrom1965-2004inmostpartsincludingAsuncion(Figure43).50-yearRLsdonotvarymuchwhereassomevariationsareobservedin200-yearRLs(Figure44).Increasingtrendsinboth50-yearand200-yearRLsareobservedfrom1965-2004butthesetrendsincreasemorerapidlyinAsuncionanditssurroundingareasascomparedtotheotherparts(Figure45).ThePEVIvariesfrom1.1-1.3everywhereexceptinAsuncionanditssurroundingregionswhereitvariesfrom1.3-1.4(Figure46a).From1965-2004,thePEVIincreasesinthemajorpartsbutitisincreasingmorerapidlyinAsuncionanditssurroundingareasascomparedtotheotherparts(Figure46b).Thenumberofextremesoccurringconsecutivelyfor2daysislessthan10%ofthetotalextremeswhereasthemajorpartsofthecountrydonotreceiveconsecutive3-daysextremes(Figure47aand47b).Decreasingtrendsinbothconsecutive2-and3-daysextremesareobservedfrom1965-2004(Figure47cand47d).InAsuncion,lessthan5%and2%ofthetotalextremesoccurfor2and3daysconsecutively,respectively,andtheirtrendsshowdecreasingbehaviorfrom1965-2004.Paraguayreceivesheavyrainfallinsummer,i.e.,OctobertoMarch,andexperiencesthemostnumberofextremesfromDecembertoFebruarywithDecemberreceivingthehighestnumberofextremes(Figure48).5.3.6SurinameandFrenchGuiana InSurinameandFrenchGuiana,thresholdsarelowbuttheirtrendsfrom1965-2004showsharplyincreasingbehavioreverywhereincludingParamariboandCayenne(Figure43).Novariationsin50-yearand200-yearRLsareobservedbuttrendsin50-yearand200-yearRLsdecreaseinSurinameandincreaseinsomepartsofFrenchGuianaincludingCayenne(Figures44and45).ThePEVIliesbetween1.2and1.4inSurinamewhereasitrangesfrom1.2-1.3inFrenchGuiana(Figure46a).From1965-2004,thePEVIshowsdecreasingtrendsinbothSurinameandFrenchGuianaexceptCayennewherethePEVIincreases(Figure46b).Thenumberofconsecutive2-daysextremesvariesfrom1-20%inSurinamewithParamariboreceiving15-20%118

PAGE 139

Inmid-andhigh-altitudesoftheBrazilianhighlandsandeastVenezuela,thePEVIvariesfrom1.1-1.4andshowsincreasingtrendsfrom1965-2004insomeareas(Figure46a).Thehigh-altitudesoftheBrazilianHighlandsreceive20-35%and6-16%ofthetotalextremesconsecutivelyfor2and3days,respectively,andtheirtrendsshowincreasingbehaviorfrom1965-2004insomeparts(Figure47).Butinthehigh-altitudesofeastVenezuela,consecutive3-daysextremesarenotobservedandlessthan5%ofthetotalextremesoccurconsecutivelyfor2daysanditstrendsshowdecreasingbehaviorfrom1965-2004.Themid-altitudesandlowlandsofeastBrazilindicateverylessvariationsinPEVIwhichliesbetween1.1and1.2butmostoftheeasternBrazil,whichincludesRioDeJaneiroandSaoPaulo,experienceincreasingtrendsinPEVIfrom1965-2004(Figure46).IneastBrazil,thenumberofconsecutive2-and3-daysextremesrangefrom10-30%and1-14%,respectively,andtheirtrendsshowincreasingbehaviorfrom1965-2004onlyinsomepartsin-cludingSaoPaulo(Figure47).CatingaswithlowlandshasthehighestPEVIinSouthAmericawhereasthelowlandsoftheAmazonbasinandtheMatoGrassoPlateauhavehigherPEVIvalues(Figure46a).InthelowlandsofVenezuela,northArgentina,Uruguay,Paraguay,Suriname,andFrenchGuiana,thePEVIliesbe-tween1.1and1.3(Figure46a).From1965-2004,themajorpartsofthelowlandsoftheAmazonbasin,northVenezuela,northArgentina,Uruguay,andParaguayexperienceincreasingtrendsinPEVIwhereasdecreas-ingtrendsinPEVIareobservedinthelowlandsoftheMatoGrassoPlateau,Suriname,andFrenchGuianaexceptCayenne(Figure46b).AllthelowlandsregionsofBrazil,northArgentina,Venezuela,Paraguay,Uruguay,SurinameandFrenchGuianaexperiencelessthan10%ofthetotalextremesfor2daysconsecu-tivelywhereasthenumberofconsecutive3-daysextremesiszeroinmostoftheseareas(Figures47aand47b).OnlysomeofthelowlandsareasoftheAmazonbasinparticularlywesternpartsofthebasin,theMato119

PAGE 140

IntheevergreenforestsoftheAmazonbasin,southVenezuela,Suriname,andFrenchGuiana,thePEVIrangesfrom1.1-1.3(Figure46a).CatingaswithevergreenforesthasthehighestPEVIinSouthAmerica.IncreasingtrendsinPEVIareobservedonlyinsomepartsoftheAmazonbasinandsouthVenezuela(Figure46b).IntheAmazonbasin,Catingas,andsouthVenezuela,thenumberofconsecutive2-daysextremesislessthan10%ofthetotalextremesanditstrendsshowincreasingbehaviorfrom1965-2004(Figure47aand47c).InSuriname,thenumberofconsecutive2-daysextremesvariesfrom1-20%andshowsdecreasingtrendsfrom1965-2004whereasitrangesfrom10-15%andindicatesincreasingtrendsfrom1965-2004inFrenchGuiana.Thenumberofconsecutive3-daysextremesiszerointheevergreenforestsofCatingas,southVenezuela,andsomepartsoftheAmazonbasinandshowsdecreasingtrendsinSurinameandincreasingtrendsinthemajorpartsofFrenchGuianafrom1965-2004(Figure47band47d).InthesavannasofnorthVenezuela,thePEVIrangesfrom1.2-1.6andshowsincreasingtrendsfrom1965-2004whilethenumberofconsecutive2-daysextremesislessthan10%andshowsdecreasingtrendsfrom1965-2004(Figures46and47).Inthecropland/naturalvegetationoftheBrazilianhighlands,theMatoGrassoPlateau,eastBrazil,northArgentina,Uruguay,andParaguay,thePEVIislowandliesbetween1.1and1.3andshowsincreasingtrendsfrom1965-2004insomeoftheirareas(Figure46).TheBrazilianhighlandsandNEBrazilexperience20-35%and6-16%ofthetotalextremesconsecutivelyfor2and3days,respectively,andtheirtrendsshowincreasingbehaviorfrom1965-2004insomeareas(Figure47).Lessthan10%ofthetotalextremesoccurconsecutivelyfor2daysintheMatoGrassoPlateau,SEBrazil,northArgentina,Uruguay,andParaguaybuttheirtrendsincreasefrom1965-2004insomepartsoftheMatoGrassoPlateauandnorthArgentina.Thenumberofconsecutive3-daysextremesislessthan2%inSEBrazil,somepartsoftheMatoGrassoPlateau,northArgentina,Uruguay,andParaguay.From1965-2004,trendsinconsecutive3-daysextremesdecreaseinSEBrazil,theMatoGrassoPlateau,Uruguay,ParaguaybutincreaseinsomepartsofnorthArgentina. Acautionshouldbeexercisedwhileinterpretingtheresultsatallthosegridpointswhere

PAGE 141

ThisdissertationanalyzedthespatialandtemporalvariabilityofprecipitationextremesinSouthAmericabasedondailyprecipitationdataavailablein2:50griddedeldsfrom1940-2004.Ateachgridpoint,65yearsofdatafrom1940-2004wereusedtounderstandspatialvariabilitywhereasthetemporalvariabilitywasinvestigatedfor40years(1965-2004)andwasgivenastheslopeoflineartrendobtainedbyttingaregressionlineto16valuescomputedfrom25-yearmovingwindowfrom1965-2004,i.e.,1965-1989,1966-1990,:::,1980-2004.WeanalyzedweeklyprecipitationmaximaresidualsandutilizedthePoisson-GPmodeltoinvestigatethespatialandtemporalvariabilityofthreshold,thescale()andshape()parameters,and50-yearand200-yearRLs.Thetemporalvariabilityofprecipitationwereevaluatedfromthetemporalvariabilityofthresholds.Thethresholdwaschosenasthe95%-quantileoftimeseries.WealsoinvestigatedthespatialandtemporalvariabilityofthePEVI,whichmeasuresthevariabilityofextremesandisdenedastheratioof200-yearand50-yearRLs.Basedondailyprecipitationdata,weinvestigatedthespatialandtemporalvariabilityofthepercentageofthenumberofconsecutive2-and3-daysextremesoutofthetotalnumberofextremes.Thespatialvariabilityofthepercentageofthenumberofextremesinaparticularmonthoutofthetotalnumberofextremeswasalsoinvestigatedbasedondailyprecipitationdata.Thethresholdfortheanalysisofdailyprecipitationdatawaschosenasthe99%-quantileoftimeseries. PrecipitationishighindicatedbyhighthresholdsinSEBrazil,Uruguay,Paraguay,andBuenosAires.ThePEVIishighintheeasternpartsoftheAmazonbasin,Catingas,MatoGrassoPlateau,NWVenezuelaincludingCaracas,andAsuncion.From1965-2004,bothprecipitationandthePEVIshowincreasingtrendsintheeasterncoastalregionsofBrazilincludingRioDeJaneiro,theBrazilianhighlandsparticularlysouth-ernparts,northVenezuelaincludingCaracas,somepartsofnorthArgentina,Uruguay,ParaguayincludingAsuncion,andCayenne.TheAmazonbasinexcepteasternpartsandSaoPauloexperienceincreasingtrendsinthePEVIanddecreasingtrendsinprecipitation.IntheeasternpartsoftheAmazonbasin,Catingas,theMatoGrassoPlateau,Brasilia,BuenosAires,andMontevideo,simultaneousdecreasingtrendsareobservedinprecipitationandthePEVI.ThePEVIshowsdecreasingtrendsinSurinameincludingParamariboandFrenchGuianaexcludingCayennealthoughincreasingprecipitationtrendsareobservedintheseareas.Thenumberofconsecutive2-and3-daysextremesarehighintheBrazilianHighlands,NEBrazil,andBrasilia.Trendsinprecipitationandthenumberofbothconsecutive2-and3-daysextremesincreaseinfewpartsof121

PAGE 142

TheareasofinterestbasedonanincreasingPEVIfrom1965-2004,aretheAmazonbasin,theBrazilianHighlands,Venezuela,Uruguay,Paraguay,andsomeofthehighlypopulatedcitiesinSouthAmerica,speci-callyRioDeJaneiro,SaoPaulo,Caracas,Asuncion,andCayenne.SomepartsofeastBrazil,fewpartsoftheBrazilianhighlands,SaoPaulo,andCayennealsoexperienceincreasingnumberofconsecutive2-and3-daysextremes.Waterresourcesengineersandplanners,disastermanagementagencies,andpolicymakersneedtopayspecialattentiontotheregionswithincreasingtrendsinthePEVIandconsecutive2-and3-daysdailyextremes,especiallywhentheseregionsoverlapwithdenselypopulatedareas,whileplanningforinfrastruc-turedevelopmentanddisastermanagement.Civilengineerscanutilizetheresultsofthisdissertationforthedesignofhydraulicstructures,specicallywhenconsideringtheoptimalsafetyfactorsintheirdesign.Hy-drologistsandclimatologistsneedtodelvedeeperintothepotentialcausesoftheobservedspatio-temporaltrendsinextremesfordelineatingthevariabilityofextremesduetonaturalandanthropogeniceffects. Precipitationextremesmayresultinsignicantlossofhumanlifeandproperty.However,thedamagescausedbyprecipitationcanbeinuencedbyavarietyoffactorsotherthanjustprecipitationmaximaorthestatisticalpropertiesthereof.Thesefactorsincludesurfaceandsub-surfacehydrologysincethedamagescausedbyprecipitationextremesareprimarilycausedbyoodsandashoods,which,inturn,arestronglyinuencedbythephysicsofrunoffandinltration.Theotherfactorispopulation:certainlythe(catastrophic)impactofdisastersdependson(high)populationdensitiesandthelocationofcriticalinfrastructuresorna-tional/humanassetswhichmaybepotentiallydamagedbyprecipitationextremes.Finally,theactualdamageswouldalsobeafunctionofresilienceofcommunitiesandcriticalinfrastructurestoprecipitationextremesandrelateddisasters.122

PAGE 144

Theabilitytodetectthechaoticsignalfromanitetimeseriesobservationofhydrologicsystemsisad-dressedinthispaper.Thepresenceofrandomandseasonalcomponentsinhydrologicaltimeseries,likerainfallorrunoff,makesthedetectionprocesschallenging.Testswithsimulateddatademonstratethepres-enceofthresholds,intermsofnoisetochaotic-signalandseasonalitytochaotic-signalratios,beyondwhichthesetofcurrentlyavailabletoolsisnotabletodetectthechaoticcomponent.Theinvestigationsalsoindi-catethatthedecompositionofasimulatedtimeseriesintothecorrespondingrandom,seasonalandchaoticcomponentsispossiblefromnitedata.RealstreamowdatafromtheArkansasandColoradoriversareusedtovalidatetheseresults.Neitheroftherawtimeseriesexhibitschaos.WhileachaoticcomponentcanbeextractedfromtheArkansasdata,suchacomponentiseithernotpresentorcannotbeextractedfromtheColoradodata.Thisindicatesthatrealhydrologicdatamayormaynothaveadetectablechaoticcomponent.Thestrengthsandlimitationsoftheexistingsetoftoolsforthedetectionandmodelingofchaosarealsostudied.6.1Introduction Thepresenceofnonlineardynamicsandchaoshasstrongimplicationsforpredictivemodelingandtheanal-ysisofdominantprocessesinanydiscipline.Theexistenceofchaoticbehaviorhasbeendemonstratedindiverseareasrangingfromturbulence[119],weatherorclimate[67,120]andgeophysics[54,123],tobiologyormedicine[127],nance[128],andelectricalcircuits[131].Thepresenceofchaosinhy-drologyhasbeensuggestedbypreviousresearchers[48].Theabilitytodetectandmodelchaoticbehaviorfromnitehydrologictimeserieshasrecentlybeendebated[65,66]. Characterizationofchaosfromreal-worldobservationsisknowntobeadifcultprobleminnonlineardy-namics[67].Thecomplexitywashighlightedinthecontextofclimatemodelsby[70],whodemonstratedthatsensitivitytoinitialconditionsmaybecomelessapparentwhentherandomnessininternalatmosphericvariablesbeginstodominate.Fundamentalquestionsstillremainunansweredintheseareas,forexampletheabilitytodetectchaosfromanitetimeserieswithrandomandseasonalcomponents,theabilityto124

PAGE 145

Thispaperinvestigatestheabilityofnonlineardynamicaltoolstodetect,characterize,andpredictchaosfromnitehydrologicobservations,usingbothsimulatedandrealtimeseries.Realisticsimulateddataisgeneratedbycontaminatingchaoticsignalswithrandomandseasonalcomponents,whilerealstreamowdataareusedfromtheArkansasandColoradorivers.Thecorrelationdimensionmethodisusedfordetectingthepossiblepresenceofchaos.Nonlinearpredictivemodels,namelythephase-spacereconstruction(PSR)andarticialneuralnetworks(ANN)areemployedfortimeseriesdecompositionandprediction.Thisdisser-tationdevelopsseveralnewinsightsandinterestingresults.Thepresenceofthresholdsforthedetect-abilityofchaosisdemonstrated,specicallywhenachaoticsignalismixedwithrandomorseasonalsignals,oracombinationthereof.Thesethresholdscanbeexpressedintermsoftherelativedominanceofthechaoticcomponentcomparedtotherandomorseasonalcomponents.Theabilitytodecomposeatimeseriesintothecontributionsfromtheindividualcomponents(random,seasonalandchaotic)isshown.Thecorrespondingimplicationsforpredictivemodelingandcharacterizingthenonlineardynamicsarehighlighted.Realstream-owdataanalysisprovidesadditionalinsights.First,notallhydrologictimeseriescontainchaoticcompo-nentswhichcandedetectedandmodeled.Second,forcertainnitehydrologictimeseries,thepresenceofchaoscanindeedbedetected,isolatedfromrandomandseasonalcomponents,andutilizedforpredictivemodeling. Therestofthepaperisorganizedasfollows.Section2presentsthetoolsandmethodsemployedinthisdissertation.ThesimulatedandrealhydrologicaldataarediscussedinSection3.Sections4and5presentanddiscusstheresultsobtainedwithsimulatedandrealhydrologicaltimeseries,respectively.ThesummaryandconclusionsofthispaperarepresentedinSection6.6.2Toolsandmethods6.2.1Stateoftheartandliteraturereview:toolsandconcepts Thetheoreticalconceptsunderlyingthemethodologiesforthedetectionandmodelingofnonlineardynamicalandchaoticcomponents,aswellastheirimplementation,areavailableintheliterature[64,132].Thepresentdissertationexploitsthecorrelationdimensionmethod,aswellasnonlinearpredictivemodelslike125

PAGE 146

whereistheHeavisidestepfunction;Nisthenumberofpointsinthetimeseries;andristheradiusofaspherewithitscenterateitherofthecurrentpoints,ziorzj.TherelationbetweencorrelationsumC(r)givenbyEq.(1)andcorrelationexponentisexpressedas wherecisaconstantand=limr!1LnC(r)

PAGE 147

[137]observedthatthenitecorrelationexponentachievedusingthecorrelationdimensionmethodisnotagoodindicatorofthepresenceofchaossincelinearstochasticprocessesmayalsoyieldanitecorrelationexponent.Thedeterminationofcorrelationexponentisgreatlyinuencedbyseveralfactorsincludinglimiteddatasize,thepresenceofnoise,delaytime,andthepresenceofalargenumberofzerosinthedataset.Hydrologicaltimeseriesisnite,contaminatedwithnoise,andmaycontainalargenumberofzeros.Aniteandsmalldatasetproducesasmallerscalingregion.ThismaynotbesufcienttocalculatetheslopeoftheLnC(r)vs.Lnrcurveandmayresultinanunderestimationofthecorrelationexponent.Alargescalingregionmaybebetterdelineatedwhenalargedatasetisusedwhich,inturn,resultsinabetterestimationoftheslope.[138]suggestedthattheminimumnumberofdatapointsrequiredforthecorrectestimationofthecorrelationexponentisNmin=102+0:4m,wheremistheembeddingdimension.ThepresenceofnoisemayaffectthescalingbehaviorandmaytendtomaketheslopeoftheLnC(r)vs.Lnrplotlargerforsmallvaluesofrresultinginanoverestimationofthecorrelationexponent.Ifthedelaytime,,istoosmall,thephasespacemaycontainverylittleinformationandmayresultinanunderestimationofthecorrelationexponent.Ifistoolarge,thephasespacemaymissoutnearbydivergingtrajectoriesresultinginanoverestimationofthecorrelationexponent.Thepresenceofalargenumberofzerosinthetimeseriesproducesaphasespacewithlimitedinformationabouttheunderlyingdynamicsandresultsinanunderestimationofthecorrelationexponent.6.2.1.2Articialneuralnetworks(ANNs)

PAGE 148

AsthenumberofinputnodesintheANNisincreased,thepredictionskillsincreaseuptoacertainpointandthenbecomeconstant.Alargenumberofinputnodesreectstheuseofadditionallaggedvariablestomodelthetimeseries.Whenthenumberofinputsistoolow,theinformationcontentinthelaggedvaluesisnotadequatelycaptured.Thisresultsinahigherbiasandhencelowerpredictionskills.Whenthenumberistoolarge,thefunctionalformtobemodeledgrowscomplex,leadingtoalargererrorvarianceandcorrespondingdecayinskills.Thisbias-variancetradeoff(andrelatedissues)usuallyresultsinanoptimalnumberofnodeswheretheskillsattainamaxima.Incertaincases,thenumberofinputvaluesrepresentingthenumberoflaggedvariablesofatimeseriesatwhichtheskillsattainamaximahavebeenequatedwiththenumberofvariablesinuencingthedynamicalsystem(e.g.,[59]).6.2.1.3Phase-spacereconstruction(PSR)prediction Inthepresentdissertation,thelocalapproximationapproachsuggestedby[143]isusedtodetermine^FT.Thisapproachreducesthecomplexityofthisfunctionbydividingitsdomainintomanyneighborhoodsandestimatinganapproximationmapforeachneighborhood.Assuggestedby[126],^FTcanbemodeledasa128

PAGE 149

where,(z11;::;z11(m1));::;(zkk;::;zkk(m1))arethekstatesnearesttothecurrentstate,Zi,andarecomputedbasedontheEuclideandistancemeasurekZiZjk,whereZj=(zjj;::;zjj(m1));b1;::;bmarethefunctioncoefcients;andz11+T;::;zkk+Tarethekpredicteddatapoints.Usingfunctioncoefcients,bi,fromEq.(4),thepredictedvalueforthecurrentstateisgivenas^zi+T=Pmr=1zi(r1)br,whereisthedelaytime.Inthepresentdissertation,thevaluesemployedforkandare25and1,respec-tively.ThepredictionaccuracyisrepresentedintermsofCCandMSEbetweentheoriginaldataandthepredicteddata.Ithasbeensuggested[59]thattheembeddingdimensionatwhichtheCCvs.mcurvesaturatesmayindicatetheminimumnumberofvariablesrequiredtocapturethedynamicsofthesystem.6.3Datadescription Theeffectivenessofthemethodsdescribedinsection2.1intermsoftheirabilitytodetectthepresenceoflow-dimensionalchaos,aswellasforshort-termpredictions,isinvestigated.Thisisaccomplishedthroughananalysisofbothsimulatedandrealhydrologicdata.6.3.1Simulateddata Hydrologicaltimeseriesarealwaysniteandmaybecontaminatedwithnoiseandseasonality.Toobtainrealisticinsights,simulateddataisgeneratedbycontaminatingchaotictimeserieswithnoise,i.e.whiteandautoregressive,andseasonality.ThechaotictimeseriesisrepresentedherebytheLorenzsystemofequationsas dt=(yx);dy dt=xz+rxy;dz dt=xybz(40) where=10;r=28;andb=8=3.TheLorenzsystemishighlysensitivetotheinitialcon-ditionsandis,therefore,chaotic.Seasonalityimpliesperiodicitywhich,inturn,couldbroadlyinclude129

PAGE 150

RealhydrologicseriesincludingmonthlystreamowtimeseriesoftheArkansasriveratLittleRockanddailystreamowtimeseriesoftheColoradoriverbelowParkerdamareinvestigated.TherawdataisobtainedfromtheU.S.GeologicalSurveysite. Thepresence(orabsence)oflow-dimensionalchaosinashorthydrological(monthlystreamow)timeseriesattheArkansasriveratLittleRockinArkansasisrststudied.TheArkansasRiveristhefourthlongestriverintheUnitedStatesandisatributaryoftheMississippiwhichowseastandsoutheastthroughColorado,Kansas,OklahomaandthestateofArkansas.ItislocatedatLatitude34o4500000andLongitude92o1602500.Theassociateddrainageareais158,090squaremileswhilethecontributingdrainageareais135,849squaremiles.ThetemperatureatLittleRockrangesfromameanlowof40oFinJanuarytoameanhighof81oFinJuly.ThemeanannualprecipitationatLittleRockis50.26inches.Thestreamowdataof43years(October1927-September1970)isanalyzed.ThemonthlystreamowtimeseriesattheArkansasriverisshowninFig.50. ThedailystreamowdataobservedattheColoradoriverbelowParkerdam,Arizona-Californiaisalsoinvestigatedfortheexistenceofchaoticbehavior.TheColoradoRiverowsthroughColorado,NewMexico,Utah,California,ArizonaandNevada.ItdrainsapartofthearidregionsonthewesternslopesoftheRocky130

PAGE 151

StatisticsofthestreamowdataobservedattheArkansasandColoradoriversaregiveninTable12.Table12.Streamowdatastatistics(valuesinm3=s) Parameter Arkansasriver Coloradoriver Datapoints 516 1461 Numberofzeros 0 0 Mean 0.6529 0.1672 Standarddeviation 0.684 0.0594 Variance 0.4678 0.00353 Maximumvalue 4.7571 0.2819 Minimumvalue 0.0187 0.0334 6.4Resultswithsimulateddata6.4.1Purechaotic,randomandseasonaltimeseries Inthisdissertation,achaotictimeseriesisrepresentedbytheXcomponentofEq.(5);arandomseriesisanormally-distributedwhitenoisewith=0and=0:16;andseasonalseriesisrepresentedbyaperiodicfunction,i.e.cosinefunctionwithafrequencyof10Hzandanamplitude10.Eachserieshas1000131

PAGE 152

Figure53.RelationbetweencorrelationexponentandembeddingdimensionforLorenz(Xcomponent),seasonal,andwhitenoiseseries.132

PAGE 154

Asdescribedearlierandindicatedbypreviousresearchers[60,136,144],hydrologicandotherrealsystemstendtogenerateobservablesthathaverandomandseasonalcomponents,inadditiontoanynonlinearde-terministic(orchaotic)signalthatmaybepresent.Simulateddataaregeneratedusingamixtureofthesecomponentstounderstandtheabilitytoidentify,characterizeandquantifychaosfromamidstseasonalityandnoisethroughthecommonlyusedtoolsfornonlineardynamicsandchaos.6.4.2.1Mixtureofchaoticandseasonalseries

PAGE 159

Org.wn MSE CC 0.075 0.0910 21.3 0.0083 0.9999 0.1 0.1254 25.4 0.0157 0.9999 0.1 0.2318 15.9 0.0537 0.9997 0.33 0.3608 9.33 0.1301 0.9992 1.0 1.1985 19.85 1.4349 0.9913 2.0 2.6403 32.01 6.9706 0.9587 3.0 4.1341 37.8 17.0917 0.9022 5.0 6.5895 31.8 43.4134 0.7806 10.0 12.624 26.24 159.3665 0.4948 Figure58.Correlationexponentvs.embeddingdimensionplotformixedtimeseriesconsistingofLorenz(Xcomponent)andwhitenoise.139

PAGE 160

Theseparationofthenonlineardeterministicsignalfromtherandomcomponentisnowconsidered.Theseparationisobtainedthroughtheuseofpredictivemodelingstrategiesthathavedemonstratedvalueforchaoticsystems.Severalassumptionsaremadeforthispurpose.First,itisassumedthatthefunctionalformencapsulatedbythetrainednonlinearpredictivemodelscanbeutilizedtomodelthedeterministiccompo-nentofthetimeseries.Thus,adecompositionoftheseriesintoadeterministicandarandomcomponentisrealized.Second,itisassumedthatthechaotic-signaliscontainedwithinthedeterministiccomponentandadequatelymodeledbythepredictivemodels.Thus,ifafunctionalform^fNNmodeledbytheANNexists,where^zt+T=^fNN(zt;zt;::;zt(m1))+,theANNcanbetrainedusingtheentiredatasettond^fNN.Oncetrained,theestimateof^ztprovidedbytheANNisassumedtorepresentthedeterministiccomponent,andtheresidualisassumedtorepresentnoise.Thisassumptionholdsaslongasthefunctionalformmodeledbythepredictivemodelisvalid.Thesecondassumptionimpliesthatthechaotic-signal,ifany,iscontainedwithinthedeterministicsignalisolatedusingthepredictivemodeling-baseddecompositionstrategy.Itisnoted,however,thatthisapproachdoesnotdistinguishbetweenchaoticandnon-chaoticdeterminism.Once140

PAGE 161

Table14.SeparationofwhitenoisefromamixtureofLorenz(Xcomponent)andwhitenoiseusingthePSRwithm=10. %error MSE CC 0.075 0.0861 14.8 -0.0012 -0.0066-0.0041 0.0825-0.0901 0.0074 0.9999 0.1 0.1066 6.6 -0.0022 -0.0088-0.0045 0.1021-0.1115 0.0114 0.9999 0.2 0.2008 0.4 0.0 -0.0135-0.0115 0.1924-0.2101 0.0403 0.9997 0.33 0.3502 6.12 -0.0011 -0.0229-0.0208 0.3355-0.3664 0.1226 0.9991 Table15.SeparationofwhitenoisefromamixtureofLorenz(Xcomponent)andwhitenoiseusingtheANNwithm=10. %error MSE CC 0.075 0.0871 16.1 0.0 -0.0054-0.0054 0.0834-0.0911 0.0076 0.9999 0.1 0.1129 12.9 0.0 -0.007-0.007 0.1081-0.1181 0.0138 0.9999 0.2 0.2238 11.9 0.0 -0.014-0.014 0.2144-0.2341 0.05 0.9996 0.33 0.3541 7.3 0.0 -0.0221-0.0221 0.3392-0.3704 0.1391 0.9991 141

PAGE 162

Figure60.LnC(r)vs.Lnrplotfortheseries,afterremovingnoise,observedattheArkansasriver.Thecurvesareshowninascendingorderofembeddingdimension,m=1,2,..,20fromtoptobottom.142

PAGE 165

ThemonthlystreamowtimeseriesattheArkansasriverisshowninFig.50andisanalyzedusingthecorrelationdimensionmethodtodetectthepresence(orabsence)oflow-dimensionalchaosinthesystem.Themethodologyusedfortheanalysisisdescribedinsection2.1.1.Amovingwindowofsize3unitsisusedtocomputetheslope.ThenumberofLnrvaluesforwhichtheslopeisavailableandconstantislimitedforthisdata.Theslopeatamiddlerange(aroundLnr=0:8)isusedasthecorrelationexponent.Fig.59showsthecorrelationexponentvs.mplotfortheoriginalmonthlystreamowserieswhichdoesnotsaturatebasedeitheronaneyeestimationortheobjectivetechniquesdescribedinsection2.1.1.Thisindicatesthateitherthesystemisstochasticorthechaoticcomponent,ifpresent,isdominatedbynoiseorseasonality.ThetimeseriesisanalyzedusingthePSRwithm=10toisolatethedeterministiccomponentfromtheoriginalseries.Themethodologyissimilartothatusedforsimulateddata,andisdescribedindetailinsection4.2.Themeanandstandarddeviationofthenon-deterministiccomponent,i.e.noise,separatedusingthePSRare0:0097and0:7159,respectively.Themeanandstandarddeviationoftheoriginalseriesandtheseriesaftertheremovalofnoise,i.e.deterministicseries,are=0:6529;=0:6833and=0:5908;=0:6564,respectively.Inthepresentexample,theratioofn(0:7159)andc(0:6529)is1:09whichisgreaterthanthethresholdvalue(i.e,0:02)obtainedforthesimulateddatainsection4.2.2.Anattempttotaperiodicfunctiontothedeterministicseriestoisolateseasonalbehaviorisnotmadesincenodominantfrequencyintheseriesisobservedfromfrequencydomainanalysis.Thedeterministicseriesisanalyzedusingthecorrelationdimensionmethodwiththesameembeddingdimension,i.e.10,andwindowsize,i.e.3,toexaminealow-dimensionalchaosinthesystem.Fig.60showstheLnC(r)vs.Lnrplotfortheseries.Thecorrelationexponentvs.mplotinFig.59forthedeterministicseriesshowsthatthecorrelationexponentsaturatesafteranembeddingdimensionof15basedontheobjectivetechniques.Thesaturationvalueofthecorrelationexponentis3:97.Thenumberofvariablesrequiredtomodelthestreamowdynamicalsystemwouldbe4.Thelowvalueofcorrelationdimensionsuggeststhepossiblepresenceoflow-dimensionalchaosinthestreamowdynamics.Fig.61showsone-step,two-step,andthree-stepaheadpredictionsfortheoriginalmonthlystreamowseriesattheArkansasriver.Fig.62showsMSEvs.Forecastleadtimeplotsfortheoriginalanddeterministicseries.TheMSEvaluesatForecastleadtimeof7and8are27:22and315:27,respectively.Fortheoriginalseries,MSEvaluesarehighandincreaseastheforecastleadtimeincreasesindicatingthepresenceofnon-deterministiccomponentconrmingtheresultsobtainedearlier145

PAGE 166

ThedailystreamowseriesobservedattheColoradoriverbelowParkerdamisshowninFig.51.Thisseriesisanalyzedforlow-dimensionalchaoticbehaviorusingthecorrelationdimensionmethod.Theslopeiscalculatedbyconsideringamovingwindowofsize3units.ThecorrelationexponentistheslopevalueatLnr=2:2.Thecorrelationexponentvs.mplotfortheoriginalseriesisshowninFig.63.Basedonaneyeestimationandtheobjectivetechniquesdescribedinsection2.1.1,itisseenthatthecorrelationexponentdoesnotreachasaturationvalue.Thisindicatesthateitherthesystemisstochasticoritcontainssomedominantnoise/seasonalitycomponent.Toseparatethenon-deterministiccomponent,theoriginalserieswith=0:1672and=0:0594isanalyzedusingthePSRwithm=10.Themeanandthestandarddeviationofthenon-deterministiccomponent,i.e.noise,separatedusingthePSRare0:0and0:0099,respectively.Thestatisticsforthedeterministiccomponentthatmightcontainchaosandseasonalityaregivenas=0:1669146

PAGE 167

Figure65.MultistepaheadpredictionsfortheColoradoriverstreamowdata.Top:one-stepaheadpredic-tions.Middle:two-stepaheadpredictions.Bottom:three-stepaheadpredictions.147

PAGE 169

Thestrengthsoftheuseofnonlineardynamicaltoolsinconjunctionwithstatisticaltimeandfrequencydo-mainmethodologiesweredemonstratedinthiswork.Thisdissertationinvestigatedtheabilitytodetectandmodelchaosfromnitehydrologicobservations,especiallywhenrandomnessandseasonalityarepresent.Theabilitytodetectandmodelnonlineardynamicalandchaoticcomponents,fromnitereal-worldob-servations,islikelytohavesignicantimplicationsforscienticunderstandingandpredictivemodelinginmultipledisciplines.Thisresearchrepresentsastepforwardinthesedirections.Theresultsofthepresentinvestigationdemonstratedthepresenceofthresholds,expressedintermsofnoisetochaotic-signalandsea-sonalitytochaotic-signalratios.Theabilitytodetectchaosfromobservationsdependsonwhetherthechaoticcomponentinthehydrologictimeseriesisdominantenoughtosatisfythethresholds.Itwasshownthattheoveralltimeseriescanbedecomposedintothecontributingrandom,seasonalandchaoticcomponents.Timeserieswasdecomposedusingnonlinearpredictivemodelingforseparatingthechaoticcomponent,statisticalmethodsforcharacterizingrandomdata,andfrequencydomainapproachesforisolatingseasonality.Thishasdirectimplicationsforascienticunderstandingofhydrologicphenomenaandthedominantprocessesthatmaybepresentaswellasinthedevelopmentofpredictivemodels.Forexample,itwasshownthatthechaoticcomponent,oncedetectedandisolated,canbebetterpredictedintheshort-termthroughnonlinearmodelslikeANNandPSR.TheinsightsobtainedfromsimulateddatawereusedtointerprettheresultsofrealstreamowdatafromtheArkansasandColoradorivers.Itwasobservedthatachaoticcomponentcanbedetected,isolatedandutilizedforimprovedpredictivemodelingfromnitehydrologictimeseriesliketheArkansasdata.However,itwasseenthatforcertainhydrologicdataliketheColoradoriverdata,thismaynotbepossible.Thismaybeduetotheabsenceofchaotic/nonlineardynamicalcomponentinthedata.IfthedataobservedattheColoradoriverdoescontainchaotic/nonlineardynamiccomponent,thenitmaybecompletelydominatedbyrandomnessintroducedasaresultofthestochasticmodeofdamoperationonadailybasis.149

PAGE 171

Theapplicationofdifferenttools,specicallynonlineardependence,extremes,andchaos,ontherealdataindicatesthatthesetoolsmaybeeasilytasdifferentcomponentsindevelopingpredictivemodelsforhydrologyandclimate.Theinsightsobtainedfromtherealdataanalysisusingthesetoolsaredescribedbelowseparately. RigorousanalysisofrecentlydevelopedMI-estimationmethodsindicatethattwoMI-estimationmethods,specicallyKDEandKNN,outperformtheothermethodsandestimationproceduresintermsoftheirabilitytocapturethedependencestructureincludingnonlineardependencewherepresent.WendthatKNNisthebestestimatorforveryshortdatawithrelativelylownoisewhileKDEworksbetterforveryshortdatawhenthenoiselevelsarehigher.Forshortdata,KNNisthebestchoiceforcapturingthenonlineardependenceacrossallnoiselevelsexceptwhenthedataaregeneratedfromchaoticdynamics,whereKDEisabetterchoiceathighernoiselevels.WesurmisethattherelativeperformanceofKDEandKNNwithrespecttovariousnoiselevelsisaconsequenceofthebias-variancetradeoff.ThebiasintheKDEestimatesdominatesthevarianceoftheestimatesforlownoise-to-signalratios.TheKNNperformsrelativelybetterforlownoiselevelssinceitsbiasandvariancearelowerthanthatfromKDE.However,theconverseistrueforhighnoise-to-signalratios,andhencetheKDEperformsrelativelybetter.Forhighnoise,thevariancedominatesbecauseofthenoiseinthedatabutthevarianceassociatedwithk=3forKNNincreasesdramatically. TheapplicationofMI-basednonlineardependenceontherealdatasuggeststhatthereexistsanonlinearextrabasinalconnectionbetweenENSOandriverowsinthetopicsandsubtropics.Thisstudyalsoshowsanappreciableincreaseof20-70%inthevariationofannualriverowslinkedtoENSOusingnonlinearrelationshipmeasureascomparedtolinearmeasures.Hence,theseresultsindicateadditionalpredictabilityintheENSO-streamowextrabasinalconnectionwhenMI-basedapproachesareused,ascomparedtolinearapproachesusedbyresearcherstilldate.TheadditionaldependencecapturedbytheMI-basednonlinearCCsmaybeusefulfordevelopingmoreaccurateandlongerstreamowmodels.AlthoughENSOhasadirectinuenceonrainfallanomaliesoverthetropicalandsubtropicalregions,onlyaportionofthevariationintheannualowofriverslocatedintheseregionsisassociatedwithENSOevents.Thismaybedueto151

PAGE 172

AnalysisofprecipitationextremesinSouthAmericaindicatethattheareasofinterestbasedonanincreas-ingPEVIfrom1965-2004,aretheAmazonbasin,theBrazilianHighlands,Venezuela,Uruguay,Paraguay,andsomeofthehighlypopulatedcitiesinSouthAmerica,specicallyRioDeJaneiro,SaoPaulo,Caracas,Asuncion,andCayenne.SomepartsofeastBrazil,fewpartsoftheBrazilianhighlands,SaoPaulo,andCayennealsoexperienceincreasingnumberofconsecutive2-and3-daysextremes.Waterresourcesengi-neersandplanners,disastermanagementagencies,andpolicymakersneedtopayspecialattentiontotheregionswithincreasingtrendsinthePEVIandconsecutive2-and3-daysdailyextremes,especiallywhentheseregionsoverlapwithdenselypopulatedareas,whileplanningforinfrastructuredevelopmentanddisas-termanagement. Analysisofchaosindicatethatthechaoticcomponent,oncedetectedandisolated,canbebetterpredictedintheshort-termthroughnonlinearmodelslikeANNandPSR.TheinsightsobtainedfromsimulateddataareusedtointerprettheresultsofrealstreamowdatafromtheArkansasandColoradorivers.Itisobservedthatachaoticcomponentcanbedetected,isolatedandutilizedforimprovedpredictivemodelingfromnitehydrologictimeseriesliketheArkansasdata.However,forcertainhydrologicdataliketheColoradoriverdata,thismaynotbepossible.Thismaybeduetotheabsenceofchaotic/nonlineardynamicalcomponentinthedata.IfthedataobservedattheColoradoriverdoescontainchaotic/nonlineardynamiccomponent,thenitmaybecompletelydominatedbyrandomnessintroducedasaresultofthestochasticmodeofdamoperationonadailybasis. Theaboveresultsalsoindicatethepossibilityofinter-connectingthesetoolsforthepurposeofdevelopingpredictivemodels.SincethegeophysicaldataaregenerallynoisyandMI-estimationmethodsaresensitivetonoise,thenoisycomponentinthedatacanbereduced,beforeapplyingMI-estimationmethodsforcapturingnonlineardependence,usingANNorPSRprovidedthereisapresenceofchaosinthedata.ThisproceduremayenhancethecapabilityofMI-estimationmethodsforcapturingnonlineardependence.Nonlinearde-pendencebetweenvariablescanhelpinidentifyingcriticalpredictionvariablesoutofmanypredictorsandalsoenhancescienticunderstandingofrelationshipsbetweendifferentgeophysicalprocesses.Theidenti-cationofrelationshipsbetweenvariablesmayhelpinidentifyingadditionalvariables,alsocalledcovariates,forextremevaluedistributionsinordertoimproveextremevaluemodels.Thereisacaveatthattheuseofcovariates,havinggreaterdependence,inextremevaluedistributionsmaynotnecessarilyimproveextremevaluemodels.152

PAGE 173

M.BarahonaandC.-S.Poon.Detectionofnonlineardynamicsinshort,noisytimeseries.Nature,381:215,1996.[2] Y.Moon,B.Rajagopalan,andU.Lall.Estimationofmutualinformationusingkerneldensityestima-tors.Phys.Rev.E,52(3):2318,1995.[3] G.A.DarbellayandI.Vajda.Estimationoftheinformationbyanadaptivepartitioningoftheobser-vationspace.IEEETrans.Inform.Theory,45(4):1315,1999.[4] N.KwakandC.-H.Choi.InputfeatureselectionbymutualinformationbasedonParzenwindow.IEEETrans.onPatternAnalysisandMachineIntelligence,24(12):1667,2002.[5] A.Kraskov,H.Stogbauer,andP.Grassberger.Estimatingmutualinformation.Phys.Rev.E,69:066138,2004.[6] M.M.V.Hulle.Edgeworthapproximationofmultivariatedifferentialentropy.NeuralComputation,17:1903,2005.[7] Q.Wang,Y.Shen,andJ.Q.Zhang.Anonlinearcorrelationmeasureformultivariabledataset.PhysicaD,200:287,2005.[8] C.J.Cellucci,A.M.Albano,andP.E.Rapp.Statisticalvalidationofmutualinformationcalculations:Comparisonsofalternativenumericalalgorithms.Phys.Rev.E,71:066208,2005.[9] A.M.FraserandH.L.Swinney.Independentcoordinatesforstrangeattractorsfrommutualinforma-tion.Phys.Rev.A,33(2):1134,1986.[10] S.Khan,S.Bandyopadhyay,A.R.Ganguly,S.Saigal,D.J.EricksonIII,V.Protopopescu,andG.Os-trouchov.Relativeperformanceofmutualinformationestimationmethodsforquantifyingthedepen-denceamongshortandnoisydata.Phys.Rev.E,Accepted,2007.[11] M.P.Hoerling,A.Kumar,andM.Zhong.ElNino,LaNina,andthenonlinearityoftheirteleconnec-tions.J.Clim.,10:1769,1997.[12] Y.H.Jin,A.Kawamura,K.Jinno,andR.Berndtsson.NonlinearmultivariableanalysisofSOIandlocalprecipitationandtemperature.NonlinearProcessesinGeophysics,12:67,2005.[13] E.A.B.Eltahir.ElNinoandthenaturalvariabilityintheowoftheNileRiver.WaterResour.Res.,32(1):131,1996.[14] K.N.Amarasekera,R.F.Lee,E.R.Williams,andE.A.B.Eltahir.Ensoandthenaturalvariabilityintheowoftropicalrivers.J.Hydrology,200:24,1997.[15] D.W.Whitaker,S.A.Wasimi,andS.Islam.TheElNino-SouthernOscillationandlong-rangefore-castingofowsintheGanges.Int.J.Climatol.,21:77,2001.[16] F.AnctilandP.Coulibaly.WaveletanalysisoftheinterannualvariabilityinsouthernQuebecstream-ow.J.Clim.,17:163,2004.153

PAGE 174

J.E.Richey,C.Nobre,andC.Deser.Amazonriverdischargeandclimatevariability:1903-1985.Science,246:101,1989.[18] G.WangandE.A.B.Eltahir.UseofENSOinformationinmedium-andlong-rangeforecastingoftheNileoods.J.Clim.,12:1726,1999.[19] E.Tziperman,L.Stone,M.A.Cane,andS.Zebiak.ElNinochaos:OverlappingofresonancesbetweentheseasonalcycleandthePacicocean-atmosphereoscillator.Science,264:72,1994.[20] S.-I.AnandF.-F.Jin.NonlinearityandsymmetryofENSO.J.Clim.,17:2399,2004.[21] S.Khan,A.R.Ganguly,andS.Saigal.Detectionandpredictivemodelingofchaosinnitehydrolog-icaltimeseries.NonlinearProcessesinGeophysics,12:41,2005.[22] T.R.Karl,R.W.Knight,andN.Plummer.Trendsinhigh-frequencyclimatevariabilityinthetwentiethcentury.Nature,377:217,1995.[23] P.Y.Groisman,T.R.Karl,D.R.Easterling,R.W.Knight,P.F.Jamason,K.J.Hennessy,R.Suppiah,C.M.Page,J.Wibig,K.Fortuniak,V.N.Razuvaev,A.Douglas,E.Forland,andP.-M.Zhai.Changesintheprobabilityofheavyprecipitationimportantindicatorsofclimaticchange.ClimaticChange,42:243,1999.[24] B.N.Goswami,V.Venugopal,D.Sengupta,M.S.Madhusoodanan,andP.K.Xavier.IncreasingtrendofextremeraineventsoverIndiainawarmingenvironment.Science,314:1442,2006.[25] M.J.Manton,P.M.Della-Marta,M.R.Haylock,K.J.Hennessy,N.Nicholls,L.E.Chambers,D.A.Collins,G.Daw,A.Finet,D.Gunawan,K.Inape,H.Isobe,T.S.Kestin,P.Lefale,C.H.Leyu,T.Lwin,L.Maitrepierre,N.Ouprasitwong,C.M.Page,J.Pahalad,N.Plummer,M.J.Salinger,R.Suppiah,V.L.Tran,B.Trewin,I.Tibig,andD.Yee.TrendsinextremedailyrainfallandtemperatureinSoutheastAsiaandtheSouthPacic:1961-1998.Int.J.Climatol.,21:269,2001.[26] R.SuppiahandK.J.Hennessy.Trendsintotalrainfall,heavyrainfallevents,andnumberofdryeventsinAustralia.Int.J.Climatol.,18(10):1141,1998.[27] M.HaylockandC.Goodess.InterannualvariabilityofEuropeanextremewinterrainfallandlinkswithmeanlarge-scalecirculation.Int.J.CLimatol.,24:759,2004.[28] T.C.Peterson,M.A.Taylor,R.Demeritte,D.L.Duncombe,S.Burton,F.Thompson,A.Porter,M.Mercedes,E.Villegas,R.S.Fils,A.K.Tank,A.Martis,R.Warner,A.Joyette,W.Mills,L.Alexan-der,andB.Gleason.RecentchangesinclimateextremesintheCaribbeanregion.J.Geophys.Res.,D21:4601,doi:10.1029/2002JD002251,2002.[29] M.Brunetti,M.Maugeri,T.Nanni,andA.Navarra.DroughtsandextremeeventsinregionaldailyItalianprecipitationseries.Int.J.CLimatol.,22:543,2002.[30] T.Cavazos.Usingself-organizingmapstoinvestigateextremeclimateevents:AnapplicationtowintertimeprecipitationintheBalkans.J.Climate,13(10):1718,2000.[31] T.IwashimaandR.Yamamoto.Astatisticalanalysisoftheextremesevents:Long-termtrendofheavydailyprecipitation.J.Meteorol.Soc.Japan,71:637,1993.[32] C.HellstromandB.A.Malmgren.SpatialanalysisofextremeprecipitationinSweden1961-2000.AMBIO:AJournaloftheNumanEnvironment,33(4):187,2004.[33] L.M.V.Carvalho,Jones.C.,andB.Liebmann.ExtremeprecipitationeventsinsoutheasternSouthAmericaandlarge-scaleconvectivepatternsintheSouthAtlanticconvergencezone.J.Clim.,15:23772394,2002.154

PAGE 175

B.Liebmann,C.Jones,andL.M.V.Carvalho.InterannualvariabilityofdailyextremeprecipitationeventsinthestateofSaoPaulo,Brazil.J.Climate,14:208,2001.[35] G.Kuhn.OnDependenceandExtremes.PhDthesis,MunichUniversityofTechnology,2006.[36] G.Kuhn,S.Khan,A.R.Ganguly,andM.L.Branstetter.Geospatial-temporaldependenceamongweeklyprecipitationextremeswithapplicationstoobservationsandclimatemodelsimulationsinSouthAmerica.AdvancesinWaterResources,doi:10.1016/j.advwatres.2007.05.006,2007.[37] A.F.Jenkinson.Thefrequencydistributionoftheannualmaxima(orminima)valuesofmeteorologicalelements.Q.J.Meteorol.Soc.,81:158,1955.[38] E.J.Gumbel.Statisticsofextremes.ColumbiaUniversityPress,NewYork,1958.[39] R.W.Katz,M.B.Parlange,andP.Naveau.Statisticsofextremesinhydrology.AdvancesinWaterResources,25:1287,2002.[40] S.Nadarajah.ExtremesofdailyrainfallinwestcenralFlorida.ClimateChange,69:325,2005.[41] R.W.Katz,G.S.Brush,andM.B.Parlange.Statisticsofextremes:Modelingecologicaldisturbances.Ecology,86(5):1124,2005.[42] P.TodorovicandE.Zelenhasic.Astochasticmodelforoodanalysis.WaterResour.Res.,6:16411648,1970.[43] J.Pickands.Statisticalinferenceusingextremeorderstatistics.AnnalsofStatistics,3:119,1975.[44] Y.Li,W.Cai,andE.P.Campbell.StatisticalmodelingofextremerainfallinSouthwestWesternAustralia.J.Climate,18:852,2005.[45] P.S.WilsonandR.Toumi.Afundamentalprobabilitydistributionofheavyrainfall.Geophys.Res.Lett.,32(14):L14812,10.1029/2005GL022465,2005.[46] S.Khan,G.Kuhn,A.R.Ganguly,andD.J.EricksonIII.Spatio-temporalvariabilityofdailyandweeklyprecipitationextremesinSouthAmerica.WaterResour.Res.,Inrevision,2007.[47] S.D.GainesandM.W.Denny.Thelargest,smallest,highest,lowest,longest,andshortest:extremesinecology.Ecology,74:1677,1993.[48] I.Rodriguez-Iturbe,F.B.DePower,M.B.Shari,andK.P.Georgakakos.Chaosinrainfall.WaterResour.Res.,25(7):1667,1989.[49] S.Islam,R.L.Bras,andI.Rodriguez-Iturbe.Apossibleexplanationforlowcorrelationdimensionestimatesfortheatmosphere.J.Appl.Meteorol.,32:203,1993.[50] A.W.JayawardenaandF.Lai.Analysisandpredictionofchaosinrainfallandstreamowtimeseries.J.Hydrol.,153:23,1994.[51] J.Stehlik.Deterministicchaosinrunoffseries.J.Hydrol.Hydromech,47:271,1999.[52] B.Sivakumar.Chaostheoryinhydrology:importantissuesandinterpretations.J.Hydrol.,227(1-4):1,2000.[53] B.Sivakumar.Forecastingmonthlystreamowdynamicsinthewesternunitedstates:Anonlineardynamicalapproach.EnvironmentalModelingandSoftware,18(8/9):721,2003.[54] B.Sivakumar.Chaostheoryingeophysics:Past,presentandfuture.Chaos,SolitonsandFractals,19(2):441,2004.155

PAGE 176

B.Sivakumar,R.Berndtsson,J.Olsson,K.Jinno,andA.Kawamura.Dynamicsofmonthlyrainfall-runoffprocessatthegotabasin:Asearchforchaos.HydrologyandEarthSystemSciences,4(3):407417,2000.[56] B.Sivakumar,R.Berndtsson,J.Olsson,andK.Jinno.Evidenceofchaosinrainfall-runoffprocess.HydrologicalSciencesJournal,46(1):131,2001.[57] B.Sivakumar,R.Berndtsson,andM.Persson.Monthlyrunoffpredictionusingphase-spacerecon-struction.HydrologicalSciencesJournal,46(3):377,2001.[58] B.Sivakumar,R.Berndtsson,J.Olsson,andK.Jinno.Discussiononanalysisofcross-correlatedchaoticstreamowsbyelshorbagyetal.HydrologicalSciencesJournal,47(3):523,2002.[59] B.Sivakumar,M.Persson,R.Berndtsson,andC.B.Uvo.Iscorrelationdimensionareliableindi-catoroflow-dimensionalchaosinshorthydrologicaltimeseries?WaterResourcesResearch,38(2),10.1029/2001WR000333:31,2002.[60] A.Elshorbagy.Noisereductionapproachinchaotichydrologictimeseriesrevisited.CanadianWaterResourcesJournal,26(4):537,2001.[61] A.Elshorbagy,U.S.Panu,andS.P.Simonovic.Analysisofcross-correlatedchaoticstreamows.HydrologicalSciencesJournal,46(5):781,2001.[62] A.Elshorbagy,S.P.Simonovic,andU.S.Panu.Estimationofmissingstreamowdatausingprinciplesofchaostheory.JournalofHydrology,255(1-4):123,2002.[63] A.Elshorbagy,S.P.Simonovic,andU.S.Panu.Noisereductioninchaotichydrologictimeseries:Factsanddoubts.JournalofHydrology,256(3-4):147,2002.[64] M.N.IslamandB.Sivakumar.Characterizationandpredictionofrunoffdynamics:anonlineardynamicalview.Adv.WaterResour.,25:179,2002.[65] D.Schertzer,Y.Tchiginskaya,S.Lovejoy,P.Hubert,H.Bendjoudi,andM.Larchevque.Whichchaosintherainfall-runoffprocess?HydrologicalSciencesJournal,47(1):139,2002.[66] B.Sivakumar,R.Berndtsson,J.Olsson,andK.Jinno.Replytowhichchaosintherainfall-runoffprocess?byschertzeretal.HydrologicalSciencesJournal,47(1):149,2002.[67] K.Fraedrich.Estimatingthedimensionsofweatherandclimateattractors.J.Atmos.Sci.,43:419,1986.[68] J.Theiler,S.Eubank,A.Longtin,B.Galdrikian,andJ.D.Farmer.Testingfornonlinearityintimeseries:Themethodofsurrogatedata.PhysicaD,58:77,1992.[69] S.BasuandE.Foufoula-Georgiou.Detectionofnonlinearityandchaocityintimeseriesusingthetransportationdistancefunction.Phy.Lett.A,301:413,2002.[70] G.Wang.Aconceptualmodelingstudyonbiosphere-atmosphereinteractionsanditsimplicationsforphysicallybasedclimatemodels.JournalofClimate,17(13):2572,2004.[71] R.L.BrasandI.RodriguesIturbe.Rainfallgeneration:Anon-stationarytimevaryingmultidimen-sionalmodel.WaterResour.Res.,12:450,1976.[72] Y.Tessier,S.Lovejoy,P.Hubert,D.Schertzer,andS.Pecknold.Multifractalanalysisandmodelingofrainfallandriverowsandscaling,causaltransferfunctions.J.geophysicalRes.,101(D21):2642726440,1996.156

PAGE 177

E.DouglasandA.P.Barros.Probablemaximumprecipitationestimationusingmultifractals:Appli-cationintheeasternunitedstates.J.Hydrometeorology,4(6):1012,2003.[74] R.J.KuligowskiandA.P.Barros.Localizedprecipitationforecastsfromanumericalweatherpredic-tionmodelusingarticialneuralnetworks.WeatherandForecasting,13(4):1194,1998.[75] R.J.KuligowskiandA.P.Barros.Usingarticialneuralnetworkstoestimatemissingrainfalldata.J.ofAmericanWaterResourcesAssociation,34(6):1,1998.[76] G.KimandA.P.Barros.Quantitativeoodforecastingusingmultisensordataandneuralnetworks.J.Hydrology,246:45,2001.[77] A.R.GangulyandR.L.Bras.Distributedquantitativeprecipitationforecastingcombininginforma-tionfromradarandnumericalweatherpredictionmodeloutputs.J.ofHydrometeorology,AmericanMeteorologicalSociety,4(6):1168,2003.[78] J.TheilerandP.E.Rapp.Re-examinationoftheevidenceforlow-dimensional,nonlinearstructureinthehumanelectroencephalogram.Electroencephalogr.Clin.Neurophysiol.,98(3):213,1996.[79] K.LehnertzandC.E.Elger.Canepilepticseizuresbepredicted?Evidencefromnonlineartimeseriesanalysisofbrainelectricalactivity.Phys.Rev.Lett.,80(22):5019,1998.[80] D.A.SmirnovandB.P.Bezruchko.Estimationofinteractionstrengthanddirectionfromshortandnoisytimeseries.Phys.Rev.E,68:046209,2003.[81] R.Q.Quiroga,A.Kraskov,T.Kreuz,andP.Grassberger.Performanceofdifferentsynchronizationmeasuresinrealdata:Acasestudyonelectroencephalographicsignals.Phys.Rev.E,65:041903,2002.[82] N.NicolaouandS.J.Nasuto.CommentonPerformanceofdifferentsynchronizationmeasuresinrealdata:Acasestudyonelectroencephalographicsignals.Phys.Rev.E,72:063901,2005.[83] R.Q.Quiroga,A.Kraskov,andP.Grassberger.ReplytoCommenton'Performanceofdifferentsynchronizationmeasuresinrealdata:Acasestudyonelectroencephalographicsignals'.Phys.Rev.E,72:063902,2005.[84] S.Khan,A.R.Ganguly,S.Bandyopadhyay,S.Saigal,D.J.EricksonIII,V.Protopopescu,andG.Os-trouchov.Nonlinearstatisticsrevealsstrongertiesbetweenensoandthetropicalhydrologicalcycle.Geophys.Res.Lett.,33:L24402,doi:10.1029/2006GL027941,2006.[85] B.Rajagopalan,U.Lall,andD.G.Tarboton.Evaluationofkerneldensityestimationmethodsfordailyprecipitationsampling.StochasticEnvironmentalResearchandRiskAssessment,11(6):523,doi:10.1007/BF02428432,1997.[86] G.A.Darbellay.Anestimatorofthemutualinformationbasedonacriterionforindependence.Com-putationalStatisticsandDataAnalysis,32:1,1999.[87] D.R.Brillinger.Somedataanalysesusingmutualinformation.BrazilianJ.ProbabilityandStatistics,18:163,2004.[88] T.M.CoverandJ.A.Thomas.Elementsofinformationtheory.Wiley,NewYork,1991.[89] R.Steur,J.Kurths,C.O.Daub,J.Weise,andJ.Selbig.Themutualinformation:Detectingandevaluatingdependenciesbetweenvariables.Bioinformatics,81(2):S231S240,2002.[90] J.Xu,Z.-R.Liu,R.Liu,andQ.-F.Yang.Informationtransmissioninhumancerebralcortex.PhysicaD,106:363,1997.157

PAGE 178

T.Schreiber.Measuringinformationtransfer.Phys.Rev.Lett.,85(2):461,2000.[92] M.S.Pinsker.Informationandinformationstabilityofrandomvariablesandprocesses.SanFran-cisco:Holden-Day,1964.[93] H.Joe.Relativeentropymeasuresofmultivariatedependence.J.AmericanStatisticalAssociation,84(405):157,1989.[94] C.GrangerandJ.Lin.Usingthemutualinformationcoefcientstoidentifylagsinnonlinearmodels.J.TimeSeriesAnalysis,15(4):371,1994.[95] S.J.Schiff,P.So,andT.Chang.Detectingdynamicalinterdependenceandgeneralizedsynchronythroughmutualpredictioninaneuralensemble.PhysicalReviewE,54(6):6708,1996.[96] H.K.M.Meeren,J.P.M.Pijn,E.L.J.M.Luijtelaar,A.M.L.Coenen,andF.H.L.Silva.Corticalfocusdriveswidespreadcorticothalamicnetworksduringspontaneousabsenceseizuresinrats.J.Neuroscience,22(4):1480,2002.[97] B.W.Silverman.Densityestimationforstatisticsanddataanalysis.ChapmanandHall/CRC,1986.[98] P.B.Wright.Homogenizedlong-periodSouthernOscillationindices.Int.J.Climatol.,9:33,1989.[99] M.C.ToddandR.Washington.ClimatevariabilityincentralequatorialAfrica:InuencefromtheAtlanticsector.Geophys.Res.Lett.,31(23),2004.[100] G.PizarroandU.Lall.ElNinoandFloodsintheUSWest:Whatcanbeexpect?EOS,TransactionsoftheAGU,83(32):349,2002.[101] M.KendallandJ.D.Gibbons.Rankcorrelationmethods.ACharlesGrifn,5thEdition,1990.[102] B.EfronandR.J.Tibshirani.Anintroductiontothebootstrap.ChapmanandHall/CRC,1993.[103] F.C.Curriero,J.A.Patz,J.B.Rose,andS.Lele.TheassociationbetweenextremeprecipitationandwaterbornediseaseoutbreaksintheUnitedStatus,1948-1994.Am.J.PublicHealth,91(8):1194,2001.[104] J.R.StedingerandT.A.Cohn.Floodfrequencyanalysiswithhistoricalandpaleooodinformation.WaterResources,22:785,1986.[105] R.W.Katz.Stochasticmodelingofhuricanedamage.J.Appl.Meteorol.,41:754,2002.[106] B.G.BrownandR.W.Katz.Regionalanalysisoftemperatureextremes:spatialanalogforclimatechange?J.Climate,8:108,1995.[107] J.P.Palutikof,B.B.Brabson,D.H.Lister,andS.T.Adcock.Areviewofmethodstocalculateextremewindspeeds.MeterologicalApplications,6:119,1999.[108] F.P.Schoenberg,R.Peng,andJ.Woods.Onthedistributionofwildresizes.Environmetrics,14:583592,2003.[109] B.LiebmannandD.Allured.DailyprecipitationgridsforSouthAmerica.Bull.Amer.Meteor.Soc.,86(11):1567,2005.[110] S.G.Coles.Anintroductiontostatisticalmodelingofextremevalues.Springer-Verlag,London,UK,2001.[111] D.R.Easterling,J.L.Evans,P.Ya.Groisman,T.R.Karl,K.E.Kunkel,andP.Ambenje.Observedvariabilityandtrendsinextremeclimateevents.Bull.Am.Met.Soc.,81:417,2000.158

PAGE 179

G.A.MeehlandC.Tebaldi.Moreintense,morefrequent,andlongerlastingheatwavesinthe21stcentury.Science,305:994,2004.[113] J.R.Michael.TheStabilizedProbabilityPlot.Biometrika,70(1):11,1983.[114] S.G.Coles.Ongoodness-of-ttestsforthetwo-parameterWeibulldistributionderivedfromthestabilizedprobabilityplot.Biometrika,76(3):593,1989.[115] A.C.Kimber.TestsfortheExponential,WeibullandGumbelDistributionsBasedontheStabilizedProbabilityPlot.Biometrika,72(3):661,1985.[116] J.Galambos.Theasymptotictheoryofextremeorderstatistics.RobertE.Krieger,Malabar,FLorida,USA,1987.[117] B.RajagopalanandU.Lall.Ak-nearest-neighborsimulatorfordailyprecipitationandotherweathervariables.WaterResour.Res.,35(10):3089,1999.[118] D.Yates,S.Gangopadhyay,B.Rajagopalan,andK.Strzepek.Atechniqueforgeneratingre-gionalclimatescenariosusinganearest-neighboralgorithm.WaterResour.Res.,39(7):1199,doi:10.1029/2002WR001769,2003.[119] H.D.Abarbanel.NonlinearSystems.VCHPublishers,NewYork,1994.[120] E.N.Lorenz.Atmosphericpredictabilityasrevealedbynaturallyoccurringanalogues.J.Atmos.Sci.,26:636,1969.[121] K.Fraedrich.Estimatingweatherandclimatepredictabilityonattractors.J.Atmos.Sci.,44:722,1987.[122] C.Essex,T.Lookman,andM.A.H.Nerenberg.Theclimateattractoronshorttimescales.Nature,326:64,1987.[123] A.Hense.Onthepossibleexistenceofastrangeattractorforthesouthernoscillation.BeitrPhy.Atmos.,60(1):34,1987.[124] B.P.Wilcox,M.S.Seyfried,andT.H.Matison.Searchingforchaoticdynamicsinsnowmeltrunoff.WaterResour.Res.,27(6):1005,1991.[125] E.N.Lorenz.TheEssenceofChaos(TheJessieandJohnDanzLectureSeries).UniversityofWash-ingtonPress,1996.[126] A.PorporatoandL.Ridol.Cluestotheexistenceofdeterministicchaosinriverow.Int.J.Mod.Phys.B,10:1821,1996.[127] Y.Almog,O.Oz,andS.Akselrod.Correlationdimensionestimation:canthisnonlineardescriptioncontributetothecharacterizationofbloodpressurecontrolinrats?IEEETransactionsonBiomedicalEngineering,46(5):535,1990.[128] D.Hsieh.Chaosandnonlineardynamics:Applicationstonancialmarkets.JournalofFinance,46:1839,1991.[129] R.R.Trippi.Chaos&NonlinearDynamicsintheFinancialMarkets:Theory,EvidenceandApplica-tions.IrwinProfessionalPublishing,1995.[130] A.L.Cornelis.Visualizationofchaosfornancemajors.In2000FinanceEducatorsConference:FinanceEducationintheNewMillennium,Proceedingsofthe2000AnnualConference,pages187226,DeakinUniversity,Burwood,Victoria,Australia,2000.159

PAGE 180

G.S.Yim,J.W.Ryu,Y.J.Park,S.Rim,S.Y.Lee,W.H.Kye,andC.M.Kim.Chaoticbehaviorsofoperationalampliers.PhysicalReviewE,69,2004.[132] F.Takens.Detectingstrangeattractorsinturbulence,inDynamicalSystemsandTurbulence,LecturenotesinMathematics,898,pp.366-381.SpringerVerlag,NewYork,1980.[133] B.Sivakumar,S.Y.Liong,C.Y.Liaw,andK.K.Phoon.Evidenceofchaoticbehaviorinsingaporerainfall.J.Am.WaterResour.Assoc.,34:301,1998.[134] R.C.Hilborn.ChaosandNonlinearDynamics.OxfordUniversityPress,2000.[135] P.GrassbergerandI.Procaccia.Measuringthestrangenessofstrangeattractors.PhysicaD,9:189208,1983.[136] G.SugiharaandR.M.May.Nonlinearforecastingasawayofdistinguishingchaosfrommeasurementerrorintimeseries.Nature,344:734,1990.[137] A.R.OsborneandA.Provenzale.Finitecorrelationdimensionforstochasticsystemswithpower-lawspectra.PhysicaD,35:357,1989.[138] M.A.H.NerenbergandC.Essex.Correlationdimensionandsystematicgeometriceffects.Phys.Rev.Lett.A,42(12):7065,1990.[139] ASCE.Taskcommitteeonapplicationsofarticialneuralnetworksinhydrology,ii,hydrologicappli-cations.J.Hydrol.Eng.,5(2):124,2000.[140] A.S.WeigendandN.A.Gershenfeld.TimeSeriesPrediction:ForecastingtheFutureandUnder-standingthePast.Addison-Wesley,Reading,MA,1994.[141] G.Zhang,B.E.Patuwo,andM.Y.Hu.Forecastingwitharticialneuralnetworks:Thestateoftheart.Int.J.Forecasting,14:35,1998.[142] C.M.Bishop.NeuralNetworksforpatternrecognitions.OxfordUniversityPress,1995.[143] J.D.FarmerandJ.J.Sidorowich.Predictingchaotictimeseries.Phys.Rev.Lett.,59:845,1987.[144] D.KoutsoyiannisandDPachakis.Deterministicchaosversusstochasticityinanalysisandmodelingofpointrainfallseries.J.Geophys.Res.,101:26,441,451,1996.[145] USGS.privatecommunication.2004.[146] BoR.BureauofReclamation,privatecommunication.2004.160

PAGE 181

ShirajKhanisaPh.D.candidateintheDepartmentofCivilandEnvironmentalEngineeringattheUniver-sityofSouthFlorida.HeisexpectedtoreceivehisPh.D.degreeonAugust11,2007.HereceivedaBachelorofTechnology(B.Tech.)degreefromIndianInstituteofTechnology(IIT),Roorkeein2001.AtIIT,hewastheonlystudentselectedforThePresidentofIndiaGoldMedalforoutstandingprociencyincludingcharacter,conduct,excellenceinacademicperformance,extra-curricularactivitiesandsocialservices.HealsoreceivedIITSilverMedalforgettingthehighestGPAinCivilEngineering.In2005,hewasselectedbyAmericanSocietyofEngineersofIndianOriginfortheKalpanaChawla-FordMotorCompanyAwardinrecognitionofhishighscholasticachievementsinengineering.Duringhisdoctoralstudies,hepublishedsixjournalpapersinthehighqualityjournalsandthreebookchapters,andpresentedseveralconferencepapers.