USF Libraries
USF Digital Collections

Bayesian and Empirical Bayes approaches to power law process and microarray analysis

MISSING IMAGE

Material Information

Title:
Bayesian and Empirical Bayes approaches to power law process and microarray analysis
Physical Description:
Book
Language:
English
Creator:
Chen, Zhao, 1977-
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla.
Publication Date:

Subjects

Subjects / Keywords:
Bayesian
power law process
Empirical Bayes
Nonhomogeneous Poisson Process
EBarrays
microarray
Dissertations, Academic -- Mathematics -- Doctoral -- USF   ( lcsh )
Genre:
government publication (state, provincial, terriorial, dependent)   ( marcgt )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Summary:
ABSTRACT: In this dissertation, we apply Bayesian and Empirical Bayes methods for reliability growth models based on the power law process. We also apply Bayes methods for the study of microarrays, in particular, in the selection of differentially expressed genes. The power law process has been used extensively in reliability growth models. Chapter 1 reviews some basic concepts in reliability growth models. Chapter 2 shows classical inferences on the power law process. We also assess the goodness of fit of a power law process for a reliability growth model. In chapter 3 we develop Bayesian procedures for the power law process with failure truncated data, using non-informative priors for the scale and location parameters. In addition to obtaining the posterior density of parameters of the power law process, prediction inferences for the expected number of failures in some time interval and the probability of future failure times are also discussed. The prediction results for the software reliability model are illustrated. We compare our result with the result of Bar-Lev,S.K. et al. Also, posterior densities of several parametric functions are given. Chapter 4 provides Empirical Bayes for the power law process with natural conjugate priors and nonparametric priors. For the natural conjugate priors, two-hyperparameter prior and a more generalized three-hyperparameter prior are used. In chapter 5, we review some basic statistical procedures that are involved in microarray analysis. We will also present and compare several transformation and normalization methods for probe level data. The objective of chapter 6 is to select differentially expressed genes from tens of thousands of genes. Both classical methods (fold change, T-test, Wilcoxon Rank-sum Test, SAM and local Z-score and Empirical Bayes methods (EBarrays and LIMMA) are applied to obtain the results. Outputs of a typical classical method and a typical Empirical Bayes Method are discussed in detail.
Thesis:
Thesis (Ph.D.)--University of South Florida, 2004.
Bibliography:
Includes bibliographical references.
System Details:
System requirements: World Wide Web browser and PDF reader.
System Details:
Mode of access: World Wide Web.
Statement of Responsibility:
by Zhao Chen.
General Note:
Includes vita.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains 109 pages.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001478786
oclc - 56563572
notis - AJS2476
usfldc doi - E14-SFE0000430
usfldc handle - e14.430
System ID:
SFS0025122:00001


This item is only available as the following downloads:


Full Text

PAGE 1

BayesianandEmpiricalBayesApproachestoPowerLawProcessandMicroarrayAnalysisbyZhaoChenAdissertationsubmittedinpartialfulllmentoftherequirementsforthedegreeofDoctorofPhilosophyDepartmentofMathematicsCollegeofArtsandSciencesUniversityofSouthFloridaMajorProfessor:A.N.V.Rao,Ph.D.Committeemember:MarcusMcWaters,Ph.D.Committeemember:KandethodyRamachandran,Ph.D.Committeemember:LihuaLi,Ph.D.Committeemember:GeorgeYanev,Ph.D.DateofApproval:July12th,2004Keywords:Bayesian,EmpiricalBayes,PowerLawProcess,NonhomogeneousPoissonProcess,Microarray,EBarrayscCopyright2004,ZhaoChen

PAGE 2

DEDICATIONTomyparents,GuorenChenandWandiMo,whomadeallofthispossible,fortheirendlessencouragementandpatience.Andalsotomyance,PengFeng,whoneverfailstoremindmehowtoliveasifeverydayweremymostprecious.ii

PAGE 3

ACKNOWLEDGMENTSThereisonepersonwhodeservesmydeepestthanksandrespectforhiscontinuedsupportduringthewritingofthisdissertation:myadvisor,Dr.A.N.V.Rao.Histechnicalandeditorialadvicewasessentialtothecompletionofthisdissertationandhehastaughtmeinnumerablelessonsandinsightsontheworkingsofacademicresearchingeneral.Ialsogreatlyappreciatehisadviceinmanyotheraspectsofacademiclife.Mythanksalsogotothemembersofmymajorcommittee.Dr.ChrisTsokosshowedmewhat'smostimportantasastatisticianandhowtoecientlypresentawork.Ican'tnishmygraduatestudywithoutDr.MarcusMcWaterscontinuedsupport.ThankstoDr.KandethodyRamachandranandDr.GeorgeYanevforreadingpreviousdraftsofthisdissertationandprovidingmanyvaluablecommentsthatimprovedthepresentationandcontentsofthisdissertation.I'dliketogivemyspecialappreciationtoDr.LihuaLi.Heoeredmeagreatopportunitytogainsomeexperienceinapplyingstatisticsmethods.Healsogavememanypreciousadvicesonhowtoworkasaprofessional.ThankDr.JiangKunforchairingmydissertationdefense.ManythankstoDr.EricHaura.Heistheonewhomotivatesmetoworkonmicroarrayanalysisandhasalwaysbeensupportiveonthework.IhadverypleasanttimetoworkwithDanielJ.DaucerandLanxiSonginthelaboratory.Agoodfriendofmine,HengLi,contributedagreatnumberofhelpandnoexactwordscandescribemydeepappreciationtohim.IreallyappreciatethemathematicsdepartmentoftheUniversityofSouthFloridaforoeringmetheopportunitytostudyhere.Duringthepast5years,notonlydidIgainiii

PAGE 4

knowledgeeveryday,butalsothefriendshipfromthepeopleinthedepartment.Thankyouforyoursmiles,Ayako,Beverly,Nancy,FrancesandMarianne.Youmademefeellikeafamily.iv

PAGE 5

BayesianandEmpiricalBayesApproachestoPowerLawProcessandMicroarrayAnalysisZhaoChenABSTRACTInthisthesis,weapplyBayesandEmpiricalBayesmethodsforreliabilitygrowthmodelsbasedonthepowerlawprocess.WealsoapplyBayesmethodsforthestudyofmicroarrays,inparticular,intheselectionofdierentiallyexpressedgenes.Thepowerlawprocesshasbeenusedextensivelyinreliabilitygrowthmodels.Chapter1reviewssomebasicconceptsinreliabilitygrowthmodels.Chapter2showsclassicalinferencesonthepowerlawprocess.Wealsoassessthegoodnessoftofapowerlawprocessforareliabilitygrowthmodel.Inchapter3wedevelopBayesianproce-duresforthepowerlawprocesswithfailuretruncateddata,usingnon-informativepriorsforthescaleandlocationparameters.Inadditiontoobtainingtheposteriordensityofparametersofthepowerlawprocess,predictioninferencesfortheexpectednumberoffail-uresinsometimeintervalandtheprobabilityoffuturefailuretimesarealsodiscussed.Thepredictionresultsforthesoftwarereliabilitymodelareillustrated.WecompareourresultwiththeresultofBar-Lev,S.K.etal.[7].Also,posteriordensitiesofseveralpara-metricfunctionsaregiven.Chapter4providesEmpiricalBayesforthepowerlawprocesswithnaturalconjugatepriorsandnonparametricpriors.Forthenaturalconjugatepriors,two-hyperparameterpriorandamoregeneralizedthree-hyperparameterpriorareused.Inchapter5,wereviewsomebasicstatisticalproceduresthatareinvolvedinmi-croarrayanalysis.Wewillalsopresentandcompareseveraltransformationandnormal-v

PAGE 6

izationmethodsforprobeleveldata.Theobjectiveofchapter6istoselectdierentiallyexpressedgenesfromtensofthousandsofgenes.Bothclassicalmethodsfoldchange,T-test,WilcoxonRank-sumTest,SAMandlocalZ-scoreChen,Z.[17]andEmpiricalBayesmethodsEBarraysandLIMMAareappliedtoobtaintheresults.OutputsofatypicalclassicalmethodSAMandatypicalEmpiricalBayesMethodEBarraysarediscussedindetail.vi

PAGE 7

ContentsAbstractvListofTablesxiListofFiguresxiiChapter1ReliabilityGrowthandGrowthModels11.1FundamentalsofReliability..........................11.2CountingProcess................................21.3HomogeneousPoissonProcessHPP.....................51.4NonhomogeneousPoissonProcess.......................61.5PowerLawProcess...............................81.5.1HistoricalReview............................91.5.2ModelMotivation............................101.5.3PresentStudy..............................121.6Summary....................................14Chapter2ClassicalInferenceonthePowerLawProcess152.1Introduction...................................152.2PointEstimationofParametersand...................162.2.1MaximumLikelihoodEstimates^and^...............172.2.2UnbiasedEstimates.........................19vii

PAGE 8

2.2.3LinearlyBestEcientEstimateof.................192.3IntervalEstimationandTestsofHypothesis.................202.4EstimationofIntensityFunction........................212.5MeanTimeBetweenFailureMTBF.....................222.5.1MTBFn.................................222.5.2MTBFQ.................................232.6Goodness-of-tTests..............................252.6.1DuanePlot...............................262.6.2RatioPowerTest............................272.6.3Log-ratioTest..............................282.7Summary....................................28Chapter3BayesianInferenceonthePowerLawProcess303.1LikelihoodFunction...............................313.1.1Likelihoodofy1.............................313.1.2Likelihoodof~y=y1;y2;y3;:::;yk...................313.2PosteriorDensityof;...........................333.3PosteriorDensityofand..........................343.3.1MarginalPosteriorDensityof....................343.3.2MarginalPosteriorDensityof....................363.4PredictiveInference...............................383.4.1PredictionsBasedonPosteriorDensity................383.4.2PredictionInferenceBasedonEstimationofIntensityFunction..403.4.3DataComparison............................423.5PosteriorDensityforSomeParametricFunctions..............423.5.1PosteriorDensityforSystemReliability...............44viii

PAGE 9

3.5.2PosteriorDensityFortheExpectedNumberofFailuresinSomeTimeInterval..............................453.5.3PosteriorDensityforIntensityFunction...............463.6Summary....................................46Chapter4EmpiricalBayesAnalysisonthePowerLawProcess484.1ParametricEmpiricalBayesonthePLP...................484.1.1TwoHyperparametersa;m.....................494.1.2ThreeHyperparametersa;m;ym...................534.1.3PriorPlots................................544.2NonparametricPrioronthePowerLawProcess...............554.3FurtherResearch................................614.4Summary....................................61Chapter5MicroarrayAnalysis:NormalizationandTransformationofProbe-LevelData635.1OverviewofMicroarrayAnalysis.......................635.2DataDescription................................645.3Probe-LevelAnalysisofOligonucleotideArrays...............665.3.1MAS5.0{MicroarrayAnalysisSuiteVersion5.0..........675.3.2MBEI{Model-BasedExpressionIndexdChip2001........685.3.3RMA{RobustMultichipAnalysis..................695.3.4VSN{VarianceStabilizationofNetwork...............705.3.5Comparison...............................705.4Summary....................................71Chapter6StatisticalMethodsofSelectingDierentiallyExpressedGenes726.1SelectDierentiallyRegulatedGenesUsingClassicalStatisticsMethods..72ix

PAGE 10

6.1.1FoldChangeMethod..........................726.1.2T-test..................................746.1.3WilcoxonRank-SumTest.......................766.1.4SAM...................................776.1.5LocalZ-scoreMethod..........................816.2SelectDierentiallyExpressedGenesbyBayesianandEmpiricalBayesMethods.....................................826.2.1EBarrays................................836.2.2LIMMA.................................866.2.3Cyber-T.................................876.3OtherWorkinMicroarray...........................886.4Summary....................................89Bibliography90x

PAGE 11

ListofTablesTABLE2.1SoftwareFailureTimesinSeconds..................26TABLE4.1CommonlyUsedKernelFunctions.................60TABLE6.1SAMResultTable...........................80TABLE6.2EBarraysResultTable........................84TABLE6.3LIMMAResultTable.........................87xi

PAGE 12

ListofFiguresFIGURE1.1TheSamplePathforaCountingProcess.............3FIGURE1.2ThreeDierentTypesofSystems..................11FIGURE2.1DuanePlotforDataTable2.1...................27FIGURE2.2PlotofNumberofFailuresagainstCumulativeOperatingTimeforDataTable2.1...............................28FIGURE3.1MarginalDensityof........................35FIGURE3.2ThePredictiveProbabilityFunctionofN7;344;80;000....43FIGURE3.3TheProbabilityofn+1thFailureTime.............43FIGURE4.1BayesianModelforthePLP....................49FIGURE4.2ParametricEmpiricalBayesModelforthePLP..........50FIGURE4.3PriorContourPlota=2m=2ym=2..............56FIGURE4.4PriorThreeDimensionPlota=2m=2ym=2.........56FIGURE4.5PriorContourPlota=2m=6ym=2..............57FIGURE4.6PriorThreeDimensionPlota=2m=6ym=2.........57FIGURE4.7PriorContourPlota=2m=2ym=6..............58FIGURE4.8PriorThreeDimensionPlota=2m=2ym=6.........58FIGURE4.9PriorContourPlota=6m=2ym=2..............59FIGURE4.10PriorThreeDimensionPlota=6m=2ym=2.........59FIGURE5.1DataAnalyticTasksinMicroarrayExperimentation.......65FIGURE5.2SignalExtractioninProbeLevelData...............66FIGURE5.3STAT3v.s.GFPMeanValueSTAT3=0:98745GFP.....67xii

PAGE 13

FIGURE6.1LowessRegressionforFiveExperiments..............73FIGURE6.2HistogramofT-statistics......................75FIGURE6.3HistogramofLog-expressionWilcoxonStatistics.........76FIGURE6.4SAMPlot...............................79FIGURE6.5BoxplotofGeneExpressionforFiveExperimentsandFiveControls81FIGURE6.6EBarraysFitGGModel-FromLiWongNormalizedData...85xiii

PAGE 14

Chapter1ReliabilityGrowthandGrowthModelsRepairablesystemsareoftenmodeledafteraclassofstochasticpointprocesses.Ascher4[2],Engelhardt[26],RigdonandBasu1990[49].Thisclassusuallycon-sistsofhomogeneousPoissonprocessesHPP,nonhomogeneousPoissonprocessesNHPP,thebranchingPoissonprocessesBPPandtherenewalprocessesRP.InthischapterwepresentsomefundamentalresultsonhomogenousPoissonprocessesandnonhomoge-neousPoissonprocesses.AparticularnonhomogeneousPoissonprocesswithPowerLawintensityfunctioncanbeemployedasareliabilitygrowthmodelandwillplayamajorroleinourresearch.Theresultsthatarecoveredinthischapterarebasicfortheresearchpresentedinlaterchapters.1.1FundamentalsofReliabilityThereliabilityfunctionistheprobabilitythatasystemwillcarryoutitsmissionthroughtimetandisdenotedbyRt.LetTdenotethefailuretimesincetheinitialstartupofthesystemassumedtobeattimet=0.LetNtdenotethecumulativenumberoffailuresfromtime0totimet.ThenthereliabilityfunctionisdenedasRt=Pr[T>t]=1)]TJ/F20 11.955 Tf 11.955 0 Td[(Ft=Z1tfsds;1

PAGE 15

whereFtisthecumulativedistributionfunctionofTandftistheprobabilitydensityfunctionofT.WenotethatRt=Pr[Nt=0].ThereliabilityfunctionisalsocalledthesurvivalfunctionofT.Rtdecreasesint,from1att=0,to0att=1.Reliabilityplaysakeyroleindevelopingproductsandinenhancingcompetitive-ness.Thereisalotofliteratureonthereliabilityofnonrepairablesystem.However,wewillfocusonrepairablesystem.Arepairablesystemisasystemwhich,afterfailingtoperformoneormoreofitsdesignedfunctionssatisfactorily,canberestoredtoafullysatisfactoryperformancebyanymethod,otherthanreplacementoftheentiresystemAscher84[2].Agoodportionofliteratureonrepairablesystemsseemstobemotivatedfromapplicationstomechan-icalsystems.However,repairablesystemsarenotlimitedtosuchcases.Animportantapplicationareaisthereliabilityofsoftwaresystems.Considerthesystemistesteduntilitfails.Sinceitisrepairable,wewilldebugthesoftwareanditrunsagain.Thenwewillcontinuetotestthesystemuntilitfailsagain.Beforethesystemsareputintothemarket,weneedreachadesirablereliability,whichwillreectthequalityofthenaldesign.Thisprocessoftestingasystemhasbeenreferredasreliabilitygrowth.Inwhatfollows,wewilldiscusshowtomodelthisprocessinasuitableway.1.2CountingProcessModelsutilizingacountingprocesshaveplayedakeyroleintheanalysisofsys-temscomposedofrandomoccurringevents.Bywayofmotivation,supposethatweareinterestedinobservingtheoccurrencesofarepeatableeventoveraperiodoftime.Oneofthesimplestexamplesisthearrivalofcustomersataservicestation,suchasabank.Anotherexampleistheoccurrencesofearthquakesofaspeciedmagnitudeatapartic-ularlocationovertime.Theexamplethatisofinteresttoushereisthepointsintime2

PAGE 16

Figure1.1:TheSamplePathofaCountingProcesswhenasystem'ssoftwarefails.Inallsuchcases,theeventofinterestdoesnotoccurwithanyregularityandisthereforenotexactlypredictable.Wearenotsureabouttheexacttimesatwhichtheeventswilloccurandconsequentlyabouttheexactnumberofeventsthatwilloccurinanytimeinterval.Sucharandomphenomenoniscalledapointprocess.Acountingprocessissimplythecountofthenumberofeventsthathaveoccurredinanyspeciedintervaloftime.SinceNtisunknownforanyvalueoft,wearefacingwiththeproblemofdescribingouruncertaintyaboutaninnitecollectionofrandomvariables,oneforeacht.Anyindexedcollectionofrandomvariablesiscalledstochasticprocess,andwhentheinterestisfocusedoncounts,theprocessiscalledacountingpro-cessandisdenotedbyfNt;t0g.ThesamplepathofacountingprocessisgivenbyFigure1.1.Thehorizontallineisdesignatedtorepresenttime;theverticallineisusedtorepresentthetotalnumberofcountsovertime.Itisastepfunctionstartingatzero,andtakingjumpsofsizeoneateachti,thatis,thecumulativetimeoftheithfailure.Ourpurposeofthischapteristointroducesomeprobabilisticmodelsforthe3

PAGE 17

countingprocess.Themostcommonlyusedmodelsarehomogeneousandnonhomoge-neousPoissonprocesses.WenowdeneaPoissonprocessandtheintensityfunction.AcountingprocessNtissaidtobeaPoissonprocessif1.N=0;2.Foranyax x:Theharzardfunctionisthelimitoftheprobabilitythataunitfailsfortherstandonlytimeinasmallintervalgiventhatitsurvivedtothebeginningoftheinterval.Harzardfunctionisaconditionalprobabilityandgivesitsrelativerateattimet.ItalsocanbecalculatedbydividingthederivativeofcumulativedistributionfunctionFtwiththeprobabilityofsurvivingpasttimet,thatis,ht=ft 1)]TJ/F20 11.955 Tf 11.955 0 Td[(Ft:4

PAGE 18

Notethatttgivestheprobabilityofafailureinasmalltimeintervalt;t+t].Inacountingprocess,theexpectednumberoffailuresuptotimetisdenotedbymt=E[Nt].Intensityfunctioncanbeobtainedbytakingthederivativeofmt.Itisanabsoluterate.1.3HomogeneousPoissonProcessHPPThecountingprocessfNt;t0gissaidtobeahomogeneousPoissonprocessHPPiftheintensityfunctiontisaconstant,thatis,t=,>0and1.N=0;2.Theprocesshasindependentincrementsandstationaryincrements.Apointprocesshasstationaryincrementsifforallk,PNt;t+s]=kisindependentoft.Itcanbeshownthatthenumberofeventsinanyintervaloflengths=t2)]TJ/F20 11.955 Tf 12.595 0 Td[(t1hasaPoissondistributionwithmeans,thatisP[Nt2)]TJ/F20 11.955 Tf 11.955 0 Td[(Nt1=n]=e)]TJ/F21 7.97 Tf 6.586 0 Td[(ssn n!;0t1t2;n=0;1;::::Theintensityfunctionisalsoreferredasrepairrate.HomogeneousPoissonProcesshasthefollowingproperties,proofsaregiveninRigdon00[50]:Property1.AprocessisanHPPwithconstantintensityfunction,ifandonlyifthetimesbetweeneventsareiidexponentialrandomvariableswithmean1=.Property2.If0
PAGE 19

T2<:::t]=Pr[Nt=0]=e)]TJ/F21 7.97 Tf 6.587 0 Td[(t:ThetimesbetweeneventsmentionedinProperty1iscalledthesequenceofin-terarrivaltimeswhichisdenotedbyfXi=Ti)]TJ/F20 11.955 Tf 12.466 0 Td[(Ti)]TJ/F18 7.97 Tf 6.586 0 Td[(1;i=1;2;:::g.Weshallnotethat,inHPP,eachXiisindependentlyidenticallyexponentiallydistributedwithmean1=.Hence,wecanexpectanaverage1=eventstooccurwithinthetimeintervaltn)]TJ/F18 7.97 Tf 6.587 0 Td[(1;tn].Pr[X1>x]=Pr[Nx=0]=e)]TJ/F21 7.97 Tf 6.586 0 Td[(x;Pr[X2>xjX1=x1]=Prfzeroeventinx1;x1+x]g=e)]TJ/F21 7.97 Tf 6.587 0 Td[(x;=Pr[Xn>xjXn=xn)]TJ/F18 7.97 Tf 6.587 0 Td[(1]=Prfzeroeventinxn)]TJ/F18 7.97 Tf 6.586 0 Td[(1;xn)]TJ/F18 7.97 Tf 6.586 0 Td[(1+x]g=e)]TJ/F21 7.97 Tf 6.586 0 Td[(x:1.4NonhomogeneousPoissonProcessNonhomogeneousPoissonprocessNHPPisaPoissonprocesswhichintensityfunctionisnotaconstant.AcountingprocessfNt;t0ghasanonhomogeneousPoissonprocessif1.N=0;2.Theprocesshasindependentincrements.Itcanbeshownthatthenumberoffailuresinanyintervalt1;t2]hasaPoissondistributionwithmeanRt2t1vtdt.Thatis,PNt2)]TJ/F20 11.955 Tf 11.955 0 Td[(Nt1=k=1 k!expf)]TJ/F1 9.963 Tf 17.269 14.058 Td[(Zt2t1vtdtgZt2t1vtdtk:Forourpurposes,theseoccurrencesintimewillbethefailuretimesofarepairablesystem.Thoughthemodelsdiscussedinthefollowingmaybeapplicabletoothersitua-tions,weshallusethetermfailuresinsteadofeventsfromnowon.6

PAGE 20

Therearetwodierentsamplingprotocolswhichprovidedataonrepairablesys-tem:ifailuretruncatedcaseandiitimetruncatedcase.Dataaresaidtobefailuretruncatedwhentestingstopsafterapredeterminednumberoffailures.Supposethatarepairablesystemisobservedtillnfailuresoccurxedn,soweobservetheorderedfailuretimest1>>>>><>>>>>>:0;y0;my=mtn;0tn:Property3.IfaNHPPwithintensityfunctiontisobserveduntiltimet,andifthefailuretimesareT1
PAGE 21

theinterval;t],thenconditionedonNt=n,therandomvariablesT1>>>>><>>>>>>:0;y0;my=mt;0t:Property4.Theprobabilityofsystemfailureoccurringaftertimetisknownasthereliabilityfunction,Rt.ThenonhomogeneousPoissonprocessassumesthatthenumberoffailuresinanyintervalt1,t2]hasaPoissondistributionwithmeanRt2t1vtdt.HencethereliabilityfunctionisRt=Pr[T>t]=Pr[Nt=0]=e)]TJ/F1 9.963 Tf 7.997 7.528 Td[(Rt2t1tdtRt2t1tdt0 0!=e)]TJ/F1 9.963 Tf 7.998 7.527 Td[(Rt2t1tdt=e)]TJ/F18 7.97 Tf 6.587 0 Td[([t2)]TJ/F21 7.97 Tf 6.587 0 Td[(t1]:1.5PowerLawProcessAcommonfunctionformfortheintensityfunctioninNHPPist= t )]TJ/F18 7.97 Tf 6.586 0 Td[(1for>0;>0;whereandarethescaleparameterandshapeparameterrespectively.Theintensityfunctionisproportionaltothecumulativefailuretimetraisedtoapower,thereforethisspecialnonhomogeneousPoissonprocessisusuallycalledthePowerLawProcess.Themeanvaluefunctiontoftheprocessist=ENt=Zt0sds=Zt0 s)]TJ/F18 7.97 Tf 6.587 0 Td[(1ds=t=:8

PAGE 22

AnalternativewayofdescribingthepowerlawprocessistoconsiderthesequenceofsuccessivefailuretimesT1;T2;:::whereTiisthetimeoftheithfailure.ThenthetimeoftherstfailureT1hasaWeibulldistributionwithscaleandshapeparameterand.Tii=2;3;:::nhavelefttruncatedWeibulldistributionsconditionalonT1=t1;:::;Ti)]TJ/F18 7.97 Tf 6.586 0 Td[(1=ti)]TJ/F18 7.97 Tf 6.587 0 Td[(1.Therefore,PowerLawProcessisalsocalledWeibullProcess.1.5.1HistoricalReviewThepowerlawprocesshasbeenwidelyusedinreliabilitygrowthCrow982[22],andsoftwarereliabilitymodelsKyparisisandSingpurwalla985[35],andinrepairablesystemsAscherandFeingold84[2],EngelhardtandBain86[26],RigdonandBasu89[49].OthernamesforthePowerLawmodelare:theDuaneModelDu-ane964[24]andAMSAAmodel.AMSAAstandsfortheUnitedStatesArmyMate-rialsSystemAnalysisActivity.Thereisalotofliteratureonthepowerlawprocessfromaclassicalstatisticsview.MuchtheoreticalworkdescribingthePowerLawmodelwasperformedinthe1970'sLee,LandLee,K.978[36]andEngelhardtandBain978[26][3].Classicalinferenceonthepowerlawprocess,suchaspointestimation,condenceintervals,testsofhypothe-sisforparametersandestimatesoftheintensityfunction,wasreviewedbyRigdonandBasu89[49].Calabria1988[12]examinedmodiedmaximumlikelihoodestimatorsoftheexpectednumberoffailuresinagiventimeintervalandofthefailureintensityandcomparetheirmeansquarederrorswiththoseMLEs.Qiao,H.andTsokos,C.1998[44]obtainedthebestecientestimatesofintensityfunction.Bayesianinferenceonthepowerlawprocesswasalsostudiedduringthepasttwodecades.BayesianpointandintervalestimateswereobtainedbyGuida,M.989[30]andKyparisisandSingpurwalla985[35].Informativeandnoninformativepriorswerebothemployedonfailuretruncateddatacase.Bar-Levetal.991[7]discussedbothtimeand9

PAGE 23

failuretruncateddatabyusingnoninformativepriors.Theyderivedpredictiondistri-butionsoffuturefailuretimesandthenumberoffailuresinsomefuturetimeinterval.Itinvolvescomplicatednumericalcalculation.Calabria90[13]alsoderivedpredic-tiondistributionbyusingnoninformativeandinformativepriors.Thesereferencesaregivenonasinglesystemandusuallyassumeparametersareindependent.Crow74[21]andBain978[3]analyzedindependentequivalentmulti-systembyemployingpowerlawprocess.PowerboundsforatestofequalityoftrendsinseveralindependentpowerlawprocesseswerediscussedbyCalabria,R.,Guida,M.andPulcini,G.92[14].HuangandBier998[33]presentedanaturalconjugatepriorforthePLP.1.5.2ModelMotivationThemostcommonlyusedmodelsforrepairablesystemsarethehomogeneousandnonhomogeneousPoissonprocesses.Letusstartwithdierentdatasets.Figure1.2dis-playsthetimedotplotsandscatterplotsofcumulativefailurenumberagainstcumulativefailuretimeunderdierentsituationsofrepairablesystems.Ingure1.2,therstsituationaillustratesanimprovingsystem.Afterremovingbugs,timesbetweenfailurestendtogetlongerandsystemisimproving.Theintensityfunctiondecreasessincetheprobabilityoffailuresgetssmallerwhensystemages.Thiscanbeemployedasareliabilitygrowthmodelandhenceisofthemostinterest.Thesecondsituationbillustratesasteadysystem.Timesbetweenfailurestendtostaythesame.Theintensityfunctionremainsconstantsincetheprobabilityoffailuredoesnotchange.Thethirdsituationcillustratesadeterioratingsystem.Afterremovingbugs,timesbetweenfailurestendtogetshorterandsystemisdeteriorating.Theintensityfunctionincreasessincetheprobabilityoffailurewillgetslargerwhensystemages.Usuallytheassumptionsofindependentandidenticaldistributionfortimesbe-10

PAGE 24

Figure1.2:ThreeDierentTypesofSystems11

PAGE 25

tweenfailuresinrepairablesystemareinvalid.Wemustconsidermodelsinwhichtheassumptionsdonothold.Theintensityfunctionrepairrateplaysanimportantroleforselectingmodelbecauseitcontainstheinformationaboutlikelihoodofafailureatoraroundanytimet.Theintensityfunctionchangeswhensystemages.Insituationa,therepairsystemisimproving.Insituationc,thesystemisdeteriorating.Underthosetwocases,weshouldemployNHPP.Insituationb,theintensityfunctionisaconstant,HPPisamoreproperchoice.IfNHPPisselectedasthemodel,averycommonlyusedprocessinNHPPisthepowerlawprocessPLP.Powerlawprocessisexiblyenoughtosetupmodelsforthosethreesituationsbyapplyingdierentvaluesofshapeparameter.Detailswillbegiveninthenextchapter.ExceptPLPisveryexible,thefactthatmeanfunctioncanbeeasilyderivedisalsoaplus.1.5.3PresentStudyWemakeparameterstransformations=ln,=1=.Alocationparameterandascaleparameterareobtained.Thismakesnoninformativepriorsmoreappropri-ate.Thusweappliednon-informativepriorstogettheposteriordensitiesofandandgotBayesestimatorsofand.AnewlydevelopedBayesestimatorofintensityfunctionisshown.WecomparedthepredictionresultbyemployingposteriorinferenceswiththeresultbyemployingaBayesianestimatorofintensityfunction.Thecurrentapproachsimpliesthecalculationconsiderably.Theposteriordensitiesofseveralparameterfunc-tionsarediscussedinthelastpart.Bayesianapproachrequiresnumericalintegration.Weeitherusesomeapproximationmethodorcreatecomputerprogramtocalculatethedata.Wealsoprovideouroriginalworkbyapplyingnaturalconjugatepriorsandnon-parametricKernelpriorsinEmpiricalBayesanalysisforthepowerlawprocess.For12

PAGE 26

thenaturalconjugatepriors,two-hyperparameterpriorandamoregeneralizedthree-hyperparameterpriorHuangandBier1998[33]areused.Giventheestimatesofhy-perparameters,weobtainclosedformsforpriorandposteriordistributionsinaspecialcase.AnotherareaofcurrentresearchinterestsfocusesonBayesianandEmpiricalBayesmethodsonmicroarray.CurrentlywehaveanopportunitytoworkwithDr.HauraonmicroarraydataanalysisinMottLeeCancerCenter{oneofthelargestnationalcancercenters.Inthiswork,themainobjectistoselectdierentiallyexpressedgenesinaround22,000genes.Thedataarenonpaired5-controlGFPproteinand5-experimentStat3geneexpression.BothclassicalandBayesianmethodsareapplied.WebrieyaddressthestatisticalstructureandillustratetheresultsoftwoclassicalmethodsSAMStorey2002,localZ-scoreChen,Z.[17]andoneparametricEmpiricalBayesNewton,2002.Severalpartiallistsofdierentiallyexpressedgenesareshown.TheresultsareobtainedbySASprogrammingandresearchsoftwarepackages.Wealsodiscusssomenormalizationmethodsandapplicationsonprobe-levelandexpression-leveldata.Bayesianmethodsaredevelopedinmanyothereldsofmicroarrayanalysis,forinstance,assessingdierentialexpressionNewton2002,Speed2002,Smyth2003,clus-teringSebastiani2002,decompositionOchs2002,principalcomponentanalysisandpredictionMikeWest,2000.Themainproblemsonmicroarrayarefromlowreplicatesandlargeamountofgenes.Bayesiananalysispartiallycontributestosolvetheproblembyconsideringthetypicalvariabilityinthesystem.FurtherworkcanbedonebyemployingBayesianorEmpiricalBayesanalysisinmicroarray.13

PAGE 27

1.6SummaryWeaddresssomefundamentalconceptswhichareinvolvedinreliabilitymodelandreliabilitygrowth.Aclassofpointprocessesisusuallyusedtomodelrepairablesystem,suchashomogeneousPoissonprocesses,nonhomogeneousPoissonprocesses,BranchingPoissonprocessesandrenewalprocesses.Ourresearchwillfocusonacommonlyusednonhomogeneousprocess{PowerLawProcess.Wegiveahistoricalreviewonthepowerlawprocessinthischapter.Andweshowedabriefideahowthepowerlawprocesscanbeappliedinreliabilitygrowthmodels.OurresearchinterestalsoincludesBayesianandEmpiricalBayesapproachesonmicroarrayanalysis,especiallyontheareaofselectionofdierentiallyexpressedgenes.14

PAGE 28

Chapter2ClassicalInferenceonthePowerLawProcessInthischapter,weaddresssomeclassicalinferenceresultsonthePowerLawPro-cess.Theseincludepointandintervalestimatesfortheparameters,hypothesistesting,estimationoftheintensityfunctionandestimationofthemeantimebetweenfailuresMTBF.Wewillalsodiscussthreegoodness-of-ttests.Thersttwowillbeillustratedwithrealdataset.2.1IntroductionAsmentionedinthelastchapter,thepowerlawprocesscanbedescribedasanonhomogeneousPoissonprocessfNt;t0gwithintensityfunction:t= t )]TJ/F18 7.97 Tf 6.586 0 Td[(1;for>0;>0:Themeanvaluefunctiontoftheprocessis:t=ENt=Zt0sds=Zt0 s )]TJ/F18 7.97 Tf 6.587 0 Td[(1ds=t=:Theshapeparameteraectshowthesystemdeterioratesorimprovesovertime.If>1,theintensityfunctiontisincreasing,thenthefailurestendtooccurmorefrequently,andwecallthesystemdeteriorating.If<1,theintensityfunctiontisdecreasing,thenthesystemisimproving.Underthissituation,thepowerlawprocesscanbeappliedasareliabilitygrowthmodel.If=1,thepowerlawprocessisreducedto15

PAGE 29

asimplehomogeneousPoissonprocesswithintensity1=,whereisascaleparameter.ThepowerlawprocesshasbeenwidelyusedasmodelsinrepairablesystemsAs-cherAndFeingold984[2],EngelhardtandBain1986[26],RigdonandBasu89[49]andsoftwarereliabilitygrowthmodelsKyparisisandSingpurwalla985[35].Forex-ample,Duane64demonstratedthatmanysystemsdevelopedatGeneralElectricseemedtofollowamodelcloselyrelatedtothepowerlawprocess.2.2PointEstimationofParametersandTherearetwodierentsamplingprotocolswhichprovidedataonthepowerlawprocess:ifailuretruncatedcaseandiitimetruncatedcase.Thesetwotermsaredenedinthepreviouschapter.ThejointpdfofthefailuretimesT1;T2;:::;TnfromaNHPPwithintensityfunctiontisthengivenbyCrow82[22],ft1;t2;:::;tn=nYi=1tiexp)]TJ/F1 9.963 Tf 11.291 14.059 Td[(Zw0xdx:.1wherewisaso-calledstoppingtime:w=tnforthefailuretruncatedcase,w=tforthetimetruncatedcase.Thusforthefailuretruncatedcasewitht= t )]TJ/F18 7.97 Tf 6.586 0 Td[(1,thejointdensityofT1
PAGE 30

andtherandomvariableNhasaPoissondistributionwithmeant=,sofNn=t=nexp[)]TJ/F15 11.955 Tf 9.299 0 Td[(t=] n!n=0;1;::::Thus,thejointdensityofNandT1
PAGE 31

2n=^hasachi-squaredistributionwith2n)]TJ/F20 11.955 Tf 12.382 0 Td[(degreesoffreedom,where=1forfailuretruncatedcaseand=0fortimetruncatedcase.NowwewillshowE^=n n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(;Var^=n22 n)]TJ/F15 11.955 Tf 11.956 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.955 0 Td[(:FortheproofweshallusethefollowingLemma.Lemma.LetXbe2distributedwithndegreesoffreedom,thenEXk=2k\050n 2+k \050n 2wherekisanintegers:t:n 2+k>0:Inparticular,EX=n;E1 X=1 n)]TJ/F15 11.955 Tf 11.955 0 Td[(2;EX2=n2+2n;E1 X2=1 n)]TJ/F15 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(4:Proof.BytheLemmaabove,wehaveE^=2nE1 22n)]TJ/F21 7.97 Tf 6.587 0 Td[(=n n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(:WealsohaveE^2=E2n 22n)]TJ/F21 7.97 Tf 6.587 0 Td[(2=4n22 [2n)]TJ/F20 11.955 Tf 11.955 0 Td[()]TJ/F15 11.955 Tf 11.955 0 Td[(2][2n)]TJ/F20 11.955 Tf 11.955 0 Td[()]TJ/F15 11.955 Tf 11.955 0 Td[(4]=n22 n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.955 0 Td[(:Thus,thevarianceoftheMLEofisVar^=E^2)]TJ/F15 11.955 Tf 11.955 0 Td[([E^]2=n22 n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.956 0 Td[(:AmodiedmaximumlikelihoodestimateofisgivenbySureshandRao92.ThismodiedMLEisgivenby0=n)]TJ/F15 11.955 Tf 11.955 0 Td[(1 n^:18

PAGE 32

2.2.2UnbiasedEstimatesTheMLE'sarebiasedestimatesandwecanadjustthemtounbiasedestimates.TheunbiasedestimatorsareBainandEngelhardt991[4],=n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[( n^=n)]TJ/F15 11.955 Tf 11.956 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[( Pni=1logtn=ti;=1forthefailuretruncatedcase,and=0forthetimetruncatedcase.ThevarianceofisVar=Varn)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[( n^=2 n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.955 0 Td[(:InCalabriaetal.988,itwasshownthattheunbiasedestimateofismoreecientthanthebiasedestimate.2.2.3LinearlyBestEcientEstimateofInQiaoandTsokos998,theyshowedthatthereexistsalinearlybestecientestimateof,denotedby.Theorem.Assumeisanunbiasedestimateof,andhasanitevariance,thenthereexistsanunique0suchthatMSE0=minMSE:Moreover,0=2 2+Var:Intheabovetheorem,theMSEisdenedastheexpectedvalueofthesquareofthedeviationoftheestimatefromthetruevaleanditequalstothesquareofvarianceplusthesquareofbias.Applyingtheirtheorem,wehave0=2 2+[2 n)]TJ/F18 7.97 Tf 6.587 0 Td[(2)]TJ/F21 7.97 Tf 6.587 0 Td[(];19

PAGE 33

andthebestecientestimateofis=n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.955 0 Td[( n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(=n)]TJ/F15 11.955 Tf 11.956 0 Td[(2)]TJ/F20 11.955 Tf 11.955 0 Td[( Pni=1logtn=ti:WealsohaveMSE0=20Var+0)]TJ/F15 11.955 Tf 11.955 0 Td[(122=4Var [2+Var]2+2Var2 [2+Var]2=2Var [2+Var]=2 n)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(:TheeciencyofMLE^,unbiasedestimateandlinearlybestestimateareEFFj=MSE MSE=n)]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F20 11.955 Tf 11.956 0 Td[( n)]TJ/F15 11.955 Tf 11.956 0 Td[(1)]TJ/F20 11.955 Tf 11.955 0 Td[(<1;EFFj^=MSE MSE^=n)]TJ/F15 11.955 Tf 11.955 0 Td[(3 n+6<1;FailuretruncatedcaseEFFj^=MSE MSE^=n)]TJ/F15 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(3 n)]TJ/F15 11.955 Tf 11.955 0 Td[(1n+2<1;TimetruncatedcaseEFFj^=MSE MSE^=n)]TJ/F15 11.955 Tf 11.955 0 Td[(2 n+6<1;FailuretruncatedcaseEFFj^=MSE MSE^=n)]TJ/F15 11.955 Tf 11.955 0 Td[(22 n)]TJ/F15 11.955 Tf 11.955 0 Td[(1n+2<1:TimetruncatedcaseTherefore,thelinearlybestestimatehasthegreatesteciencyandunbiasedestimatehasabettereciencythanMLE^.2.3IntervalEstimationandTestsofHypothesisWeagainapplythefactthat2n=^hasachi-squaredistributionwith2n)]TJ/F20 11.955 Tf 12.493 0 Td[(degreesoffreedom,where=1whenfailuretruncatedcaseand=0whentimetruncatedcase.ThenwecanwriteP21)]TJ/F21 7.97 Tf 6.586 0 Td[(=2n)]TJ/F20 11.955 Tf 11.955 0 Td[(<2n ^<2=2n)]TJ/F20 11.955 Tf 11.955 0 Td[(=1)]TJ/F20 11.955 Tf 11.955 0 Td[(20

PAGE 34

Note:isthesignicancelevelandnottheparameterhere.Sothecondenceintervalforis21)]TJ/F21 7.97 Tf 6.586 0 Td[(=2n)]TJ/F20 11.955 Tf 11.955 0 Td[(^ 2n<<2=2n)]TJ/F20 11.955 Tf 11.955 0 Td[(^ 2n:Theresultthatn=^hasachi-squaredistributionwith2n)]TJ/F20 11.955 Tf 12.403 0 Td[(degreesoffreedomcanalsobeusedtoconstructatestatsignicantlevelforH0:=0versusH1:6=0:TheruletorejectH0is2n0 ^<21)]TJ/F21 7.97 Tf 6.586 0 Td[(=22n)]TJ/F20 11.955 Tf 11.955 0 Td[(or2n0 ^>2=22n)]TJ/F20 11.955 Tf 11.955 0 Td[(;thatis,^<2n0 2=22n)]TJ/F20 11.955 Tf 11.955 0 Td[(or^>2n0 21)]TJ/F21 7.97 Tf 6.587 0 Td[(=22n)]TJ/F20 11.955 Tf 11.955 0 Td[(:ItisoftenusefultotestH0:=1versusH1:6=1:ThepowerlawprocessreducestothehomogeneousPoissonprocesswhen=1,andittestswhetherthesystemisre-mainingstableornot.AlternativetestcanalsobeH1:>1whichmeansthesystemisdeterioratingorH1:<1whichmeansthesystemisimproving.2.4EstimationofIntensityFunctionRecallPowerLawprocess'sintensityfunctionist= t )]TJ/F18 7.97 Tf 6.586 0 Td[(1;for>0;>0:Thesimplestwayisusingmaximumlikelihoodestimatesofandtoevaluatet.Wehave^vt=^ ^w ^^)]TJ/F18 7.97 Tf 6.587 0 Td[(1=n^ wwherew=tnforthefailuretruncatedcase,w=tforthetimetruncatedcase.OtherestimatesareobtainedbyTsokosandRao995[59]andQiaoandTsokos98[44].21

PAGE 35

RigdonandBasu00[50]combinedthefailuretruncatedcaseandtimetruncatedcasetogether,insummary,theestimatesofintensityfunctionareunbiasedEstimate^vUB=n)]TJ/F15 11.955 Tf 11.955 0 Td[(1n)]TJ/F15 11.955 Tf 11.955 0 Td[(2^ nwandbestecientestimatewithminimummeansquarederror^vMMSE=n)]TJ/F15 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(3^ nw:2.5MeanTimeBetweenFailureMTBFMeantimebetweenfailuresisdenedtobetheaveragetimethatacomponentworkswithoutfailure.Itisanimportantmetricthatassessesthereliabilityofrepairablesystem.ThereciprocaloftheintensityfunctioniswidelyacceptedasanapproximateestimateoftheMTBFCox&Lewis1966[20],denotedbyMTBFA.However,sucharelationshipisonlytrueforHPPThompson981[57].Theexpectedtimebetweenthenthandthen+1thfailure,denotedbyMTBFn,isthemeantimebetweenfailureintimeintervaltn;tn+1.Inthissection,werstderiveMTBFnasafunctionofand,thenweshowanestimateofMTBFnderivedbyQiaoandTsokos[44].TheirestimateofMTBFnwillbereferredasMTBFQ.2.5.1MTBFnTheprobabilityFtn+1jtnofsystemfailingaftertimetn+1,giventhatthesystemlastfailedattimetnisequivalenttotheprobabilityofthesystemexperienceszerofailuresbetweentn;tn+1.ThiscanbeusedtoderivethedistributionofMTBFnforthepowerlawprocess.Fromprevioussection,Pr[Ntn+1)]TJ/F20 11.955 Tf 11.956 0 Td[(Ntn=0]=expf)]TJ/F15 11.955 Tf 15.276 0 Td[([tn+1)]TJ/F20 11.955 Tf 11.955 0 Td[(tn]g:Henceftn+1jtn=d dtn+1Ftn+1jtn=d dtn+1f1)]TJ/F20 11.955 Tf 11.955 0 Td[(Pr[Ntn+1)]TJ/F20 11.955 Tf 11.955 0 Td[(Ntn=0]g22

PAGE 36

=tn+1expf)]TJ/F15 11.955 Tf 15.276 0 Td[([tn+1)]TJ/F20 11.955 Tf 11.955 0 Td[(tn]g:MTBFn=Z1tntn+1fn+1tn+1=tndtn+1)]TJ/F20 11.955 Tf 11.956 0 Td[(tn=Z1tntn+1 tn+1)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F15 11.955 Tf 14.158 8.088 Td[(1 [tn+1)]TJ/F20 11.955 Tf 11.955 0 Td[(tn]dtn+1)]TJ/F20 11.955 Tf 11.955 0 Td[(tn=1 )]TJ/F18 7.97 Tf 6.586 0 Td[(1=exptn \0501 +1Fx:whereFx=1)]TJ/F20 11.955 Tf 11.955 0 Td[(Pr[X>tn ],X\0501 +1;1.2.5.2MTBFQQiaoandTsokos98[45]investigatedtherelationbetweenMTBFandtherecip-rocaloftheintensityfunction.TheyprovidedupperandlowerboundsfortheestimateofMTBFn.Whatitfollowsisabriefproofforthereliabilitygrowthmodel<1.Considertheintensityfunctiont= tn )]TJ/F18 7.97 Tf 6.586 0 Td[(1;Let=1=,h=tn=,then1 t=h)]TJ/F18 7.97 Tf 6.587 0 Td[(1:WecanalsorewriteMTBFQasMTBFQ=ehZ1he)]TJ/F21 7.97 Tf 6.587 0 Td[(tt)]TJ/F18 7.97 Tf 6.586 0 Td[(1dt:Case1:1 2Thiscaseisequivalentto2.Thereforewemayassume=m+1+,wherem1isanintegerand2[0;1.Weshalltakethespeicalcase=0.Inthiscase,theMTBFQcanbeexpandedasMTBFQ=ehe)]TJ/F21 7.97 Tf 6.587 0 Td[(hmXi=0iYk=0)]TJ/F20 11.955 Tf 11.955 0 Td[(khm)]TJ/F21 7.97 Tf 6.587 0 Td[(i23

PAGE 37

=hmmXi=01 hiiYk=0)]TJ/F20 11.955 Tf 11.955 0 Td[(k=1 t[1+)]TJ/F15 11.955 Tf 11.955 0 Td[(11 h+:::+)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F15 11.955 Tf 11.955 0 Td[(2:::)]TJ/F20 11.955 Tf 11.955 0 Td[(m hm]:Theexpressionaboveshowsthatthedierencebetween1 tisgivenby1 t")]TJ/F15 11.955 Tf 11.955 0 Td[(11 h+:::+)]TJ/F15 11.955 Tf 11.955 0 Td[(1)]TJ/F15 11.955 Tf 11.955 0 Td[(2:::)]TJ/F20 11.955 Tf 11.955 0 Td[(m hm#:Forgeneral,wehaveMTBFQ=h+mmXi=0iYk=0)]TJ/F20 11.955 Tf 11.955 0 Td[(khm)]TJ/F21 7.97 Tf 6.587 0 Td[(i+ehh)]TJ/F21 7.97 Tf 6.587 0 Td[(Z1ht)]TJ/F21 7.97 Tf 6.587 0 Td[(m)]TJ/F18 7.97 Tf 6.587 0 Td[(1e)]TJ/F21 7.97 Tf 6.587 0 Td[(tdt:FromtheexpressionofMTBFQ,itcanbeeasilyseenthatMTBFQ1 t")]TJ/F15 11.955 Tf 11.955 0 Td[(11 h+:::+)]TJ/F15 11.955 Tf 11.956 0 Td[(1)]TJ/F15 11.955 Tf 11.955 0 Td[(2:::)]TJ/F20 11.955 Tf 11.955 0 Td[(m hm#1 t"1+)]TJ/F15 11.955 Tf 11.955 0 Td[(1 h#;andMTBFQ1 tmXi=0)]TJ/F15 11.955 Tf 11.955 0 Td[(1 hi1 t1 1)]TJ/F21 7.97 Tf 13.151 4.707 Td[()]TJ/F18 7.97 Tf 6.586 0 Td[(1 h:Hence,1 t"1+)]TJ/F15 11.955 Tf 11.955 0 Td[(1 h#MTBFQ1 t1 1)]TJ/F21 7.97 Tf 13.15 4.708 Td[()]TJ/F18 7.97 Tf 6.587 0 Td[(1 h;where=1 with1=2,h=tn andt=1 tn)]TJ/F19 5.978 Tf 5.756 0 Td[(1.ThusthepointestimateforMTBFQisMTBFQ=1 2t[1+)]TJ/F15 11.955 Tf 11.955 0 Td[(1 h+1)]TJ/F20 11.955 Tf 13.151 8.087 Td[()]TJ/F15 11.955 Tf 11.955 0 Td[(1 h)]TJ/F18 7.97 Tf 6.587 0 Td[(1]:.5Case2:1 2<1Inthiscase,wemaywrite=1+with0<1.ThusMTBFQ=ehZ+1he)]TJ/F21 7.97 Tf 6.587 0 Td[(ttdthehZ+1he)]TJ/F21 7.97 Tf 6.587 0 Td[(tdt=h=1 t:24

PAGE 38

WemayalsoobtainMTBFQ=ehhe)]TJ/F21 7.97 Tf 6.587 0 Td[(h+Z+1hh)]TJ/F18 7.97 Tf 6.587 0 Td[(1e)]TJ/F21 7.97 Tf 6.587 0 Td[(tdt1 t"1+ h#:Therefore,inthiscase,wehave1 tMTBFQ1 t[1+ h]:ThusthepointestimateforMTBFQforthiscaseisMTBFQ=1 2t[2+ h]:Fortheintensityfunctiont= t )]TJ/F18 7.97 Tf 6.586 0 Td[(1,weconcludeMTBFQ=8>>>>>><>>>>>>:MTBFA 2[+)]TJ/F18 7.97 Tf 6.587 0 Td[(1 h+1 1)]TJ/F22 5.978 Tf 7.782 3.42 Td[()]TJ/F19 5.978 Tf 5.756 0 Td[(1 h];0<1 2MTBFA 2+ h;1 2<1MTBFn:>12.6Goodness-of-tTestsThereareseveralwaystoassessthetofpowerlawprocess.TheDuaneplotisaninformalgraphicalmethod.Exactgoodness-of-ttestscanbeconstructedbymakinganappropriatetransformationofthefailuretimes.Suchtransformationsincluderatio-powertransformationandlog-ratiotransformation.WeusethefollowingsoftwarefailuretimetabletoillustratehowtheDuaneplotandratio-powertransformationworkintest-ingthegoodness-of-t.Thisdatasetconsistsof38softwarefailuretimestakenfromMusa979[41].25

PAGE 39

SoftwareFailureTimesinSeconds i ti i ti i ti i ti 1 115 11 1955 21 6162 31 36800 2 115 12 2026 22 6552 32 37363 3 198 13 2632 23 8415 33 40133 4 376 14 3821 24 9752 34 40785 5 570 15 3861 25 14260 35 46378 6 706 16 4649 26 15094 36 58074 7 1780 17 4871 27 18494 37 64798 8 1798 18 4943 28 18500 38 67344 9 1813 19 5558 29 23061 10 1905 20 6147 30 26229 i:Failurenumber;ti:CumulativeFailuretime.Musa979[41]Table2.12.6.1DuanePlotIfthepowerlawprocessisthecorrectmodel,Duaneplotsshouldberoughlylinear.Thisisderivedfromthefollowing:E[Nt]=t=t :Thus,E[Nt t]=1 tt=t)]TJ/F18 7.97 Tf 6.586 0 Td[(1 :Aftertakingnaturallogarithmofbothsides,wehavelogE[Nt t]=)]TJ/F15 11.955 Tf 11.955 0 Td[(1logt)]TJ/F20 11.955 Tf 11.955 0 Td[(log;thisshowslogE[Nt=t]isalinearfunctionoflogtassumingtisxed,inwhichNtisrandom.FromFigure2.1,itsuggeststhepowerlawprocessisindeedpropersincetheDuaneplotshowsalinearrelation.26

PAGE 40

Figure2.1:DuanePlotforDataTable2.12.6.2RatioPowerTestAcommongoodness-of-tistheratio-powertransformation,whichisdenedby^Ri=ti=tn,i=1;2;:::;n)]TJ/F15 11.955 Tf 11.647 0 Td[(1,whereisanunbiasedestimatorwhichisobtainedasinsubsection2.1.2.H0:Thepowerlawprocessiscorrectmodel.H1:Thepowerlawprocessisnotcorrectmodel.TheteststatisticfortheCramer-vonMisestestisC2R=1 12n)]TJ/F15 11.955 Tf 11.955 0 Td[(1+n)]TJ/F18 7.97 Tf 6.587 0 Td[(1Xi=1^Ri)]TJ/F15 11.955 Tf 19.2 8.088 Td[(2i)]TJ/F15 11.955 Tf 11.955 0 Td[(1 2n)]TJ/F15 11.955 Tf 11.955 0 Td[(12:Fromthedatatable2.1,thestatisticC2R=0:25.Weacceptthenullhypothesisat5%level.Furthermore,Figure2.2isascatterplotofnumberoffailures.Itsuggestsanimprovingsystemsinceitisconcavedown.27

PAGE 41

Figure2.2:PlotofNumberofFailuresagainstCumulativeOperatingTimeforDataTable2.12.6.3Log-ratioTestInLog-ratiogoodness-of-ttestforthepowerlawprocess,wemakealog-ratiotransformationUi=logtn=tn)]TJ/F21 7.97 Tf 6.586 0 Td[(i:Ifthepowerlawprocesswithparametersandisthepropermodel,itcanbeshownthatU1
PAGE 42

hypothesistestofmodelparametersweregiven.Wederivedauniedformoflinearlybestecientestimatesofthescaleparameterforthefailuretruncatedandtimetruncateddata.Forcompleteness,wealsoincludedtheestimationofintensityfunctionandmeantimebetweenfailuresMTBF.Inthelastpart,weappliedrealdatasettoshowhowthreeexistinggoodnessofttestsforthePLPworkinmodelcheck.29

PAGE 43

Chapter3BayesianInferenceonthePowerLawProcessRecallthatthepowerlawprocesscanbedescribedasanonhomogeneousPoissonprocessfNt,t0gwithintensityfunctiont= t )]TJ/F18 7.97 Tf 6.587 0 Td[(1;for>0;>0:Themeanvaluefunctiontoftheprocessis:t=ENt=Zt0sds=Zt0 t )]TJ/F18 7.97 Tf 6.587 0 Td[(1ds=t=:Inthischapter,weusethetransformation=ln,=1=andobtainthelocationpa-rameterandthescaleparameter.WedevelopBayesianproceduresforthepowerlawprocesswithfailuretruncationdata,usingnon-informativepriorsforthescaleparameterandthelocationparameter.Bayesianinferenceisdierentfromtheclassicalmeth-odssincewetaketheparametersandasrandomvariablesinsteadofxednumbers.Inadditiontoobtainingtheposteriordensityofparametersofthepowerlawprocess,Bayesianpredictioninferencesfortheexpectednumberoffailuresandthefuturefailuretimesarediscussed.PredictiveinferencebasedonBayesianestimationoftheintensityfunctiongreatlysimpliesthecalculations.WecompareourresultswiththepaperofBar-Lev,S.K.,Lavi,I.andReiser,B.992[7]byusingthedatasetfromMusa979.Wealsoderiveposteriordensitiesofsystemreliability,meanvaluefunctionandintensityfunction.30

PAGE 44

3.1LikelihoodFunction3.1.1Likelihoodofy1Aspointedoutinsection1.5,thetimeoftherstfailureT1hasaWeibulldistri-butionwithscaleparameterandshapeparameter,thatis,T1Weibull;withpdfft1= t1 )]TJ/F18 7.97 Tf 6.587 0 Td[(1e)]TJ/F18 7.97 Tf 6.587 0 Td[(t1=andcdfFt1=Zt10fsds=Zt10 s )]TJ/F18 7.97 Tf 6.587 0 Td[(1e)]TJ/F18 7.97 Tf 6.586 0 Td[(s=ds=)]TJ/F20 11.955 Tf 9.298 0 Td[(e)]TJ/F18 7.97 Tf 6.586 0 Td[(s=t10=1)]TJ/F20 11.955 Tf 11.955 0 Td[(e)]TJ/F18 7.97 Tf 6.587 0 Td[(t1=:LetY1=lnT1,thenFy1=PrY10:3.1.2Likelihoodof~y=y1;y2;y3;:::;ykLetYk=lnTk,fork=2;:::;n,whereTkisthekthorderedfailuretime,thenPrfYk>ykjtk)]TJ/F18 7.97 Tf 6.586 0 Td[(1g=PrflnTk>ykjtk)]TJ/F18 7.97 Tf 6.587 0 Td[(1g=PrfTk>eykjeyk)]TJ/F19 5.978 Tf 5.757 0 Td[(1g=Prfzerofailureineyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1;eyk]g=exp)]TJ/F1 9.963 Tf 11.291 14.058 Td[(Zeykeyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1sds=exp)]TJ/F20 11.955 Tf 9.299 0 Td[(sjeykeyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1:31

PAGE 45

Sincethemeanvaluefunctionist=t=;wehaveseykeyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1=1 eyk)]TJ/F15 11.955 Tf 16.815 8.088 Td[(1 eyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1fork=2;:::;n:Therefore,cdfofYkconditionalonobservationyk)]TJ/F18 7.97 Tf 6.586 0 Td[(1isFykjyk)]TJ/F18 7.97 Tf 6.586 0 Td[(1=PrfYk>ykjyk)]TJ/F18 7.97 Tf 6.586 0 Td[(1g=expf)]TJ/F20 11.955 Tf 15.276 0 Td[(yeykeyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1g=expf)]TJ/F15 11.955 Tf 15.276 0 Td[(eyk +eyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1 gandthepdfofYkconditionalonobservationyk)]TJ/F18 7.97 Tf 6.586 0 Td[(1isfykjyk)]TJ/F18 7.97 Tf 6.586 0 Td[(1= expfyk)]TJ/F15 11.955 Tf 11.955 0 Td[(eyk +eyk)]TJ/F19 5.978 Tf 5.756 0 Td[(1 g:Hence,thejointlikelihoodfunctionof~yforthefailuretruncatedcaseisL~y=Lyn;yn)]TJ/F18 7.97 Tf 6.587 0 Td[(1;yn)]TJ/F18 7.97 Tf 6.587 0 Td[(2;:::;y1=fynjyn)]TJ/F18 7.97 Tf 6.587 0 Td[(1fyn)]TJ/F18 7.97 Tf 6.587 0 Td[(1jyn)]TJ/F18 7.97 Tf 6.586 0 Td[(2fy2jy1fy1= nexpfnX1yi)]TJ/F15 11.955 Tf 16.815 8.088 Td[(1 eyng:.1Similarly,itcanbeshownthatfortimetruncateddata,likelihoodfunctionisL~y= nexpfnX1yi)]TJ/F15 11.955 Tf 16.815 8.087 Td[(1 eygwherey=lntandtisthestoppingtime.Tosimplifyourwork,weonlyconsiderthefailuretruncatedcase.However,inferencesforthetimetruncatedcaseissimilarsincewehaveasimilarlikelihoodfunction.Nowwemakeatransformationontheparametersand.Let=ln,=1 ,thenthejointlikelihoodfunction.1of~ybecomesL~y=Lyn;yn)]TJ/F18 7.97 Tf 6.586 0 Td[(1;yn)]TJ/F18 7.97 Tf 6.586 0 Td[(2;:::;y1=1 nexpfPni=1yi)]TJ/F20 11.955 Tf 11.956 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F20 11.955 Tf 11.956 0 Td[( g:.2Hence,fromthelikelihoodfunction.2,theclassicalMLE'sofandareobtainedas^=yn)]TJ/F21 7.97 Tf 16.581 12.951 Td[(nX1yi=n32

PAGE 46

and^=yn)]TJ/F15 11.955 Tf 12.895 3.155 Td[(^lnn:3.2PosteriorDensityof;Wenotethatisalocationparameterandisascaleparameter.FromBoxandTiao73[10],thenoninformativepriorsforandareo=constant;o/1 Wealsonotethatthesepriorsarebothimpropersinceintegralofpriorsarenotnite.Ifandareindependent,thejointprioris0;/1=.Ifandaredepen-dent,0;/1=2Bar-Lev,S.K.etal.992[7].Hereweassumethatandareindependent.Hence,bytheBayes'ruletheposteriordensity;j~yis:;j~y=L~yj;1 R1R10L~yj;1 dd:.3Wenowcomputethemarginaldensitym~yof~y.Itisgivenbym~y=Z1Z10L~yj;1 dd=Z1Z101 n+1expfPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( gdd=Z101 n+1Z1expfPn1yi gexpf)]TJ/F20 11.955 Tf 9.298 0 Td[(n gexpf)]TJ/F20 11.955 Tf 15.276 0 Td[(eyn)]TJ/F22 5.978 Tf 5.756 0 Td[( gdd=Z101 n+1expfPn1yi gexpf)]TJ/F20 11.955 Tf 9.298 0 Td[(nyn gZ1expfnyn)]TJ/F20 11.955 Tf 11.955 0 Td[( gexpf)]TJ/F20 11.955 Tf 15.276 0 Td[(eyn)]TJ/F22 5.978 Tf 5.756 0 Td[( gddLeth=expfyn)]TJ/F21 7.97 Tf 6.587 0 Td[( g,thendh d=eyn)]TJ/F22 5.978 Tf 5.757 0 Td[( )]TJ/F15 11.955 Tf 10.494 8.088 Td[(1 :WealsohaveZ1expfnyn)]TJ/F20 11.955 Tf 11.955 0 Td[( gexpf)]TJ/F20 11.955 Tf 15.276 0 Td[(eyn)]TJ/F22 5.978 Tf 5.756 0 Td[( gd=\050n:33

PAGE 47

Thus,m~y=\050nZ101 nexpfn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(nyn gd=\050n nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yinZ10nyn)]TJ/F15 11.955 Tf 11.955 0 Td[(n1yi nexpf)]TJ/F15 11.955 Tf 15.276 0 Td[(nyn)]TJ/F15 11.955 Tf 11.956 0 Td[(n1yi gd=\050n\050n)]TJ/F15 11.955 Tf 11.955 0 Td[(1 nyn)]TJ/F15 11.955 Tf 11.955 0 Td[(n1yin)]TJ/F18 7.97 Tf 6.587 0 Td[(1:Usingthisin.3,weobtainthejointposteriordensityof;as;j~y=c~y1 n+1expfPyi)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( g.4wherec~y=1 m~y=nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yin)]TJ/F18 7.97 Tf 6.587 0 Td[(1 \050n\050n)]TJ/F15 11.955 Tf 11.955 0 Td[(1.3.3PosteriorDensityofand3.3.1MarginalPosteriorDensityofThemarginalposteriordensityofisobtainedbytakingintegralofjointposteriordensity.4withrespectto.Thatisj~y=Z1;j~yd=c~y\050n1 nexpfn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(nyn g:Figure3.1givestheposteriordensityofforthedataintable2.1.Therefore,Bayesianpointestimateforundersquarederrorlossis~B=Ej~y=Z10fj~yd=nyn)]TJ/F1 9.963 Tf 11.956 7.97 Td[(Pn1yi n)]TJ/F15 11.955 Tf 11.955 0 Td[(2:TheBayesianmaximumlikelihoodestimateofobtainedasthemaximaoftheposteriorp.d.fofis^B=nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi n:34

PAGE 48

Figure3.1:MarginalDensityof35

PAGE 49

Evaluatedfromdatatable2.1,wehave~B=2:653and^B=2:513.Themarginalposteriorvarianceof,denotedbyVj~y~B,istheestimationerrorfor~B.Vj~y~B=Ej~y)]TJ/F15 11.955 Tf 12.894 3.155 Td[(~B2=Ej~y2)]TJ/F15 11.955 Tf 11.955 0 Td[(2~BEj~y+~2B=E2)]TJ/F15 11.955 Tf 12.894 3.155 Td[(~2B:SinceE2=Z10c~y\050n1 n)]TJ/F18 7.97 Tf 6.587 0 Td[(2expfPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(nyn gd=nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi2 n)]TJ/F15 11.955 Tf 11.955 0 Td[(2n)]TJ/F15 11.955 Tf 11.955 0 Td[(3;itfollowsthatVj~y~B=nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi2 n)]TJ/F15 11.955 Tf 11.955 0 Td[(22n)]TJ/F15 11.955 Tf 11.956 0 Td[(3:Moreover,themthmomentisgivenbyEm=nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yim\050n)]TJ/F20 11.955 Tf 11.955 0 Td[(m)]TJ/F15 11.955 Tf 11.955 0 Td[(1 \050n)]TJ/F15 11.955 Tf 11.955 0 Td[(1m=1;2;:::n:3.3.2MarginalPosteriorDensityofThemarginalposteriordensityofisobtainedbytakingintegralofjointposteriordensity.4withrespectto,hence,j~y=Z10;j~yd=c~yZ11 n+1expfPyi)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( gd:Theevaluationofthisintegralrequiresnumericalprocedures.HerewederiveanapproximateestimatorofbyusingLindley'sapproximation.Lindley'sApproximation:Lindley80[54]developedanasymptoticapprox-imationtotheratioI=RweLd ReLd36

PAGE 50

where=1;:::;m,Listhelogarithmicofthelikelihoodfunction,wandarearbitraryfunctionsofandrepresentstherangespaceof.Clearly,ifw=uandisthepriordistributionof,thenPosteriorexpectationofugiventhedatax=x1;:::;xnisI=EujxwhichistheBayesestimatorofuunderthesquared-error-lossfunction.ToobtainBayesestimateof,weneedtoapproximateIform=2andassume1and2areindependent.Lindleygavethefollowingexpansion:I=u+1 2u1111+u2222+1u111+2u222++1 2L30u1211+L03u2222+L21u21122+L12u12211.5evaluatedat^1;^2andu11=@2u @21;u22=@2u @22;L30=@3L @31;L03=@3L @32;L21=@3L @212;L12=@3L @221;11=)]TJ/F20 11.955 Tf 9.298 0 Td[(L20)]TJ/F18 7.97 Tf 6.586 0 Td[(1=)]TJ/F20 11.955 Tf 10.531 8.087 Td[(@2L @21)]TJ/F18 7.97 Tf 6.587 0 Td[(1;22=)]TJ/F20 11.955 Tf 9.298 0 Td[(L02)]TJ/F18 7.97 Tf 6.586 0 Td[(1=)]TJ/F20 11.955 Tf 10.531 8.088 Td[(@2L @22)]TJ/F18 7.97 Tf 6.587 0 Td[(1;1=@ @1)]TJ/F15 11.955 Tf 9.299 0 Td[(2logv;2=@ @2)]TJ/F15 11.955 Tf 9.298 0 Td[(2logv:Inourcase,=;,U;=;V;==n+1;L;=Pyi)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;U1=1;U2=0;U11=0;U22=0;1=0;2=)]TJ/F15 11.955 Tf 9.298 0 Td[(1=2;L10=)]TJ/F20 11.955 Tf 10.494 8.088 Td[(n +1 expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;L20=)]TJ/F15 11.955 Tf 12.824 8.088 Td[(1 2expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;37

PAGE 51

L21=2 3expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( +1 4expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;L30=1 3expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( L01=)]TJ/F1 9.963 Tf 10.494 16.058 Td[(Pyi)]TJ/F20 11.955 Tf 11.956 0 Td[( +yn)]TJ/F20 11.955 Tf 11.956 0 Td[( 2expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;L02=2Pyi)]TJ/F20 11.955 Tf 11.956 0 Td[( 3)]TJ/F15 11.955 Tf 13.151 8.088 Td[(2yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 3expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 13.151 8.088 Td[(yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 4expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;L03=)]TJ/F15 11.955 Tf 10.494 8.088 Td[(6Pyi)]TJ/F20 11.955 Tf 11.955 0 Td[( 4+6yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 4expyn)]TJ/F20 11.955 Tf 11.956 0 Td[( +6yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 5expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( +yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 6expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( ;L12=)]TJ/F15 11.955 Tf 10.494 8.088 Td[(2n 3+2 3expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( +2yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 4expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( +1 4expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( +yn)]TJ/F20 11.955 Tf 11.955 0 Td[( 5expyn)]TJ/F20 11.955 Tf 11.955 0 Td[( :Usingtheexpression.5andsomeoftermsareequaltozero,weobtainEj~x=+1 2L30211+1 2L122211=+1 2L301 L202+1 2L121 L20L02=+ 2expyn)]TJ/F21 7.97 Tf 6.587 0 Td[( +)]TJ/F18 7.97 Tf 6.586 0 Td[(2n3 expyn)]TJ/F22 5.978 Tf 5.756 0 Td[( +23+2yn)]TJ/F20 11.955 Tf 11.955 0 Td[(2+2+yn)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F15 11.955 Tf 9.298 0 Td[(4Pn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(+4yn)]TJ/F20 11.955 Tf 11.956 0 Td[(expyn)]TJ/F21 7.97 Tf 6.586 0 Td[( +2yn)]TJ/F20 11.955 Tf 11.955 0 Td[(expyn)]TJ/F21 7.97 Tf 6.587 0 Td[( whichwillbeevaluatedby^;^,whichareMLE'sofand.3.4PredictiveInference3.4.1PredictionsBasedonPosteriorDensityFor=lnand=1=,themeanfunctionforthepowerlawprocessbecomest=t e1 .LetNs1;s2denotethenumberoffailuresoccurringintheintervals1;s2.ThenNs1;s2hasaPoissondistributionwithmeans1;s2=s2 e1 )]TJ/F15 11.955 Tf 11.955 0 Td[(s1 e1 :Hencetheprobabilityofrfailuresoccurbetweenthetimeintervals1;s2isP[Ns1;s2=rj;]=1 r![s1;s2]rexp[)]TJ/F20 11.955 Tf 9.299 0 Td[(s1;s2]:38

PAGE 52

Consequently,bypluggingintheposteriordensity;j~y,thepredictivedistributionofNs1;s2isP[Ns1;s2=rj~y]=Z10Z1P[Ns1;s2=rj;];j~ydd=\050n+r \050r+1c~yZ10)]TJ/F21 7.97 Tf 6.587 0 Td[(nexpyi s1=2)]TJ/F20 11.955 Tf 11.955 0 Td[(s1=1rs1=2)]TJ/F20 11.955 Tf 11.955 0 Td[(s1=1+eyn=)]TJ/F18 7.97 Tf 6.587 0 Td[(n+rd:.6Animportantspecialcaseiss1=tn=eyn,thatis,weareinterestedintheprobabilityofnumberoffailuresoccurringinsomefuturetimetn;s2.Inthiscase,3.6reducestoP[Ntn:s2=rj~y]=n+r)]TJ/F15 11.955 Tf 11.955 0 Td[(1r!rXk=0)]TJ/F15 11.955 Tf 9.299 0 Td[(1krk!nyn)]TJ/F1 9.963 Tf 11.955 7.971 Td[(Pyi n+klns2)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pyi)]TJ/F20 11.955 Tf 11.955 0 Td[(kynn)]TJ/F18 7.97 Tf 6.587 0 Td[(1:.7Forthespecialcases1=0;s2=s;thatis,anotherequivalentsystemistobeginoperatingandwewanttopredictthenumberoffailuresofthenewsystemoverinterval0;s,forsomesofinterest,.6reducestoP[N:s=rj~y]=\050n+r \050r+1c~yZ10)]TJ/F21 7.97 Tf 6.586 0 Td[(nnYi=1ti=s1=+tn=s1=)]TJ/F18 7.97 Tf 6.586 0 Td[(n+rd:.8Theintegralin.6and.8canbecomputednumerically.However,forthecases>yn,.8canbewrittenasaninnitesum,Bar-lev,S.K.etal.992[7]P[N:s=rj~y]=n+r)]TJ/F15 11.955 Tf 11.955 0 Td[(1r!1Xk=0)]TJ/F20 11.955 Tf 9.299 0 Td[(n)]TJ/F20 11.955 Tf 11.955 0 Td[(rk!nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pyi n+klns)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pyi)]TJ/F20 11.955 Tf 11.955 0 Td[(kynn)]TJ/F18 7.97 Tf 6.587 0 Td[(1:Ifs=yn,then.8canbereducedtoP[N:yn=rj~y]=n+r)]TJ/F15 11.955 Tf 11.955 0 Td[(1n)]TJ/F15 11.955 Tf 11.955 0 Td[(1!=2n+r:Weshallnowdiscussthepredictionoffuturefailuretimes.Givencurrentavailabledatat1;t2;:::;tn,wehavetopredictthefuturen+rthfailuretimeTn+r.DeneZr=39

PAGE 53

Tn+r)]TJ/F20 11.955 Tf 9.367 0 Td[(Tn,conditionalontheobservationTn=tn.ThenthepredictionofTn+risequivalenttothepredictionofZr.Fromequation3.7,wehaveP[Ntn:s2=rj~y]=n+r)]TJ/F15 11.955 Tf 11.955 0 Td[(1r!rXk=0)]TJ/F15 11.955 Tf 9.299 0 Td[(1krk!kn)]TJ/F18 7.97 Tf 6.586 0 Td[(1wherek=nyn)]TJ/F1 9.963 Tf 6.587 6.974 Td[(Pyi n+klns2)]TJ/F1 9.963 Tf 6.587 6.974 Td[(Pyi)]TJ/F21 7.97 Tf 6.586 0 Td[(kynandPZrzj~y=Patleastrfailuresintn;tn+z]j~y=1)]TJ/F20 11.955 Tf 11.956 0 Td[(Patmostr-1failuresintn;tn+z]j~y=1)]TJ/F21 7.97 Tf 11.955 12.952 Td[(r)]TJ/F18 7.97 Tf 6.587 0 Td[(1Xj=0n+j)]TJ/F15 11.955 Tf 11.955 0 Td[(1j!jXk=0jk!)]TJ/F15 11.955 Tf 9.298 0 Td[(1kkn)]TJ/F18 7.97 Tf 6.586 0 Td[(1:.9Thetimeforn+rthfailurecannowbeestimatedbyevaluatingEZr.3.4.2PredictionInferenceBasedonEstimationofIntensityFunctionLetQ=Tn ^Tn.ItisknownthatnQ2n)]TJ/F18 7.97 Tf 6.587 0 Td[(1approximatelyLee,L.andLee,K.87.^Tn=e)]TJ/F18 7.97 Tf 7.251 0 Td[(^=^ ^T1=^)]TJ/F18 7.97 Tf 6.587 0 Td[(1nandTn=eYn;where^,^areclassicalMLE'sofandrespectively.Tn=e)]TJ/F21 7.97 Tf 6.587 0 Td[(= T1=)]TJ/F18 7.97 Tf 6.587 0 Td[(1n;whichgives=ln t1=)]TJ/F19 5.978 Tf 5.756 0 Td[(1n:Recallthatthenoninformativepriorsare0=constant=c,0/1=.Hence,thepriorofconditionalonis0j=d d=ct1=)]TJ/F18 7.97 Tf 6.586 0 Td[(1n t1=)]TJ/F18 7.97 Tf 6.587 0 Td[(1n=c :Thereforethepriorofis0/Z0j0d/1=:40

PAGE 54

Ifisgiven,fromtheequation^==QandnQ2n)]TJ/F18 7.97 Tf 6.586 0 Td[(1,thep.d.f.L^jof^conditionalonisL^j=n ^n)]TJ/F19 5.978 Tf 5.756 0 Td[(3 2e)]TJ/F22 5.978 Tf 7.782 3.258 Td[(n 2^ 2n)]TJ/F19 5.978 Tf 5.757 0 Td[(1 2\050n)]TJ/F18 7.97 Tf 6.587 0 Td[(1 2n ^2:BytheBayesrule,theposteriordensityTnj^Tn/L^j0andj^=L^j0 RL^j0d=n ^n)]TJ/F19 5.978 Tf 5.757 0 Td[(3 2e)]TJ/F22 5.978 Tf 7.782 3.259 Td[(n 2^n ^ 2n)]TJ/F19 5.978 Tf 5.756 0 Td[(1 2\050n)]TJ/F18 7.97 Tf 6.587 0 Td[(1 2:.10Itisconcludedfromtheposteriordensity3.10thatn ^2n)]TJ/F18 7.97 Tf 6.587 0 Td[(1.WehavetheBayesianpointestimatesforis~B=E=n)]TJ/F15 11.955 Tf 11.955 0 Td[(1 n^;^B=n)]TJ/F15 11.955 Tf 11.956 0 Td[(3 n^:Wenowuse~vBtogivepredictioninference.Theprobabilityofthenumberoffailuresintimeintervaltn;yisP[Ntn:y=rj~v]=e)]TJ/F1 9.963 Tf 7.998 7.527 Td[(Rytn~vBtdtRytn~vBtdtr r!:.11Theprobabilityofthen+1thfailuretimeisPZ1zj~t=Pratleast1failurein[tn;tn+z]j~t=1)]TJ/F20 11.955 Tf 11.955 0 Td[(Prnofailurein[tn;tn+z]j~t=1)]TJ/F20 11.955 Tf 11.955 0 Td[(exp)]TJ/F1 9.963 Tf 11.291 14.059 Td[(Ztn+ztn~vBtdt:.12OneofthedrawbacksinBayesiananalysiscomesfromtherequirementofnumericalcalculation.PredictiveinferencebasedonBayesianestimate~vBofintensityfunctiongreatlysimpliesthecomputation.Inthefollowingsection,weutilizeadatasettoshow41

PAGE 55

theresultsremainverycloseastraditionalposteriordensityapproach,whichusuallyneedsdoubleintegral.3.4.3DataComparisonFigure3.2displaysthepredictiveprobabilityfunctionofNtn;s2forthedataintable2.1withtn=67;344ands2=80;000.Thelinewithcirclesrepresentsthepredic-tiveprobabilitydistribution3.7basedontheposteriordensity,whilethelinewithstarsrepresentsthepredictiveprobabilitydistribution.11basedonaBayesianestimateofintensityfunctionv.Inthetimeinterval7,344,80,000,thepeakpointshowsthatthemostpossiblenumberoffailuresisthreewithprobabilityof1=4.Figure3.3displaysthepredictiveprobabilityfunctionofn+1thfailuretimeZ1given.8and.12forthedatatable2.1withtn=67;344ands2=80;000.Simi-larlyasFigure3.2,thelinewithcirclesrepresentsthepredictiveprobabilitydistribution.8basedontheposteriordensity,whilethelinewithstarsrepresentsthepredictiveprobabilitydistribution.12basedonaBayesianestimateofintensityfunctionv.Thenextfailurewillalmostcertainlyoccurwithinthenext14,000secondsandwithin2000secondsthereisaprobabilityofabout1/3ofafailureoccurring.Figure3.2and3.3showthatwehaveverycloseresultsbyusingtwodierentap-proaches.However,theonewiththeBayesianestimateofintensityfunctionsimplifythecomputationanddon'trequirenumericalcalculation.3.5PosteriorDensityforSomeParametricFunctionsInthissection,weshallderiveposteriordistributionsoffunctionsof;whichareofparticularinterest.Thosefunctionsaresystemreliability,expectednumberoffail-uresinsometimeintervalandintensityfunction.42

PAGE 56

Figure3.2:ThePredictiveProbabilityFunctionofN7;344:80;000 Figure3.3:TheProbabilityofn+1thFailureTime43

PAGE 57

3.5.1PosteriorDensityforSystemReliabilityRecallthatthereliabilityfunctionisdenedtobetheprobabilityofnofailuresoveraspeciedtimeinterval.Foragivenrepairablesystemforwhichdatahasbeencollected,ahighreliabilityoversomefuturetimeofinterestwillaectdecisionsonreplacement.Also,ahighreliabilityforsomeperiodofinterestinreliabilitygrowthmayimplythatitisworthwhileendingthedevelopmentprocess.Withthisinmind,setR=Ry;s=P[Ny;s=0]=expf)]TJ/F20 11.955 Tf 15.276 0 Td[(es)]TJ/F22 5.978 Tf 5.756 0 Td[( +ey)]TJ/F22 5.978 Tf 5.756 0 Td[( g;whichimplies=lnes=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey= )]TJ/F20 11.955 Tf 9.299 0 Td[(lnrTheposteriorcumulativedistributionofreliabilityisFr=Z10Pr[lnes=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey= )]TJ/F20 11.955 Tf 9.299 0 Td[(lnrj]d:Hence,theposteriorpdfofreliabilityisfrj~y=Z10f[lnes=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey= )]TJ/F20 11.955 Tf 9.299 0 Td[(lnr]d[lnes=)]TJ/F21 7.97 Tf 6.587 0 Td[(ey= )]TJ/F21 7.97 Tf 6.587 0 Td[(lnr] drd=Z10Z10C~y=n+1expfPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(nlnes=)]TJ/F21 7.97 Tf 6.586 0 Td[(ey= )]TJ/F21 7.97 Tf 6.586 0 Td[(lnr )]TJ/F20 11.955 Tf 9.298 0 Td[(expyn)]TJ/F20 11.955 Tf 11.955 0 Td[(lnes=)]TJ/F21 7.97 Tf 6.586 0 Td[(ey= )]TJ/F21 7.97 Tf 6.586 0 Td[(lnr gd)]TJ/F20 11.955 Tf 9.298 0 Td[( rlnrd=)]TJ/F20 11.955 Tf 9.299 0 Td[(rlnr)]TJ/F18 7.97 Tf 6.587 0 Td[(1nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi n)]TJ/F15 11.955 Tf 11.955 0 Td[(2Z10C~y=n+1expfPn1yi +expy 1 es=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey=lnrg)]TJ/F20 11.955 Tf 9.298 0 Td[(lnr es=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey=nd=)]TJ/F20 11.955 Tf 9.298 0 Td[(lnrn )]TJ/F20 11.955 Tf 9.298 0 Td[(rlnrZ10C~y=n+1expPn1yi res)]TJ/F22 5.978 Tf 5.757 0 Td[(y )]TJ/F19 5.978 Tf 5.756 0 Td[(1)]TJ/F19 5.978 Tf 5.757 0 Td[(1es=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey=)]TJ/F21 7.97 Tf 6.587 0 Td[(nd:Inaspecialcase,weconsideranequivalentsystemwhichisjustbeginningtooperate;i.e.,overthetimeinterval,es]frj~y=)]TJ/F20 11.955 Tf 9.299 0 Td[(lnrn )]TJ/F20 11.955 Tf 9.298 0 Td[(rlnrZ10C~y=n+1expPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(s reyn)]TJ/F22 5.978 Tf 5.756 0 Td[(s d:44

PAGE 58

Fors=lntn,weobtainfrj~y=)]TJ/F20 11.955 Tf 9.299 0 Td[(lnrn)]TJ/F18 7.97 Tf 6.587 0 Td[(1r)]TJ/F18 7.97 Tf 6.586 0 Td[(1 \050n)]TJ/F15 11.955 Tf 11.955 0 Td[(1nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi:3.5.2PosteriorDensityFortheExpectedNumberofFailuresinSomeTimeIntervalSimilartotheconsiderationofthereliabilityfunction,westopmodifyingsystemwhentheexpectednumberoffailuresinsomeperiodofinterestgetssmallenoughinareliabilitygrowthmodel.Heretheexpectednumberoffailuresinagivenintervalisofinterest.Theexpectednumberoffailuresovertimeintervaley;es]isalsothemeanvaluefunctioninthatgiventime,whichism=my;s=es)]TJ/F22 5.978 Tf 5.756 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(ey)]TJ/F22 5.978 Tf 5.756 0 Td[( ;whichimplies=lnes=)]TJ/F20 11.955 Tf 11.956 0 Td[(ey= m:TheposteriorcdfofmeanvalueisFm=1)]TJ/F1 9.963 Tf 11.955 14.059 Td[(Z10Pr[lnes=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey= mj]d:Hence,theposteriordensityofmisfmj~y=)]TJ/F1 9.963 Tf 11.291 14.059 Td[(Z10f[lnes=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey= m]d[lnes=)]TJ/F21 7.97 Tf 6.587 0 Td[(ey= m] dmd=)]TJ/F1 9.963 Tf 11.291 14.058 Td[(Z10Z10C~y1 n+1expfPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(nlnes=)]TJ/F21 7.97 Tf 6.586 0 Td[(ey= m )]TJ/F20 11.955 Tf 9.298 0 Td[(expyn)]TJ/F20 11.955 Tf 11.956 0 Td[(lnes=)]TJ/F21 7.97 Tf 6.586 0 Td[(ey= m gd)]TJ/F20 11.955 Tf 9.298 0 Td[( md=mn)]TJ/F18 7.97 Tf 6.587 0 Td[(1nyn)]TJ/F1 9.963 Tf 11.955 7.97 Td[(Pn1yi n)]TJ/F15 11.955 Tf 11.955 0 Td[(2Z10C~y=n+1expfPn1yi +exp[)]TJ/F20 11.955 Tf 9.298 0 Td[(mey)]TJ/F22 5.978 Tf 5.756 0 Td[(s )]TJ/F18 7.97 Tf 6.587 0 Td[(1)]TJ/F19 5.978 Tf 5.756 0 Td[(1]ges=)]TJ/F20 11.955 Tf 11.955 0 Td[(ey=)]TJ/F21 7.97 Tf 6.586 0 Td[(nd:45

PAGE 59

3.5.3PosteriorDensityforIntensityFunctionAfterthecompletionofthetestingstageofasystem,thesystemissupposedtohaveaconstantfailurerateintensityfunctionthroughitsusefullifetime.Thusweareinterestedintheintensityfunctionvalueofthetimestoppingdeveloping.Theintensityfunctionis=y=1 expy)]TJ/F20 11.955 Tf 11.955 0 Td[( )]TJ/F20 11.955 Tf 11.955 0 Td[(y;whichimplies=y)]TJ/F20 11.955 Tf 11.955 0 Td[(ln+yWecanwritecdfofasF=1)]TJ/F1 9.963 Tf 11.956 14.058 Td[(Z10Pr[y)]TJ/F20 11.955 Tf 11.955 0 Td[(ln+yj]d:Therefore,theposteriordensityforisfj~y=)]TJ/F1 9.963 Tf 11.291 14.059 Td[(Z10f[y)]TJ/F20 11.955 Tf 11.955 0 Td[(ln+y]d[y)]TJ/F20 11.955 Tf 11.955 0 Td[(ln+y] dd=Z10Z10C~y=n+1expfPn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(ny +nln+y)]TJ/F15 11.955 Tf 11.955 0 Td[(+ygd2 +yd=Z10C~y=n+1exp[Pn1yi)]TJ/F20 11.955 Tf 11.955 0 Td[(ny )]TJ/F15 11.955 Tf 11.955 0 Td[(+y]+yndZ102 +yd:Weobtainedposteriordensitiesforreliabilityfunction,meanvaluefunctionandintensityfunction.Therefore,wecanndBayesianpointestimatesusingnumericalcalculation.Iftheintegralcan'tbeevaluatedinaclosedform,manymathematicalsoftwarepackages,suchasMathematica,Maple,Matlab,candodoubleintegration.3.6SummaryWeappliedalogarithmtransformationontheshapeparameterandareciprocaltransformationonthescaleparameter.Thenweobtainedalocationparameterand46

PAGE 60

ascaleparameter.Thismakesnoninformativepriorsmoreappropriate.WedevelopedBayesianproceduresforthepowerlawprocesswithfailuretruncationdatabasedonand.BasicBayesianresultssuchastheposteriordensity,marginalposteriordistributionofeachparameterandBayesianpointestimatesofparametersareobtained.EstimatesofincludetheestimateundersquaredlossfunctionandBayesianMLE.Posteriorvariancefortheestimateundersquaredlossandmthmomentofarealsoderived.WeappliedLindley'sapproximationtondestimationofundersquaredlossfunction.ThenwediscussedtheBayesianpredictioninferencesfortheexpectednumberoffailuresandthefuturefailuretimes.Therearetwowaystoapproachthis.Oneisus-ingposteriordensity;theotheroneisusinganewlydevelopedBayesianestimationofintensityfunction.Predictiveinferencewiththesecondapproachgreatlysimpliesthecalculations.WecompareourresultswiththepaperofBar-lev,S.K.etal.992[7]byusingthedatasetfromMusa1979.Finally,wealsoderiveposteriordensitiesofsystemreliability,meanvaluefunctionandintensityfunction.47

PAGE 61

Chapter4EmpiricalBayesAnalysisonthePowerLawProcessInthischapterwefocusonEmpiricalBayesEBanalysisonthePowerLawPro-cessbyemployingparametricEBpriorsandnonparametricEBpriors.FortheparametricEBpriors,weapplytwo-hyperparameternaturalconjugatepriorandamoregeneralizedthree-hyperparameternaturalconjugateprior.ThosepriorswerestatedinHuangandBier1998[33].HerewederiveanEmpiricalBayesproceduretoestimatethenaturalconjugatepriors.Tocomparewiththepreviouschapter,whenwecompletelyknowtheprior,theapproachisBayesian.Ifwedon'tknowthepriorcompletely,weuseEmpiricalBayeswhenassumingparametersofpriorarexedbutunknown.Sincewehavepastexperienceabouttheparametersofthemodel,wecanemploydatatoestimatethehy-perparametersofpriors,henceestimatethepriorsinparametricEmpiricalBayes.4.1ParametricEmpiricalBayesonthePLPParametricempiricalBayesproceduresareeasiertoworkwithiftheintensityfunctionisparametrizedast=t)]TJ/F18 7.97 Tf 6.587 0 Td[(1;t>0:Throughoutmostofthisdissertation,wehaveusedt= t )]TJ/F18 7.97 Tf 6.587 0 Td[(1;for>0;>0:48

PAGE 62

Figure4.1:BayesianModelforthePLPTherelationshipbetweenandis=)]TJ/F21 7.97 Tf 6.587 0 Td[(.Ifksystemsaresimilar,butnotidentical,aparametricempiricalBayesPEBapproachcanbeapplied.Weassumethatthesystemparametersinourcasethei;i0saredrawnfromsomepriordistribution;.Figure4.1displaysBayesianframeworkinthepreviouschapter.Figure4.2displaysparametricempiricalBayesmodelforthePLP.Weassumei;i0smakeuparandomsampleselectedfromtheprior.Thisassumptionisreasonableiftheksystemsaremadefromthesamemanufacturingprocess.Hereweemploynaturalconjugatepriorswithtwoandthreehyperparameters.4.1.1TwoHyperparametersa;mAssumethereareksystems.Let~tidenotethevectoroffailuretimesforsystemi,andletT=[~t1;~t2;:::;~tk]49

PAGE 63

Figure4.2:ParametricEmpiricalBayesModelforthePLPdenotethetwo-dimensionalarrayoffailuretimes.ThelikelihoodfunctionoftherstnifailuretimesforsystemiisL~tij;=Lti1;ti2;:::;tinij;=nininiYj=1tij)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(tni:.1Thenaturalconjugatepriordistributionforthepowerlawfailuremodelisgivenby0;jm;a=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(atmni)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(tni;.2herem;aarepositivehyperparameters.cisaconstantandtniisaxedtruncatedfailuretimeforsystemi.c=ZZm)]TJ/F18 7.97 Tf 6.587 0 Td[(1m)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(atmni)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(tnidd=)]TJ/F18 7.97 Tf 26.381 4.936 Td[(2ma)]TJ/F21 7.97 Tf 6.587 0 Td[(m[exp)]TJ/F20 11.955 Tf 9.299 0 Td[(atmni])]TJ/F18 7.97 Tf 6.586 0 Td[(1.3Thenthemarginaldistributionof~tigivenm,aism~tijm;a=ZZLti1;ti2;:::;tinij;0;jm;add=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1ZZni+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1ni+m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F20 11.955 Tf 9.299 0 Td[(atmniniYj=1tij])]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F15 11.955 Tf 9.299 0 Td[(2tnidd50

PAGE 64

=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1Z2)]TJ/F18 7.97 Tf 6.587 0 Td[(m+ni\050ni+mt)]TJ/F18 7.97 Tf 6.587 0 Td[(m+nini[exp)]TJ/F20 11.955 Tf 9.299 0 Td[(atmniniYj=1tij])]TJ/F18 7.97 Tf 6.586 0 Td[(1ni+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1d=c)]TJ/F18 7.97 Tf 6.587 0 Td[(12)]TJ/F18 7.97 Tf 6.587 0 Td[(m+ni)]TJ/F18 7.97 Tf 7.314 4.339 Td[(2ni+m[a+nilntni)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQnij=1tij])]TJ/F18 7.97 Tf 6.587 0 Td[(ni+m exp)]TJ/F20 11.955 Tf 9.298 0 Td[(atmniQnij=1tij=2)]TJ/F18 7.97 Tf 6.587 0 Td[(m+ni)]TJ/F18 7.97 Tf 7.315 4.338 Td[(2ni+m[a+nilntni)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQnij=1tij])]TJ/F18 7.97 Tf 6.586 0 Td[(ni+m Qnij=1tij)]TJ/F18 7.97 Tf 7.314 3.453 Td[(2ma)]TJ/F21 7.97 Tf 6.586 0 Td[(m:.4SinceKsystemsareindependent,wehavethemarginaldistributionofTgivenmandaismTjm;a=m~t1jm;am~t2jm;am~tkjm;a=kYi=1f)]TJ/F18 7.97 Tf 7.314 4.338 Td[(2ni+m[a+nilntni)]TJ/F15 11.955 Tf 11.956 0 Td[(lnQnij=1tij])]TJ/F18 7.97 Tf 6.586 0 Td[(ni+m 2m+niQnij=1tij)]TJ/F18 7.97 Tf 7.315 3.453 Td[(2ma)]TJ/F21 7.97 Tf 6.586 0 Td[(mg:.5InordertoobtainMLEsofmanda,weneedtakeanaturallogarithmof.5ln[mTjm;a]=kXi=1)]TJ/F15 11.955 Tf 11.955 0 Td[(m+niln2+2ln)1(ni+m)]TJ/F15 11.955 Tf 11.956 0 Td[(ni+mln[a+nilntni)]TJ/F15 11.955 Tf 11.291 0 Td[(lnniYj=1tij])]TJ/F15 11.955 Tf 11.955 0 Td[(lnniYj=1tij)]TJ/F15 11.955 Tf 11.955 0 Td[(2ln\050m+mlna:.6Thentakingthederivativewithrespecttoain.6andsetitequaltozero,wehavekXi=1"ni+m a+nilntni)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQnij=1tij)]TJ/F20 11.955 Tf 13.15 8.087 Td[(m a#=0:.7Similarly,takingthederivativewithrespecttomin4.6andsetitequaltozero,wehavekXi=124)]TJ/F15 11.955 Tf 11.291 0 Td[(ln2+2)]TJ/F23 7.97 Tf 13.167 4.339 Td[(0ni+m \050ni+m)]TJ/F15 11.955 Tf 11.955 0 Td[([a+nilntni)]TJ/F15 11.955 Tf 11.956 0 Td[(lnniYj=1tij])]TJ/F15 11.955 Tf 11.955 0 Td[(2)]TJ/F23 7.97 Tf 7.314 4.338 Td[(0m \050m+lna35=0:.8AccordingtothefollowingwellknownpropertyofGammafunction:)]TJ/F15 11.955 Tf 10.494 8.088 Td[()]TJ/F23 7.97 Tf 7.314 4.338 Td[(0z \050z=1 z++1Xi=11 n+z)]TJ/F15 11.955 Tf 13.718 8.088 Td[(1 n:Equation.8issimpliedto2kXi=1niXj=11 m+j)]TJ/F15 11.955 Tf 11.955 0 Td[(1+klna 2)]TJ/F20 11.955 Tf 11.955 0 Td[(ka)]TJ/F21 7.97 Tf 16.84 12.952 Td[(kXi=124nilntni)]TJ/F15 11.955 Tf 11.955 0 Td[(lnniYj=1tij35=04.951

PAGE 65

MLEsofaandmcanbeobtainedbysolvingequations.7and.9numerically.Ingeneral,thelikelihoodequationsdonotadmitaclosedformsolutionandanumericalmethodmustbeemployedtoapproximatetheMLEsofthehyperparametersa;m.Inthespecialcasethatwehaveonesystem,thatisk=1,withobservationst1;t2;:::;tn,weareabletoobtaintheestimateofainaclosedformfromBayesianmaximumlikelihoodapproach.ThefollowingresultsshallbeconsideredasBayesianinference.^a=^mlntn)]TJ/F15 11.955 Tf 15.344 8.087 Td[(^m nlnnYi=1ti:WecanuseNewton-RaphsonmethodtogetMLEform.Theposteriordistribution;j~t;n;^m;^a=1 m~tj^m;^ac)]TJ/F18 7.97 Tf 6.586 0 Td[(1n+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1n+^m)]TJ/F18 7.97 Tf 6.586 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.444 0 Td[(^at^mnnYi=1ti])]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F15 11.955 Tf 9.299 0 Td[(2tn:.10Hence,thepriordistributionandposteriordistributionarefromthesamefamily,priorsarenaturalconjugatepriors.Theposteriormeanforis~B=E=ZZ;dd=c)]TJ/F18 7.97 Tf 6.586 0 Td[(12n+^m+1\050n+^m+1\050^m+n[^a)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQni=1ti+n+1lntn])]TJ/F18 7.97 Tf 6.587 0 Td[(n+^m m~tj^m;^a[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^at^mnQni=1ti]:=\050n+^m+1 2)1(^m+nf^a+n+1lntn)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQni=1ti ^a+nlntn)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQni=1tig)]TJ/F18 7.97 Tf 6.586 0 Td[(n+^m:.11Theposteriormeanforis~B=E=ZZ;dd=c)]TJ/F18 7.97 Tf 6.586 0 Td[(12n+^m\050n+^m+1\050^m+n[^a)]TJ/F15 11.955 Tf 11.955 0 Td[(lnQni=1ti+nlntn])]TJ/F18 7.97 Tf 6.587 0 Td[(n+^m+1 m~tj^m;^a[exp)]TJ/F15 11.955 Tf 9.444 0 Td[(^at^mnQni=1ti]:=\050n+^m+1 \050n+^m^a+nlntn)]TJ/F15 11.955 Tf 11.956 0 Td[(ni=1lnti:.1252

PAGE 66

4.1.2ThreeHyperparametersa;m;ymThesituationissimilarasintheprevioussubsection.Butherewetakeamoregeneralconjugateprior.Recallthelikelihoodfunction:L~tij;=Lti1;ti2;:::;tinj;=nininiYj=1tij)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(tni:Thenaturalconjugatepriordistributionforthepowerlawfailuremodelisgivenby0;jm;a;ym/m)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.586 0 Td[(1mYi=1ymi)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(ym:.13Theparametersy1:::ymcanbeinterpretedasapseudo-dataset,wheremisthenumberoffailuresandyiisthetimeoftheithfailure.Forsimplicityandwithoutlossofgenerality,wecanchooseQmi=1ymi=exp)]TJ/F20 11.955 Tf 9.298 0 Td[(aymm.Thusthenaturalconjugatepriorbecomes0;jm;a;ym=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.587 0 Td[(1m)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(aymm)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(ym;.14herem;a;ymarepositivehyperparameters.cisaconstantandc=ZZm)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(aymm)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(ymdd=)]TJ/F18 7.97 Tf 26.382 4.936 Td[(2ma)]TJ/F21 7.97 Tf 6.587 0 Td[(m[exp)]TJ/F20 11.955 Tf 9.298 0 Td[(aymm])]TJ/F18 7.97 Tf 6.586 0 Td[(1:.15Thenthemargianldistributionof~tigivenm,a,ymism~tijm;a;ym=ZZLti1;ti2;:::;tinij;0;jm;a;ymdd=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1ZZni+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1ni+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1[exp)]TJ/F20 11.955 Tf 9.299 0 Td[(aymmniYi=1tni])]TJ/F18 7.97 Tf 6.587 0 Td[(1exp[)]TJ/F20 11.955 Tf 9.299 0 Td[(tni+ym]dd=1 cZ\050ni+m[tni+ym])]TJ/F18 7.97 Tf 6.587 0 Td[(m+ni[exp)]TJ/F20 11.955 Tf 9.298 0 Td[(aymmniYj=1tij])]TJ/F18 7.97 Tf 6.587 0 Td[(1ni+m)]TJ/F18 7.97 Tf 6.587 0 Td[(1d:.16SinceKsystemsareindependent,thelikelihooddistributionofTgivenmandaismTjm;a=m~t1jm;am~t2jm;am~tkjm;a:.17AgainthelikelihoodequationsdonotadmitaclosedformsolutionandanumericalmethodmustbeemployedtoapproximatetheMLEsofa;m;ym.However,suppose53

PAGE 67

weonlyhaveobservationst1;t2;:::;tnfromonesystem,whichmeansweonlyhavearandomsampleofsizeone;fromtheprior;ja;m;ym,ourinferenceshallberegardedasBayesianmaximumlikelihoodapproach.Hence,forthisspecialcase,wehavetheposteriordistributionof;is;j~t;n;^m;^a;^ym=c)]TJ/F18 7.97 Tf 6.587 0 Td[(1n+^m)]TJ/F18 7.97 Tf 6.586 0 Td[(1n+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.444 0 Td[(^a^y^mmQni=1ti])]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(^ym+tn m~tj^m;^a;^ym:.18TheposteriormeanforisE=ZZ;dd=c)]TJ/F18 7.97 Tf 6.587 0 Td[(1\050n+^m+1Rn+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.586 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.587 0 Td[(n+^md m~tj^m;^a;^ym=\050n+^m+1Rn+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.587 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.587 0 Td[(n+^m+1d \050n+^mRn+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.586 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.587 0 Td[(n+^md:.19TheposteriormeanforisE=ZZ;dd=c)]TJ/F18 7.97 Tf 6.586 0 Td[(1\050n+^mRn+^m[exp)]TJ/F15 11.955 Tf 9.444 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.587 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.586 0 Td[(n+^md m~tj^m;^a;^ym=Rn+^m[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.586 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.587 0 Td[(n+^md Rn+^m)]TJ/F18 7.97 Tf 6.587 0 Td[(1[exp)]TJ/F15 11.955 Tf 9.445 0 Td[(^a^y^mmQni=1tn])]TJ/F18 7.97 Tf 6.587 0 Td[(1^ym+tn)]TJ/F18 7.97 Tf 6.587 0 Td[(n+^md:.20ItshouldbeaddressedthattheproblemwithparametricEmpericalBayesPEBisthatweassumethattheestimatesofthepriorparametersarethepriorparametersthemselves.ThePEBapproachdoesnotaccountforuncertaintyintheestimatesofthesehyper-parameters.Variationintheseestimateswouldleadtomorevariationintheestimatesoffunctionofparameters,suchasintensityandreliabilityetc.4.1.3PriorPlotsThejointpriordensityisgivenby.18.Bytakeintegralwithrespectto,0jm;a;ym=Zc)]TJ/F18 7.97 Tf 6.586 0 Td[(1m)]TJ/F18 7.97 Tf 6.587 0 Td[(1m)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(aymm)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(ymd;=amm)]TJ/F18 7.97 Tf 6.587 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(a \050m.2154

PAGE 68

marginalpriordistributionofhasaGammadistributionwithmeanm=aandvariancem=a2.Theconditionalpriordistributionofgivenis0j=0; 0=c)]TJ/F18 7.97 Tf 6.587 0 Td[(1n+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1n+m)]TJ/F18 7.97 Tf 6.586 0 Td[(1[exp)]TJ/F20 11.955 Tf 9.299 0 Td[(aymmQni=1ti])]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(ym+tn m~tjm;a;ym\050m amm)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.298 0 Td[(a=ymmm)]TJ/F18 7.97 Tf 6.586 0 Td[(1exp)]TJ/F20 11.955 Tf 9.299 0 Td[(ym \050mwhichisGammadistributionwithmeanm=ymandvariancem=y2m:Figure4.1toFigure4.8arethepriordensityplotsgivendierentvalueofhyperparameters.1.Figure4.3andFigure4.4givescontourandthree-dimensiongraphform=2a=2ym=2.Onlydrawx-axisin[0,4]andy-axisin[0,4].2.Ifonlymincreases,theotherparametersarexed,allthemeansandvarianceincrease.Thegraphmovesalwaysfromx-axisandy-axis,andmorespreadout.AsshowninFigure4.5andFigure4.6whena=2m=6ym=2.3.Ifonlyymincreases,theotherparametersarexed,meanm=ymdecreasesandvariancem=y2mdecreasesthegraphmovesclosetoy-axisandmoreconcentrated.Asshowningure4.7andgure4.8m=2a=2ym=6.4.Ifonlyaincreases,theotherparametersarexed,meanm=aandvariancem=a2decrease.Thegraphmovesclosetox-axisandmoreconcentrate.AsshowninFigure4.9andFigure4.10whena=6m=2ym=2.4.2NonparametricPrioronthePowerLawProcessWenowassumej;j=1;2;:::;mfrommsystemsaredrawnfromapriordistri-bution0.Thegoalofdensityestimationistoapproximatetheprobabilitydensityfunction0.Assumewehavemindependent,identicallydistributedobservations^1,^2:::^mwhichareobtainedfrommsystemsusingclassicalMLEsinthepreviouschapter.55

PAGE 69

Figure4.3:PriorContourPlota=2m=2ym=2 Figure4.4:PriorThree-DimensionPlota=2m=2ym=256

PAGE 70

Figure4.5:PriorContourPlota=2m=6ym=2 Figure4.6:PriorThreeDimensionPlota=2m=6ym=257

PAGE 71

Figure4.7:PriorContourPlota=2m=2ym=6 Figure4.8:PriorThreeDimensionPlota=2m=2ym=658

PAGE 72

Figure4.9:PriorContourPlota=6m=2ym=2 Figure4.10:PriorThree-DimensionPlota=6m=2ym=259

PAGE 73

Thekerneldensityestimator^0fortheestimationofthedensityvalue0isdenedas^0=1 mhmXj=1K^i)]TJ/F20 11.955 Tf 11.955 0 Td[( hwhereKdenotesthekernelfunction,andhdenotesthebandwidth,orthesmoothingparameter.Thebandwidthcontrolstheamountofsmoothing.Ifhislarge,thereisalotofsmoothing,andifhissmallthereislesssmoothing.^0isanonparametricprobabilitydensityestimationto0.Anumberofpossiblekernelfunctionsarelistedinthefollowingtable:Table4.1:CommonlyUsedKernelFunctions Kernel Ku Uniform 1 2Ijuj1 Triangle )-222(jujIjuj1 Epanechnikov 3 4)]TJ/F20 11.955 Tf 11.955 0 Td[(u2Ijuj1 Quartic 15 16)]TJ/F20 11.955 Tf 11.955 0 Td[(u22Ijuj1 Triweight 35 32)]TJ/F20 11.955 Tf 11.955 0 Td[(u23Ijuj1 Gaussian 1 p 2exp)]TJ/F18 7.97 Tf 10.494 4.707 Td[(1 2u2 Cosinus 4cos 2Ijuj1 Kernelfunctionhasthefollowingproperties:iKu=K)]TJ/F20 11.955 Tf 9.298 0 Td[(u,iiR1Kudu=1,iiiR1uKudu=0andviR1u2Kudu6=0.Toobtainthepriordensity^0,weneedtoselectakernelfunctionKandbandwidthh.Inourwork,weemploythemostcommonusedkernelfunction:Gaussiankernel.ItisdierentiableeverywhereandisgivenbyK=1 p 2e)]TJ/F19 5.978 Tf 7.782 3.258 Td[(1 2260

PAGE 74

Todetermineanoptimalkerneldensityandbandwidth,weneedminimizethemeanintegratedsquarederrorMISEwhichisdenedasMISEK;h=EZ1^0)]TJ/F20 11.955 Tf 11.955 0 Td[(02dx:ForaxedGaussiankernel,estimationofbandwidthhcanbeobtainedbyunimodelSilverman'smethodSeedetailsinchapter5of[51].Toestimatenonparametricdensityfor,weapplythesameprocedureasabove.Hence,nonparametrickernelpriordensityforandP=1 n^h1nXi=11 p 2expf)]TJ/F15 11.955 Tf 16.471 8.088 Td[(1 2)]TJ/F15 11.955 Tf 13.447 3.155 Td[(^i ^h12gnonparametrickernelpriordensityforandP=1 n^h2nXi=11 p 2expf)]TJ/F15 11.955 Tf 16.472 8.088 Td[(1 2)]TJ/F15 11.955 Tf 14.24 0 Td[(^i ^h22g:ThisisthesimplestideathatemploysnonparametricEmpiricalBayesonthePowerLawProcess.Furtherworkcanbedonetoaddressmoreoptimalnonparametricpriors.4.3FurtherResearchItisfundamentalbasisofBayesiandecisiontheorythatstatisticalinferenceshouldstartwiththedeterminationofthreefactors:thedistributionfamilyfortheobservations,thepriordistributionfortheparametersandthelossassociatedwithdecisions.Furtherworkcanbedonetochecktherobustnessofthepriorsweused,suchasnoninformativepriors,naturalconjugateEBpriorsandnonparametricEmpiricalBayespriors.Wecanslightlychangethepriorandseewhathappenstothedecision.TwocommonlyusedmeasuresaretherangeofposteriordecisionandcomparingBayesriskcriteria.4.4SummaryWeworkedonEmpiricalBayesEBanalysisonthePowerLawProcessbyem-ployingparametricEBpriorsandnonparametricEBpriors.FortheparametricEB61

PAGE 75

priors,weapplytwo-hyperparameternaturalconjugatepriorandamoregeneralizedthree-hyperparameternaturalconjugateprior.ThosepriorswerementionedinHuangandBier98[33].HerewederiveEmpiricalBayesproceduretoestimatethenaturalconjugatepriors.Weemployedpastexperiencetoestimatepriorsthroughdata.Weconsideredaspecialcasewhenweonlyhaveonesystem.Underthatcase,theanalysisbecomesaBayesianMaximumLikelihoodapproach.Hence,wealsoshowedsomeresultsfromBayesianperspectiveinthischapter.AccordingtononparametricEBpriors,wehavekestimatesofparameterfromksystems,thenweconstructanonparametricpriorwithnormalkernelfunctionandanoptimalbandwidth.62

PAGE 76

Chapter5MicroarrayAnalysis:NormalizationandTransformationofProbe-LevelDataThischapterprovidesanoverviewofmicroarrayfromstatisticsperspective.Wegiveadescriptionofthetargetdatasets.Wealsosummarizeandcompareseveralmethodsoftransformationandnormalizationforprobe-leveldata.5.1OverviewofMicroarrayAnalysisDNAmicroarraytechnologyisatoolforstudyinghowlargenumbersofgenesin-teractwitheachotherandenablesthesimultaneousanalysisofthousandsofsequencesofDNAforgeneticandgenomicresearch.Microarraytechnologyhasbeendevelopingrapidlyoverthelastseveralyears.Statisticalanddata-analytictechniquesareinvolvedinallstagesofmicroarrayex-perimentationandanalysis.AtaskmapinmicroarraydataanalysisisgiveninFigure5.1.Thischaptercoverstheapplicationofseveralexistingmethodsinprobelevelanalysisofoligonuleotidearrays.Inthenextchapter,whichincludesamoreimportantissue,severalclassicalandBayesianstatisticalmethodsareappliedtoanalyzedierentiallyexpressedgenesonexpressionleveldataproducedfromDr.Haura'slaboratory.Toselectdierentiallyexpressedgenesacrossdierentconditionsistherstlevelofgeneexpressionanalysis.Thesecondlevelconsidersthetermssuchascommonfunc-tionalities,interactionsandco-regulation.Thereforeclusteringisanimportantissue.Thethirdlevelaimstondtheunderlyingregulatoryregionsandgenenetworksthat63

PAGE 77

ultimatelyareresponsiblefortheobservedpatterns.5.2DataDescriptionWeobtaintwosetsofdatafromtheexperimentsinDr.Haura'slaboratory.TherstsetconsistsofDNAprobe-leveldata,whichisthebasedatasetobtainedbyscan-ninghybridizedcDNA.Thesecondsetisgeneexpressionleveldata.Geneexpressionleveldataarecomputedfromprobeleveldata.Therearevariousmethodsdevelopedforthegeneexpressionindexcomputation.Irizarryetal.03;Lemonetal.02;Holderetal.2001;Naefetal.01;ZhouandAbagyan2002;AymetrixInc.01b;Zhangetal.002Inthefollowingsectionwewillgiveabriefcomparisonoffourmainmethodsandcorrespondingsoftware.Hereisanintroductionofprobeleveldata.Itistherawdatabeforehavinggeneexpressions.ThersttypeofprobeisreferredtoaperfectmatchPM.EachPMprobeispairedwithamismatchMMprobe.Thesetwoprobesarereferredtoasaprobepair.Eachgeneexpressionisrepresentedby11-20probepairsasshowninFigure5.2andusuallyavaluerepresentingtheaveragedierencebetweenPMandMM.ThepurposeoftheMMprobedesignistomeasurenon-specicbindingandbackgroundnoise.AfterscanningthearrayshybridizedtolabeledRNAsamples,intensityvaluesPMijandMMijarerecordedforarraysi=1;:::;Iandprobepairsj=1;:::;Jforanygivenprobeset.Inourdataset,theexpressionleveldatacontain22215genes.Thisarrangesthedatasetconsistingof22215rows.Weperformtreatmentsonvesamplesanduseanothervesamplesascontrol.Thisyieldsdatasetconsistingof10columns.ThosevecontrolunitsaredenotedbyGFPGFPproteinandveexperimentalunitsaredenotedbySTAT3Stat3:amemberofthefamilyofsignaltransducersandactivatorsoftran-scription.AscatterplotofexperimentgeneexpressionSTAT3againstcontrolgeneexpres-64

PAGE 78

Figure5.1:Data-analyticTasksinMicroarrayExperimentation65

PAGE 79

Figure5.2:SignalExtractioninProbeLevelDatasionGFPisshowninFigure5.3.Itgivesabriefideaaboutthevalueofgeneexpression.Eachdotrepresentsagene.Theregressionissupposedtobey=xbasedontheassump-tionthatup-regulatedgenesanddown-regulatedgeneswithsimilaraverageintensityroughlycanceledoutorotherwisemostgenesremainunchanged.Thisassumptionisusuallytrueinlargegenomestudies.Dudoitetal.02[25]5.3Probe-LevelAnalysisofOligonucleotideArraysAnimportantstepinmicroarrayanalysisisthenormalizationofrawdata.ForAymetrixgenechips,summarizing11to20probepairsintoonemeasureofexpressionisanessentialstep.Thenormalizationaimstoaccountforsystemtechnicaldierenceinmeasurementprocessandcontrolformanyexperimentalsourcesofvariability.Measure-mentdierencesconsistentlybetweenchipsareduetoimageanalysisidentifyingandquantifyingeachspotonthearray,dierentamountofRNA,hybridizationconditions66

PAGE 80

Figure5.3:STAT3v.s.GFPMeanValue:STAT3=0:98745GFPtemperature,time,mixing,etcandscannersettingLaseranddetector,chemistryoftheuorescentlabeletc.Dierentapproaches-allunderlyingsomespecicmodelassumptions-havebeenproposed.Twokeyelementsshouldbespeciedineachapproach:mathematicalmodelandnormalizationmethod.WewilldiscussastandardmethodusedintheAymetrixMicroarraySuite5.0softwareMAS5.0withcomparisontootherthreepreprocessingalgorithms:therobustmultichipanalysisRMA;modelbasedexpressionindexMBEI;avariancestabilizationmethodVSN.RMAandMBEIdChipareavailablewithintheBioconductorproject.5.3.1MAS5.0{MicroarrayAnalysisSuiteVersion5.0MicroarrayAnalysisSuitVersion5.0MAS5.0isproducedbyAymetrixInc.2002.H.LeeMottCancerCenterisapplyingthispackagetoextractsignalsfromscanningimage.Hereisanintroductionaboutthisapproach.Themathematicalmodel67

PAGE 81

isSignal=TukeyBiweightflog2PMj)]TJ/F20 11.955 Tf 11.955 0 Td[(MMjgforprobepairj.MMjisanadjustedMMjthatisneverbiggerthanPMj.TukeyBiweightisarobustaverageprocedurewithweights.Themeaniscalculatedtoidentifycenterofdata.Distanceofeachdatapointfromthemeaniscalculated.Thedistancedetermineshoweachvalueisweightedintheaverage,i.e.outlierfarfromthemediancontributelittletotheaverage.MAS5.0oersonlyaglobalnormalizationprocedureforthesummarizedprobesets.Itadjuststhetrimmedmeansignaltoaspeciedtargetsignalvalue,insomecase500.ExpressionmeasuresforeachprobesetarecalculatedwithTukeyBiweightbeforenormalization.MAS5.0assignstoeachprobesetanexpressioncallandalsooersthepossibilityofperformingdatascaling.ThedrawbacksforMAS5.0dependontwofacts.Averageofdierentprobesisn'treallymeaningfulsinceprobeshaveintrinsicallydierenthybridizationcharacteristics.TheMAS5.0methoddoesn'tlearnbasedoncross-chipperformanceofindividualprobes.5.3.2MBEI{Model-BasedExpressionIndexdChip2001MBEIaccountsforindividualprobe-speciceects,automaticdetectionandhan-dlingofoutliersandimageartifacts.dChipisasoftwarepackageproducedbyLiandWong2001associatedwithModel-BasedExpressionIndexmethod.Thismodelisbasedontheobservationthatthevariationofaspecicprobeacrossmultiarrayscouldbeconsiderablysmallerthanthevarianceacrossprobesinaprobesetandsomeprobesareoutliers.Totakethisintoaccount,thefollowingmultiplicativemodelwasproposedasPMij)]TJ/F20 11.955 Tf 11.955 0 Td[(MMij=ij+eij;68

PAGE 82

whichindicatesastrongprobeanityeectj.Distributionoferroreijisassumedtobeindependentofsignalstrength.Furthermore,dChipallowsassessmentofastandarderrorSEforeachprobesetintensitymeasure,whichisanindicatorofhybridizationqualitytotheprobeset.Standarderrorsofjprobepairjareusefulfordiscardingprobesetswithlowhybridizationquality.Standarderrorsofiarrayiareusedtoidentifyarrayoutliers.NormalizationmethodsinMBEIareinvariantsetnormalization,quantilenormal-izationandcyclicLowess.Fortheinvariantsetnormalization,asetofnon-dierentiallyexpressedgenesareselectedbytheirinvariantranksoftheprobeintensities.Thosegenescanberegardedasbaselines.TheinvariantsetnormalizationproducesabetterttingofthereplicatescomparingtotheMAS5.0scalingprocedure.Lowesstechniqueistoapplyanonlinearcurvetothescatterplotsoftheprobepairdierencesofabaselinearrayagainstalltheotherarraysandthenforcethecurvetotheliney=x.5.3.3RMA{RobustMultichipAnalysisAlogscalelinearadditivemodelRobustMultichipAnalysisRMAwasproposedbyBolstad,Irizarry,Speed002.ThismethodanalyzesdataforasetofchipsusingonlyPMandignoringMM.ThemathematicalmodelislogPMij=logi+logj:Arobustlinearttingprocedure,suchasmedianpolish,wasusedtoestimatethelogscaleexpressionvaluesi.Inpractice,logPMij)]TJ/F20 11.955 Tf 11.678 0 Td[(BG=logai+logbj+logeij.SignallogPMijrepresentsthetransformationthatbackgroundcorrectsandnormalizes.Thusbackgroundvalueisimportanthere.RecentresultssuggestthatsubtractingMMasawayofcorrectingfornon-specicbindingisnotalwaysappropriate.UnadjustedMMvaluemayaddmorenoise.69

PAGE 83

Normalizationmethodsincludequantilenormalizationandcurvettingnormal-ization.Thequantilemethodtriestomakethesamedistributionofprobeintensitiesforeacharrayinasetofarrays.Themethodisboundtotheideathataquantile-quantileplotshowsthatthedistributionoftwodatavectorsisthesameiftheplotisastraightdiagonal.Theideacanbeextendedtondimensions.5.3.4VSN{VarianceStabilizationofNetworkVSNisanormalizationprocedureproducedbyHuberetal002[34]andalsoamethodtopreprocessDNAmicroarrayexpressiondata.Inprobeleveldataanalysis,VSNusesthesamemathematicalmodelasinRMAexceptforthenormalization.Andnormalizationmethodisvariancestabilizingtransformations.Asthenamestates,variancestabilizationtransformationremovesthedependenceofthevarianceonthetotalintensity.Thisgivesgeneswithhigherintensitiesanequalchanceofbeingrankedhighasgeneswithlowerintensity.5.3.5ComparisonTocomparetheprobeleveltransformationandnormalizationmethods,severalstandardsshouldbeconsidered,suchasprecision,consistency,specicity,sensitivityandaccuracy.Precisionmeansthereproducibilityofmeasurement,asestimatedbystandarderroracrossreplicatechips.Specicitymeanstheproportionofthesignalthatoriginatesfromtheintendedtranscripti.e.crosshybridization.Sensitivitygiveslowesttranscriptconcentrationforanacceptableaccuracy.Accuracymeasuresthedistanceofmeasure-menttotruevalue.Li&WongdemonstratedthatthemultiplicativemodelhasamoresensiblemodeltoanalyzedatafromhighdensityoligonucleotidearrayexperimentscomparedtoMAS70

PAGE 84

5.0.MBEIdchipalsoismoresuitableforanyfurtheranalysisthatMAS5.0estimationdoesareasonablejobonprobe-setthatarebright.dChipandRMAdoesabetterjobongenesthatarelessabundant.AccordingtoIrizarryetal,RMAhasabetterprecisionthanMAS5.0anddChipbasedonhighersquaredcoecientcorrelation,especiallyforlowexpressionlevels.Con-cerningintheamountoftruepositivesidentiedusingspiked-inexperiments,RMAperformsslightlybetterthandChip,butmuchbetterthanMAS5.0.Onthebasisofpublisheddata,RMAalsoshowsbettersensitivityandspecicitywithrespecttodChipandMAS5.0.TheadvantageofRMAandVSNaretwo-fold:rst,weareabletodetectmoreofthespike-ingeneswhilegettinglessfalsepositives;secondly,theresultingdataiseasiertoanalyze.ThestrongintensitydependencyofMAS5.0datahasdisappeared.RMAandVSNperformsimilaronsomedataset.However,quantilenormalizationinRMAperformsfaster.LogtransformationinRMAismoreinterpretablethanarcsinetransformationinVSN.AtthemomentRMAappearstobethebestmethodavailable.However,itisalsonecessarytocheckmodelassumptionforanygivendata.5.4SummaryMicroarrayanalysisisafairlynewresearchareaandjustdevelopedinpastfewyears.Herewerstgaveastructurehowstatisticaltechniquesareinvolvedinallstagesofmicroarrayanalysis.Thenweintroducedhowtherealdatasetthatweareanalyzinglooklike.Wesummarizedandcomparedseveralmethodsintransformationandnormalizationofprobeleveldata.Itistreatedaslowlevelanalysisinmicroarrayandtheresultsareusuallyobtainedbyexistingsoftwarepackages.Followingtothisstep,wewillhavegeneexpressiondata,whichwillbeanalyzedinnextchapter.71

PAGE 85

Chapter6StatisticalMethodsofSelectingDierentiallyExpressedGenesInthischapter,theobjectistonddierentiallyexpressedgenesin22,215genes.Thedataarenonpairedve-controlGFPproteinandve-experimentSTAT3geneex-pression.BothclassicalandBayesianmethodsareapplied.Classicalmethodsincludefoldchange,T-test,WilcoxonRank-sumtest,LocalZ-scoreChen,Z.[17]andSAMStorey2002.EmpiricalBayesmethodsincludeEBarraysNewton,2002[42],LIMMASmythG.K.03[55]andCybor-TBaldiandLong01[5].WemainlydiscusstwoclassicalmethodsSAMStorey2002,localZ-scoreZ.Chen[17]andoneparametricEmpiricalBayesmethodNewton,2002.Severalpartiallistsofdierentiallyexpressedgenesareshown.Classicalmethodintendstocontrolfalsediscoveryrate,whileEmpiricalBayesmethodEBarraysaimstoclassifygenesbyexpressionpatternsusingposteriorprobability.6.1SelectDierentiallyRegulatedGenesUsingClassicalStatisticsMethods6.1.1FoldChangeMethodFoldchangemethodisthesimplestandmostintuitiveapproach.However,thefoldthresholdischosenarbitrarily.Wemaygettoomanyortoofewgenes.Usuallyvarianceofgeneexpressiondatainlowintensityislarge,inhighintensityissmall.Figure6.1isa72

PAGE 86

Figure6.1:LowessRegressionforFiveExperimentsscatterplotofLog-ratioagainstlog-intensitywhileverticalaxisrepresentsLogratio=log2STAT3=GFPandhorizontalaxisrepresentsLogintensity=log10STAT3GFP:Ithasafunnelshape.Byusingfoldchangemethod,thisleadstohighfalsepositivesatthelowintensityendandmissingtruepositivesatthehighintensity.Toimprovethesensitivity,localZ-scorewillbeillustratedlater.Z.Chen[17]Inmicroarrayanalysis,wehavelargesetofgenes.BeforethegeneprolesofRNAsamplescanbeanalyzedandinterpreted,theGFPandSTAT3intensitiesmustbenormalizedrelativetooneanothersothattheSTAT3/GFPratioprovidesanunbiasedrepresentationofrelativeexpression.Per-chipnormalizationisessentiallyatypeofscalingtoadjustthetotaloraverageintensityofeacharray.Per-genenormalizationcomparestheresultsforasinglegeneacrossallthesamples.73

PAGE 87

Sincemostgeneswillnotchange,Figure6.1issupposedtocenteratzero.Basedonthisassumption,normalizationisnecessarytobalancetheexpressionintensitiessothatmeaningfulbiologicalcomparisonscanbemade.Thefollowingaretwonormal-izationmethodsconcerninggene-expressiondatafromasinglearrayhybridizationre-viewedbyJohnQuackenbush02[46].NotethatLog-ratioforeachgeneisdenotedbyTi,Ti=log2STAT3i=GFPiandLog-intensityofeachgeneisdenotedbyAi,Ai=log10GFPiSTAT3iTotalintensitynormalizationLetT0i=Ti)]TJ/F20 11.955 Tf 12.546 0 Td[(Log2NtotalwhereNtotal=Stat3=GFP.Thus,meanofT0iisequaltozero.Thisisequivalenttosubtractingaconstantfromthelogarithmoftheexpressionratio.NormalizationusingregressiontechniquesAbasicnormalizationmethodisprint-tipLOWESSnormalization.LOWESSstandsforLOcallyWEightedpolynomialregreSSionDudoitetal.02.Wesetyi=Log2Stat3i=GFPiandxi=Log10Stat3iGFPiforeachgene.Thenwemakeare-gressionsuchthatyi=mxi+b,obtainanestimate^yxi.Thenusethisestimatetoplotscattergraph:T0i=Ti)]TJ/F15 11.955 Tf 12.894 0 Td[(^yxionA,whereTiandAiaredenedasabove.Tiwillbebroughttobecenteredatzerobytheregressionline.Fromgure6.1,itisobviousthatnormalizationisnotakeyissueinourdatasinceLog-ratiosinourdatahavecenteredatzero.Theslopeofregressionlineisapproximatelyzerosothatthereisonlyaslightchangeafternormalization.6.1.2T-testAunivariatestatisticaltestT-testisusedtoselectdierentiallyexpressedgenes.InaT-test,theempiricalmeansmcandmtandvariancess2cands2tareusedtocalculate74

PAGE 88

Figure6.2:HistogramofT-statisticsthenormalizeddistancebetweentwopopulationsintheform:t=mc)]TJ/F20 11.955 Tf 11.955 0 Td[(mt=s s2c nc+s2t nt:.1Here,foreachpopulation,m=xi=nands2=xi)]TJ/F20 11.955 Tf 11.699 0 Td[(m2=n)]TJ/F15 11.955 Tf 11.698 0 Td[(1aretheestimatesforthemeanandstandarddeviation.ItiswellknownthattfollowsapproximatelyaStudentdistribution,withf=[s2c=nc+s2t=nt]2 s2c=nc2=nc)]TJ/F15 11.955 Tf 11.955 0 Td[(1+s2t=nt2=nt)]TJ/F15 11.955 Tf 11.955 0 Td[(1.2degreesoffreedom.Whentexceedsacertainthresholddependingonthecondencelevelselected,thetwopopulationsareconsideredtobedierent.Becauseinthet-testthedistancebetweenthepopulationmeansisnormalizedbytheempiricalstandarddeviations,thishasthepotentialforaddressingsomeoftheshortcomingsofthesimplexedfold-thresholdapproach.Thefundamentalproblemwiththet-testforarraydataisthattherepetitionnumberncand/orntareoftensmallsinceexperimentsremaincostlyortedioustorepeat.Smallpopulationsofsizen=1;2;or3arestillverycommonandleadtopoorestimatesofthevariance.Thusabetterframeworkisneededtoaddress75

PAGE 89

Figure6.3:HistogramofLog-expressionWilcoxonStatitsicstheseshortcomings.LaterwewilldescribeseveralBayesianprobabilisticframeworksforarraydata,whichcaneectivelyaddresstheproblemofpoorreplicates.Figure6.2givesahistogramof22215genes'T-statistics.Fromthegraph,wecanconcludethatmostgeneexpressiondonotchangesinceT-statisticsisconcentratedatthecenter.6.1.3WilcoxonRank-SumTestWilcoxonRank-SumtestisanonparametrictestfortwoindependentsamplesandisequivalenttoMann-WhitneyUtest.ThestatisticTjisconcernedaboutadierenceinmeans,whereWilcoxonRank-Sumstatisticismoreconcernedaboutgeneraldistribution.Forexample,ifallveexperimentsexpressionvalueisslightlygreaterthanvecontrolsexpressionvaluebychance,itwillgiveasignicantstatisticvaluewhenthetruthisnot.ThusT-statisticsismorepowerful.Figure6.3givesahistogramofWilcoxonstatisticsforallthegenes.76

PAGE 90

6.1.4SAMSAMstandsforSignicanceAnalysisofMicroarraysandisanoutperformedmethodinidentifyingdierentiallyexpressedgenesinDNAmicroarraysNewton,2002[42].Inthissection,wegiveabriefcontexthowSAMworks.Theresulttableandplotforourdatawillalsobeinterpreted.Statisticallyspeaking,identifyingdierentiallyexpressedgenesisamultiplehy-pothesistestingwhichtestsallgenessimultaneouslyanddecideswhichgenesaredier-entiallyexpressed.TherearefourkeystepsinSAMasfollowing:STEP1.AStatisticisformedforeachgenebydj=xj2)]TJ/F15 11.955 Tf 12.68 0 Td[(xj1 sj+soforgenej:wherexj1andxj2aretheaveragegeneexpressionforgenejundercontrolandexperimentrespectively.sjisthepooledstandarddeviationforgenej.Asmallpositiveconstants0isaddedtothedenominatorinordertoensurethatthevarianceofdjisindependentofgeneexpression.Thevaluefors0waschosentominimizethecoecientofvariationofdj,whichiscomputedasafunctionofsjinmovingwindowsacrossthedata.STEP2.Calculatenulldistributionforstatistics.Eachgenehasanulldistribution.Thenulldistributioncanmosteasilybecalculatedbypermutatingthegrouplabels.Forexample,welabelourdataas1,1,1,1,1,2,2,2,2,2.Toassessnulldistribution,wedoarandompermutationofthesamplelabelsandrecomputethestatisticsandcounthowmanyexceedathreshold,say,2.Redothis,say,200times.Wecanndtheaveragenumberofgenesexceeding2andusethisnumbertoestimatefalsediscoveryrateFDRinstep4.STEP3.Chooserejectionregionsandcomparethestatisticsfromobservationtostatis-ticsfromnulldistribution.djistheorderstatisticsfordjsuchthatdddJ.djistheestimateoftheexpectedorderstatisticsfromKpermutations.Kcanbechosenbyuser,forexample,200inourdata.isathresholdchosenbyuserinsoftware,77

PAGE 91

forexample,weuse0.53245inourdata.Wedenerejectregionsas:dj)]TJ/F15 11.955 Tf 14.021 3.155 Td[(djordj)]TJ/F15 11.955 Tf 14.021 3.155 Td[(dj:STEP4.FindtheestimateofFalseDiscoveryRateFDR.FalseDiscoveryRateistheerrorratethatwecalltrulyunchangedgenesdierentiallyexpressed.TheFDRwasproposedbyBenjaminiandHochberg95.AnestimateofFDRStorey2002[56]is^FDR=R0 R^00whereR0istheaveragenumberofsignicantgenesfromKrandompermutationoflabels,whichcanbeinterpretedasaveragenumberoffalsediscoveredgenes.R0=1 KKX1#fdj:djt1ordjt2g:Risthenumberofsignicantgeneswediscoveredbasedonathreshold.R=#fdj:djt1ordjt2gand^00isanestimateoftheoverallproportionoftruenullhypothesisanddependsonanotherthreshold0.^00=J)]TJ/F20 11.955 Tf 11.955 0 Td[(R0 J)]TJ/F20 11.955 Tf 11.955 0 Td[(R00:Table6.1displaysasignicantgenelistbyemployingVSNnormalizedgeneex-pression.Figure6.4displaysatypicalresultplotbyusingSAMpackage.Weinputourdatatypeastwoclassandunpaireddata.SinceweuseVSNnormalizedgeneexpression,datahasbeeninlogscale.Numberofpermutationsissettoonehundredtocalculatethed-statisticundernullhypothesis.Thecriticalvalueisadjustedto2.06105inordertoobtainareasonablefalsediscoveryrateto0.05747.Scoredisthestatisticvaluefromtheobservations.Numeratorofd,denominators+s0ofdandfoldchangearealsogiven.q-valueisthelowestpFDRpositivefalsediscoveryrateatwhichthegeneiscalledsignicant.Itissimilartop-value,butinterpretedastheprobabilitythatafalsepositivegivenitsstatisticisasormoreextremethantheobservedstatistic.78

PAGE 92

Figure6.4:SAMPlot79

PAGE 93

SignicantGenesList InputParameters ImputationEngine 10-NearestNeighborImputer DataType Twoclass,unpaireddata Datainlogscale? True NumberofPermutations 100 BlockedPermutation? False RNGSeed 37571352 Delta,FoldChange .06105, ComputedQuantities ComputedExchangeabilityFactorS0 0.024499213 S0percentile 0.02 FalseSignicantNumberMedian,90% .59364,2.37455 FalseDiscoveryRateMedian,90% .05747,0.22987 PioHat 0.59346 578PositiveSignicantGenes Row GeneName GeneID Score Numerator Denominator FoldChange q-value 8487 208992-s-at gb:BC000627.1 29.0263 3.5768 0.1232 12.0184 0.0398 4516 204988-at gb:NM 005141.1 21.2830 1.8281 0.0858 3.5462 0.0398 5178 205650-s-at gb:NM 021871.1 19.9642 2.1003 0.1052 4.2850 0.0398 21449 222088 s at gb:AA778684 17.8435 1.5972 0.0895 3.0238 0.0398 15609 216236 s at gb:AL110298.1 17.4420 1.3602 0.0780 2.5724 0.0398 18977 219612 s at gb:NM 000509.3 17.3390 2.4297 0.1401 5.3553 0.0398 2289 202760 s at gb:NM 007203.1 17.3031 1.5197 0.0878 2.8728 0.0398 5177 205649 s at gb:NM 000508.2 16.6140 1.5350 0.0924 2.8874 0.0398 3032 203504 s at gb:NM 005502.1 16.5730 1.4455 0.0872 2.7329 0.0398 18520 219155 at gb:NM 012417.1 16.4220 0.7138 0.0435 1.6401 0.0398 80

PAGE 94

Figure6.5:BoxplotofGeneExpressionforFiveExperimentsandFiveControls6.1.5LocalZ-scoreMethodAsmentionedinfoldchangemethod,thescatterplotofLog-ratioagainstLog-intensityisfunnel-shaped.Geneexpressionintensitieshavehighvariationinlowintensityandlowervariationinhighintensity.Thisisanextremelyseriousprobleminourdata.Ifonlyuseacut-othreshold,wewillgetalargequantityoffalsepositivegenesinlowintensityandmissdierentiallyexpressedgenesinhighintensity.LocalZ-scoremethodbalancestheweightsonthelowandhighintensity.ThedataweareanalyzingisgeneintensityMAS5.0fromveexperimentsSTAT3andvecontrolsGFP,whichhas10columnsand22215rows.StepsinCalculatingLocalZ-scoreisshownasfollowing:Step1.Normalizationacrossarray:eachcolumnisdividedbyitsmedian.Figure6.5displaysboxplotsforten-arraygeneexpressionbeforenormalization.Afterstep1,medianwillbeexactlysame.Step2.GetCombinedExperimentIntensityandCombinedControlIntensity:Fifthrootofproductofveexperimentscontrols.81

PAGE 95

Step3.CalculateLOGRATIOandLOGINTENSITYbasedoncombinedexperimentintensityandcombinedcontrolintensity.Step4.LowessNormalizationtoLOGRATIOinourcase,ratioiscenteredat0al-readyStep5.GraphLOGRATIOversusLOGINTENSITYStep6.Splitdatainto4localpartsbasedonLOGINTENSITYbyquartilesofLOGINTENSITYStep7.Eachpart,calculatethelocalZ-score:Z)]TJ/F15 11.955 Tf 11.956 0 Td[(score=Logratio localSD:Wecallgeneswhosez-scoreisgreaterthan2dierentiallyexpressedgenes.Inourresulttable,wehave159genesfrompartIblack,167genesfrompartIIblue,152genesfrompartIIIgreenand170genesfrompartIVred.Mainadvantageoflocalz-scoreisthatitpartiallysolvestheproblemthatthechangeoflow-intensitygenesismoresignicantthanhigh-intensitygenes.Inlocalz-scoremethod,itgivesagoodshottondsignicantgenesinhighintensity.One-cutoffoldchangewillgiveustoomanylow-intensitygenes.Alllevelofintensitygetrelativelyequalchance.6.2SelectDierentiallyExpressedGenesbyBayesianandEmpiricalBayesMethodsThereisakindofinformationsharingamonggenes.Thedatafromothergenesprovidesomeinformationabouttypicalvariabilityinthesystem.Thiscanbenetouranalysisbecauseexperimentsofteninvolvetensofthousandofgenesbutonlytensofmi-croarrays,sotheamountofinformationpergenescanberelativelylow.EmpiricalBayesEBMethodsarewell-suitedtohighdimensionalinferenceproblemsandthusprovidea82

PAGE 96

naturalapproachtomicroarraydataanalysis.Ingeneral,classicalmethodsfocusoncon-trollingfalsediscoveryrate,whileEmpiricalBayesmethod"EBarrays"aimstoclassifygenesbyexpressionpatternsusingposteriorprobability.Newton2002[42]6.2.1EBarraysInsteadofapplyingstatisticalinferenceonindividualgenesasinclassicalmethods,EmpiricalBayesanalysistakesaccountofinformationthatsharesamonggenes.MichaelA.NewtonandChristinaKendziorski002developedanEBframeworkforselectingdierentiallyexpressedgenesandEBarraysisthesoftwarepackagewhichisavailableinanRlibrary.OutputinEBarraysprovidestheposteriorprobabilitiesofdierentialex-pressionacrossmultipleconditions.ThissectionfocusesonanoverviewofhowEBarraysworksundertwoconditions.Inourdata,datacanbedescribedintwopatterns:;1;1;1;1;1;1;1;1;1and;1;1;1;1;2;2;2;2;2:TherstpatternpresentsequivalentexpressionEEandthesecondpatternpresentsdierentialexpressionDE.ResultsfromEBarraysprovidetheposteriorprobabilitiesoftwopatternsasillustratedinTable6.2.P1istheposteriorprobabilitythatagenehasanEEpatternandP2istheposteriorprobabilitythatagenehasaDEpattern.Hereisthemathematicalstructure:AdistributionforequivalentexpressionEEjforgenej,samplei=1,2...n:foxj=Zni=1fobsxjijd:AdistributionfordierentialexpressionDEj:f1xj=foxj1foxj2:83

PAGE 97

TheposteriorprobabilityofdierentialexpressioncalculatedbyBayes'ruleis:pf1xj pf1xj+)]TJ/F20 11.955 Tf 11.955 0 Td[(pfoxj:Table6.2:EBarraysResultTable Name P1 P2 FoldChange Link Description 117 at 7.01E-44 1.00E+00 0.417294305 3310 X51757/FEATURE=cds 200612 s at 2.12E-04 1.00E+00 0.72426676 163 gb:NM 001282:1 200613 at 2.61E-04 1.00E+00 0.728831113 1173 gb:NM 004068:1 200632 s at 2.74E-29 1.00E+00 2.053329445 10397 gb:NM 006096:1 200635 s at 3.00E-06 1.00E+00 1.429124935 5792 gb:AU145351 200636 s at 9.75E-06 1.00E+00 1.418685758 5792 gb:NM 002840:1 200637 s at 2.63E-09 1.00E+00 1.525636934 5792 gb:AI762627 200697 at 1.58E-26 1.00E+00 1.987423539 3098 gb:NM 000188:1 200699 at 3.32E-06 1.00E+00 0.695327778 11014 gb:BE962456 200768 s at 3.62E-04 1.00E+00 1.376725964 4144 gb:BC001686.1 200769 s at 2.92E-04 1.00E+00 1.378797068 4144 gb:NM 005911:1 TwoparticularspecicationsofthegeneralmixturemodelsaredescribedinEBarray:Gamma-GammamodelandLognormal-Normalmodel.InGamma-Gammamodel,itas-sumesobservationcomponentisaGammadistributionwithshapeparameter>0andameanvaluej,whilemarginaldistributionforjisanInverseGammawithhyperpa-rameters.InLognormal-Normalmodel,itassumesobservationcomponentisaLognormaldistributionwithameanvaluejandcommonvariance2,marginaldistributionjisanormaldistributionwithhyperparameters.Itiscriticaltocheckmodelassumptionbeforeweapplyit.Figure6.6displaysagoodtofgeneexpressionfromLi-WongnormalizationtoGamma-Gammamodel.AnimportantadvantageinEBarraysisthatthedatafromothergenesprovidesomeinformationaboutthetypicalvariabilityinthesystemthroughmarginaldistri-84

PAGE 98

Figure6.6:EBarraysFitGGModel-FromLiWongNormalizedDatabution.ThegeneralframeworkprovidedbyEBanalysisisquiteexible.Probabilitydistributionsarespeciedinseverallayersandaccountformultiplesourcesofvariation.TheposteriorprobabilityofdierentialexpressionisaveryusefulinferenceinEBarrays.Thistransformsevidencetothefamiliarscaleofprobability.Posteriorprobabilitycal-culationcarriesovernaturallytocomparisonsamongmulticonditions.Itiseasiertobeinterpretedtonon-statistician.Themethodsthattreatgenesasseparatexedeectsmayhavereducedeciencycomparedtomethodsthattreatthegenesasarisingfromsomepopulationsincetheydonottakeadvantageofthelevelofinformationsharingamonggenes.Furthermore,classifyinggenesintoexpressionpatternsbytheposteriorprobabilityisanoptimalpro-cedureinthecontextofthemixedmodel:itminimizedtheexpectednumberofer-rors.Inclassicaltesting,thegoalistoboundthefalsediscoveryrateandmaximizethepower.Newton002[42]Amaindrawbackisthatthedatamaynotsatisfytheassumption,forinstant,thedistributionsofexpressionacrossgenesorwithinarraydonothaveanormalorgamma85

PAGE 99

distribution,whichoccursveryoften.6.2.2LIMMALIMMAisasoftwarepackageassociatedwiththepapertitled"LinearModelsandEmpiricalBayesMethodsForassessingdierentialexpressioninmicroarrayexperiments"producedbyGordenK.Smyth03[55].ThispaperextendedandresetthehierarchicalmodelofLonnstedtandSpeed002[40]inthecontextofgenerallinearmodels.Consistentandclosedformestimatorsarederivedthroughthemarginaldistributionoftheobservedstatistics.Theadvantageofthismethodthattheestimatorobtainedlowerfalsediscoveryisshowninasimulationstudy.ABayesinferentialapproachisproposedintermsofmoderatedt-statisticinwhichposteriorvariancesaresubstitutedforthesample-variances.Themoderatedt-statistichastheadvantageovertheordinaryt-statisticthatverysmallsamplevariancesareheav-ilybalancedwhilelargersamplevariancesaremoderatedtoalessrelativedegree.Themoderatedt-statisticapproachhastheadvantageovertheposterioroddsthatthenum-berofhyperparameterswhichareneededtobeestimatedisreducedandknowledgeofnon-nullpriorforthefoldchangeisnotrequired.Themoderatedt-statisticisdistributedindependentlyofthesamplevariancessothatinferenceaboutthevarianceandeecthy-perparameterscanbeconsideredseparately.Moreover,theinferentialapproachextendseasilytoprovidetestsinvolvingtwoormorepatternsthroughtheuseofmoderatedF-statistics.Table6.3givesapartialresultforourdatafromLIMMApackage.Misthefoldchange.tistheordinaryt-statiscs.Bisthemoderatedt-statisticassociatedwithp-value.86

PAGE 100

Table6.3:LIMMAResultTable Name M t P.Value B Description 208992 s at 3.576775568 43.75188021 1.51E-12 26.00787978 gb:BC000627.1 204988 at 1.826076349 31.30802095 9.49E-11 22.89101456 gb:NM 005141:1 205650 s at 2.100255784 29.96882705 1.19E-10 22.42797956 gb:NM 021871:1 222088 s at 1.597187059 26.40506191 5.19E-10 21.02410681 gb:AA778684 202760 s at 1.519678276 25.54032036 5.88E-10 20.64054422 gb:NM 007203:1 216236 s at 1.36024007 25.21315777 6.06E-10 20.49054964 gb:AL110298.1 205649 s at 1.535027452 24.67935255 6.42E-10 20.2398186 gb:NM 000508:2 216238 s at 2.307502018 24.66346006 6.42E-10 20.23223735 gb:BG545288 6.2.3Cyber-TTheideaofusingat-statisticwithaBayesianadjusteddenominatorwasproposedbyBaldiandLong2001whoalsodevelopedausefulCyber-Tprogram.Independentnormaldistributionsaremodeledforlog-expressionvalue.Itisreasonabletoassumemeanandvariancearedependentbasedontheinspectionoftypicalmicroarraydatasets.Inthismethod,thepriorofmeanconditionalonvariancehasanormaldistributionwithtwohyperparameters.ThepriorofvarianceisassumedtohaveaninverseGammawithanothertwohyperparameters.Theyobtainedtheposteriordensity,whichhassameformasthejointpriordensity.Bayesianestimatesforparameterandhyperparametersareobtained.Simulationshowsthatthesepointestimates,combinedwithat-test,provideasystematicinferenceapproachthatcomparesfavorablywithsimplet-testorfoldmethods,andpartlycompensateforthelackofreplication.However,theworkwaslimitedthoughtotwo-samplecontrolversustreatmentdesignsandthemodeldidn'tdistinguishbetweendierentiallyandnon-dierentiallyex-pressedgenes.Theyalsodidn'tdevelopconsistentestimatorsforthehyperparameters.Thedegreesoffreedomassociatedwiththepriordistributionofthevarianceswassettoa87

PAGE 101

defaultvaluewhilethepriorvariancewassimplyequaltolocallypooledsamplevariances.6.3OtherWorkinMicroarrayInspiteofthefactthatdierentialexpressioncanbeappliedtoalargenumberofgenes,itremainswithintherestrictionoftheoldone-gene-at-a-timemodel.Mostgenesactrelatedwithothergenes.Thepatternsofexpressionacrossmultiplegenesandexper-imentsarecriticalinDNAmicroarrayanalysis.Todetectsuchpatterns,clusteringmustbeintroduced.Atthislevel,insteadofassuminggenesareindependent,researchersareinter-estedingenescovariance,atwhetherthereexistsmulti-genepatterns,clusterofgenesthatsharethesamebehavioroveroracrossdierenttreatments.Multi-geneexpressionpatternscanbeusedtocharacterizediseasesanddiscriminate,forexample,dierentkindsofcancers.VariousclusteringmethodsSebastiani2002havebeenproposed,in-cludingk-means,hierarchicalclustering.Clusteringmethodscanbeappliednotonlytogenes,butalsotoconditions,DNAsequences,andotherrelateddata.Mostpopu-larpackageinclusteringmicroarraydataisfromMichaelEisen'slabandhereistheURL.http://rana.lbl.gov/EisenSoftware.htm.Arraydataisinherentlyhigh-dimensional,hencedimensionalityreductionandvi-sualizationareparticularlyuseful.PrincipalcomponentanalysisPCAMikeWest2002andclusteringarethemostimportantandwidelyusedmethods.PCAcanbeviewedasamethodtocompressandvisualizedata.Itprovidesanoptimallineardimensionreductiontechniqueinthemean-squaresense.88

PAGE 102

6.4SummaryWeworkedontheselectionofdierentiallyexpressedgenesamong22,215genesandobtainedseveralgenelistsasresults.BothclassicalandBayesianmethodsareapplied.Classicalmethodsconsistoffoldchange,T-test,WilcoxonRank-sumtest,LocallyZ-scoreandSAMStorey2002.EmpiricalBayesmethodsconsistofEBarraysNewton,2002[42],LIMMASmythG.K.03[55]andCybor-TBaldiandLong001[5].SAMStorey2002,localZ-scoreZ.Chen[17]andaparametricEmpiricalBayesmethodNewton,2002arediscussedwithmoredetail.Weillustratedtheproceduresbyaformofclearsteps.Wealsoshowedthemainadvantageanddrawbackofeachmethodandexplainedresulttable.Higherlevelofmicrorrayanalysis,suchasclustering,PCA,canbedoneasfurtherwork.89

PAGE 103

Bibliography[1]Anderson,P.andBorgan,O.,StatisticalModelsBasedonCountingProcesses.992[2]Ascher,H.andFeingold,H.,RepairableSystemsReliability.984[3]Bain,L.J.,StatisticalAnalysisofReliabilityandLifeTestingModels978[4]Bain,L.J.andEngelhardt,M.,StatisticalAnalysisofReliabilityandLifeTestingMod-els991NewYork:2ndeditionMarcelDekker.[5]Baldi,P.andHateld,G.W.,DNAMicroarraysandGeneExpression:FromExperi-mentstoDataAnalysisandModeling.002[6]Baldi,P.andLong,A.D.,ABayesianFrameworkfortheAnalysisofMicroarrayEx-pressionData:Regularizedt-TestandStatisticalInferencesofGeneChanges.Bioin-formatics.17,001509-519.[7]Bar-LevS.K.,Lavi,I.andReiser,B.,BayesianInferenceforThepowerlawprocess,Ann.Inst.Statist.Math.Vol.44,No.4992623-639.[8]Beiser,J.A.andRigdon,S.E.,BayesPredictionfortheNumberofFailuresofaRe-pairableSystem.IEEETransactiononReliability.46291-295997[9]Berger,J.,StatisticalDecisionTheoryandBayesianAnalysis,2ndEdition,Springer-Verlag,NewYork,Berlin,andHeidelberg.85[10]Box,G.E.P.andTiao,G.C.,BayesianInferenceinStatisticalAnalysis,Addison-Wesley,Reading,Massachusetts973.90

PAGE 104

[11]Broberg,P.,StatisticalMethodsforRankingDierentiallyExpressedGenes,GenomeBiology003[12]Calabria,R.Guida,M.andPulcini,G.,SomeModiedMaximumLikelihoodEsti-matorsfortheWeibullProcess.ReliabilityEngineeringandSystemSafety.2351-58.87[13]Calabria,R.,Guida,M.andPulcini,G.,BayesEstimationofPredictionIntervalsforaPowerLawProcess,Comm.Statist.TheoryMethods.19,990023-3035.[14]Calabria,R.,Guida,M.andPulcini,G.,PowerBoundsforaTestofEqual-ityofTrendsinkIndependentPowerLawProcesses.Comm.Statist.TheoryMethods.21992275-3290.[15]Calabria,R.andPulcini,G.,BayesInferencefortheModulatedPowerLawProcess,Comm.Statist.TheoryMethods.260,997,2421-2438.[16]Chen,Z.andRao,A.N.V.,BayesianInferenceonPowerLawProcess,FourthInterna-tionalConferenceonDynamicSystemsandApplicationsatAtlantasubmittedMay21-2403[17]Chen,Z.andRao,A.N.V.,MethodsofSelectingDierentiallyExpressedGenes,FourthWorldCongressofNonlinearAnalystsWCNA-2004inpreparationJune30-July704[18]Chen,Z.,ComputingProbabilitiesUsingtheMultiplicationRule,PresentationatWilkesBarre,PA04[19]Chen,Z.,HypothesisTesting:CalculatingandMinimizingTypeIIErrorProbability,PresentationatFortMyers,FL0491

PAGE 105

[20]CoxD.R.andLewis.Wiley,P.A.W.,TheStatisticalAnalysisofSeriesofEvents.966[21]Crow,L.H.,ReliabilityAnalysisforComplexRepairableSystems.ReliabilityandBiometry[22]Crow,L.H.,CondenceIntervalProceduresfortheWeibullProcesswithApplicationstoReliabilityGrowth.Technometrics.24982-72.[23]DataAnalysisToolsforDNAMicroarrays003[24]Duane,J.T.,LearningCurveApproachtoReliabilityMonitoring,IEEETrans.AerospaceElectronSystem.29643-566[25]Dudoit,S.etal.,StatisticalMethodsforIdentifyingDierentiallyExpressedGenesinReplicatedcDNAMicroarrayExperiments.StatisticaSinica.12100211-139.[26]Engelhardt,M.andBain,L.J.,OntheMeanTimeBetweenFailuresforRepairableSystems.IEEETransactionsonReliability98619-422.[27]Engelhardt,M.andBain,J.L.,StatisticalAnalysisofaCompoundPowerLawModelforRepairableSystems,IEEETransactionsonReliability,R-36987392-396.[28]Engelhardt,M.andBain,J.L.,PredictionIntervalsfortheWeibullProcess,Techno-metrics.20978167-169.[29]Gnedenko,B.V.andUshakov,I.A.,ProbilisticReliabilityEngineering.995[30]Guida,M.,Calabria,R.andPulcini,G.,BayesInferenceforaNon-homogeneousPoissonProcesswithPowerLawIntensity.IEEETransactionsonReliability.R-3898903-609.92

PAGE 106

[31]Hatmann,O.andSamans,B.MicroarrayDataNormalizationandTransformation:ComparingMAS,VSNandRMAforAymetrixGeneChips.TechnicalReport:AymetrixGeneChipnormalizationwithBioConductor003[32]Higgins,J.J.andTsokos,C.P.,Aquasi-Bayesestimateofthefailureintensityofareliability-growthmodel,IEEETransactionsonReliability.R-0981471-475.[33]Huang,Y.andBier,V.,ANaturalConjugatePriorfortheNon-HomogeneousPoissonProcesswithaPowerLawIntensityFunction,Commun.Statist.272525-551.998[34]Huber,W.,Heydebreck,A.,Sultmann,H.,Poustka,A.andVingron,M.,VarianceSta-bilizationAppliedtoMicroarrayDataCalibrationandtotheQuanticationofDif-ferentialExpression.Bioinformatics.1896-104.02[35]Kyparisis,J.andSingpurwalla,N.D.,BayesianInferencefortheWeibullProcesswithApplicationstoAssessingSoftwareReliabilityGrowthandpredictingSoftwareFail-ures.ComputerScienceandStatistics:ProceedingsoftheSixteenthSymposiumontheInterface.985[36]Lee,L.andLee,K.,SomeResultsonInferencefortheWeibullProcess,Technometrics.20,9781-45.[37]Lee,M.,Kuo,F.,Whitmore,G.andSklar,J.,ImportanceofReplicationinMicroarrayGeneExpressionStudies:StatisticalMethodsandEvidencefromRepetitivecDNAHybridizations.PNAS000[38]Lewin,A.,Richardson,S.,Marshall,C.,Glazier,A.andAitman,T.,BayesianMod-ellingofDieretiallyGeneExpression003.[39]Lindqvist,B.,StatisticalModelingandAnalysisofRepairableSystems,MathematicalMethodsinReliability.September,997-19.93

PAGE 107

[40]Lonnstedt,I.andSpeed,T.P.,ReplicatedMicroarrayData.Stat.Sinica.12,31-3602[41]MusaJ.D.,SoftwareReliabilityData,depositedinIEEEComputerSocietyReposi-tory,NewYork,1979.[42]NewtonM.andKendziorski,ParametricEmpiricalBayesMethodsforMicroarrays.TheAnalysisofGeneExpressionData:Methodsandsoftware.00254-271.[43]Parmingiani,G.,Garrett,E.S.,Irizarry,R.A.andZeger,S.L.,TheAnalysisofGeneExpressionData:MethodsandSoftware.003[44]Qiao,H.andTsokos,C.,BestEcientEstimatesoftheIntensityFunctionofthePowerLawProcess.JournalofAppliedStatistics,25,111-120998[45]Qiao,H.andTsokos,C.P.,ReliabilityGrowth:MTBFvs.IntensityFunctionoftheWeibullProcess.[46]Quackenbush,J.,MicroarrayDataNormalizationandTransformation,NatureGe-neticsSupplement,32,December200296-501.[47]Rigdon,S.E.andBasu,A.P.,TheEectofAssumingaHomogeneousPoissonProcessWhentheTrueProcessisaPowerLawProcess.JournalofQualityTechnology,22,111-117990[48]Rigdon,S.E.andBasu,A.P.,ThePowerLawProcess:aModelfortheReliabilityofRepairablesystems.JournalofQualityTechnology21,251-260.990[49]Rigdon,S.E.andBasu,A.P.,EstimatingtheIntensityFunctionofaPowerLawPro-cessattheCurrentTime:TimeTruncatedCase.CommunicationsinStatisticsSim-ulationandComputation19,1079-1104.99094

PAGE 108

[50]Rigdon,S.E.andBasu,A.P.,StatisticalMethodsfortheReliabilityofRepairablesys-tem000[51]Roberts,H.W.andRao,A.N.V.,PredictingthePerformanceofSoftwareSystemsviathePowerLawProcess,DissertationofUniversityofSouthFlorida000[52]Singpurwalla,N.andWilson,S.,StatisticalMethodsinSoftwareEngineer-ing.9991-48.[53]Speed,T.,InterdisciplinaryStatistics:StatisticalAnalysisofGeneExpressionMi-croarrayData.003[54]Sinha,S.K.,JohnWileyandSons,ReliabilityandLifeTesting.986.[55]SmythG.K.LinearModelsandEmpiricalBayesMethodsforAssessingDieren-tialExpressioninMicroarrayExperiments.StatisticalApplicationsinGeneticsandMolecularBiology3003.[56]Storey,J.D.andTibshirani,R.,SAMThresholdingandFalseDiscoveryRatesforDetectingDierentialGeneExpressioninDNAMicroarray.TheAnalysisofGeneExpressionData00272-290.[57]Thompson,Jr.,W.A.OntheFoundationsofReliability.Technometrics23981.[58]Tsokos,C.P.,RelaibilityGrowth:NonhomogeneousPoissonProcess.RecentAdvancesinLife-TestingandReliability319-334,CRC,BocaRaton,FL.95[59]Tsokos,C.andRao,A.N.V.,EstimationofFailureIntensityfortheWeibullProcess,JournalofEngineeringReliabilityandSystemSafety995[60]Ushakov,I.,HandbookofReliabilityEngineering.99495

PAGE 109

AbouttheAuthorMs.ZhaoChengotherbachelordegreeinInternationalFinancefromHunanUni-versityin1997.ThensheworkedasanaccountantfortwoyearsinChina.ShestartedhermasterprograminthemathematicsdepartmentoftheUniversityofSouthFloridainAugust1999.ShereceivedhermasterdegreeinmathematicsinMay2001.ShewasadmittedtodoctoralprogramspecializedinStatisticsinFall2001.DuringherstudyattheUniversityofSouthFlorida,shereceivednancialsupportasateachingassistant.Shehasinstructedvariousmathematicsandstatisticscourses.Ms.Chenattendedanumberofseminarsandpresentedherresearchinterestintwointernationalconferencesandweeklyseminarinthedepartment.ShealsoworkedasavolunteerinstructorinUSFUrbanScholarsProgram.FromlastJulytothepresent,sheworksasabiostatisticianinH.LeeMottCancerCenter,specializedinmicroarrayandproteinanalysis.96


xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001478786
003 fts
006 m||||e|||d||||||||
007 cr mnu|||uuuuu
008 040811s2004 flua sbm s000|0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0000430
035
(OCoLC)56563572
9
AJS2476
b SE
SFE0000430
040
FHM
c FHM
090
QA36 (ONLINE)
1 100
Chen, Zhao,
d 1977-
0 245
Bayesian and Empirical Bayes approaches to power law process and microarray analysis
h [electronic resource] /
by Zhao Chen.
260
[Tampa, Fla.] :
University of South Florida,
2004.
502
Thesis (Ph.D.)--University of South Florida, 2004.
504
Includes bibliographical references.
500
Includes vita.
516
Text (Electronic thesis) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
Title from PDF of title page.
Document formatted into pages; contains 109 pages.
520
ABSTRACT: In this dissertation, we apply Bayesian and Empirical Bayes methods for reliability growth models based on the power law process. We also apply Bayes methods for the study of microarrays, in particular, in the selection of differentially expressed genes. The power law process has been used extensively in reliability growth models. Chapter 1 reviews some basic concepts in reliability growth models. Chapter 2 shows classical inferences on the power law process. We also assess the goodness of fit of a power law process for a reliability growth model. In chapter 3 we develop Bayesian procedures for the power law process with failure truncated data, using non-informative priors for the scale and location parameters. In addition to obtaining the posterior density of parameters of the power law process, prediction inferences for the expected number of failures in some time interval and the probability of future failure times are also discussed. The prediction results for the software reliability model are illustrated. We compare our result with the result of Bar-Lev,S.K. et al. Also, posterior densities of several parametric functions are given. Chapter 4 provides Empirical Bayes for the power law process with natural conjugate priors and nonparametric priors. For the natural conjugate priors, two-hyperparameter prior and a more generalized three-hyperparameter prior are used. In chapter 5, we review some basic statistical procedures that are involved in microarray analysis. We will also present and compare several transformation and normalization methods for probe level data. The objective of chapter 6 is to select differentially expressed genes from tens of thousands of genes. Both classical methods (fold change, T-test, Wilcoxon Rank-sum Test, SAM and local Z-score and Empirical Bayes methods (EBarrays and LIMMA) are applied to obtain the results. Outputs of a typical classical method and a typical Empirical Bayes Method are discussed in detail.
590
Adviser: A.N.V. Rao.
653
Bayesian.
power law process.
Empirical Bayes.
Nonhomogeneous Poisson Process.
EBarrays.
microarray.
690
Dissertations, Academic
z USF
x Mathematics
Doctoral.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.430