USF Libraries
USF Digital Collections

Path centrality :

MISSING IMAGE

Material Information

Title:
Path centrality : a new centrality measure in networks
Physical Description:
Book
Language:
English
Creator:
Alahakoon, Tharaka
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
Betweenness centrality
Social networks
Randomized algorithms
Experimental algorithmics
Graphs
Dissertations, Academic -- Computer Science & Engineering -- Masters -- USF   ( lcsh )
Genre:
non-fiction   ( marcgt )

Notes

Abstract:
ABSTRACT: In network analysis, it is useful to identify important vertices in a network. Based on the varying notions of importance of vertices, a number of centrality measures are defined and studied in the literature. Some popular centrality measures, such as betweenness centrality, are computationally prohibitive for large-scale networks. In this thesis, we propose a new centrality measure called k-path centrality and experimentally compare this measure with betweenness centrality. We present a polynomial-time randomized algorithm for distinguishing high k-path centrality vertices from low k-path centrality vertices in any given (unweighted or weighted) graph. Specifically, for any graph G = (V,E) with n vertices and for every choice of parameters alpha between (0,1), epsilon between (0,1/2), and integer k between 1,n, with probability at least 1-1/n^2 our randomized algorithm distinguishes all vertices v in V that have k-path centrality Ck(v) more than n^(alpha)*(1+2*epsilon) from all vertices v in V that have k-path centrality Ck(v) less than n^(alpha)*(1-2*epsilon). The running time of the algorithm is O(k^(2)*epsilon^(-2)*n^(1-alpha)*ln(n)). Next, we present a polynomial-time randomized approximation algorithm for computing the k-path centrality values of all vertices in any given (unweighted or weighted) graph. Specifically, for any graph and for every choice of parameters alpha between (0,1/2) and integer k between 1,n, with probability at least 1-1/n^2 our randomized approximation algorithm computes the k-path centrality value of every vertex within an additive error of at most n^(1/2+alpha). The running time of the algorithm is O(k^(3)*n^(1-2*alpha)*ln(n)). Theoretically and experimentally, our algorithms are (for suitable choices of parameters) significantly faster than the best known deterministic algorithm for computing exact betweenness centrality values (Brandes' algorithm). Through experimentations on both real and randomly generated networks, we demonstrate that vertices that have high betweenness centrality values also have high k-path centrality values.
Thesis:
Thesis (MSCS)--University of South Florida, 2010.
Bibliography:
Includes bibliographical references.
System Details:
Mode of access: World Wide Web.
System Details:
System requirements: World Wide Web browser and PDF reader.
Statement of Responsibility:
by Tharaka Alahakoon.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains X pages.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
usfldc doi - E14-SFE0004573
usfldc handle - e14.4573
System ID:
SFS0027888:00001


This item is only available as the following downloads:


Full Text

PAGE 1

PathCentrality:ANewCentralityMeasureinNetworks by TharakaAlahakoon Athesissubmittedinpartialfulllment oftherequirementsforthedegreeof MasterofScienceinComputerScience DepartmentofComputerScienceandEngineering CollegeofEngineering UniversityofSouthFlorida MajorProfessor:RahulTripathi,Ph.D. AdrianaIamnitchi,Ph.D. MiguelLabrador,Ph.D. DateofApproval: May28,2010 Keywords:betweennesscentrality,socialnetworks,randomizedalgorithms,experimental algorithmics,graphs Copyright c 2010,TharakaAlahakoon

PAGE 2

DEDICATION tomyparentsandtomytwobrothers

PAGE 3

ACKNOWLEDGEMENTS Itismypleasuretothankthosewhomadethisthesispossible;thisworkwouldnothave beencompletedwithouthelpandsupportofmanyindividuals.Iamgratefultoeveryone whohasfosteredmewiththeiradviceandtheirsuggestions.Inparticular,Iwouldlike todeeplyexpressmysinceregratitudetomyadvisorDr.RahulTripathiforprovidingme anopportunitytoconductmyMaster'sthesisunderhim.Ithankhimforhispatience, motivation,enthusiasm,andimmenseknowledgetoguideandsupportmeoverthecourseof it.Iamgratefultohimforthelongdiscussionsthathelpedmesortoutthetechnicaldetails ofmyworkandforhiscarefulreadingandcommentingoncountlessrevisionsofthisthesis. IwishtoalsoexpressmywarmandsincerethankstoDr.AdrianaIamnitchiformany valuablehoursofdiscussionsthathavehelpedmeunderstandmyexperimentalareabetter. Ithankherforprovidingmewithveryvaluabledatasetsformyexperimentsandforher guidanceintheexperimentalportionofthisthesis.IwouldliketoacknowledgeDr.Miguel Labradorforhiscommentsduringmythesisdefense,inparticularthesuggestionofusing gnuplotsforplottinggraphs. IalsoliketothanktheuseoftheservicesprovidedbyResearchComputing,University ofSouthFlorida.IamgratefultomanypeopleonthefacultyandstaoftheDepartmentof ComputerScienceandEngineeringandtheDepartmentofMathematicsandStatisticsatthe UniversityofSouthFloridafortheirassistanceandencouragementinvariouswaysduringmy courseofstudies.Iliketothankmanyfriendsandcolleagueswhohavehelpedmethrough thesedicultyears.Theirsupportandcarestrengthenedmetoovercomesetbacksandstay focusedonmygraduatestudy.IgreatlyvaluetheirfriendshipandIdeeplyappreciatetheir beliefinme. Finally,butnottheleast,Ithankmyfamilyforbelievinginmeandlovingmeforall theseyears.Theirloveandsupportkeptmemovingforwardinmanyjoyfulaswellasdicult times.

PAGE 4

TABLEOFCONTENTS LISTOFTABLESii LISTOFFIGURESiv LISTOFALGORITHMSvi ABSTRACT vii CHAPTER1INTRODUCTION1 1.1Motivation1 1.2Notations1 CHAPTER2CENTRALITYMEASURES3 2.1PopularCentralityMeasures3 2.2BetweennessCentrality5 2.3Bounded-DistanceBetweennessCentralityforUnweightedGraphs10 CHAPTER3 k -PATHCENTRALITY12 3.1TheNotionof k -PathCentralityforUnweightedGraphs12 3.2TheNotionof k -PathCentralityforWeightedGraphs15 3.3ARandomizedApproximationAlgorithmfor k -PathCentrality17 CHAPTER4RESULTS20 4.1RandomlyGeneratedNetworks20 4.2RealNetworks29 CHAPTER5CONCLUSION51 REFERENCES53 ABOUTTHEAUTHOREndPage i

PAGE 5

LISTOFTABLES Table1.1Notations2 Table2.1RuntimeofeachprocedureinBrandes'betweennesscentralityalgorithm9 Table4.1Summaryinformationofarandomnetworkwith5000vertices and10%density23 Table4.2Summaryinformationofarandomnetworkwith5000vertices and20%density24 Table4.3Summaryinformationofarandomnetworkwith5000vertices and30%density25 Table4.4Summaryinformationofarandomnetworkwith5000vertices and40%density26 Table4.5Summaryinformationofarandomnetworkwith5000vertices and50%density27 Table4.6Summaryinformationofarandomnetworkwith5000vertices and60%density28 Table4.7SummaryinformationofZachary'sKarateClubdataset33 Table4.8SummaryinformationofLesMiserablesdataset34 Table4.9SummaryinformationofWordAdjacenciesdataset35 Table4.10SummaryinformationofCondensedMatter1999dataset36 Table4.11SummaryinformationofCondensedMatter2003dataset37 Table4.12SummaryinformationofCondensedMatter2005dataset38 Table4.13SummaryinformationofHigh-EnergyTheorydataset39 Table4.14SummaryinformationofInternetdataset40 Table4.15SummaryinformationofYeastdataset41 Table4.16SummaryinformationofKazaadataset142 Table4.17SummaryinformationofKazaadataset243 Table4.18SummaryinformationofKazaadataset344 ii

PAGE 6

Table4.19SummaryinformationofComputationalGeometrydataset45 Table4.20SummaryinformationofSmallWorlddataset46 Table4.21SummaryinformationofSmall,GrithandDescendantsdataset47 Table4.22SummaryinformationofScientometricsdataset48 Table4.23SummaryinformationofSelf-OrganizingMapsdataset49 Table4.24SummaryinformationofZewaildataset50 iii

PAGE 7

LISTOFFIGURES Figure4.1PlotshowingrunningtimesinsecondsofBrandes'algorithm andourrandomizedapproximationalgorithm21 Figure4.2Scatterplotforarandomnetworkwith5000verticesand10%density23 Figure4.3Scatterplotforarandomnetworkwith5000verticesand20%density24 Figure4.4Scatterplotforarandomnetworkwith5000verticesand30%density25 Figure4.5Scatterplotforarandomnetworkwith5000verticesand40%density26 Figure4.6Scatterplotforarandomnetworkwith5000verticesand50%density27 Figure4.7Scatterplotforarandomnetworkwith5000verticesand60%density28 Figure4.8ScatterplotforZachary'sKarateClubdataset33 Figure4.9ScatterplotforLesMiserablesdataset34 Figure4.10ScatterplotforWordAdjacenciesdataset35 Figure4.11ScatterplotforCondensedMatter1999dataset36 Figure4.12ScatterplotforCondensedMatter2003dataset37 Figure4.13ScatterplotforCondensedMatter2005dataset38 Figure4.14ScatterplotforHigh-EnergyTheorydataset39 Figure4.15ScatterplotforInternetdataset40 Figure4.16ScatterplotforYeastdataset41 Figure4.17ScatterplotforKazaadataset142 Figure4.18ScatterplotforKazaadataset243 Figure4.19ScatterplotforKazaadataset344 Figure4.20ScatterplotforComputationalGeometrydataset45 Figure4.21ScatterplotforSmallWorlddataset46 Figure4.22ScatterplotforSmall,GrithandDescendantsdataset47 Figure4.23ScatterplotforScientometricsdataset48 iv

PAGE 8

Figure4.24ScatterplotforSelf-OrganizingMapsdataset49 Figure4.25ScatterplotforZewaildataset50 v

PAGE 9

LISTOFALGORITHMS 1Initialize 7 2Relax&Count8 3Accumulation8 4Brandes'Betweenness9 5Bounded-DistanceBetweenness11 6Indicator13 7Unweighted k -Path14 8Weighted k -Path17 9RandomizedApproximation k -Path18 vi

PAGE 10

PathCentrality:ANewCentralityMeasureinNetworks TharakaAlahakoon ABSTRACT Innetworkanalysis,itisusefultoidentifyimportantverticesinanetwork.Basedonthe varyingnotionsofimportanceofvertices,anumberofcentralitymeasuresaredenedand studiedintheliterature.Somepopularcentralitymeasures,suchasbetweennesscentrality, arecomputationallyprohibitiveforlarge-scalenetworks.Inthisthesis,weproposeanew centralitymeasurecalled k -pathcentralityandexperimentallycomparethismeasurewith betweennesscentrality. Wepresentapolynomial-timerandomizedalgorithmfordistinguishinghigh k -pathcentralityverticesfromlow k -pathcentralityverticesinanygivenunweightedorweighted graph.Specically,foranygraph G = V;E with n verticesandforeverychoiceofparameters 2 ; 1, 2 ; 1 = 2,andinteger k 2 [1 ;n ],withprobabilityatleast1 )]TJ/F15 10.9091 Tf 11.477 0 Td [(1 =n 2 our randomizedalgorithmdistinguishesallvertices v 2 V thathave k -pathcentrality C k v more than n +2 fromallvertices v 2 V thathave k -pathcentrality C k v lessthan n )]TJ/F15 10.9091 Tf 10.202 0 Td [(2 Therunningtimeofthealgorithmis O k 2 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.586 0 Td [( ln n Next,wepresentapolynomial-timerandomizedapproximationalgorithmforcomputing the k -pathcentralityvaluesofallverticesinanygivenunweightedorweightedgraph.Specifically,foranygraphandforeverychoiceofparameters 2 ; 1 = 2andinteger k 2 [1 ;n ],with probabilityatleast1 )]TJ/F15 10.9091 Tf 11.145 0 Td [(1 =n 2 ourrandomizedapproximationalgorithmcomputesthe k -path centralityvalueofeveryvertexwithinanadditiveerrorofatmost n 1 = 2+ .Therunningtime ofthealgorithmis O k 3 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n Theoreticallyandexperimentally,ouralgorithmsareforsuitablechoicesofparameters signicantlyfasterthanthebestknowndeterministicalgorithmforcomputingexactbetweennesscentralityvaluesBrandes'algorithm.Throughexperimentationsonbothrealandranvii

PAGE 11

domlygeneratednetworks,wedemonstratethatverticesthathavehighbetweennesscentrality valuesalsohavehigh k -pathcentralityvalues. viii

PAGE 12

CHAPTER1 INTRODUCTION 1.1Motivation Anetworkisadirectedorundirectedgraphinwhichverticestypicallyrepresentpeopleor organizationsorentities,andedgesdenoterelationshipsorlikelinessorphysicalconnections. Afundamentalproblemintheanalysisofnetworksistodetermineverticesoredgesofgreat importanceinagivennetwork.Themotivationofthisthesisistondimportantverticesof anetworkinsignicantlyfasttime. Therearemanycentralitymeasuresproposedtoevaluatetheimportanceofavertexor anedgeinanetwork.Betweennesscentralityisconsideredtobethemostpopularofall.The bestknowndeterministicalgorithmforcomputingbetweennesscentralityofallverticesor edgestakes O n + m spaceand O nm timeinunweightedgraphs,and O n + m spaceand O nm + n 2 log n timeinweightedgraphs[Bra01],where n isthenumberofverticesand m isthenumberofedgesinthegraph.Inthisthesis,weintroduceanewcentralitymeasure called k -pathcentralityanddeviseapolynomial-timerandomizedapproximationalgorithm forcomputingall k -pathcentralityvalueswithsmallerrorprobability.Ourrandomizedapproximationalgorithmissignicantlyfasterthanthatforcomputingbetweennesscentrality ofallvertices.Wealsoexperimentallydemonstratethatverticeswithhighbetweennesscentralityvaluealsohavehigh k -pathcentralityvalue.Therefore,webelievethatthe k -path centralitynotionandtherandomizedapproximationalgorithmproposedinthisthesismay havepracticalmeritinnetworkanalysis. 1.2Notations Inanetwork,itiscriticalforvariouspracticalreasonstoidentifyimportantverticesor edges.Inthisthesis,wewillfocusonimportanceofvertices,buttheconceptcanalsobe 1

PAGE 13

appliedforimportanceofedges.Forconvenience,wecandescribeanetworkasadirected orundirectedgraph G = V;E ,where V representsthesetofverticesand E representsthe setofedges.Let n and m denotethecardinalityof V andthecardinalityof E ,respectively. Let W beanonnegativeweightfunctionontheedgesofagraph.Withoutlossofgenerality, wecanassumethatthegraphsconsideredinthisthesisareedge-weighted,sincetheedgesof anunweightedgraphcanbeassumedtohaveunitweights.Also,wecanassumeallgraphs consideredinthisthesistobedirected,sinceanyundirectededge f u;v g canbesplitintotwo directededges u;v and v;u Let G = V;E beagraphandlet s t bearbitraryverticesof G .Wedenea path from s to t asasequenceofverticesthatstartsfrom s andendsat t suchthatanytwoconsecutive verticesinthesequenceareconnectedbyanedgeof G .The weight ofapathisdenedtobe thesumoftheweightsofalltheedgesinthepath.The shortestpath from s to t in G isapath ofminimumweight,andwedenoteitsweightby d G s;t .Let P s t bethe setofpredecessors ofavertex t onshortestpathsfrom s to t in G .Let st denotethe numberofshortestpaths from s to t in G .Forany v 2 V ,let st v denotethenumberofshortestpathsfrom s to t in G thatpassthrough v .Notethat d G s;s =0, ss =1and st v =0if v 2f s;t g SomeusefulnotationsusedinthisthesisaresummarizedinTable1.1. Table1.1Notations Notation Meaning V Setofverticesingraph G = V;E E Setofedgesingraph G = V;E n Cardinalityof V m Cardinalityof E W u;v Weightofthedirectededgefrom u 2 V to v 2 V ,where u;v 2 E d G s;t Theweightofashortestpathfrom s 2 V to t 2 V P s t Setofpredecessorsofavertex t onshortestpathsfrom s 2 V to t 2 V st Thenumberofshortestpathsfrom s 2 V to t 2 V st v Thenumberofshortestpathsfrom s 2 V to t 2 V thatpassthrough v 2 V 2

PAGE 14

CHAPTER2 CENTRALITYMEASURES 2.1PopularCentralityMeasures Amajorfocusinnetworkanalysisisonndingimportantverticesoredgesinagiven network.Centralitymeasuresareusedtorankverticesoredgesbasedontheirimportance inthenetwork.Henceforward,wewillconsidervertex-basedcentralitymeasuresandignore theiredge-basedcounterparts,sincethelatteraredenedanalogouslytotheformer.Someof theearlieststudiesofnetworkanalysisoccurredintheeldofsocialscience[WF94,Sco00]. Overtheyears,networkresearchershaveintroducedmanyindicestomeasurethecentrality ofavertex.Simplestcentralitymeasureisthedegree.Becausedegreeofavertexonly dependsonitsneighbors,thedegreemeasuredoesnotgiveagoodindicationofthevertex comparedtothewholenetwork.Thereforecentralitymeasures,whichindicatetherankofa vertexaccordingtoitspositioninthenetwork,weremorepreferable.Somepopularcentrality measuresareCloseness[Bea65,Sab66],Eigenvector[Bon72],Graph[HH95],Stress[Shi53]and Betweenness[Ant71,Fre77]. Denition2.1ClosenessCentrality[Bea65] Foreveryvertex v 2 V ofadirected weightedgraph G V;E ,the closenesscentrality C C v of v isdenedby C C v = n )]TJ/F15 10.9091 Tf 10.909 0 Td [(1 X t 2 V d G v;t : Denition2.2GraphCentrality[HH95] Foreveryvertex v 2 V ofadirectedweighted graph G V;E ,the graphcentrality C G v of v isdenedby C G v = 1 max t 2 V d G v;t : 3

PAGE 15

Denition2.3StressCentrality[Shi53] Foreveryvertex v 2 V ofadirectedweighted graph G V;E ,the stresscentrality C S v of v isdenedby C S v = X s 6 = v X t 6 = v;s st v : Denition2.4BetweennessCentrality[Ant71,Fre77] Foreveryvertex v 2 V ofa directedweightedgraph G V;E ,the betweennesscentrality C B v of v isdenedby C B v = X s 6 = v X t 6 = v;s st v st : Accordingtoabovedenitions,highcentralityscoreofavertexindicatesthatthevertex iseasilyreachablefromothervertices.Asanexample,supposethatwewanttoknowwhoare themostimportantpeopleinaresearchgroup.Ifweassignedgeweightsbetweentwopeople accordingtothenumberofpaperstheypublishedtogether,wegetanundirected,weighted graph.Asmentionedbefore,wecantreatthisgraphasadirectedgraphbysplittingeach edgeintobidirectionaledges.Anyoftheabovecentralitymeasurewillassigneachvertexa value.Byarrangingthesevaluesinadecreasingorderwecangureoutwhoarethemost importantresearchersinthatgroup.Notethatwemightgetdierentorderingdependingon thecentralitymeasureused.Itisarguabletosaywhichcentralitymeasurebestdescribesthe importanceofverticesinagraph.Centralitymeasuresassignvaluestoverticesthatallowus tocomparetwoormorevertices.Typically,thehigherthecentralityvalue,thegreaterthe importanceofthatvertex. Thecomputationofthesecentralitymeasuresisatimeconsumingtask.Alotofresearch isdevotedtodevisingfastalgorithmsforcomputingthesemeasures.Betweennesscentrality measurecapturesthedegreeofinuenceavertexhasovertheinformationowinanetwork. Therefore,inrecentyearsbetweennesscentralityhasgainedpopularityamongmanysocial networkanalysts.Thebestknownexactalgorithmforcomputingbetweennesscentralityis givenbyBrandes[Bra01].ComputationofthebetweennesscentralityusingBrandes'algorithmtakes O n + m spaceand O nm timeonunweightedgraphs,and O n + m space and O nm + n 2 log n timeonweightedgraphs,where n isthenumberofverticesand m is 4

PAGE 16

thenumberofedges.BasedontheideausedbyEppsteinandWang[EW01],Brandesand Pich[BP07]proposedanapproximationalgorithmtocomputebetweennesscentrality.Their algorithmrandomlychoosesasmallsampleofverticescalledpivots.Itthencomputessingle sourceshortestpathsfromeachpivottoallothervertices.Thenthealgorithmdenesthe estimatedbetweennesscentralityofavertexintermsoftheaveragecontributionofeachpivot. Theaccuracyandtheruntimeofthealgorithmdependonthenumberofpivotsselected. Geisberger,SandersandSchultes[GSS08]proposedlinearscalingandbisectionscaling techniquestoapproximatebetweennesscentrality.Inlinearscalingtheyassumedthatthe contributionofthesampledependslinearlyonthedistancetothesampleandcanbeimplementedusingaslightvariationoftheBrandes'betweennessalgorithm.Inbisectionscaling contributionofthesampleisonlyonthesecondhalfofthepathandrequiresadierent approachwithanotherlevelofrandomsampling.Foragivenvertex,Baderetal.[BKMM07] presentedanadaptivesamplingbasedalgorithmforapproximatelycomputingbetweenness centralityofavertex.Theyshowedthatifthebetweennesscentralityofavertexis n 2 =t ,for someconstant t 1,thenwithprobabilityatleast1 )]TJ/F15 10.9091 Tf 11.068 0 Td [(2 ,itsbetweennesscentralitycanbe estimatedtowithinafactorof1 = ,where0 << 1 = 2,using t samplesofsourcevertices. Wenextvisitbetweennesscentralityindetail. 2.2BetweennessCentrality Anthonisse[Ant71]andFreeman[Fre77]independentlyintroducedthebetweennesscentralityofavertex.Thecomputationofbetweennesscentralityofavertex v 2 V requirestwo steps: countingthenumberofshortestpathsbetweenallpairs s;t ofverticesof G countingthenumberofsuchpathsbetween s;t thatpassthrough v Notethatavertex v liesonashortestpathfrom s to t ifandonlyiftheequality d G s;t = d G s;v + d G v;t issatised.Wecancomputetheweightandthenumberofshortestpaths betweenanypair s;t ofverticesusingtraversalalgorithmssuchasthebreadth-rstsearch algorithmforunweightedgraphsandDijkstra'salgorithmforweightedgraphs.Forcomputing 5

PAGE 17

st ,weusethefactthat st = P v 2 P s t sv ,where P s t denotesthesetofpredecessorsof thevertex t onshortestpathsfrom s to t .Aftercomputing st valuesforall s;t 2 V wecan compute st v usingthefollowingfact: st v = 8 > > < > > : sv vt if d G s;t = d G s;v + d G v;t 0otherwise. Thissimplealgorithmforcomputingbetweennesscentralityrequires O n 2 spaceand O n 3 time.Calculatingbetweennesscentralityinnetworkswithhugenumberofverticesis computationallyprohibitive.Brandes[Bra01]wasabletocomputebetweennesscentralityin O n + m spaceand O nm timeonunweightedgraphs,and O n + m spaceand O nm + n 2 log n timeonweightedgraphs.Brandesdenedthenotationofthe dependency ofavertex s 2 V onasinglevertex v 2 V as s? v = P t 2 V st v st andprovedthefollowingrecursive relationon s? v ,whichissignicanttohisalgorithm: s? v = X w : v 2 P s w sv sw + s? w : BelowwegiveBrandes'betweennesscentralityalgorithmforweightedgraphs.Unweighted casecanbedonemuchsimplerusingbreadth-rstsearchinsteadofDijkstra'salgorithm. ThepseudocodeforBrandes'betweennessalgorithmisdescribedintheprocedureBrandes' Betweenness.ThisproceduremakescallstotheproceduresInitialize,Relax&Count,and Accumulation. TheprocedureInitializetakesagraph G V;E ,array d ofdistances,array ofnumberof shortestpaths,array P ofpredecessors,queue Q ,stack S ,andastartvertex s asinput.It doesthefollowing: Foreveryvertex v except s ,thedistance d [ v ]from s to v issetto 1 .Thedistance d [ s ] from s toitselfissetto0. Foreveryvertex v except s ,thenumberofshortestpaths [ v ]from s to v issetto0. Thevalue [ s ]issetto1. 6

PAGE 18

Foreveryvertex v ,thepredecessor P [ v ]of v issetto ; Addvertex s tothequeue Q Setstack S to ; input :Graph G = V;E ,Array d ,Array ,Array P ,Queue Q ,Stack S ,Vertex s output :None begin foreach vertex v 2 V do d [ v ] 1 ; [ v ] 0; P [ v ] ; ; d [ s ] 0; [ s ] 1; Q f s g ; S ; ; end Algorithm1:Initialize TheprocedureRelax&Counttakestwovertices u and v ofagraph G ,array d ofdistance, array ofnumberofshortestpaths,array P ofpredecessors,queue Q ,andarray W ofedge weightsasinput.Itdoesthefollowing: Ifthecurrentdistanceestimate d [ v ]of v isgreaterthanthesumofthedistanceestimate d [ u ]of u andtheweight W u;v ofedge u;v ,then { set d [ v ]tothelattervalue. { set [ v ] [ u ]. { set u astheonlypredecessorof v { insert/update v tothequeue Q withnewkey d [ v ]. Ifthecurrentdistanceestimate d [ v ]of v isequaltothesumofthedistanceestimate d [ u ]of u andtheweight W u;v ofedge u;v ,then 7

PAGE 19

{ set [ v ] [ v ]+ [ u ]. { append u tothepredecessorlistof v input :Array d ,Array ,Array P ,Queue Q ,Array W ,Vertex u ,Vertex v output :None begin if d [ v ] >d [ u ]+ W u;v then d [ v ] d [ u ]+ W u;v ; [ v ] [ u ]; P [ v ] f u g ; insert/update Q bysetting Q v withnewkey d [ v ]; if d [ v ]= d [ u ]+ W u;v then [ v ] [ v ]+ [ u ]; P [ v ] P [ v ] [f u g ; end Algorithm2:Relax&Count TheprocedureAccumulationtakesagraph G ,array C B ofbetweennesscentralitymeasures,array ofnumberofshortestpaths,array P ofpredecessors,stack S ,andastartvertex s asinput.Itdoesthefollowing: Foreveryvertex v of G ,thedependencyofthevertex s on v thatcontributestothe betweennesscentrality C B v of v iscomputed. input :Graph G = V;E ,Array C B ,Array ,Array P ,Stack S ,Vertex s output :None begin foreach vertex v 2 V do [ v ] 0; while S isnonempty do pop w S ; for v 2 P [ w ] do [ v ] [ v ]+ [ v ] [ w ] + [ w ]; if w 6 = s then C B w C B w + [ w ]; end Algorithm3:Accumulation TheprocedureBrandes'Betweennesstakesagraph G andarray W ofedgeweightsas input.Itdoesthefollowing: 8

PAGE 20

Foreveryvertex v ,itsbetweennesscentrality C B v iscomputed. input :Graph G = V;E ,Array W output :Array C B begin foreach vertex v 2 V do C B v 0; foreach vertex s 2 V do Initialize G;d;;P;Q;S;s ; while Q isnonempty do u Extract-Min Q ; push u to S ; foreach vertex v suchthat u;v 2 E do Relax&Count d;;P;Q;W;u;v ; Accumulation G; C B ;;P;S;s ; return C B ; end Algorithm4:Brandes'Betweenness Analysis: Brandes'betweennesscentralityalgorithmconsistsofseveralprocedures.Basedon a bonacciheap implementationofthepriorityqueue Q ,therunningtimeoftheseprocedures canbesummarizedasfollows. Table2.1RuntimeofeachprocedureinBrandes'betweennesscentralityalgorithm Procedure Runtime Initialize O n Relax&Count amortized O Accumulation O n Brandes'Betweenness O nm + n 2 log n Thus,theoverallrunningtimeofBrandes'betweennesscentralityalgorithmis O nm + n 2 log n .Usingbreadth-rstsearchinsteadofDijkstra'salgorithmwhentheinputgraphis unweighted,thecomputationtimeofBrandes'betweennesscentralityalgorithmreducesto O nm time. 9

PAGE 21

2.3Bounded-DistanceBetweennessCentralityforUnweightedGraphs Inbetweennesscentralityforunweightedgraphs,wemeasureshortestpathsirrespectiveof theirlength.BorgattiandEverett[BE06]gavetheideaoflimitingthelength.Theirargument wasthatverylongpathsareseldomused.Theycalledthismeasure k -betweennesscentrality becausetheyonlyconsiderdepthatmost k .LaterBrandes[Bra08]redenedthismeasure asbounded-distancebetweennesscentrality. Denition2.5Bounded-DistanceBetweennessCentrality[Bra08] Foranyxed k 2 N + andforeveryvertex v 2 V ofadirectedunit-weightedi.e.,unweightedgraph G = V;E ,the bounded-distancebetweennesscentrality C B k v of v isdenedby C B k v = X s;t 2 V : d G s;t k st v st : Bounded-distancebetweennesscentralityforanunweightedgraphcanbeexplicitlycomputedusingBrandes'betweennesscentralityalgorithmwherewestopthebreadth-rstsearch ofthealgorithmwhenavertexofdistance k isreached[Bra08]. TheprocedureBounded-DistanceBetweennesstakesagraph G ,integer k 2 N + ,andan array W representingtheweightfunctionontheedgesof G asinput.Itdoesthefollowing: Foreveryvertex v ,thebounded-distancebetweenness C B k v iscomputed. 10

PAGE 22

input :Graph G ,Integer k 2 N + ,Array W suchthat W e =1foreveryedge e output :Array C B k begin foreach vertex s 2 V do C B k u 0; foreach vertex s 2 V do Initialize G;d;;P;Q;S;s ; while Q isnonempty do u Extract-Min Q ; if d [ u ] >k then break ; push u to S ; foreach vertex v suchthat u;v 2 E do Relax&Count d;;P;Q;W;u;v ; Accumulation G; C B k ;;P;S;s ; return C B k ; end Algorithm5:Bounded-DistanceBetweenness Analysis: Bounded-distancebetweennessalgorithmaboveisarestrictedversionofBrandes' betweennessalgorithm.Sincetraversalofthegraphisdonefromeachvertextoothervertices thatareatdistanceatmost k ,thewhileloopbreaksafterreachingthedistance k .Inthe worstcase,ifdistancesfromanyvertextoallotherverticesarelessthan k ,thisalgorithmwill performsimilartoBrandes'betweennessalgorithm,whichtakes O nm timeonunweighted graphs. 11

PAGE 23

CHAPTER3 k -PATHCENTRALITY 3.1TheNotionof k -PathCentralityforUnweightedGraphs Inthepreviouschapter,weintroducedsomepopularcentralitymeasures.Roughlyspeaking,betweennesscentralityofavertexdetermineshowimportantthatvertexisinrelationto theoverallinformationowinthenetwork.Therefore,itisconsideredtobeanimportant centralitymeasureforsocialnetworks.Brandes'betweennesscentralityalgorithm[Bra01]is thebestknowndeterministicalgorithmforcomputingbetweennessofalltheverticesofa graph.However,asignicantshortcomingofBrandes'algorithmisthatitsrunningtimeis prohibitiveforlarge-scalenetworks.Withthemotivationtondthemostimportantvertices inanetwork,inthischapterweintroduceanewcentralitymeasurecalled k -pathcentrality. Sinceunweightedcaseiseasiertograsp,rstwewilldene k -pathcentralityforunweighted graphs.Forany s 2 V ,let p s ; ` denotethenumberofpathsoflengthexactly ` 2 N + that startfrom s .Forany s;v 2 V ,let p v s ; ` denotethenumberofpathsoflengthexactly ` 2 N + thatstartfrom s andpassthrough v Denition3.1 k -PathCentralityforUnweightedGraphs Foranyxed k 2 N + and foreveryvertex v 2 V ofadirectedunit-weightedi.e.,unweightedgraph G V;E ,the k -path centrality C k v of v isdenedby C k v = X s 6 = v X 1 ` k p v s ; ` p s ; ` : Wepresentarandomizedalgorithmwithsmallerrorprobabilityatmost1 =n 2 ,thatdistinguishesverticeswithhigh k -pathcentralityvaluesfromverticeswithlow k -pathcentrality values,where n isthenumberofverticesinthegraphs.Specically,foreverychoiceofreal numbers 2 ; 1 ; 2 ; 1 = 2andxedinteger k 2 [1 ;n ],withprobabilityatleast1 )]TJ/F15 10.9091 Tf 10.867 0 Td [(1 =n 2 12

PAGE 24

ifavertex v 2 V has k -pathcentrality C k v >n +2 ,thenourrandomizedalgorithm reportsthat v isavertexwithhigh k -pathcentrality,andifavertex v 2 V has k -pathcentrality C k v 6ln n = 2 thenvertex v isreportedasa high k -path centralityvertex.Otherwiseitisreportedasa low k -pathcentralityvertex. input :Graph G ,real 2 ; 1 = 2,Array B output :None begin n j V G j ; foreach vertex v 2 V do if B v > 6ln n = 2 then Print v hasa high k -pathcentrality"; else Print v hasa low k -pathcentrality"; end Algorithm6:Indicator TheprocedureUnweighted k -Pathtakesanunweightedgraph G ,realvalue 2 ; 1, realvalue 2 ; 1 = 2,andinteger k 2 [1 ;n ]asinput.Itdoesthefollowing: Foreveryvertex v ,indicatewhetheritisa high k -pathcentralityvertexora low k -path centralityvertex. 13

PAGE 25

input :Graph G ,Real 2 ; 1,Real 2 ; 1 = 2,Integer k 2 [1 ;n ] output :None begin foreach vertex v 2 V do B v 0; v unexplored; n j V G j ; Stack S ; ; outer-loop 1; while outer-loop 6 k )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.587 0 Td [( ln n do Let k 0 2 [1 ;k ]beanintegerchosenuniformlyatrandom; Let x 2 V beavertexchosenuniformlyatrandom; x explored; push x to S ; inner-loop 1; while inner-loop k 0 andnotalloutgoingedgesfrom x areexplored do Let x;y beanedgefrom x toanunexploredvertex y ,chosenuniformly atrandom; y explored; push y to S ; B y B y +1; inner-loop inner-loop+1; x y ; while S isnonempty do pop x S ; x unexplored; outer-loop outer-loop+1; Indicator G;; B ; end Algorithm7:Unweighted k -Path 14

PAGE 26

Analysis: Let ` =6 k )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.586 0 Td [( ln n .Letvertex v beahigh k -pathcentralityvertex.Then C k v >n +2 .Dene,for1 i l X i = 8 > > < > > : 1ifthe i th randomwalkpassesthrough v 0otherwise. Itiseasytoseethatwhenthealgorithmterminates B v = ` X i =1 X i .Notethat E [ X i ]=Pr X i =1 = 1 kn X s 2 V X 1 ` k Prarandomwalkoflength ` startsfrom s andpassesthrough v = 1 kn 0+ 1 kn X s 6 = v X 1 ` k p v s ; ` p s ; ` = 1 kn C k v : Therefore, E [ B v ]= ` X i =1 E [ X i ]= ` kn C k v ` kn 1 )]TJ/F23 7.9701 Tf 6.587 0 Td [( +2 Pr B v ` kn 1 )]TJ/F23 7.9701 Tf 6.586 0 Td [( Pr B v )]TJ/F22 10.9091 Tf 10.909 0 Td [( E [ B v ] e )]TJ/F24 5.9776 Tf 5.756 0 Td [( 2 2 ` kn 1 )]TJ/F24 5.9776 Tf 5.756 0 Td [( +2 byChernobounds e )]TJ/F20 7.9701 Tf 6.587 0 Td [(3ln n = n )]TJ/F20 7.9701 Tf 6.586 0 Td [(3 : Fromtheaboveanalysis,wecanconcludethatthe k -pathcentralityrandomizedalgorithm hasanerrorprobabilityatmost n )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 andhasruntime O `k = O k 2 )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.587 0 Td [( ln n 3.2TheNotionof k -PathCentralityforWeightedGraphs Nowwewillgeneralizethe k -pathcentralitynotionanddeneitforweightedgraphs.For any s 2 V and ` 2 N + ,let B s ; ` denotethenumberofpaths st from s tosomevertex t 2 V withoneofthefollowingproperties: 15

PAGE 27

Totalweightofthepath st from s to t is ` ,i.e., W st = ` Totalweightofthepath st from s to t islessthan ` andthereisanedge t;w that doesnotbelongto st suchthat W st + W t;w >` Similarly,forany s;v 2 V and ` 2 N + ,let B v s ; ` denotethenumberofpaths st v from s tosomevertex t 2 V thatpassthrough v andhaveoneofthefollowingproperties: Totalweightofthepath st v from s to t thatpassesthrough v is ` ,i.e., W st v = ` Totalweightofthepath st v from s to t thatpassesthrough v islessthan ` andthere isanedge t;w thatdoesnotbelongto st v suchthat W st v + W t;w >` Nowwecandene k -pathcentralityforweightedcasesimilartotheunweightedcase. Denition3.2 k -PathCentralityforWeightedGraphs Foranyxed k 2 N + and foreveryvertex v 2 V ofadirectedweightedgraph G V;E ,the k -pathcentrality C k v of v isdenedby C k v = X s 6 = v X 1 ` k B v s ; ` B s ; ` : Notethattheabovedenitionworksforbothweightedandunweightedgraphs.Wegive arandomizedalgorithmAlgorithm8thatforeverychoiceofrealnumbers 2 ; 1 ; 2 ; 1 = 2andxedinteger k 2 [1 ;n ],withprobabilityatleast1 )]TJ/F15 10.9091 Tf 11.053 0 Td [(1 =n 2 ,ifavertex v 2 V has k -pathcentrality C k v >n +2 ,thentherandomizedalgorithmreportsthat v isavertex withhigh k -pathcentralityandifavertex v 2 V has k -pathcentrality C k v
PAGE 28

input :Graph G ,Array W ,Real 2 ; 1,Real 2 ; 1 = 2,Integer k 2 [1 ;n ] output :None begin foreach vertex v 2 V do B v 0; v unexplored; n j V G j ; Stack S ; ; outer-loop 1; while outer-loop 6 k )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.587 0 Td [( ln n do Let k 0 2 [1 ;k ]beanintegerchosenuniformlyatrandom; Let x 2 V beavertexchosenuniformlyatrandom; x explored; push x to S ; inner-loop 1; while inner-loop k 0 andnotalloutgoingedgesfrom x areexplored do Let x;y bearandomedgefrom x toanunexploredvertex y ,chosen withprobabilityinverselyproportionaltoitsedgeweight; y explored; push y to S ; B y B y +1; inner-loop inner-loop+1; x y ; while S isnonempty do pop x S ; x unexplored; outer-loop outer-loop+1; Indicator G;; B ; end Algorithm8:Weighted k -Path 3.3ARandomizedApproximationAlgorithmfor k -PathCentrality Ouralgorithmforapproximatecomputationof k -pathcentralityvaluesinanetworkis identicaltotheWeighted k -Pathprocedure;themaindierenceisinthenumberofiterations. Byanalysisofthealgorithmgivenbelowwefoundtherequirednumberofiterationstobe 2 k 2 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n foreverychoiceofinteger k 2 [1 ;n ]andrealnumber 2 ; 1 = 2. TheprocedureRandomizedApproximation k -PathAlgorithm9takesagraph G ,real value ,integer k ,andarray W ofedgeweightsasinput.Itdoesthefollowing: Foreveryvertex v ,anapproximationtothe k -pathcentrality C k v iscomputed. 17

PAGE 29

input :Graph G ,Array W ,Real 2 ; 1 = 2,Integer k 2 [1 ;n ] output :Anapproximationtothe k -pathcentrality C k v iscomputed begin foreach vertex v 2 V do B v 0; v unexplored; n j V G j ; ` 2 k 2 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n ; Stack S ; ; outer-loop 1; while outer-loop ` do Let k 0 2 [1 ;k ]beanintegerchosenuniformlyatrandom; Let x 2 V beavertexchosenuniformlyatrandom; x explored; push x to S ; inner-loop 1; while inner-loop k 0 andnotalloutgoingedgesfrom x areexplored do Let x;y bearandomedgefrom x toanunexploredvertex y ,chosen withprobabilityinverselyproportionaltoitsedgeweight; y explored; push y to S ; B y B y +1; inner-loop inner-loop+1; x y ; while S isnonempty do pop x S ; x unexplored; outer-loop outer-loop+1; foreach vertex v 2 V do Outputapproximate k -pathcentralityof v B v kn ` ; end Algorithm9:RandomizedApproximation k -Path Analysis: Wegivetheanalysisforthecaseofunweightedgraphs;theanalysisforweighted graphsissimilar.Let v beavertexintheunweightedgraph G V;E .Dene,for1 i ` X 0 i = 8 > > < > > : 1ifthe i th randomwalkpassesthrough v 0otherwise. 18

PAGE 30

Itiseasytoseethatwhenthealgorithmterminates B v = ` X i =1 X 0 i E [ X 0 i ]=Pr X 0 i =1 = 1 kn X s 2 V X 1 ` k Prarandomwalkoflength ` startsfrom s andpassesthrough v = 1 kn 0+ 1 kn X s 6 = v X 1 ` k p v s ; ` p s ; ` = 1 kn C k v : Letusdene X i = knX 0 i .Noticethateach X i belongstotherange[0 ;kn ].Also,notethat X i sareindependentrandomvariables. Theorem3.3Hoeding'sBound If X 1 ;X 2 ;:::;X h areindependentrandomvariables with a i X i b i and = E h P h i =1 X i h i istheexpectedmean,then,forevery > 0 ,wehave Pr P h i =1 X i h )]TJ/F22 10.9091 Tf 10.909 0 Td [( 2 e )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 h 2 2 = P h i =1 b i )]TJ/F23 7.9701 Tf 6.586 0 Td [(a i 2 : Theoutputofthealgorithmis P ` i =1 X i =` .ApplyingHoedingbound,weget Pr P ` i =1 X i ` )-222(C k v 2 e )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ` 2 2 = `k 2 n 2 =2 e )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ` 2 = k 2 n 2 : Supposethat = n 1 2 + .Then,theprobabilityoferrorisatmost2 e )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 `n 1+2 = k 2 n 2 = 2 e )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 `= k 2 n 1 )]TJ/F21 5.9776 Tf 5.756 0 Td [(2 .Thisprobabilityoferrorcanbemadeatmost1 =n 3 bymaking e )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 `= k 2 n 1 )]TJ/F21 5.9776 Tf 5.756 0 Td [(2 e )]TJ/F20 7.9701 Tf 7.998 0 Td [(ln n 3 ,or ` k 2 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 3 2 ln n + 1 2 ln2.Therefore,forouralgorithmwecanset ` = 2 k 2 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n ,wheretheadditiveerror is n 1 2 + Ingeneral,iftheadditiveerroris ,thenwecanuse ` k 2 n 2 ln n = 2 ,whichresultsin therunningtime O k 3 n 2 ln n = 2 oftherandomizedalgorithm.Thus,for = n 1 = 2+ ,the randomizedalgorithmtakes O k 3 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n time. 19

PAGE 31

CHAPTER4 RESULTS InthischapterwecompareBrandes'betweennesscentralityalgorithm[Bra01]withour approximationalgorithmfor k -pathcentrality.Weconductedanextensivesetofexperiments onbothsynthetici.e.,randomlygeneratedandrealnetworks.Forallofourexperiments, weusevalues =0 : 01and k =ln n + m ,where n isthenumberofverticesand m isthe numberofedgesinthegraph.AlltheexperimentsinthisthesisweredoneontheResearch ComputingclusterattheUniversityofSouthFlorida.ResearchComputingclusterconsists of316nodeswithatotalof2064coresandthetotalmemoryis4 : 812Terabytes.Werefer thereadertotheResearchComputingwebpage[Res10]forfurtherdetailsabouttheresearch environmentusedinthisthesis. 4.1RandomlyGeneratedNetworks OurrstexperimentcomparestherunningtimeofBrandes'betweennesscentralityalgorithmwithourapproximationalgorithmfor k -pathcentrality.Forthisexperiment,we created4500undirected,unweightedrandomgraphswithvaryingnumberofverticesand varyingdensity.Thedensity d ofagraphisdenedastheratioofthenumberofedgesinthe graphandthetotalpossiblenumberofedges,i.e., d = m= )]TJ/F23 7.9701 Tf 5 -3.995 Td [(n 2 .Weusedasimplemethodto constructrandomgraphswithagivennumberofvertices n anddensity d .Firstwecompute thecorrespondingnumberofedges m ,giventhenumberofvertices n anddensity d ,using therelation m = d )]TJ/F23 7.9701 Tf 5 -3.996 Td [(n 2 .Then,togeneratearandomgraph G V;E thathas n verticesand m edges,weadd m randomedgestotheinitialemptysetofedgesasfollows:uniformlyat randompicktwoverticesfrom V andconnectthembyanedge. Figure4.1showsacomparisonoftherunningtimesinsecondsbetweenBrandes'betweennesscentralityalgorithmandourrandomizedapproximationalgorithmfor k -pathcentrality 20

PAGE 32

onundirected,unweightedrandomgraphs.Therandomgraphsaregeneratedusingtherandomizedsamplingproceduredescribedabove.Inthisplot,thenumberofverticesvaryfrom 100to5000inincrementsof100,anddensityvariesfrom10%to90%inincrementsof10. Figure4.1PlotshowingrunningtimesinsecondsofBrandes'algorithmandourrandomized approximationalgorithm WecanclearlyseethattherunningtimeofBrandes'betweennesscentralityalgorithm ismuchhigherthantherunningtimeofourapproximationalgorithmfor k -pathcentrality whenbothalgorithmsaretestedonthesamerandomgraph.Thisdemonstratesthatour algorithmyieldsasignicantspeedupinrunningtimeoverBrandes'betweennesscentrality algorithm.Furthermore,thisholdstrueforanygraph,directedorundirectedandweighted orunweighted. Inoursecondsetofexperiments,wecomparethevaluesofexactbetweennesscentrality withthecorrespondingvaluesofapproximate k -pathcentralityinrandomgraphs.Itisinter21

PAGE 33

estingtoseethatinallourplotsforrandomgraphswithdensityatmost60%,thereisanice linearcorrelationbetweentheexactbetweennessvaluesandthecorrespondingapproximate k -pathcentralityvalues.WenowpresenttheseresultsforrandomnetworksusingTables4.1 through4.6andFigures4.2through4.7.Eachtablewillshowtheinformationaboutanetwork,thecomputationtimesofalgorithms,andthecorrelationbetweenbetweennessand approximate k -pathcentralityvalues.Eachgurecorrespondingtoatableaswellasan experimentwillshowthescatterplotsofexactbetweennesscentralityvalueswithrespectto theapproximate k -pathcentralityvaluesofallverticesofthenetwork.Sinceexperimentationonnetworkswith5000verticeswasfeasiblewithinourcomputationalrequirements,we presentbelowsomeresultsforrandomnetworksinwhichthenumberofverticesis5000and thedensityvariesfrom10%to60%inincrementsof10. 22

PAGE 34

Table4.1Summaryinformationofarandomnetworkwith5000verticesand10%density Numberofvertices n 5 ; 000 Numberofedges m 1 ; 249 ; 750 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 10% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 1 ; 334seconds Alpha 0 : 01 Pathlengthln m + n 14 Runningtimeofour k -pathcentralityalgorithm 488seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9868 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 85% Figure4.2Scatterplotforarandomnetworkwith5000verticesand10%density 23

PAGE 35

Table4.2Summaryinformationofarandomnetworkwith5000verticesand20%density Numberofvertices n 5 ; 000 Numberofedges m 2 ; 499 ; 500 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 20% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 3 ; 218seconds Alpha 0 : 01 Pathlengthln m + n 15 Runningtimeofour k -pathcentralityalgorithm 1 ; 285seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9786 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 74% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 79% Figure4.3Scatterplotforarandomnetworkwith5000verticesand20%density 24

PAGE 36

Table4.3Summaryinformationofarandomnetworkwith5000verticesand30%density Numberofvertices n 5 ; 000 Numberofedges m 3 ; 749 ; 250 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 30% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 5 ; 254seconds Alpha 0 : 01 Pathlengthln m + n 15 Runningtimeofour k -pathcentralityalgorithm 1 ; 758seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9683 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 75% Figure4.4Scatterplotforarandomnetworkwith5000verticesand30%density 25

PAGE 37

Table4.4Summaryinformationofarandomnetworkwith5000verticesand40%density Numberofvertices n 5 ; 000 Numberofedges m 4 ; 999 ; 000 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 40% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 7 ; 535seconds Alpha 0 : 01 Pathlengthln m + n 15 Runningtimeofour k -pathcentralityalgorithm 2 ; 375seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9509 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 50% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 50% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 62% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 71% Figure4.5Scatterplotforarandomnetworkwith5000verticesand40%density 26

PAGE 38

Table4.5Summaryinformationofarandomnetworkwith5000verticesand50%density Numberofvertices n 5 ; 000 Numberofedges m 6 ; 248 ; 750 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 50% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 9 ; 749seconds Alpha 0 : 01 Pathlengthln m + n 16 Runningtimeofour k -pathcentralityalgorithm 4 ; 029seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9451 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 50% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 65% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 68% Figure4.6Scatterplotforarandomnetworkwith5000verticesand50%density 27

PAGE 39

Table4.6Summaryinformationofarandomnetworkwith5000verticesand60%density Numberofvertices n 5 ; 000 Numberofedges m 7 ; 498 ; 500 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.209 Td [(n 2 60% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 11 ; 451seconds Alpha 0 : 01 Pathlengthln m + n 16 Runningtimeofour k -pathcentralityalgorithm 4 ; 624seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9290 Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 45% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 69% Figure4.7Scatterplotforarandomnetworkwith5000verticesand60%density 28

PAGE 40

4.2RealNetworks Nextweexperimentwithseveralrealnetworkdataobtainedfromvariousresources[DH97, BM06,BJMO07,New08].Wehighlyacknowledgetheeortofthoseindividualswhoobtained thesedatasetsfromdierentsourcesandmakingthemavailabletothepublic.Theinformationaboutvariousdatasetsisstatedbelow. TheZachary'sKarateClubdataset[Zac77,New08]:AccordingtoWayneZacharyin 1977,itcontainsanetworkoffriendshipsbetweenmembersofakarateclubataUS university.Verticesrepresentmembersatthekarateclubandedgesrepresentrelationshipsbetweenmembers.Thisisanundirected,unweightedgraphwith34verticesand 78edgesthathasbeencitedinsomeliterature[New04,LN05,DFLJ07,GL08,MG09, MMO09,CHL10].ThesummaryinformationofthisdatasetisshowninTable4.7.The scatterplotshowingapproximate k -pathcentralityvaluesinrelationtothecorrespondingbetweennessvaluesforthisdatasetisshowninFigure4.8. TheLesMiserables"dataset[Knu93,New08]:Itcontainsdataofaweighted,undirectednetworkofco-appearancesofcharactersinVictorHugo'snovelLesMiserables". Verticesrepresentcharactersandedgesconnectanypairofcharactersthatappearin thesamechapterofthebook.Thenumberofsuchco-appearancesindicatetheedge weights.Thereare77verticesand254edgesinthisnetwork.Thisdatasethasbeen citedinsomeliteraturerelatedtocommunitystructures[HJ08,MMO09,CHL10].The summaryinformationofthisdatasetisshowninTable4.8.Thescatterplotshowing approximate k -pathcentralityvaluesinrelationtothecorrespondingbetweennessvalues forthisdatasetisshowninFigure4.9. TheWordAdjacenciesdataset[New06,New08]:Itcontainsdataofanetworkof112 verticesthatrepresentcommonlyoccurringadjectivesandnounsinthenovelDavid Coppereld"byCharlesDickens,asdescribedbyMarkNewman.Theedgesconnect anypairofwordsthatappearadjacenttoeachotheratanypointinthetext.The networkisunweighted,undirectedandconsistsof425edges.Thisdatasethasbeen citedinsomepapersbyNewmanandothers[New06,NL07,BRTC08].Thesummary 29

PAGE 41

informationofthisdatasetisshowninTable4.9.Thescatterplotshowingapproximate k -pathcentralityvaluesinrelationtothecorrespondingbetweennessvaluesforthisdata setisshowninFigure4.10. CondensedMatterdatasets[New01b,New08]:TherearethreeCondensedMatterCollaborationdatasetsusedinthisthesis.Theyconsistofweighted,undirectednetworks ofcoauthorshipsbetweenscientistspostingpreprintsontheCondensedMatterE-Print ArchivestartingfromJanuary1,1995.Theweightofanedgebetweentwoscientistsrepresentsthenumberofcoauthorshipsbetweenthem.Theenddatesofcollectionofthese threedatasetsareDecember31,1999,June30,2003andMarch31,2005.Thesenetworkshavebeencitedinsomeliterature[New01a,New01b,New01c,BRTC08,JGH10]. ThesummaryinformationforthesethreedatasetsareshowninTables4.10,4.11, and4.12.Thescatterplotsshowingapproximate k -pathcentralityvaluesinrelationto thecorrespondingbetweennessvaluesforthesedatasetsareshowninFigures4.11,4.12, and4.13. High-EnergyTheorydataset[New01b,New08]:AccordingtoMarkNewman,thisdata setconsistsofaweighted,undirectednetworkofcoauthorshipsbetween8361scientists postingpreprintsontheHigh-EnergyTheoryE-PrintArchivebetweenJanuary1,1995 andDecember31,1999.Thisnetworkconsistsof8361verticesand15751edges.The weightofanedgebetweentwoscientistsrepresentsthenumberofcoauthorshipsbetween them.Thisnetworkhasbeencitedinsomeliterature[New01a,New01b,BRTC08].The summaryinformationofthisdatasetisshowninTable4.13.Thescatterplotshowing approximate k -pathcentralityvaluesinrelationtothecorrespondingbetweennessvalues forthisdatasetisshowninFigure4.14. Internetdataset[New08]:Thisdatasetrepresentsanunweighted,undirectednetwork with22963verticesand48436edges.UsingBGPtablespostedbytheUniversityof OregonRouteViewsProject,MarkNewmancreatedthisdatasetJuly22,2006.It consistsofasymmetrizedsnapshotofthestructureoftheInternetatthelevelofautonomoussystems.Thisnetworkhasbeencitedinsomeliterature[BRTC08,GCZ09]. ThesummaryinformationofthisdatasetisshowninTable4.14.Thescatterplotshow30

PAGE 42

ingapproximate k -pathcentralityvaluesinrelationtothecorrespondingbetweenness valuesforthisdatasetisshowninFigure4.15. Yeastdataset[BCC + 03,BM06]:Thisdatasetrepresentsproteinyeastsasvertices andproteininteractionsasedges.Thereare1870typeofproteinsand8960number ofinteractionsamongtwoproteins.Thisisanunweighted,undirectednetwork,which hasbeencitedinsomepapers[BCC + 03,BRTC08].Thesummaryinformationofthis datasetisshowninTable4.15.Thescatterplotshowingapproximate k -pathcentrality valuesinrelationtothecorrespondingbetweennessvaluesforthisdatasetisshownin Figure4.16. Kazaadatasets[IRF04]:Kazaaisapopularpeer-to-peerle-sharingsystem.Asof June2003,therearemorethan4millionestimatedconcurrentusers.Here,information aboutthelesrequestedfordownloadisusedtobuildagraphusingtheusersasvertices ofthegraph.Twousersareconnectedinthisgraphiftheyhavethesamedownload duringsomeinterval.Threedatasetsareformedinthreedierenttimeintervalsand wereferthereadertothepaper[IRF04]forfurtherdetails.Thesummaryinformation ofthesedatasetsareshowninTables4.16,4.17,and4.18.Thescatterplotshowing approximate k -pathcentralityvaluesinrelationtothecorrespondingbetweennessvalues forthesedatasetsareshowninFigures4.17,4.18,and4.19. ComputationalGeometrydataset[BM06]:Thisdatasetconsistsofauthorcollaborationincomputationalgeometry,whereverticesrepresentauthors.Iftwoauthors wroteacommonworkbook,paper,etc.,thenwehaveanedgebetweenthosetwo verticesandtheedgeweightisthenumberofsuchcommonworks.Thisundirected, weightedgraphhas6158verticesand11898edges.WereferthereadertothePajekweb page[BM06]forfurtherdetails.Thesummaryinformationofthisdatasetisshownin Table4.19.Thescatterplotshowingapproximate k -pathcentralityvaluesinrelation tothecorrespondingbetweennessvaluesforthisdatasetisshowninFigure4.20. Pajekdatasets[BM06]:Thelastvedatasetsconsistofcitationnetworkswhere verticesrepresentpeopleandedgesrepresentrelationshipsbetweentwopeople.The 31

PAGE 43

numberofverticesvariesfrom233to6651andthenumberofedgesvariesfrom994 to54253.Allvegraphsareundirectedandunweighted,andwereferthereaderto thePajekwebpage[BM06]forfurtherdetails.Thesedatasetshavebeencitedin someliterature[Bat03,PPZ10].Thesummaryinformationofthesedatasetsareshown inTables4.20through4.24.Thescatterplotsshowingapproximate k -pathcentrality valuesinrelationtothecorrespondingbetweennessvaluesforthesedatasetsareshown inFigures4.21through4.25. 32

PAGE 44

Table4.7SummaryinformationofZachary'sKarateClubdataset Numberofvertices n 34 Numberofedges m 78 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 13 : 904% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm < 1seconds Alpha 0 : 01 Pathlengthln m + n 5 Runningtimeofour k -pathcentralityalgorithm < 1seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9129 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 85% Figure4.8ScatterplotforZachary'sKarateClubdataset 33

PAGE 45

Table4.8SummaryinformationofLesMiserablesdataset Numberofvertices n 77 Numberofedges m 254 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 8 : 681% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm < 1seconds Alpha 0 : 01 Pathlengthln m + n 6 Runningtimeofour k -pathcentralityalgorithm < 1seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9061 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 80% Figure4.9ScatterplotforLesMiserablesdataset 34

PAGE 46

Table4.9SummaryinformationofWordAdjacenciesdataset Numberofvertices n 112 Numberofedges m 425 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 6 : 837% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm < 1seconds Alpha 0 : 01 Pathlengthln m + n 6 Runningtimeofour k -pathcentralityalgorithm < 1seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9268 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 90% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 70% Figure4.10ScatterplotforWordAdjacenciesdataset 35

PAGE 47

Table4.10SummaryinformationofCondensedMatter1999dataset Numberofvertices n 16 ; 726 Numberofedges m 47 ; 594 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 034% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 640seconds Alpha 0 : 01 Pathlengthln m + n 11 Runningtimeofour k -pathcentralityalgorithm 52seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 6502 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 45% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 56% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 56% Figure4.11ScatterplotforCondensedMatter1999dataset 36

PAGE 48

Table4.11SummaryinformationofCondensedMatter2003dataset Numberofvertices n 31 ; 163 Numberofedges m 120 ; 029 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 0 : 025% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 2 ; 884seconds Alpha 0 : 01 Pathlengthln m + n 12 Runningtimeofour k -pathcentralityalgorithm 183seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 6920 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 55% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 54% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 55% Figure4.12ScatterplotforCondensedMatter2003dataset 37

PAGE 49

Table4.12SummaryinformationofCondensedMatter2005dataset Numberofvertices n 40 ; 421 Numberofedges m 175 ; 693 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 0 : 022% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 5 ; 405seconds Alpha 0 : 01 Pathlengthln m + n 12 Runningtimeofour k -pathcentralityalgorithm 311seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 6982 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 45% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 54% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 55% Figure4.13ScatterplotforCondensedMatter2005dataset 38

PAGE 50

Table4.13SummaryinformationofHigh-EnergyTheorydataset Numberofvertices n 8 ; 361 Numberofedges m 15 ; 751 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 045% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 109seconds Alpha 0 : 01 Pathlengthln m + n 10 Runningtimeofour k -pathcentralityalgorithm 14seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 7032 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 40% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 50% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 42% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 54% Figure4.14ScatterplotforHigh-EnergyTheorydataset 39

PAGE 51

Table4.14SummaryinformationofInternetdataset Numberofvertices n 22 ; 963 Numberofedges m 48 ; 436 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 018% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 917seconds Alpha 0 : 01 Pathlengthln m + n 11 Runningtimeofour k -pathcentralityalgorithm 462seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9563 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 85% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 84% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 82% Figure4.15ScatterplotforInternetdataset 40

PAGE 52

Table4.15SummaryinformationofYeastdataset Numberofvertices n 1 ; 870 Numberofedges m 8 ; 960 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 0 : 513% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 10seconds Alpha 0 : 01 Pathlengthln m + n 9 Runningtimeofour k -pathcentralityalgorithm 2seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8600 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 56% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 69% Figure4.16ScatterplotforYeastdataset 41

PAGE 53

Table4.16SummaryinformationofKazaadataset1 Numberofvertices n 1 ; 400 Numberofedges m 6 ; 316 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 645% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 6seconds Alpha 0 : 01 Pathlengthln m + n 9 Runningtimeofour k -pathcentralityalgorithm 2seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8515 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 80% Figure4.17ScatterplotforKazaadataset1 42

PAGE 54

Table4.17SummaryinformationofKazaadataset2 Numberofvertices n 1 ; 550 Numberofedges m 8 ; 028 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 669% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 13seconds Alpha 0 : 01 Pathlengthln m + n 9 Runningtimeofour k -pathcentralityalgorithm 5seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8735 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 78% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 83% Figure4.18ScatterplotforKazaadataset2 43

PAGE 55

Table4.18SummaryinformationofKazaadataset3 Numberofvertices n 2 ; 424 Numberofedges m 13 ; 354 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 455% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 16seconds Alpha 0 : 01 Pathlengthln m + n 10 Runningtimeofour k -pathcentralityalgorithm 8seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8063 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 90% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 85% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 78% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 72% Figure4.19ScatterplotforKazaadataset3 44

PAGE 56

Table4.19SummaryinformationofComputationalGeometrydataset Numberofvertices n 6 ; 158 Numberofedges m 11 ; 898 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 063% Directedorundirected undirected Weightedorunweighted weighted RunningtimeofBrandes'betweennesscentralityalgorithm 42seconds Alpha 0 : 01 Pathlengthln m + n 10 Runningtimeofour k -pathcentralityalgorithm 9seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8254 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 60% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 65% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 72% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 75% Figure4.20ScatterplotforComputationalGeometrydataset 45

PAGE 57

Table4.20SummaryinformationofSmallWorlddataset Numberofvertices n 233 Numberofedges m 994 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 3 : 662% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm < 1seconds Alpha 0 : 01 Pathlengthln m + n 7 Runningtimeofour k -pathcentralityalgorithm < 1seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8938 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 80% Figure4.21ScatterplotforSmallWorlddataset 46

PAGE 58

Table4.21SummaryinformationofSmall,GrithandDescendantsdataset Numberofvertices n 1 ; 024 Numberofedges m 4 ; 922 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 0 : 455% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 3seconds Alpha 0 : 01 Pathlengthln m + n 9 Runningtimeofour k -pathcentralityalgorithm 2seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9232 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 90% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 84% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 79% Figure4.22ScatterplotforSmall,GrithandDescendantsdataset 47

PAGE 59

Table4.22SummaryinformationofScientometricsdataset Numberofvertices n 2 ; 729 Numberofedges m 10 ; 416 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 0 : 280% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 22seconds Alpha 0 : 01 Pathlengthln m + n 9 Runningtimeofour k -pathcentralityalgorithm 5seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 8258 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 90% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 78% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 79% Figure4.23ScatterplotforScientometricsdataset 48

PAGE 60

Table4.23SummaryinformationofSelf-OrganizingMapsdataset Numberofvertices n 3 ; 772 Numberofedges m 112 ; 731 Density d = m= )]TJ/F24 5.9776 Tf 4.224 -3.21 Td [(n 2 1 : 585% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 38seconds Alpha 0 : 01 Pathlengthln m + n 10 Runningtimeofour k -pathcentralityalgorithm 22seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 9450 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 100% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 90% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 85% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 80% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 77% Figure4.24ScatterplotforSelf-OrganizingMapsdataset 49

PAGE 61

Table4.24SummaryinformationofZewaildataset Numberofvertices n 6 ; 651 Numberofedges m 54 ; 253 Density d = m= )]TJ/F24 5.9776 Tf 4.223 -3.21 Td [(n 2 0 : 245% Directedorundirected undirected Weightedorunweighted unweighted RunningtimeofBrandes'betweennesscentralityalgorithm 155seconds Alpha 0 : 01 Pathlengthln m + n 11 Runningtimeofour k -pathcentralityalgorithm 30seconds Correlationcoecientbetweenexactbetweennessvalues andapproximate k -pathcentralityvaluesoverallvertices 0 : 6676 Percentageofverticescommontoboththetop5betweennessvalues andthetop5 k -pathcentralityvalues 40% Percentageofverticescommontoboththetop10betweennessvalues andthetop10 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop20betweennessvalues andthetop20 k -pathcentralityvalues 70% Percentageofverticescommontoboththetop50betweennessvalues andthetop50 k -pathcentralityvalues 78% Percentageofverticescommontoboththetop100betweennessvalues andthetop100 k -pathcentralityvalues 79% Figure4.25ScatterplotforZewaildataset 50

PAGE 62

CHAPTER5 CONCLUSION Inthisthesis,weintroducedanewcentralitymeasurecalled k -pathcentralityforunweightedgraphs.Let n denotethenumberofverticesand m denotethenumberofedgesin aninputgraph.Wegaveapolynomial-timerandomizedalgorithmthat,onanyinputgraph, allowsustodistinguishbetweenverticesthathavehigh k -pathcentralityvaluesandvertices thathavelow k -pathcentralityvalues.Thisalgorithmhaserrorprobabilityatmost1 =n 2 andrunsintime O k 2 )]TJ/F20 7.9701 Tf 6.587 0 Td [(2 n 1 )]TJ/F23 7.9701 Tf 6.587 0 Td [( ln n ,where 2 ; 1, 2 ; 1 = 2andinteger k 2 [1 ;n ] areadjustableparameters.Thenweextendedourdenitionof k -pathcentralitytoweighted graphsandgavesimilarpolynomial-timerandomizedalgorithmthatdistinguishesbetween high k -pathcentralityverticesandlow k -pathcentralityverticesinweightedgraphs.We showedtheanalysisoftherandomizedalgorithmforunweightedgraphs,andthattheanalysis oftherandomizedalgorithmforweightedgraphsissimilar. Next,wepresentedapolynomial-timerandomizedapproximationalgorithmforcomputing the k -pathcentralityvaluesofallverticesinanyinputgraph.Thisalgorithmhasanadditive erroratmost n 1 = 2+ andrunsintime O k 3 n 1 )]TJ/F20 7.9701 Tf 6.586 0 Td [(2 ln n ,where 2 ; 1 = 2andinteger k 2 [1 ;n ] areadjustableparameters.Throughextensiveexperimentationsonbothrealandrandomly generatednetworks,weillustratedthatverticeswithhighbetweennesscentralityvaluesalso havehigh k -pathcentralityvalues.Scatterplotsbetweenexactbetweennesscentralityvalues andapproximate k -pathcentralityvaluesindicatealinearcorrelationandhighcorrelation coecient.Inallourexperiments,weset k toln n + m and to0 : 01.Withthesechoicesof parameters,ourrandomizedapproximationalgorithmfor k -pathcentralityisasymptotically fasterthanBrandes'betweennesscentralityalgorithm.Wecomparedtherunningtimeof thesealgorithmsonbothrealandrandomlygeneratednetworks.Byexperimentation,we 51

PAGE 63

demonstratedthatourrandomizedapproximationalgorithmfor k -pathcentralityismuch fasterthanBrandes'betweennesscentralityalgorithm. Inthisthesis,wedidnotpresentanyformaljusticationonwhywechoose as0 : 01and k asln n + m .Ourchoiceoftheseparameterswasmotivatedbythequalityofourexperimental results.Inourfuturework,wewouldliketondouttheoptimalvaluesfortheseparameters throughrigorousanalysis.Wementioninconclusionthatouralgorithmisextremelyecient intermsoftimeandmemoryusageinndingverticesthathavehigh k -pathcentrality values.Moreover,sinceapproximate k -pathcentralityvaluescomputedbyouralgorithm showalinearcorrelationwithexactbetweennesscentralityvaluesinourexperimentalresults, thenotionof k -pathcentralityandthealgorithmsproposedinthisthesismayhavepractical valueintheanalysisofnetworks. 52

PAGE 64

REFERENCES [Ant71]J.Anthonisse.Therushinadirectedgraph.TechnicalReportBN9/71,Stichting MathematischCentrum,Amsterdam,Netherlands,1971. [Bat03]V.Batagelj.Ecientalgorithmsforcitationnetworkanalysis.TechnicalReport cs.DL/0309023v1,arXiv.orge-Printarchive,September2003. [BCC + 03]D.Bu,L.Cai,R.Chen,G.Li,L.Ling,H.Lu,S.Sun,H.Xue,J.Zhang,N.Zhang, Y.Zhao,andX.Zhu.Topologicalstructureanalysisoftheprotein-proteininteractionnetworkinbuddingyeast. Nucl.AcidsRes. ,31:2443{2450,May2003. [BE06]S.BorgattiandM.Everett.Agraph-theoreticperspectiveoncentrality. Social Networks ,28:466{484,2006. [Bea65]M.Beauchamp.Animprovedindexofcentrality. BehavioralScience ,10:161{163, 1965. [BJMO07]A.Barabsi,H.Jeong,S.Mason,andZ.Oltvai.Networkdatabases. http://www.nd.edu/ networks/resources.htm,2007. [BKMM07]D.Bader,S.Kintali,K.Madduri,andM.Mihail.Approximatingbetweenness centrality.In ProceedingsoftheWorkshoponAlgorithmsandModelsforthe Web-GraphWAW ,pages124{137.SIAM,2007. [BM06]V.BatageljandA.Mrvar.Pajekdatasets.http://vlado.fmf.unilj.si/pub/networks/data/,2006. [Bon72]P.Bonacich.Factoringandweightingapproachestostatusscoresandclique identication. JournalofMathematicalSociology ,2:113{120,1972. [BP07]U.BrandesandC.Pich.Centralityestimationinlargenetworks. International JournalofBifurcationandChaos ,17:2303{2318,2007.SpecialIssueon ComplexNetworksStructureandDynamics [Bra01]U.Brandes.Afasteralgorithmforbetweennesscentrality. JournalofMathematicalSociology ,25:163{177,2001. [Bra08]U.Brandes.Onvariantsofshortest-pathbetweennesscentralityandtheirgeneric computation. SocialNetworks ,30:136{145,May2008. [BRTC08]P.Boas,A.Rodrigues,G.Travieso,andL.Costa.Bordertreesofcomplex networks. JournalofPhysicsAMathematicalGeneral ,41:224005,June2008. [CHL10]S.Caeri,P.Hansen,andL.Liberti.Edgeratioandcommunitystructurein networks. PhysicalReviewE ,81:026105,Feb2010. 53

PAGE 65

[DFLJ07]H.Du,M.Feldman,S.Li,andX.Jin.Analgorithmfordetectingcommunity structureofsocialnetworksbasedonpriorknowledgeandmodularity. Complexity 12:53{60,2007. [DH97]T.DavisandY.Hu.UniversityofFloridasparsematrixcollection. http://www.cise.u.edu/research/sparse,1997. [EW01]D.EppsteinandJ.Wang.Fastapproximationofcentrality.In Proceedingsof the12thAnnualACM-SIAMSymposiumonDiscreteAlgorithms ,pages228{229. ACMPress,2001. [Fre77]L.Freeman.Asetofmeasuresofcentralitybasedonbetweenness. Sociometry 40:35{41,1977. [GCZ09]Y.Guo,C.Chen,andS.Zhou.Fingerprintfornetworktopologies.InJieZhou, editor, Complex ,volume5of LectureNotesoftheInstituteforComputer Sciences,SocialInformaticsandTelecommunicationsEngineering ,pages1666{ 1677.Springer,2009. [GL08]R.GhoshandK.Lerman.Communitydetectionusingameasureofglobalinuence.TechnicalReportcs.CY/0805.4606v1,arXiv.orge-Printarchive,August 2008. [GSS08]R.Geisberger,P.Sanders,andD.Schultes.Betterapproximationofbetweenness centrality.In Proceedingsofthe9thWorkshoponAlgorithmEngineeringand ExperimentsALENEX ,pages3{12.SIAM,2008. [HH95]P.HageandF.Harary.Eccentricityandcentralityinnetworks. SocialNetworks 17:57{63,1995. [HJ08]B.HuangandT.Jebara.Maximumlikelihoodgraphstructureestimationwith degreedistributions.In AnalyzingGraphs:TheoryandApplications,NIPSWorkshop ,2008. [IRF04]A.Iamnitchi,M.Ripeanu,andI.Foster.Small-worldle-sharingcommunities. In Proceedingsofthe23rdAnnualJointConferenceoftheIEEEComputerand CommunicationsSocietiesINFOCOM ,pages952{963,March2004. [JGH10]D.Jansen,R.Gortz,andR.Heidler.Knowledgeproductionandthestructureof collaborationnetworksintwoscienticelds. Scientometrics ,83:219{241,2010. [Knu93]D.Knuth. TheStanfordGraphBase:APlatformforCombinatorialComputing Addison-Wesley,1993. [LN05]D.Liben-Nowell. AnAlgorithmicApproachtoSocialNetworks .Ph.D.thesis, MassachusettsInstituteofTechnology,ElectricalEngineeringandComputerScienceDepartment,June2005. [MG09]I.MorarescuandA.Girard.Opiniondynamicswithdecayingcondence:Applicationtocommunitydetectioningraphs.TechnicalReportmath.OC/0911.5239v1, arXiv.orge-Printarchive,Nov2009. 54

PAGE 66

[MMO09]P.McSweeney,K.Mehrotra,andJ.Oh.Anewcommunitydetectionalgorithm basedonmakov-chainsandateamformationmodel.In Proceedingsofthe2009 InternationalConferenceonAdvancesinSocialNetworkAnalysisandMining ASONAM ,pages371{372.IEEEComputerSociety,July2009. [New01a]M.Newman.Scienticcollaborationnetworks.I.networkconstructionandfundamentalresults. PhysicalReviewE ,64:016131,June2001. [New01b]M.Newman.Thestructureofscienticcollaborationnetworks. Proceedingsof theNationalAcademyofSciencesoftheUnitedStatesofAmerica ,98:404{409, January2001. [New01c]M.Newman.Whoisthebestconnectedscientist?astudyofscienticcoauthorshipnetworks. PhysicalReviewE ,64:016132,2001. [New04]M.Newman.Detectingcommunitystructureinnetworks. TheEuropeanPhysical JournalB-CondensedMatterandComplexSystems ,38:321{330,March2004. [New06]M.Newman.Findingcommunitystructureinnetworksusingtheeigenvectorsof matrices. Arxivpreprintphysics/0605087 ,2006. [New08]M.Newman.Networkdata.http://www-personal.umich.edu/ mejn/netdata/, 2008. [NL07]M.NewmanandE.Leicht.Mixturemodelsandexploratoryanalysisinnetworks. ProceedingsoftheNationalAcademyofSciencesoftheUnitedStatesofAmerica 104:9564{9569,June2007. [PPZ10]M.Piraveenan,M.Prokopenko,andA.Zomaya.Localassortativenessinscalefreenetworks. EPLEurophysicsLetters ,89:49901,2010. [Res10]ResearchComputingatUniversityofSouthFlorida.Circehardware. https://rc.usf.edu/trac/doc/wiki/CirceHardware,2010. [Sab66]G.Sabidussi.Thecentralityindexofagraph. Psychometrika ,31:581{603,1966. [Sco00]P.Scott. SocialNetworkAnalysis:AHandbook .SagePublicationsLtd.,2000. [Shi53]A.Shimbel.Structuralparametersofcommunicationnetworks. BulletinofMathematicalBiophysics ,15:501{507,1953. [WF94]S.WassermanandK.Faust. SocialNetworkAnalysis:MethodsandApplications CambridgeUniversityPress,1994. [Zac77]W.Zachary.Aninformationowmodelforconictandssioninsmallgroups. JournalofAnthropologicalResearch ,33:452{473,1977. 55

PAGE 67

ABOUTTHEAUTHOR TharakaAlahakoonreceivedhisB.S.inComputerSciencefromtheUniversityofSouth FloridaUSF.HeiscurrentlypursuingtwoMaster'sdegreesfromUSF,oneinComputer ScienceandtheotherinMathematics.HisresearchinterestsareRandomizedAlgorithms, CombinatoricsandGraphTheory.HeisconductinghisresearchworkunderDr.RahulTripathiintheDepartmentofComputerScienceandEngineeringatUSF.Inadditiontohis researchwork,heiscurrentlyworkingasagraduateteachingassistantintheDepartmentof MathematicsandStatisticsatUSF.


xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 22 Ka 4500
controlfield tag 007 cr-bnu---uuuuu
008 s2010 flu s 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0004573
035
(OCoLC)
040
FHM
c FHM
049
FHMM
090
XX9999 (Online)
1 100
Alahakoon, Tharaka.
0 245
Path centrality :
b a new centrality measure in networks
h [electronic resource] /
by Tharaka Alahakoon.
260
[Tampa, Fla] :
University of South Florida,
2010.
500
Title from PDF of title page.
Document formatted into pages; contains X pages.
502
Thesis (MSCS)--University of South Florida, 2010.
504
Includes bibliographical references.
516
Text (Electronic thesis) in PDF format.
538
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
3 520
ABSTRACT: In network analysis, it is useful to identify important vertices in a network. Based on the varying notions of importance of vertices, a number of centrality measures are defined and studied in the literature. Some popular centrality measures, such as betweenness centrality, are computationally prohibitive for large-scale networks. In this thesis, we propose a new centrality measure called k-path centrality and experimentally compare this measure with betweenness centrality. We present a polynomial-time randomized algorithm for distinguishing high k-path centrality vertices from low k-path centrality vertices in any given (unweighted or weighted) graph. Specifically, for any graph G = (V,E) with n vertices and for every choice of parameters alpha between (0,1), epsilon between (0,1/2), and integer k between [1,n], with probability at least 1-1/n^2 our randomized algorithm distinguishes all vertices v in V that have k-path centrality Ck(v) more than n^(alpha)*(1+2*epsilon) from all vertices v in V that have k-path centrality Ck(v) less than n^(alpha)*(1-2*epsilon). The running time of the algorithm is O(k^(2)*epsilon^(-2)*n^(1-alpha)*ln(n)). Next, we present a polynomial-time randomized approximation algorithm for computing the k-path centrality values of all vertices in any given (unweighted or weighted) graph. Specifically, for any graph and for every choice of parameters alpha between (0,1/2) and integer k between [1,n], with probability at least 1-1/n^2 our randomized approximation algorithm computes the k-path centrality value of every vertex within an additive error of at most n^(1/2+alpha). The running time of the algorithm is O(k^(3)*n^(1-2*alpha)*ln(n)). Theoretically and experimentally, our algorithms are (for suitable choices of parameters) significantly faster than the best known deterministic algorithm for computing exact betweenness centrality values (Brandes' algorithm). Through experimentations on both real and randomly generated networks, we demonstrate that vertices that have high betweenness centrality values also have high k-path centrality values.
590
Advisor: Rahul Tripathi, Ph.D.
653
Betweenness centrality
Social networks
Randomized algorithms
Experimental algorithmics
Graphs
690
Dissertations, Academic
z USF
x Computer Science & Engineering
Masters.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.4573