USF Libraries
USF Digital Collections

Geos :

MISSING IMAGE

Material Information

Title:
Geos : a service for the management of geo-social information in a distributed system
Physical Description:
Book
Language:
English
Creator:
Anderson, Paul
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
Data Management
Peer-to-Peer Systems
Social Graph
Socially-Aware Applications
Privacy Protection
Dissertations, Academic -- Computer Science & Engineering -- Masters -- USF   ( lcsh )
Genre:
non-fiction   ( marcgt )

Notes

Abstract:
ABSTRACT: Applications and services that take advantage of social data usually infer social relationships using information produced only within their own context, using a greatly simplified representation of users' social data. We propose to combine social information from multiple sources into a directed and weighted social multigraph in order to enable novel socially-aware applications and services. We present GeoS, a geo-social data management service which implements a representative set of social inferences and can run on a decentralized system. We demonstrate GeoS' potential for social applications on a collection of social data that combines collocation information and Facebook friendship declarations from 100 students. We demonstrate its performance by testing it both on PlanetLab and a LAN with a realistic workload for a 1000 node graph.
Thesis:
Thesis (MSCS)--University of South Florida, 2010.
Bibliography:
Includes bibliographical references.
System Details:
Mode of access: World Wide Web.
System Details:
System requirements: World Wide Web browser and PDF reader.
Statement of Responsibility:
by Paul Anderson.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains X pages.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
usfldc doi - E14-SFE0004552
usfldc handle - e14.4552
System ID:
SFS0027867:00001


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 22 Ka 4500
controlfield tag 007 cr-bnu---uuuuu
008 s2010 flu s 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0004552
035
(OCoLC)
040
FHM
c FHM
049
FHMM
090
XX9999 (Online)
1 100
Anderson, Paul.
0 245
Geos :
b a service for the management of geo-social information in a distributed system
h [electronic resource] /
by Paul Anderson.
260
[Tampa, Fla] :
University of South Florida,
2010.
500
Title from PDF of title page.
Document formatted into pages; contains X pages.
502
Thesis (MSCS)--University of South Florida, 2010.
504
Includes bibliographical references.
516
Text (Electronic thesis) in PDF format.
538
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
3 520
ABSTRACT: Applications and services that take advantage of social data usually infer social relationships using information produced only within their own context, using a greatly simplified representation of users' social data. We propose to combine social information from multiple sources into a directed and weighted social multigraph in order to enable novel socially-aware applications and services. We present GeoS, a geo-social data management service which implements a representative set of social inferences and can run on a decentralized system. We demonstrate GeoS' potential for social applications on a collection of social data that combines collocation information and Facebook friendship declarations from 100 students. We demonstrate its performance by testing it both on PlanetLab and a LAN with a realistic workload for a 1000 node graph.
590
Advisor: Adriana Iamnitchi, Ph.D.
653
Data Management
Peer-to-Peer Systems
Social Graph
Socially-Aware Applications
Privacy Protection
690
Dissertations, Academic
z USF
x Computer Science & Engineering
Masters.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.4552



PAGE 1

GeoS:AServicefortheManagementofGeo-SocialInformationinaDistributedSystem by PaulAnderson Athesissubmittedinpartialfulllment oftherequirementsforthedegreeof MasterofScienceinComputerScience DepartmentofComputerScienceandEngineering CollegeofEngineering UniversityofSouthFlorida MajorProfessor:AdrianaIamnitchi,Ph.D. CristianBorcea,Ph.D. JayLigatti,Ph.D. DateofApproval: May18,2010 Keywords:DataManagement,Peer-to-PeerSystems,SocialGraph,Socially-AwareApplications,Privacy Protection Copyright c 2010,PaulAnderson

PAGE 2

Dedication Tomyparents.Thankyouforteachingmetheimportanceofaneducation.

PAGE 3

Acknowledgments Thisthesiswouldnothavebeenpossiblewithouttheassistanceofmanypeople.Mycolleagues,Nicolas KourtellisandJoshuaFinnis,designedthePrometheusarchitecture,helpedinexperimentation,andwere alwaysthereformetodiscussideaswith.JeremyBlackburnbroughtupmanygreatideasandquestions throughoutthedesignofGeoS.DanielBostonandSusanJuanPanfromNJITprovidedmewithsocialdata thatwasveryusefulforexperimentation.MayurPalankar'spresentationonsocialnetworkingiswherethe rstsparkoftheideaforGeoSstarted.SachaeSosocreatedtherstapplicationsthatmadeuseofGeoS. Iwouldliketothankmymajorprofessor,Dr.AdrianaIamnitchi,mostofall,forallthesupportshegave, knowledgesheshared,andforallowingmetodomyresearchwithher. ThisresearchwassupportedbytheNationalScienceFoundationunderGrantNo.CNS-0831785.Any opinions,ndings,andconclusionsorrecommendationsexpressedinthismaterialarethoseoftheauthors anddonotnecessarilyreecttheviewsofthesponsors.

PAGE 4

TableofContents ListofTables iii ListofFigures iv Abstract v Chapter1:Introduction 1 1.1Motivation 2 1.2Solution,ThesisContribution,andRoadmap3 Chapter2:TargetApplicationsandMotivatingScenarios5 Chapter3:GeoS 8 3.1DesignObjectivesandPerformanceRequirements8 3.1.1SocialSensors8 3.1.2DecentralizationofSocialData10 3.1.3ManagementofSocialData10 3.1.4SocialInferenceRequests11 3.2Architecture11 3.2.1OverallArchitectureofPrometheusandGeoSasaComponent11 3.2.2CommunicationBetweenGeoSandOtherComponents13 3.2.3GeoSDesign15 3.2.3.1DataStructure15 3.2.3.2Updates17 3.2.3.3AdaptiveThresholdFunction18 3.2.3.4ExampleSocialInput20 3.2.3.5Aging20 3.2.3.6MessageFormats21 3.3SocialInferenceAlgorithms21 3.3.1FriendTest22 3.3.2TopFriends23 3.3.3Neighborhood23 3.3.4SocialStrength24 Chapter4:Implementation28 Chapter5:ExperimentalSetup30 5.1SocialGraphGeneration30 5.1.1SyntheticGraphs30 5.1.2NJITSocialGraph31 5.2GenerationofWorkloads32 5.2.1InputBasedonFacebook32 5.2.2RequestsBasedonTwitter34 5.2.3RequestsBasedonBitTorrent35 i

PAGE 5

5.3Testbed 35 Chapter6:ExperimentalEvaluation37 6.1PerformanceMetrics37 6.2ExperimentalResults37 6.2.1GeoSCapabilities37 6.2.2GeoSPerformance39 6.2.3PrometheusPerformance44 Chapter7:RelatedWork50 7.1SocialNetworks50 7.2DataCollection51 7.3Socially-AwareApplications53 Chapter8:FutureWork 55 8.1AdditionalSocialInferenceFunctions55 8.2ExpandedSocialData57 8.3OpenQuestions58 Chapter9:Conclusion 60 References 61 ii

PAGE 6

ListofTables Table1.StoredSocialData.17 Table2.FormatofMessagesinGeoS.22 Table3.PropertiesofGraphGeneratedUsingNearestNeighborAlgorithm.31 Table4.PropertiesofNJITGraph.32 Table5.ProbabilityDistributionFunctionforInputUsingFacebookData.34 Table6.ProbabilityDistributionFunctionfortheNeighborhoodRequestUsingTwitterData.35 Table7.ProbabilityDistributionFunctionfortheSocialStrengthRequestUsingBitTorrentData.35 Table8.MeanNumberofContactsReturnedbyaNeighborhoodRequestSubmittedbyEgofor VariousCombinationsofEdgesandWeightsWithMeanSocialStrengthBetweenEgo andtheContactsWithinTwoHopsinParentheses.38 iii

PAGE 7

ListofFigures Figure1.AnOverviewofthePrometheusArchitecture.12 Figure2.GeoSandtheGatewayCommunicating.14 Figure3.DirectedGraphVersusUndirectedGraphforanInferenceRequestofatLeastPathLength Two. 16 Figure4.InputsBeingReceivedandThenDistributedThroughanUpdate.19 Figure5.ApplyingRemoteInputInformationtotheAdaptiveThresholdFunction.19 Figure6.AlgorithmforNeighborhoodego, ,x,radius.24 Figure7.ASocialStrengthCalculationExample.25 Figure8.AlgorithmforSocial StrengthContinued.27 Figure9.TraversalofNJITGraph.33 Figure10.PerformanceResultsofNeighborhoodRequeststhatWereComputedLocally.40 Figure11.PerformanceResultsofNeighborhoodRequeststhatWereComputedUsingDirectRemotePeers.41 Figure12.PerformanceResultsofNeighborhoodRequeststhatWereComputedUsingIndirectRemotePeers.42 Figure13.NeighborhoodResultsZoomed-InonSevenSecondsandBelow.43 Figure14.PerformanceResultsofSocialStrengthRequeststhatWereComputedUsingDirectRemotePeersorCompletelyLocally.44 Figure15.SocialStrengthResultsZoomed-InonOneSecondandBelow.45 Figure16.FullledNeighborhoodRequestsVersusTimetoDeliverResultOverAllSocialHops.46 Figure17.FullledNeighborhoodRequestsVersusTimetoDeliverResultGroupedbySocialHops.47 Figure18.FullledSocialStrengthRequestsVersusAverageTimetoDeliverResult.49 Figure19.ProposedAlgorithmforNewSocial Strengthego,alter.56 iv

PAGE 8

GeoS:AServicefortheManagementofGeo-SocialInformationinaDistributedSystem PaulAnderson Abstract Applicationsandservicesthattakeadvantageofsocialdatausuallyinfersocialrelationshipsusinginformationproducedonlywithintheirowncontext,usingagreatlysimpliedrepresentationofusers'social data.Weproposetocombinesocialinformationfrommultiplesourcesintoadirectedandweightedsocial multigraphinordertoenablenovelsocially-awareapplicationsandservices.WepresentGeoS,ageo-social datamanagementservicewhichimplementsarepresentativesetofsocialinferencesandcanrunonadecentralizedsystem.WedemonstrateGeoS'potentialforsocialapplicationsonacollectionofsocialdatathat combinescollocationinformationandFacebookfriendshipdeclarationsfrom100students.Wedemonstrate itsperformancebytestingitbothonPlanetLabandaLANwitharealisticworkloadfora1000nodegraph. v

PAGE 9

Chapter1: Introduction Thepopularityofuser-generatedcontenttoolsexposesanunprecedentedamountofinformationabout theinteractionbetweenInternetusers.Inparticular,tworecentclassesofInternetapplicationsarerevealing muchsocialinformation:popularonlinesocialnetworkssuchasFacebookorLinkedIn;andwidelyadopted collaborativetoolssuchasCiteULikeorDeliciousthatprovidearichinformationfabricthroughtags, annotations,andorganizationoftext.Inaddition,theubiquitousGPSandBluetooth-capablemobiledevices cangenerateyetanotherlayerofsocialdatathroughlocationandcollocationinformation. Thisunprecedentedamountofsocialinformationcanbeleveragedinsupportofsocialcomputingapplicationsthatcouldreallyharnessthepowerofsocialtiesbetweenusers.However,thestateoftheart inrecordingandmaintainingsocialinformationplacesthreesignicantobstaclesinthewayofdevelopinginnovativesocialapplications.First,manysourcesofsocialinformationrelyondeclaredrelationships thatallhavethesameweight,whichobscuresrelevantinformationintermsoftrust,sharedinterest,social solidarity,etc.Somesocialnetworkshavetriedtocorrectthisparticularproblem.Forexample,Facebook hasattemptedtousethevolumeofinteractionbetweenuserstoranktheirrelationships,butthisrankisnot availabletothird-partyapplications[Won10]. Second,themanualdeclarationofrelationshipscanleaveoutrelevantsocialtiesorpressurizeusersinto declaringunsubstantiatedrelationships.Theimplicitsocialincentivesinsuchsystems-themorefriends auserhas,thebetter-motivateuserstodeclarealargenumberoffriends.Forexample,onestudyof Facebook[WBS + 09]foundthatmorethanathirdofFacebookusershave100ormorefriendshipsdeclared. Thesamestudyalsofoundthatformostusers,70%oftheirinteractionsonFacebookhappenwithonly 20%oftheirfriends.Thesetwofactsshowthatusersdeclaremanyfriends,butthenonlyinteractwitha smallsubsetofthem.Anotherstudyfoundthatauserwhohasmanyfriendsonanonlinesocialnetworkis perceivedasmorecondentandpopularthanifthatsameuserhadlessfriends[KRZBS07]. Third,thebusinessmodelofcurrentonlinesocialnetworksleadstocentralizedstorageandmanagement ofusersocialdata,andgivesuserslittlecontrolovertheirsocialdata.Forexample,allexistingsystemsare centralized:Onlinesocialnetworksprotfromusingand/orsellingdetailsontheirusers'socialnetworksfor 1

PAGE 10

marketingpurposes[NM09].Thisbusinessmodelcanalsorestrictapplicationdevelopersbylimitingwhat socialdatatheyareallowedtoaccessandhowtheycanusethisdata[Facb]. ThisthesisproposesGeoS,ageo-socialdatamanagementservicethataddressestheseissuesandallows fortherapiddevelopmentof socially-awareapplications thatmakeuseofthevastamountsofsocialdata available.GeoSautomaticallyinferssocialrelationshipsbyreceivingdetailsofsocialinteractionsfrom third-partygeo-socialapplicationsthatwerefertoas socialsensors .Thesocialrelationshipsreportedby GeoScanbemuchmorecomplexthanthesimplebooleanfriendshipdeclarationsusedinmanyonlinesocial networks,astheyrecordcommonactivitiesandestimatehowstronglythoseactivitiesrelatetwopeople. Examplesofactivitiesincludecollocationataconcert,discussingafootballgameonFacebook,andsending anSMStoafriendtoaskabouttheupcominghikingtrip.GeoSalsoprovidesasetofsocialinference functionsthatcanbeusedbythirdpartyapplicationstondindirectsocialconnections,theusersinteracted withmost,thestrengthofasocialrelationship,andsociallyconnecteduserswhoarephysicallynearby,for example.Finally,GeoSisbuiltasapartofthePrometheussystemthathasafullydecentralizedarchitecture onapeer-to-peernetwork.Prometheusprovidesuserscompletecontroloverwheretheirsocialdataisstored andwhohastherighttoaccessit.PrometheusalsoprovidesthesupportforGeoS'distributedmanagement ofsocialdata. 1.1Motivation TherearethreemainmotivationsbehindthedevelopmentofGeoS.Therstistospeedupthedevelopmentofsocially-awareapplicationsbyallowingthemtouseathird-partyservicethatcollectsandmanages socialdataandprovidessocialinferencesbasedonthedataithascollected.Mostonlinesocialnetworks alreadygivesomeaccesstosocialinformation,buttheylimittheamountofinformationaccessible:inparticular,theycanonlygatherinformationonhowusersinteractwiththeirserviceandmoreimportantly,some networkslimitwhatdatacanbeaccessedandhowitcanbeused[Facb]. ThesecondmotivationforGeoSistoprovidemoredetailedsocialinformationthancurrentservices do.Mostresearchsofarhasonlyconsideredoneortwosourcesofsocialinformation[GM04,AYP + 05, MPDG08]andusedasimpliedrepresentationofthesocialnetwork[AHK + 07,LKG + 08,ZC10].Because eachsourceofsocialinformationislimitedtoitsowncontext,asinglesourcecanonlyprovideapiece ofauser'scompletesocialinformation.Ifsocialinformationisbeingrepresentedbyasimplegraphdata structure,importantinformationislikelytobelostbecauseverylittleisknownaboutrelationsbeyondthe simplefactthatcertainonesexist.Acompleteviewofusers'socialinformationallowsforanalysesthat 2

PAGE 11

aremoredetailed.Notethatthesimplegraphmostsociologicalstudiesusetorepresentsocialnetworksis likelychosenduetothedifcultiesinherittosurveys[BKKS84,MS63]oneofthemainprovidersofsocial informationinsociology. ThethirdmotivationistogiveuserscompletecontroloverthestorageoftheirsocialinformationbydecentralizingthedatamanagedbyGeoS.Thisisinstarkcontrasttocurrentonlinesocialnetworkswhere,for example,Facebookkeepsusers'socialdataevenaftertheydeletetheiraccount[Kha09].Whenanorganizationgathersdataaboutmanyusersinonecentralizedspot,thereisthepossibilityofanentityusingthisdata tomonitorthoseusers,creatingwhatisoftencalledaBigBrotherscenario.Inaddition,ifanorganization hascentralizedcontroloverasetofsocialdata,usersmayuseitlessduetopossibleprivacyconcernsor theorganizationmaylimitaccesstothedata,causingtheprogressionofadvancementsinsocialcomputing applicationsandrelatedsocialnetworkanalysistobestied.Finally,ifanapplicationwasbuilttousea singlecentralizedserviceandthatservicewentofine,theapplicationwouldnolongerfunctionuntilitwas rewritten. 1.2Solution,ThesisContribution,andRoadmap GeoSisadistributedservicethatcollectssocialdata,managesthevastamountofavailablesocialdata usinganovelsocialgraphdatastructure,andsolvessocialinferencerequeststosupportavarietyofsocial computingapplications.GeoSreceivessocialinformationfromdifferentsources,includingcellphonecall logs,friendshipdeclarationonsocialnetworkingsitessuchasFacebookandMySpace,andemails,with nolimitonthepossiblesources.Thereceivedsocialinformationisrecordedinasocialgraphthathasa tagandweightassociatedwitheachedgethatconnectstwousers.Thetagandweightsystemrecordsthe strengthofeachactivityorsocialgroupthatrelatestwopeople,andsincetheedgesaredirected,relationship strengthcanbeasymmetric.TheparticulargraphdatastructureusedinGeoSenablesndingnuancedsocial relationsintermsofthetypeandstrength,typicallyspeciedbyanapplication.AsapartofthePrometheus architecture,GeoSrunsonapeer-to-peernetworkthatallowsuserstohavefullcontrolofwheretheywant theirsocialinformationstored.Wealsopresentinthisthesisexperimentalresultsthatshowtheperformance andcapabilitiesofGeoS. InChapter2:,wepresentapplicationsandscenariosthathavemotivatedthedesignofGeoS.Chapter3: containstheobjectivesthatwereinmindwhiledesigningGeoS,thearchitectureofPrometheus,andtheAPI andfunctionalityofthesocialinferencefunctionswithinGeoS.Chapter4:liststheimplementationdetails ofGeoS.WedetailhowwepreparedforexperimentationinChapter5:.Thatisfollowedbytheexperimental 3

PAGE 12

resultsinChapter6:.RelatedworkispresentedinChapter7:.WeproposefutureworkinChapter8:.In Chapter9:,wesummarizetheworkpresentedinthisthesis. 4

PAGE 13

Chapter2: TargetApplicationsandMotivatingScenarios ManyclassesofapplicationscanbenetfromtheservicesofferedbyGeoS.Someofthoseapplications wouldbeimpossibletodowithcurrentservices,whileothersrequirelessworktoimplementorcangain morefeaturesbyusingGeoS. Oneexampleapplicationallowsuserstoinvitefriendswhowentseparatelytoasportingeventtojoin themafterwards.Usingcurrenttechnology,thiscouldbedonebymanuallyusingGoogleLatitudetosee whichfriendsarelocatedneartheuser,andthenmanuallysendingamessageouttoeachfriendtheuser wantstoinvite.WithGeoS,anapplicationcouldsimplyqueryGeoStoinferwhichfriendsoftheuserare alsoatthegameandinvitethem.TheapplicationthatusesGeoScouldalsoprovidemorefeatures,suchas lteringtheinvitedfriendsbasedonrelationshipstrength. Asimilartypeofapplicationwouldbeonethatsimpliesnotifyingselectfriendsaboutanupcoming event.Oneoftheeasiestwaystodothisrightnowwouldbetosendaninvitationtoselectfriendsona socialnetworkingsitelikeFacebook.TheproblemwiththisisthatmostpeoplehaveFacebookfriendsinthe hundreds[WBS + 09],andsoitwouldbetimeconsumingtoselectwhomtoinvitefromsuchalargeset.GeoS allowsuserstoltertheirfriendseasily.Forexample,ausercanlimitfriendshipsreturnedtoonlyfriends withastrongrelationshipand/oronlyfriendswithwhichtheydocertainactivities.Inaddition,GeoSallows theusertoinvitefriendsoffriends.Thus,ausercouldinvitetoanoutdooreventallhishikingfriends,along withtheirfriends.Incontrasttothelastexampleapplication,thepresentlocationoftheusersisnotchecked andtheapplicationteststheamountthathikingrelatestheusersinsteadofthegeneralrelationshipstrength betweenthem.Also,inthisexamplethefriendsoffriendswerereturnedinsteadofonlydirectconnections. Anotherapplicationsilencesphonecallswhentheuserisinaparticularsocialcontextandtheincoming phonecallisoutsidethatcontext,regardlessoflocation.Thesocialcontextoftheuserisinferredbylooking atthesocialconnectionstheuserhaswiththosetheuseriscurrentlycollocatedwith.Becauseitdoesnotrely onlocation,thisapplicationcouldsilencetheuser'sphoneautomaticallywhentheuserisinameeting,as wellaswhentheyareatlunchwithcoworkers.Theapplicationcouldinfertheuser'ssocialcontextevenwhen nodirectsocialconnectionexistsbetweenthemandsomeoneattheirlocation.Forexample,thephonecould 5

PAGE 14

silencepersonalcallswheninaninterviewwiththeregionalmanagerwhomtheuserhasnevermetbefore butanindirectconnectionfromtheusertotheirbosstotheregionalmanagerexists.Severalapplications meanttosolvethecommonproblemofcellphoneinterruptionhavealreadybeendeveloped,buttheydonot provideasmanyfeaturesastheproposedonethatusesGeoS.SenSay[SSF + 03]triestodiscoverifthephone shouldbesilencedthroughtwomethods:iftheuserhasscheduledaninterruptibleeventintheircalendaror iftheuserisspeaking.Therstmethodrequiresworkfromtheuserwhilethesecondwillnotfunctionif thephonedoesnotpickupanyloudsoundsfromtheusertalkingsoftlyornottalkingatall.Acommercial application,Locale[Loc],silencesthephonewhentheuserisatcertainlocationsdenedbytheuser,but thisrequirestheusertosetuptheselocationsaheadoftime.Neitherofthesesolutionsdifferentiatesbetween personalandbusinesscallers. Kingetal.[KBI09]haveproposedBatTorrent,aderivativeofBitTorrentmeantformobilephonesthat usessocialknowledgetodecidewhethertotrustalowbatterymessagefromanotherpeer.Whenthe applicationreceivesalowbatterymessagefromatrustedsource,thereceivernolongerrequirestheother peertoupload,allowingittoconservebatterypower.BatTorrentneedsGeoStocheckwhetherthesender isatrustedperson:anuntrustedusercouldreportthattheirbatteryislowinordertogamethesystemand avoidhavingtoupload.GeoSinferstrustforBatTorrentbydeterminingthestrengthofthesocialconnection fromtheseedertothepeerwiththelowbattery.Theintuitionisthatthenumberandintensityofexisting socialtiesevenonesthatdonotdirectlyconnectthetwouserscaninthiscasebetransformedintosocial incentivesforgoodbehaviorandanestimateofsocialtrust.WithoutGeoS,theuserwouldhavetodecide manuallywhetherusersweretrustedornot. Afewapplicationsalreadyusesocialknowledgetogreateffect,butwebelievemanyofthemwouldhave resultsthataremoreprecisebyusingthemorecomplexsocialdatafrommultiplesourcesprovidedbyGeoS. RE: [GKF + 06]useswhiteliststohelppreventthespreadingofspam.WhatmakesRE:specialisthatit allowsfriend-of-friendwhitelisting:ifuserAwhitelistsuserBanduserBwhitelistsuserC,emailsfrom userCwillbeallowedtoreachuserA.Thecurrentsystemrequiresuserstodeclarewhichsendersshould havetheiremailsallowed,butGeoSwouldallowthewhitelisttobeinferredautomatically. F2Fisapeer-to-peernetworkthatusessocialinformationtoprovideuserswithincentivesforsharing[LD06].TheauthorsonlytesttheexampleapplicationBlockPartywithsocialgraphsfromOrkutanda datingsite,butasstatedinthepaper,usersmaynotbewillingtosharetheirdataorresourceswiththosein thetestedgraphs.Ifasimilarapplicationwantedtoautomaticallydecidewhomtosharewith,GeoSprovides itwithaweightratherthanasimplebooleanvaluefoundinOrkut.Inaddition,interestingpossibilitiescould 6

PAGE 15

openupifthetypeofrelationthatconnectspeopleisalsoused.Forexample,ausercouldpossiblyonly sharetheirconcertphotoswiththeirmusicfriends. Peerspectiveusessocialknowledgetomakesearchingthewebmorepowerful[GMD06].TheweightsandedgetypesinGeoScouldbeusedtodecidehowstronglytobiaseachuser's searchresults.Forexample,ifausersearchesforteamschedule,thesystemcouldchecktheuser'ssports friendstoseewhattypeofteamstheyusuallysearchedfor.Thehigherthesportsweightconnectingtheuser tohisorherfriend,themorethetwomustinteractaboutsports,andthus,themorethefriend'sresultsshould matteronthesubject. OnenalpossibleuseofGeoS,andperhapsthemostimportant,isintheareaofsociologicalstudies. GeoSallowsforsociologyresearcherstorecordsocialinformationfrommultiplesourcesinacomplexsocial graphwithlittleprogrammingrequired.GeoStogetherwithsocialsensorscansimplifysociologicalstudies whilegivingmoredetailedresults. 7

PAGE 16

Chapter3: GeoS Inthefollowingsections,weshalldescribetheobjectivesofGeoSalongwithhowitmeetsthoseobjectivesSection3.1.ThiswillbefollowedbythearchitectureofGeoSSection3.2.Thissectionwillinclude anoverviewofPrometheus,thesystemofwhichGeoSisapart,withafocusontheroleofGeoSwithin PrometheusandhowGeoSinteractswiththeotherparts.ThedesignofGeoSandhowitinteractswithusers' applicationsisalsoincludedinthearchitecturedescription.ThespecicationsofthealgorithmsGeoSuses tocompletesocialinferenceswillfollowthisSection3.3. 3.1DesignObjectivesandPerformanceRequirements ThepurposeofGeoSistobothmanagegeo-socialinformationsubmittedbydiversesocialsensorswhile reducingprivacy-lossrisksandansweravarietyofsocialinferencerequestsinareasonabletime.Tothisend, thedesignobjectivesare: storesocialinformationinadecentralizedmanner acceptinputfromavarietyofsocialsensors managethegeo-socialinformationitreceives Theperformancerequirementsare: reasonablyfastprocessingofsocialinferencerequests lightcommunicationload 3.1.1SocialSensors AmainobjectiveofGeoSistoallowanysourceofgeo-socialinformation,orwhatwecallsocial sensors,tosendinputtoGeoS.Weexpectthefollowingsocialsensorstobedeveloped: mobiledevicesthatnduserswhoareatthesamelocationastheownerandthendetermineatwhich locationtheinteractionisoccurring 8

PAGE 17

applicationsattachedtosocialnetworkingsites,emailapplications,instantmessagingapplications, andotherprovidersoftext-basedsocialinteractionthatestablishthetopicdiscussedbylookingfor keywords devicesthatreportinformationabouttheuser'sphonecallandSMSdetails servicesthatrecordthesharingoflesbetweenusers extensionstovideogamesthatobserveusersinteractingtogether However,GeoSshouldnothavetobemodiedwheneveranewsocialsensorisdeveloped. Thus,thesocialinputmessageformatmustbeopen,butthesystemalsohastokeepnetworkcommunicationatalowlevelandprivacy-lossrisksataminimum.Tomeettheseobjectives,wekeeptheinputmessage formatsmall,simple,andgeneric.Thesmallmessagesizecreateslittlenetworktrafc,whichisimportant whenweexpectmobiledevicestobeoneofthemainprovidersofinput.Thesimplenatureoftheinput hasenoughinformationforGeoStobuildasocialgraphovertime,yetreducesthepossibleprivacy-lossto networksecurityattacks,suchas,man-in-the-middleattacks.Finally,itsgenericformatallowsanysocial sensortobeginsendingsocialinformationwithoutanychangestoGeoS. Inourdesignwemadethefollowingassumptionsaboutsocialsensors: First,socialsensorsaresophisticatedenoughtoaggregatetheinformationtheycollect.Thiswillallow themtobothrecognizetrendsandlimitthenumberofmessagestheysendtoGeoS.Atypeoftrendwe expectsocialsensorstodetect,usingactivityrecognitionthroughubiquitouscomputing[LHP + 07,PFF + 03], isroutines.Forexample,ifausergoestothegymeachday,weexpecttheirdevicenottoreportasocial interactionwiththeotherdailygymgoersaroundthem.Weassumemostsocialsensorswillreportinat mostonceaday,butthatthesensorwillsendseveralmessagesconsecutivelyatthattime.Aggregatinginput isalsogoodforprivacybecauseitprotectsagainstamalicioususerknowingexactlywhatanotheruseris currentlydoing.Forexample,ifinputwasnotaggregatedandamalicioususersawthatusersAandBjust hadamovieinteraction,themalicioususerwouldknowAandBarelikelyatamovietheatercurrently.The moretimebetweensendingaggregatedinput,themoredifcultitwouldbetobuildupauser'sschedule,but thentherecomestheproblemsofthesocialsensorpotentiallylosingitsinformatione.g.frombecominglost ordamagedandtheinformationnotbeinginthesocialgraphforotherapplicationstomakeuseofit. Second,whileweexpectsocialsensorswilllikelyaggregateinformation,westillexpectthevalueofan individualinputtobesmallbecauseanaccuraterelationshipweightwillbebuiltslowlyovertime.Since usersarelikelytohaveapplicationsuserelationshipstrengthasafactorwhenallowingordenyingprivileges, 9

PAGE 18

itisbetterintermsofprivacyforarelationshipweighttobeginlowandslowlygrowtoanaccurateweight thanstartouthighandbereduced.Abenetofindividualinputsbeingsmallisthatthelossofasingleinput willnotstronglyaffectauser'ssocialinformation.Tocombattheslowbuildingofarelationshipweight, weexpectapplicationswillbootstraptheuser'ssocialinformationbylookingattheuser'spastsocial interactionsthroughsuchavenuesassocialnetworkingwebsitesandemails. Third,socialsensorscanalsoidentifythestrengthanindividualinteractionaffectstheconnectionbetween twousersbylookingatmanyvariables,suchas,thelengthoftheinteraction,howsociallyactivetheusers are,andhowinterestedtheusersareintheactivitytheydidtogether.Wealsoassumethatthesocialsensors willbeabletoascertainwhatactivityitwasthatthetwousersdidtogether.Fromtheaboveexpectedsocial sensortypes,co-locationinputcouldtaketheGPScoordinatesofthelocation,usereversegeocodingtond theaddressofthelocation,andnallyusesometypeofdatabasetondwhatactivityisrelatedwiththat location.Thesocialsensorsthathandletext-basedsocialinteractionscouldminethetextforkeywordsthat showwhatactivitytheuserswerediscussingbytext.Thephonecalldevicecouldusespeechrecognitionto again,lookforkeywordstodiscoverwhattopictheusersdiscussed. 3.1.2DecentralizationofSocialData Whenmanagingthegeo-socialdata,amainobjectiveistomaketheusersfeelsafebygivingthemcontrol overthestorageoftheirsocialdataandagain,keeppotentialprivacy-losstoaminimum.Tothisend,we decidedthatGeoSwouldoperateonadecentralizedplatform.Thedecentralizedplatformprovidesmany benetstousers,including: usersaregivencommandovertheirsocialdata BigBrotherscenariosarepreventedbynotgivinganentityaccesstothesocialdataofallusers theeffectsofasecurityattackarelessenedversusacentralizedplatformbecauseifasinglesystemis compromised,areducednumberofusersareaffected 3.1.3ManagementofSocialData Intermsofmanaginggeo-socialdata,GeoSisresponsibleforthefollowing: organizingusers'socialinformationinagraphdatastructure applyingandtrackinginput 10

PAGE 19

propagatingsocialinformationacrossthedecentralizedsystem submittingsocialinferencerequeststootheruserswhensocialinformationonthoseusersisneeded 3.1.4SocialInferenceRequests GeoSneedstoprovideasetofsocialinferencefunctionsasAPIforapplications.Theseinferencefunctionsneedtobereasonablyefcientsinceusersarelikelytobeinteractingdirectlywiththem.Giventhe likelyscaleofthegraphforexample,Facebookhas400millionactiveusers[Faca],thefactthatthegraph willbedistributedacrossmanypeers,andpotentialprivacyissues,somesocialmetrics,suchasbetweenness centrality,aretoocomplex.TheonesimplementedinGeoS,however,aresimpleandcanstillhaveahigh impactonsocially-awareapplications. 3.2Architecture WebuilttheGeoSarchitecturesothattheservicecouldmanagesocialdatainadistributedsystem.GeoS isapartofPrometheus,apeer-to-peersystemthatprovidesuserswithcontroloverwheretheirsocialdatais storedandwhoaccessesit. 3.2.1OverallArchitectureofPrometheusandGeoSasaComponent WedesignedGeoStorunonapeer-to-peerdistributedsystemandcooperatewithotherservicesthatare specictohandlingthemanagementofadistributedsystem[KFA + 10].Wenamedthearchitecturecomprised ofGeoStogetherwiththeservicesthathandlethedistributedsystem Prometheus .Thecomponentsthatmake upPrometheusare: Ermis -managesthedecentralizedpeer-to-peernetworkandgroupspeersbasedonwhichonesauser declaresastrusted Registrar -manageswhichpeersuserstrust Gateway -enforcesprivacythroughencryptionandaccesscontrollistsACLs GeoS -managesthesocialdataofusers AnoverviewofthePrometheusarchitecturecanbefoundinFigure1. EachpeerthatisapartofPrometheusrunsacopyofthefourservices.UsersofPrometheusdeclarepeers astrustedandhavetheirsocialdatastoredonthosepeers.Sincemostpeoplearesensitiveabouttheprivacy 11

PAGE 20

Figure1.AnOverviewofthePrometheusArchitecture. oftheirsocialdata,weexpectuserstochoosepeersownedbyusersthattheyhavesocialrelationshipswith astheirtrustedpeers.Thus,GeoScanaidusersinchoosingtheirtrustedpeersbyusingitssocialinferences. Ermismaintainsalistofwhichpeersaretrustedbyauserinthatuser'strustedpeerlistTPL.TheseTPLs containtheIPaddressesofthepeersthataretrustedfortheuserandareupdatedwheneveratrustedpeerjoins thepeer-to-peernetwork.Aswillbedescribedindetaillater,allmessagesinthePrometheussystemrstgo throughtheGatewaysothatitcanenforceprivacyandsothatonlyonecomponentisinchargeoftherouting ofmessagestothecorrectpeers.WhentheGatewayneedstoknowwheretorouteamessage,itrequeststhe trustedpeerlistforthedestinationuserfromErmis. TheuserdenesanACL,whichinturntheGatewayusestodecidewhohasaccesstotheuser'sdata. TheACLsareapowerfulformofprivacyandlettheuserusemanyparameterswhendecidingwhocanand cannotaccesstheirdata.Suchparametersincludethedistanceinthesocialgraphthattherequestfordata originated,howtheusersaresociallyconnected,andwhotherequestinguseris.TheGatewayusespublickeyencryptionwhentransferringmessagesbetweenpeersinordertoprotecttheprivacyoftheuserfrom possibledatainterceptions. ErmisactsasaninterfaceforSCRIBE[RKCD01]whichisitselfbuiltuponPastry[RD01].Asanimplementationofdistributedhashtables[PRR97],Pastryisapeer-to-peernetworkoverlaythatallowsforthe quickroutingofmessagesandisabletoadapttopeersleavingandjoiningthenetworkfrequently.SCRIBE organizespeersintogroups,which,inthecaseofPrometheus,allowsittocontactallthetrustedpeersofa userquicklythroughanycastormulticast. 12

PAGE 21

3.2.2CommunicationBetweenGeoSandOtherComponents Thecomponentsrunasseparateservicesandcommunicatewitheachotherusingsocketstoallowfor completemodularity.AllcommunicationwithGeoSrstgoesthroughtheGateway;allinput,requests, results,andevencommunicationbetweenGeoSpeersmustgothroughtheGateway.Inthisway,theGateway cancheckitsACLs,decryptincomingmessages,andencryptoutgoingmessages.ThisalsoallowsGeoSto almostblindlysendmessagestootherpeerssinceallitneedstodoistelltheGatewaywhichuserthe messageistobesentto.IftheGatewayonpeerAreceivesamessageforauserwhohasnotsetpeerAas trusted,theGatewaywillforwardtherequesttoacorrectpeer.DuetoallcommunicationtoGeoSrstgoing throughtheGateway,theformatofmessagestotheGatewayshallbedened. MessagestotheGatewayusethefollowingthegenericformat: component < -r > | < destination IP > | destination user | requesting user | message < ::: reply IP ::: reply port ::: ID > .Notethatoptionalelds areplacedinanglebracketsandvariablesareitalicized.Basedonthepreviousgenericformat,tomakea socialinferencerequest,anapplicationsendsamessagetotheGatewayusingthefollowingformat:geos-r || destination user | requesting user | inferencedestination user ::: reply IP ::: reply port ::: ID .Socialinput usesthefollowingformat:geos || destination user | destination user | inputdestination user . Fieldsaredelimtedbythepipecharacter.The component elddeneswhichcomponentofPrometheus themessageisdestinedfor.Theoptional destination IP eldisonlyusedwhenamessageistobedeliveredto aspecicpeer.Ifthateldisempty,apeertrustedby destination user ischosen.TheACLof destination user ischeckedtoseeif requesting user isallowedtoaccesstheirsocialdata.The message eldissimplythe messagetobedelivered.Theoptional-rinthecomponenteldand::: reply IP ::: reply port ::: ID afterthe messageeldareusedwhenthemessageexpectsareply.Inordertosaveresources,theGatewayandGeoS closeallsocketconnectionsassoonastheyreceivetheentiremessage.However,inorderfortheGatewayto sendareply,itmustknowatwhatIPandportthesenderwillbelistening.TheIDisforthesendertoidentify thereply. TheonlytimeGeoSsendsamessagedestinedforErmisiswhenitneedsamessagedeliveredtoallthe trustedpeersofauser.ErmisdoesthisforGeoSwhenGeoSneedsanupdatesentoutorthemostuptodate coordinatedata.AswithallcommunicationleavingfromorcomingtoGeoS,GeoSusestheGatewayto communicatewithErmis.ThereisalsoindirectcommunicationbetweenGeoSandErmiswhenGeoSmakes arequestoftheGatewayandtheGatewayneedsinformationonthepeer-to-peernetwork. 13

PAGE 22

Figure2.GeoSandtheGatewayCommunicating. 14

PAGE 23

3.2.3GeoSDesign GeoSusesadirectedmultigraphwhereeachedgehasaweightandalabelforholdingthesocialdataofits usersdetailedinSection3.2.3.1.Toreducetheamountofnetworktrafcthroughoutthedistributedsystem andtheenergycostsonmobiledevicesthatarerunningsocialsensors,socialsensorsmaysendinputtoonly onetrustedpeer.Tokeepthesocialdataconsistentamongstthetrustedpeersofauser,peersperiodically sendupdates,asdescribedinSection3.2.3.2.Thesystemdecideswhentosendupdatesbyusinganadaptive thresholdfunctionpresentedinSection3.2.3.3thatisdesignedtobefairwithrespecttotheamountof networktrafccreatedforeachuser. 3.2.3.1DataStructure GeoSusesaweighted,directedmultigraphtorepresentthesocialdatafortheusers.Eachusercanhave multipleparalleledgesconnectingthemtoanotheruser,witheachedgehavingadifferenttagassociated withit.Atagusuallyrepresentsasocialactivitysharedbetweenthetwousers.Forexample,hiking, school,work,andmoviesarepossibletags.Thesocialsensorswillprovideactivitytags.Thereare sometagsthatwillmostlikelybeexplicitlydeclared,suchasfamily.Socialsensorstotagsisamanyto manyrelationship:Eachsocialsensorcansendinputofanytagandeachtagcanbeusedtolabelaninput sentbyanysocialsensor.Asocialsensorisnotrestrictedtousingasingletag.However,becauseofthislack ofarestriction,whensocialsensorssendinputforaparticularedge,theymusttakeintoaccounttheweight currentlyontheedgeincaseanothersocialsensorhaschangedit. Eachedgehasaweightassociatedwithitthatrepresentshowstronglythatactivityrelatesthetwousers. Theweightisarealnumberbetweenzeroandone,inclusively,andvariesdependingonhowfrequentlythe twouserssharetheactivity. Wechosetorepresentthesocialgraphasadirectedgraphbecauseinsociologyitisgenerallyaccepted thattiesareusuallyasymmetricallyreciprocal,differingincontentandintensity[Wel88].Inaddition, makingthegraphdirectedenhancesprivacywithoutrequiringextraworkbyensuringthatifuser A attempts toarticiallyraiseitsrelationshipstrengthwithuser B bysendingamassivenumberofinteractions,itwill notaffectthestrengthoftherelationshipuser B haswithuser A .Ifthegraphwereundirected,thesystem wouldhavetodetectthatabulkofunidirectionalinteractions,orspam,werereceivedandeliminatethem tokeepusersfromarticiallyincreasingthemutualrelationshipstrengthbetweenthemandotherusers. DetectingspamwithanundirectedgraphwouldrequireGeoStorecordtheamountofinteractionreportedby eachuserforeveryusertheyreportedlyinteractedwithandiftheamountofinteractionbetweentwousers 15

PAGE 24

Figure3.DirectedGraphVersusUndirectedGraphforanInferenceRequestofatLeastPathLengthTwo. isdecidedlyone-way,thesenderofthespamcanberemoved.Whilethiscertainlydoeswork,recording theone-wayinteractionsusesextraresourcesandissimilartomakingthegraphdirected.Inaddition,GeoS wouldremovespamafterthesystemdiscovereditinsteadofspamneverexistingintherstplace.Removing duplicateorinconsistentinputinanundirectedgraphrequiresabitmorethought. Usingadirectedgraphalsosimpliesdatamanagementinadistributedenvironmentbyremovingthe chanceofaninputbeingappliedtwice:inanundirectedgraph,ifbothpartiesthatarepartofaninteraction sendasocialinput,theinputwillbeappliedtwiceifthetwousersshareatrustedpeer.Inaddition,ifthe twoapplicationsordevicesthatsenttheinputsentdifferentvalues,aninconsistencywouldoccur.These twofactswouldrequireGeoSorthesocialsensorstodoextrawork.Forexample,thesocialsensorsofthe interactingusersmighthavetocommunicatewitheachothertodeterminewhichsensorissendingtheinput forthetwousersandtoverifythatanagreeduponweightisbeingsent. Thebiggestcosttousingthedirectedgraphisthat,asseeninFigure3,somerequeststhatrequire searching2hopsorgreaterawaywillrequirearemoterequestwhentheywouldnothavetoinanundirected graph.Intheexample, A wantsinformationabout C through B and C wantsinformationabout A through B ,but B hasnotdeclaredthetrustedpeerof A and C asitstrustedpeer.Intheundirectedversionofthe example, A and C couldgettherequiredinformationlocally. Othersocialdatasuchaswhentheuserlastinteractedwitheachotheruserandthecurrentcoordinate locationonlefortheuserisalsostoredwiththeuser'ssocialgraph.Werefertotheentiretyofauser's socialdataastheir socialstatedatale ,orSSDF.Dataontheamountoftrafctheuserreceivesfrominput whoseuseshallbedescribedinSection3.2.3.3isalsotransferredwiththeSSDF,eventhoughitmaynot beconsideredsocialdata.ItiseasiertoincludethisdatawiththeSSDFratherthansenditseparately.Ina simplegraphwith1000nodesandoneedgetypethatwasrandomlygeneratedusingtheNearestNeighbor algorithmdescribedinSection5.1,themeansizeofasingleuser'sSSDFwas1062bytes.Theuser'scurrent coordinatesareincludedinatransferredSSDFonlyincaseofpeerfailure.Becauseweexpectlocation 16

PAGE 25

tochangefrequently,apeerwillqueryalltrustedpeersofauserforthelatestcoordinatesentbytheuser anytimeitneedslocationdata. Table1.StoredSocialData. DataDescriptionEntryinTransferredSSDF SocialGraph GraphDirectedmultigraphdatastructurethatcanbeusedtond edgesandnodesinthegraph Builtusingsecond WeightsAfteranedgehasbeenfoundinthegraph,itsweightcanbe lookedupinthisdatastructure Second TimestampsContainsthetimesincetwogivenuserslastinteractedThird StartTimeUsedwithtimestampstocalculateservertimeN/A UpdatePreparation GraphDeltaAmountweightshavechangedsincelastupdatesentN/A AverageInputAveragenumberandweightofinputtocalculateadaptive threshold Fourth Coordinates CoordinatesLatitude,longitude,andaltitudeFifth CoordinateTimestampTimestampsetbyinputdevicethatisusedtomakesure mostup-to-datecoordinateisused 3.2.3.2Updates EachGeoSpeertrackstheinputitreceivedforeachuser.Ifthetotalweightincreaseforauserpassesa certainthreshold,GeoSwillsendanupdatetoErmisviatheGatewayforErmistodistributeanupdatetoall othertrustedpeersoftheuser.TheupdateisintheformatofanSSDFandincludesalloutboundedgesfrom theuser,andtheweightforeachofthoseedges.Byincludingalledgesandweightsfortheuser,nodeswho mighthavemissedpreviousupdatesareupdated.ThelocalGatewayreceivesthisupdate,encryptsitwith theuser'skey,andhandsitovertoErmisfordeliverytoalloftheuser'strustedpeers.Uponreceivingit,the GatewayofeachtrustedpeerdecryptstheupdateanddeliversittothelocalGeoS. Eachtrustedpeerofauserhasitsownlocalcopyofthesocialgraphforthatuser.Whenatrustedpeerof auserreceivesaninputthatanothertrustedpeerofthatuserdoesnot,theirlocalcopiesareinconsistent.A goalofGeoSistokeepeachlocalcopyofauser'ssocialgraphinanearconsistentstateacrossthedistributed systemwhilekeepingnetworktrafclow.Sendinginputforausertoeverytrustedpeerofthatuserwould keepthegraphsconsistentaslongasnopeermissedaninput,butitwouldalsocausealotoftrafcacross thenetworkeachtimeasocialsensorsentinput.Usingtheperiodicupdatesystemallowseachgraphtobe inanearconsistentstatewhileonlyoccasionallycreatingextratrafctoeachtrustedpeer. 17

PAGE 26

Whenausercausesanupdatetobesentout,eachcopyoftheirsocialgraphonalloftheirtrustedpeers hasitsbasestatebroughttothesamepoint.Inputthatisreceived,butnotyetreleasedinanupdateiskept separatefromthedatainthebasestateandcanbeimaginedasbeingontopofit.AsseeninFigure4,when anupdateisreceivedbyapeer,thecurrentsocialgraphfortheuserisreplacedbythenewbasestateinthe update,andthenanyunreleasedinputisappliedontopofit.Twographsareequivalentiftheyareatthe samebasestateandthereisnounreleasedinput.Whenapeersendsanupdate,theGatewayreceivesalocal copyoftheSSDF,whichitkeepsencrypted.WhenanewGeoSnodecomesonline,itwillbesentoneof theseencryptedlocalcopiesfromaGatewaytobootstrapit.TheSSDFbringsthenewnodetothesame basestateasalltheotherpeers. 3.2.3.3AdaptiveThresholdFunction Studieshavefoundthatthesocialinteractionfrequencyofpeoplegenerallytakestheformofapower-law distribution.Inotherwords,someusersareveryactivesociallywhileothersaremuchlessactive[GWH07]. Becausetherearegenerallysuchlargevariationsinhowsociallyactivepeopleare,weexpectthatdifferent userswillhavealargedivergenceintheamountofinputtheirsocialsensorssendtoGeoS.Astaticthreshold fordecidingwhenGeoSshouldsendupdatescouldleaveusersinanout-of-datestateforalongtimeiftheir socialsensorsreportlittlesocialinteraction.Consequently,wechoseanadaptivethresholdforeachuser. Thisadaptivethresholdfunctionworksbyrecordingtheaverageweightofinputmessagesthepeerreceivesfortheuseralongwiththeaveragenumberofinputsthatpeerreceivesperhourfortheuser.Itthen usesthisdatatotrytocausetheusertosendoutonaverage X updatesanhour. Thethresholdissetto: < averageinputweight > < averagenumberofinputanhour > =< desired averageupdatesanhour > Thedesiredaverageupdatesanhourcanbescaleddependingonhowmuchtrafciswishedforinthe network.Forourexperiments,itwassettothreeupdatesperhour.AsmentionedinSection3.2.3.2,the averageweightandinputcountisdistributedviatheSSDFsothatthethresholdfunctiononnewpeerswill knowhowoftentosendupdates.Theseremoteweightandinputcountsalsoinuencetheadaptivethreshold ofalreadyexistingpeers,butthelocalcountsinuencethefunctionmore.Thethoughtbehindthisisthatthe accumulatedknowledgeofhowsociallyactiveauserisshouldinuenceapeer'sadaptivethresholdfunction, butnotasstronglyasitsownpersonalexperience.Sincealocalpeerwillreceiveitsowncountsbackina remotepeer'sSSDFupdate,itsubtractsitscountsfromtheincomingremotecountsbeforeusingthem.The procedurebywhichacountisprocessedwhenreceivedinanupdateisgiveninFigure5. 18

PAGE 27

Figure4.InputsBeingReceivedandThenDistributedThroughanUpdate. numberofremoteinputs receivednumberofremoteinputs )]TJ/F37 8.9664 Tf 9.216 0 Td [(numberoflocalinputs ; remoteserverstotalruntime receivedruntime )]TJ/F37 8.9664 Tf 9.216 0 Td [(timesincelocalpeerwasinitialized ; remoteinputspersecond numberofremoteinputs = remoteserverstotalruntime ; localinputspersecond numberoflocalinputs = timesincelocalpeerwasinitialized ; inputspersecond remoteinputspersecond + localinputspersecond = 3 ; totalinput inputspersecond timesincelocalpeerwasinitialized ; Figure5.ApplyingRemoteInputInformationtotheAdaptiveThresholdFunction. 19

PAGE 28

Wechosetobasethecalculationsontheaveragenumberofupdatesbecauseitwouldallowustoeasily shapethenetworktrafcthatcomesfromupdatesbeingsent.ItalsoallowedforeasycalculationssinceGeoS onlyneededtokeeptrackoftheaveragevolumeofinput. 3.2.3.4ExampleSocialInput Nowthatallpartsoftheupdateprocesshavebeenexplained,thefollowingisanexampleofthestepsinput goesthroughtobeappliedinGeoS.Inthisexample,twousersAandBarecollocatedatauniversityand theirsocialsensorsdetecttheirinteraction.Theinteractionisnoticedbyanapplicationontheirmobilephone thatusesBluetoothtosearchforotherBluetoothdevices.WhenitndsanewBluetoothdevicepresent,it queriesaservicethattranslatesdevicestoGeoSusernames.WhenitdeterminestheotherBluetoothdeviceis asocialsensorofanotheruser,itrecordshowlongtheusersaretogethertoverifytheywerenotjustpassing by.Theusersstaytogetherforseveralminutesandsoitdecidesthatitmusthavebeenarealinteraction.The socialsensornowusesGPStonditscurrentlocation,thenusesareversegeocodingservicetodetermine theplace,andnallyusesaservicetondthetagassociatedwiththisplaceschool.Withthetagfound, itnowcalculatesaweight.Ittakesintoaccountmanyfactors,suchashowlongtheusersweretogether,how oftentheownerinteractswithothersatschool,andifthetwousersarerelatedinsomeotherway. ThesocialsensorforuserAcollectsdataforseveralhoursandthendecidesitispreparedtosendasocial input.TheinputisencryptedonthephoneandthensenttotheGatewayofatrustedpeerofuserA.The GatewaydecryptstheinputandforwardsittoGeoS.GeoSappliestheinputtoitssocialgraph.Theschool edgefromuserAtouserBisincreasedbythecalculatedweight.Theweightoftheinputisgreaterthanthe thresholdreturnedbytheadaptivethresholdfunction,andsoanupdateisreleasedbythepeer. GeoScompilestogethertheuser'sSSDFandgivesittotheGateway,tellingittohaveErmismulticast theupdatetoalltrustedpeersoftheuser.TheGatewayencryptstheSSDFandgivesittoErmis.Ermis multicaststheupdatetotheGatewaysofallthetrustedpeersoftheuser.TheseGatewaysverifytheupdateis allowedbytheirACLs,decrypttheupdate,andthenpassittoGeoS.EachoftheGeoSinstancesreplacesits SSDFwiththeoneintheupdateandthenappliesontopofthatanyinputithasreceived,butnotyetreleased inanupdate. 3.2.3.5Aging Whenpeopledonotinteractforanextendedperiodoftime,thesocialbondsbetweenthembeginto slowlydegrade.Consequently,unlessGeoSisusedforashort-termstudy,itwillneedsometypeofaging functiontomimicthisslowdegradationofsocialtiesduetolackofinteraction. 20

PAGE 29

Ourcurrentapproachistoreducetheweightbetweenusersby10%eachweektheydonotinteract.This resultsinaveryslowdegradationandrelationshipsneverfullyendbuteventuallybecomeveryweak.This seemslikeapassablesolutionsinceitgivesusersplentyoftimetoreinitiatecontact.However,wehaveno wayofknowingwhetherthis,oravariationofit,isthecorrectsolutionfromasociologicalview.Inthe following,weproposeafewpossiblesolutionsthatcouldsolvetheagingproblem. SinceGeoShassupportforinputtohaveanegativevalue,onealternativeisforanoutsideapplicationto handlethespecicsofaging.Leavingtheagingtoanoutsideapplicationwouldbeagoodsolutionbecauseit isamodularapproach.Thisway,anyonecanworkonanagingfunction.NotethatalthoughGeoSsupports inputthathasanegativevalue,thisiscurrentlytheonlycasewhereweexpectit. Anotherapproachtotheagingproblemistohaveusersmanuallysethowmuchtimetheywanttogoby beforeagingoccursandhowstronglytheagingfunctionshouldaffecttherelationshipwhenitdoes.Ina similarfashion,insteadofsettinganamountoftimebeforeagingshouldoccur,ausercouldmanuallygointo anapplicationanddegradetheconnectionwhenevertheychoose.Forexample,iftheynoticesomeonethey nolongerinteractwithisshowingupintheresultsforinferencerequests,theycouldenterthedegradation applicationandreducethestrengthoftheconnectiontheyhavewiththeotheruser.Ifusersmustdegrade relationstrengthsthemselves,thesystemgivesthemfullcontrolovertheirsocialgraphandtheresponsibility istakenawayfromGeoSoranapplication.Ifanapplicationhandledaging,amistakeormiscalculationin theapplicationcouldruinagreatdealofsocialdata. However,requiringtheusertodoextraworkisoneoftheworstproblemsofthisapproachsincewe designedGeoStorequireverylittleuserintervention.Also,usersmayhaveadifculttimedeterminingwhat valuestouseandendupdisruptingtheirsocialgraph. 3.2.3.6MessageFormats TheformatofcallstoGeoSissimilarinappearancetotypicalprogrammingfunctioncalls.Theformat ofthesecallscanbefoundinTable2. 3.3SocialInferenceAlgorithms GeoSincludesseveralsocialinferencefunctionstoallowusers'applicationstogetsocialinformation easily.ByprovidinganAPIthatincludesinferencefunctions,developersareabletoeasilyqueryGeoS forsocialinformation.ThesocialinferencefunctionsprovidedwithGeoSarerelativelysimple,andwe expectmorecomplicatedsocialinferencestobedevelopedinthefutureusingthedefaultonesasabase. 21

PAGE 30

Table2.FormatofMessagesinGeoS. CommunicationWithUsers'Applications Userinput soc inputego,alter,tag,weightReturns:Void geo inputego,alter,longitude,timestampReturns:Void Requests SocSego,alter,request numeReturns:Realnumber neighborhoodego,tag,min weight,hops,request numReturns:Listofuserswhometspecications top friendsego,tag,num top,request numReturns:Listofuptonum topusers friend testego,alter,tag,min weight,request numReturns:Boolean get coordinateego,request numReturns:Latitudeandlongitude proximity friendsego,tag,min weight,hops,distance, request num Returns:Listofuserswhometspecications CommunicationWithinPrometheus SSDFDistribution initializeSSDFReturns:Void mass updateupdateReturns:Void GeoS-to-GeoSCommunication get coordinateego,replyUserReturns:Latitude,longitude get coordinate secondaryego,replyIP,replyUser,request numReturns:Latitude,longitude get coordinate replylatitude,longitude,timestamp,request numReturns:Void neighborhood secondaryego,tag,min weight,hops,replyIP, replyUser,request num,hopstravelled,user chain Note:Resultsinneighborhood replycall withencryption neighborhood replylist of users,request numReturns:Void SocS secondaryego,alter,replyIP,replyUser,request numNote:ResultsinSocS replycall SocS replysocial strength,request numReturns:Void OutboundMessagestoOtherComponents ermis | 127.0.0.1 | user | user | scribeupdate ssdfego0enc> SSDFNote:Resultsinmass updatecall gateway | 127.0.0.1 ||| log ssdfegoSSDFNote:Createslocal,encryptedcopyofSSDF Ifapplicationdevelopersbuildfuturesocialinferencefunctionsfromthecompositionofthebasicprovided ones,nochangeshavetobemadetoGeoS.However,whilethereisagainofmodularitybydoingitthis way,therewillbealargeamountofbackandforthcommunicationcomparedtoifthenewfunctionwas implementedwithinGeoS.Oneideatoreducethisunneededcommunicationistocreateuserservicesthat runonthesamepeerasGeoS,receivecomplexrequestsfromusers,andthenautomaticallytakecareof submittingacompositionoffunctions.Theseinferencefunctionswererstpresentedin[AKFI10]. 3.3.1FriendTest Thefunction friend testego,userB, ,x isabooleanfunctionthatcheckswhether ego isdirectlyconnectedto userB inthesocialgraphbyanedgewithlabel andwithaminimumweightof x .Amobile 22

PAGE 31

phoneapplicationcanusethisfunctiontodeterminewhetheranincomingcallfromacoworkershouldbelet throughorsilencedonweekends. 3.3.2TopFriends Thefunction top friendsego, ,n returnsasize n setofusersinthesocialgraphthataredirectly connectedto ego byanedgewithlabel ,orderedbydecreasingweights.Anapplicationcanusethis function,forexample,toinviteusershighlyconnectedwith ego tosharecontentrelatedtoactivity 3.3.3Neighborhood Thefunction neighborhoodego, ,x,radius returnsthesetofusersin ego 'sneighborhoodof radius whoareconnectedthroughsocialtiesofalabel andweightlargerthan x .Thefunctionperformsabreadthrstsearchonthesocialgraph.Anapplicationthatsilencesauser'scellphonenearcoworkersusesthis functiontoinfer ego 'sworkneighborhoodinthesocialgraphevenifnotdirectlyconnectedtotheuseri.e., thephonewillstillbesilencedifonabusinesslunchwitharegionalmanagerthattheuserhasnotmet before. Iftheradiusisgreaterthanone,GeoSmayneedtousetheGatewaytorequestotherGeoSsonother peerstoinfersocialinformationtocompletetheinference.ThisneedforaremoteGeoSwillonlyhappenif thelocalGeoSisnotthetrustedpeerforauserthatis radius )]TJ/F8 9.9626 Tf 9.018 0 Td [(1 hopsawayfrom ego .ThelocalGeoSwill senda neighborhood secondary requesttoaremoteGeoSofthisuserforwhichitisnotatrustedpeer.The secondaryrequestcontainswithinittheusualdetailsoftagandweight,butalsocontainshowmanymore hopstosearchandwheretoreplyto.Thesecondaryrequestcouldpotentiallyresultinthisuseralsosending offasecondaryrequestofitsown.Eachpeerthatgetsasecondaryrequestrepliesbacktothesenderwiththe resultevenifitwasempty.Whentheoriginalpeernallygetsalltherepliesitiswaitingon,itcompiles theresultsremovingduplicatesifany,andreturnstheresults.Thewaitingpeerwilleventuallytimeoutif itdoesnotreceiveareply. Thesearchingofthegraphusesabreadth-rstsearchalgorithm.Thealgorithmstartsoutwithaqueue thatisemptyexceptfor ego .Thealgorithmcontinuestoremovetheuserinthefrontofthequeueandmake itthecurrentlysearcheduseruntilthequeueisempty.Eachneighborofthecurrentuserwhohasanedge thatmeetsthespecicationsandwhohasnotbeenvisitedyetisaddedtothenalresults.Auseraddedto thenalresultsiseitheraddedtothequeueorprocessedonaremotepeer.Iftheusertruststhepeer,the algorithmplacestheuserinthebackofthequeue;otherwise,itsendsoffaremoterequest.GeoSsendsoff 23

PAGE 32

remoterequestsrightawaywithauniquerequestnumberattachedtothem.Aftertraversingthegraphasfar aspossiblelocally,thealgorithmwaitstoreceiverepliesfromallrequests.Itthencombinestheresultsfrom theremoterequests,removesduplicates,andgivestheresultstotheGateway. input : ego , x radius output :Thesetofusersin ego 'sneighborhoodof radius whoareconnectedthroughsocialtiesofalabel and weightlargerthan x Enqueue ego ; while queue notempty do ego Dequeue ; foreach alter in Neighbors ego do if alter notvisited then if in Edges ego alter then if x Weight ego alter then Visit alter ; distance [ alter ]= distance [ ego ]+1 ; results )]TJ/F37 8.9664 Tf 17.088 0 Td [(alter ; if distance [ alter ] < radius then Enqueue alter ; end end end end end end Figure6.AlgorithmforNeighborhoodego, ,x,radius. Therequest proximity friendsego, ,x,radius,distance isasimpleextensionof neighborhood .After theresultsaregatheredfrom neighborhood ,itremovesanyuserswhoarenotwithinagivenphysical distance.Thisinferencecouldbeusedtoinvitenearbyfriendstoabarafterafootballgame.Togetthe locationofauser,itsimplycalls get coordinate .Sincelocationsarelikelytochangeoften,socialsensors sendlocationinputtoonlyonepeerandlocationinformationisnotincludedinupdates.Instead,whenever get coordinate iscalled,itcontactseverytrustedpeeroftheusertogetthecurrentcoordinatethepeerhas onlefortheuseralongwiththetimestampgivenwiththatlocation.Whichevercoordinatereceivedhas thelatesttimestampassociatedwithitistheoneusedinthedistancecalculation.Thedistancebetweentwo coordinatesiscalculatedusingthesphericallawofcosines. 3.3.4SocialStrength Thefunction social strengthego,alter returnsarealnumberbetween0and1thatquantiesthesocial strengthbetween ego and alter from ego 'sperspective.Thisvalueisnormalizedto ego 'ssocialtiestoensure thatthesocialstrengthislesssensitivetothesocialactivityoftheusers.AsshowninFigure7,thenormalized 24

PAGE 33

Figure7.ASocialStrengthCalculationExample. weightfromAtoitsadjacentneighborBNW f AB g isthesumofalltheweightsoftheedgesfromAtoB aggregatingoveralltypesofinteractionsbetweenAandBdividedbythelargestofallthesumsofweights goingfromuserAtooneofitsneighborsD.ThepathstrengthfromAtoCthroughBPS f ABC g isthe lowestofalltheNWonthatpath,dividedbythelengthofthepath.Finally,thesocialstrengthfromuserA touserCSocS f AC g isthelargestpathstrengthfromAtoC. Thesocialstrengthrequestprovidesuserswithaquantitativemeasureonsocialsolidarity[BF01]based onthefrequencyandintensityoftheirsocialinteractions.Suchafunctioncouldbeused,forexample, toestimatesocialincentivesforresourcesharing,suchasstorageorhostingaserviceforanotheruserora community,orthebattery-awareBitTorrentexamplepresentedabove.Socialstrengthcanbeappliedtodyads thatmaynotbedirectlyconnected.Moreover,evenifadirecttieexists,anindirectpaththroughcommon friendsmaybestronger[Bia97].Welimitthelengthoftheindirectpaththatconnectstwousersto2,using awell-acceptedresultinsociologyknownasthehorizonofobservability:Friendkin[Fri83]showsthat peopleknowaboutpersonsuptotwosocialhopsaway,eventhoughtheydonotinteractdirectlywiththem. Weinterpretthisasanindicationthatitisunlikelyforuserswhoareconnectedbyashortestpathoflength3 orlongertohaveanon-zerosocialstrengthbetweenthem. Sincethesocialstrengthrequiressearchingtwohopsaway,oneGeoSpeermaynotlocallyhaveallthe requiredinformation.ThismeansitwillhavetousetheGatewaytocontactGeoSsonotherpeersasseenin Section3.2.2.Theremotepeerwillreturnthenormalizedweightfromtheremoteuserto userB ,whichcan becombinedwithlocaldatatocalculatethepathstrength.ThelocalGeoSwillmakesureitispossiblefor apathgoingthroughtheremoteusertoresultinalargerpathstrengththanthelargestpathstrengthfound 25

PAGE 34

locallybeforesendingouttheremoterequest.AsocialstrengthrequestfromauserAtoanyotheruserwill resultinamaximumnumberofremotecallsequaltothenumberofneighborsuserAhasinthesocialgraph. 26

PAGE 35

input : ego alter output :Arealnumberbetween0and1thatquantiesthesocialstrengthbetween ego and alter from ego 's perspective Sum userA userB : begin foreach tag in Edges userA userB do sum )]TJ/F26 8.9664 Tf 17.088 0 Td [(Weight userA userB tag ; end end MaxSum userA : begin max 0 ; foreach userB in Neighbors userA do if Sum userA userB > max then max Sum userA userB ; end end end NormalizedWeight userA userB : begin Sum userA userB = MaxSum userA ; end PathStrength userA userB userC : begin if NormalizedWeight userA userB > NormalizedWeight userB userC then NormalizedWeight userA userB = pathLength ; else NormalizedWeight userB userC = pathLength ; end end Social Strength ego alter : begin max 0 ; foreach userB in Neighbors ego do if userB is alter then if NormalizedWeight ego userB > max then max NormalizedWeight ego userB ; end else foreach userC in Neighbors userB do if userC is alter then if PathStrength ego userB userC > max then max PathStrength ego userB userC ; end end end end end end Figure8.AlgorithmforSocial StrengthContinued. 27

PAGE 36

Chapter4: Implementation GeoSiswritteninPythonandusessomepreexistingPythonmodules.Animportantsuchmoduleis NetworkX,whichisagraphingmodulethatGeoSusesforhandlingthedirected,multipleparalleledge, graphdatastructure.Specically,weusetheMultiDiGraphclass.Theclassprovidessimpletousefunctions todealwithgraphtasksthatwouldbedifculttoimplementmanuallyusingbuilt-indatastructures.Adding edges,ndingneighborsofauser,anddeterminingthenumberandtypeofedgesconnectingtwousers arealleasilyaccomplishedusingthisclass.Theclassalsoprovidesgraphalgorithms,suchascalculating betweennesscentrality,whichGeoSdoesnotusecurrently,butmayproveusefulforfutureGeoSalgorithms. SomepossibleusesofthesegraphalgorithmsarepresentedinChapter8:.NetworkXwaschosenoverother graphinglibraries,likeiGraph,becauseofhowwellitisdocumentedandbecauseitiswritteninPython, whichallowsforeasiertesting. Pythondictionaries,alsoknownasassociativearrays,areusedforholdingtheweightforeachedgealong withotheredgedatalikethesumofalledgesbetweentwousers.GeoSalsousesdictionariesforstoringthe GPSlocationofeachuser,andthelasttimeeachpairofusersinteracted.Thedictionarydatastructureisa built-inpartofPythonandsimpliesprogramminginGeoSbyallowingittouseusernamesaskeys. ForremoterequestsbetweenGeoSnodes,webuiltarequest-handlingsystem.Remoterequestsaresent anytimeapeerdoesnothaveedgedatathatitneedstocorrectlyfulllarequestduetoitnotbeingauser's trustedpeer.Eachtimeapeersendsaremoterequest,arequestnumberisincrementallygeneratedthatwill beuniqueforthelocalGeoSuntiltwicethemaximumintegersizerequestshavebeensent.Therequest numberonlyneedstobeuniqueforthelocalGeoSbecausethepeeronlyusesittoidentifytheresultofthe remoterequestwhenitreceivesit.AftertheremoteGeoSfulllstherequest,itreplieswiththeresultandthe requestnumber.ThelocalGeoScollectsalltheserepliesandthengivesthemtothefunctionthatmadethe remoterequestssothatitcannishprocessingitsresults.ThelocalGeoScontinueswaitingforrepliesuntil eitheratimeoutvalueispassedortherequestismarkedasfullled.Ifarequestrequiressocialdatamore thantwohops,itispossibleforapeertobewaitingonaremotepeerwhoiswaitingonotherremotepeers. 28

PAGE 37

GeoSismulti-threaded.Connectionsarelistenedforwithanon-blockingsocketbecausenon-blocking socketsarefasterthanblockingsockets[McM],andtheUnix select systemcallisusedforcheckingwhen thereareincomingconnections.Wheneveritreceivesanewconnection,thepeerspawnsanewthread, receivestheincomingmessage,andcallsthefunctionthatcorrespondstotherequest.Wechosethreading overusingasinglethreadwithasynchronousconnectionsbecausewhenafunctionmakesremoterequests, itmustwaituntilitreceivestheresultsbeforeitcannishprocessing.Itwouldbemuchmorecomplicated tocalltherequestfunction,sendouttheremoterequests,gobackintothemainreceivingloop,havethe receivingloopkeepcheckingwhetheralltheremoterequestshavebeenfullledortimedoutwhileitisstill takingnewrequests,andthengobacktonishingtherequestinsteadofjustrunningtherequestinathread, sendingtheremoterequests,andthenputtingtherequest'sthreadtosleepuntilalltheresultsarereceivedor theremoterequeststimeout. GeoSalsousesthreadingtoruncertaincallsonaschedule.Onescheduledthreadupdatesthenumberof requestsreceivedeachsecondforloggingpurposes,anothercleansupanyremoterequeststhatneverreceived areply,andnally,alastonerunseveryfteenminutestocheckifuserswhohavenotsentoutanupdatefor alongperiodoftimenowexceedthethresholdneededtosendanupdate.Forthelatter,thethresholdwillbe lowerthanlastcheckedduetotimepassing. 29

PAGE 38

Chapter5: ExperimentalSetup TotesttheperformanceandcapabilitiesofGeoS,weexecutedseveralexperiments.Alltestingdatawere eitherbasedonsocialdataorwereactualsocialdatatomakethetestsasclosetotherealworldaspossible. Severalideasforcreatingthesocialgraphstobeusedinthetestswereconsidered.Forperformancetests,the frequencyofremoteinputandrequestmessageswerebasedonprobabilitydistributionfunctionsPDFfrom datareleasedbypreviousresearch. 5.1SocialGraphGeneration Whenrunningexperimentsontheservice,agraphthatwasbasedonrealsocialdatawasdesiredsothat theresultsoftheexperimentswouldbesimilartowhatwouldbeseenintherealworld.Thegraphneeded tohavemanyusersandmultipleparalleledgeslinkingusers.Datafrommultiplesourcesisveryusefulfor testingthefeaturesofGeoSbecausethemoreknownabouttheusers,themoredetailedthegraphwillbeand thecloseritwillbetowhatagraphmadeinGeoSwilllooklikeintherealworld. 5.1.1SyntheticGraphs Thenextideaonhowtocreateausefulsocialgraphwastogeneraterandomgraphswithproperties similartoasocialgraph.Thiswouldallowustogenerateagraphthathadexactlythedesirednumberof nodes.AmodiedversionoftheNearestNeighboralgorithmwaschosenforgeneratingthegraphsafter seeingthatSalaetal.foundthealgorithmconsistentlyproducedgraphswithpropertiessimilartorealsocial graphs[SCW + 10].TheNearestNeighboralgorithmwasoriginallypresentedin[V az03]andSalaetal. modiedtheoriginalalgorithminordertobringthepropertiesofgraphsitgeneratesclosertothoseseen inrealsocialgraphs.Theonlyfeaturegraphsgeneratedwiththisalgorithmlackismultipleparalleledges betweenusers,buttheyarestillusefulfortesting.Forourtests,wegeneratedagraphwith1000nodesusing thealgorithm.ItspropertiescanbeseeninTable3. Wehaveconsideredthatthealgorithmcanbuildasocialmultigraphwhilekeepingthesocialgraph properties.Weproposethateachnodeinthegraphcanberandomlygivenasetofrealnumbersbetween 30

PAGE 39

Table3.PropertiesofGraphGeneratedUsingNearestNeighborAlgorithm. Nodes: 1000 Edges: 5784 AverageClusteringCoefcient: 0.32 AverageClusteringCoefcientofarandomgraph: 0.12 AverageEccentricity: 7.64 Power-lawDistribution: 1.39 AverageShortestPathLength: 3.58 AverageShortestPathofaRandomGraph: 3.94 AverageDegree: 11.57 0and1,whereeachvaluerepresentshowinterestedtheyareinasubjectrepresentedbyanedgetag. Wheneverthealgorithmconnectsapairofusersbyanedge,twodirectededgesarecreatedbetweenthemfor eachtag.Theweightsoftheedgesarerandomlygeneratedbetween0andtheirinterestinthetag'ssubject. Inthisway,ifthegraphwasreducedtojustoneedgetype,itwouldstillbeasocialgraph,butsomeedges leavinguserswillhavesuchalowweightthattheyarealmostnonexistent. 5.1.2NJITSocialGraph ToverifyGeoSalsoworkscorrectlyonrealsocialdata,adatasetthatincludedsocialinformationfrom twosourceswastested.Thesocialdataconsistofone-monthofBluetoothcollocationdataandFacebook friendlistsforasetof104studentsattheNewJerseyInstituteofTechnologyNJIT.Thedatacollection tookplaceonthismedium-sizedurbancampus,andthesubjectswererepresentativeofthevariousmajors offeredoncampus,with75%undergraduatesand28%women. SimilartotheRealityMiningtraces[EP06],smartphonesweredistributedtostudentsandanapplication quietlyrecordedtheBluetoothaddressesofnearbydevicesandperiodicallytransmittedthemtoaserver. Giventhatthesamplesizewassmallcomparedtotheuniversitypopulationrandomlyselectedoutof 9000studentsandthatmanystudentsarecommuters,thetracedataissparse.Forexample,abouthalfthe subjectsreportedlessthan24hoursofdatafortheentiremonth,andonly17%ofthescansdetectedother Bluetoothdevicesinproximity.Thetypicaluserprovidedafewhoursofdataperday,especiallyduring theweekdays.ThesamesetofsubjectsinstalledaFacebookapplicationtoparticipateinasurvey,andgave permissionfortheirfriendliststobecollectedwhentheyinstalledthisapplication. Therecordedsocialdatawereseparatedintothreecategories:pairsofuserswhowerecollocatedfora totalofatleast45minutesatleast15ofthoseminuteswereconsecutiveduringthestudy CL: 45 ,pairsof userswhowerecollocatedforatotalofatleast90minutesatleast15ofthoseminuteswereconsecutive 31

PAGE 40

Table4.PropertiesofNJITGraph. Nodes: 100 Edges: 469 AverageClusteringCoefcient: 0.37 AverageClusteringCoefcientofarandomgraph: 0.09 AverageEccentricity: 4.47 Power-lawDistribution: 1.37 AverageShortestPathLength: 2.50 AverageShortestPathofaRandomGraph: 2.98 AverageDegree: 9.38 duringthestudy CL: 90 ,andpairsofuserswhodeclaredeachotherfriendsonFacebook FB .Eachpair inthe CL: 90 datawasalsointhe CL: 45 data.Therewere87usersinthe FB data,94inthe CL: 45 data, and81inthe CL: 90 data.Therewereatotalof100uniqueusersinthethreedatasetswith81appearingin boththeFacebookandcollocationdata. WecreatedamultigraphbasedontheNJITdatawhereedgeswerelabeledfacebookwithweight0.1 forthe FB data,collocationwithweight0.2forthe CL: 45 data,andcollocationwithweight0.3for the CL: 90 data.AsearchofthegraphstartingfromanodeintheperipherycanbeseeninFigure9andthe graph'spropertiescanbefoundinTable4.Blackedgesarecollocationedgeswithweight0.3,grayedges arecollocationedgeswithweight0.2,anddottededgesarefacebookedgeswithweight0.1. 5.2GenerationofWorkloads Forexperimentation,realisticworkloadswerecreatedfortestingthereal-worldperformanceoftheservice.Therehavebeenseveralstudiesintohowoftenpeoplecommunicatethroughelectronicsourceslike TwitterandFacebook[WBS + 09,KGA08].Wecombinedthisinformationwithdetailsontheinwardand outwardconnectionsofthenodesinthegraphtocreatethevariousworkloads.Ourassumptionbasedon ndingsfrom[WBS + 09]isthattheamountandratioofconnectionsauserhasisrepresentativeofauser's socialactivitylevel. 5.2.1InputBasedonFacebook WeemulateaFacebooksocialsensorbasedonaFacebooktraceanalysis[WBS + 09].Weextractworkloadcharacteristicssuchastheprobabilitydistributionfunctionandtherateofuseractivity. Thesensorranksusersintosevengroupsbasedontheirsocialdegreeinandoutdegree.Usingthe cumulativedistributionfunctionfromtheFacebookstudyFigure8in[WBS + 09],wemapeachgroupinto 32

PAGE 41

Figure9.TraversalofNJITGraph. 33

PAGE 42

aparticularprobabilityclass,asshowninTable5.Oncethegroupisselected,auserwhohasnotbeen selectedyetispickedtobethesourceoftheinput,S.Forthedestinationoftheinput,anotheruserispicked, fromtheavailablesocialconnectionsofS.Theweight x oftheinputiskeptconstanttoasmallvalue,across allinputsforallusers.Sincetheusersarepickedbasedontheirsocialdegree,userswithhighersocialdegree probabilisticallywillproducemoreinput.Wealsousethesamestudysobservationthatusersproducean averageofaboutoneFacebookinteractionperuserperday,whichtranslatesintooneFacebookinteraction every90secondsforauserbaseof1000users. Table5.ProbabilityDistributionFunctionforInputUsingFacebookData. %SocialDegree Rankdesc.order %ofTotal InteractionCDF GroupAssignedto User Probabilityto ChooseUserPDF 5 40.0 1 0.400 10 60.0 2 0.200 20 80.0 3 0.200 30 90.0 4 0.100 40 95.0 5 0.050 50 97.5 6 0.025 > 50 100.0 7 0.025 5.2.2RequestsBasedonTwitter Inordertoemulatetheworkloadproducedbyapplicationsthatneedneighborhoodinferences,weusedan analysisonTwittertraces[KGA08].Intuitively,weassociateatweetinTwitterwithaneighborhoodrequest centeredattheleaderofthetweetinGeoS.FrompreviousTwittertraceanalysis,weextracttheprobability distributionfunctionandtherateofsubmittedtweets. Werankedusersintothreegroupsbasedontheirsocialdegreeratioofincomingtooutgoingedges.Using theCDFfromtheTwitterstudyFigure4in[KGA08],eachgroupismappedontoaparticularprobability tobeselectedandsubmitaneighborhoodrequestasshowinTable6.Oncethegroupisselected,auserwho hasnotbeenselectedyetispickedtobethesourceoftherequest.Thenumberofhopsfortherequestcan berandomlypickedfrom1,2or3hopstheaveragepathlengthinasocialnetworkistypicallylow,sowe assumethataninferenceof3socialhopsleadstoalargeenoughgraphcoverageforsucharequest.From thesamestudy,weassumeabout1neighborhoodinferencerequestevery12secondsintervalforatrace-set of1000users. 34

PAGE 43

Table6.ProbabilityDistributionFunctionfortheNeighborhoodRequestUsingTwitterData. Followers/Following In/OutDegree AverageNumber ofTweetsperday Groupassignedto user Probabilityto chooseuserPDF 100-1000 86 1 0.53 10-100 50 2 0.31 0-10 25 3 0.16 5.2.3RequestsBasedonBitTorrent Thesocialstrengthbetweentwousersmaybeneededasameasuretoinfertrustbetweentwousers. Forexample,inabattery-awareBitTorrentapplication[KBI09]onmobiledevices,usersmayrelyonsocial incentivestobeallowedtotemporarilyfreeridethesystemwhenlowonbattery.Theirsocialcontactswill checktheirsocialstrengthwiththeleecherwhoislowonbatterytoseeiftheywanttoallowitaccesstodata. WeemulaterequestsforsocialstrengthinferencebasedonananalysisofBitTorrenttraces[GCX + 05] fromwhichweagainextracttheprobabilitydistributionfunctionandrateofsubmittedrequests.Twousers arerandomlyselectedasthesourceanddestinationofthesocialstrengthinferencerequest.Thesourceisalso associatedwithanumberofrequeststosubmitthroughouttheexperiment,usingtheprobabilitydistribution functionshowninTable7.Weassumeaboutonesocialstrengthinferencerequestevery12secondsinterval foratracesetof1000users. Table7.ProbabilityDistributionFunctionfortheSocialStrengthRequestUsingBitTorrentData. Numberofrequeststosubmit ProbabilitytochooseuserPDF 1 0.450 2 0.144 4 0.178 8 0.104 15 0.079 25 0.026 35 0.015 45 0.004 5.3Testbed WeexecutedalltestsonsomeformofLinux-basedcomputer.TheexperimentsmeanttotesttheperformanceofGeoSweredoneusingtenquad-corecomputerswith4GBofRAMrunningonaLAN.This wastoreducethepossibilityofnetworkdelays.TheexperimentsthattestedthePrometheusarchitectureas awholewerecarriedoutontenPlanetLabmachines.Thespecicsofthesemachinesareunknown,butwe triedtospreadthemachinestodifferentlocationsaroundtheworld.ThreewereinNorthAmerica,three 35

PAGE 44

wereinEurope,threewereinAsiaandonewasinSouthAmerica.Acopyoftheapplicationthatsentand receivedrequestswasoneachpeer.Thesourceanddestinationofeachrequestwasrandomlychosen.Performancetestscontinuallysentneighborhoodandsocialstrengthrequestsalongwithsocialinputwhichcaused updatestobesent. Wemappedusersontopeersusingeitherauniformoranon-uniformdistribution.Witheitherdistribution, wekeptthenumberofusersoneachpeerataconstant.Intheuniformdistribution,theonlyconcernwas givingeachpeeraroundthesamenumberoftotaledges.Thisresultedinanalmostrandomdistributionfrom theviewpointofthesocialgraph.Thenon-uniformdistributioninsteadusesthesocialgraphtoguidethe placementofusers.Itworksbyrstpickingthetenuserswiththehighestdegreeandassigningeachoneto apeer.Wheneverauserisassignedtoapeer,allofitsneighborsaresortedbytheirdegreeandaddedtothe endofthequeueforthepeer.Eachpeertakesturnsremovingusersfromitsqueueuntilanunassigneduser isremoved.Thislastremoveduseristhenassignedtothepeer.Anyuserwhoisremoved,butnotassigned isputinasecondaryqueue.Ifthemainassignmentqueueempties,therstuserinthesecondaryqueue isremovedanditsneighborsareaddedtotheassignmentqueue.Whenassigninguserstopeers,assigning usersclosetotherstuseristhehighestpriority.Thenexthighestpriorityistoassignuserswiththehighest numberofedges.Theoverallgoalwasforpeerstohaveasimilarnumberoftotaledgeswhilekeeping sociallyconnectedusersonthesamepeer.Inourtestusingarandomlygeneratedgraph,thelargestnumber ofedgesononepeerwas437,theleastnumberononepeerwas338,andthemeannumberofedgesona peerwas394.4. 36

PAGE 45

Chapter6: ExperimentalEvaluation Weperformedthreeclassesofexperimentation.First,wetestedthecapabilitiesofGeoSusingsome scenariostheservicemaybeusedforinthefuture.Second,wetestedtheperformanceoftheservice.Finally, weanalyzedtheperformanceoftheentirePrometheusarchitectureforanyobservationsthatcanbemade aboutGeoSasapartofit. 6.1PerformanceMetrics TherearetwomainmetricsonwhichtheperformanceofGeoSisjudged.Inthecapabilitiestest,the metricissimplyaqualitativeone.Forexample,dotheresultsfollowwhatwouldbeexpectedforthegraph? Fortheperformancetests,thetimetofulllarequestisevaluated. 6.2ExperimentalResults Wepresentthefollowingexperimentalresults.Theexperimentswereconductedusingeitheragraphwith 1000nodesrandomlygeneratedusingamodiedversionoftheNearestNeighboralgorithm[SCW + 10]or arealsocialgraphbuiltusingcollocationandFacebooktracesfromNJIT.Thetestswereadministeredon eithertenPCsonaLANclusterortenPlanetLabmachinesspreadthroughouttheworld. 6.2.1GeoSCapabilities WerstdemonstratehowasocialapplicationcouldexploittheGeoSfunctionalitiesandtheuseofthe weightedmultigraphbypresentingthefollowingtoyscenariothatusestheNJITdataset. StudentswhoattendaparticularuniversitytogetherarerecordedinGeoSasconnectedbyacollocation edgewithdifferentweights.2for45minutesofcollocationand0.3for90minutesofcollocationreportedbytheirBluetoothsocialsensor.TheymayalsobeFacebookfriendsusinginformationreportedbya Facebooksocialsensor. 37

PAGE 46

Table8.MeanNumberofContactsReturnedbyaNeighborhoodRequestSubmittedbyEgoforVarious CombinationsofEdgesandWeightsWithMeanSocialStrengthBetweenEgoandtheContactsWithinTwo HopsinParentheses. SocialHops 1 2 3 4 5 6 FB 3.60 11.39 23 32 36 37 CL.45 8.71 38.35 72 84 87 87 CL.90 3.95 8.62 17 29 40 49 CL.45orFB 9.64 43.33 78 90 92 92 CL.45andFB 1.92 5.53 17 27 31 32 CL.90orFB 5.73 17.44 34 49 60 67 CL.90andFB 1.0 2.77 6 12 17 19 Alicehasacquiredseveralticketstothesoldoutfootballgameanddoesnotwantanyoftheticketstogo towaste,evenifshedoesnotpersonallyknoweveryonewhousesoneofhertickets.Sinceanyonewhouses herticketsneedstoattendwithher,shewantstopickthosewithwhichshehastheclosestrelationship.Toaid herindecidingwhotogivetheticketsto,sheusesGeoStoassessthesocialstrengthofherconnectionwith studentswhomsheisconnectedwithdirectlyorindirectly.First,shesubmitsaneighborhoodrequestfora givennumberofsocialhopsusingparticulartypesofedgesandcombinationscollocationwithaweight of0.2or0.3andfacebookwithaweightof0.1,toretrievethestudentsconnecteddirectlyorindirectly withherTable8.Whenshedecidestobestrictandndonlythestudentswhowerecollocatedwithher thelongestandareFacebookfriendswithher,muchfewerstudentsarereturned.DependingonwhereAlice isinthesocialgraph,ameanofoneortwostudentswithinoneortwohopsarereturned,thusmaking thesearchfocusedanduseful.Next,sherequestsfromGeoSthesocialstrengthbetweenherandeachofthe studentsreturnedbythepreviousneighborhoodrequest.Allshehastodoiscontactthestudentswiththe highestsocialstrength.0,even2socialhopsaway,toinvitethemtothegame. GeoSwasabletosimplifyAlice'staskofndingthestudentswithwhichshehasthestrongestrelationship.Itisintuitivetoseethattheusersshehadthestrongestweightwithfromtheneighborhoodrequest werethesameoneswithwhichshehadthestrongestsocialstrength.Oneexceptionisthatfollowingtwohop Facebookedgesofweight0.1resultsinahighersocialstrengththanfollowingtwohopcollocationedges ofminimumweight0.2.ThisisseenbecausetherearelessFacebookedgesthancollocationedges,which resultsintherebeingahigherprobabilityofaFacebookandcollocationedgeappearingbetweentwousersif thereisalreadyaFacebookedgebetweenthem. Astherestrictionsonedgesarelowered,thenumberofusersreturnedbytheneighborhoodrequest increasesandthemeansocialstrengthofthoseusersdecreases.Restrictingtheedgesthatcanbetraversed makesmanynodesunreachable,whichisespeciallyinterestingsincethetwotypesofedgesuseddifferent 38

PAGE 47

socialsensors.IfthistesthadbeendoneusingonlyFacebookorcollocationdata,manysocialconnections wouldbelost.Thisobservationtswiththeclaimthatonlyconsideringonesourceofsocialinformationcan resultinalossofimportantdetails. 6.2.2GeoSPerformance ThefollowingdataarefromanexperimentexecutedontheLANclusterandshowtheperformanceof GeoS.BecausetheLANhasveryfewnetworkdelaysandbecausethesevalueswererecordedfromwithin GeoS,thedatafocusesalmostsolelyontheperformanceofGeoS.Duringthetest,atotalof1,855,737 messageswerereceivedbythetenpeersrunningGeoS.Therewereanaverageof8.1messagesreceivedper secondandthetestcontinuedforalittlemorethansixandahalfhours.Inthefollowinggures,theYaxis representshowlongittooktofulllarequestandtheXaxisrepresentstheorderofincomingmessages.The timemeasuredisfromwhenGeoSreceivestherequestuntilGeoSsendstheresulttotheGateway. Theresultsfromtheneighborhoodrequestwillbeshownrst.Theperformanceresultswillstartwith requeststhatwereprocessedcompletelylocally,followedbyonenetworkhoprequestsaremotepeerhadto becontactedforinformation,andgouptorequeststhatrequiredtwonetworkhopsaremotepeerhadtobe contactedforinformationandinturn,theremotepeerhadtocontactasecondremotepeer. Figure10showsthattherequestswhichwerehandledlocallywerecompletedveryquickly.99%oflocal requestsarecompletedinlessthanamillisecond.Themeanruntimewas0.12milliseconds,themedianrun timewas0.11seconds,andthemaximumtimearequesttookwas3.86milliseconds. Next,Figure11showssomeinterestinginformationonrequeststhatneededtocontactaremotepeer.The rstthingtonoticeisthatthereisaspikeatthebeginningthatisanorderofmagnitudelargerthantherestof theresults.ThisiscausedbyErmisrstndingwhichpeersareonlineandwhichonesarethetrustedpeers oftherequesteduser.Afterthisinitialspike,therunningtimesarelowduetoErmiscachingwhichpeersare online.Eventhoughtherearemanynewdelaysadded,suchasnetworktransferandtheGatewayencrypting resultsfromtheremotepeer,themeanruntimeofarequestis0.18secondsandthemedianis0.13seconds. Over90%oftherequestsarecompletedinlessthanaquarterofasecond. Figure12showsneighborhoodrequestswherearemotepeeriscontactedindirectly.Theinitialspikeis seenagain,butthistimeitiseventaller.Thefewlinesthatreachallthewayto30secondsmeansthata GeoSpeersentoutaremoterequesttoaremotepeer,butneverheardanythingback.Fortestingpurposes, GeoShasaverylargetimeoutvalueof15secondstimesthemaximumnumberofnetworkhopsthisremote requestcouldenduptaking.These30seconddelaysareonlyseenattheverybeginningandonlyacoupleof 39

PAGE 48

Figure10.PerformanceResultsofNeighborhoodRequeststhatWereComputedLocally. 40

PAGE 49

Figure11.PerformanceResultsofNeighborhoodRequeststhatWereComputedUsingDirectRemotePeers. 41

PAGE 50

Figure12.PerformanceResultsofNeighborhoodRequeststhatWereComputedUsingIndirectRemote Peers. times;thismeansthatafewrequestsfailedatthebeginningofthetestwhenapeerorservicewasnotready tofulllthem.The20secondspikesshowthatforathreehoprequest,Ermiswaitedfortensecondsatthe rstnetworkhopandthenwaitedtensecondsagainatthesecondnetworkhop.InFigure13,wecutoffthe verticalpartofFigure12thatisabovesevensecondsinordertozoominontheresultsaftertheinitialspike. Whiletheruntimesarenotquiteasfastassomeseenearlier,97%arestillnishedinlessthanasecond. Themeanruntimeis0.45secondsandthemedianis0.31seconds.Themedianisalittlemorethandouble thatseeninthetwohoprequest,whichindicatesthatthereisnomajordelayaddedeventhoughmanymore remoterequestsarebeingspawned. Finally,Figure14showstheperformanceresultsofsocialstrengthrequestswhileFigure15cutsoffthe verticalpartofthegraphaboveonesecondtozoominontheresultsbelowasecond.Thesocialstrength performanceresultsareverysimilartothoseseeninthesecondneighborhoodresults,whichmakessense sinceitmaycontactremotepeersofconnectedusersaswell.Itshouldbenotedthatsocialstrengthrequests donotrequireanyremotecallsifanindirectconnectioncouldnotbeasstrongasonealreadyfoundinthe 42

PAGE 51

Figure13.NeighborhoodResultsZoomed-InonSevenSecondsandBelow. 43

PAGE 52

Figure14.PerformanceResultsofSocialStrengthRequeststhatWereComputedUsingDirectRemotePeers orCompletelyLocally. localgraph.Anon-uniformdistributionwasusedforthistest,which,aswillbeshowninSection6.2.3,does increasetheprobabilityofaninferencerequestbeingcompletedlocally.97.5%oftherequestsweredonein lessthanaquarterofasecond.Themeanruntimewas0.12secondsandthemedianwas0.10seconds. 6.2.3PrometheusPerformance TheneighborhoodrequestresultsfromtheexperimentsontheLANclusterandPlanetLabareshownin Figure16.Theexperimentmeasuresfromwhentherequestleavestherequesteruntiltheresultisreceived.As expected,therequestssubmittedinPlanetLabtakelongertimetonishthanintheclusterapproximately10 timesmore.Whatisinterestingisthedifferenceintheresponsetimebetweentheuniformandnonuniform distributionsofusersontopeers:onlyabout32%oftherequestsnishinlessthanonesecondintheuniform distribution,comparedto42.7%inthenonuniform. InFigure17wepresenttheresultsfromPlanetLab,groupedusingthenumberofsocialhops.Theresults demonstratethatforonesocialhop,theuser-peerdistributiondoesnotaffectthetimeneededtofullla 44

PAGE 53

Figure15.SocialStrengthResultsZoomed-InonOneSecondandBelow. 45

PAGE 54

Figure16.FullledNeighborhoodRequestsVersusTimetoDeliverResultOverAllSocialHops. 46

PAGE 55

Figure17.FullledNeighborhoodRequestsVersusTimetoDeliverResultGroupedbySocialHops. request.Ontheotherhand,whentherequestisfortwosocialhops,thedifferenceissignicant:only16% oftherequestsnishinlessthanonesecondintheuniformdistributionversus33.7%inthenonuniform. Whentherequestisforthreesocialhops,thenonuniformdistributiondeliverstheresultabout3%fasterthan theuniform.Weexpectsociallyconnecteduserstosharepeersandthissharingofpeersclearlyprovidesa performanceimprovementwhenarequestcouldneedremoteinformation. ThesocialstrengthrequestresultsfromtheexperimentsontheLANclusterandPlanetLabareshown inFigure18.Wenoticethedifferenceintheperformanceofthesystem,betweenuniformandnonuniform distributionofusersontopeers.Onlyabout16.8%oftherequestsnishinlessthanonesecondinthe uniformdistribution,whereas37.3%inthenonuniform.Notethesimilarityoftheresultsfromthesocial 47

PAGE 56

strengthrequestsFigure18andtheneighborhoodrequestsfor2socialhopsFigure16.Thisisbecause thecalculationofthesocialstrengthrequestgoesasfarastwosocialhops. Singlehoprequestsinthesetestshavetwodistinctrequestcompletionpoints,oneat100milliseconds andanotherat1second.Oneobservationisthatthesetimesaremuchworsethanfromtheperspectiveof GeoS.Thisshowsthatthereisobviouslysomenetworkdelayinsendingandreceivingtherequestandits result.The100millisecondrequestsresultedfromtherequestinitiator,thepeerthatreceivestherequest,and thepeerthatneedstofullltherequestbeinglinkedthroughafastconnectionlikelyfrombeingthesame machineorbeingclosegeographically.Sinceseveralofthepeersareondifferentcontinents,itdoesnot seemunusualthattheotherrequeststake1secondtocomplete.GeoSwasshowncompletinglocalrequests inlessthanonemillisecondonaverageearlier.ThePlanetLabcomputersareslowerthanthecomputersin theearliertest,butevenifGeoSwastentimesslowerinthiscase,itwouldonlybecausingonepercentof theoverallprocessingtimeformostrequests. 48

PAGE 57

Figure18.FullledSocialStrengthRequestsVersusAverageTimetoDeliverResult. 49

PAGE 58

Chapter7: RelatedWork Digitalservicesandapplicationsthatworkwithsocialinformationareapopularsubjectcurrently,as isevidencedbytheamountofrecentresearchonthesubject.Ourexplorationofrelatedworkrstcovers generalresearchintotheareaofsocialnetworks,whichisthenfollowedbyanexaminationofthemany piecesofresearchthateithercollectsocialinformationoranalyzesocialinformationcollectedbyothers, withusnallylookingintoapplicationsandservicesthatcouldmakeuseofasocialdatastorageservice. 7.1SocialNetworks Ourresearchonsocialnetworksfocusedonsocialgraphsandsocialnetworkanalysis,butalsoincluded basicsociologyconcepts,suchastheformationofsocialties.Thegoalintheresearchwastogainenough knowledgeofthesociologytomakesurethedesignofGeoSstayedinlinewithgenerallyacceptedsociology theories. Thecoreideathatrelationshipstrengthshouldbebasedoninteractionscomesfrom[Hom51],aclassic sociologybook,inwhichHomansstatesthemorefrequentlypersonsinteractwithoneanother,thestronger theirsentimentsoffriendshipforoneanotherareapttobe.AspointedoutbyEngestr omin[Eng05],objects, suchasajoborsportsteam,arewhatcausepeopletointeractwitheachother.Itwasthisideathatspawned thedesigndecisionofusingagraphwithmultipleparalleledgesbetweenusers,whereeachedgetypeisa classofobject.Bycombiningtheideasfromthosetwosources,itwasdecidedthateachedgewouldhaveits ownweight,allowingforthestrengthbywhichoneclassofobjectoutofpossiblymanyrelatestwousers tobefound. [Gra73]isanotherclassicsociologywork.Init,Granovetterputssocialconnectionsinoneofthree groups:strong,weak,orabsent.HendsthatifthereisanA-BlinkandanA-Clinkthenthereisagood chanceofaB-Clink.Healsondsthatweaktiesaremorelikelytolinkdifferentsmallgroupsthanstronger ties.ThesesignicantndingsarenotdirectlyusedinGeoS,buttheywereimportanttorememberwhile developingthesystem. 50

PAGE 59

Granovetteralsodenesthestrengthofrelationshipsasaprobablylinearcombinationoftheamountof time,theemotionalintensity,theintimacymutualconding,andthereciprocalserviceswhichcharacterize thetie.Whilesomeofthesefactorsarelikelydifcultforasocialsensortocalculate,theauthorsofsocial sensorsshouldstillkeeptheminmind.NotehoweverthatGranovetterlaterstatesinthepaperthattheamount oftimeinvestedintoarelationshipprobablydenestherelationshipstrengthforsomecases.Thisismuch easierforsocialsensorstotrackandtsbetterwithHomans'observations[Hom51]. Friedkinreportsin[Fri83]that,observabilityisveryunlikelyamongpersonswhoarethreeormoresteps removed.Thisisanessentialndingforthedesignofthesocialstrengthfunction.Friedkinalsoreportedin thesamepaperthatobservabilityismorelikelywithanincreasednumberofpathsfromtheobservertothe otheruser. In[YHC08],Yonekietal.lookatseveraldatasetsfromCRAWDAD.Themainfocusofthepaperis onndingdifferentkindsofhubnodesdegree,rank,cross,date,andparty.Thepaperalsolooksathow removinghubnodesaffectsepidemicspread.Thedatasetsarerepresentedwithundirected,weightedgraphs. HubnodesarenotcurrentlyconsideredinGeoS,but[YHC08]revealedhowusefulthehubnodescanbefor conveyinginformation. 7.2DataCollection AlargepartofGeoSisitsjobofcollectingsocialdatafrommultiplesources.Toaidinthedesignof this,weneededtodomuchresearchintothecollectionofsocialdata.Ourresearchfocusedonpapersthat usedsocialinformationtoinferinformationabouttheuser,analysisofonlinesocialnetworks,andbuildinga socialgraphusingsocialinformation. Lewisetal.'spaper[LKG + 08]discussesusingFacebookdatatodeterminesocialrelationships.Italso givesideasonhowFacebookdatacanbeusedfordeterminingarelationship.Forexample,itpointsout thatiftwousersaretaggedtogetherinaphotographonFacebook,theymusthavebeenphysicallytogether. AnotheranalysisofFacebookdataoccursin[GWH07]wheretheFacebookmessagesendinghabitsof4.2 millionFacebookusersareanalyzed.Observationsoftheauthorsincludethefollowing: only15.1%ofallpairsofanalyzedFacebookfriendsexchangedmessageswitheachotherthrough Facebook userssentanaverageof0.97messagesaweek 51

PAGE 60

socializingthroughFacebookismorelikelytohappeninatimewhentheuserislesslikelytointeract withpeople AlltheseobservationsareusefulfordevelopingaFacebooksocialsensor. EagleandPentlandusedatasolelyfromcellphonestoinferfriendshipsandroutinesin[EP06].Theyalso describehowcellphonesarewearablesensors.Inaddition,thepapercontainsstatisticsonhowoftencell phoneswereusedfordifferentfunctionsintheirstudy.Theideathatroutinescouldbedetectedallowedus tomaketheassumptionthatasocialsensorcoulddetermineaparticularinteractionisjustpartofaroutine. Theinferenceoffriendshipisalsoastrongargumentthataccuratesocialsensorscanbedeveloped. Matsuoetal.createasocialgraphbasedontheappearanceofpeople'snamestogetheronwebpages in[MMH + 06].Asimilarsystempresentedin[CBM04]scansthroughauser'ssentandreceivedemails foremailaddressesandthensearchesthewebforeachoftheseaddresses.Bydoingthis,itcanndthe homepageofthepersonandfromthere,theuser'scontactinformationcanbegatheredalongwiththosewho arerelatedtothem.Thisallowsasocialnetworktobegenerated.Bothpapersprovidetheideaofasocial sensorthatspidersthewebforrelationships. In[SA04],theEnronemaildatasetisanalyzed.ThisdatasetwasusedinearlyexperimentationofGeoS becausedirectededgeswithrealisticweightscanbeeasilygeneratedusingthelengthandnumberofemails sentbetweenusers.Misloveetal.analyzetheFlickr,LiveJournal,Orkut,andYouTubesocialnetworksalong withsocialgraphsbasedonthemin[MMG + 07].Thisanalysisisespeciallyinterestingbecauseitshowsthe graphpropertiesofsomerealsocialgraphs. Miluzzoetal.presentCenceMein[MLF + 08].CenceMeisamobilephoneapplicationthatusesthemany differentsensingdevicesavailableonamobilephoneGPS,microphone,accelerometer,Bluetooth,camera todetectwheretheownerofthephoneisandwhattheownerisdoing.In[PM09],theauthorsusemultiple sensorsGPS,GSM,Wi-Fi,Bluetoothtodeterminetheplacesauservisitsregularlyandwhatactivitiesthe userdoesthere.Itcanalsopredictactivitiesandthevisitingofplaces.Thefactthatbothauthorswereableto infertheactivityusersweredoinghelpsreinforcethefactthatitispossibleforarealsocialsensortodetect whatclassofactivitytwousersareinteractinginregardsto. Wyattetal.useamicrophoneonasmalldevicetodetectwhenthewearerishavingaconversation in[WBCK08].Itrecordsconversationsinawaythatpreservesprivacy,butstillhadgreaterthan90%accuracyindetectingconversationsintesting.Thepaperalsofoundthatspeechpatternschangelesswhen someoneistalkingtoacloserelation.Thismaybeusefulinformationfordevelopingcollocationorphone callsocialsensors. 52

PAGE 61

In[NK06],Nicolaietal.haveashortdiscussionabouttheBluetoothdatatheauthorscollected.The authorsalsopresentsomeoftheirtheoriesondeterminingsocialcontextusingBluetooth.Weexpectthat Bluetoothshallbethebestwayinwhichtondcollocatedusers,andsothisdiscussionisimportantwhen developingacollocationsocialsensor. Theauthorsof[SR08]ndthatpeoplewhochatwitheachotheraremorelikelytohavesimilarinterests. Also,thosepeoplearelikelytobethesameageandfromthesamelocationandofoppositegender.In addition,theyndthatpeoplewhohavemutualfriendsarelikelytohavesimilarinterests.Netoetal.nd thatuserswithsimilartaggingactivitylikelyshareinterestsin[NCA + 09].Thesendingsmaybeuseful whendevelopingsocialsensors.Forexample,maybetheintensityofaninteractionshouldbestrongerifthe twouserswhoareinteractingarelesslikelytointeract. 7.3Socially-AwareApplications TheoverallobjectiveofGeoSistoallowsocially-awareapplicationstoqueryitforthesocialinformation ofusers.AnimportantstageinthedesignofGeoSwasdeterminingwhattypeofsocialdatawouldbe neededbytheseapplications.Wegeneratedsomeofourownideasforsocially-awareapplications,butalso researchedwhatapplicationshadalreadybeendeveloped. Acontext-awaremobilephoneiscreatedin[SSF + 03].Basically,theprojectwillturnoffaphone's ringeranddootherthingstothephonebasedonwhatcontexttheuserisin.Itusessensordataaccelerometer, microphone,lightsensortogetherwiththeuser'scalendar.ThepaperwasofgreatusewhiledesigningGeoS. First,itsilencescellphones,whichisanearlygoalthatwewantedasocially-awareapplicationthatusesGeoS tobeabletodo.Second,itcombinesdatafrommultiplesourcescellphonesensorsandtheuser'scalendar. However,thephonesilenceronlysilencesanincomingcalliftheownerisspeakingoriftheircalendarsays theyarebusyatthetime.Theformercouldhaveproblemsiftheownerisinasituationwherethephone shouldbesilence,buttheuserisnottalkingsuchasalecture.Also,itcouldsilenceanimportantphone callwhiletheownerishavingacasualconversation.Thelatterrequiressomeworkfromtheuser.Forthese reasons,wefeelitwouldbebetterforthedevicetoalsousetheuser'ssocialcontext. Ostra,asystemtoreducespam,usesasocialgraphtodeterminethemaximumnumberofmessagesa usercansendbeforerequiringatleastoneoftheirsentmessagestobemarkedasawantedcommunication[MPDG08].SybilGuardusesthefactthatinthecaseofasybilattack,allofthesybilidentitieswillbe closetofully-connectedwitheachotherinordertoreducetheimpactofasybilattack[YKGF06].Friendstore allowsuserstobackuptheirdataontothecomputerofthosetheyhaveasocialrelationshipwith[TCL08].All 53

PAGE 62

oftheseapplicationsusesimplebooleanedgesforrelationshipsandwebelievethattheywouldbenetfrom havingamorecomplexsocialgraphavailabletothem.Forexample,FriendstorecouldpossiblyallowaremoteuserAtostoremoredataonthesystemthanremoteuserBifthereexistedastrongersocialrelationship betweentheownerandAthanbetweentheownerandB. LiandDabekfoundthatusingsocialrelationshipinformationinapeer-to-peernetworkprovidesbenetsduetousershavingsocialincentivestostayonline[LD06].HoggandAdamicrecommendusinga weightedsocialgraphforavisiblereputationlevelthatwouldbedifculttospoof[HA04].REusestwohop relationshipstoautomaticallypopulateemailwhitelists[GKF + 06]. 54

PAGE 63

Chapter8: FutureWork GeoSprovidesmanynovelfeatures,buttherearestillseveralideaswewouldliketoresearch.Theinclusionofmoresocialinferencefunctionswillsupportthedevelopmentofadvancedsocially-awareapplications. GeoSwilleventuallymanagetheentiretyofauser'ssocialdataandtothisend,implementingsupportfor themanagementofsocialdataotherthanauser'srelationshipsandgeo-locationisneeded.Finally,some questionsarestillopenwhichrequirefurtherresearch. 8.1AdditionalSocialInferenceFunctions Severalnewsocialinferencefunctionshavebeenconsidered,butrequirefurtherresearch.First,amore accurateversionofthesocialstrengthfunctionhasbeenhypothesized.Whencalculatingthestrengthofindirectlinks,thecurrentversiononlyconsidersthestrongestpath.InadditiontotheHorizonofObservability, Friedkinobservedin[Fri83]thatuserswithmorepathslinkingthemtoanindirectcolleagueweremorelikely toknowofthecolleague'swork.Usingthisobservation,wewouldliketoextendthesocialstrengthfunction totakeintoaccountthenumberofpathslinkingtwousers.Weproposeanalgorithmforthisfunctionin Figure19.Thisalgorithmutilizessomeideasfrom[DGS09],wheretheauthorspresentanalgorithmforcalculatingtheprobabilityoftrustbetweentwouserswhoarenotdirectlyconnected.Thealgorithmwepropose isincompleteasitiscompletelyuntestedandwedonotcompletelyagreewithitsresultsinthecaseofapath wherealltheedgeweightsare1.0.Insuchacase,theresultisasocialstrengthof1.0,whichseemstoohigh whentheusersmaynotevenknoweachother. Includingalgorithmsthatcanestimatethegraphmeasurementsofanodewithoutrequiringtheentirety ofthegraphwouldaddaverypowerfulfeaturetoGeoS.Forexample,thebetweennesscentralityofanodeis usefulfordeterminingtheimportanceofanodeinconnectingothers,butitrequirestheentiregraph.Because weexpectmanyuserswilllimittheirsocialinformationtofriendsoffriends,andtorefrainfromcreating ahugeamountofnetworktrafc,agraphalgorithminGeoSshouldnotneedmorethantwohopsofsocial information.Anotherpossibilityisforthealgorithmstobewritteninsuchawaythatausercouldgiveneeded informationwhileretainingitsprivacy. 55

PAGE 64

input : ego alter output :Arealnumberbetween0and1thatquantiesthesocialstrengthbetween ego and alter from ego 's perspective Sum userA userB : begin foreach tag in Edges userA userB do sum )]TJ/F26 8.9664 Tf 17.088 0 Td [(Weight userA userB tag ; end end MaxSum userA : begin max 0 ; foreach userB in Neighbors userA do if Sum userA userB > max then max Sum userA userB ; end end end NW userA userB : begin if userA is userB then 1 : 0 ; else Sum userA userB = MaxSum userA ; end end PathStrength userA userB userC : begin if NW userA userB < NW userB userC then 1 )]TJ/F26 8.9664 Tf 9.216 0 Td [(NW userA userB 2 = pathLength ; else 1 )]TJ/F29 8.9664 Tf 9.216 0 Td [( NW userA userB NW userB userC 1 = pathLength ; end end Social Strength ego alter : begin foreach userB in Neighbors ego do if alter in Neighbors userB then product product PathStrength ego userB alter ; end end 1 )]TJ/F37 8.9664 Tf 9.215 0 Td [(product ; end Figure19.ProposedAlgorithmforNewSocial Strengthego,alter. 56

PAGE 65

Analsocialinferencefunctionthathasbeencontemplatedisonethatautomaticallyinfersauser's interestsbyanalyzingthesocialgraphoftheuser.Thebasicideaisthatthestrongertheweightsauserhas foraparticulartag,themorethatusermustinteractregardingthattag.Ifausertakespartinaparticular activityAmorethananactivityB,theusermustbemoreinterestedinactivityA.Determininginterest levelscouldbeassimpleassummingalloftheweightsofeachtagauserisdirectlyconnectedthroughand comparingeachtag'ssummedweighttoeachothertag'ssummedweight. 8.2ExpandedSocialData ThereareseveralpiecesofsocialinformationthatcouldbemanagedbyGeoS.Theseincludecontact informationfortheuseranditsfriends,scheduledeclaredandinferred,andgroupmembershipdeclared andinferred.BystoringsocialinformationinGeoS,thesocialinformationisavailableinadistributed systemandcanmakeuseoforbeusedinsocialinferencefunctions. Contactinformationcanhaveatypeofaccesscontrollistthatrequiresthategobeconnectedtousers byaparticularpathinorderforthemtogainaccesstoacertaineldofcontactinformation.Forexample, ausermaysaythattheonlycontactinformationitwantsgiventothoseithasaworkrelationshipwithis itsofcephoneoradifferentusermaysayasocialstrengthofatleast0.3isrequiredtoaccessitsemail address.Thiswouldgreatlysimplifythesharingofcontactinformationforuserswhohavemultipleformsof communicationopentothem. Ascheduleisabasicpieceofsocialinformation,butweproposethatsinceitispossibletoinferroutines[EP06],acompletescheduleforausercouldbeinferredusingpastknowledgetogetherwithdeclared events.Inferredeventswouldincludeaprobabilitythattheeventwilloccurandoutsideuserscouldquery thisinferredschedulefortimeswhentheusermightbeavailable.Eventscouldincludetagsthatareparallel toedgetagsanddescribetheevent.Similartothecontactinformation,atypeofaccesscontrollistcanbe placedontheschedulesothatonlycertainuserswouldbeabletoknowtheusermaybeavailableataparticulartime.Queryingforavailabilityismeanttoeasethendingoftimeslotswheremultipleuserscanmeet whilelteringaccesstoavailabilityisdesignedtoreducethechanceauserwillbeaskediftheyareavailable atatimeinwhichtheyarebusyordoinganactivitytherequestormaynotbeinterestedin.Forexample,if theinferredschedulebelievesthereisa60%chanceauserisdoingahikingactivityatagiventime,theuser mayonlyallowstronglyconnectedhikerstoseethattheuserislikelyavailableatthattime. Researchhasbeendoneintotheinferenceofgroupmembership[GPJB07].GeoScouldusetheknowledgeofgroupmembershipinseveralways.ItcouldbeusedtovalidatethesocialgraphwithinGeoS,the 57

PAGE 66

socialgraphinGeoScouldbeusedtovalidatethealgorithm'sndings,orGeoScouldusegroupmembership forsocialinferences. 8.3OpenQuestions GeoSrequiressocialsensorsandrequiresthatsocialsensorsprovideitaccurateinput.Exactlyhowthis inputisdeterminedrequiresfurtherresearch.Whendesigningasetofsocialsensors,notonlymusttheybe abletotakeinteractionsandturnthemintoaccurateweights,butthesocialsensorsmustalsobeabletotake intoaccounttheweightthatothersocialsensorshavealreadyaddedtothegraph.Itisverymuchpossible thattwouserswilldiscussanactivitybyphoneoremailandthenmeetinpersontodothatactivity.Thiscase wouldresultinthephoneoremailsensorsendinganinputforthetagassociatedwiththatactivityfollowed bythecollocationsensorlatersendinganotherinputforthatsameactivitytag.Oneideaisthatsocialsensors willincludethecurrentweightoftheedgewhencalculatingtheweightofanewinput.Ifthisideaispossible, itmayreducethehistoryneededforcalculatingaweight,butwouldresultinmorecallstoGeoS.Anotheridea isthateachsocialsensorwillcreateitsowntagthatisasubsetofanothertagforexample,hiking-email andhiking-collocationwouldbesubsetsofhikingandGeoSwillaggregatethesetagsinanintelligent waytocalculatethesuperset. Thetagtogiveinputisalsostillunderconsideration.GeoScurrentlyallowsanytagtobesentwithinput, butitcanbearguedthatthisistooopenbecauseonesocialsensormaylabelanactivityworkwhileanother labelsthesameactivityjob.Thus,itmaybebettertorestricttagstoapublishedsetdespitemakingthe systemalittlemoreclosedoff.Anotherideafortagswhichrequiresasetgroupoftagsisthattheyshould bearrangedinahierarchygoingfromspecictogeneral.Forexample,baseballandfootballmaybea childrenofsports,whichmayitselfbeachildofrecreation.Theweightofthesportstagwouldbebased ontheweightofitschildrenalongwithanyinputsentdirectlytothattag.Themostdifcultpartofthis approachwouldbedeterminingtheweightoftagsthathavechildren. Thenalopenquestionishowupdatesshouldbehandled.Thecurrentapproachisverysimpleandcan resultindatabeinglostduetoaraceconditionorinvaliddatabeingpropagatedthroughagroupoftrusted peers.Muchresearchhasbeendonerecentlyintoconsistencyinapeer-to-peersystem,butwewanttobe suretochoosetheoptimalsolutionbeforeimplementingitinGeoS. Twosolutionsforensuringconsistencyinapeer-to-peernetworkhavebeenlookedintothusfar.These are: 58

PAGE 67

BalancedConsistencyMaintenanceBCoMProtocol[HFB10] WOOTFramework[OUMI06] AtlasPeer-to-PeerArchitecture[AM06] WiththeBCoMProtocol,foreachsharedobjecte.g.auser'sSSDFinourcase,adisseminationtree structureisformedabovetheP2Poverlay.Updatesarethensenttotherootofthetreeandpropagatedto therestofthepeers.TheWOOTframeworkisdesignedforconsistencyincollaborativeeditingoveraP2P network,buttheremovalandadditionofedges,weights,andothersocialinformationseeninGeoSissimilar totheremovalandadditionofcharactersintext.Theframeworkprovidesasetofalgorithmsthatcanensure thatinsertionsanddeletionsareappliedinthecorrectorderacrosstheP2Pnetwork.TheAtlasarchitecture usesatimestampingservicethatprovidesamonotonicallyincreasingtimestampstothepeers.Thisallows peerstodetectmissedupdatesandtheminthecorrectorderwhenlaterreceived. 59

PAGE 68

Chapter9: Conclusion WehaveshownthatGeoScancombinesocialdatafrommanyseparatesourcesintoaweighted,directed, andlabeledmultigraphdatastructure.Thankstothepopularityofubiquitouscomputingandonlinesocial networks,therearevastquantitiesofsocialdataeasilyavailabletobefedtoGeoS. ThesocialinformationwithinGeoSisstoredonadecentralizedsystemandallowsforinferencesof user'ssocialinformation.Theinferredsocialinformationcanbeusedbysocially-awareapplicationsto makethemmorecomplexthaniftheyhadusedanonlinesocialnetwork.Thedecentralizationofsocial dataincreasesprivacyandhelpsavoidabigbrotherscenario.Wepresentedthecurrentlyavailablesocial inferencefunctionsanddemonstratedtheirperformance. Wealsopresentedapplicationsthatcouldusetheresultsofthesocialinferencefunctions.Inaddition,the functionalityofthesocialinferencefunctionswastested.TheotherservicesinthePrometheusarchitecture, ofwhichGeoSisapart,werebrieydiscussed. GeoSfulllsouroriginaldesignobjectivesofstoringsocialdatainadecentralizedsystem,whilerecordingsocialinputfromanytypeofsocialsensor,andprovidingasetofsocialinferencefunctionsthatcanbe usedbytheapplicationsofusers. 60

PAGE 69

References [AHK + 07]Ahn,YongYeol,SeungyeopHan,HaewoonKwak,SueMoon,andHawoongJeong: Analysisof topologicalcharacteristicsofhugeonlinesocialnetworkingservices .In WWW'07:Proceedingsofthe16thinternationalconferenceonWorldWideWeb ,pages835,NewYork,NY, USA,2007.ACM,ISBN978-1-59593-654-7. [AKFI10]Anderson,P.,N.Kourtellis,J.Finnis,andA.Iamnitchi: OnManagingSocialDataforEnabling Socially-AwareApplicationsandServices .In SocialNetworkSystems2010 .ACM,2010. [AM06]Akbarinia,R.andV.Martins: DatamanagementintheAPPAP2Psystem .In Int.Workshopon High-PerformanceDataManagementinGridEnvironmentsHPDGrid .Citeseer,2006. [AYP + 05]Anwar,Zahid,WilliamYurcik,VivekPandey,AsimShankar,IndranilGupta,andRoyH. Campbell: Leveragingsocial-networkinfrastructuretoimprovepeer-to-peeroverlayperformance:Resultsfromorkut .CoRR,abs/cs/0509095,2005. [BF01]Bourgeois,MichaelandNoahE.Friedkin: Thedistantcore:socialsolidarity,socialdistance andinterpersonaltiesincore-peripherystructures .SocialNetworks,23:245,October 2001. [Bia97]Bian,Yanjie: Bringingstrongtiesbackin:Indirectties,networkbridges,andjobsearchesin china .AmericanSociologicalReview,62:366,1997,ISSN00031224. [BKKS84]Bernard,H.Russell,PeterKillworth,DavidKronenfeld,andLeeSailer: Theproblemofinformantaccuracy:Thevalidityofretrospectivedata .AnnualReviewofAnthropology,13:495 517,1984,ISSN00846570. [CBM04]Culotta,Aron,RonBekkerman,andAndrewMccallum: Extractingsocialnetworksandcontact informationfromemailandtheweb .In Collaboration,Electronicmessaging,Anti-Abuseand SpamConference .Citeseer,2004. [DGS09]Dubois,Thomas,JenniferGolbeck,andAravindSrinivasan: Rigorousprobabilistictrustinferencewithapplicationstoclustering.submittedtothe .In IEEE/WIC/ACMConference onWebIntelligence ,2009. [Eng05]Engestr om,Jyri: Whysomesocialnetworkservicesworkandothersdon'tor:thecasefor object-centeredsociality ,2005.http://bit.ly/aVrQRb,visitedon2010-01-23. [EP06]Eagle,NathanandAlexSandyPentland: Realitymining:sensingcomplexsocialsystems .PersonalandUbiquitousComputing,10:255,May2006, ISSN1617-4909PRINT1617-4917ONLINE. [Faca]Facebook: Facebookstatistics .http://www.facebook.com/press/info.php?statistics,visited on2010-02-17. [Facb]Facebook: Statementofrightsandresponsibilities .http://www.facebook.com/terms.php,visited on2010-02-17. [Fri83]Friedkin,N.E.: Horizonsofobservabilityandlimitsofinformalcontrolinorganizations .Social Forces,62:57,1983. 61

PAGE 70

[GCX + 05]Guo,Lei,SongqingChen,ZhenXiao,EnhuaTan,XiaoningDing,andXiaodongZhang: Measurements,analysis,andmodelingofbittorrent-likesystems .In InternetMeasurementConference2005 ,pages35,2005. [GKF + 06]Garriss,S.,M.Kaminsky,M.J.Freedman,B.Karp,D.Mazi eres,andH.Yu: RE:reliableemail In Proceedingsofthe3rdconferenceonNetworkedSystemsDesign&Implementation ,pages 22,2006. [GM04]Goecks,JeremyandElizabethD.Mynatt: Leveragingsocialnetworksforinformationsharing In CSCW'04:Proceedingsofthe2004ACMconferenceonComputersupportedcooperative work ,pages328,NewYork,NY,USA,2004.ACM,ISBN1-58113-810-5. [GMD06]Gummadi,KrishnaP.,AlanMislove,andPeterDruschel: Exploitingsocialnetworksforinternet search .In Proc.5thWorkshoponHotTopicsinNetworks ,pages79,Irvine,CA,2006. [GPJB07]Gupta,A.,S.Paul,Q.Jones,andC.Borcea: Automaticidenticationofinformalsocialgroups andplacesforgeo-socialrecommendations .InternationalJournalofMobileNetworkDesign andInnovation,2:159,2007. [Gra73]Granovetter,MarkS.: Thestrengthofweakties .TheAmericanJournalofSociology, 78:1360,May1973. [GWH07]Golder,S.A.,D.Wilkinson,andB.A.Huberman: Rhythmsofsocialinteraction:Messaging withinamassiveonlinenetwork .In 3rdInternationalConferenceonCommunitiesandTechnologies ,June2007. [HA04]Hogg,TadandLadaAdamic: Enhancingreputationmechanismsviaonlinesocialnetworks .In Proceedingsofthe5thACMconferenceonElectroniccommerce-EC'04 ,page236,2004. [HFB10]Hu,Y.,M.Feng,andL.N.Bhuyan: ABalancedConsistencyMaintenanceProtocolforStructuredP2PSystems .2010ProceedingsIEEEINFOCOM,pages1,2010. [Hom51]Homans,GeorgeC.: Thehumangroup/byGeorgeC.Homans .Routledge&Kegan,London, 1951. [KBI09]King,Zach,JeremyBlackburn,andAdrianaIamnitchi: Battorrent:Abattery-awarebittorrent formobiledevices .In Proceedingsofthe11thInternationalConferenceonUbiquitousComputing,PosterSession ,2009. [KFA + 10]Kourtellis,N.,J.Finnis,P.Anderson,J.Blackburn,andA.Iamnitchi: Prometheus:Distributed managementofgeo-socialdata .In USENIXNSDI ,2010. [KGA08]Krishnamurthy,Balachander,PhillipaGill,andMartinArlitt: Afewchirpsabouttwitter .In WOSP'08:ProceedingsoftherstworkshoponOnlinesocialnetworks ,pages19,New York,NY,USA,2008.ACM,ISBN978-1-60558-182-8. [Kha09]Khan,Urmee: Facebookcontroversyoverrighttodeletepersonalinformation ,February2009. http://bit.ly/2HaBtU,visitedon2010-02-09. [KRZBS07]Kleck,Christine,ChristenReese,DawnZiegerer-Behnken,andS.ShyamSundar: Thecompany youkeepandtheimageyouproject:Puttingyourbestfaceforwardinonlinesocialnetworks .In 57thAnnualConferenceoftheInternationalCommunicationAssociation ,SanFrancisco,CA, 2007. [LD06]Li,JinyangandFrankDabek: F2f:reliablestorageinopennetworks .In Proceedingsofthe5th InternationalWorkshoponPeer-to-PeerSystems ,February2006. 62

PAGE 71

[LHP + 07]Logan,B.,J.Healey,M.Philipose,E.Tapia,andS.Intille: Along-termevaluationofsensing modalitiesforactivityrecognition .UbiComp2007:UbiquitousComputing,pages483, 2007. [LKG + 08]Lewis,Kevin,JasonKaufman,MarcoGonzalez,AndreasWimmer,andNicholasChristakis: Tastes,ties,andtime:Anewsocialnetworkdatasetusingfacebook.com .SocialNetworks, 30:330,2008,ISSN0378. [Loc]Locale: Localeforandroid .http://www.twofortyfouram.com/,visitedon2009-11-12. [McM]McMillan,Gordon: Socketprogramminghowto .http://docs.python.org/dev/howto/sockets. html,visitedon2009-05-11. [MLF + 08]Miluzzo,Emiliano,NicholasD.Lane,Krist ofFodor,RonaldPeterson,HongLu,MircoMusolesi,ShaneB.Eisenman,XiaoZheng,andAndrewT.Campbell: Sensingmeetsmobilesocial networks:thedesign,implementationandevaluationofthecencemeapplication .In SenSys'08: Proceedingsofthe6thACMconferenceonEmbeddednetworksensorsystems ,pages337, NewYork,NY,USA,2008.ACM,ISBN978-1-59593-990-6. [MMG + 07]Mislove,Alan,MassimilianoMarcon,KrishnaP.Gummadi,PeterDruschel,andBobbyBhattacharjee: Measurementandanalysisofonlinesocialnetworks .In Proceedingsofthe5thInternetMeasurementConference ,October2007. [MMH + 06]Matsuo,Yutaka,JunichiroMori,MasahiroHamasaki,KeisukeIshida,TakuichiNishimura, HideakiTakeda,KoitiHasida,andMitsuruIshizuka: Polyphonet:anadvancedsocialnetworkextractionsystemfromtheweb .In WWW'06:Proceedingsofthe15thinternational conferenceonWorldWideWeb ,pages397,NewYork,NY,USA,2006.ACMPress, ISBN1595933239. [MPDG08]Mislove,Alan,AnsleyPost,PeterDruschel,andKrishnaP.Gummadi: Ostra:leveragingtrust tothwartunwantedcommunication .In Proceedingsofthe5thSymposiumonNetworkedSystemsDesignandImplementation ,pages150,Berkeley,CA,USA,2008.USENIXAssociation,ISBN111-999-5555-22-1. [MS63]Morgan,JamesN.andJohnA.Sonquist: Problemsintheanalysisofsurveydata,and aproposal .JournaloftheAmericanStatisticalAssociation,58:415,1963, ISSN01621459. [NCA + 09]Neto,ElizeuS.,DavidCondon,NazarenoAndrade,AdrianaIamnitchi,andMateiRipeanu: Individualandsocialbehaviorintaggingsystems .In HT'09:Proceedingsofthe20thACM conferenceonHypertextandhypermedia ,pages183,NewYork,NY,USA,2009.ACM, ISBN978-1-60558-486-7. [NK06]Nicolai,TomandHolgerKenn: Towardsdetectingsocialsituationswithbluetooth .In Ubicomp 2006. [NM09]Neate,RupertandRowenaMason: Networkingsitecashesinonfriends ,January2009.http: //bit.ly/1aPH,visitedon2010-02-10. [OUMI06]Oster,G.,P.Urso,P.Molli,andA.Imine: DataconsistencyforP2Pcollaborativeediting .In Proceedingsofthe200620thanniversaryconferenceonComputersupportedcooperativework page268.ACM,2006. [PFF + 03]Philipose,M.,K.Fishkin,D.Fox,H.Kautz,D.Patterson,andM.Perkowitz: Guide:Towards understandingdailylifeviaauto-identicationandstatisticalanalysis .In Proc.oftheInt. WorkshoponUbiquitousComputingforPervasiveHealthcareApplicationsUbihealth .Citeseer,2003. 63

PAGE 72

[PM09]Papliatseyeu,AndreiandOscarMayora: Mobilehabits:Inferringandpredictinguseractivities withalocation-awaresmartphone .In 3rdSymposiumofUbiquitousComputingandAmbient Intelligence2008 ,volume51of AdvancesinSoftComputing ,pages3432.SpringerBerlin/ Heidelberg,2009,ISBN978-3-540-85866-9. [PRR97]Plaxton,C.Greg,RajmohanRajaraman,andAndr eaW.Richa: Accessingnearbycopiesof replicatedobjectsinadistributedenvironment .In SPAA'97:Proceedingsoftheninthannual ACMsymposiumonParallelalgorithmsandarchitectures ,pages311,NewYork,NY, USA,1997.ACM,ISBN0-89791-890-8. [RD01]Rowstron,AntonyandPeterDruschel: Pastry:Scalable,decentralizedobjectlocation, androutingforlarge-scalepeer-to-peersystems .In Middleware'01:Proceedingsofthe IFIP/ACMInternationalConferenceonDistributedSystemsPlatformsHeidelberg ,pages329 350.Springer-Verlag,2001. [RKCD01]Rowstron,A.,A.M.Kermarrec,M.Castro,andP.Druschel: SCRIBE:Thedesignofalargescaleeventnoticationinfrastructure .NetworkedGroupCommunication,pages30,2001. [SA04]Shetty,J.andJ.Adibi: Theenronemaildatasetdatabaseschemaandbriefstatisticalreport InformationSciencesInstituteTechnicalReport,UniversityofSouthernCalifornia,2004. [SCW + 10]Sala,Alessandra,LiliCao,ChristoWilson,RobertZablit,HaitaoZheng,andBenY.Zhao: Measurement-calibratedgraphmodelsforsocialnetworkexperiments .In 19thInternational WorldWideWebConference .ACM,April2010. [SR08]Singla,ParagandMatthewRichardson: Yes,thereisacorrelation:-fromsocialnetworksto personalbehaviorontheweb .In WWW'08:Proceedingofthe17thinternationalconferenceon WorldWideWeb ,pages655,NewYork,NY,USA,2008.ACM,ISBN978-1-60558-085-2. [SSF + 03]Siewiorek,Daniel,AsimSmailagic,JunichiFurukawa,AndreasKrause,NeemaMoraveji, KathrynReiger,JeremyShaffer,andFeiLungWong: Sensay:Acontext-awaremobilephone In Proceedingsofthe7thIEEEInternationalSymposiumonWearableComputers ,page248, Washington,DC,USA,2003.IEEEComputerSociety,ISBN0-7695-2034-0. [TCL08]Tran,DinhNguyen,FrankChiang,andJinyangLi: Friendstore:cooperativeonlinebackup usingtrustednodes .In SocialNets'08:Proceedingsofthe1stworkshoponSocialnetwork systems ,pages37,NewYork,NY,USA,2008.ACM,ISBN978-1-60558-124-8. [V az03]V azquez,Alexei: Growingnetworkwithlocalrules:Preferentialattachment,clusteringhierarchy,anddegreecorrelations .PhysicalReviewE,67:056104,May2003. [WBCK08]Wyatt,Danny,JeffBilmes,TanzeemChoudhury,andJamesA.Kitts: Towardstheautomated socialanalysisofsituatedspeechdata .In UbiComp'08:Proceedingsofthe10thinternationalconferenceonUbiquitouscomputing ,pages168,NewYork,NY,USA,2008.ACM, ISBN978-1-60558-136-1. [WBS + 09]Wilson,Christo,BryceBoe,AlessandraSala,KrishnaP.N.Puttaswamy,andBenY.Zhao: Userinteractionsinsocialnetworksandtheirimplications .In Proceedingsofthe4thEuropeanconferenceonComputersystems ,pages205,NewYork,NY,USA,2009.ACM, ISBN978-1-60558-482-9. [Wel88]Wellman,Barry: Structuralanalysis:Frommethodandmetaphortotheoryandsubstance ,pages19.CambridgeUniversityPress,1988, ISBN0-521-24441-2HARDCOVER;0-521-28687-5PAPERBACK.ID:1988-97324001. 64

PAGE 73

[Won10]Wong,Phil: Conversationsabouttheinternet5:Anonymousfacebookemployee ,January2010. http://bit.ly/7O2Yqf,visitedon2010-02-10. [YHC08]Yoneki,Eiko,PanHui,andJonCrowcroft: Distincttypesofhubsinhumandynamicnetworks In SocialNets'08:Proceedingsofthe1stworkshoponSocialNetworkSystems ,pages7, NewYork,NY,USA,April2008.ACM,ISBN978-1-60558-124-8. [YKGF06]Yu,Haifeng,MichaelKaminsky,PhillipB.Gibbons,andAbrahamFlaxman: Sybilguard:defendingagainstsybilattacksviasocialnetworks .In SIGCOMM'06:Proceedingsofthe2006 conferenceonApplications,technologies,architectures,andprotocolsforcomputercommunications ,pages267,NewYork,NY,USA,2006.ACM,ISBN1-59593-308-5. [ZC10]Zhang,JianandChaomeiChen: Collaborationinanopendataescience:Acasestudyofsloan digitalskysurvey .CoRR,abs/1001.3663,2010. 65