Global patterns in coronavirus diversity

Citation
Global patterns in coronavirus diversity

Material Information

Title:
Global patterns in coronavirus diversity
Series Title:
Virus Evolution
Publisher:
Oxford University Press
Publication Date:

Record Information

Source Institution:
University of South Florida
Holding Location:
University of South Florida
Rights Management:
This item is licensed with the Creative Commons Attribution License. This license lets others distribute, remix, tweak, and build upon this work, even commercially, as long as they credit the author for the original creation.
Resource Identifier:
2057-1577 ( ISSN )
10.1093/ve/vex012 ( DOI )
K26-05322 ( USFLDC DOI )
k26.5322 ( USFLDC Handle )

Related Items

Related Item:
https://doi.org/10.1093/ve/vex012

USFLDC Membership

Aggregations:
Karst Information Portal
Added automatically

Postcard Information

Format:
Book

Downloads

This item is only available as the following downloads:


Full Text

PAGE 1

GlobalpatternsincoronavirusdiversitySimonJ.Anthony,1,2,3,*ChristineK.Johnson,4DeniseJ.Greig,4SarahKramer,1,5XiaoyuChe,1HeatherWells,1AllisonL.Hicks1, DamienO.Joly,6,7NathanD.Wolfe,6PeterDaszak,3WilliamKaresh3, W.I.Lipkin,1,2StephenS.Morse,2PREDICTConsortium,8JonnaA.K.Mazet,4,†andTraceyGoldstein4,*,†1CenterforInfectionandImmunity,MailmanSchoolofPublicHealth,ColumbiaUniversity,722West168thStreet,NewYork,NY10032,USA,2DepartmentofEpidemiology,MailmanSchoolofPublicHealth,Columbia University,722West168thStreet,NewYork,NY10032,USA,3EcoHealthAlliance,460West34thStreet,New York,NY10001,USA,4OneHealthInstitute&KarenCDrayerWildlifeHealthCenter,SchoolofVeterinary Medicine,UniversityofCaliforniaDavis,Davis,CA95616,USA,5DepartmentofEnvironmentalHealth Sciences,MailmanSchoolofPublicHealth,ColumbiaUniversity,722West168thStreet,NewYork,NY10032, USA,6Metabiota,Inc.OneSutter,Suite600,SanFrancisco,CA94104,USA,7WildlifeConservationSociety,New York,NY10460,USAand8http://www.vetmed.ucdavis.edu/ohi/predict/publications/Authorship.cfm*Correspondingauthor:E-mail:sja2127@cumc.columbia.edu;tgoldstein@ucdavis.edu†Jointseniorauthor.AbstractSincetheemergenceofSevereAcuteRespiratorySyndromeCoronavirus(SARS-CoV)andMiddleEastRespiratorySyndrom Coronavirus(MERS-CoV)ithasbecomeincreasinglyclearthatbatsareimportantreservoirsofCoVs.Despitethis,only6%of allCoVsequencesinGenBankarefrombats.Theremaining94%largelyconsistofknownpathogensofpublichealthoragriculturalsignicance,indicatingthatcurrentresearcheffortisheavilybiasedtowardsdescribingknowndiseasesrather thanthe‘pre-emergent’diversityinbats.Ourstudyaddressesthiscriticalgap,andfocusesonresourcepoorcountries wheretheriskofzoonoticemergenceisbelievedtobehighest.WesurveyedthediversityofCoVsinmultiplehosttaxa fromtwentycountriestoexplorethefactorsdrivingviraldiversityataglobalscale.Weidentiedsequencesrepresenting 100discretephylogeneticclusters,ninety-oneofwhichwerefoundinbats,andusedecologicalandepidemiologicanalyses toshowthatpatternsofCoVdiversitycorrelatewiththoseofbatdiversity.ThiscementsbatsasthemajorevolutionaryreservoirsandecologicaldriversofCoVdiversity.Co-phylogeneticreconciliationanalysiswasalsousedtoshowthathost switchinghascontributedtoCoVevolution,andapreliminaryanalysissuggeststhatregionalvariationexistsinthedynamicsofthisprocess.OverallourstudyrepresentsamodelforexploringglobalviraldiversityandadvancesourfundamentalunderstandingofCoVbiodiversityandthepotentialriskfactorsassociatedwithzoonoticemergence. Keywords: coronavirus;viralecology;bat;evolution. VCTheAuthor2017.PublishedbyOxfordUniversityPress. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense(http://creativecommons.or g/ licenses/by-nc/4.0/),whichpermitsnon-commercialre-use,distribution,andreproductioninanymedium,providedtheoriginalworkisproperly cited. Forcommercialre-use,pleasecontactjournals.permissions@oup.com1 VirusEvolution ,2017,3(1):vex012doi:10.1093/ve/vex012 Researcharticle Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 2

1.IntroductionIn2002/3,SARS-Coronavirus(SARS-CoV)emergedinthe GuangdongprovinceofsouthernChina( Drostenetal.2003 ; Ksiazeketal.2003 ).Itquicklyspreadtotwenty-sevencountries, infecting8,098peoplewith774deaths,andwasdeclaredthe firstglobalpandemicofthe21stcentury.Batswereidentifiedas thereservoir( Lauetal.2005 ; Lietal.2005 )andprobablesource ( Geetal.2013 )oftheoutbreak.In2012,theSARSpandemicwas followedbyMERS-Coronavirus(MERS-CoV),whichemergedin theMiddleEast( Zakietal.2012 )with1,782confirmedhuman casesand640deaths(asofSeptember2016).Camelswereidentifiedasthelikelysourceofhumaninfections( Reuskenetal. 2013 ; Azharetal.2014 );however,batswereagainfoundtohost closelyrelated(MERS-like)virusesandarethereforeassumedto betheoriginalevolutionarysource( Wooetal.2006 ; Anthony etal.2012 ; Cormanetal.2014a b ; Wacharapluesadeeetal.2015 ; Anthonyetal.2017 ). Together,theseoutbreakshavecementedthecoronaviridae asafamilyofzoonoticconcern,andstimulatedasurgeinviral discoveryeffortsinbats[reviewedby Drexleretal.(2014) ]. TheseeffortsappeartoshowthatalmostallhumanCoVshave zoonoticoriginsorotherwisecirculateinanimals,includinghuman229E[bats( Pfefferleetal.2009 );camels( Sabiretal.2016 )], NL63[bats( Donaldsonetal.2010 ; Huynhetal.2012 )],andOC43 [cattle( Vijgenetal.2005 )].Evennon-humanCoVssuchasporcineepidemicdiarrheavirus(PEDV)mayhaveemergedbyhost switchingfromotheranimals[bats( Tangetal.2006 ; Huang etal.2013 )].Overall,itseemsthatCoVdiversityinbatsissubstantial( Drexleretal.2014 ),thatthesevirusesarepronetohost switching( Wooetal.2009 ),andthattheyareacurrent,historic, andfuturethreattopublichealth. Despiterecentefforts,thereareseveralnotablegapsinour understandingofCoVdiversity.Foremost,ourknowledgeof CoVsinresource-limitedcountriesispoor( Drexleretal.2014 ). Thisisparticularlyproblematicgiventhatmanyofthesesame areasarepredictedtobehotspotsofdiseaseemergence( Jones etal.2008 ).Second,therehasbeenlittleefforttounderstandthe evolutionaryandecologicaldriversofCoVdiversityonaglobal scale,ortoevaluatehostandregionalvariationinthefactors thatcontributetotheriskofemergence(e.g.hostswitching). Indeed,moststudiestodatehavebeensomewhatlimitedintheir geographicscopeandhaveincludedonlysmallsamplesizes withlittleepidemiologicorecologicalcontext.Inshort,thereisa criticalneedforamoreglobalperspectiveonCoVdiversity. In2009,thePREDICTprojectwasestablished,inpart,toaddress thisneed.Focusedonstrengtheningcapacityandidentifyingvirusesinwildlifeathigh-riskinterfaces( PREDICT_Consortium 2014 ),thisUSAIDEmergingPandemicThreats(EPT)initiative workedwithlocalpartnersintwentycountriesacrossLatin America,Africa,andAsiaover5yearstobetterunderstandthecurrentdiversityofCoVs(aswellasotherviruses)andevaluatethe factorsthatdrivethisdiversityatdifferentscales.Herein,wereport alargediversityofCoVsequences(mostlyfrombats),showthat thebiogeographyofbatshasshapedthediversityofCoVsglobally, andprovideevidencetosuggesttherecouldberegionalvariation inhostswitchingandtheriskforzoonoticemergence.2.Methods2.1AnimalsandsamplesBats(n 12,333),rodents(n 3,387),andnon-humanprimates (n 3,470)werehumanelysampled(captureandrelease)from twenty‘hotspot’countries( Jonesetal.2008 ),representingcentralAfrica(Cameroon,Gabon,DemocraticRepublicofCongo, RepublicofCongo,Rwanda,Tanzania,Uganda),LatinAmerica (Peru,Bolivia,Brazil,Mexico),andAsia(Bangladesh,Cambodia, China,Indonesia,Laos,Malaysia,Nepal,Thailand,VietNam). Sampleswerealsocollectedfromhumans(n 1,124)inseven countriesincentralAfricaandAsiaasapilottobegintoexplore thepropensityforviralsharingwithwildlife.Inalltwentycountries,wildlifesampleswerecollectedfrom‘highrisk’interfaces, wheredirectorindirectcontactwithhumansmightpromote zoonoticviraltransmission.Theselectedsamplingsitesincludedareasofland-usechange(deforestation,conversionto agriculture);sitesinandaroundhumandwellings;fociofecotourism;marketsrestaurantsandfarmsalongtheanimalvalue chain;andareaswhereoccupationalexposurewaslikely(animalsanctuaries,agriculturalactivities).Whenpossible,individualswereidentifiedtolowesttaxonomicorder(genusand species)andassignedtoanageclass(adult,subadult,neonate) bythefieldteams.Allsamplesfromswabs(e.g.oral,urine,rectal),fluids(e.g.saliva),andtissueswerecollectedinto(1) NucliSensVRLysisBuffer(bioMe rieux,Inc.,Marcy-I’ Etoile,France) and(2)viraltransportmedia,andthenfrozeninthefieldinliquid nitrogenandtransferredtothelaboratoryforstorageat 80C. Allanimalsamplingactivitieswereconductedwithpermissions fromlocalauthoritiesandundertheInstitutionalAnimalCare andUseCommitteeattheUniversityofCalifornia,Davis(protocolnumber:16048).AllhumanactivitieswerereviewedandapprovedbytheUCDavisInstitutionalReviewBoard(IRB),under protocols:215253and432330.2.2ViraldiscoveryRNAwasextractedfromallsamples,andcDNApreparedusing superscriptIII(Invitrogen).TwobroadlyreactiveconsensusPCR assaystargetingnon-overlappingfragmentsoftheorf1abwere usedtodetectbothknownandnovelCoVs( Quanetal.2010 ; Watanabeetal.2010 ).Thefirst[the‘Watanabe’assay ( Watanabeetal.2010 )]amplifieda 434bpfragmentofthe RNA-dependentRNApolymerase(RdRp)correspondingtonucleotides(NTs)15,156–15,589inthehumanCoVOC43genome (NC_005147),whilethesecond[the‘Quan’assay( Quanetal. 2010 )]amplifieda 332bpfragmentofadifferentpeptidedownstreamoftheRdRp,correspondingtoNTs18,323–18,654. Amplifiedproductsoftheexpectedsizewereclonedandsequenced(traditionalSangerdideoxysequencing)accordingto standardprotocols,andsequenceseditedmanuallyinGeneious Pro(ver9.1.3,Biomatters,Auckland,NZ).Wenotethatthisapproachwasadoptedtofacilitateviraldiscoveryinresourcepoorsettings,whicharethetargetofpandemicpreparedness initiativessuchasUSAIDPREDICT,andwheretechniquessuch ashighthroughputsequencingarelargelyunavailable.2.3PhylogeneticsCoronavirussequencesfromtheQuanandWatanabedatasets werealignedwithreferencesequencescollectedfromGenBank usingMUSCLE( Edgar2004 ),followedbymanualalignmentin Se-Alv2.0a11(http://tree.bio.ed.ac.uk).Thebest-fittingmodelof nucleotidesubstitutionforeachdatasetwasselectedby Akaike’sInformationCriterion(AIC)usingjModeltestv2.1.5 ( Darribaetal.2012 ).Forbothalignments,thegeneraltimereversiblemodelwithagammadistributionofrateheterogeneity wasthebest-fittingmodel.MaximumlikelihoodtreeswereconstructedfromeachalignmentusingMEGAv6.06( Tamuraetal. 2| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 3

2013 ).Histogramsofallpairwisesequenceidentities(%)were plotted,asdescribedpreviouslyforhantaviruses( Maesetal. 2009 ),andusedtodefinecut-offsbetweenviralsequenceclustersforanalysis(akintooperatingtaxonomicunits).2.4DiversitymeasuresTheEarth’ssurfacewasdividedintogridcells,each10latitude by10longitude.Thirty-fourofthesegridcellsincludedareas fromwhichbatsweresampled(batswereidentifieddownto thespecieslevelinthirtycells),andtwenty-sevencellshad samplestestingpositiveforCoVs.Ifthecoordinateslistedfora particularbatwereonthelinebetweengridcells,itwasarbitrarilyincludedinthegridcellwithhigherlatitude/longitude. Viralandhostspeciesrichnessforeachcellwascalculatedby determiningthetotalnumberofuniquevirusandbatspecies withineachcell.Alphadiversitywascalculatedusingthe Shannonindex( Jostetal.2011 ),whichwaschosenoverthe Gini–Simpsonindexbecauseitdoesnotplacedisproportionate weightondominantspecies.Sincewedonotknowhowrepresentativeoursamplingwas,theShannonindexwasdeemed thebestapproach.Theeffectivenumberofvirusesandbatsin eachcellwasthencorrelatedusingKendall’stau,sincethevalueswerenotnormallydistributed.Sensitivityanalyseswere performedbyshiftingthegridcellsin1incrementsandrecalculatingrichnessandeffectivespeciesnumbers. Matricesofbetadiversitybetweeneachcombinationofgrid cellsweredevelopedusingtheJaccardindex( Jostetal.2011 ), andManteltestswereusedtodeterminewhetherthedistance betweengridswasassociatedwiththedissimilarityinvirusand batspecies.Geographicdistancematriceswerecalculatedby findingthecentroidofeachgridcellandcalculatingthegeodesicdistancebetweeneachpairusingtheRpackage ‘geosphere’( Geosphere2015 )andtestswereperformedusing Spearman’srankcorrelation.AllanalysesofdiversitywereconductedinRversion2.3.2.2.5NetworkmodelApresence/absencematrixwasconstructedtoshowthedistributionofviralsequenceclustersacrossallspecies.Usingthe PythonpackageNetworkX( Hagbergetal.2008 ),anetwork modelwasconstructed,connectingbatspeciestoallviralclustersthatwereidentifiedwithinthatspeciesinourdata.The networkismadeupofthirty-oneconnectedcomponents, eighty-twoviralclusters,eighty-fivebatspecies,and159edges. NetworkswereplottedusingGephiusingtheforce-directedalgorithmForceAtlas2( Jacomyetal.2014 ).Specifically,weplotted twobipartitegraphs,onewherehostsarecoloredbyregionand oneinwhichtheyarecoloredbyfamily.2.6CophylogenyTheJane( Conowetal.2010 )softwaretoolwasusedforcophylogeneticreconstructions.Thisapproachappliesan apriori ‘cost scheme’todifferentevolutionaryeventsinordertotestthedegreeofcongruencebetweentwotrees(virusandhost).These ‘events’includecospeciation,duplication,hostswitching,and failuretodiverge(orsorting)( Charleston1998 ; Conowetal. 2010 ),whichareusedtocreateaminimal-costreconstructionof theevolutionaryhistorybetweenthevirusesandhosts. Anothercophylogenyprogram,CoRe-PA( Merkleetal.2010 ), usesaparameter-adaptiveapproachtoestimateanappropriate costschemewithoutanypriorvalueassignment.Forouranalysis,weusedJanewiththedefaultpre-determinedcostscheme wherecospeciationisthenullhypothesisand,therefore,the ‘cheapest’event,withtheassumptionthathostswitching wouldtheoreticallybemorecostlythangeneticdriftwithina speciestowhichavirusisalreadyadapted.AnalysiswasrepeatedwithvariedcostvaluesinJanetotestforsensitivityto ourcostscheme,aswellasinCoRe-PAforrobustness,and eventtypesandcostsfromeachreconstructionwerecompared betweenthetwoprograms.Asthesechangesultimatelydidnot significantlyinfluencetheoutcomeoftheanalysis,exactevent assignmentsusedinthefinalanalysiswereinferredsolelyfrom Janewiththedefaultcostscheme. Theanalyseswereperformedusingalluniqueassociations betweenvirusesandtheirrespectivehosts(limitedtobats). Alpha-CoVsequencesandbeta-CoVsequenceswereanalyzed separately,sinceeachappearstohavediversifiedindependentlywithinbats.Thisgaveusatotaloffourreconstructions (alpha-Quan,alpha-Watanabe,beta-Quan,beta-Watanabe). Onlyoneinstanceofeachvirus–hostassociationwasincluded, evenifmultipleinstancesofthesameassociationwereobserved(i.e.ouranalysisdoesnotaccountforthefrequencyof detection).AssociationswereexcludedifthehostwasnotidentifiedtothespecieslevelorifthecytochromeBsequencedid notexistinGenBankorcouldnotbeamplifiedbyPCR.Thetopologyofthevirustreewasinferredfromthetopologyofthe fulltreeforconsistency,sincesubsetsofsequencesoftenproducedifferentarrangements. Foreachreconstruction,themostrecentevolutionaryevent leadingtoavirus–hostassociationwasrecorded.Giventhat sortingisparticularlysensitivetomissingdata(itindicatesa lossonthetreewheninfactitmaysimplybeanartifactof under-sampling),wehaveexcludedthiseventfromouranalysis.Wealsodistinguishedbetweentwopotentialtypesofhost switching,basedonanassumptionthattherecouldbedifferencesinzoonoticriskbetweenvirusesthatonlymovebetween closelyrelatedbats(e.g.thosewithinthesamegenus)andthose thatareabletomakemoresubstantivejumpsintodistantlyrelatedspecies(thoseinotherhostgeneraorfamilies).Wethereforeuse‘hostswitching’toreferspecificallytovirallineages thathavemovedintoahostbelongingtoanothergenusorfamily,and‘sharing’toindicateavirallineagethathasmovedinto adifferenthostspeciesofthesamegenus(withoutspeciating). Weselectedhostgenusasthecut-offtobeconsistentwithcurrentconventionsoftaxonomichierarchy,andverifiedthatthe numberofspeciespergenusisrelativelyconsistentbetweenregions.Wefurtherverifiedthatthemeangeneticdistancebetweenhostspeciesisconsistentbetweenregions(i.e.thatthe taxonomicdistanceavirushastonavigatewhenmovingbetweenanytwospeciesisroughlysimilarinallthreeregions). Abinaryregressionmodelwithlogisticlinkfunctionwas usedtoinvestigatewhetherthedegreeofhostswitchingvaried byregion.Thedependentvariablewashostswitching,andeithercospeciationorsharingwasthereferencegroup.Region wastheindependentvariable.Weusedthegeneralizedestimatingequations(GEE)( LiangandZeger1986 )methodinorderto accountformultipleeventtypeswithinonefamilyandderivea robustestimateofthestandarderrors.Eventdataforallfourreconstructionswereaggregatedfortheanalysis.Wherehost–virusassociationswereduplicated,onewasremoved.Ifthe duplicatedeventtypesdidnotagree[whichcanoccurgiventhe sensitivityofthesemethodstovariationintopologyandnumberoftaxaineachtree( Conowetal.2010 )],werandomlyselectedoneofthetwopossibilitiesandrepeatedthe randomization100times.TheGEEestimatesfromthe100randomiterationswerethencombinedusingrulesestablishedby S.J.Anthony etal. | 3 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 4

Rubin(1987) .Theanalysiswasrepeatedtoexplorevariation basedonbatfamily(independentvariable).Somefamiliesonly possessedonetypeofevent,whichwouldintroducequasicompleteseparationintotheGEEmodel,sotheywereexcluded fromthecorrespondingmodels.However,thenumbersofbat specieswithintheseexcludedfamilieswereverysmall( < 5) comparedtotheoverallnumberofbatspeciesforthewhole study,sowefeltthiswouldnotimpactanyoveralltrends.2.7ViraldiscoveryeffortTherelationshipbetweensamplingeffortandviraldiscovery wasevaluatedusingalog-linkPoissonregressionmodel,fitted withthecountofvirusesforeachspeciesasthedependentvariableandthenumberofanimalssampledastheindependent variable.Thismodelisuniversallyappliedwithcountdata.The coefficientofthesamplenumbervariableisthelogoftheratio oftheexpectednumberofviruseswith(n 1)samplescollected totheexpectednumberofviruseswithnsamplescollected, wherencanbeanyarbitrarypositiveinteger.Usingthisestimatedcoefficient,wecalculatedtheestimatednumbersofviruses(viralsequenceclusters)thatcanbefoundwithrespectto thesamplingeffort(i.e.allpossiblenumbersofcollectedanimalswithinaspecies),andtheywerethenusedtoplottheestimatedcurvetogetherwithits95%confidenceband.Weclarify thatweusetheterm‘expected’toindicatethestatisticalexpectationbasedontheestimateoftheregressionmodel.Weacknowledgethatthemodelfitisnotasgoodasthemodelwith anadditionalquadratictermoftheindependentvariable,nonethelessitavoidstheproblemofover-fittingandrealisticallyreflectstherelationshipbetweensamplingeffortandviral discoverywhenthenumberofanimalscollectedisrelatively lower.2.8FactorsassociatedwithCoVpositivityDuetothesmallnumbersofpositiveindividualsobtainedfor rodents,non-humanprimates,andhumans,onlybatdatawere analyzedforfactorsassociatedwithpositivity.Atotalof12,333 batsweretested.Thisincluded5,624malesand5,767females (922werenotsexed).Amongagedbats7,385wereadults,1,029 weresub-adults,and8wereneonates(3,911werenotaged). Binarylogisticregressionmodelswereusedtoevaluatesignificantvariables,restrictinganalysestospeciesforwhich 50individualsweretested.ModelselectionwasfirstperformedinR (version3.2.5)( RFoundationforStatisticalComputing,2008 )usingtheRpackage‘gmulti’andthestepfunction,andthebest modeldeterminedbyAIC.Arobustestimationofstandarderror wascalculatedbyclusteringbyindividualAnimalIdentification numbertoaccountfornon-independenceofmultipletestsfrom thesameanimalusingSTATA13.1,SE(CollegeStation,TX, USA).Theindependentvariablesincludedinanalyseswere SpecimenType(blood,feces/rectalswabs,guano,oral/rectal sample,oral/nasalswabs,tissue,urine/urogenitalswabs),Host Taxon(family,sub-family,genus),HostAge(adult,sub-adult, neonate),Season(wet,dry),andInterface(animaluse,human activity,landuse,pristinearea).Seasonwasdesignatedas‘wet’ or‘dry’accordingtomonthandproximitytotheequatorfor eachcountry.Interfaceswerebroadlygroupedaccordingtothe typeandintensityofanimalcontactwithpeople,specifically(1) LandUse(animalssampledinareaswithcrops,extractiveindustries,livestockactivities),(2)AnimalUse(hunting,markets, restaurants,trade,wildanimalfarms,wildlifemanagements, zoos/sanctuaries,handlingbyveterinarian/researchers),(3) HumanActivities(ecotourism,inandaroundhumandwellings),and(4)Pristine(whereanimalhumancontactwasnot likely).Themodelfitforeachregionwasevaluatedusingthe Hosmer–Lemeshowgoodness-of-fittestandpseudoR2values.2.9Virus‘Hotspot’MapsThenumberofviralsequenceclusters(richness)ineachsubcladewassummarizedbyhostfamily(SupplementaryTable S1),andchisquaretestsusedtoevaluatewhetherviralsubcladeandhostfamilywereindependentofeachother.BatspeciesrichnessmapswerethencreatedtoinferhotspotsofCoV diversity.Foreachviralclade,spatialdistributiondataonallbat speciesbelongingtoassociatedfamilieswasobtainedfromthe InternationalUnionforConservationofNature(IUCN).Using ArcGISversion10( ESRI2011 ),batspecieswerequantifiedby countingoverlappingpolygons.Theresultingcountdatawere thenconvertedtoarasterandplotted.Allsampledbats,aswell asallidentifiedsequenceclusterswithinthecladeinquestion, werethenplottedovertherasterdata.3.Results3.1GlobaldiversityofCoVsAtotalof19,192animalsandhumanswereassayedforthe presenceofCoVbyconsensusPCR(cPCR)( Table1 ).Themajority werebats(n 12,333),representing282speciesfromtwelve families.Overall,theproportionofCoVpositiveindividualswas 8.6%inbats(n 1,065/12,333)and0.2%innon-bats(n 17/ 6,859).Inotherwords,over98%ofallpositiveindividualswere bats. Partialsequenceswereobtainedfromtwonon-overlapping fragmentsoftheorf1abgene,yielding654sequencesforthe ‘Quan’regionand950forthe‘Watanabe’region(see‘Methods’). Therewasa27%overlap,wheresequenceswereobtainedforboth regionsfromthesamesample.Thedistributionofpairwisesequenceidentitiesdefineda90%cut-offbetweentaxa( Fig.1 ),and theresultingmonophyleticgroups(inwhichallsequences had 90%identity)usedasouroperatingtaxonomicunits.Groups thatsharedlessthan90%identitytoaknownsequencewerelabeledsequentiallyasPREDICT_CoV-1,-2,-3etc;whilegroupssharing 90%identitytoasequencealreadyinGenBankwere consideredtobestrainsofaknownvirusandassignedthesame nameasthematchingsequence(e.g.Kenya_bat_CoV_KY33or HKU-9).Sequencesthatshared > 90%identitybutwerefoundin differenthostswereconsideredpartofthesametaxonomicunit. Basedonthesecriteria,100disc reteviraltaxawereidentified, ninety-oneofwhichwerefoundinbats( Table1 ; Fig.2 ). Importantly,wemakenoclaimsth atthesegroupscorrespondto ‘species’asfullorf1abreplicasesequenceswouldberequiredfor that( Kingetal.2012 ).Insteadweclarifyth atweareusingthese partialfragmentstoclustersequencesinto‘operationaltaxonomic units’(analogoustoOTUclusteringinmicrobiomeresearch).We alsostressthatourcut-offwasdeterminedbasedontwodiscrete regionswithintheorf1abseparatedby > 3,000nucleotides,and thattheseregionscorrespondto uniquepeptidespost-translationalcleavage. Ofnotewasthedetectionofsubgroup2aCoVsequencesin bats,humansandnon-humanprimatesfromfourdifferent countries.Thissub-cladeislargelyconsideredtherodentsubclade( Lauetal.2015 ; Wangetal.2015 ; Tsoleridisetal.2016 ), butherewedetectedsequencescorrespondingtotheknownvirusbetacoronavirus-1in Pteropusmedius fromBangladesh, 4| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 5

Pteropusalecto fromIndonesia,andfromhumansinChina,and sequencescorrespondingtomurinecoronavirusinnon-human primatesfromNepal(thesesequencesweredetectedbyfour differentlaboratories,andnorodentsampleswereprocessedat thesametime)( Fig.2 ).Inaddition,wereportthediscoveryof severalsub-clade2band2cCoVsequenceclusters(theSARS andMERSsub-clades,respectively)( Fig.2 ),andthedetectionof humancoronavirus229E-likesequencesinhipposiderosand rhinolophusbatssampledinROC,Uganda,Cameroonand Gabon,supportingprevioussuggestionsthat229Ehaszoonotic origins( Pfefferleetal.2009 ; Huetal.2015 ).Finally,wehighlight thedetectionofavian-associatedinfectiousbronchitisviruslikesequences(IBV)inbatsandacloselyrelatedvirus (PREDICT_CoV-49)innon-humanprimatesfromBangladesh,as wellasporcineepidemicdiarrheavirus(PEDV)inbats( Fig.2 ). Again,weconfirmthatnolivestocksampleswereprocessedin anyoftheselaboratoriesthatmighthaveactedasasourceof contamination.3.2FactorsdrivingviraldiversityViral a -diversity(the‘effectivenumber’ofspecies)wassignificantlycorrelatedwithbat a -diversity(tau 0.325, P 0.022)( Fig. 3 ),suggestingthatmoreviruseswillbefoundinregionswhere batdiversityishigher.Thisassociationwasmaintainedwhen viralrichness,ratherthaneffectivenumberofspecies,was usedasthediversityindex(tau 0.421, P < 0.001).Viralandbat b -diversitywerealsosignificantlycorrelated(Manteltest; rho 0.575, P < 0.001).Inbothcases,diversitydifferentiatedalmostentirelyintothreediscretecommunitiesbyregion( Fig.3 ). OnlyAfricaandAsiasharedanyviralsequenceclusters (PREDICT_CoV-35andHKU9; Fig.4 ),demonstratingthatbats havehadastrongbiogeographicinfluenceontheglobalecology andevolutionofCoVs. Co-phylogeneticreconciliationanalysiswasusedtoinvestigatetheevolutionarymechanismsdrivingthevirus–hostassociationsobserved,forexamplehostswitching,viralsharing,or co-speciation(see‘Methods’).Overall,hostswitchingwasthe dominantmechanism,followedbyco-speciation( Fig.5 ).When analyzedbyregion,hostswitching(inter-genustransmission) remaineddominantinAfricaandAsia,butnotinLatin America.Despitefewerhost-switchingeventsinLatinAmerica, therewasaconcomitantincreaseinvirussharing(intra-genus transmission).Thissuggeststhatvirusesarestillswitching hostsinLatinAmericabutpreferentiallymovebetweenclosely relatedspecies,whileinAfricaandAsiavirusescrossbetween moredistantlyrelatedspecies.Toassessthesensitivityofour resultstoourchosenmethod(i.e.useofapre-definedcost schemeintheprogramJane),werepeatedtheanalysisforthe Quanalpha-CoVandbeta-CoVreconstructionsusingCoRePA,a parameter-adaptiveapproachthatestimatesanappropriate costschemewithoutanypriorvalueassignment.Comparing theresults,wenoted91%agreementbetweenJaneandCoRePA inalpha-CoVreconstruction(thirty-fourevents,ofwhichthirtyoneagreed),and100%agreementinthebeta-CoVreconstructions(twenty-sixevents,allofwhichagreed).Wefurthernote thatthecostschemecalculatedinCoRePAwas0.7forhost switchingand0.1forco-speciation—supportingthedefaultcost schemeusedinJane(i.e.thathostswitchingshouldbeconsideredmore‘costly’thanco-speciation). Abipartitenetworkmodelconnectingviralsequenceclusterstotheirhostssupportstheseresults,illustratingthatbat CoVswereconnectedtoseveralhostfamiliesinAfricaandAsia whilebeinglargelyrestrictedtoasinglebatfamilyinLatin Table1. Summaryofindividualstestedandnumberpositiveforatleastonecoronavirusbyhosttaxa. HosttaxatestedNo.individualstestedNo.individualspositiveNo.distinctvirusesdetected Bats12,3331,06591 Non-HumanPrimates3,47042 RodentsandShrews3,387117 Humans1,12422 Total19,1921,082100aaNote:Numbersdonottotalastwovirusesweredetectedintwotaxa. Figure1. HistogramoftherelativefrequencyofpairwisesequenceidentitiesusedtodenethecutoffbetweenoperatingtaxonomicunitsforallCoVsequences detected.Abimodaldistributionwasobservedandacutoffof90%sequenceidentityusedtoseparatesequencesfromboththeWatanabe(PanelA)andQuan( PanelB)assaysintodiscreteviraltaxa. S.J.Anthony etal. | 5 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 6

Table2. VariablesassociatedwithcoronaviruspositivebattestsinAfrica,Asia,andLatinAmerica.Regionaldatasetsforeachlogisticregressionmodelconsistedofbatspecieswheregreaterthanftybatsweretested.Agedatasetswereasubsetoftheregionaldatasetsasageclass wasnotdeterminedforallindividuals.Numbersinboldarestaticallysignicantfor P < 0.05. Oddsratio P 95%Confidenceintervals Africa Sampletype LowerUpper Feces/rectalswab 350.98 < 0.001146.64840.07 Oral/nasalswab 5.05 < 0.0012.0312.54 Oral/rectalswab 30.98 < 0.00113.4671.30 Blood1.650.5400.338.09 Tissue1.00 Hostfamily Hipposideridae1.220.5590.622.41 Pteropodidae3.84 < 0.0012.276.49 Molossidae1.00 Season Dry2.42 < 0.0011.793.25 Wet1.00 Interface Animaluse1.98 0.0011.352.91 Pristinearea1.390.6240.375.29 Landuse1.660.2190.743.74 Humanactivity1.00 Age Subadult 5.91 < 0.0014.288.17 Adult1.00 Asia Sampletype Feces/rectalswab 62.80 < 0.00115.44255.49 Oral/nasalswab 13.25 < 0.0013.1156.38 Urine/urogenitalswab 46.92 < 0.00110.85202.80 Guano 22.12 < 0.0013.66133.56 Tissue,Oral/rectalswab1.00 Hostfamily Emballonuridae0.310.260.042.37 Miniopteridae 9.78 < 0.0015.6916.81 Pteropodidae 2.16 < 0.0011.293.62 Rhinolophidae1.080.820.572.03 Vespertilionidae 3.67 < 0.0012.186.18 Hipposideridae1.00 Season Dry 1.490.031.032.16 Wet1.00 Interface Animaluse 3.300.011.298.40 Humanactivity 3.480.011.368.95 Pristinearea1.810.350.526.35 Landuse1.00 Age Subadult 1.840.001.262.67 Adult1.00 LatinAmerica Sampletype Feces/rectalswab 15.660.0005.0248.79 Oral/rectalswab 27.860.0006.86113.22Oral/nasalswab1.00 Hostsubfamily Carolliinae6.950.060.9053.76 Stenodermatinae5.040.120.6737.89 Glossophaginae1.00 Interface Animaluse 5.250.011.4818.65 Humanactivity2.730.120.779.69 Landuse4.220.100.7822.95 Pristinearea1.00 Season Dry1.300.520.582.89 Wet1.00 Age Subadult1.730.150.833.62 Adult1.00 6| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 7

America( Fig.4 ,PanelB).Further,hostswitchingwasnearly fourtimesmorelikelythanvirussharinginAfrica,whencomparedwithLatinAmerica(OR:3.858; P 0.040).CoVsinAsia werealsomorelikelytohostswitchthansharewhencompared withLatinAmerica,howevertheresultswerenotsignificant (OR:2.474; P 0.143).Noparticularbatfamilywasassociated withincreasedhostswitching,butthismayreflectsmallsamplesizeswhenthedataaresummarizedbyfamily.3.3EstimatednumberofCoVsinbatsAlargenumberofhost(bat)speciesinourstudywerenegative forCoV(n 197),howevernoneofthesespeciesweresampled extensively( Fig.6 ).Wefoundthatallspecieswithsample sizes > 110individualswerepositiveforoneormoreCoVs,suggestingthatwewouldhavedetectedCoVsinsomeofthenegativespeciesinourstudyifsamplingeffortwereincreased.Due tothehighnumberofthesenegativespeciesandtheireffecton theaveragenumberofvirussequenceclustersperspecies,they wereexcludedfromourestimates.Byretainingonlythosespeciesforwhich > 110individualsweresampled(therewere twenty-sevenspeciesthatqualified),weestimatedtheaverage numberofCoVsperspeciestobe2.67(std 1.38),thusaccountingforthelikelyscenariothatsomespecieswillhavemorevirusesandothersless.Wethenextrapolatedtheaveragetoall 1,200batspeciestoestimateatotalpotentialrichnessof3,204 CoVs(range 1,200–6,000CoVs),mostofwhichhaveyettobe described.3.4FactorspredictingCoVpositivityInordertorefinefuturesurveillanceeffortstofindtheundiscovereddiversityofCoVsinbats,weevaluatedthefactorspredictingCoVpositivity.Foreachregion,thebestmodelincluded specimentype,batfamily(orinthecaseofLatinAmerica,subfamily),season,andanimal–humaninterface( Table2 ).Season wasnotincludedinthetopmodelforLatinAmerica,butthe modelwiththeseasonincludedwaswithintwoAIC(delta AIC 1.06)andthereforeconsideredtohaveasmuchsupportas thetopmodel( BurnhamandAnderson2002 ).Specimentype washighlyassociatedwithCoVpositivity,andsamplescontainingfecesorfecalswabsweresignificantlymorelikelytotest positivethanotherspecimentypesinallregions.Batfamily wasimportantinAsiaandAfrica.Seasonwasalsosignificant, withsamplescollectedduringthedryseasonmorelikelytotest positiveinAfricaandAsiathanthosecollectedduringthewet season(albeitwithlowoddsratios).Ageclassappearstobeimportant,assub-adultsweremorelikelytotestpositivethan adultsinAfricaandAsia.SexwasnotrelatedtoCoVpositivity inAsiaorLatinAmericawhilemaleswereslightlymorelikely tobeCoVpositiveinAfrica(OR 1.5,1.2–1.9CI, P < 0.001).Broad categoriesofanimal–humaninterfacesweresignificantly Figure2. MaximumlikelihoodphylogeneticreconstructionsforallpartialCoVRdRpfragmentsforboththeQuan(PanelA)andWatanabe(PanelB)assays.Sequen ces arecollapsedintoclades,representingouroperatingtaxonomicunits(sequencessharing 90%identity)andnumberofsequencesforeachtaxonisindicatedinparentheses.RepresentativepublishedsequencesfromGenBankhavebeenincludedforcomparison(GenBankaccessionnumberindicated).Bothtreeswerer ootedusing therelatedBredavirus(NC_007447)andallnodeshave 60%bootstrapsupport.Piechartsindicatethedistributionofviraltaxabyhost(bat)family,ineachvirussubclade. S.J.Anthony etal. | 7 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 8

associatedwithCoVpositivityamongbatsinallthreeregions: inAfricaandLatinAmerica,batssampledattheanimaluseinterfacecategoryweremorelikelytobepositive,andinAsia, batssampledattheanimaluseandhumanactivitiesinterfaces weremorelikelytobepositivethanthosesampledatotherinterfaces.Thesignificanceoftheanimalinterfaceinallthree regionssuggeststhatpracticesaroundanimalusemaybeimportantfordiseasetransmission.3.5FuturesamplingeffortBasedonourstudy,wewereabletoestimatethesamplingeffortthatwouldberequiredtofindallCoVsinbats,basedona Figure3. Comparisonofviralandbatdiversity.Theearth’ssurfacewasdividedintogridcellsbylatitudeandlongitude(10 10degreeunits)fordiversitycalculations (PanelA).Gridcellswherebatsweresampledarenumberedineachregion.Alphadiversity(ShannonH)forvirus(PanelB)andhost(PanelC)werecorrel ated,indicatingthatareasofhighbatdiversityalsohavehighviraldiversity.Darkercellsindicatehigheralphadiversity(i.e.moreviralorhosttaxa)ineach gridcell.Betadiversity wasalsocorrelatedbetweenvirus(PanelD)andhost(PanelE),anddifferentiatedintothreediscretecommunitiesbyregion—LatinAmerica(gridcel ls1–10),Africa (gridcells11–20),andAsia(gridcells21–34).Shadingindicatesthateitherviruses(inred)orhosts(inblue)aresharedbetweentwocorrespondin ggridcells,withdarker cellsindicatinghigherpairwisesimilarity. 8| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 9

Poissonregressionmodelofourdata( Fig.6 ).With154individuals,weestimatethatanaverageofoneCoVwillbedetected (95%CI136–177).With397individuals,weestimatethatup tofiveCoVswillbedetected(95%CI351–458).Aswedidnot observeanyspecieswithgreaterthanfiveviralsequence clusters,wemaketheassumptionthatsampling397individualsshouldcapturethefulldiversityofCoVsineachbat species. Figure4. NetworkmodelshowingtheconnectionofCoVsandtheirhosts.Viralsequenceclusters(coloredgrey)areconnectedtohostspecies,eitherbyregion( Panel A)orfamily(PanelB).Viralandhostandcommunitiesseparatealmostentirelybyregion;onlyAfricaandAsiaareconnectedbytwosharedviruses(HKU 9and PREDICT_CoV-35)foundinspeciesfrombothcontinents.NetworksalsoshowthatvirusesappeartobesharedbymultiplehostfamiliesinAfricaandAsi a,whilebeing morerestrictedtoasinglefamilyinLatinAmerica. S.J.Anthony etal. | 9 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 10

Figure5. Relativeproportionofevolutionaryeventsleadingtoobservedvirus:hostassociationsforCoVsinbats.Cophylogeneticreconstructionswereuse dtoidentify eachevent(SupplementaryFig.S1:PanelsA–D),andsignicanceevaluatedbyregion.Acrossallregions,hostswitchingwasthedominantevolutiona ryevent(Panel A).Whenseparatedbyregion,hostswitchingremaineddominantinAfrica(PanelB)andAsia(PanelC),butnotinLatinAmerica(PanelD). Figure6. Relationshipbetweensamplingeffortandviraldetectionamongallbatspeciessampled.APoissonregressionmodelwasusedtoestimatetheexpected numberofvirusesbasedonthenumberofanimalssampled,byspecies(eachcircleindicatesaseparatespecies).The95%condenceintervalsareindicate d. 10| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 11

3.6PredictingthedistributionofunknownCoVsViralsub-cladeandhostfamilywerenotindependentofeach other( v2byclade P < 0.001;SupplementaryTableS1),demonstratingthatdifferentcladeshavesignificantassociationswith specificbatfamilies.Forexample,subgroup2dCoVsweresignificantlyassociatedwithpteropidbats,andwereonlyidentifiedinregionswherepteropidbatsexist.Likewise,subgroup2b CoVswereassociatedwithrhinolophidandhipposideridbats and2cwithvespertilionidbats.To‘predict’thepotentialdistributionofunknownCoVs,weplottedtheknowndistributionof batsbelongingtothesefamiliesandassumethat‘hotspots’of batdiversityinferhotspotsofviraldiversityforeachsub-clade ( Fig.7 ).WenotethatthealphaCoVgenushasbeenconsidered asonegroup,sincetheydonotresolvewellintosub-clades ( deGrootetal.2009 ),andthatno2amapwasgenerated giventhatonlyonebatvirusfromthiscladewasidentified.4.DiscussionTheemergenceofSARSandMERShasdrivenaneedtounderstandmoreaboutthediversity,ecologyandevolutionofcoronaviruses,particularlyatso-called‘hotspots’ofzoonotic emergence( Jonesetal.2008 ; Drexleretal.2014 ).Tothisend,we surveyedthediversityofCoVsfromtwentycountriesinLatin America,Africa,andAsiatoidentifyglobalfactorsdrivingviral diversityandtolookforregionaldifferencesinfactorsthatcontributetotheriskofemergence,suchashostswitching.Intotal, weidentifiedsequencesfrom100discretephylogeneticclusters,ninety-oneofwhichwerefoundinbats. OurdatasuggestthatthediversityofbatCoVshasbeen drivenprimarilybyhostecology.First,viralrichnesswas stronglycorrelatedwithbatrichness,suggestingthatmost CoVswillbefoundinregionswherebatdiversityishighest. Second,weshowedthatCoVdiversityseparatesintothree Figure7. Viraldiversity‘hotspot’maps.PanelAshowsthepotentialhotspotsforCoVsbasedonthedistributionofbatsworldwide.LocationofalphaCoVseque nces fromthisstudyareshowninblackandbetaCoVsequencesinblue,indicatingthereisnogeographicalbiasbasedonviralgenus(i.e.alphaandbetaCoVs areequally likelytobefoundinallregions).Someviralsub-cladeswereassociatedwithparticularbatfamilies,andthespatialdistributiondataofallspeci esbelongingtothese familieswereplottedtoindicatethepotentialhotspotsofviraldiversity(richness)forthesesub-clades.PanelBindicatesthepotentialdistri butionof2bCoVsbasedon thedistributionofrhinolphusandhipposiderosbats.Locationsof2b-positiveanimalsidentiedinthisstudyareindicatedinblack,andcorrelat ewithareasofhigh speciesrichness(forthesefamilies).CoV-positiveanimalsforothersub-cladesshowninlightblue.PanelCindicatesthepotentialdistribution of2cCoVsbasedonthe distributionofvespertilionidbats.Locationsof2c-positiveanimalsidentiedinthisstudyareindicatedinblack.Thismapsuggeststhereareho tspotsof2cdiversity inregionsnotcoveredinthisstudy(e.g.Europe).PanelDindicatesthepotentialdistributionof2dvirusesbasedonthedistributionofpteropidba ts.Themapsuggests thesevirusesmayhaveamorelimiteddistribution,comparedwithvirusesofothersub-clades.Locationsof2d-positiveanimalsidentiedinthisst udyareshownin black. S.J.Anthony etal. | 11 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 12

distinctcommunitiesbyregion,echoingthedistributionofbats andsuggestinganecologicaldependenceontheirhosts.And third,weidentifiedparticularassociationsbetweenviralsubcladeandbatfamily,indicatingthatCoVshaveevolvedwith(or adaptedto)preferredfamilies.Collectively,thesedatashow thattheglobaldiversityanddistributionofCoVsinbatsisnonrandomandisdrivenbyvariationinthebiogeographyofbats. Regionalvariationwasalsoobservedintheproportionof hostswitchingevents,relativetootherevolutionarymechanismssuchasco-speciation,duplication,orsharing(see Methodsfordefinitions).Overall,hostswitchingwasthedominantmechanism,supportingthegeneraltrendobservedfor CoVs( Vijaykrishnaetal.2007 ; Wooetal.2009 ; Lauetal.2012 )as wellassimilarfindingsforotherviralfamilies,includingparamxyoviruses( Meladeetal.2016 ),hantaviruses( Ramsdenetal. 2009 ),andarenaviruses( Coulibaly-N’Goloetal.2011 ; Irwinetal. 2012 ).However,whenanalyzedbyregion,therewereproportionallyfewereventsinLatinAmericacomparedwithAfricaor Asia.Giventhathostswitchingisthefirstcriticalstepinzoonoticemergence( Lietal.2005 ; Pfefferleetal.2009 ; Wooetal. 2012 ; Azharetal.2014 ),wesuggestthisfindingcouldreflectregionaldifferencesintheriskofdiseaseemergence. Itisimportanttonotethatwedistinguish‘hostswitching’ (inter-genusorfamilyswitching)from‘sharing’(intra-genus switching)inouranalysisandthatwhiletheproportionofhost switcheswaslowerinLatinAmerica,theproportionofsharing eventswasconcomitantlyhigher.ThisindicatesthatCoVsin thisregionstillswitchhosts,justliketheydoinAfricaandAsia, butthattheypreferentiallymovebetweencloselyrelatedspecies.Whilewedonotyetunderstandthefactorsdrivingthisdifference,ifweassumethattheriskofspillovertohumans reflectsthetaxonomicdistancesthatCoVsareabletojump ( KreuderJohnsonetal.2015 ),ourresultswouldindicatea higherriskforzoonoticemergenceinAfricaandAsia[acknowledgingthathostswitchingisnottheonlyimportantfactordrivingzoonoticemergence( Jonesetal.2008 ; Keesingetal.2010 ; Kareshetal.2012 )].Whileevidenceofregionalvariationinhost switchinghasnotbeenreportedpreviouslyforviruses(to ourknowledge),similarpatternshavebeenobservedin hemosporidianparasites( Ellisetal.2015 ). Intotal,weestimatedthatthereareatleast3,204CoVs (basedontheclusterdefinitionusedinthisstudy)inbats.This suggeststhatalthoughthereisstillalargenumberofcoronavirusesthatremaintobediscovered,theirdetectionremainsan achievablegoal,andweadvocatefortheirdiscoverygiventhe potentialforevengreaterinsightsintotheglobalecologyand evolutionoftheseviruses(e.g.furtherinvestigationintoregionaldifferencesinhostswitching),andtheopportunityto elucidatetheirindividualzoonoticpotential( Geetal.2013 ; Wangetal.2014 ; Yangetal.2014 ; Anthonyetal.2017 ).Building onourstudy,wesuggestthatfuturesurveillanceeffortsshould considerthenumberofsamplescarefullyandincludeupto400 individuals(ofeithersex)perspeciesinordertomaximizethe chanceofdetectingallCoVs,andthatincludinglessthan154 individualsperspeciesmightreducethechanceoffindingeven onevirusandwouldlikelyproduceapoorreturnoninvestment.Wealsorecommendthatfeces(orrectalswabs)followed bysaliva(ororalswabs)shouldbeprioritizedifresourcesare limitedandallsampletypescannotbeprocessed,andthat samplingshouldbebiasedtowardsimmatureanimalsandin thedryseason. Topredictwherethisunknowndiversityislikelytobe,we plottedtheknowndistributionofallbatfamiliesassociated witheachCoVsub-clade.The2bSARScladewasassociated withrhinolophidandhipposideridbats,soweplottedthedistributionofallbatspeciesbelongingtothesefamiliestogenerate amapofviraldiversity(richness).Basedonthismap,wewould expectthegreatestdiversityofunknown2bvirusestobefound inSouthEastAsia,whichisconsistentwithpreviousstudiesdescribing2bvirusesinbats,butalsoinisolatedpocketsthroughoutAfrica.Incontrast,the2cMERScladewasassociatedwith vespertilionidbats,predictingthat2cdiversitywillbehighestin Mexico,Europe,CentralandSouthEastAfrica,andpartsof SouthEastAsia.Again,thisisconsistentwiththedistribution of2cvirusesreportedhere,andpreviouslyintheliterature ( Wooetal.2007 ; Reuskenetal.2010 ; Anthonyetal.2012 ; Lelli etal.2013 ; Cormanetal.2014a b ; Anthonyetal.2017 ). Certainlimitationsshouldbeconsideredintheinterpretationofourdata.First,thediversityofCoVsequencesdetectedis almostcertainlybiasedbyoursamplingeffort.Asmallnumber ofspeciesweresampledabundantly,thusmaximizingour chancesoffindingaCoV(27/282specieshad > 110individuals); however,mostwereunder-sampled(232/282with 50individuals).WhilethisbiasestheprobabilityoffindingaCoVtowards themoreabundantlysampledspecies,westressthatitalsoreflectstheunevendistributionofnaturallyoccurringcommunities( Magurran2011 ),andthattargetingtheveryrarespecies wouldbeprohibitivelyexpensive.Itisalsounclearifpopulation sizehasaneffectonthenumberofvirusespresentinaspecies—e.g.dolargerpopulationsaccommodatemoreviraldiversity?Secondouranalysisonlyrepresentsthe‘last’evolutionary eventidentified.Itdoesnotrepresent,oraccountfor,thecompleteevolutionaryhistoryoftheselineages.Therefore,whilea singleeventhasbeenascribedtoeachassociation(see ‘Methods’),itdoesnotprecludeothereventshavinganequalor greaterimpactinthepast.Wearelimitedtousingthismostrecenteventtypeinordertoassignasingleevolutionaryeventto eachuniqueassociation.Itshouldalsobenotedthateventsare dependentontherelationshipsobservedinthereconstructed treesandcouldchangeasmorevirusesareaddedorlongersequencesareused.Third,wehaveonlygeneratedpartialsequencesforanalysis,whichlimitsourabilitytocommenton thespecificzoonoticpotentialofeachvirusorprovideacomprehensiveanalysisoftheirgenetichistoriesandtaxonomy,for example,estimatingthefrequencyorimpactofrecombination, whichcanbeanimportantdriverofhostswitchinganddisease emergence( Anthonyetal.2017 ).Wecertainlysupportfullgenomesequencingwhereverpossible[andhavesuchefforts underway( Anthonyetal.2017 )],howeverdifficultiesinvirus isolationorsequencingdirectlyfromswabs(wherematerial andviralloadcanbelow,yieldingvariableresults)canoften precludeextendingsequencesmuchbeyondtheshortfragmentsamplifiedbyconsensusPCR( Drexleretal.2014 ).Equally, logisticandpermittingissuesinmovingbiologicalsamples acrossborders,andtheneedtofirstdevelopin-countrycapacity tofullysequencetheseviruses,havealllimitedourabilityto characterize(mostof)thesevirusesfurther. Our‘stateoftheart’isintheuniquescopeofourstudy,focusingonasingleviralfamilyataglobalscale.Toachievethis,the USAIDPREDICTprojectfirsthadtoestablishastrategyforviral discoveryintwentydifferentcountries,manyofwhichhadno priorcapacitytodothisworkatall.Forthisreason,weuseda simpleyethighlyimplementableapproachbasedoncPCR. Initially,thisapproachmightnotappeartobeaspowerfulas highthroughputsequencing(HTS)techniques;however,this studyhasdemonstratedtheparticularandconsiderable strengthsofcPCRasanaffordablescreeninganddiscoverytoolin resource-poorsettings,whereHTSisstilllargelyunsupported. 12| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 13

Studyingviraldiversityonaglobalscaleisjustonecomponentofalargerstrategytounderstandthefactorsdrivingviral diversity.Ecologicalandevolutionarymechanismscancontributedifferentlyacrossscales,andwethereforeadvocatecomplimentingtheglobalperspectivewithstudiesthatfocusonentire viralcommunities(ratherthanonlyoneviralfamily)withinsingleindividualsorhostspecies(ratherthanonlyonaglobal scale)( Anthonyetal.2015 ).Weproposethatitiscriticaltounderstandboththecomponentpartsofthe‘zoonoticpool’ ( Morseetal.2012 )andthenatureoftheirinteractionsatdifferentscalesifwearetomovetowardsbetterpredictivemodels, andultimatelytoreducingzoonoticemergence. Finally,weofferaspecificcommentregardingthepublic healththreatposedbybats.Whileitistemptingtoconclude thatbatsharboralargenumberofpotentiallyzoonoticCoVs, mostoftheputativevirusesdetectedinthisstudyareunlikely toposeanythreattohumans—eitherbecausetheylackthebiologicalpre-requisitestoinfecthumancellsorbecausetheecologyoftheirhostslimitstheopportunityforspillover.Studies suchasthisareintendedtoadvanceourunderstandingofthe fundamentalbiologyofviruses,nottocreatealarmorincitethe retaliatorycullingofbats.Indeed,suchactionsoftenhaveunanticipatedconsequences;andcanevenenhancediseasetransmission,asseenwithrabiesinvampirebats( Streickeretal. 2012 ; Blackwoodetal.2013 ).Batsareimportantinsectivores, pollinatorsandseeddispersersandweunderscoreboththeirvitalecosystemroleandtheneedtoconsideranypublichealth interventionscarefully.SupplementarydataSupplementarydataareavailableat VirusEvolution online. Conflictofinterest:Nonedeclared.AcknowledgementsThisstudywasmadepossiblebythegeneroussupportof theAmericanpeoplethroughtheUnitedStatesAgencyfor InternationalDevelopment(USAID)EmergingPandemic ThreatsPREDICTproject(cooperativeagreementnumber GHN-A-OO-09-00010-00).Wethankthegovernmentsof Cameroon,Gabon,DemocraticRepublicofCongo,Republic ofCongo,Rwanda,Tanzania,Uganda,Peru,Bolivia,Brazil, Mexico,Bangladesh,Cambodia,China,Indonesia,Laos, Malaysia,Nepal,Thailand,andVietNamforpermissionto conductthisstudy,andtheeldteamsandcollaborating laboratoriesthatperformedsamplecollectionandtesting.ReferencesAnthony,S.J.etal.(2012)‘CoronavirusesinbatsfromMexico’, JournalofGeneralVirology ,94:1028–38. ,etal.(2015)‘Non-randompatternsinviraldiversity’, NatureCommunications ,6:8147. ,etal.(2017)‘Furtherevidenceforbatsastheevolutionary sourceofMERScoronavirus’, mBio ,8/2.pii:e00373-17.doi: 10.1128/mBio.00373-17. Azhar,E.I.etal.(2014)‘Evidenceforcamel-to-humantransmissionofMERScoronavirus’, TheNewEnglandJournalofMedicine 370:2499–505. Blackwood,J.C.etal.(2013)‘Resolvingtherolesofimmunity, pathogenesis,andimmigrationforrabiespersistencein vampirebats’, ProceedingsoftheNationalAcademyofSciencesof theUnitedStatesofAmerica ,110:20837–42. Burnham,K.P.,andAnderson,D.R.(2002) ModelSelectionand MultimodelInference:APracticalInformation-TheoreticApproach NewYork:Springer. Charleston,M.A.(1998)‘Jungles:anewsolutiontothehost/parasitephylogenyreconciliationproblem’, Mathematical Biosciences ,149:191–223. Conow,C.etal.(2010)‘Jane:anewtoolforthecophylogenyreconstructionproblem’, AlgorithmsforMolecularBiology:AMB ,5:16. Corman,V.M.etal.(2014a)‘Rootingthephylogenetictreeof middleeastrespiratorysyndromecoronavirusbycharacterizationofaconspecicvirusfromanAfricanbat’, Journalof Virology ,88:11297–303. etal.(2014b)‘Characterizationofanovelbetacoronavirus relatedtomiddleEastrespiratorysyndromecoronavirusin Europeanhedgehogs’, JournalofVirology ,88:717–24. Coulibaly-N’Golo,D.etal.(2011)‘Novelarenavirussequencesin Hylomyscussp.andMus(Nannomys)setulosusfromCote d’Ivoire:implicationsforevolutionofarenavirusesinAfrica’, PLoSOne ,6:e20893. Darriba,D.etal.(2012)‘jModelTest2:moremodels,newheuristicsandparallelcomputing’, NatureMethods ,9:772. deGroot,R.J.etal.(2009)‘Virustaxonomy:classicationandnomenclatureofviruses’,inA.M.Q.King,M.J.,Adams,E.B., Carstens,andJ.,Lefkkowitch(eds) NinthReportofthe InternationalCommitteefortheTaxonomyofViruses .Amsterdam: Elsevier. Donaldson,E.F.etal.(2010)‘Metagenomicanalysisofthe viromesofthreeNorthAmericanbatspecies:viraldiversity amongdifferentbatspeciesthatshareacommonhabitat’, JournalofVirology ,84:13004–18. Drexler,J.F.,Corman,V.M.,andDrosten,C.(2014)‘Ecology,evolutionandclassicationofbatcoronavirusesintheaftermath ofSARS’, AntiviralResearch ,101:45–56 Drosten,C.etal.(2003)‘Identicationofanovelcoronavirusin patientswithsevereacuterespiratorysyndrome’, TheNew EnglandJournalofMedicine ,348:1967–76. Edgar,R.C.(2004)‘MUSCLE:multiplesequencealignmentwith highaccuracyandhighthroughput’, NucleicAcidsResearch ,32: 1792–7. Ellis,V.A.etal.(2015)‘Localhostspecialization,host-switching, anddispersalshapetheregionaldistributionsofavianhaemosporidianparasites’, ProceedingsoftheNationalAcademyof SciencesoftheUnitedStatesofAmerica ,112:11294–9. ESRI.(2011) ArcGISDesktop:Release10 .Redlands,CA: EnvironmentalSystemsResearchInstitute. Ge,X.Y.etal.(2013)‘Isolationandcharacterizationofabat SARS-likecoronavirusthatusestheACE2receptor’, Nature 503:535–8. Hijmans,R.J.,Williams,E.,Vennes,C.(2015)Geosphere: SphericalTrigonometryv.RPackageversion1.5. Hagberg,A.A.,Schult,D.A.,andSwart,P.J.(2008)‘Exploringnetworkstructure,dynamics,andfunctionusingNetworkX’,inG., Varoquaux,T.,Vaught,andJ.,Millman(eds.) Proceedingsof7th PythoninScienceConference(SciPy2008) ,pp.11–15.Pasadena,CA. Hu,B.etal.(2015)‘Batoriginofhumancoronaviruses’, Virology Journal ,12:221. Huang,Y.W.etal.(2013)‘Origin,evolution,andgenotypingof emergentporcineepidemicdiarrheavirusstrainsinthe UnitedStates’, mBio ,4:e00737–13. Huynh,J.etal.(2012)‘Evidencesupportingazoonoticoriginof humancoronavirusstrainNL63’, JournalofVirology ,86: 12816–25. S.J.Anthony etal. | 13 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 14

Irwin,N.R.etal.(2012)‘Complexpatternsofhostswitchingin NewWorldarenaviruses’, MolecularEcology ,21:4137–50. Jacomy,M.etal.(2014)‘ForceAtlas2,acontinuousgraphlayout algorithmforhandynetworkvisualizationdesignedforthe Gephisoftware’, PLoSOne ,9:e98679. Jones,K.E.etal.(2008)‘Globaltrendsinemerginginfectiousdiseases’, Nature ,451:990–3. Jost,L.,Chao,A.,andChazdon,R.L.(2011)‘Compositionalsimilarityandbetadiversity’inA.E.,MagurranandB.J.,McGill (eds.) BiologicalDiversity:FrontiersinMeasurementand Assessment .Oxford:OxfordUniversityPress. Karesh,W.B.etal.(2012)‘Ecologyofzoonoses:naturalandunnaturalhistories’, Lancet ,380:1936–45. Keesing,F.etal.(2010)‘Impactsofbiodiversityontheemergence andtransmissionofinfectiousdiseases’, Nature ,468:647–52. King,A.M.Q.etal.(2012) VirusTaxonomy:Classicationand NomenclatureofViruses.NinthReportoftheInternational CommitteeonTaxonomyofViruses .ElsevierAcademicPress. KreuderJohnson,C.etal.(2015)‘Spilloverandpandemicpropertiesofzoonoticviruseswithhighhostplasticity’, Scientic Reports ,5:14830. Ksiazek,T.G.etal.(2003)‘Anovelcoronavirusassociatedwith severeacuterespiratorysyndrome’, TheNewEnglandJournalof Medicine ,348:1953–66. Lau,S.K.etal.(2005)‘Severeacuterespiratorysyndrome coronavirus-likevirusinChinesehorseshoebats’, Proceedings oftheNationalAcademyofSciencesoftheUnitedStatesofAmerica 102:14040–5. ,etal.(2012)‘Recenttransmissionofanovelalphacoronavirus,batcoronavirusHKU10,fromLeschenault’srousettesto pomonaleaf-nosedbats:rstevidenceofinterspeciestransmissionofcoronavirusbetweenbatsofdifferentsuborders’, JournalofVirology ,86:11906–18. ,etal.(2015)‘Discoveryofanovelcoronavirus,China RattuscoronavirusHKU24,fromNorwayratssupportsthemurineoriginofBetacoronavirus1andhasimplicationsforthe ancestorofBetacoronaviruslineageA’, JournalofVirology ,89: 3076–92. Lelli,D.etal.(2013)‘Detectionofcoronavirusesinbatsofvarious speciesinItaly’, Viruses ,5:2679–89. Li,W.etal.(2005)‘BatsarenaturalreservoirsofSARS-likecoronaviruses’, Science ,310:676–9. Liang,K.,andZeger,S.L.(1986)‘Longitudinaldataanalysisusing generalizedlinearmodels’, Biometrika ,73:13–22. Maes,P.etal.(2009)‘Aproposalfornewcriteriafortheclassicationofhantaviruses,basedonSandMsegmentproteinsequences’, Infection,GeneticsandEvolution:JournalofMolecular EpidemiologyandEvolutionaryGeneticsinInfectiousDiseases ,9: 813–20. Magurran,A.E.(2011) BiologicalDiversity:FrontiersinMeasurement andAssessment .Oxford:OxfordUniversityPress. Melade,J.etal.(2016)‘Aneco-epidemiologicalstudyofMorbillirelatedparamyxovirusinfectioninMadagascarbatsreveals host-switchingasthedominantmacro-evolutionarymechanism’, ScienticReports ,6:23752. Merkle,D.,Middendorf,M.,andWieseke,N.(2010)‘Aparameteradaptivedynamicprogrammingapproachforinferringcophylogenies’, BMCBioinformatics ,11Suppl1:S60. Morse,S.S.etal.(2012)‘Predictionandpreventionofthenext pandemiczoonosis’, TheLancet ,380:1956–65. Pfefferle,S.etal.(2009)‘Distantrelativesofsevereacuterespiratorysyndromecoronavirusandcloserelativesofhuman coronavirus229Einbats,Ghana’, EmergingInfectiousDiseases 15:1377–84. PREDICT_Consortium.(2014) ReducingPandemicRisk .Promoting GlobalHealth.OneHealthInstitute,UniversityofCalifornia Davis,Davis,CA. Quan,P.L.etal.(2010)‘Identicationofasevereacuterespiratorysyndromecoronavirus-likevirusinaleaf-nosedbatin Nigeria’, mBio ,1:pii:e00208-10.doi:10.1128/mBio.00208-10. RFoundationforStatisticalComputing.(2008). R:ALanguageand EnvironmentforStatisticalComputing .Vienna,Austria:R FoundationforStatisticalComputing. Ramsden,C.,Holmes,E.C.,andCharleston,M.A.(2009) ‘Hantavirusevolutioninrelationtoitsrodentandinsectivore hosts:noevidenceforcodivergence’, MolecularBiologyand Evolution ,26:143–53. Reusken,C.B.etal.(2010)‘Circulationofgroup2coronaviruses inabatspeciescommontourbanareasinWesternEurope’, VectorBorneZoonoticDis ,10:785–91. ,etal.(2013)‘MiddleEastrespiratorysyndromecoronavirus neutralisingserumantibodiesindromedarycamels:acomparativeserologicalstudy’, TheLancet.InfectiousDiseases ,13: 859–66. Rubin,D.B.(1987) MultipleImputationforNonresponseinSurveys NewYork:JohnWileyandSons. Sabir,J.S.etal.(2016)‘Co-circulationofthreecamelcoronavirus speciesandrecombinationofMERS-CoVsinSaudiArabia’, Science ,351:81–4. Streicker,D.G.etal.(2012)‘Ecologicalandanthropogenicdrivers ofrabiesexposureinvampirebats:implicationsfortransmissionandcontrol’, Proceedings.BiologicalSciences/theRoyal Society ,279:3384–92. Tamura,K.etal.(2013)‘MEGA6:molecularevolutionarygenetics analysisversion6.0’, MolecularBiologyandEvolution ,30:2725–9. Tang,X.C.etal.(2006)‘PrevalenceandgeneticdiversityofcoronavirusesinbatsfromChina’, JournalofVirology ,80:7481–90. Tsoleridis,T.etal.(2016)‘Discoveryofnovelalphacoronaviruses inEuropeanrodentsandshrews’, Viruses ,8:84. Vijaykrishna,D.etal.(2007)‘Evolutionaryinsightsintotheecologyofcoronaviruses’, JournalofVirology ,81:4012–20. Vijgen,L.etal.(2005)‘CompletegenomicsequenceofhumancoronavirusOC43:molecularclockanalysissuggestsarelatively recentzoonoticcoronavirustransmissionevent’, Journalof Virology ,79:1595–604. Wacharapluesadee,S.etal.(2015)‘Diversityofcoronavirusin batsfromEasternThailand’, VirologyJournal ,12:57. Wang,Q.etal.(2014)‘BatoriginsofMERS-CoVsupportedbybat coronavirusHKU4usageofhumanreceptorCD26’, CellHost& Microbe ,16:328–37. Wang,W.etal.(2015)‘Discovery,diversityandevolutionof novelcoronavirusessampledfromrodentsinChina’, Virology 474:19–27. Watanabe,S.etal.(2010)‘BatCoronavirusesandexperimental infectionofbats,thePhilippines’, EmergingInfectiousDiseases 16:1217–23. Woo,P.C.etal.(2006)‘Moleculardiversityofcoronavirusesin bats’, Virology ,351:180–7. ,etal.(2007)‘Comparativeanalysisoftwelvegenomesof threenovelgroup2candgroup2dcoronavirusesrevealsunique groupandsubgroupfeatures’, JournalofVirology ,81:1574–85. ,etal.(2009)‘Coronavirusdiversity,phylogenyandinterspeciesjumping’, ExperimentalBiologyandMedicine ,234: 1117–27. 14| VirusEvolution ,2017,Vol.3,No.1 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017

PAGE 15

,etal.(2012)‘Geneticrelatednessofthenovelhumangroup CbetacoronavirustoTylonycterisbatcoronavirusHKU4and PipistrellusbatcoronavirusHKU5’, EmergingMicrobesand Infection ,1:e35. Yang,Y.etal.(2014)‘ReceptorusageandcellentryofbatcoronavirusHKU4provideinsightintobat-to-human transmissionofMERScoronavirus’, Proceedingsofthe NationalAcademyofSciencesoftheUnitedStatesofAmerica 111:12516–21. Zaki,A.M.etal.(2012)‘Isolationofanovelcoronavirusfroma manwithpneumoniainSaudiArabia’, TheNewEnglandJournal ofMedicine ,367:1814–20 S.J.Anthony etal. | 15 Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex012/3866407/Global-patterns-in-coronavirus-diversityby University of South Florida useron 16 October 2017


printinsert_linkshareget_appmore_horiz

Download Options

close
No images are available for this item.
Cite this item close

APA

Cras ut cursus ante, a fringilla nunc. Mauris lorem nunc, cursus sit amet enim ac, vehicula vestibulum mi. Mauris viverra nisl vel enim faucibus porta. Praesent sit amet ornare diam, non finibus nulla.

MLA

Cras efficitur magna et sapien varius, luctus ullamcorper dolor convallis. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Fusce sit amet justo ut erat laoreet congue sed a ante.

CHICAGO

Phasellus ornare in augue eu imperdiet. Donec malesuada sapien ante, at vehicula orci tempor molestie. Proin vitae urna elit. Pellentesque vitae nisi et diam euismod malesuada aliquet non erat.

WIKIPEDIA

Nunc fringilla dolor ut dictum placerat. Proin ac neque rutrum, consectetur ligula id, laoreet ligula. Nulla lorem massa, consectetur vitae consequat in, lobortis at dolor. Nunc sed leo odio.