USF Libraries
USF Digital Collections

Topology, morphisms, and randomness in the space of formal languages

MISSING IMAGE

Material Information

Title:
Topology, morphisms, and randomness in the space of formal languages
Physical Description:
Book
Language:
English
Creator:
Kephart, David. E
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla.
Publication Date:

Subjects

Subjects / Keywords:
Entropy
Language norm
Language distance
Language space
Pseudo-metric
Symbolic dynamics
Dissertations, Academic -- Mathematics -- Doctoral -- USF   ( lcsh )
Genre:
government publication (state, provincial, terriorial, dependent)   ( marcgt )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Summary:
ABSTRACT: This paper outlines and implements a systematic approach to the establishment, investigation, and testing of distances and topologies on language spaces. The collection of all languages over a given number of symbols forms a semiring, appropriately termed a language space. Families of languages are defined by interrelations among words. The traditional classification begins with the syntax rules or grammar of the language, that is, the word-transformations by which the entire language can be produced from a single axiom, or starting word. The study of distances between languages as objects and of the topologies induced by language distances upon spaces of languages has been of a limited character. Known language distances introduce topologically awkward features into a language space, such as total disconnectedness.This dissertation examines the topologies induced by three language distances, the effect that each one has upon the notion of a random language, and discusses continuity and word-distribution of structure-preserving language transformations, i.e., morphisms. This approach starts from metric-like requirements, but adduces an additional condition intuitively appropriate to gauging language distance. At the same time, strict, i.e. non-metric pseudo-metrics are admitted as possible language distance functions, and these are investigated by the use of metric quotient spaces.The study of the notion of randomness implied by the topology induced by such a pseudo-metric on a language space offers insight into the structure of language spaces and verifies the viability of the pseudo-metric.Three language pseudo-metrics are studied in this dissertation: a version of the most commonly-used (Cantor) word metric; an upper-density (Besicovitch) pseudo-metric borrowed from the study of cellular automata; and an adaptation and normalization of topological entropy, each evaluated on the symmetric set-difference between languages. It is shown that each of these distances induces a distinct topology on the space of languages. The topology induced by Cantor distance is compact and totally disconnected, the topologies induced by the other two are non-compact, with entropic distance resulting in a topology that is the strict refinement of the Besicovitch topology, enhancing the picture of the smaller languages in the Besicovitch topology.It is also shown that none of the three topologies gives quantitative expression to the distinction between regular and linear languages, although, using Martin-Lof randomness tests, it is shown that each pseudo-metric is associated with a new notion of a random language.A classification of language mappings is introduced, with the aim of identifying those which best preserve the structure of languages under specific topologies. There are results regarding continuity of mappings, the matrix representation of the pre-image of certain morphisms, and the formal expressions of the probability distribution of the image of certain morphism. The continuity of an injective morphism on its image is demonstrated under limited conditions.Finally, the questions which this approach leaves open are detailed. While basic facts about a permutation-invariant version of symmetric set difference are shown, this has yet to be fully elaborated.
Thesis:
Thesis (Ph.D.)--University of South Florida, 2005.
Bibliography:
Includes bibliographical references.
System Details:
System requirements: World Wide Web browser and PDF reader.
System Details:
Mode of access: World Wide Web.
Statement of Responsibility:
by David E. Kephart.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains 147 pages.
General Note:
Includes vita.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001670381
oclc - 62321793
usfldc doi - E14-SFE0001250
usfldc handle - e14.1250
System ID:
SFS0025571:00001


This item is only available as the following downloads:


Full Text

PAGE 2

Lackingareason,shesaid,whycontinue?InhercalmquestionIgainedpurposeAndnowsherequestsInotmentionhername!Onsuchirony,lifeturns.Yes,Icouldnameothers,mybrotherMichael,Itwouldbetting,butnotcomplete;toHeroflongsufferingandpracticaloptimism,ofUp-turnedfaceandcrystallinerealismIdedicatethistomywife,whogavemehowandwhytogoon.

PAGE 4

iii ListofFigures.......................................... iv Abstract............................................. v Chapter1Formalizingdistanceandtopologyonaspaceoflanguages...........11.1Arelanguagesthesupportofameaningfuldistance?................ 1 1.2Basicdenitions.................................... 5 1.2.1Generalnotation............................... 5 1.2.2Languagesandlanguagefamilies...................... 6 1.2.3Languagespaces............................... 9 1.3Morphismsonwordsandlanguagespaces...................... 10 1.3.1Extendingliteralmappingsandwordmorphismstolanguagemorphisms. 11 1.3.2LanguageSpaceIsomorphisms........................ 20 1.3.3Thepermutativesetdifferenceoflanguages................. 22 1.4Languagepseudo-metricsandlanguagenorms.................... 24 1.4.1Denitionoflanguagepseudo-metricandlanguagenorm.......... 25 1.4.2Linkbetweenlanguagenormsandlanguagepseudo-metrics........ 26 1.5Randomlanguages.................................. 29 1.5.1Martin-Lofrandomnesstests......................... 31 1.5.2Ageneralapproachtorandomnessintopologicallanguagespaces..... 33 Chapter2FormallanguagespaceasaCantorspace....................352.1TheCantorlanguagenormanddistance....................... 36 2.1.1Denition................................... 36 2.1.2Languagecylindersetsandtheirenumeration................ 38 2.2TheCantortopologyonalanguagespace...................... 41 2.3RandomnessunderCantordistance.......................... 42 2.3.1Anupper-semi-computablemeasureonopensets.............. 42 2.3.2Examplesofnonrandomlanguages..................... 46 2.3.3Randomequalsnon-recursivelyenumerable................. 47 Chapter3TheBesicovitchlanguagepseudo-metric....................503.1ABesicovitchpseudo-metriconlanguagespaces.................. 53 3.2TheBesicovitchlanguagenormissurjective..................... 56 3.3Besicovitchdistancequotientspace......................... 58 3.3.1TheBesicovitchdistanceequivalencerelationandinducedquotientspace. 59 3.3.2Themetricquotienttopology......................... 61 3.3.3Convergenceinthequotientspace...................... 64 i

PAGE 5

64 3.3.5Anupperquotientspace,homeomorphictotheunitinterval........ 69 3.4ThegeometryoftheBesicovitchtopology...................... 76 3.4.1Apointinthequotientspace......................... 78 3.4.2Openandclosedneighborhoodsoflanguages................ 79 3.4.3Idealsandtheelementsoftheupperquotientspace............. 81 3.5TheChomskyhierarchyrevisited........................... 83 3.5.1Thenitelanguagesarenotdense...................... 83 3.5.2Alllocallytestablelanguageshavenormzero................ 84 3.5.3Regularlanguages.............................. 86 3.5.4Non-RElanguages.............................. 87 3.6RandomLanguagesundertheBesicovitchpseudo-metrictopology......... 88 Chapter4Theentropicpseudo-metric............................924.1Denitionofentropyandentropicdistance...................... 93 4.1.1Entropy:therateofexponentiallanguagegrowth.............. 94 4.1.2Theentropiclanguagenormissurjective................... 96 4.2Theentropicquotientspace.............................. 97 4.2.1Denitionoftheentropicquotientspaces.................. 98 4.2.2Apointinthequotientspace......................... 100 4.2.3Elementsoftheupperquotientspace.................... 103 4.3EntropyandtheChomskyhierarchy......................... 108 4.3.1Thenitelanguages............................. 108 4.3.2Locallytestablelanguages.......................... 109 4.4Randomnessintheentropytopology......................... 110 4.4.1Separability.................................. 110 4.4.2Measure.................................... 111 4.4.3Randomnesstests............................... 111 4.4.4Randomlanguages.............................. 111 Chapter5Languagemorphismsandcontinuity.......................1135.1Thecontinuityofmorphisms............................. 114 5.2Thepre-imageofasectionofwords......................... 117 5.2.1Thewordsymbolmatricesandsymbol-lengthvectorsofmorphisms.... 118 5.2.2Distributionofthemorphicimageofalanguagespace........... 121 Chapter6Conclusionsregardingdistancesbetweenlanguage;whatistobedone.....1246.1ConclusionsregardingtheCantor,Besicovitch,andentropytopologies....... 124 6.2Tasksyettobenished................................ 127 6.2.1Distinguishingregularlanguagesmetrically................. 128 6.2.2Distancesbetweengrammars......................... 129 6.2.3Algebraictopologyonthespaceoflanguages................ 129 6.2.4ThedLmetric................................. 130 6.3Conclusion...................................... 131 References............................................ 134 AbouttheAuthor.....................................EndPage ii

PAGE 6

126 2Languagedistancesbetweenlargelanguages....................... 127 iii

PAGE 7

62 2TheBesicovitchquotientspaces.............................. 70 3TherelationshipoftheBesicovitchlanguagespace,quotientspace,andunitinterval(upperquotientspace)................................... 77 4Wordinagenerallytestablelanguage.......................... 84 5Thequotientspacesoftheentropicdistance........................ 106 iv

PAGE 8

v

PAGE 9

vi

PAGE 10

1

PAGE 11

2

PAGE 12

3

PAGE 13

4

PAGE 14

5

PAGE 15

6

PAGE 16

Theclassicallanguagefamiliesaredenedintermsofgrammarsthatgenerate,orrecognizethem.AgrammarGisa4-tuple,G(N;T;;S),whereN,thenon-terminalsymbols,andT,theterminalsymbols,aredisjointalphabetsusedtoproducealanguage,andS2Nisanaxiomwhichinitiatestheproductionofthelanguage.Therulesofproductionaregivenbytheniterelation,whichisasubsetof(N[T)N(N[T)(N[T)suchthat\[(N[T)T]6=?.Ifthewordpair(u;v)2wewriteu!v.Ifx;y2(N[T)andx=x1ux2,y=y1vy2suchthatu!v,wewritex)y.Denotebythesymbol)thereexiveandtransitiveclosureof).ThenthelanguagerecognizedbythegrammarG,denotedL(G),isjustthesetL(G)=fu2T:S)ug.ThefamilyofalllanguagesrecognizedbygrammarsisthefamilyofType0languages(L0).ItisastandardproofthattheType0languagesareinfacttherecursivelyenumerablelanguages,viz.,thoselanguagesrecognizedbyTuringMachines(See,e.g.,pp.178-9in[35]).WewillusethedesignationREforthisfamily.Thisestablishesanintimaterelationshipbetweenlanguageclassicationandtheoreticalcomputerscience. 7

PAGE 18

1. AlanguageI2PAiscalledarightidealofAifIAI. 2. AlanguageI2PAiscalledaleftidealofAifAII. 3. AlanguageI2PAiscalledatwo-sidedidealofAifAIAI.Wherethenotionofthesizeofalanguageisnototherwisedened,thesizeofalanguagewillmeanitscardinality.Thus,thesizeofalanguageisanaturalnumberifthelanguageisnite,andisinnite(actually,!),otherwise.1.2.3LanguagespacesThecollectionofallpossiblelanguagesisaset,andwecallthiscollectiontheformallanguagespace.ThecollectionofalllanguagesoveragivenalphabetAisthenthelanguagespaceoverA,thecollectionofallsubsetsofA,i.e.,thepowersetP(A).WewilldenotethelanguagespaceoverAbyPA.Asapowerset,alanguagespaceinheritsthesetoperationsonlanguagesoverA.Inparticular,PAisacommutativemonoidunderthesetoperationofunion.Denotingbytheoperationoflanguageconcatenation,thetriple(PA;;fg)isamonoid.Sincelanguageconcatenationisdistributiveoversetunion,the5-tuple(PA;[;;;fg)isasemiring.Moreover,alanguagespacePAisclosedundercountableunionsandsetdifferences,sothatthelanguagespaceoverAisa-ringonthesetA.Thustheunusualsituationexiststhat,ifameasureisdenedonA,theneveryelementofthelanguagespacePAbecomesameasurableset. 9

PAGE 19

1. letL[k],L[
PAGE 20

11

PAGE 21

Foranylanguagemorphism:PA!PBandanynon-emptylanguageM2PA,(M)=[w2M(fwg): Thismeansthatcountableunionispreserved.Moreover,iflanguageN0isanon-emptynitesubsetof(M),thenthereisanitelanguageNM,suchthat(N)N0.Notethatitisnottruethat,however,that,foranynitelanguageN0(N),thereexists(niteorinnite)languageNsuchthat(N)=N0.Forexample,supposeA=fa;b;cgandthat:PA!

PAGE 22

Alanguagemorphismisnonerasingif,foreverynitelanguageF,#(F)#F.Proof.Suppose,towardcontradiction,thatthelanguagemorphism:PA!PBmeetsthehypothesizedconditions,butisnotnonerasingbecauseitmapssomeinnitelanguageLtoanitelanguageG.ThencertainlythereisanitesubsetFofLsuchthat#F>#G.ByFact1,(F)G,andso#F>#(F),contrarytohypothesis.Noticethatitispossiblethat,underamonoidmorphism,theimageoftheemptylanguageisnon-emptyandtheimageoffgisnotfg.Thefollowingisanexampleofthis. Asanexample,denethelanguagemapping~:P2!P2such~takestoN2andlanguagesotherthantothelanguageconsistingofallwordscontaininguptoasmanyinstancesofthesymbol2asthelengthoftheshortestwordinthelanguage.Thatis,let~fg=f1;2g,~(fg)=1,andfornon-emptyLf1;2g+,~(L)=fw2P2:jwj2mLg,wheremListhelengthoftheshortestwordinL.Then,forL;Mf1;2g+,~(fg)~(L)~(fg)~(M)~(fg)=fw2P2:jwj2mL+mMg=~(L)~(M)=~(LM)and,since2~(L)forallLf1;2g+,wealsohavethat~()~(L)=~(L)~()=~()=f1;2g.Then~isamonoidmorphismfromP2toP2. 13

PAGE 23

Ifwedenethelanguagemapping~0tomapto,sothat~0(L)=8><>:;ifL=,and~(L);otherwise,where~isthemonoidmorphismfromExample1,then~0(L[M)=fw2PA:jwj2minfmL;mMgg=~0(L)[~0(M)forallL;M2P2,andhence~0isalanguagemorphism.Thefollowingfactliststherestrictionswhichresultinbetterbehavioronthepartofmonoidandlanguagemorphisms. Everynon-trivialmonoidmorphism:PA!PBmapstoeitherortoaninnitelanguageandmapsfgtoeitherfgoraninnitelanguage.IfmapsanylanguageinPAtoanitelanguageinPB,then(fg)=fg.Alanguagemorphism:PA!PBwhichmapsanylanguageinPAtoanitelanguageinPBmapsto.Proof.Ifisamonoidmorphism,then()()=(),so()canbenonite,non-emptylanguageotherthanfg.But,if()=fg,andL2PA,thenwehave()=(L)=(L)()=(L)fg=(L);whichimpliesthatisnotnon-trivial,contrarytohypothesis.Inaddition,(fg)(fg)=(fg),sothat(fg)canbenonon-emptynitelanguageotherthanfg.If(fg)=,andL2PA,thenwewouldhave(L)=(Lfg)=(L)=;

PAGE 24

Everywordmorphism':A!Bextendstotheuniquelanguagemapping':PA!PBsuchthat'(L)=f'(v):v2Lg=[v2L(v):(1.3)and'()=.AconstructionwhichistheconverseofFact4,i.e.,foragivenlanguagemorphismtondthewordmorphismofwhichitistheextension,ispossiblefornonexpansivelanguagecodes.Weprovethisaftershowingthatlanguagecodesmustbenonerasing,andthatnonerasingandnonexpansivelanguagemorphismsarelanguagecodes. 15

PAGE 25

ConsiderallpermutationsofthesymbolsofthealphabetA,where>2.Thesecanbeenumerated,sothatthesetofallsuchpermutationsisP=fi:0i
PAGE 26

Letthelanguagemapping:P2!P4beasfollows:(fg)=fg(f1g)=f1;3g(f2g)=f2;4g:Thenisclearlynotnonexpansive,yet,sinceitisdenedoneachsymbolofP2,itcanbedened,byconcatenationandsetunion,onalllanguagesoverP2.Inthiscase,itisclearthattheresulting 17

PAGE 27

Denetheliteralmapping':A!Btobesuchthat':a7!waandlet'alsorepresenttheextensionof'toawordmorphism.Let0betheuniqueextensionof'toalanguagemorphism,asmentionedinFact4.Then'preservesconcatenationofsymbols,sothat0(fvg)=(fvg);(1.5)byequation(1.4).Butinaddition,'isacode;for,supposethat'(u)='(v).Thismeansthat,ifjuj=kandjvj=l,then'(u[1]u[k])=wu[1]wu[k]=wv[1]wv[l]='(v[1]v[l])andthusthat(fug)=fu[1]gfu[k]g=fv[1]gfv[l]g=(fvg).Butisalan-guagecode,sothisisacontradiction,andso'isacode.Finally,byequation(1.3),0preservesunionsofwords.Therefore,(1.2)holdsfor0.Sincethesameequationholdsfor,wehavethat0=. 18

PAGE 28

Letbealanguagemappingsuchthat,ifL2PA,then(L)=fw2L:x2Limpliesjxjjwjg;i.e.,(L)isthesetoftheshortestwordsinL.Then(L)AmL,wheremLrepresentsthelengthoftheshortestwordinL6=,and()=.Thisisamonoidmorphism,since,ifLandMarenon-emptylanguages,theshortestwordsinLM,forL;M2PAmustbeoftheformuvwhereuisoneoftheshortestwordsinLandvisoneoftheshortestwordsinM.SupposingthatL=,then(L)=,inwhichcase,(LM)=(ML)==(L)(M):Themappingisnotnon-erasing,sincemL2N,and,therefore,mapsinnitelanguagesintoFINA.Supposethatthereexistsawordmorphism':A!Bsuchthat(L)=f'(w):w2LgforeveryL2PA.ConsiderlanguageL=fa;aag2PA.Then'(a)='(aa)=a.But'isawordmorphism,so'(aa)='(a)'(a)=aa,acontradiction.Therefore,nosuchwordmorphismexists. Let,calledtheshiftoperatoronlanguages,bedenedthus:(w)=8><>:w[2;jwj]ifjwj>1,andotherwise. 19

PAGE 29

If'isanaturallanguageisomorphism,where'isthewordmorphismofwhichitistheextension,then'()=.Proof.Ifj'()j>0,thenj'(a)j=j'(a)j=j'(a)'()j=j'(a)j+j'()j>j'(a)j,whichisimpossible.

PAGE 30

1. thelanguagespacesPAandPBarenaturallyisomorphic; 2. thealphabetsAandBhavethesamecardinality; 3. thereexistsasurjectivelanguagecodefromPAontoPB;and 4. foragivenpermutationonthesymbolsofA,P(A)andPBarenaturallyisomorphic.Proof.(2)1)If#A=#B,thenthereexistsabijection':AB.ThenbyLemma1.1,themorphism',whichistheextensiontoacodeofliteralmapping',extendstoanonerasing,nonexpansivelanguagecode':PAPB.Notethatj'(v)j=jvj,forallv2PA.Supposewordw2Bkforsomek2N;ifk=0,w=,and'(fg)=fgbyFact3;ifk>0,thenw=w[1]w[k],whichisthe'-imageoftheuniqueword'1w[1]'1w[2]'1w[k]2Ak,whichisawordv2Aksuchthat'(fvg)=fwg,namely,theconcatenationoftheelementsofAwhichmap,under',tow[1];w[2];:::;w[k],thatis,theuniqueword'1(w)2A.Fromthisobservation,andbyequation(1.2),'issurjective.Consequently'isanaturalisomorphismbetweenthelanguagespacesPAandPB.(1)2)Assumethereisanaturallanguageisomorphism'betweenPAandPB,theextensionofthemorphism':A!B.Then'whichmustbeinjectivesince'isinjective.Foreachb2BthereexistsauniquelanguageLbAsuchthat'(Lb)=fbg.Supposethat,forsomeb2B,wb=uv,whereu;v2A+.Thenj(wb)j=j(uv)j=j(u)(v)j=j(u)j+j(v)j=jbj=1,implyingthat(u)=or(v)=,whichisacontradiction.Therefore,wb2Aforeachb2B.Finally,(A)B:indeed,ifa2Aand'(a)=t2Ak,withk>1,then'(fag)='fwt[1]wt[k]g,contradictingthebijectivityofanaturallanguagespaceisomorphism.Therefore,thefunction~'='jAisabijectionbetweenAandB,and#A=#B.(2)3)Trivial.(3)2)Ifisasurjectivelanguagecode,thenLemma1.1saysthatisabijectionbetweenPAandPB,and,therefore,by1.1,theextensiontoalanguagecodeofacode':A!B.Further,thecode'issurjectiveonB.Forsuppose,towardcontradiction,thatthereisawordwinBwithoutapre-imageinA.Butthisimpliesthatthelanguagefwg2PBhasnopre-image 21

PAGE 31

22

PAGE 32

Since4isagroupactiononPAPA,F4G=G14Fandhence#F4G=#G4F.Thepurposeofthepermutativesymmetricset-differenceistoinspectstructuraldifferencesbe-tweenlanguages,ratherthanthosebroughtaboutbyarecodingofthesymbolsofalanguage.LetL4kM=L[
PAGE 33

24

PAGE 34

25

PAGE 35

Cardinalityofalanguageisapermutativegeneralizedlanguagenorm.CardinalityisameasureonA,andsoitfulllstherequirementsofalanguagenorm.SincethecardinalityofwordsofanylengthisunchangedbyapermutationofthesymbolsofA,itispermutative.Sincethereexistinnitelanguages,cardinalityisageneralizedpermutativelanguagenorm.Notethatanysubadditivemeasureand,therefore,anymeasureonAisalanguagenorm.1.4.2Linkbetweenlanguagenormsandlanguagepseudo-metricsFromeachlanguagepseudo-metricthereisanaturallinktoauniquelanguagenorm,andforeachlanguagenormthereisanaturallinktoauniquelanguagepseudo-metric.Ifalanguagenormisper-mutative,thereisalinktoapossiblydistinctpermutativelanguagepseudo-metric.Theconstructioninvolvedintheproofofthisdepends,inpart,onthefollowinglemmafromsettheory.

PAGE 36

Foreachlanguagenormkk:PA!R0thereisauniquelanguagepseudo-metricd:PAPA!R0suchthatkL4Mk=d(L;M)forallL;M2PA,and,foreachlanguagepseudo-metricd:PAPA!R0,thereisauniquelanguagenormsuchthatkLk=d(L;).Proof.Uniquenessisobvious.GivenlanguagenormskkaandkkbonPAsuchthatkL4Mka=d(L;M)=kL4MkbforalllanguagesL;M2PA,thenkLka=kL4ka=d(L;)=kL4kb=kLkb

PAGE 37

28

PAGE 38

[Denition3.1,Theorem3.2[37]]Let(X;)beametricspace.Thesequencefxigi2Nissaidtoconvergetox,denotedxi!xif,provided,forall">0,(xi;x)<"forallbutnitei2N.Thenxisthelimitoffxigi2N.1.5RandomlanguagesWenowconsiderhowtodenearandomlanguage.Arandomelementinasystemisonewhichistypicalofthatsystem.Inthisway,arandomelementexempliestheexpressibilityofagivenstructure.Sincewehaveinmind,ultimately,applicationtosystemsofarbitraryvariability,onejusticationforagiventopologyoflanguagesisthecharacteristicsofrandomlanguagesunderthattopology.Theideaisthatthesemustexistandmustbesufcientlyindescribabletoaccommodateallheretoforeundiscoveredregularitiesinphenomena,but,atthesametimeshouldbehemmedinbylanguagesthataredescribable,i.e.,extraordinaryenoughthattheymaybedescribedsuccinctly. 29

PAGE 39

30

PAGE 40

Thestandardbijectionh;ibetweenNNandNwillbemadeuseofinthisdissertation.Thefunctionh;i:NN!Nassignsnon-negativeintegerssystematicallytoelementsofNN.Itassigns0to(0;0).Supposeallintegersinthesetf0;:::;nghavebeenassigned,andthatthenumbernhasbeenassignedtothepair(i;j).Ifi>0,h;iassignsthenumbern+1tothepair(i1;j+1).Ifi=0,h;iassignsthenumbern+1tothepair(j+1;0).Inthisway,assignmentismadetoallelements(i;j)suchthati+jkbeforeanyassignmentismadetoanypair(i;j)suchthati+j>k.Thus,h;iisinjective.Ontheotherhand,ifj=0,thenhi;ji=h0;i1i+1,orelse(i+1;j1)2NN,andsohi;ji=hi+1;j1i+1.Thus,thefunctionh;iisabijection.Calculationshowsthatthisenumerationmaybedenedasfollows:hi;ji=1 2(i+j)(i+j+1)+j:

PAGE 41

[Calude,Marcus,Staiger[5]]Arandomnessspaceisatriple(X;B;)where(X;)isaseparabletopologicalspace,isaprobabilitymeasureonthe-algebrageneratedby,andBistheenumerationofabasisof.WenextdeneameansofeffectivelysortingoutelementsofXwhichmaybelabellednon-randomataspeciccondencelevel.Thecondencelevelisthe-measureoftheintersectionofasequenceofdescribableopensets.Describabilityiseffectivelyapproximatedbytherecursiveenumerabilityofthenumbersdeterminingthesuccessiveopensetsofthissequenceasunionsofbasiselementsofthetopology.Itis,ofcourse,possibletodescribeallopensetsasunionsofbasiselements.WhatwerequirebeyondthisisthattheunionofthesetsofnumberscorrespondingtoallbasiselementscomprisingthetherecursiveenumerationofunionsofopensetsispossiblebecausewehaverequiredthatBisanenumerationofabasisofthetopology. ArandomnesstestonXisaB-computablesequenceV=fVigi2Nsuchthat(Vi)2iforalli2N. 2. Anelementx2Xiscallednonrandomifx2Ti2NViforsomerandomnesstestfVigi2NonX;anelementofXiscalledrandomifitisnotnonrandom.ThefactwhichmakesMartin-LofRandomnessTestssousefulisthat,undercertaincircumstances,arandomnessspacehasauniversalrandomnesstest. [Martin-Lof,[29]]Let(X;B;)bearandomnessspace.ThenarandomnesstestU=fUigi2NonXisuniversalif,foranyrandomnesstestV=fVigi2NonX,thereexistsaxedc2NsuchthatVi+cUiforalli2N,wheretheconstantcdependsonUandV,butnoti.

PAGE 42

[Theorem3.10,[16]]Let(X;B;)bearandomnessspace.Ifisupper-semi-computable,thenthereisauniversalrandomnesstestonX.Anowheredensesetisonewhichcontainsnoclosureofanopenset.AmeagersubsetofXistheunionofcountablymanynowheredensesetsinX.FromTheorem1.2andDenition1.5.2,wehavethefollowing. [Theorem3.11,[16]]IfthereisauniversalrandomnesstestUontherandomnessspace(X;B;),then ThesetofnonrandomelementsofXisof-measure0,andnowheredense. 2. IfthesetofnonrandomelementsofXisdenseinX,thenthesetofrandomelementsismeagerinX.Thuswehavethewell-knowresultthatthepropertiesofthenonrandomelementsnotfoundwithinauniversalrandomnesstestsatisfyallstochasticrequirementsforrandomness.Moreover,theysatisfyeverypossiblestochastictestforrandomness,includinganywhichhavenotbeenconceived.1.5.2AgeneralapproachtorandomnessintopologicallanguagespacesThefollowingisanoutline,drawnfromthedenitionsandtheoremsoftheprevioussection,ofthemethodweadoptfortheexaminationofrandomnessinanygivenlanguagespacetopology. 33

PAGE 43

Elaboratethelanguagespace,oranappropriatetopologicalspacederivedfromit,asaseparablespace,andndanenumerationforabasisofthetopologyofthatspace. 2. Determineameasureonthe-algebrageneratedbythelanguagespace(orotherappropriatespace)topologyandproveitsupper-semi-computability. 3. Displaytheassociatednestedopensetsoflanguages,themeasuresofwhichconvergequicklyenoughtoyieldarandomnesstest. 4. Drawconclusionsaboutthecharacterofauniversalrandomnesstestand,thereby,givearoughapproximationtowhatarandomlanguageisunderthegiventopology.Thankstotheextensiveworkdoneonrandomness,wewillnotneedtofullyelaboratethisprocedureineverycase.Forexample,shouldweencounteraspacehomeomorphictotheunitinterval,stepsone,two,andthreehavelongsincebeencompleted.Theunitintervalismentionedasanexamplein[16],andwewillmakeuseoftheprincipleconclusionmentionedthere:constructiblenumbers,itisshown,arenonrandom.Withthesepreparatorydenitionsandobservations,weturntotheconsiderationofthreelan-guagepseudo-metrics.Therstoftheseisametric. 34

PAGE 44

35

PAGE 45

2raisedtothepowerofthelengthoftheshortestwordinthelanguage. 36

PAGE 46

2,whiled()1(L1;L2)=1 8,since,underthepermutation:a7!band:b7!a,wendthat(L1)4L2=fbba;baa;bbba;baaa;:::g;and,hence,thatk(L1)4L2k1=1 8. 37

PAGE 47

2,itfollowsthatelementsofthestandardmetricbasisoftheCantortopologyformthecollectionC=fB2k(L):k2N;L2PAg:(2.2) ThesetB2k(L)isthecollectionoflanguageswhichagreewithLonallwordsoflengthsuptoandincludinglengthk.Referringtotheword-metric!,acylindersetofwordsoverAisdenedasacollectionCw=wA,forsomewordwinA.ThenCwisthesetofallwordsvsuchthatw2Pref(v).Weseethat,analogously,thebasiselementsof1arethecylindersetsoflanguages,i.e.,allthelanguagesinPAwhichagreewithsomegivenlanguageLonallwordsofalllengthsuptoandincludingsomelengthk. [Vianu,Daniela[13,38]]Thelanguagecylindersetoflengthk2N,denotedCL;k,aroundlanguageL2PAisthesetCL;k=nM2PA:L[k]=M[k]o:(2.3) 38

PAGE 48

Givenk2N,eachdistinctsubsetofAkcorrespondstoasinglecylindersetoflanguagesoflengthk.NotethatCL;k+1(CL;k,forthelanguageM=L\Ak[Lc\Ak+1isnon-emptyforallL2PAandallk2N.YetM2CL;knCL;k+1. Theintersectionoftwocylindersets,sayCL;jandCM;k,wherejk,iseither?orCM;k.Tobeprecise,CL;j\CM;k=8><>:?;ifL[j]6=M[j]CM;k;ifL[j]=M[j]:For,supposelanguageN2CL;j\CM;kandL[j]6=M[j].ThenthereexistsijsuchthatL[i]6=M[i].Butwehave,byoursupposition,thatN[i]=L[i]andN[i]=M[i],whichisimpossible.Inaddition,notethattheintersectionofcountablymanycylindersetscontainsasinglelanguage,andisnotacylinderset.Inparticular,\i2NCL;i=fLg;whichisnotacylinderset. Theunionoftwocylindersets,CL;j[CM;k,wherejkiseitherasinglecylindersetoraunionofdisjointcylindersets.Tobeprecise,CL;j[CM;k=8><>:CL;j;ifL[j]=M[j]CLnM[j];j[CL\M;j[CMnL[j];kifL[j]6=M[j]: wherethelastcaseistheunionofthreedisjointcylindersets,twooflengthj,thethirdoflengthk.Sincethisfragmentationintodisjointcylindersetscancontinueindenitely,acountableunionofcylindersetsisacountableunionofdisjointcylindersets.LetCkdenotethecollectionofalllanguagecylindersetsoflengthk.Thatis,letCk=fCL;k:L2PAg:

PAGE 50

41

PAGE 52

#Ck=2k+11 43

PAGE 53

#Ckm=i00Xm=0i0m

PAGE 54

2,since#C0=220+11 21=2.Atstep1,theourapproximationtothemeasureofLis1=8+1=8=1=4,sincewefoundtwocylindersetsoflength1.Atthesecondstep,wehaveonecylindersetoflength2andtwooflength2,sooursecondapproximationis1=8+1=128+1=128=9=64.

PAGE 55

AnynitelanguageFisnonrandom,asthereisamaximumlengthn2NofawordinF.Therefore,FisthesolelanguagefoundintheintersectionofthecylindersetsCF\A0;0;CF\A1;1;:::;CF\A
PAGE 56

47

PAGE 57

48

PAGE 58

49

PAGE 59

50

PAGE 60

2n+1nXi=nj(x)(x)j;i.e.,bytheupperdensityofthemeanoftheordinary`p-normofthedifferencebetweenthefunc-tionsastheintervalofevaluationwidenssymmetricallyaroundx=0.BecausetheBesicovitchpseudo-metricdependsontheevaluationofthetwofunctionsonlyatdiscreteintervals,itisnat51

PAGE 61

52

PAGE 62

IflanguageLisasubsetoflanguageM,then,foreachk2N,#L[k]#M[k].Itfollowsthatthefunctionkkisalanguagenorm,whichwewillcalltheBesicovitchlanguagenorm.Therefore, 53

PAGE 63

ThegeneralfeaturesofBesicovitchdistanceonlanguagespacesareasfollows: 1. TherangeofBesicovitchdistanceandlanguagenormonlanguagespacesistheunitinterval.Inotherwords,thedenitioncouldsayd:PAPA![0;1].Thisis,infact,acontinuousmap.SurjectivitywillbedemonstratedinLemma3.2andCorollary3.2.1.ContinuitywillbeshownasaconsequenceofCorollary3.11.1. 2. TheBesicovitchdistancebetweenalanguageanditscomplementis1.Thatis,d(L;Lc)=1.ThisfollowsfromthefactthatL\Ak4Lc\Ak=Akforeachk2N. 3. OneverylanguagespacePA,theBesicovitchdistanceisastrictpseudo-metric;forconsiderthelanguagesL=andM=fag,wherea2A.ThenL6=M,yet#L4kM=1,forallk2N,andsod(L;M)=limk!11 #A
PAGE 64

By(1.1)onpage7,(3.2)maybewrittenasfollows:d(L;M)=limsupk!1#L4
PAGE 65

2.WeproceedtoshowthattherangeoftheBesicovitchlanguagenormistheentireunitinterval.3.2TheBesicovitchlanguagenormissurjectiveWeclaimthatkPAk=[0;1],i.e.,thattheimageoftheentirelanguagespaceundertheBesicovitchnormispreciselytheentiretheunitinterval.Tojustifythisclaim,weneedtobeabletoconstructalanguagewithanarbitrarynorm.Weidentifyacollectionofsuchlanguages.

PAGE 66

Butthen,ofcourse,kLk=r,whichestablishestheclaim.Thisestablishestheclaimstatedatthebeginningofthissection.Werestateit,forclarity. 57

PAGE 67

58

PAGE 68

TherelationisanequivalencerelationonPA.Proof.Reexivityandsymmetryareobvious.LetL;M;N2PAsuchthatLMandMN.Sincedisapseudo-metric,0=d(L;M)+d(M;N)d(L;N).FromRemark6(1),LN. 1. TheequivalenceclassoflanguageL2PAwillbedenoted[L]. 2. ThecollectionofequivalenceclasseswillbecalledtheBesicovitchquotientspaceoverPA,whichwewilldenoteQA.Elementsofthequotientspace,i.e.,pointsinQA,willbedenotedwithsansserifletters,L;M;N;:::.Collectionsofpointswillbedenotedinbold-facesansserif,L;M;N;:::. 3. Letdenotethequotientmapping:PA!QAwhichtakesalanguageinPAtoitsequivalenceclassinQA.SinceQAisapartitionofPA,themappingiswell-denedandsurjectiveonQA,butnotinjectivesinceasinglepointofQAistheimageunderofeverylanguageinaequivalenceclass.SincePAisasemiringundersetunion,thesetoperationsofunion,intersection,andcomple-mentationarepreservedbymappingsfromcollectionsofpointsinQAtothesetsoflanguagesofwhichtheyaretheequivalenceclasses.Inparticular,everytopologyonQAisthequotientofatopologyonPA. 59

PAGE 71

Thusd(L;M)"=2>0,byDenition3.3.4,contrarytoourhypothesis.Nowweshowthatthequotientmapisanisometry.WhenwetalkaboutBesicovitchdistancebetweenlanguages,fromnowon,wecan,interchangeablydiscussdistancebetweenpointsinQA.

PAGE 73

64

PAGE 74

65

PAGE 76

Fromthesecalculations,settingJw;i=(Jw;i),thesequencefJw;igi2Nhasthefollowingproper-ties:nosubsequenceofthissequencecanconverge,yeteverylanguageineachpointofthesequencehasthesamenorm. 67

PAGE 77

2#(LnM)\AjandlanguageN2;j(MnL)\Ajsuchthat#N2;j=1 2#(MnL)\Aj.DenelanguageNsuchthatN[j]=(L\M)[N1;j[N2;jforallj2N.IfL4Misanin-nitelanguage,itfollowsthatd(L;N)=d(N;M)=1 2d(L;M).IfL4Misnite,istrue,aswell,sinceallthreedistanceare0.Ja;iandJa;i+1inthesequenceJaintheproofofLemma3.7.Fori>0,weobtain,by2k1applicationsoftheaboveconclusion,asequenceoflanguagesNa;i=fNjgj2k=fN0=Ja;i;N1;:::;N2k1;N2i=Ja;i+1g.ThedistancebetweenlanguageNjandlanguageNj+1inNa;i,byLemma3.7,is12i+k:(3.14)TheconcatenationofsequencesNa;0,Na;1,andsoon,givesalanguagesequenceN0awhichcon-tainsNaasasubsequence.ConsiderthepointsequenceL0=(N0a).From(3.14)L0isCauchyconvergentbutfromLemma3.7itisnotconvergent.

PAGE 78

1. letLMifkLk=kMkforallL2L,M2M; 2. lethLi=nM2QA:MLo,anddenotebyNthecollectionnhLi:L2QAo,denoteelementsofNinscript,L;M;N;:::,anddenotecollectionsofelementsofNinboldfacescript,L;M;N;:::;and 3. letbethemapfromQAtoNwhichtakespointLtoitsequivalenceclass,hLi.Finally,forr2[0;1],letrdenotenL2QA:kLk=rforallL2Lo. Itisobviousthatisanequivalencerelation.Thequotientmapiswell-denedbyCorollary3.5.3.Sincer=hMiforeachM2r,thisimpliesbyRemark6,thatr=M,forpreciselyoneelementM2N.Weequiptheupperquotientspacewithametric. 69

PAGE 79

Lc 2.Theonlycompactequivalenceclasses,however,arethesingletons,0and1. ThenthesetUisapparentlyequivalenttothesubsettopologyontheunitinterval.Thatis,thereisahomeomorphismbetweenNand[0;1]ifthefunctioninducesthequotienttopologyonN.WeextendtheabuseofnotationusedwithlanguagesandthequotientspaceandwriteL2r(orL2L)tomeanthatlanguageListobefoundinpointsoftheequivalenceclassr(orL2r=L).WewriteLrtomeanthateachlanguageinthelanguagecollectionLisinone(butnotnecessarilythesame)pointintheequivalenceclassr.WewriteLLtoindicatethattheimage[(L)]isasubsetofthecollectionofelementsofLN.Wewillshowthat,withexactlytwoexceptions,risanuncountablesubsetofQA.Theelements0and1aretheexceptions. 70

PAGE 80

Thisprovidestherequiredcondition,becauseifd(L;M1)=1andd(L;M2)=1,thismeansthatL4M1andL4M2arein1(fromDenition3.1.1),implyingbyLemma3.8thatd(L4M1;L4M2)=0andthusthatd(M1;M2)=0;soM1M2.

PAGE 81

2foranylanguageL2L. IfL2PA,kLk=r,and0sr(1),thenthereexistsasubsetofL,thelanguageML,suchthatkMk=s.Proof.Ifs=0,letM=,andwearedone.Ifr=s,letM=L,andwearedone.Therefore,assumethats2(0;r).tothecardinalityofthesetofshorterwordsNotethats=r>0.FormthelanguagesequenceL=L[i]i2N,and,fromthis,formtheintegersequencefmigi2Nsuchthatthefollowingequalityholds:mi=j(s=r)#L[i]k:(3.17) 72

PAGE 82

Itfollowsthat,if0rs1,thenforanylanguageL2r,thereexistslanguageMLsuchthatM2s.TheappropriatelanguageisLif0=r=s,A,ifs=1,andmaybeconstructedasin2byinvertingthefractionsin3.17,3.18,and3.19,ifs2(0;1). Then,byProposition2,thereisalanguageMLsuchthatkMk=r".Butsincer"
PAGE 83

74

PAGE 84

2,everybasiselementofexceptforPAitselfexcludeseverylanguageMsuchthatMLc,i.e.,uncountablymanylanguagesMsuchthatkMk=kLk.

PAGE 85

ThereexistrightidealsofAineveryequivalenceclassofQA,byLemma3.5.Thisisanobservationregardingthelocationoflanguagesintheupperquotientspace,however.Weconsiderthegeometryofthesituationmoreclosely.AsketchoftherelationshipofthespacesandtypicalmembersofeachisgiveninFigure3.3.4ThegeometryoftheBesicovitchtopologyAlanguagespaceisalsolentacharacteristicstructurebytheBesicovitchtopology.Itwasshownthat(only)thelanguagesinLcareatdistance1fromlanguagesinthepointLofQA.Insubsectionone,wereopenthediscussionoflanguagesinpointsofQA,toarriveatamorecompletedescriptionofpoints.Insubsectiontwo,wedevelopapartialdescriptionofopenandclosedsetsinthequotientspace,andshowhowtoproduceanarbitraryboundaryelementoftheclosureofabasissetsof 76

PAGE 86

... ... ... ... ... ::: ::: ::: ::: A ::: :::

PAGE 87

78

PAGE 88

2,thenB"(L)contains0or1dependinguponwhetherkLkislessthanorgreaterthan1 2.Notetheanomalouscharacterofneighborhoodswithradiusgreaterthan1 2. SupposelanguageLissuchthatkLk=5 8.ThenB11=16(L)containseverylanguageinboth0and1,butnoteverylanguageMsuchthatM=1 2,sincethereisapointMcontainingalanguageNsuchthatkNk=1 2andd(M;N)=1 8(byProposition2),but,consideringthelanguagesinMc,suchasNc,thetriangleinequalitygivesthatd(L;Nc)11=8=7=8>11=16.Hence,McisinthesameequivalenceclassasM,whichisonly1=8awayfromtheclasstowhichLbelongs,butnotinB11=16(L). 79

PAGE 89

1. 2.

PAGE 90

Ifl",nolanguageMsuchthatlkMk>"isin

PAGE 91

2k
PAGE 92

83

PAGE 93

qn qn+r=k s s s s

PAGE 94

Therefore,writing,forconvenience,qi=bi=ncandri=ikqifori2N,theBesicovitchnormofLcanbecalculatedasfollows.Assumek0n.ThenkLk=limsupk0!18><>:Pk02i=0sqi(Pn1j=0j)+sqk01Prk01j=0j 85

PAGE 95

2,thenthebasiselementB1 4(M)of,i.e.,B1 4(M)=N2PA:d(M;N)<1 4containsnolocallytestablelanguage.Therefore,thefamilyLOCAisnotdensein(PA;).3.5.3RegularlanguagesAllniteandlocallanguagesbelongto0.Thesesub-familiesoftheregularlanguagescontainnoopenset,nordoestheirclosurecontainanyopensetsinanyofthethreetopologicalspacesassociatedwithBesicovitchdistance.Regularlanguagesarepossiblydenseinthelanguagespace,andtheyarecertainlytobefoundinelementsoftheupperquotientspaceatanarbitrarilysmalldistancefromanyelementintheupperquotientspace. 86

PAGE 96

TheregularlanguagesaredenseintheBesicovitchtopology.Thenon-RElanguages,however,areubiquitous.3.5.4Non-RElanguagesAswewillreadilyshow,thereisanon-RElanguagesineverypointofQA.Thisisaconsequenceofthenot-surprisingfactthat,becausedisastrictpseudo-metric,theequivalenceclassesareuncountable. 87

PAGE 97

kok2N,sothatkLk=0.Lettheintegersequencefkigi2Nbeastrictlyincreasing,non-negativesequencethatisnotrecursivelyenumerable.ThenthelanguageL0=na2ki:i2Noisnon-RE,andL0L.Moreover,thereareuncountablymanynon-REsequencesfkigi2N. 88

PAGE 98

[Example3.61,[16]]Everycomputablenumberisnonrandom.Proof.Themachinewhichgeneratesthe-aryexpansionofacomputablenumberqsuccessivelyconnesthevalueofqtoanintervalofu-measure1;2;:::.Thisisarandomnessteston([0;1];Bu;u)containingthenumberq.If,forexample,theexpansionoftherealnumberqtothreeplacesis0:d1d2d3,thenq2d11;(d1+1)1\(d1+d2)2;(d1+d2+1)2\(d12+d2+d3)3;(d12+d2+d3+1)2\;whichisclearlyarandomnesstest.FromDenition3.6.1,everylanguagein0isnonrandomintheBesicovitchtopology.Com-biningthiswithLemma3.20,itemergesthatthereisanonrandomlanguagethatisnon-RE,whichistosay,thesetofrandomlanguagesunderBesicovitchandCantortopologiesdonotcoincide. 89

PAGE 99

90

PAGE 100

91

PAGE 101

92

PAGE 102

93

PAGE 104

Thenthefunctionkkhisindeedalanguagenorm,sincekk=0and,ifLM,then#L[k]#M[k]forallk;this,inturn,meansthatthefunctiondhisalanguagepseudo-metric. Ifthereexistsasequenceofintegersfkigi2Nsuchthat#L4kiM=kiforalli2N,thenlimi!11 ExceptwhenthelanguageLisnite,theentropicnormisproportionaltothetopolog-icalentropyofaninnitefactoriallanguage.Indeed,kLkh=h(L) log2,ifLisafactoriallanguage.Ingeneral,however,thefollowinginequalityholds:kLkhh(L) log2:Notethatthepermutativeversionoftheentropicdistance,namely,d()h(L;M)=limsupk!11 95

PAGE 106

97

PAGE 107

1. Let[L]hdenotesthehequivalenceclasscontaininglanguageL. 2. ThentheentropicquotientspaceQAhisthecollectionf[L]h:L2PAg;L;M;N;:::willde-notepointsofQAh,andL;M;N;:::willdenotesubsetsofQAh. 3. Themappingh:PA!QAhtakesalanguagetoitshequivalenceclass.

PAGE 108

LethLihdenotethehequivalenceclassofpointLinQAh. 2. LetNhdenotespacemadeupofthecollectionofallhequivalenceclasses,whereL;M;N;:::denoteelementsofthisspaceandL;M;N;:::denotesubsetsofthisspace. 3. Let:QAh!NhmappointsinQAhtotheirhequivalenceclasses.Ifr2[0;1],wewilldenotebyrhthesetL2QAh:kLkh=r;ifL2Lofallpointscontaininglanguageswithnormr.AswithelementsofN,itisclearthat,ifL2Nh,thenL=rhforsomer2[0;1].Althoughitistrivialthat0hisasingletoninQAh,thefollowingexampleillustratesthedissimilar-itybetween1hand1. LetL0beanr-expansivelanguage,wherer=2 3.LetL00L0beanr-expansivelanguage,wherer=1 3.ConstructlanguagesL1andL2asfollows.LetL1[k]=8><>:L0[k];ifkisnotapowerof2;andAk;ifk=2i;i2N

PAGE 109

3.Thisiscompletelyatoddswiththerelationshipofelementsof1toeachother:elementsof1hhavetheirowncomplementsin1h;pairsofpointsin1hneednotbecloserthan2 3toeachother,indhdistance;and,thus,that1hcontainsrepresentativesofdifferenthequivalenceclasses.Finally,wedenedistanceonNhasfollows.

PAGE 110

101

PAGE 111

Thisestablishesourclaim.Inotherwords,foreachL(h)2Qh,thereexistsM()2QAsuchthat,iflanguageLissuchthatL2L(h),thenL2M().Thisleadstothefollowingcorollary.

PAGE 112

103

PAGE 113

since,byassumptions+2"=r<1.As"goesto0,theupperboundestablishedaboveand(4.9),takentogether,provethat,ifr6=s,thendh(L;M)=maxfr;sg. If"
PAGE 114

If"r,thenkMkh<".Proof.TherstconclusionfollowsfromLemma4.8:ifdh(L;M)<"",ifkLkh6=kMkh,whichisacontradiction.Thesecondfollowssimilarly:ifkMkh>"r,thendh(L;M)=maxfkLkh;kMkhg",meaningthatM=2Bh"(L).NotethatCorollary4.8.3doesnotsaypreciselywhathappenswithintheelementrh.Theconverseofpart2ofthecorollaryisobvious:openneighborhoodsofradiusgreaterthanthenormofthelanguagearoundwhichtheyareformedconsistofthecollectionofalllanguagesofnormlessthantheradius.Butwenowinquireintothecontentofopenneighborhoodswithradiuslessthanorequaltothenormofthecenterlanguage.Frompart1ofthecorollaryweknowthat,howeverelsetheymaybedescribed,theyarecontainedwithintheh-equivalenceclasstowhichthecenterbelongs. 2;(4.10)and,hence,r+" 2+1 2=1:Thisimpliesthefollowinginequalityifk>k";r:k(r+")+kr0k";r,suchthat#L[ki]>ki(r")foralli2N.ThereexistslanguageLr0;iAknL[k]suchthat#Lr0;i=jkir0k,foralli2N,asthereareenoughwordstoformsuchalanguageby(4.11).LetlanguageM=L[Si2NLr0;i.ThenkL4Mkh=r0.Thus,dh(L;M)=r0r0,byLemma4.8,whichisacontradiction. 105

PAGE 115

1

PAGE 117

108

PAGE 118

klogs+n1

PAGE 119

110

PAGE 120

111

PAGE 121

112

PAGE 122

1. thatdisanarbitrarylanguagepseudo-metricwithacorrespondinglanguagenormkk, 2. thatQAisthequotientspacedenedby,whereLMifd(L;M),andpointsinQAaredenotedinsansseriffont 113

PAGE 123

thatdenotesthequotientmapfromPAintoQAunderwhichthequotientmetricdisanisometryofdand 4. thequotienttopology~istheimageunderofthepseudo-metrictopologyinducedbyd. 5. Whenevernecessary,rwilldenotetheclassofpointsinthequotientspacecontainingalllanguagesatdistancerfromthepoint,whichistosay,alllanguagesofnormr.Wherepossible,ofcourse,Nwillbetheunitinterval.Thisrequiresnotonlysometypeofnormal-izationofthenorm,butitssurjectivityon[0;1],sowedonotassumeN=[0;1]inthischapter.ItwassaidinSection1.3,thatsemiringmorphismspreserveaspectsoflanguagestructure.Weproceedfromthisstandpoint. letM(A;B)denotethecollectionoflanguagemorphismsfromPAintoPB,asubsetofthelargerclassofmonoidmorphisms; 2. letthecollectionoflanguagespacecodesfromPAtoPBwillbedenotedC(A;B);and 3. letW(A;B)denotethecollectionofnon-expansive,non-erasinglanguagemorphismsfromPAintoPBwhicharenecessarilyinjective,i.e.,everylanguagemorphism'whichisanextension(seeProposition1)ofanarbitrarywordmorphism':A!B.ItisclearfromearlierremarksthatM(A;B))W(A;B)andC(A;B))W(A;B).5.1ThecontinuityofmorphismsThecontinuitymorphism2M(A;B),oneforwhicheveryopensetoflanguagesinPBistheimageunderofanopensetinPA,wouldatrstseemtodependonthenatureofthemorphismandthecharacterofthelanguagepseudo-metricsusedtotopologizethetwospaces.Weshowthatdependenceofcontinuityuponthelanguagepseudo-metrictakestheformofdependenceuponthenumberofsymbolsinthealphabetsofthespaces.Therstquestion,then,isWhichmorphismsarecontinuousinalanguagespaceundertopol-ogizedbyapseudo-metricd?Continuitybetweentwolanguagespacestopologizedbyd1isa 114

PAGE 125

Considerthemorphism2M(N4;N2)whichisanextensionofthewordmor-phism'suchthat'(1)='(3)=1'(2)='(4)=2;andconsiderthelanguageL=f1g2P2.Thelanguagesf1g;f3g,andf1;3gcomprisetheinverseimageofL.Acylinderoflengthnsetaroundf1g2P2istheimage,under,ofthreecylindersetsinP4,namelyCf1g;n,Cf3g;n,andCf1;3g;n,i.e.,onearoundeachoftheselanguages.ThisclearlycanbeextendedtoanylanguageinP2.Aratherrestrictedgeneralizationregardingcontinuityoflanguagecodesispresented. Everylanguagecodeiscontinuousonitsimage.Amoregeneralissueis,givenalanguagedistanceandthetopologyitinduces,wheredoesamor-phismmapdifferentfamiliesoflanguages?Onepossibilityisamappingwhich,althoughnottrivial, 116

PAGE 126

117

PAGE 127

118

PAGE 128

Thenthefollowingisobviouslythecase:j'(w)j=hw';'i;whereh;iistheusualdotproductofvectors.Conversely,ifxisag-componentvectoroverN,andhx;'i=kthenforanywordx2AwhichhasexactlyxisymbolsinAi,for1ig,thenj'(x)j=k. Mlk m,whichistosay,thenumberofpossible'symbol-lengthvectors.Nextwerequirethenumberof'symbol-matricesW'x,thatis,ghmatricesoverNthecolumnvectorsofwhichsumtotheentriesinagiven'symbol-lengthvector,withwij=0ifj>i.Finally,givena'symbolmatrixW'x,weneedtocalculatethenumberofdistinctwordswsuchthatjwjaij=wijforeachsymbolaij2A,where1ig,1jh.Thenalstepisstandard:thenumberofdistinctpermutationsofw'=jwjsymbolsofwhichtherearewijinstanceseachofsymbolsaij2A,where1ig,1jhis'(w)where'(W'w)=jwj! 119

PAGE 129

Let'2C(A;B),whereA=fa;bg,B=fd;e;fgsuchthat':a7!de Thenm'=2,A1=fbg,A2=fag,=f1=1;2=2g,1=2=1;and'=(11).Ifkiseven,i.e.,k=2l,l2N,thenCk=f(2i;li);0ilgandifk=2l+1,thenCk=f(2i+1;li);0ilg.Thevectorsineachofthesemaythereforebeindexedbyi2Nl[f0g.Foreachk2Nandeachc2Ck,Dc1=f(c1)gandDc2=f(c2)g,whilejcj=c1+c2,whichistosayjcj=l+lifk=2landjcj=l+i+lifk=2l+1.Thus(5.3)simpliesto(5.1),sincethereisonlyonepartitionofeachsetDcifori2Ng=f1;2g.Further,eachvectordihasonlyonecomponentfori2f1;2g.Thus,fork=2l,thetotalnumberofwordsinAwith'-imagesinBkisgivenby#h'(A)\Bki=lXi=0(l+i)! (2i)!(li)!

PAGE 130

(2i+1)!(ki)!:Calculationshowsthat,therefore,#h'(A)\Bki+#h'(A)\Bk+1i=#h'(A)\Bk+2i for,ifk=2l,lXi=0(l+i)! (2i)!(li)!+(l+i+1)! (2i+1)!(li)!=1+lXi=11+2i li+1(l+i)! (2i)!(li)!+1=[(l+1)+(l+1)]! [2(l+1)]![(l+1)(l+1)]!+lXi=1l+i+1 (2i)!(li)!+[(l+1)+0]! (20)![(l+1)0]!=l+1Xi=0(l+1+i)! (2i)!(l+1i)! andsimilarlyfork=2l+1.Sinceobviously'(A)\B=ffgand'(A)\B2=fde;ffgitfollowsthat#'A\B
PAGE 132

123

PAGE 133

1. Identifyalanguagenorm,i.e.,amapofalanguagespaceintothenon-negativerealnumbers,withthecharacteristicthattheemptylanguagemapsto0andthenormofalanguageisnogreaterthanthenormofasupersetofthatlanguage. 2. Denethecorrespondinglanguagepseudo-metricasthenormofthesymmetricset-differenceoftwolanguages. 3. Determinebasicpropertiesofthesefunctions,suchaswhethertheyareadditiveontheunionofsets. 124

PAGE 134

Ifthepseudo-metricisstrict,maplanguagesatdistance0fromeachothertoequivalenceclasses.Investigatethepropertiesoftheseclassesandthecharacterofconvergenceinthere-sultingmetricquotientspace. 5. Ifthequotientspaceisnotcompact,trytoestablishanupperquotientspacebyidentifyingequivalenceclassescontaininglanguageswiththesamenorm. 6. Investigatethelocationofthemember-familiesoftheChomskyhierarchyandotherknowncharacteristicsoflanguagesinthethree-foldtopologyinduced. 7. Determinethedenitionofrandomnessinthetopologizedlanguagespace.Inpursuingthisapproach,wewantedtodeterminewhethersuchamethodcouldgivecleartopo-logicalandlinguisticresults,whetherdifferentformsofrandomnessemerged,andwhateffecttheuseofdifferenttopologieswouldhaveonlanguagemorphisms.Therstmetricwasbasedonandingtheshortestdifferencebetweenlanguages,thesecondwasbasedonthecumulativedifferencesbetweenlanguages,andthethirdwasbasedontheexponentialgrowthrateofdifferencesbetweenlanguages.Lemmas4.10and3.17,andCorollaries3.18.1,2.2.1,and4.8.5showthefollowing. LetA=fa;b;cg.ConsiderthefollowinglanguagesoverA: 125

PAGE 135

1/4 1/2 1/2 0 1/2 1/2 1/2 0 1/4 1/2 1/4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Table1:Languagedistancesbetweensmalllanguages ConsiderthefollowinglanguagesL5,L6,andL7,derivedfromL2,L3,andL4ofExample15,respectively.TheyfallintothelanguagefamiliesREGnLOC,CSnREGandCFnCS,respectively. 126

PAGE 136

1/4 1/2 0 1/2 1/2 0 1/4 23/36 0 47/72 47/72 0 1 1 0 1 1 0 3/4 L6 3/8 L7 9/26 1. ThefamilyofregularlanguagesindenseintheBesicovitchlanguagetopologicalspace. 127

PAGE 137

Everylanguagecode(injectivelanguagemorphismgeneratedbyaliteralmapping)iscontinu-ousonitsimage.Itwasshown,inLemma5.2,thatalanguagecodeiscontinuousonitsimageifitisamapfromatopologizedspacetoaspacewithanertopology.Additionally,theinvestigationintorandomnessontheentropictopologywastransferredtorandom-nessontheupperquotientspace,withtheunsatisfactoryresultthatalllanguagessufcientlylargemightberegardedasnonrandom.Anadjustmenttothisapproachisclearlyinorder.6.2.1DistinguishingregularlanguagesmetricallyThedistinctionofcertainsubfamiliesoftheregularlanguagesbymetricshasbeendemonstratedinthattheyhavebeenforcedintothesameequivalenceclassandthereforecannotbedenseinthetopologizedlanguagespace.ButitremainstoaccomplishasimilardistinctionbymetricmeansbetweenlanguagesinLINA,forinstance,andlanguagesinREGA.Asketchofonepossibleattackonthisproblemispresentedhere.Theideaistoconstructametricdesignedwithsyntacticgoals.Tomakecertainthatregu-larlanguagesaredifferentiatedbythismetricfromnon-regularlanguages,takeadvantageoftheMyhill-Nerodetheoreminthefollowingmanner.LetdR(L;M)begivenbyrationofthenumberoffollowersetsinL4kMtok,withthelimitsupremumtakenaskgoestoinnity.

PAGE 138

129

PAGE 139

(6.2) NotethatdFINA=d1.Openballsofradius"aroundalanguageL2PA,givenLFINA,consistofthefollowing:B";F(L)=8>>>>><>>>>>:fLgifdF(L;M)";8M2PAnLPAif"1;andCFL;"otherwise.(6.3)whereFL=F2F:#F4NimpliesL4
PAGE 140

1. Themetricspace(PA;dL)isnotnecessarilycompact. 2. Let 3. LetIFdenotethesetofallisolatedpointsin(PA;dF).Thenthefollowingareequivalent: (a) ThesetIFisdensein(PA;d1); (b) ThesetFisnowheredensein(PA;dF);and (c) ThesetFisamaximalnowheredenseset.6.3ConclusionThespaceoflanguagesprovestoberichinmaterialandrelevantquestionsfromthestandpointoftopology,morphisms,andrandomness.Languagecollectionsarequantiableasobjectsandinrelationtoeachother.Theapproachofgivingarigorousdenitionofalanguagepseudo-metric,bysupplementingordinarypseudo-metricconditionswiththerequirementthatlanguageswithagreaternumberofdistinctwordsbefartherapartprovestomakethestructureswithinlanguagecollectionsmoreapparent.ByProposition1,eachlanguagepseudo-metricmaybeassociatedwithlanguagenorm,i.e.,adenitionoftherelativesizeofalanguage.Thishasshowninterestingresultsandraisedintriguingquestions.UnderaCantordistance,languagespacesaretotallydisconnected;theyareallcopiesoftheCantorspace.ByTheorem2.1thenon-recursivelyenumerablelanguagesaretherandomlanguagesintheCantorspace.UnderaBesicovitchdistancethelanguagespaceisconnected,non-compact,andcanbepartitionedintoequivalenceclasseswithuniqueantipodes.WeusedProposition2toshowthat,underasecondquotientmap,theunitintervalappearsasthecompactquotientofthespace.Weadaptedtopologicalentropyintoanentropicdistance,andshowed,inCorollary4.8.5,thatentropicdistanceinducesatopologywhichisarenementoftheBesicovitchtopology,.Wefoundasubfamilyoftheregularlanguageswhichisdenseinthisspace, 131

PAGE 141

132

PAGE 142

133

PAGE 143

[1] J.Arsuaga,J.Roca,D.W.Sumners,S.Trigueros,M.Vazquez.KnottingprobabilityofDNAmoleculesconnedinrestrictedvolumes:DNAknottinginphagecapsids.PNAS.99:8(April16,2002)5373. [2] A.S.Besicovitch.AlmostPeriodicFunctions.DoverPublications,Inc.,NewYork,NY,1954. [3] F.Blanchard,E.Formenti,P.Kurka.CellularautomataintheCantor,Besicovitch,andWeyltopologicalspaces.ComplexSystems.11:2(1997)107. [4] C.Calude,P.Hertling,H.Jurgensen,K.Weihrauch.Randomnessonfullshiftspaces.Chaos,Solitons,andFractals.12(2001)491. [5] C.Calude,S.Marcus,L.Staiger.Atopologicalcharacterizationofrandomsequences.Info-mationProcessingLetters.88(2003)245. [6] A.CarboneandM.Gromov.FunctionallabelsandsyntacticentropyonDNAstringsandproteins.TheoreticalComputerScience.300(2003)35. [7] G.Cattaneo,E.Formenti,L.Margara,J.Mazoyer.AShift-InvariantMetriconSZInduc-ingaNon-trivialTopology.MathematicalFoundationsofComputerScience(MFCS'97,Bratislava).I.Privara,P.Rusika,eds.LectureNotesinComputerScience1295(1997)179188. [8] X.Chen,M.Li,X.Li,B.Ma,P.M.B.Vitanyi.TheSimilarityMetric.IEEETransactiononInformationTheory.50:12(December,2004)3250. [9] C.ChoffrutandJ.Karhumaki.CombinatoricsofWords.G.Rozenberg,A.Salomaa,eds.HandbookofFormalLanguages,Vol.1:Word,Language,Grammar.Springer-Verlag.Berlin,1997.329. 134

PAGE 144

R.CilibrasiandP.Vitanyi.AutomatedMeaningDiscoveryUsingGoogle.http://homepages.cwi.nl/paulv/papers/amdug.pdf.1. [11] K.CulikII,J.Karhumaki,andP.Salmela.FixedPointApproachtoCommutationofLan-guages.AspectsofMolecularComputing,EssaysDedicatedtoTomHeadontheOccasionofHis70thBirthday.N.Jonoska,G.Paun,G.Rozenberg,eds.Springer-Verlag.Berlin,2004. [12] E.Formenti.OnthesensitivityofadditivecellularautomatainBesicovitchtopologies.The-oreticalComputerScience.301(2003)341. [13] D.GenovaandN.Jonoska.TheTopologicalSpaceofFormalLanguagesandForbidding-EnforcingSystems.Preprint. T.HarjuandJ.Karhumaki.Morphisms.HandbookofFormalLanguages,Vol.1:Word,Lan-guage,Grammar.G.Rozenberg,A.Salomaa,eds.Springer-Verlag.Berlin,1997.439. [15] M.HerlihyandN.Shavit.TheTopologicalStructureofAsynchronousComputability.Jour-naloftheACM.46:6(Nov1999)858. [16] P.HertlingandK.Weihrauch.Randomelementsineffectivetopologicalspaceswithmea-sure.InformationandComputation.181(2003)32. [17] S.Hussini,L.Kari,S.Konstanides.CodingpropertiesofDNAlanguages.Proceedingsofthe7thInternationalMeetingonDNABasedComputers.N.JonoskaandN.C.Seeman,eds.LectureNotesinComputerScience2340(2001)107. [18] D.Jones,C.Orengo,J.Thornton,eds.Bioinformatics:Genes,Proteins,&Computers.BIOSScienticPublishers,Limited.Oxford,2003. [19] N.JonoskaandK.Mahalingam.LanguagesofDNAbasedcodewords.PreliminaryPro-ceedingsofthe9thInternationalMeetingonDNABasedComputers.J.ChienandJ.Reif,eds.(2003)58 [20] N.Jonoska,D.Kephart,K.Mahalingam.GeneratingDNAcodewords.CongressusNumer-antium.156(2002)99. 135

PAGE 145

N.JonoskaandM.Saito.AlgebraicandtopologicalmodelsforDNArecombinantprocesses.DevelopmentsinLanguageTheory.C.S.Caludeet.al.,eds.LectureNotesinComputerScience3340(2004)49. [22] D.KephartandJ.Lefevre.CODEGEN:thegenerationandtestingofDNAcodewords.ProceedingsoftheIEEE2004ConferenceonEvolutionaryComputation.(2004)1865. [23] B.P.Kitchens.SymbolicDynamics;One-sided,Two-sidedandCountableMarkovShifts.Springer-Verlag.Berlin,1998. [24] W.Kuich.SemiringsandFormalPowerSeries:TheirRelevancetoFormalLanguagesandAutomata.HandbookofFormalLanguages,Vol.1:Word,Language,Grammar.G.Rozenberg,A.Salomaa,eds.Springer-Verlag.Berlin,1997.609. [25] M.LiandP.Vitanyi.AnIntroductiontoKomolgorovComplexityandItsApplications.2nded.SpringerVerlag.NewYork,1997. [26] D.LindandB.Marcus.SymbolicDynamicsandCoding.CambridgeUniversityPress.NewYork,NY,1999. [27] S.MacLane.CategoriesfortheWorkingMathematician.Springer-Verlag.NewYork,NY,1971. [28] A.MateescuandA.Salomaa.AspectsofClassicalLanguageTheory.HandbookofFor-malLanguages,Vol.1:Word,Language,Grammar.G.Rozenberg,A.Salomaa,eds.Springer-Verlag,Berlin,1997.175. [29] PerMartin-Lof.DenitionofRandomSequences.InformationandControl.9(1966)602619. [30] J.R.Munkres.Topology.2nded.Prentice-Hall,Inc.,UpperSaddleRiver,NJ,2000. [31] J.Roca,D.W.Sumners,M.E.Vazquez.NoveldisplayofknottedDNAmoleculesbytwo-dimensionalgelelectrophoresis.NucleicAcidsResearch.29:13(2001)e671. [32] D.W.Sumners.LiftingtheCover:UsingTopologytoProbetheHiddenActionofEnzymes.NoticesoftheAMS.42:5(May1995)528. 136

PAGE 146

G.Paun,G.Rozenberg,A.Salomaa.DNAComputing;NewComputingParadigms.Springer-Verlag.Berlin,1998. [34] J.-E.Pin.SyntacticSemigroups.HandbookofFormalLanguages,Vol.1:Word,Language,Grammar.G.Rozenberg,A.Salomaa,eds.Springer-Verlag,Berlin,1997.680. [35] G.Rozenberg,A.Salomaa,eds.HandbookofFormalLanguages,Vol.1:Word,Language,Grammar.Springer-Verlag.Berlin,1997. [36] A.RayandA.Surana.ALanguageMeasureforSupervisoryControl.AppliedMathematicalLetters.16(2003)985. [37] W.Rudin.RealandComplexAnalysis.McGraw-HillBookCo.Singapore,1987. [38] V.Vianu.TheBodnarchukMetricSpaceofLanguagesandthetopologyofthelearningspace.MathematicalFoundationsofComputerScience1977,Proceedings,6thSymposium,TatranskaLomnica,September5,1977.J.Gruska,ed.LectureNotesinComputerScience53(1977)537. [39] S.Wolfram.TheoryandApplicationsofCellularAutomata.WorldScientic.Singapore,1986. 137


xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001670381
003 fts
005 20051216095137.0
006 m||||e|||d||||||||
007 cr mnu|||uuuuu
008 051122s2005 flu sbm s000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0001250
035
(OCoLC)62321793
040
FHM
c FHM
1 100
Kephart, David. E.
0 245
Topology, morphisms, and randomness in the space of formal languages
h [electronic resource] /
by David E. Kephart.
260
[Tampa, Fla.] :
b University of South Florida,
2005.
502
Thesis (Ph.D.)--University of South Florida, 2005.
504
Includes bibliographical references.
516
Text (Electronic thesis) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
500
Title from PDF of title page.
Document formatted into pages; contains 147 pages.
Includes vita.
520
ABSTRACT: This paper outlines and implements a systematic approach to the establishment, investigation, and testing of distances and topologies on language spaces. The collection of all languages over a given number of symbols forms a semiring, appropriately termed a language space. Families of languages are defined by interrelations among words. The traditional classification begins with the syntax rules or grammar of the language, that is, the word-transformations by which the entire language can be produced from a single axiom, or starting word. The study of distances between languages as objects and of the topologies induced by language distances upon spaces of languages has been of a limited character. Known language distances introduce topologically awkward features into a language space, such as total disconnectedness.This dissertation examines the topologies induced by three language distances, the effect that each one has upon the notion of a random language, and discusses continuity and word-distribution of structure-preserving language transformations, i.e., morphisms. This approach starts from metric-like requirements, but adduces an additional condition intuitively appropriate to gauging language distance. At the same time, strict, i.e. non-metric pseudo-metrics are admitted as possible language distance functions, and these are investigated by the use of metric quotient spaces.The study of the notion of randomness implied by the topology induced by such a pseudo-metric on a language space offers insight into the structure of language spaces and verifies the viability of the pseudo-metric.Three language pseudo-metrics are studied in this dissertation: a version of the most commonly-used (Cantor) word metric; an upper-density (Besicovitch) pseudo-metric borrowed from the study of cellular automata; and an adaptation and normalization of topological entropy, each evaluated on the symmetric set-difference between languages. It is shown that each of these distances induces a distinct topology on the space of languages. The topology induced by Cantor distance is compact and totally disconnected, the topologies induced by the other two are non-compact, with entropic distance resulting in a topology that is the strict refinement of the Besicovitch topology, enhancing the picture of the smaller languages in the Besicovitch topology.It is also shown that none of the three topologies gives quantitative expression to the distinction between regular and linear languages, although, using Martin-Lof randomness tests, it is shown that each pseudo-metric is associated with a new notion of a random language.A classification of language mappings is introduced, with the aim of identifying those which best preserve the structure of languages under specific topologies. There are results regarding continuity of mappings, the matrix representation of the pre-image of certain morphisms, and the formal expressions of the probability distribution of the image of certain morphism. The continuity of an injective morphism on its image is demonstrated under limited conditions.Finally, the questions which this approach leaves open are detailed. While basic facts about a permutation-invariant version of symmetric set difference are shown, this has yet to be fully elaborated.
590
Adviser: Natasha Jonoska.
653
Entropy.
Language norm.
Language distance.
Language space.
Pseudo-metric.
Symbolic dynamics.
690
Dissertations, Academic
z USF
x Mathematics
Doctoral.
049
FHMM
090
QA36 (Online)
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.1250