xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 2200385Ka 4500
controlfield tag 005 20070711144029.0
007 cr mnu---uuuuu
008 070711s2006 flu sbm 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0001582
Stearns, Adrienne M.
Production and perception of place of articulation errors
h [electronic resource] /
by Adrienne M. Stearns.
[Tampa, Fla] :
b University of South Florida,
ABSTRACT: Speech errors have been utilized since the beginning of the last century to learn more about how speech is produced, both physically and cognitively. Collection of speech errors has progressed from writing down naturally occurring speech errors to recording experimentally induced speech errors to current studies, which are using instrumentation to record acoustic and kinematic information about experimentally induced speech errors. One type of instrumentation being used in articulatory research is ultrasound. Ultrasound is gaining popularity for use by those interested in learning how speech is physically produced because of its portability and noninvasiveness. Ultrasound of the tongue during speech provides visual access to the articulatory movements of the tongue.This study utilizes ultrasound recordings of speech errors in two ways. In Experiment 1, ultrasound images of participants' tongues were recorded while they read tongue twisters designed to elicit speech errors. The tongue twisters were CVC words or CV syllables with onset velar or alveolar stops. Within the ultrasound video, the angle of the tongue blade and elevation of the tongue dorsum were measured during the onset stop closure. Measurements of tongue twisters were compared to baseline production measures to examine the ways in which erroneous productions differ from normal productions. It was found that an error could create normal productions of the other category (i.e., categorical errors) or abnormal productions that fell outside the normal categories (i.e., gradient errors).Consonant productions extracted from ultrasound video were presented auditory only to nave listeners in Experiment 2. Listeners heard a variety of normal, gradient error, and categorical error productions. Participants were asked to judge what they heard as the onset sound. Overwhelmingly, the participants heard normal productions as well as gradient error productions as the target sound. Categorical error productions were judged to be different from the target (e.g., velar for alveolar). The only effect of erroneous production appears to be a slight increase in reaction time to respond with a choice of percept, which may suggest that error tokens are abnormal in some way not measured in this study.
Thesis (M.A.)--University of South Florida, 2006.
Includes bibliographical references.
Text (Electronic thesis) in PDF format.
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
Title from PDF of title page.
Document formatted into pages; contains 67 pages.
Adviser: Stefan A. Frisch, Ph.D.
Slips of the tongue.
x Speech Language Pathology
t USF Electronic Theses and Dissertations.
Production and Perception of Place of Articulation Errors by Adrienne M. Stearns A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Communicati on Sciences and Disorders College of Arts and Sciences University of South Florida Major Professor: Stef an A. Frisch, Ph.D. Catherine L. Rogers, Ph.D. Jean C. Krause, Ph.D. Date of Approval: March 31, 2006 Keywords: ultrasound, articulatory gestures, slips of the tongue, consonants, tongue twister Copyright 2006, Adrienne M. Stearns
Dedication This thesis project is dedicated to To m Sievers, for his unwavering support and constant encouragement. Without his l ove, steadfastness, and occasional capering, I would not have had the confidence or strength to fini sh this journey.
Acknowledgements I would like to thank friends and family that have supported me and given me encouragement during this project. First of a ll, I would like to express how much I have enjoyed working with my thesis chair, Dr Stefan Frisch. His support and patience helped me complete this thesis, which I am very proud of, and provided me with an opportunity to learn more about ultrasound research and contribute to the field. Dr. Frisch was always dependable and flexib le, and I benefited greatly from having him as a mentor. I would like to thank my committee memb er Dr. Catherine Rogers for showing me how interesting speech science is. I w ould also like to thank my committee member Dr. Jean Krause for her encouragement and s upport during this project. Thanks to both of you for your insightful comments and for your contributions to a lively and entertaining defense. I would like to thank my friends who have tried to keep me on an even keel during the last two years. Danielle Melo wic and Shannon Weekley have always cheered me on while providing a good laugh wh en it was needed. It has been a joy to work with Sylvie Wodzinski in the lab and I have come to count her as a true friend. Finally, I would like to thank my family for their ever-present encouragement. My parents, Robert and Fred a Stearns, and my aunt and uncle, Pat and Paul Stearns, always told me that I could do anything I wanted to. Their support bolstered my determination to complete this project.
i TABLE OF CONTENTS List of Tables................................................................................................................. ....iii List of Figures................................................................................................................ ....iv Abstract....................................................................................................................... .......vi Chapter One: Introduction.................................................................................................1 Patterns Observed in Research on Naturally Occurring Speech Errors..........................2 Patterns Observed in Research on Experimentally Induced Speech Errors...................6 Speech Error Assessment Utilizing Instrumentation......................................................8 Ultrasound and Speech Error Research........................................................................10 Purpose........................................................................................................................ ..13 Chapter Two: Ultrasound Error Elicitation......................................................................14 Stimuli........................................................................................................................ ...14 Participants................................................................................................................15 Procedure..................................................................................................................16 Measurement.................................................................................................................17 Results........................................................................................................................ ...20 P1............................................................................................................................. .25 P2............................................................................................................................. .30 P3............................................................................................................................. .34 P4............................................................................................................................. .38 Discussion..................................................................................................................... 41 Chapter Three: Perception of Place of Articulation Errors..............................................44 Stimuli........................................................................................................................ ...44
ii Procedure..................................................................................................................45 Participants................................................................................................................45 Results........................................................................................................................ ...45 Identification.............................................................................................................46 Reaction Times.........................................................................................................49 Discussion..................................................................................................................... 54 Chapter Four: Conclusions/Discussion............................................................................55 Conclusions...................................................................................................................5 5 Discussion..................................................................................................................... 56 Clinical Uses.................................................................................................................5 8 References..................................................................................................................... ....61 Appendix A: Perception Results for Each Stimulus........................................................64
iii LIST OF TABLES Table 1 Tongue twisters read by participants..............................................................15 Table 2 Stimulus classification system........................................................................24 Table 3 Category classifica tion of P1Â’s productions...................................................27 Table 4 Category classifica tion of P2Â’s productions...................................................32 Table 5 Category classifica tion of P3Â’s productions...................................................35 Table 6 Category classifica tion of P4Â’s productions...................................................39 Table 7 Number of stimulus presentations by category and pe rcentage of error selection by listeners.......................................................................................47 Table 8 Reaction times and z-scores fo r perception of alveolar production targets..............................................................................................................51 Table 9 Reaction times and z-scores for perception of velar production targets.........51
iv LIST OF FIGURES Figure 1 Sagittal view of the tongue body with ultrasound; tongue tip to the right.................................................................................................................12 Figure 2 Cushion-scanning technique within head stabilizing frame...........................17 Figure 3 Measurements applied to ultrasound image of the tongue.............................18 Figure 4 Ultrasound images of alveolar (top) and velar ( bottom) productions with mean and 2 SD ranges depi cted for tongue dorsum and tongue blade................................................................................................................22 Figure 5 Standard deviations surrounding means and ranges of productions for blade angle and dorsum distance for P1.........................................................26 Figure 6 Articulations for P1 delineated by vowel and baseli ne, alveolar, and velar productions by blade angle and dorsum distance..................................29 Figure 7 Standard deviations surrounding means and ranges of productions for blade angle and dorsum distance for P2.........................................................31 Figure 8 Articulations for P2 delineated by vowel and baseli ne, alveolar, and velar productions by blade angle and dorsum distance..................................33 Figure 9 Standard deviations surrounding means and ranges of productions for blade angle and dorsum distance for P3.........................................................34 Figure 10 Articulations for P3 delineat ed by vowel and base line, alveolar, and velar productions by blade angle and dorsum distance..................................37 Figure 11 Standard deviations surroundi ng means and ranges of productions for blade angle and dorsum distance for P4.........................................................38 Figure 12 Articulations for P4 delineat ed by vowel and base line, alveolar, and velar productions by blade angle and dorsum distance..................................40 Figure 13 Distributions of perceptual judgements..........................................................49
v Figure 14 Reaction time to stimuli on th e basis of production target and 3-way error categorization.........................................................................................53
vi Production and Perception of Place of Articulation Errors Adrienne M. Stearns ABSTRACT Speech errors have been utilized since th e beginning of the last century to learn more about how speech is produced, both phys ically and cognitively. Collection of speech errors has progressed from writing down naturally occurring speech errors to recording experimentally induced speech er rors to current studies, which are using instrumentation to record acoustic and ki nematic information about experimentally induced speech errors. One type of instrument ation being used in articulatory research is ultrasound. Ultrasound is gaining popularity fo r use by those interested in learning how speech is physically produced because of its portability and noninvasiveness. Ultrasound of the tongue during speech provides visual access to the articulatory movements of the tongue. This study utilizes ultrasound recordings of speech errors in two ways. In Experiment 1, ultrasound images of participantsÂ’ tongues were recorded while they read tongue twisters designed to elicit speech erro rs. The tongue twisters were CVC words or CV syllables with onset velar or alveolar stops. Within the ultrasound video, the angle of the tongue blade and elevation of the tongue dorsum were measured during the onset stop closure. Measurements of tongue twisters were compared to baseline production
vii measures to examine the ways in which erroneous productions differ from normal productions. It was found that an error c ould create normal productions of the other category (i.e., categorical errors) or abnor mal productions that fell outside the normal categories (i.e., gradient errors). Consonant productions extracted from ultr asound video were presented auditory only to nave listeners in Experiment 2. Listeners hear d a variety of normal, gradient error, and categorical error pr oductions. Participants were asked to judge what they heard as the onset sound. Overwhelmingly, th e participants heard normal productions as well as gradient error productions as the targ et sound. Categorical error productions were judged to be different from the target (e.g., velar for alveolar). The only effect of erroneous production appears to be a slight increase in reaction time to respond with a choice of percept, which may suggest that error tokens are abnormal in some way not measured in this study.
1 CHAPTER ONE: INTRODUCTION Speech errors have been reported by ma ny people in many ways. Sigmund Freud in 1901 termed speech errors as Â“faulty actions.Â” The colloqu ial label for FreudÂ’s faulty actions is Â“Freudian slip.Â” Freud attributed the speech error to an error in human action, speech, or memory caused by the unconscious mi nd. These errors often appear to the observer as being bizarre and nons ensical. He concluded that articulation errors reveal a Â“hiddenÂ” cognitive process of which even the speaker is unaware. Reverend William Archibald Spooner was reported to regularly ma ke speech errors by transposing the first consonant or consonant cluster of words. Very few speech errors were verifiably recorded as being produced by him, and most of those have been determined to be purposeful, such as Â“YouÂ’ve missed my history lectures YouÂ’ve hissed my mystery lecturesÂ” (MacKay, 1970). However, th e term Â“spoonerismÂ” became popular in describing accidental speech errors indicativ e of a faulty phonological process where the initial consonants of words are transposed. Speech errors provide insight into the speech production process by bringing to light the range of possibilities that exist for arti culatory actions. Research into slips of the tongue has concluded that language processi ng makes use of phonological constituents as units (e.g., features, segments, onsets, rimes). Speech errors can involve the rearrangement of a single segment, syllabl e onset, syllable rime, or whole syllable,
2 suggesting that each of these constituents exist as a unit in language structure and processing (Fromkin, 1971). At the same ti me, speech errors are more likely to occur between segments that occupy the same position within the word, syllable, and prosodic structure, suggesting that speech is genera ted by using word Â“templatesÂ” into which segments are fitted (Shattuck -Hufnagel, 1992). Stemberger and Trieman (1986) and Dell (1990) found that frequently occurring word s and segments are more protected from speech errors, which they explain using an activation/competition model of speech production. By learning more about how speech errors are produced, and perceived, more can be learned about the processes i nvolved in production and perception of normal speech. Patterns Observed in Research on Naturally Occurring Speech Errors Fromkin gathered data on naturally occu rring speech errors by logging errors noticed while in conversation with others or described by others to her (Fromkin, 1971). She analyzed errors linguistically and found that speech errors follow a pattern of rules. She concluded that most speech errors occur at the level of the phonological segment and that only those segments that are near to each other and in paralle l structural positions (e.g., word final and word final or word initial an d word initial) are invo lved in errors. In other words, prosodic constituents of words (e.g., onset, rime, final consonant) are involved in errors rather than random sel ections of phonemes thr oughout the word(s) in error. Grammatically non-permissible phones or phonetic sequences did not occur in the errors in her data. It was also determined th at errors may be semantic in nature and that
3 the similarity of the phonological form of words appears to play a role in word substitutions. Garrett (1975, 1976) and Shattuck (1975) collected a corpus of approximately 3,400 speech errors that occurred in spontan eous speech (MIT corpus). The errors collected were either heard directly by Ga rrett or Shattuck, or reported by friends. Garrett stated that the errors in his corpus revealed four generalizations: (1) interacting elements are similar to each other, (2) in teracting environments are also similar, (3) phonological well-formedness is preserved, and (4) speech segments in an error interact with both word and phr asal stress. In the phoneme errors in GarrettÂ’s corpus, consonants and vowels did not subs titute for each other in the cr eation of an error. That is, a consonant exchanged with other cons onants and a vowel exchanged with other vowels, but consonants did not ex change for vowels or vice vers a. Garrett stated that two nonphonological processing levels for word form s exist that affect the production of speech errors. One is the Â“real timeÂ” cons traint, which is affected by adjacency of elements and the other is a Â“functionalÂ” constr aint, which is sensitive to the structural placement of the target word in the sentence. Therefore, generation of a speech error is constrained not only by similar elements in similar environments, but also by syntactic well-formedness. Shattuck-Hufnagel and Klatt (1979) concluded that most phoneme errors occur as a result of mis-selection between two sim ilar phonemes competing for a single prosodic location in an utterance. They studied an expanded version of the MIT corpus containing nearly 6,000 errors. They found that phoneme substitutions and exchanges accounted for approximately 35% of the total errors. The remaining errors consisted either of single
4 phoneme errors that lacked an identifiable target segment or errors involving longer sequences of segments, morphemes, words, or phrases. Shattuck-Hufnagel and Klatt state that the planning process used to generate speech follows three steps: (1) planning segments or phonemes, (2) sequencing structural ly-defined slots, and (3) the integration mechanism, which consists of a Â“scan-copier to insert segments into the slots, a bookkeeper to check off or delete segments th at have been copied, and an output error monitor.Â” Malfunctions in any part of this process could account for errors in selecting the incorrect segment, incorrect placement of the segment, or overuse or underuse of the segment if the integration mechanism fails. Dell and Reich (1981) endeavored to methodologically create a corpus of naturally occurring slips of th e tongue. They instructed students who had not yet taken coursework related to theories of speech errors to collect examples of slips of the tongue heard personally during a 1-month timeframe. The students were instructed not to record speech errors that they were unsure of hear ing correctly or that they had not heard themselves. Over the course of two years, five data collection pe riods were initiated, resulting in a corpus of approximately 4,000 English examples. The analysis of the corpus was restricted to word level errors and sound errors that i nvolved only the initial consonant due to the fact that these types of e rrors are easier to detect and therefore, more likely to be accurately collected by the student s. In the analysis of initial consonant errors, they found a large percentage of sp eech errors created real words instead of nonsense items. Using two-word pairs, an ticipation, perseveration, and transposition errors were analyzed. Anticipation errors are cases where a segment is produced before its intended position. In anticip ation errors, they found that the erroneous word, which
5 occurs first, was more likely to be a real word than a nonsen se word. Perseveration errors are cases where a segment is produced agai n after being produced in its intended position. In perseveration errors, they also found that the erroneous word, which occurs second, was more likely to be a real word than a nonsense word. Transposition errors are cases where two segments exchange position. In transposition errors they found that the first word was more likely to be a real word than nonsense, but the second word was not. This suggests that the production of the first wo rd in the transposition is the actual error, and the erroneous production of the second word occurs beca use of the remaining unused intended phoneme from the pr oduction of the first word. Given the strong tendency for error productions to create real words rath er than nonsense, Dell and Reich concluded that the phonological forms in the mental lexicon influence the outcome of a phonological error. Stemberger and Treiman (1986) found that the second consonant (C2) in wordinitial consonant clusters is more likely to be produced in error than the first consonant (C1) of the cluster. Their experiments show that C2 is more often deleted than C1 during a speech error. They also demonstrate that when a consonant is added to a word during a speech error, a C2 is added more often than a C1. As a result, they determined that C1 and C2 are distinct positions within the temp late of the syllable onset in single syllable words. They propose that C1 positions are us ed for both initial singletons and the initial consonant in a cluster whereas C2 positions are used only in clusters. Stemberger and Treiman state that activation discrepancies between C1 and C2 during speech production are the cause of the difference in error freque ncies. C1, being the first sound in the word, activates many other words beginning with the same letter when it is heard by the
6 listener. On the other hand, C2 only serves to narrow the field and is not as important in distinguishing the word from other options. The extensive use of C1 positions across a larger variety of words also leads to greater activation levels in the lexicon for C1. This increased activation level protects C1 from errors relative to C2. Patterns Observed in Research on Experimentally Induced Speech Errors Baars, Motley, and MacKay (1975) develope d experimental procedures to induce speech errors in their subjects rather than relying on naturally occurring errors. The authors sought to create a f unctional separation of the input of the target from the articulation of the corre sponding error outcome. In this way, they were able to generate spoonerism-type errors in their subjects. By presenting lexical and nonsense word pairs to nave listeners, the authors sought to discover if lexical or nonsense spoonerisms were predominant. The results of their experime nts demonstrated that spoonerisms generally occur in favor of creating lexical outcomes, suggesting that subject s edit their output before producing it. However, if the subject is primed to read nonsense words, even if the spoonerism would create lexical outcomes, the words are interpreted as nonsense, and the subject does not edit their output to create lexical items. It has also been determined that speech errors generally occur in content words, rather than function words. Dell (1990) conclu ded that this is a result of usage frequency, which is related to increased activation. F unction words are used far more frequently than content words in speech. Dell proposes th at extreme familiarity with function words as a result of their high frequency rather than th eir status in a closed class is why they are rarely involved in speech errors. He also addresses the concept that function words have
7 a special status in language production. He states that function words can only be considered Â“specialÂ” if they ar e treated as whole units and not as segments that fit within a frame. Functional words may be stored as whole units (i.e., segments already associated with a frame) due the high fre quency with which they are used in speech. Therefore, it is far less likely that function words will be produced incorrectly since they are automatically inserted into pl ace as whole units in utterances. Shattuck-Hufnagel (1992) de termined that sounds that appear in the same prosodic position in the word are more likely to interact in speech er rors as a result of their similar placement. Her study examined th e role of shared word onset position and shared stress in influencing error rates. She concluded that pairs of word-onset consonants interact in segmenta l speech errors more often th an pairs of consonants in two different word positions. Also, more errors occur between pairs of consonants that share stress than between pairs that do not share st ress. Therefore, word onset position must be considered one of the prosodic const ituents of phonological encoding for speech production, distinct from a syllable onset position. Meijer (1997) conducted th ree experiments to determ ine the interaction of syllable position and stress in a word and the production of speech errors. When matching for word syllable and stress, speech errors between initial consonants were produced twice as often as speech errors between word onsets and word-internal segments, confirming previous research that wo rd onset is especially vulnerable to speech errors. This finding was further upheld by re sults from his second experiment in which word-internal structures were targeted versus word-initial structures Meijer presented a sentence containing two disyllabic target words and two interfering words. Two
8 conditions were presented: (1) where a word internal phoneme could interact to create a lexical item (e.g., ne ph ew + co p y = co ff ee) and (2) where a wo rd onset phoneme could interact with a word internal phone me to create a lexical item (e.g., f eather + co p y = co ff ee). Even when word onsets were inte racting with word-internal phonemes, it was the word-onset that displaced word-interna l phonemes rather than the other way around, which shows that word onsets are particularly volatile in creating speech errors. Finally, the idea that vowels cannot interact with cons onants in speech errors was shown to be a problem of opportunity rather than phonotactic constraints. In a speech error where substituting a vowel instead of a consonant would produce a word, interactions did occur and their frequency of occurrence seem ed comparable with the more common interactions between consonants (e.g., cand y cand le ). Speech Error Assessment U tilizing Instrumentation All previous data discussed were obtaine d using transcription of speech errors either in a laboratory setting or as they occurred by chance in normal conversation. However, by relying on auditory perception translated into i ndividual segments by transcription, bias may have been introduced into the data collecti on process. IPA is, by definition, a categorical system of transcribing individual pho neme sounds. This system does not provide a straightforward means to en code half-productions or undershooting of the target phoneme, nor does it allow for arti culatory combinations that occur when a speaker mistakenly attempts to produce two phonemes at the same time. In addition, the ability of listeners to accurat ely perceive these types of productions is unclear. As a result of lack of confidence in the reliabili ty of transcribed speech errors to accurately
9 reflect the speech production phenomena, intere st developed in technological approaches to recording and analyzing speech errors. Mowrey and MacKay (1990) conducted one of the first experiments to use articulatory instrumentation to collect data on speech errors. Th eir experiment used electromyographical (EMG) data to examine musc le fiber activity in speech errors. Their results show that errors i nvolving abnormal muscular activat ion do occur. These errors occur at a gestural level that is smaller th an the phoneme unit and they found these errors were not detectable in their own perception. Since the e rrors are not detectable without instrumentation, the results ca st doubt on the conclusions drawn from transcribed speech errors that rely solely on human perception. Speech errors can be produced by unders hooting the target articulation or by overshooting the target articu lation (Boucher, 1994). Bouc her examined x-rays of articulators in motion while speakers uttere d phrases. Auditorally, the speech errors sounded distinctly like a completely different phoneme than the target. However, the x-ray data show that the errors were a result of either not articul ating enough or moving the articulators too far in an attempt to pr oduce the target phoneme. He concluded that restricting the experimenterÂ’s encoding of sp eech errors to an alphabetic transcription may obscure important aspects of the nature of the speech error itself. Acoustic analysis of speech errors involving /s/ and /z/ by Frisch and Wright (2002) concluded that errors of devoicing /z/ occur acoustica lly/articulatorily more often than errors of voicing /s/. However, auditory perception appears to be more sensitive to voicing errors of /s/, which then is comprehende d as /z/. Therefore, in a transcription of speech errors, it would appear that /s/ is in error far more often than /z/, even though the
10 opposite is actually the case. Frisch and Wright also report ed speech errors that were recorded that were not phonologically we ll-formed, which empirically contradicts previous claims that all errors occur with in phonologically well-formed segments. They concluded that their findings were incompa tible with models of speech production that involve selecting, organizing, a nd integrating phonetic segmen tal units, as suggested by Shattuck-Hufnagel (1979). Pouplier (2003) used an electromagnetic midsagittal articulometer (EMMA) to investigate speech errors. Participants had magnetic coils that rece ived electromagnetic signals attached to specific points along the midline of thei r tongues. The signals from the coils were then tracked to determine th e movement of the tongue during articulation. She found that erroneous gestural movements ar e produced at different rates. She also found that it is more common for gestures from one phoneme to intrude upon another rather than for the gestures of the target phoneme to be reduced. Pouplier presented auditory stimuli gathered from the EMMA study to participants to obt ain their perceptual judgements of the initial consonants. Contra sting /t/ and /k/ in a perceptual experiment, intruding gestures of the tongue dorsum during production of /t/ has a systematic perceptual effect (i.e., he aring /k/) whereas tongue tip intruding gestures during production of /k/ did not signifi cantly affect identification. Ultrasound and Speech Error Research Ultrasound imaging has been gaining popul arity as a method to study articulatory gestures during speech production. Ultrasound studies completed in the 1960s and 1970s were completed using large, cumbersome equipment that only created 1-dimensional
11 images (Gick, 2002). Since that time, techno logy has progressed and current ultrasound workstations have the capability to produce 2-dimensional images with far greater temporal resolution using real -time B-scan (Stone, 2005). R eal-time B-scan is a mode wherein individual 1-dimensional images are combined to reconstruct a 2-dimensional grayscale image in the shape of a 90-120 degr ee wedge. In this mode, modern ultrasound machines can obtain 30 scans per second or more, which is considered fast enough to accurately capture most speech movements. Ultrasound images are produced by placing a transducer, or probe, against the subjectÂ’s skin while the transducer emits a sound wave between 2 and 10 MHz. This sound wave travels through soft tissue (like the tongue) and is reflected by air back to the transducer (Figure 1). The image that is cr eated reveals where soft tissue stops and air begins. For instance, when the probe is pl aced under the chin, the sound waves travel through the tongue but are reflec ted back from the top of th e tongue where it meets air in the oral cavity. This reflection is shown as a white Â“lineÂ” following the curve of the tongue and is used as a landmark in measurement of ultrasound images. The upper surface of the tongue is the black edge at th e underside of the white line where the air reflection begins.
12 Figure 1 Sagittal view of the tongue body w ith ultrasound; tongue tip to the right Due to the placement of the ultrasound probe under the speakerÂ’s chin, shadows are cast in the image by th e hyoid bone, mandible, and sublingual cavity (Gick, 2002). The shadows have the potential to obscur e the tongue tip and tongue root, depending on the placement of the probe and the subjectÂ’s anatomy. Securing the probe in a fixed position relative to the particip antÂ’s chin improves reliabili ty of the ultrasound images produced. To further improve the reliabi lity of the ultrasound probe positioning in relation to the articulators, a cushion placed over the probe allows for jaw movements during speaking without disloca ting the probe (Peng, Jost-Bri nkmann, & Miethke, 1996). blade tip dorsum root 1 c m cushion transduce r
13 Purpose Clearly, differences exist between instrume ntal speech error data and perceptual speech error data. Perceptual speech error data recorded by transcription have been used to conclude that all errors are phonologically well-formed and that errors ar e categorical in nature. Instrumental speech error resu lts demonstrate that errors can violate phonotactic constraints and are not always categorical in nature (Mowrey & MacKay, 1990; Pouplier & Goldstein, 2005). The distinc tion between how errors are articulated and how they are perceived is explored in this thesis in two experi ments. Experiment 1 examines the articulation of tongue twiste rs using ultrasound im aging of the tongue. Experiment 2 uses productions from Experi ment 1 in a perception task where nave listeners are asked to identify productions with a variety of articulat ory characteristics.
14 CHAPTER TWO: ULTRASOUND ERROR ELICITATION Experiment 1 used an ultrasound workstat ion to record images of articulators during spoken tongue twisters. To reduce demands on short-term memory, the participants were provided a wr itten list of tongue tw isters to read. The tongue twisters contrasted alveolars /t, d/ with velars /k, / in the initial position of words and nonwords. Based upon research that has shown that repeti tion of tongue twisters tends to result in speech errors, the participants repeated each tongue twister six times (Shattuck-Hufnagel, 1992). To create more Â“naturalÂ” phonological sp eech errors, the speakers were instructed to speak at a normal rate to reduce the ar ticulatory/motor programming load (Wilshire, 1999). It was expected that the speakers would produce both Â“categor ical errorsÂ” where a normal phoneme from the wrong category is produ ced, as well as Â“gradient errorsÂ” that involve an abnormal or incomplete articulatory gesture. Stimuli The tongue twisters were comprised of 22 phrases containing four words or nonsense syllables each (Table 1). In the firs t eight tongue twisters, all four CV syllables in each twister began with the sa me phoneme (/t/, /d/, /k/, or / /) and were used to obtain baseline measures of articulation for each sp eaker. The second set of six tongue twisters consisted of CVC words wherein the word initial phonemes alternated in place of
15 articulation only (i.e., / t/ and /k/ or /d/ and / /) and contrasted onl y between tongue tip and tongue dorsum raising gestur es. The CVC words containe d the target consonants as onsets, low vowels (/ / and / /), and labial final consonant s. The final eight tongue twisters were CV syllables (nonwords) with the same alternating in itial phonemes and the same low vowels. Low vowels were used in all the tongue twisters to reduce interference during articulation of stop consonants. Labial coda consonants were used in the CVCs to avoid introducing additional t ongue gestures that might obscu re clear observation of the target initial consonants (Wodzinski, 2004; Pouplier & Goldstein, 2005). Each tongue twister or tongue twiste r-like utterance was repeated six times, following the speech error elicitation procedure of Sh attuck-Hufnagel (1992). Table 1 Tongue twisters r ead by participants Baseline data Words Nonsense syllables / / top cap cop tab / / / / tap cab cob top / / / / dam gob gap damp / / / / cap top tab cop / / / / cab tap top cob / / / / gob dam damp gap /d / / / / / / / / / Participants Four (4) volunteers from an undergraduate speech and hearing sciences course participated in the experi ment. They received extra credit for their participation. Each participant had comple ted a phonetics course to ensure that they could read stimuli written in Internationa l Phonetic Alphabet (IPA) font. All were
16 females between 19 and 45 years of age. All participants were monolingual native speakers of American English who reported no past speech or hearing disorders. Procedure. Lingual articulation was imaged using an Aloka SSD-1000 ultrasound machine with a 3.5 MHz convex prob e. The participant was seated in a wheel-less straight-backed chair with her back against the back of the chair and feet flat on the floor. The probe was held stationary using the cushion-s canning technique (Peng, et al., 1995) and was placed along the inferior midline of the participantÂ’s mandible pointing upward toward the tongue (Figure 2). The participant sat with her head protruded forward slightly to provide the best image of the tongue body. The participantÂ’s head was stabilized using a fram e with contact points at the forehead, left and right sides of the head, and the back of the head, similar to the HATS system (Stone & Davis, 1995). Due to th e interconnectedness of the support structures of the stabilization system, any head movement caused the ultrasound probe to move concurrently, maintaining a fixed relationship between the head and the probe.
17 Figure 2 Cushion-scanning technique within head stabilizing frame Measurement All experiment data sets were measur ed using the programs Adobe Premiere 6.0 and Adobe Photoshop 7.0. The video recordi ng of the tongue was viewed in Adobe Premiere 6.0. The video of tongue moveme nt was observed frame-by-frame, until the frame closest to the midpoint of consonant closure was determined. The frame was then imported into Photoshop for measurement. Cues utilized to determine the exact closure location included direction of tongue movement preceding and following closure, flattening of the tongue against the alveolar ridge or palate and the bright line of the tongue edge that appears when the tongue su rface is motionless during closure. The audio waveform was also used to identify th e appropriate frame containing stop closures and releases (following Wodzinski, 2004).
18 Figure 3 Measurements applied to ultrasound image of the tongue Closure location for velar productions wa s measured by identifying the most anterior and posterior points of velar closure. Closure location was then quantified by computing the angle from the horizontal plane at the base of the ultrasound image to the midpoint between the anterior and posterior cl osure points. The closure angle for velar productions from baseline twisters was averaged by vowel (i.e., / / and / / vs. / / and / /). Since / / is a front vowel and / / is a back vowel, the velar closure position changed accordingly due to coarticulation. The average closure angle provided data about the general direction of closure that wa s typical for the partic ipant for that vowel. The tongue dorsum elevation was measured along the average cl osure angle for all blade angle closure angle dorsum distance blade points closure 1 c m
19 productions (velar and alveol ar), depending upon which vowel was contained in the utterance. Tongue blade angle was measured for bot h alveolar and velar productions. In velar productions, the tongue bl ade declines sharply away from the palatal closure location (Figure 3). For alveolar consonants the actual closure of th e tongue tip against the alveolar ridge is often not visible in an ultrasound im age because the ultrasound beam from the probe is reflected by air under th e tongue tip. Conseque ntly, the angle of elevation of the tongue blade is used to assess the alveolar closure gesture. The angle is computed from two points that are measur ed by the angle of declination from a horizontal. The first point chosen was the most anterior portion of the visible tongue blade/tip, and the second was a point about one centimeter posterior to the first along the tongue blade. Based on these two points the amount of elevation or declination of the tongue tip from the horizontal plan e determines the blade angle. Studies of the reliability of measur ements from ultrasound have found that measures are reliable (Stone, Sonies, Shawle r, Weiss, & Nadel, 1983). However, StoneÂ’s study was based on measurement procedures th at were somewhat different from those used in this study. Reliability of ultrasound measures used in the University of South Florida Speech Perception and Production Lab was investigated by Hardin (2005). The measurements of three raters were comp ared based upon their choice of video frame depicting closure, anterior and posterior tongue blade and dorsum closure points, blade and velar angle, and dorsum distance. The impact of experience measuring was also assessed by comparing measurements from one rater who progressed from inexperienced to experienced over the course of her research. Hardin conf irmed that overall reliability
20 of ultrasound measures was acceptable, and th e measures from different raters led to similar results. Results Using comparisons with the baseline measur ements from each participant, regular (or normal) articulations were differentiated from irregula r (or abnormal) articulations within the target items (tongue twisters) based on the dist ribution of tongue blade and tongue dorsum measures. The m ean and standard deviation (S D) of measures from the baseline stimuli were calculated to provide reference values for stimulus classification. Productions were categorized by determini ng whether a measurement fell within or outside of the range of 2 SD from the mean of the baseline measurements for tongue blade angle and tongue dorsum elevation for both velar and alveolar target sounds. Stimuli were classified to reflect how indi vidual measurements from the stimulus were related to the baseline measures. Words or syllables that co ntained articulations determined to be within 2 SD of the mean of the baseline measures were categorized as normal articulations (following Pouplier, 2003). To determine if an articulation error was gradient or categorical, it was compared not only to the mean for its own category but also to the mean for the other category (Figure 4). For instance, if a production of /t/ or /d/ was articulated with the tongue blade far lower than the mean for that sound (i.e., more than 2 SD below the mean for alveolars), it was also compared to see if th e tongue blade was so low as to be within 2 SD of the mean of the velar category. Li kewise, if a velar cons onant articulation was determined to have a much higher tongue blade than normal, then it was compared to the
21 alveolar category to determine if it fell within 2 SD of the mean for alveolar productions. If the measurement fell outside of 2 SD fr om the mean for its own category, but not within 2 SD of the other categ ory, then it was considered to be a gradient error because it was produced in the space between the two cat egories. However, if a measurement fell outside of 2 SD from the mean of its own cat egory and within 2 SD of the mean of the other category, that production was considered to be a categorical error for the tongue blade. The same process was used to determine error type for tongue dorsum measurements in alveolar and velar productions.
22 Figure 4 Ultrasound images of al veolar (top) and velar (bot tom) productions with mean and 2 SD ranges depicted for tongue dorsum and tongue blade Normal vela r dorsum distance Normal alveolar dorsum distance N ormal alveolar blade angle N ormal velar blade angle N ormal alveolar b lade an g le N ormal velar blade angle Normal alveola r dorsum distance Normal vela r dorsum distance
23 Classification codes were a ssigned to articulatory meas ures to reflect the position of the tongue blade and tongue dorsum in relation to normal articulation, gradient error, or categorical error (T able 2). Â“BÂ” was used to desi gnate the tongue blade and Â“DÂ” was used to designate the tongue dorsum. Â“NÂ” was used to classify a normal articulation, Â“GÂ” was used to classify a gradient error, and Â“CÂ” was used to denote a categorical error. Therefore, Â“GBCDÂ” refers to a stimulus that was measured to have a gradient error in the tongue blade angle and a categorical erro r in the tongue dorsum elevation. The measurements for each participant were categorized to determine the number of Â“normalÂ”, Â“gradient errorÂ”, and Â“c ategorical errorÂ” productions.
24 Table 2 Stimulus classification system Classification Definition NBND Normal Articulation Â€ Within 2 SD of baseline measurements for target category NBGD Gradient error in tongue dorsum distance Â€ Dorsum distance outside mean + 2 SD ranges for both categories NBCD Categorical error in dorsum distance Â€ Dorsum distance outside 2 SD of ba seline measures for target category Â€ Dorsum distance is within 2 SD of baseline measures for other category GBND Gradient error in blade angle Â€ Blade angle outside mean + 2 SD ranges for both categories CBND Categorical error in blade angle Â€ Blade angle outside 2 SD of baseline measures for category Â€ Blade angle is within 2 SD of ba seline measure for other category GBGD Gradient error in both blade angle and dorsum distance Â€ Measures for both blade angle and do rsum distance fall outside mean + 2 SD for target category and other category GBCD Categorical erro r in dorsum distance Gradient error in blade angle Â€ Dorsum distance outside of 2 SD of target category and within 2 SD of other category Â€ Blade angle outside mean + 2 SD ranges for both categories CBGD Categorical error in blade angle Gradient error in dorsum distance Â€ Blade angle outside of 2 SD of ta rget category and within 2 SD of other category Â€ Dorsum distance outside mean + 2 SD ranges for both categories CBCD Categorical error in bl ade angle and dorsum distance Â€ Blade angle and dorsum distance wi thin 2 SD of other category Each participant was measured independently. Due to differences in head size and relative placement of the probe from artic ulators between participants, the measures for one participant are not directly comparable to another. However, the overall patterns of normal productions are quite similar, while th e error data present a continuity of error values from virtually error free to error ri dden. Data were analyzed using a within-
25 subjects approach, using each participantÂ’s baseline measures to examine the productions from the word and syllable tongue twisters for that participant. P1 Figure 5 shows the range of producti on results for P1 and provides a visual comparison of their location in articulatory space. Within both the blade angle and dorsum distance panels, the productions are di vided between alveolar and velar, which are subdivided by vowel and then again by base line and alternating twisters. The boxes represent the 2 SD su rrounding the mean measurements fo r blade angle in degrees and dorsum distance in centimeters. The extent to which productions were measured to be outside of the 2 SD ranges, either above or below, are marked by the range lines extending above and/or below the boxes. Wher e range lines are not present either above or below a 2 SD box, no production was measured to be outside of 2 SD for that target. The means are located in the center of the 2 SD boxes. The means and 2 SD ranges were calculated for blade angle and dorsum distan ce measures for both baseline productions and alternating twisters by onset consonant and vowel. The mean blade angles and dorsum distances for velar productions appear to be slightly lower for the alternating twisters than for th e baseline measures for P1. Wh ile the 2 SD boxes are roughly the same size, overall they are marginally lower for the twisters. The mean and range for the alveolars appear similar in al l contexts, indicating that P1Â’s productions were measured to occur in roughly the same place in articulatory space.
26 Blade Angle Dorsum Distance -70 -60 -50 -40 -30 -20 -10 0 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Blade Angle (deg) 4 4.5 5 5.5 6 6.5 7 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Dorsum Distance (cm) Figure 5 Standard deviations surrounding me ans and ranges of productions for blade angle and dorsum distance for P1 Table 3 shows the categoriz ation of P1Â’s 536 measurable productions. Table 3 shows that, while P1 had only two categor ies of errors (GBND and NBGD), they accounted for 13% of her total productions. Ei ght of the total 71 productions classified as gradient errors in either blade or dorsum gestures were produced during baseline measures, which is consistent with other partic ipants. The majority of her error gestures consisted of NBGD in the nonword and word twisters where a velar target was produced with a slightly lower dorsum elev ation than expected (as seen in Figure 5). However, this difference may be due to normal coarticu lation and not truly erroneous production.
27 Table 3 Category classifica tion of P1Â’s productions Blade Angle Dorsum DistanceBaseline Nonwords Words Total ND 185 159 121 465 GD 5 37 21 63 NB CD 0 ND 3 1 4 8 GD 0 GB CD 0 ND 0 GD 0 CB CD 0 Total 193 197 146 536 In Figure 6, the two measurements of each consonant production for P1 are displayed in scatter plots. The x-axis is th e angle of declination of the tongue blade and y-axis is the distance of the tongue dorsum from the probe in the typical direction of closure for that vowel. The dotted lines represent the means of dorsum distance and blade angle while the boxes represent 2 SD surrounding the mean in which 95% of the articulations should fall. The box in the upper left corner of the figure represents the Â“normalÂ” location for velar productions. The box in the lower right co rner of the figure represents the Â“normalÂ” location for alveolar productions. The figures divide the twister productions into baseline (first row), alveolar target items in alternating twisters (second row), and velar target items in alternating tw isters (third row). The left column is / / vowel stimuli and the right column is / / vowel stimuli. P1 had very small SD ranges around the mean tongue blade angle and dor sum distance, compared to the other participants. Her productions of baseline twisters and alte rnating twisters were very tightly packed into th eir target production boxe s, with no categorical errors measured. As
28 discussed previously, P1 produced a small number of gradient errors that can be seen as those items outside of the 2 SD boxes, part icularly in the two velar target panels.
29 / / / / Baseline Targets Alveolar Velar 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Velar 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Targets 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Velar Targets 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 4.5 5 5.5 6 6.5 7 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Figure 6 Articulations for P1 delineated by vowel and baseline, alveolar, and velar productions by blade angl e and dorsum distance
30 P2 When determining the mean articulation measurements of the baseline twisters and comparing them to measuremen ts generated from articulations within alternating twisters for P2, it was discovered that coarticulation had a pronounced effect on tongue dorsum elevation for alveolars for at least this participant and perhaps others as well (most noticeably P3). A visual co mparison of the 2 SD boxes for alveolar productions in the dorsum distance panel of Figure 7 clearly shows that the means between baselines and alternating twisters are substantially different. This discrepancy was discovered while categorizi ng the productions for P2. Th erefore, baseline measures for tongue dorsum elevation during alveolar pr oductions in alterna ting twisters were obtained for P2 by using the mean and standa rd deviation from the alternating twister productions, excluding measurem ents that fell within the velar category for dorsum distance (i.e., categorical errors). While the sizes of the 2 SD boxes vary be tween baseline and twister productions within the velar and alveolar blade angles and the velar dorsum distance, the midpoints within each set are comparable. This sugge sts that, unlike the alve olar dorsum distance, these productions had more variability but we re centered on the same average target, as would be expected when errors are elicited. The very long range lines that are present in Figure 7 for alternating twisters reflect errorful productions. Those lines indicate that some production(s) fell in the articulato ry space in which the other category was normally produced. For instance, the bottom of the line for the blade angle of alveolar alternating twisters with the vowel / / appears to fall just below the midpoint for the box representing the 2 SD range for velar alternating twisters with the vowel / /, suggesting
31 at least one categorical erro r. At the same time, the top of the line for the dorsum distance of alveolar targets in alternating twisters with the vowel / / appears to be slightly higher than the midline for vela r target productions in alternating twiste rs with the vowel / /, which also suggests at le ast one categorical error. Blade Angle Dorsum Distance -70 -60 -50 -40 -30 -20 -10 0 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Blade Angle (deg) 7 7.5 8 8.5 9 9.5 10 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Dorsum Distance (cm) Figure 7 Standard deviations surrounding me ans and ranges of productions for blade angle and dorsum distance for P2 Table 4 shows that P2 produced a tota l of 541 measurable articulations. She produced 11 gestures within baseline measures that fell outside of the 2 SD cutoff for normal articulations, and they either involve d a gradient blade a ngle error with normal dorsum distance (GBND) or a normal blade a ngle with gradient dorsum distance error (NBGD). No baseline production was measured to be a categorical error. Normal gestures comprised 83% of P2Â’s produc tions. The word twisters generated 33 articulation errors (23% of all word pr oductions) while the nonword twisters generated 41 errors (20% of all nonw ord productions). Unlike the word twisters, the nonword
32 twisters generated errors in all but one category, CBGD, wh ich had no errors from any set of twisters. Table 4 Category classifica tion of P2Â’s productions Blade Angle Dorsum DistanceBaseline Nonwords Words Total ND 181 165 110 456 GD 6 7 8 21 NB CD 2 2 ND 5 8 15 28 GD 3 1 4 GB CD 1 1 2 ND 1 1 GD 0 CB CD 19 8 27 Total 192 206 143 541 Figure 8 shows P2Â’s articulations in scat ter plots. In base line productions, the majority of her productions fall neatly within the 2 SD boxes. For both / / and / / vowels, there were two velar productions articu lated with substantially lower blade angle than normal. However, since the focus of this paper is on articulations that are closer to the other category rather than farther away, these productions are not considered to be erroneous. When comparing P2Â’s productions of alveolar-initial twisters to the mean and 2 SD range, P2 made several articulation erro rs involving productions that fell into the velar box (categorical errors) and in the space between the velar box and the alveolar box (gradient errors). A handful of gestural e rrors made while trying to approximate a velar target are evidenced in the alveolar boxes in the bottom row of figures. The categorical errors are within the outlines of the box while the gradient errors fall in the space between the two boxes.
33 / / / / Baseline Targets Alveolar Velar 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Velar 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Targets 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Velar Targets 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 7 7.5 8 8.5 9 9.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Figure 8 Articulations for P2 delineated by vowel and baseline, alveolar, and velar productions by blade angl e and dorsum distance
34 P3 The midpoints of the 2 SD boxes for dorsum distance for alveolar target productions of baseline twisters and alternat ing twisters appear to be substantially different for P3, which is similar to the patt ern for P2 (Figure 9). As a result, error categorization for alveolar blade angle for P3 us ed the same procedure as for P2. As with P2, the ends of the range lines for each set of alternating twisters fall within the range of the 2 SD boxes for the other category (i.e., alveolar vs. velar), suggesting categorical errors in all alternating twister sets. Blade Angle Dorsum Distance -70 -60 -50 -40 -30 -20 -10 0 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Blade Angle (deg) 6 6.5 7 7.5 8 8.5 9 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Dorsum Distance (cm) Figure 9 Standard deviations surrounding me ans and ranges of productions for blade angle and dorsum distance for P3 P3 had very small SD ranges around th e mean tongue blade angle and dorsum distance for her baseline measures. Howeve r, she did make several categorical and gradient errors. Table 5 shows that P3 produ ced a total of 589 measurable articulations. Eight of the total 125 errors were produced dur ing baseline measures. P3 produced errors
35 in all but one category, NBCD. The larges t error category is GBND, with 52 of the 125 total errors. Of the 117 gesture errors pr oduced in the nonword or word twisters, the percentage of errors produced in nonword tw isters and word twisters was comparable at 29% each. Categorical errors in both tongue blade and tongue dorsum comprised 34 of the 117 errors, which is 29%. P3 produced 68 errors while reading the nonword twisters and 49 while reading the word twisters. Table 5 Category classifica tion of P3Â’s productions Blade Angle Dorsum DistanceBaseline Nonwords Words Total ND 180 163 121 464 GD 5 2 7 NB CD 0 ND 3 34 28 65 GD 3 3 6 GB CD 8 1 9 ND 2 1 3 GD 1 1 CB CD 18 16 34 Total 188 231 170 589 Figure 10 shows that a larg e number of P3Â’s velar articulations in alternating twisters had a higher dorsum distance than the mean generated from the baseline measures. This pattern was matched by the categorical error productions of alveolar targets in velar articulatory space. The consistency of the elevated dorsum distance for velars during alternating twisters is myster ious. Since velar stops involve raising the tongue dorsum to the palate, the baseline do rsum distance for closure should be the physical upper limit of tongue raising. Thr ee possibilities exist to explain this finding.
36 1. Coarticulation may have caused a shift in closure location between baseline and alternating twister productions so that th e angle used to measure dorsum distance is not the same as the direction of closure. 2. A shift in the participantÂ’s head position relative to the ultrasound probe between the baseline and alternating twister portions of the experiment may have occurred. While the participants were placed in an apparatus to restrict head movement, a small upward shift in head position is not impossible. 3. The participant may have hypera rticulated the alternating tw isters in an effort to emphasize the different productions. This may have involved lowering the jaw onto the probe thereby displacing the probe slightly, or raising the dorsum more fully to enhance the contrast.
37 / / / / Baseline Targets Alveolar Velar 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Velar 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Targets 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Velar Targets 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 6 6.5 7 7.5 8 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Figure 10 Articulations for P3 delineated by vowel and baseline, alveolar, and velar productions by blade angl e and dorsum distance
38 P4 P4 had a pattern of production so mewhat unlike those for the other participants. For this participant, blade a ngle measures overlapped substantially between alveolar and velar targets, esp ecially in the baseline productions. In the dorsum distance measures for alveolar targets, there is some indication that the m ean was shifted higher than those of the baseline measures due to coarticulation (as for P2 and P3). For the dorsum distance measure for velars, there is some indication of a shift upward in the alternating twisters, as observed in P3. Blade Angle Dorsum Distance -70 -60 -50 -40 -30 -20 -10 0 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Blade Angle (deg) 6 6.5 7 7.5 8 8.5 basetwistbasetwistbasetwistbasetwist aaeaae alveolarvelar Dorsum Distance (cm) Figure 11 Standard deviations surrounding me ans and ranges of productions for blade angle and dorsum distance for P4 P4 produced a total of 527 measurable articu lations (Table 6). Due to the overlap between blade angle categories for alveolar and velar produc tions, no productions can be classified as gradient errors for blade angl e. Considering only dorsum distance measures, she made 6 errors during baseline productions, wh ich is comparable to other participants.
39 P4 made a total of 8 dorsum distance errors e ither in the nonword or word twisters, only one of which was categorical (classified as NBCD). Table 6 Category classifica tion of P4Â’s productions Blade Angle Dorsum DistanceBaseline Nonwords Words Total ND 170 189 145 504 GD 6 5 2 13 NB CD 1 1 ND 0 GD 0 GB CD 0 ND 8 1 9 GD 0 CB CD 0 Total 184 195 148 527 Figure 12 shows the scatter plots of P4Â’s measures. P4 produced one categorical error in blade angle in the velar + / / twisters and none in the other three twister categories. This error, classified as CBND in Table 6, is located in the bottom right panel of Figure 12. One of P4Â’s productions with an alveolar target and / / vowel falls within the 2 SD box for velars; however, it is categori zed as NBCD due to the overlap in blade angle between the velar and alve olar categories. It is likel y that the larg e 2 SD boxes for dorsum elevation for alveolars are camouflaging gradient e rrors in dorsum raising, but there is no straightforward means to assess how much of the variability is due to coarticulation with the velars and how much is due to gestural in trusion in a gradient speech error.
40 / / / / Baseline Targets Alveolar Velar 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Velar 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Alveolar Targets 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Velar Targets 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) 6 6.5 7 7.5 8 8.5 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Figure 12 Articulations for P4 delineated by vowel and baseline, alveolar, and velar productions by blade angl e and dorsum distance
41 Discussion The productions for P1 were basically e rror free. A small amount of increased variability in the alterna ting twister productions was obs erved, which may have been gradient errors, or may have only been due to coarticulation. The productions measured for P2 and P3 are quite similar to one anot her. They differ from P1 in that both participants generated numer ous speech errors while read ing the tongue twisters. The overlapping mean and standard deviation boxe s for blade angle for P4 make it more difficult to directly compare those measuremen ts to other participants. This discussion will therefore focus on the data from P2 and P3 as both P2 and P3 have obvious loci of productions for alveolar and velar targets and both produ ced a number of errors. P2 and P3 had clear categorical errors where both the blade angle and dorsum distance values were normal for a producti on of the other category (i.e., a normal production of the wrong segment). A small number of productions were measured to be categorical in only one dimension. P2 had categorical errors in dorsum distance for alveolar targets combined with a normal blade angle or a gradient error in blade angle. P2 had categorical blade angle errors for al veolar targets combined with both normal dorsum distance and gradient error dorsum distance. P2 and P3 had clear gradient errors, where one dimension was intermediate between a normal production for its target ca tegory and a normal production for the other category. In most cases these were closer to the normal values, but the large number of instances where the productions are in space clearly distant from the cloud of normal productions suggests that at l east some are gradient errors and not just instances of coarticulation. This replicates findings by Mowrey and MacKay (1990), Frisch and
42 Wright (2002), and Pouplier (2003). A small number of productions ar e gradient errors in both dimensions and are found in the space directly between the 2 SD boxes, which might be viewed as gestural blends where gestures for both segments compete to be articulated. Pouplier and Goldstein (2005) concluded that the majority of their production errors consisted of Â“intrusion errors.Â” They stated that articulatory gestures of one phoneme (e.g., /k/) were more likely to intrud e upon another articula tory gesture (e.g., /t/) than for the articulatory gesture for a phoneme to be reduced. In particular, many errors produced in their experiment exhibited the intrusion of tongue dorsum raising on /t/, resulting in a simultaneous co-p roduction of two gestures. Th e data in this experiment do not show a large number of co-productions (which would appear above the normal alveolar box and to the ri ght of the normal velar box in Figures 6, 8, 10, and 12). For P2, a few alveolar productions do seem to fit this pattern; however, categorical errors with a fully raised dorsu m and lowered tongue tip were much more common. Also, some of P2Â’s velar productions seem to have a small reduction in dorsum raising. The reduction errors seem to be about as frequent as categorical errors in the velars where the tongue dorsum is not raised and the tongue blade is raised. For P3, it appears some small increase in dorsum raising frequently occurs for an alveolar target, which is correlated with a small decrease in bl ade angle. This may be comparable to the dorsum intrusion error of Pouplier and Goldstei n (2005). Categorical errors in alveolar targets were also quite common for P3. For ve lar targets, categorical errors were most common. A small number of velar target s may have a reduction of dorsum raising, especially for / /.
43 Overall, these data do show some sim ilarity to the findings of Pouplier and Goldstein (2005). However, these data c ontain fewer gradient errors and more categorical errors than observed by Pouplier a nd Goldstein. This is probably due in part to a difference in the producti on task that was used: Poup lier and Goldstein collected a large number of repetitions of two-word phras es, while this study used fewer repetitions of four-word phrases. In addition, this study used a lower rate of production. It would not be surprising that a larger number of fast er repetitions would lead to more gradient articulations than were found with the tongue twister task of this study, due to the increased motor planning demand produced by at tempting to speak a large of number of repetitions as quickly as possible. Many errors classified as gradient fell just outside of the 2 SD boxes and so may just be increased variability in articulation due to the more strenuous nature of the tongue twister task. However, a few cases cannot be wholly dismissed as variability, but also are not categorical errors. These cases show th at gradient errors observed by Pouplier and Goldstein (2005), as well as by Frisch a nd Wright (2002) and Mowrey and MacKay (1990), occur in this data as well. These cases are rare in comparison to the occurrences of categorical errors, which we re the most common error type for P2 and P3. The higher rate of categorical errors probably reflects the more langu age-like nature of the tongue twister task, which suggests that most of the errors observed in naturally occurring speech error corpora are categorical errors.
44 CHAPTER THREE: PERCEPTION OF PLACE OF ARTICU LATION ERRORS The second experiment is based on the instrumental speech error data from Experiment 1 that revealed variations in articulation. Words a nd nonwords containing misarticulated phoneme gestures from these da ta were presented to nave listeners to obtain their perceptual judgeme nts of the phoneme produced. Listener judgements of the phoneme produced was compared to actual meas ures from instrumental analysis using ultrasound. Based on research by Mowrey and MacKay (1990) and Pouplier and Goldstein (2005), it was expected that the listeners would onl y perceive the production as different from the target when the misarticul ated gestures were cat egorically different. When presented with Â“correctÂ” articulations or gradient errors, the listeners were expected to perceive the target phoneme. Reaction times for responses were also collected to see if reaction tim es would reflect gradience in the productions that might not be seen in the identification data. Stimuli The stimuli were taken from the previous experiment involving ultrasound imaging of participants articu lating tongue twisters. The stim uli presented to participants were compiled from examples of regular and irregular articulations uttered by P2 from Experiment 1. Several potential stimulus it ems were not used due to partial productions
45 of the word or syllable invol ving the initial consonant only or other factors interfering with clear auditory perception of the targ et sound (e.g., laughing). Otherwise, all available irregular articulations were used. For each irregul ar articulation that was used, a normal articulation of the same syllable in the same twister from a different repetition was also included, where possible. Procedure. The experiment took place in a sound-controlled room where each participant sat in one of four carrels. Each carrel housed an LCD monitor with a mouse and headphones. The stimuli were presented using TDT System 3 stimulus presentation and recording equipment and were randomized using ECOS/Win experiment software. The stimuli were played over headphones at approximately 65 dB. The participant was presented with four options (i.e., /t/, /k/, /d/, or / /) for each stimulus item presented. The response options were presented on the comput er monitor and the pa rticipants used the mouse to click on their choices. When the pa rticipant responded, th e reaction time delay between the onset of the stimulus and the re sponse was also recorded by the computer. They were instructed to listen carefully, as each stimulus was only played once. Participants. Forty-six (46) students in an undergraduate speech anatomy course participated in the experiment. They rece ived extra credit for their participation. Subjects were between 19 and 45 years of age. All participants were monolingual native speakers of American English who reported no past speech or hearing disorders. Results It was expected that responses matc hing the target phoneme would occur for normal articulations and gradient errors. It wa s expected that categor ical errors would be
46 more likely to be perceived as different th an the target phoneme. Reaction times were expected to demonstrate slower reactions to gradient errors than to normal productions or categorical errors. Identification Table 7 shows the percentage of responses that reflected the perception of a place of articulation error from the target, classified by production category. The table shows the average over all presentations for stimuli for that category to all participants. Stimuli th at were judged to be in a di fferent voicing category than the target (i.e., /d/ for /k/ or /t/) were eliminat ed from the analysis. Appendix A contains a complete list of target items and percentage of place of articulation error judgements by item. The analysis reveals that perception wa s nearly invariant across all participants. Participant perception of the stimuli was unifo rm in all but a few cases. All but two of the 21 stimuli in category CBCD ( categorical error) were judged to be 100% in the other category than the production target. Of the two stimuli that were judged to be less than 100% in the other category, onl y two people judged one stim ulus to sound the same as the target and only one person j udged the other stimulus to be the same as the target. One of the four stimuli for category GBGD (gradi ent in blade angle, gradient in dorsum distance) was a velar target, while the remaini ng three were alveolar targets. The velar target was judged by all particip ants to be different than th e target and all the alveolar targets were judged to be the same as thei r target category, which explains the exact 25% overall percentage of erro r selection. Examining analysis of the measurements for this stimulus, it was determined that the ve lar stimulus fell just outside the 2 SD box for the alveolar category. In fact, all GBGD stim uli fell roughly in this area and were heard as alveolar productions. Table 7 also reveals that many items measured to be gradient
47 errors were not perceived to be categori cally different from their target by any participant. Table 7 Number of stimulus presentati ons by category and pe rcentage of error selection by listeners Category Number of Stimuli Pe rcentage of Error Selection NBND 54 0.7% NBGD 9 0.2% NBCD 2 0.0% GBND 22 0.6% GBGD 4 25.0% GBCD 0 -CBND 1 0.0% CBGD 0 -CBCD 21 99.7% Figure 13 illustrates how the stimuli were perceived by the participants as a function of the articulatory measures. As in Experiment 1, the boxes on the figure outline 2 SD from the means for each target s ound, which are shown as dotted lines. The diagonal lines through the centers of the panels indicate th e midpoint between alveolar and velar articulations. Per ceptions of the productions are perfectly predicted by this dividing line. Articulations on the alveolar side were heard as alveolars and articulations on the velar side were heard as velars. Note th at this perceptual boundary is fairly far in articulatory space from the region of norma l productions for these sounds. This suggests that there can be considerable variability in production wit hout changing the identity of the percept, which is an issue that will be addressed more fully in the discussion. The few stimuli of the Â“normalÂ” or Â“gradi ent errorÂ” categories that were perceived by some listeners to be different from the target had no distingui shing characteristics
48 from other stimuli in the same categories. Those items designated in Figure 13 as Â“mixedÂ” are items that more than one person j udged differently from the majority of the participants and are shown on the figure as red circles. Items that just one person responded to as different from the target category are not marked as mixed as these may have been accidental incorrect responses rath er than genuine differe nces in perception. All of the mixed items are found in the alveolar targets. In the upper left panel, which shows alveolar targets for the vowel / /, the mixed item was measured to be a categorical error production (CBCD) but was judged by two pa rticipants to be a target alveolar production. Three mixed items are in the upper ri ght panel that shows alveolar targets for the vowel / /. Two were measured to be normal productions (NBND) and one was measured to have a gradient error of the blade angle (GBND). The two NBND stimulus items were judged to be categorically di fferent by five and seven participants respectively, while the GBND stimulus was judg ed to be different by four participants. Of the 53 stimulus items within the NBND cate gory (Â“normalÂ”), an additional five items were judged by only one person to be categoric ally different from the target, which may be attributable to an accidental incorrect se lection of consonant choice by the participant during the experiment process. In category NBGD (gradient error in dorsum distance), one participant judged one item out of nine to be categorically diffe rent. Two of 22 items in the GBND category (gradient error in blade angle) were judge d by one participant to be categorically different. In the category GBGD, four stimulus items were presented to listeners. Three of the items were perceived to be the ta rget 100% of the time. One of the GBGD
49 stimulus items was judged to be categorically different by 100% of the participants. The GBGD item judged to be categorically different was a velar target that was measured to be close to the alveolar box, shown on Fi gure 13 in the lower left panel. / / / / Alveolar Targets Normal Error Mixed 7.4 7.6 7.8 8 8.2 8.4 8.6 8.8 9 9.2 9.4 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Normal Error Mixed 7.4 7.6 7.8 8 8.2 8.4 8.6 8.8 9 9.2 9.4 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Velar Targets Normal Error 7.4 7.6 7.8 8 8.2 8.4 8.6 8.8 9 9.2 9.4 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Normal Error 7.4 7.6 7.8 8 8.2 8.4 8.6 8.8 9 9.2 9.4 -70-60-50-40-30-20-100 Blade Angle (deg)Dorsum Dist (cm) Figure 13 Distributions of perceptual judgements Reaction Times Reaction times for identification of the onset phoneme were gathered during the course of the experiment. The participants were given no particular
50 instructions to answer quickl y; therefore, reaction times between participants varied greatly. Also, the experiment used a mouse fo r judgement selection rather than a Â“button box,Â” which has the potential to introduce additional variabil ity in the lag time between selection of choice and logging of an answer. However, some evidence of patterns in reaction time was observed despite these limitations. Due the variation in reaction times for each participant, z-scores were calculated to standardize the results and allow statistica l comparisons across participants. Z-scores are used to compare the relative standings of items from distributions with different means and/or different standard deviations. The z-score for an item indicates how far and in what direction that item deviates from its distributionÂ’s mean and is expressed in units of its distributionÂ’s standard deviation. Theref ore, the z-scores for each participant reveal how quickly or slowly they responded to a category of stimu li, based upon their individual mean and standard deviation. Z-scores that are near zero indi cate a reaction time that was near average whereas z-scor es that are positive correspond to a longer response time than average and z-scores that are negative correspond to a shorter response time than average. In an effort to control for outliers, z-scores that were more than 3 SD from the average were discarded be fore averaging across subjects and stimuli. The reaction times are broken down by alve olar targets and velar targets in Tables 8 and 9, respectively. Within each table, the number of stimuli is given, and the reaction times are further broken down by the average reaction time, SD of reaction time, average z-score, and SD of z-score. Tables 8 and 9 show that the pattern in actual reaction times matches the pattern in z-scores, suggesting that the participants had similar overall patterns of reaction time even though they had mean reaction times that differed
51 by 500-1000 ms in some cases. These tables also show that some of the error categories had very few stimulus items, so the reaction times for these categories are potentially not representative. Table 8 Reaction times and z-scores for pe rception of alveolar production targets Category Number of Stimuli Average RT (ms) SD RT (ms) Average Z-score SD Z-Score NBND 38 1966 512 -0.25 0.93 NBGD 2 2078 575 0.02 0.79 NBCD 2 2152 533 0.27 0.85 GBND 21 2034 531 -0.05 0.91 GBGD 1 2043 517 -0.03 0.83 CBGD 3 1962 496 -0.23 0.76 CBCD 15 2139 534 0.24 1.22 Table 9 Reaction times and z-scores for perception of velar production targets Category Number of Stimuli Average RT (ms) SD RT (ms) Average Z-score SD Z-Score NBND 16 2085 538 0.10 0.90 NBGD 7 2131 579 0.18 0.95 GBND 1 1957 480 -0.20 1.29 GBGD 1 1942 608 -0.32 1.50 CBCD 7 1993 569 -0.20 1.01 Given that several of the gradient e rror production categories have very few stimulus items in them, these data were aggreg ated for statistical analysis. The gradient error categories were combined into a singl e group, resulting in three production error groups: Normal, Gradient Error, and Categori cal Error. Analysis of variance (ANOVA) was performed with two independent variab les: Target (Alveolar, Velar) and Error Category (Normal, Gradient Error, and Categor ical Error). The de pendent variable was the z-score for the reaction time. ANOVA wa s performed for z-scores averaged over
52 items within subjects (analysis by subjects) a nd also for z-scored averaged over subjects within items (analysis by items). For the main effect of target, there was no significant difference [by subjects F(1,270) = 0.166, p > .05; by items F(1,108) = 0.046, p > .05]. In other words, on average the alve olar targets were responded to at the same rate as velar targets. For the main effect of error category, results were mixed [by subjects F(2,270) = 6.031, p < .01; by items F(2,108) = 1.568, p > .05]. A post-hoc Tukey HSD test for the analysis by subjects found that responses to the normal productions were faster than responses to the gradient errors and categor ical errors, but the gradient errors and categorical errors were responded to at the same rate. The interaction of target and error category was significant [by subjects F(2,270) = 80.875, p < .01; by items F(2,108) = 15.042, p < .01]. The rate of response to diffe rent error types was different depending on whether the target was alveolar or velar. This interaction can clearly be seen in Figure 14, which shows the average reaction time grouped by target and error category as in the ANOVA. Figure 14 shows that for alveolar targets, normal producti ons were responded to more quickly than categorical error productions. On the othe r hand, for velar targets, normal productions were responded to more slowly than categor ical error productions. This interaction can actually be explained if the da ta are considered in terms of the listenersÂ’ percept rather than the production target. A categorical error for an alveolar target is produced in the velar articulatory space, resulting in a velar percept. Similarly, a categorical error for a velar target is produced in the alveolar articula tory space, resulting in an alveolar percept. Comparing the reaction time for normally produ ced alveolars (first column in Figure 14) versus alveolars that are the result of errors (last column in Figure 14), the responses to
53 the normal alveolars are 27 ms faster than th e responses to the vela r errors produced as alveolars. There is an analogous pattern fo r the velars. Comparing the reaction time for normally produced velars (fourth column in Figur e 14) versus velars that are the result of errors (third column in Figure 14), the respons es to the normal velars are 54 ms faster than the responses to the alveol ar errors produced as velars. With this perceptual pattern in mind, a second trend also emerges. Overall reaction times to alveol ar percepts (first, second, and last columns) are on average 115 ms faster than reaction times to velar percepts (third, fourth, and fifth columns). There is also some evidence that reaction times to gradient errors are slower than reac tion times to normal items. This is seen most clearly for the alveolars, but the small trend fo r the velars is also in the same direction. 1000 1500 2000 2500 3000 NormalGradientCategoricalNormalGradientCategorical Alveolar TargetVelar Target Reaction time (ms) Figure 14 Reaction time to stimuli on the basis of production targ et and 3-way error categorization
54 Discussion Overall, the results of this perception experiment follow a very simple pattern: stimuli that were measured to be within velar articulatory space were heard as velar productions and stimuli that were measured to be within alveolar ar ticulatory space were heard as alveolar productions. Only two stimuli were measured to be very near to the Â“lineÂ” between velar space and alveolar space, but both fall just on th e alveolar side of the articulatory space. Both of these pr oductions were heard as alveolars by the participants. The reaction time analysis f ound suggestive evidence that error productions are not exactly the same as normal productions This finding should be investigated further using more sensitive procedures. Overall, the findings are compatible w ith the claims of Mowrey and MacKay (1990) and Frisch and Wright (2002) that gradient errors are perceptually heard as normal. The results are also comparable to the study of Pouplier and Goldstein (2005), who found gradient errors were generally not perceived as misproductions by nave listeners. Pouplier and GoldsteinÂ’s claim that dorsum intrusions on alveolars are more perceptible than tongue tip intrusion on velars could not be tested due the small number of intrusions in the present data.
55 CHAPTER FOUR: CONCLUSIONS/DISCUSSION Conclusions Experiment 1 used ultrasound to capture ar ticulation in action in tongue twisters and quantify where it happened in articulatory space. Analysis of the measurements revealed that participants made both categorical errors of articulation and gradient errors of articulation. Some of the errors made by the speakers were produced as a completely different phoneme. On the ot her hand, some of th e errors made by the speakers were only somewhat different from the target, but were different enough to fall outside of the normal articulatory space for the target. Experiment 2 used data collected during Experiment 1 and presented tokens of normal productions, gradient erro rs, and categorical errors to nave listeners. As one would expect, listeners perceived normal and categorical error productions as the target or the other of the target, respectively. In terestingly though, perception of gradient errors appeared to depend on where in articulatory space the token was produced. Gradient errors produced on the alveolar side of the articulatory space were perceived to be alveolars and gradient errors produced on the velar side of the velar articulatory space were heard as velars. Analysis of reacti on times from this experiment revealed that listeners were slower to respond to gradient and categorical error productions compared
56 to normal productions. On average, responses to alveolar percepts were faster than responses to velar percepts, obscuring the error pattern. Discussion Transcription of speech errors has provide d a great deal of linguistic information about how speech is produced. However, for sp eech errors to be perceived by listeners in running speech, they necessarily needed to be obvious. Speech errors are also constrained by transcribing them into IPA, which imposes a label upon what was heard. Experiment 1 revealed that errors occur that are not within the normal range for either the target category or the other category. These findings are similar to those by other researchers who have used instrumentat ion to quantify speech errors (Mowrey & MacKay, 1990; Frisch & Wright, 2002; Pouplie r & Goldstein, 2005). To determine how gradient errors are perceived, they were included as stimulus items along with normal productions and categorical erro rs in Experiment 2. The pe rception of all the gradient errors depended upon where in articulatory space they were produced. If the gradient error was produced on the alveolar side of the space, then it was perceived to be an alveolar; however, if it was produced on the ve lar side of the space, then it was perceived as a velar. This indicates that transcrip tion of speech errors is not accurate enough to record small articulatory errors made by speakers. One possible explanation for the way grad ient errors are perceived is found in quantal theory (Stevens, 1972). The thesis of the quantal theory is that certain, relatively large changes in articulator position will cause little change in the acoustic signal, while other, relatively small changes in articulator placement will cause large changes in the
57 acoustic signal. The extent of the acoustic chan ge appears to be rela ted to the particular region of the vocal tract where the articulation is located. In certain critical regions, a slight adjustment of articulatory placement will cause a large acoustic change in sound. Therefore, the boundary line between articulato ry spaces may be the critical region that causes the production to be percei ved as either on target or in error. The relatively large area between the Â“boxÂ” of nor mal articulation for velars and alveolars and the boundary line between the two categories in perception wo uld allow for a great deal of variability of articulation that would still result in th e listener perceiving the target sound. Quantal theory also proposes that, in general, la nguages use phonemes that are articulated in regions that are large enough to withstand individual vari ations of production. These large articulatory spaces are thought to acc ount for relatively steady state productions while the articulators are moving through th e area. The findings of this study support these ideas that relatively large variability in production did not a ffect identification. While quantal theory may provide a conve nient explanation fo r the correlation of gradient error perception and articulatory spa ce, the reaction time data are not as clear. The possibility exists that the measuremen ts from the ultrasound recordings do not provide a complete picture of the producti ons. The difference in reaction times for normal and error productions (especially the cate gorical errors) suggests that the listeners were affected by some cue that was unaccount ed for by articulatory measures of blade angle and dorsum distance duri ng the stop closure. Further research into reaction times for perception of speech errors needs to be conducted to determine if some as yet unmeasured part of the utterance interferes with perception of errors.
58 Clinical Uses It has been demonstrated in this thesis and elsewhere that ar ticulatory gestures produced slightly differently than Â“normalÂ” can produce sounds that are heard as the target by the listener (Boucher, 1994; Fris ch & Wright, 2002; Mowrey & MacKay, 1990; Pouplier, 2003; Pouplier & Goldstein, 2005). Th e perception of articul atory gestures that are more than slightly different as categorical errors implies that the gesture is perceived as that category closest to wh ich it is produced. Ultrasound coul d provide a visual clue as to where an individual is producing a particular phoneme. The use of ultrasound in a clinical setting with clients who have arti culation errors could provide an accessible visual tool in remediating articulation impa irments. Many a speech-language pathologist can relate anecdotes about tryi ng to explain to a client, young or old, hearing-impaired or normal hearing, how to move the tongue to pr oduce the target sound. Providing a real time ultrasound image to the client of thei r own tongue and how it moves could provide a valuable asset to speech-language pathologists. While the clinician instructs the client where and how to move the tongue, the client can at the same time, watch to see if they are moving their tongue to the appropriate locati on. The clinician can point to the screen and show the client the general area where they need to put their tongue in order to approximate the target sound. The results of this thesis show that the speaker can produce a perceptually correct s ound even if the articulators are not in exactly the same location each time, or in the ideal position de sired by the clinician. Therefore, guiding the client to an approximation of the target sound may be enough to help them produce a perceptually accurate phoneme.
59 Research into the usability of ultrasound in field applications has demonstrated that a reliable ultrasound image can be produced if the subject rests his/her head against a stable surface, such as a headrest against a wall (Gick, Bird, & Wilson, 2005). Combining the use of a headrest with a fixed ultrasound probe, such as on a microphone stand, further enhances the reliability of th e ultrasound image. Finally, in order for a clinician to utilize these methods in comb ination with therapy, placing the ultrasound monitor where the client could see it eas ily would allow for interaction between articulation and visual feedback. While Bernhardt, Gick, Bacsfalvi, and Ad ler-Bock (2005) point out that portable ultrasound machines may not generate the consis tency of data collect ion that stationary machines with fixed transducers provide, th ey do allow treatment to be provided in locations convenient for the client. Their re search has shown that use of ultrasound as visual feedback can improve the clientÂ’s understanding of what they are being taught by the clinician. Bernhardt, et al., suggest that the clinicia n demonstrate the use of the ultrasound probe for the client not only to s how how to position it properly (if a fixed transducer is not used) but also to provide a visual ultrasoun d image of the desired target production. A sagittal view of the tongue fr om the transducer pr ovides a descriptive image of velar and alveolar productions, vow el productions, approxima nts /r/ and /l/, and sibilants and affricates. Rota ting the transducer to create a coronal image can provide another image of sibilants and affricat es that shows grooving of the tongue. Descriptive research has been conducted that uses ultrasound to examine the shape of the tongue during the articulation of vowels and consonants. This research has generated a great deal of information rega rding how the tongue postures and moves to
60 create sounds. Clinical research on the appl ication of ultrasound as a tool for abnormal speech production and on the use of ultrasound to provide articulatory feedback is just beginning.
61 REFERENCES Baars, B., Motley, M., & MacKay, D. (1975). Output editing for lexical status in artificially elicited s lips of the tongue. Journal of Verbal Learning and Verbal Behavior, 14, 382-391. Bernhardt, B., Gick, B., Bacsfalvi, P., & Adler-Bock, M. (2005). Ultrasound in speech therapy with adolescents and adults. Clinical Linguistics & Phonetics, 19 (6/7), 605-617. Boucher, V.J. (1994). Alphabet-related biases in psycholinguistic enquiries: considerations for direct theories of speech production and perception. Journal of Phonetics, 22(1), 1-18. Dell, G.S., & Reich, P. (1981). Stages in sentence production: An analysis of speech error data. Journal of Verbal Learni ng and Verbal Behavior, 20, 611-629. Dell, G.S. (1990). Effects of frequenc y and vocabulary type on phonological speech errors. Language and Cognitive Processes, 5 (4) 313-349. Frisch, S.A., & Wright, R. (2002). The phonetics of phonological speech errors: an acoustic analysis of slips of the tongue. Journal of Phonetics, 30, 139-162. Fromkin, V. (1971). The non-anomalous nature of anomalous utterances. Language, 47 (1) 27-52. Freud, S. (1901). The psychopathology of everyday life. Standard Edition of the Complete Psychological Works of Sigmund Freud. Vol. 6. Garrett, M. (1975). The analysis of se ntence production. In G. Bower (ed.). The Psychology of Learning and Motivation New York: Academic Press. 133-177. Garrett, M. (1976). Syntactic processes in sentence production. In E.C.T. Walker & Wales (Ed.). New Approaches to Language Mechanism Amsterdam: North Holland. Gick, B. (2002). The use of ultrasou nd for linguistic phonetic fieldwork. Journal of the International Phonetic Association, 32 (2), 113-121. Gick, B., Bird, S., & Wilson, I. (2005). T echniques for field application of lingual ultrasound imaging. Clinical Linguistics & Phonetics, 19 (6/7), 503-514.
62 Hardin, S.A. (2005). Reliability of Hand Measur es of Ultrasound Analysis. Unpublished MasterÂ’s Thesis, University of South Florida, Tampa, FL. MacKay, D. (1970). Spoonerisms: The structure of errors in the serial order of speech. Neuropsychologia, 8, 323-350. Meijer, P. (1997). What speech errors can te ll us about word-form generation: The roles of constraint and opportunity. Journal of Psycholinguistic Research, 26(1) 141158. Mowrey, R., & MacKay, I. (1990). Phonologi cal primitives: elec tromyographic speech error evidence. Journal of the Acoustical Soc iety of America, 88(3), 1299-1312. Peng, C.L., Jost-Brinkmann, P.G., & Miethk e, R.R. (1996). The cushion-scanning technique: A method of dynamic tongue sonography and its comparison with the transducer-skin coupling scanni ng technique during swallowing. Journal of Academic Radiology, 3, 239-244. Pouplier, M. (2003). Units of Phonological Encoding: Empirical Evidence Ph.D. dissertation, Yale University. Pouplier, M. & Goldstein, L. (2005). Asymmetries in the Perception of Speech Production Errors. Journal of Phonetics, 33, 47-75 Shattuck, S. (1975). Speech Errors and Sentence Production Unpublished doctoral dissertation. Department of Psychology, Massachusetts Inst itute of Technology, Cambridge, MA. Shattuck-Hufnagel, S., & Klatt, D. (1979). The limited use of distinctive features and markedness in speech production: evidence from speech error data. Journal of Verbal Learning and Verbal Behavior, 18, 41-55. Shattuck-Hufnagel, S. (1992). The role of wo rd structure in segmental serial ordering. Cognition, 42, 213-259. Stemberger, J., & Treiman, R. (1986). The in ternal structure of word-initial consonant clusters. Journal of Memory and Language, 25, 163-180. Stevens, K.N. (1972). The Quantal Nature of Speech: Evidence from ArticulatoryAcoustic Data. In E.E. David, Jr. and P.B. Denes (Eds.). Human Communication: A Unified View. New York: McGraw-Hill, 51-66. Stone, M. (2005). A guide to analyzi ng tongue motion from ultrasound images. Clinical Linguistics & Phonetics, 19 (6/7), 455-501.
63 Stone, M. & Davis, E. (1995). A head tran sducer support system for making ultrasound images of tongue/jaw movement. Journal of Acoustical Society of America, 98, 3107-3112. Stone, M., Sonies, B., Shawler, T., Weiss, G., & Nadel, L. (1983). Analysis of real-time ultrasound images of tongue configurat ion using grid-digitizing system. Journal of Phonetics, 11, 207-218. Wilshire, C.E. (1999). The Â“tongue twiste rÂ” paradigm as a technique for studying phonological encoding. Language and Speech, 42 (1), 57-82. Wodzinski, S.M. (2004). Ultrasound Analysis of Velar Fronting. Unpublished MasterÂ’s Thesis, University of South Florida, Tampa, FL.
64 APPENDIX A: PERCEPTION RESULTS FOR EACH STIMULUS
65 Code Stimulus Count Percentage n1kae1r2 46 0.0% n1ta2 46 0.0% n1ta4 46 0.0% n1ta6r2 46 0.0% n3kae3 46 0.0% n3kae4 46 0.0% n3ta1 46 0.0% n3ta3 45 0.0% n3tae4 46 0.0% n3tae5 46 0.0% n3tae7 46 0.0% n4da1 45 0.0% n4da2 45 2.2% n4dae2 46 0.0% n4dae3 46 0.0% n5ta2 46 0.0% n6da1 46 0.0% n6da2 46 0.0% n6da4 46 2.2% n6dae5 46 2.2% n6ga1 46 0.0% n6ga2 46 2.2% n7kae2 46 0.0% n7ta2 46 0.0% n7ta3 46 0.0% n8da1 46 0.0% n8da5 46 0.0% n8dae3 46 0.0% n8ga3 46 0.0% w1ka5 46 0.0% w1ta3 46 0.0% w1ta4 46 0.0% w2ta2 46 0.0% w3dae1r4 46 0.0% w3dae1r5 46 0.0% w3dae2r2 46 0.0% w3dae2r3 46 0.0% w3gae3 46 0.0% w4ka1 46 0.0% w4ka3 46 0.0% NBND w4kae1 46 0.0%
APPENDIX A (Continued) 66 Code Stimulus Count Percentage w4kae2 46 0.0% w4ta2 46 0.0% w4ta4 46 0.0% w4tae2r2 46 10.9% w4tae3 46 0.0% w4tae5 46 0.0% w5ka2 46 0.0% w5ka3 46 0.0% w5ta1 46 0.0% w5tae3 46 0.0% w5tae4 46 15.2% w6dae1r2 46 0.0% NBND w6ga5 46 2.2% n1kae1r1 46 0.0% n3kae1 46 0.0% n7ta4 46 0.0% w3dae2r5 46 0.0% w3gae2 46 0.0% w4ka2 46 0.0% w4ka5 46 0.0% w5ka1 46 0.0% NBGD w5ka4 46 2.2% n3ta2 46 0.0% NBCD n8da3 46 0.0% n3kae5 46 0.0% n3ta7 46 0.0% n3tae3 46 0.0% n3tae6 46 2.2% n4da5 46 0.0% n5ta4 46 0.0% n5tae5 46 0.0% w2ta6 46 0.0% w3dae1r1 46 0.0% w3dae1r2 46 0.0% w3dae1r3 46 0.0% w3dae2r1 46 0.0% w3dae2r4 46 0.0% w4ta1 46 0.0% w4ta3 46 0.0% w4ta5 46 0.0% GBND w4ta6 46 0.0%
APPENDIX A (Continued) 67 Code Stimulus Count Percentage w4tae6 46 0.0% w5ta5 46 2.2% w5tae1 46 0.0% w5tae2 46 8.7% GBND w6dae1r3 46 0.0% n1ta1r2 46 0.0% n1ta3 46 0.0% n6ga6 46 100.0% GBGD w1ta1 46 0.0% CBND n1ta1r1 46 0.0% n1ta5 45 100.0% n1ta6r1 46 100.0% n4da4 45 100.0% n4dae4 44 100.0% n4dae6 46 100.0% n4ga4 46 100.0% n6da3 46 100.0% n6da5 46 100.0% n6da6 46 97.8% n6dae3 46 100.0% n7kae6 46 100.0% n7ta1 45 100.0% n8da2r1 45 95.6% n8dae4 44 100.0% n8ga4 46 100.0% w1ka6 46 100.0% w2ta3 46 100.0% w4kae3 46 100.0% w4kae4 46 100.0% w4tae1 46 100.0% w4tae2 eliminated Â– heard as place and voice error CBCD w6ga6 46 100.0%