Speech Processing 15-492/18-492 Human Speech Processing Phonetics - - PowerPoint PPT Presentation
Speech Processing 15-492/18-492 Human Speech Processing Phonetics - - PowerPoint PPT Presentation
Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology The vocal tract From meat to voice Blow air through lungs Blow air through lungs Vibrate larynx Vibrate larynx Vocal tract shape defines
The vocal tract
From meat to voice
- Blow air through lungs
Blow air through lungs
- Vibrate larynx
Vibrate larynx
- Vocal tract shape defines resonance
Vocal tract shape defines resonance
- Obstructions modify sound
Obstructions modify sound
Tongue, teeth, lips, velum (nasal passage)
Tongue, teeth, lips, velum (nasal passage)
The ear
From sound to brain waves
- Sound waves
Sound waves
- Vibrate ear drum
Vibrate ear drum
- Cause fluid in cochlear to vibrate
Cause fluid in cochlear to vibrate
- Spiral cochlear
Spiral cochlear
Vibrate hairs inside cochlear
Vibrate hairs inside cochlear
Different frequencies vibrate different hairs
Different frequencies vibrate different hairs
Converts time domain to frequency
Converts time domain to frequency domainS domainS
From grunts to meaning
- Grunts and vocalization
Grunts and vocalization
- Lots of variation available
Lots of variation available
(continuous systems
(continuous systems – – not discrete) not discrete)
- Noises become distinct, recognizable
Noises become distinct, recognizable
- Grow into languages, dialects and idiolects
Grow into languages, dialects and idiolects
- What are the fundamental units?
What are the fundamental units?
Articulatory Movements
Electromagnetic Articulograph
Phonemes
- Defined as fundamental units of speech
Defined as fundamental units of speech
- If you change it, it (can) change the meaning
If you change it, it (can) change the meaning
“pat” to “bat” “pat” to “bat” “pat” to “ “pat” to “pam pam” ”
Vowel Space
- One or two banded frequencies (formants)
English (US) Vowels
fOOl fOOl UW UW fUll fUll UH UH tOY tOY, , OYster OYster OY OY lOne lOne, , nOse nOse OW OW bEAt bEAt, , shEEp shEEp IY IY bIt bIt, , shIp shIp IH IH gAte gAte, , EIght EIght EY EY makER makER, , sEARch sEARch ER ER gEt gEt, , fEAther fEAther EH EH hIde hIde, , bUY bUY AY AY About, About, cAnoe cAnoe AX AX hOW hOW, , sOUth sOUth AW AW lAWn lAWn, , mAll mAll AO AO bUt bUt, , hUsh hUsh AH AH fAt fAt, , bAd bAd AE AE wAshington wAshington AA AA
English Consonants
- Stops: P, B, T, D, K, G
Stops: P, B, T, D, K, G
- Fricatives: F, V, HH, S, Z, SH, ZH
Fricatives: F, V, HH, S, Z, SH, ZH
- Affricatives: CH, JH
Affricatives: CH, JH
- Nasals: N, M, NG
Nasals: N, M, NG
- Glides: L, R, Y, W
Glides: L, R, Y, W
- Note: voiced
Note: voiced vs vs unvoiced: unvoiced:
- P
P vs vs B, F B, F vs vs V V
Number of Phonemes in Language
- US English: 43
US English: 43
- UK English: 44
UK English: 44
- Japanese: 25
Japanese: 25
- Hindi: 81
Hindi: 81
- Numbers aren’t definite though
Numbers aren’t definite though
- Depends on who you ask,
Depends on who you ask,
- And what you want it for
And what you want it for
Not all variation is Phonetic
- Phonology: linguistically discrete units
Phonology: linguistically discrete units
- May be a number of different ways to say them
May be a number of different ways to say them
- /r/ trill (Scottish or Spanish)
/r/ trill (Scottish or Spanish) vs vs US way US way
- Phonetics
Phonetics vs vs Phonemics Phonemics
- Phonetics: discrete units
Phonetics: discrete units
- Phonemics: all sounds
Phonemics: all sounds
- /t/ in US English: becomes “flap”
/t/ in US English: becomes “flap”
- “water” / w
“water” / w ao ao t t er er / /
- “water” / w
“water” / w ao ao dx dx er er / /
Dialect and Idiolect
- Variation within language (and speakers)
Variation within language (and speakers)
- Phonetic
Phonetic
- “Don”
“Don” vs vs “Dawn”, “Cot” “Dawn”, “Cot” vs vs “Caught” “Caught”
- R deletion (
R deletion (Haavaad Haavaad vs vs Harvard) Harvard)
- Word choice:
Word choice:
- Y’all,
Y’all, Yins Yins
- Politeness levels
Politeness levels
Not all languages use the same set
- Asperated
Asperated stops (Korean, Hindi) stops (Korean, Hindi)
- P
P vs vs PH PH
- English uses both, but doesn’t care
English uses both, but doesn’t care
- Pot
Pot vs vs sPot sPot (place hand over mouth) (place hand over mouth)
- L
L-
- R in Japanese not phonological
R in Japanese not phonological
- US English dialects:
US English dialects:
- Mary, Merry, Marry
Mary, Merry, Marry
- Scottish English
Scottish English vs vs US English US English
- No distinction between “pull” and “pool”
No distinction between “pull” and “pool”
- Distinction between: “for” and “four”
Distinction between: “for” and “four”
Different language dimensions
- Vowel length
Vowel length
- Bit
Bit vs vs beat beat
- Japanese:
Japanese: shujin shujin (husband) (husband) vs vs shuujin shuujin (prisoner) (prisoner)
- Tones
Tones
- F0 (tune) used phonetically
F0 (tune) used phonetically
- Chinese, Thai, Burmese
Chinese, Thai, Burmese
- Clicks
Clicks
- Xhosa
Xhosa
Co-articulation
- Voicing actually doesn’t always stop
Voicing actually doesn’t always stop
- “have honey”, “impossible”
“have honey”, “impossible”
- Nasalized voices, lip rounding
Nasalized voices, lip rounding
- “min”
“min” vs vs “bit”, “sow” “bit”, “sow” vs vs “see” “see”
- Lexical stress:
Lexical stress:
- EMphasis
EMphasis, , emPHAsis emPHAsis
- PROject
PROject, , proJECT proJECT
- Reduction, contraction
Reduction, contraction
- “A boy is riding a bike”
“A boy is riding a bike”
- “I want to go to Disneyland.”
“I want to go to Disneyland.”
- “I will go tomorrow”
“I will go tomorrow”
Prosody
- Intonation
Intonation
- Tune
Tune
- Duration
Duration
- How long/short of each phoneme
How long/short of each phoneme
- Phrasing
Phrasing
- Where the breaks are
Where the breaks are
Intonation (F0)
- Rate of vibration during voiced speech
Rate of vibration during voiced speech
- Males: 80
Males: 80-
- 140 times a second
140 times a second
- Females: 130
Females: 130-
- 220 times a second
220 times a second
- Children: 180
Children: 180-
- 320 times a second
320 times a second
- Used for:
Used for:
- Emphasis
Emphasis
- Style: questions, statements, confidence etc
Style: questions, statements, confidence etc
Intonation Contour
Intonation Information
- Large pitch range (female)
Large pitch range (female)
- Authoritive
Authoritive since goes down at the end since goes down at the end
- News reader
News reader
- Emphasis for Finance H*
Emphasis for Finance H*
- Final has a raise
Final has a raise – – more information to more information to come come
- Female American newsreader from WBUR
Female American newsreader from WBUR
- (Boston University Radio)
(Boston University Radio)
Intonation Examples
- Fixed durations, flat F0.
Fixed durations, flat F0.
- Decline F0
Decline F0
- “hat” accents on stressed syllables
“hat” accents on stressed syllables
- accents and end tones
accents and end tones
- statistically trained
statistically trained
Words
- Words
Words
- The things with space around them (sort of)
The things with space around them (sort of)
- Chinese, Thai, Japanese doesn’t use spaces
Chinese, Thai, Japanese doesn’t use spaces
- Speech doesn’t use spaces
Speech doesn’t use spaces
Blackboard
Blackboard vs vs Black Board Black Board
- English
English
Morphology: walk, walks, walking, walked
Morphology: walk, walks, walking, walked
- Japanese
Japanese
Morphology:
Morphology: aruku aruku, , arukimasu arukimasu, , arukimashita arukimashita, , aruite aruite, , aruikitai aruikitai, , aruikitakatta aruikitakatta, , arukemasu arukemasu, …. , ….
Speech Acts
- Words aren’t always what they seem
Words aren’t always what they seem
- Can you pass the salt?
Can you pass the salt?
- Boston. Boston! Boston?
- Boston. Boston! Boston?
- Yeah, right
Yeah, right
- Multiple ways to say the same thing:
Multiple ways to say the same thing:
- I want to go to Boston.
I want to go to Boston.
- Yes
Yes
Human Speech
- Human production and perception
Human production and perception
- Quite different from computers
Quite different from computers
- Phonology
Phonology
- Defining the alphabet of speech
Defining the alphabet of speech
- Different languages make different distinctions
Different languages make different distinctions
- Intonation
Intonation
- How its said