Progress in the framework of the RESPITE project at DaimlerChrysler - - PowerPoint PPT Presentation

progress in the framework of the respite project at
SMART_READER_LITE
LIVE PREVIEW

Progress in the framework of the RESPITE project at DaimlerChrysler - - PowerPoint PPT Presentation

Research & Technology Progress in the framework of the RESPITE project at DaimlerChrysler Research & Technology DrIng. Fritz Class and Joan Mar Martigny, Jan. 2002 Research & Technology Contents DaimlerChrysler offline


slide-1
SLIDE 1

Research & Technology

Progress in the framework of the RESPITE project at DaimlerChrysler Research & Technology

Dr−Ing. Fritz Class and Joan Marí Martigny, Jan. 2002

slide-2
SLIDE 2

Research & Technology

Contents

DaimlerChrysler off−line demonstrator

Block−diagram of our off−line demonstrator Next evaluation experiments using our demonstrator

On−going research in Discriminative Feature

Extraction

TANDEM acoustic modelling Linear−Discriminant−Analysis−based (LDA) front−end Quadratic−Discriminant−Analysis−based (QDA) front−end A two−layer perceptron to generate state−posteriors from

QDA features (RBFs)

Results

slide-3
SLIDE 3

Research & Technology

DC off−line demonstrator: block−diagram

AURORA PCM Feature Extraction AURORA REF Viterbi Decoding Acoustic Modelling Evaluation results SCLITE package to UNI_IO feature Feature Extraction Acoustic Modelling to UNI_IO likelihood

DC ASR system CTK/QUICKNET/MSTK

slide-4
SLIDE 4

Research & Technology

DC off−line demonstrator: next steps

Evaluate results of IDIAP Multi−Stream toolkit on

the AURORA 2000 database and compare them with those of SPRACHcore and CTK toolkits

Determine, given the results of the previous

evaluation and system requirements, which is the desirable technique for our purposes

Using our own in−car american english database

compare our baseline system with the selected

  • ptimum technique
slide-5
SLIDE 5

Research & Technology

Contents

DaimlerChrysler off−line demonstrator

Block−diagram of our off−line demonstrator Next evaluation experiments using our demonstrator

On−going research in Discriminative Feature

Extraction

TANDEM acoustic modelling Linear−Discriminant−Analysis−based (LDA) front−end Quadratic−Discriminant−Analysis−based (QDA) front−end A two−layer perceptron to generate state−posteriors from

QDA features (RBFs)

Results and Conclusions

slide-6
SLIDE 6

Research & Technology

Discriminative Feature Extraction: TANDEM training

Feature Extraction Database PCM Database CMF Database ALI EBP algorithm NN weights Forward NN pass PCA matrix Database NN

  • utputs

HMM Forward- Backward PCA PCA comput. Database PCA Unsuprv. Clustering CB CB inversion CB inv VQ

Neural Net training Non−linear transform of the feature space

slide-7
SLIDE 7

Research & Technology

Discriminative Feature Extraction: LDA training

Feature Extraction Database PCM Database CMF Suprvsd. Clustering LDA comput. Database ALI CB

  • trnsfrm. &

inv. LDA matrix inverted CB CB LDA transform VQ Forwrd- Backwrd HMM

Supervised Clustering Linear transform to reduce dimensionality

slide-8
SLIDE 8

Research & Technology

Discriminative Feature Extraction: LDA

EBP algorithm Database ALI NN weights

t-1 t t

Supervised clustering Database ALI CB

t-1 t

=

j j j i i i

q P q x p q P q x p x q P ) ( ) / ( ) ( ) / ( ) / (

Bayes rule TANDEM training LDA training

x x x x x x x x xx x x

  • o
  • x2

x1

slide-9
SLIDE 9

Research & Technology

Discriminative Feature Extraction: QDA

SvOutPlaceObject

TANDEM features are obtained from log−posteriors ) ( log ) ( ) / ( log ) / ( log ) / ( log

i j j j i i

q P q P q x p q x p x q P + − =

Applying Bayes rule as in the previous slide

( ) ( )

) ( log 2 1 ) / ( log

1 ’

x p x x x q P

i i i i

− − Σ − ∝

µ µ

A quadratic equation is obtained TANDEM can be interpreted as a kind of non−linear feature extraction Key questions at this point are:

Is one gaussian per cluster enough ? How many classes should be used ? Is the gaussianity assumption always a good one?

slide-10
SLIDE 10

Research & Technology

Discriminative Feature Extraction: RBFs

Returning back to the Bayes rule

=

j j j i i i

q P q x p q P q x p x q P ) ( ) / ( ) ( ) / ( ) / ( A compromise between connectionist and parametric modelling are RBFs We could express it as:

( )

) , ( , ) / (

k k ik i

N w f x q P Σ = µ

Where f is the softmax function and N is the gaussian pdf

(.)

i

f (.)

i

f (.)

i

f ) , (

k

k N Σ µ

An RBF is thus obtained

slide-11
SLIDE 11

Research & Technology

Discriminative Feature Extraction: results

Results TESTA

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 0 dB 10 dB 20 dB Clean WER(%) ANN/HMM CMF+MSG ANN/HMM PLP+MSG DC baseline LDA_MSG-LDA_CMF

Recognition results on AURORA 2000

slide-12
SLIDE 12

Research & Technology

Discriminative Feature Extraction: results

Results TESTB

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0 50,0 0 dB 10 dB 20 dB Clean WER(%) ANN/HMM MFCC+MSG ANN/HMM PLP+MSG DC baseline LDA_MSG-LDA_CMF

Recognition results on AURORA 2000

slide-13
SLIDE 13

Research & Technology

Discriminative Feature Extraction: Results

Results TESTC

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0 0 dB 10 dB 20 dB Clean WER(%) ANN/HMM MFCC+MSG ANN/HMM PLP+MSG DC baseline LDA_MSG-LDA_CMF

Recognition results on AURORA 2000

slide-14
SLIDE 14

Research & Technology

Discriminative Feature Extraction: results

MLP Type 0 dB 10 dB 20 dB clean Average Weights (3x9x11)=297 + 480 + 127 37,5 4,4 1,8 2,4 11,5 203.520 (3x9x11)=297 + 254 + 127 36,0 7,1 2,9 2,7 12,2 107.696 MLP Type 0 dB 10 dB 20 dB clean Average (2x9x11)=198 + 480 + 127 35,2 6,9 2,6 2,6 11,8 156.000 (2x9x11)=198 + 254 + 127 36,6 7,5 2,7 2,9 12,4 82.550 MLP Type 0 dB 10 dB 20 dB clean Average (17 x 11)=187 + 480 + 127 36,5 7,8 3,1 3,1 12,6 150.720 (17 x 11)=187 + 254 + 127 37,4 8,5 3,0 3,3 13,1 79.756 (13 x 11)=143 + 480 + 127 37,7 8,3 2,7 2,8 12,9 129.600 (13 x 11)=143 + 254 + 127 39,1 8,3 3,1 3,3 13,5 68.580 (9 x 11)=99 + 480 + 127 39,1 8,1 2,9 2,8 13,2 108.480 (9 x 11)=99 + 254 + 127 40,1 8,7 3,1 3,1 13,8 57.404

Delta No Delta

TESTB-STREET WORD MLP (MFCC)

Double Delta

Reduction of the dimensionality of the Neural Net

slide-15
SLIDE 15

Research & Technology

Discriminative Feature Extraction: conclusions

TANDEM acoustic modelling can be performed with

discriminant parametric models too (QDA)

As a compromise between connectionist and

parametric modelling RBFs can be used for TANDEM

Concatenation of LDA−PLP and LDA−MSG features

results in an slight improvement to our baseline LDA system

Word−based Hybrid ANN/HMMs are the best

performing