Speaker line-up calibration of the i-vector based speaker - - PowerPoint PPT Presentation

▶

Jun 24, 2023 166 likes •384 views

1 Centre for Language and Speech Technology Radboud University Nijmegen The Netherlands Speaker line-up calibration of the i-vector based speaker recognition system for forensic application M. I. Mandasari, D. van Leeuwen and M. McLaren The

SLIDE 1

Speaker line-up calibration of the i-vector based speaker recognition system for forensic application

The International Association of Forensic Phonetics and Acoustics 2011 Annual Conference 24-28 July, 2011; Vienna, Austria

M. I. Mandasari, D. van Leeuwen

and M. McLaren

Centre for Language and Speech Technology Radboud University Nijmegen The Netherlands

SLIDE 2

Outline

Why Likelihood Ratio (LR) calibration?
LR calibration methods

▫ Linear calibration ▫ Line-up calibration (2011)

I-vector based automatic speaker recognition

system for forensic application

Experiment and results

SLIDE 3

Likelihood Ratio (LR)

In forensic evidence reporting

▫ Scores – LR representation ▫ Used for posterior odds computing by the fact finder

(Prior odds) (Posterior odds) Trace Prosecution hypothesis Defense hypothesis 3

SLIDE 4

Why is LR calibration important?

A study from Rodriguez et. al. (2007): “LR calculated from the un-calibrated system was often misleading, while the calibrated system produced more reliable LR”

Automatic Speaker Recognition System CALIBRATION LR Well-Calibrated System Good for Forensics 4

SLIDE 5

LR calibration method

2007 [ref. 7]

Linear Calibration

2011 [ref. 6]

Line-up Calibration

SLIDE 6

Linear calibration

Scores  Linear transformation  LR
Calibration:

▫ Optimize the linear transformation ▫ Using a set of development scores ▫ to minimize …

The Cllr provides an estimation of calibration error over all priors.

Miscalibration cost:

▫ Low miscalibration cost indicates that the system produces more reliable LRs.

SLIDE 7

Line-up LR calibration method

Motivated by the witness line-up scenario in

forensic tasks.

Suspect Witness Foils Foils 7

SLIDE 8

Line-up LR calibration method

Each speaker scores is “lined- up” with all foils speakers Determining the rank within the line-up set Computing the calibrated LR value!

SLIDE 9

I-vector based speaker recognition

i -vector is a speech representation in a low-dimensional total variability space. [Dehak, et. al, 2009]

Total Variability space (400D) Linear Discriminant Analysis (LDA) Projection (200D) Within Class Covariance Normalization (WCCN) Cosine Kernel Scoring Speech A Speech B

B A B A

w w w w . .

w Scores LRs LR calibration i-vector 9

SLIDE 10

I-vector system for forensics [ref. 4]

The i-vector speaker recognition system …

▫ has a good performance in classification & calibration, and ▫ offer a good separation of target and non- target scores

The symmetrical behavior of the i-vector system

is of particular interest in forensic evidence reporting, where long speech samples can be collected from a suspected speaker in an interview scenario while the trace may be of uncontrolled duration.

SLIDE 11

i-vector classification performance

Symmetrical!

SLIDE 12

Experiment setup

i-vector based automatic speaker recognition
Dataset:

▫ NIST SRE 2010 (Halved into two datasets with disjoint speakers) ▫ For duration = 5, 10, 20, 40 sec. and full utterances

Linear vs. Line-up calibration method
Performance parameter

▫ Classification : EER (Equal Error Rate) ▫ Calibration : Mis-calibration

SLIDE 13

Classification Performance

Female Male 13

SLIDE 14

Classification Performance

Male Female 14

SLIDE 15

Classification Performance

Still offer symmetrical behavior in Line-up

calibration,

EER in line-up calibration is generally better

than in linear calibration, and

The EER improvement is greater in short

duration cases.

To conclude…

▫ Line-up calibration gives a better classification performance in general than linear calibration method.

SLIDE 16

Calibration Performance

Female Male 16

SLIDE 17

Calibration Performance

Female Male 17

SLIDE 18

Calibration Performance

In both male and female case, the miscalibration

parameter of the linear calibration method is generally better than the line-up calibration method, however

The difference of the calibration performance,

measured by Cllr is small – (not more than 0.01)

To conclude

▫ Calibration performance within the line-up calibration method is not better than the linear method, but it is not that bad either.

SLIDE 19

Our Findings

▫ EER with line-up calibration is better, somehow it shows that this calibration method act more like score normalization* in the system.

Performance Gender Linear vs. Line-up calibration Classification Male .3822 (EER, %) Female .3496 Calibration Male .0052 (Miscalibration) Female .0104

SLIDE 20

Reference

1. Butcher, A.R. (2002). Forensic Phonetics: Issues in speaker identification evidence. Proceedings of the Inaugural International Conference of the Institute of Forensic Studies, Italy, p.3-5. 2. Brümmer, N. (2006). Focal II: Toolkit for calibration of multi-class recognition scores, software available at http://www.dsp.sun.ac.za/~nbrummer/focal/index.htm. 3. Dehak, N., Dehak, R., Glass, J., Reynolds, D. and Kenny, P. (2010). Cosine similarity scoring without score normalization techniques. Proceeding of Odyssey. 4. Mandasari, M. I., McLaren, M. and van Leeuwen, D. (2011). Evaluation of i-vector Speaker Recognition Systems for Forensic Application. Submitted to the 12th Annual Conference of the International Speech Communication Association, Florence, Italy. 5. Rodriguez J. G. and Ramos, D. (2007). Forensic automatic speaker classification in the “coming paradigm shift”. Speaker Classification p. 205-217. Springer. 6. van Leeuwen, D. and Brümmer, N. (2011). A speaker line-up for the likelihood ratio. Submitted to the 12th Annual Conference of the International Speech Communication Association, Florence, Italy. 7. van Leeuwen, D. and Brümmer, N. (2007). An introduction to application- independent evaluation of speaker recognition systems. Speaker Classification p. 330-353. Springer.

SLIDE 21

Speaker line-up calibration of the i-vector based speaker recognition system for forensic application

and M. McLaren

Outline

▫ Linear calibration ▫ Line-up calibration (2011)

system for forensic application

Likelihood Ratio (LR)

▫ Scores – LR representation ▫ Used for posterior odds computing by the fact finder

Why is LR calibration important?

A study from Rodriguez et. al. (2007): “LR calculated from the un-calibrated system was often misleading, while the calibrated system produced more reliable LR”

LR calibration method

Linear Calibration

Line-up Calibration

Linear calibration

▫ Optimize the linear transformation ▫ Using a set of development scores ▫ to minimize …

The Cllr provides an estimation of calibration error over all priors.

▫ Low miscalibration cost indicates that the system produces more reliable LRs.

Line-up LR calibration method

forensic tasks.

Line-up LR calibration method

Each speaker scores is “lined- up” with all foils speakers Determining the rank within the line-up set Computing the calibrated LR value!

I-vector based speaker recognition

i -vector is a speech representation in a low-dimensional total variability space. [Dehak, et. al, 2009]

I-vector system for forensics [ref. 4]

▫ has a good performance in classification & calibration, and ▫ offer a good separation of target and non- target scores

is of particular interest in forensic evidence reporting, where long speech samples can be collected from a suspected speaker in an interview scenario while the trace may be of uncontrolled duration.

i-vector classification performance

Experiment setup

▫ NIST SRE 2010 (Halved into two datasets with disjoint speakers) ▫ For duration = 5, 10, 20, 40 sec. and full utterances

▫ Classification : EER (Equal Error Rate) ▫ Calibration : Mis-calibration

Classification Performance

Classification Performance

Classification Performance

calibration,

than in linear calibration, and

duration cases.

▫ Line-up calibration gives a better classification performance in general than linear calibration method.

Calibration Performance

Calibration Performance

Calibration Performance

parameter of the linear calibration method is generally better than the line-up calibration method, however

measured by Cllr is small – (not more than 0.01)

▫ Calibration performance within the line-up calibration method is not better than the linear method, but it is not that bad either.

Our Findings

▫ EER with line-up calibration is better, somehow it shows that this calibration method act more like score normalization* in the system.

Performance Gender Linear vs. Line-up calibration Classification Male .3822 (EER, %) Female .3496 Calibration Male .0052 (Miscalibration) Female .0104

Reference

Vienna, 25 July 2011