Rapid Computation of I-vector
1Institute for Infocomm Research
(I2R), Singapore
2Nanjing University of Posts and
Rapid Computation of I-vector Longting XU 1,2 , Kong Aik LEE 1 , - - PowerPoint PPT Presentation
Rapid Computation of I-vector Longting XU 1,2 , Kong Aik LEE 1 , Haizhou Li 1 and Zhen Yang 2 1 Institute for Infocomm Research ( I 2 R ) , Singapore 2 Nanjing University of Posts and Telecomm, China Introduction Compression process an
1Institute for Infocomm Research
2Nanjing University of Posts and
Odyssey 2016, Bilbao, Spain 2
, x 0 I
Prior Observations Posterior
1 2
, , ,
T
1
,
x L
1 T 1
L T Σ F
1 1 T 1
L I T Σ NT
1 T
L T F
1 1 T
L I T NT
Odyssey 2016, Bilbao, Spain 3
– Of particularly interest for i-vector extraction on hand-held devices and large- scale cloud-based applications – The number of senone posteriors is approaching 10k and beyond [Sadjadi et al, 2016]. – The T matrix is trained offline and one-off. Computational load not general seen as a bottleneck.
Odyssey 2016, Bilbao, Spain 4
– Simplifying the posterior covariance estimation
Odyssey 2016, Bilbao, Spain 5
– Estimate directly the posterior mean without the need to evaluate the posterior covariance
Odyssey 2016, Bilbao, Spain 6
Odyssey 2016, Bilbao, Spain 7
Odyssey 2016, Bilbao, Spain 8
p p
Odyssey 2016, Bilbao, Spain 9
1 T p p
1 T
1 1 T T
1 T T T 1 1 T T T T 1 1 1 T T T T
Odyssey 2016, Bilbao, Spain 10
1 1 1 T T T T
1 T T T 1 1
1 1 1 T T T T
Odyssey 2016, Bilbao, Spain 11
Odyssey 2016, Bilbao, Spain 12
1 2
, T USV U U SV
1 1 1 T T T 1 1 1 T 2 2 1 T 2 2
I N T T T T I N U U I N I U U I N NU U
Odyssey 2016, Bilbao, Spain 13
1 1 T 1 1 T 1 T 1 2 2 2 2 2 2
A NU U A A N I U U A N U U A
1 1 1 1 T T T T 2 2 2 2
I N T T T T I N NU U A NU U
1 1 1 T T 1 1 T 1 T 1 2 2 2 2
I N T T T T A A NU U I A NU U A
Odyssey 2016, Bilbao, Spain 14
1 1
for 0 1
A N I N N I 1
c c
N c N
1 1 1 T T 1 T 1 T 1 2 2 2 2
I N T T T T A U U I A NU U A
1 1 1 1 1 T T T T T T
Odyssey 2016, Bilbao, Spain 15
Complexity Memory cost Time ratio Baseline (slow) O(CFM2 + M3) O(CFM) 106.44 Baseline (fast) O(CFM + CM2 + M3) O(CFM + CM2) 11.99 Proposed (exact) O(CFM + CM2 + M3) O(CFM + CM2) 12.65 Proposed (fast) O(CFM) O(CFM) 1
1 T T c c c c N
I T T T F
1 T c c c N
I A T F
1 T T
1
c c c c N
T T T F
1 1 T T
T T T I N F
Odyssey 2016, Bilbao, Spain 16
T c c
T T
Odyssey 2016, Bilbao, Spain 17
Odyssey 2016, Bilbao, Spain 18
– Gender dependent with C = 512 mixtures – 57-dim MFCC – SWB, SRE’04, 05, 06
– M = 400 – Trained using the same dataset as UBM
– LDA to 300-dim and length normalization was performed – 200 speaker factors – Full residual covariance for channel modeling
Odyssey 2016, Bilbao, Spain 19
Odyssey 2016, Bilbao, Spain 20
EER MinDCF10
Odyssey 2016, Bilbao, Spain 21
EER MinDCF10
– Subspace orthonomalizing prior – Uniform occupancy assumption
Odyssey 2016, Bilbao, Spain 22
Odyssey 2016, Bilbao, Spain 23