Kernel combination for SVM speaker verification

Abstract

We present a new approach for constructing the kernels used to build support vector machines for speaker verification. The idea is to construct new kernels by taking linear combination of many kernels such as the GLDS and GMM supervector kernels. In this new kernel combination, the combination weights are speaker dependent rather than universal weights on score level fusion and there is no need for extra-data to estimate them. An experiment on the NIST 2006 speaker recognition evaluation dataset (all trial) was done using three different kernel functions (GLDS kernel, linear and Gaussian GMM supervector kernels). We compared our kernel combination to the optimal linear score fusion obtained using logistic regression. This optimal score fusion was trained on the same test data. We had an equal error rate of $\simeq 5,9%$ using the kernel combination technique which is better than the optimal score fusion system ($\simeq 6,0%$).