Signal Processing Asked by an6 on January 13, 2021
I’m currently making a program for speech recognition. In the step of codebook generation using the LBG (Linde-Buzo-Gray) algorithm, I’ve read that the splitting factor $varepsilon = 0.01$ (generally)
The splitting factor is used to split the centroid of the speech features according to the formulae
begin{align}
Y_{n}^+ &= Y_n (1+varepsilon)\
Y_{n}^- &= Y_n (1-varepsilon)
end{align}
where $n$ is the index of the given codeword/centroid to be split and $Y_n$ is the codeword.
Also, after the codebook is generated, nearest neighbours are searched for each speech feature vector and the centroids are updated (basically clustering of features). This is done until the distortion of the codebook is less than epsilon.
Although my program seems to be working fine, I’m interested to know why the splitting factor is usually set to be 0.01.
Any help is appreciated. This is my first time working with codebooks and vector quantization.
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP