Does anybody know where this rule of thumb came from? Rule of them: embedding vector dimension should be the 4th root of the number of categories

Question

I was taking an online ML course and the lecturer said that a rule of thumb for choosing the number of dimensions when embedding categorical data is the following
embedding vector dimension should be the 4th root of the number of categories
The lecturer worked for Google and when I looked on the internet for this I only found a Google blog which quickly mentioned it google blog link. I'm guessing it's something that they came up with at Google but was wondering if somebody else has maybe seen it in a research paper.

user1288043 · Answer

google has published a word embedding that has embedding dimension of 300. Following the rule you have given it should have trained on $300^4 = 8.1*10^9$ words. If google is using ngrams instead of just words, then it seems plausible.

Answered by user1288043 on September 2, 2021

Does anybody know where this rule of thumb came from? Rule of them: embedding vector dimension should be the 4th root of the number of categories

One Answer

Add your own answers!

Ask a Question