Combine K-nearest neighbor with categorical embedding

Question

I've tried a few ways to do my multi-class classification. For categorical data, I used the embedding technique with Tensorflow, which moves the entity closer with its similarity. This technique provides me with approximately 25%-30%, which is low.
For the numerical data, I used the KNN algorithm that gave me roughly 40% accuracy. I am wondering is there any way to "combine" these two techniques together to achieve a better result. For example, perhaps using the probability given by the KNN algorithm to form a layer concatenated with the embedding layer. Then, use the Dense layer to further train these data.
I've searched on the Internet. It's not the technique of ensembling, which averages the accuracy of each model. It's more like concatenating the layer together.
Any help is highly appreciated.

Dimitrios Panagopoulos · Accepted Answer

If I understand well, you label encode categorical variables and fed them to a neural network. If this is the case, you can try the following:

add the numerical variables
create and train an autoencoder
use the encoder part to map input to a vector space and perform k-nearest neighbor to it.

You can read the second method described in https://towardsdatascience.com/detecting-credit-card-fraud-with-autoencoders-in-python-98391cace8a3 It uses a dataset with numerical variables only, but since you label encode categorical variables it applies in your case too.

Combine K-nearest neighbor with categorical embedding

One Answer

Add your own answers!

Ask a Question