TransWikia.com

Combine K-nearest neighbor with categorical embedding

Data Science Asked by Woden on January 22, 2021

I’ve tried a few ways to do my multi-class classification. For categorical data, I used the embedding technique with Tensorflow, which moves the entity closer with its similarity. This technique provides me with approximately 25%-30%, which is low.

For the numerical data, I used the KNN algorithm that gave me roughly 40% accuracy. I am wondering is there any way to "combine" these two techniques together to achieve a better result. For example, perhaps using the probability given by the KNN algorithm to form a layer concatenated with the embedding layer. Then, use the Dense layer to further train these data.

I’ve searched on the Internet. It’s not the technique of ensembling, which averages the accuracy of each model. It’s more like concatenating the layer together.

Any help is highly appreciated.

enter image description here

One Answer

If I understand well, you label encode categorical variables and fed them to a neural network. If this is the case, you can try the following:

  1. add the numerical variables
  2. create and train an autoencoder
  3. use the encoder part to map input to a vector space and perform k-nearest neighbor to it.

You can read the second method described in https://towardsdatascience.com/detecting-credit-card-fraud-with-autoencoders-in-python-98391cace8a3 It uses a dataset with numerical variables only, but since you label encode categorical variables it applies in your case too.

Correct answer by Dimitrios Panagopoulos on January 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP