TransWikia.com

Use Categorical features in BERT model

Data Science Asked by saran on April 30, 2021

I am trying to fine-tune BERT-base model for binary text classification using multiple features. 3 text features, 4 categorical features. Text features having more than 500 tokens length, and four categorical features having binary values 0,1. For each categorical features created its corresponding derived feature. for ex. ‘Impact_Code’ having values 0 / 1, i derived new feature as ‘Impact_Code_Derived’ with the value ‘Impact Code Not Available’ for ‘0’, and ‘Impact Code Available’ for 1. Like this, derived new features for all the 4 categorical features and using text features as it is.

While fine-tuning, i am getting bert embeddings for all the 8 features. i.e., I am taking last-hidden-state of BERT for each feature. The size is (Batch_Size x 512 x 768). I do avg. pooling for each token (1 token x 768 features) then the size become (Batch_Size x 512). and then concatenate all the 8 features avg. pooled output (Batch_size x 512 x 8) then pass it to Fully connected layer with tanh activation. This FC layer output size is 1024. Then this 1024 will be passed as input to another FC layer it will provides 2 outputs.

Since, i could see accuracy around 60%, I am not sure, whether

  1. it is right approach to handle categorical features like this
  2. it is right approach to handle multiple features in bert like this.

Please let me know your valuable suggestions. I searched in net, but couldn’t find much clear solutions to clear my doubts.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP