TransWikia.com

Overfitting in Huggingface's TFBertForSequenceClassification

Data Science Asked on December 22, 2021

I’m using Huggingface’s TFBertForSequenceClassification for multilabel tweets classification. During training the model archives good accuracy, but the validation accuracy is poor. I’ve tried to solve the overfitting using some dropout but the performance is still poor. The model is as follows:

# Get and configure the BERT model
config = BertConfig.from_pretrained("bert-base-uncased", hidden_dropout_prob=0.5, num_labels=13)
bert_model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased", config=config)

optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=0.00015, clipnorm=0.01)
loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')

bert_model.compile(optimizer=optimizer, loss=loss, metrics=[metric])
bert_model.summary()

The summary is as follows:

Bert Model summary

When I fit the model, the outcome is:

history = bert_model.fit(train_ds, epochs=30, validation_data = test_ds)

enter image description here

One Answer

From my experience, it is better to build your own classifier using a BERT model and adding 2-3 layers to the model for classification purpose. As the builtin sentiment classifier use only a single layer. But for better generalization your model should be deeper with proper regularization. As you have around 13 class you should use deeper model with a good number of training examples for each class.

Answered by SrJ on December 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP