Data Science Asked on December 31, 2020
I implemented custom NER with bellow trained data first time and it gives me good prediction with Name and PrdName. I mentioned code bellow.
if __name__ == '__main__':
TRAIN_DATA = [
('My Name is Rajesh', {'entities': [(11, 17, 'Name')]}),
('My Name is Bakul', {'entities': [(11, 16, 'Name')]}),
('My Name is Pritam', {'entities': [(11, 17, 'Name')]}),
('My Name is Rakesh', {'entities': [(11, 17, 'Name')]}),
('My Name is Jayeeta', {'entities': [(11, 18, 'Name')]}),
('this is the price of bag', {'entities': [(21, 24, 'PrdName')]}),
('what is the price of ball?', {'entities': [(21, 25, 'PrdName')]}),
('what is the price of jegging?', {'entities': [(21, 28, 'PrdName')]}),
('what is the price of t-shirt?', {'entities': [(21, 28, 'PrdName')]}),
]
iterations = 20
try:
model = 'live_ner_model'
nlp = spacy.load(model) # load existing spacy model
except:
model = None
print("Exception")
nlp = spacy.blank('en') # create blank Language class
print("Created blank 'en' model")
if 'ner' not in nlp.pipe_names:
ner = nlp.create_pipe('ner')
nlp.add_pipe(ner)
print("Create NER")
else:
ner = nlp.get_pipe('ner')
print("Exhisting NER")
# Add new entity labels to entity recognizer
for _, annotations in TRAIN_DATA:
for ent in annotations.get('entities'):
ner.add_label(ent[2])
# get names of other pipes to disable them during training
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
with nlp.disable_pipes(*other_pipes): # only train NER
optimizer = nlp.begin_training()
for itn in range(iterations):
print("Statring iteration " + str(itn))
random.shuffle(TRAIN_DATA)
losses = {}
for text, annotations in TRAIN_DATA:
nlp.update(
[text], # batch of texts
[annotations], # batch of annotations
drop=0.2, # dropout - make it harder to memorise data
sgd=optimizer, # callable to update weights
losses=losses)
print(losses)
# Save model
output_dir = 'live_ner_model'
if output_dir is not None:
output_dir = Path(output_dir)
if not output_dir.exists():
output_dir.mkdir()
nlp.meta['name'] = model # rename model
nlp.to_disk(output_dir)
print("Saved model to", output_dir)
# Test the saved model
output_dir = 'live_ner_model'
print("Loading from", output_dir)
nlp2 = spacy.load('live_ner_model')
test_text = """
what is the price of cup. My Name is Rahim
"""
doc2 = nlp2(test_text)
for ent in doc2.ents:
print(ent.label_, ent.text)
But when I am trying to trained with some new data which has entity with only PrdName or any other new entity excluding Name in existing model.
Then Name entity prediction goes wrong. I think this issue arises as I updated trained data excluding Name
entity.
So is there any way we can improve training by not affecting existing training. Can someone share the idea? If possible please share a sample code.
Environment: Anaconda, spacy=v2.0.1, python=3.7
The model depends entirely on the training data: if you train with some data which has only PrdName as label, the model knows only this label and can predict only this label. You need to provide as much training data as possible, containing all the possible labels.
For the record, NER are usually trained with thousands of sentences in order to account for the diversity of the cases where a NE can appear.
Answered by Erwan on December 31, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP