Data Science Asked by Mohit Saini on November 23, 2020
I want to train my application for phrase similarity. I want my model to predict similarity score for phrases as shown in below examples.
ex-
International Business Machines = I.B.M
Synergy Telecom = SynTel
Beam inc = Beam Incorporate
Sir J J Smith = Johnson Smith
Alex, Julia = J Alex
James B. D. Joshi = James Joshi
James Beaty, Jr. = Beaty
Is there any dataset available to train this type of model?
This is a difficult problem, but definitely worth exploring.
An interesting resource to look into is DBpedia. It aims to extract structured information from the Wikipedia project. It is available under a free license (CC-BY-SA).
You can conveniently explore the project online, e.g.:
Note that you are restricted to the extensive but ending knowledge on Wikipedia, for example Synergy Telecom/SynTel
seems not to have an entry. Your creativity would be required to overcome this limitation.
Answered by Simon on November 23, 2020
This seems to correspond to entity linking or possibly named entity coreference. You might find some datasets here.
Answered by Erwan on November 23, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP