Data Science Asked by Borislav Stoilov on April 3, 2021
After reading several papers I am not sure if it is possible to some how generate text with the same meaning (paraphrase it) using only Word2vec.
I found out other approaches that use sequences of sentence pairs, and they train Neural nets to find the most similar, but this is hard to maintain and it will be hard to generate relevant content like this.
I would like to give raw text to Word2vec powered algorithm that gives paraphrased text.
Someone has defined a problem for paraphrasing on Tensor2Tensor: https://github.com/tensorflow/tensor2tensor/releases
You can also define your own problem but you might have to supply the corpus: https://tensorflow.github.io/tensor2tensor/new_problem.html
If you are just looking for a vectorized representation of your sentences, Google BERT might be worth looking at. Bert-as-service is quite convenient: https://github.com/hanxiao/bert-as-service
Perhaps you could reframe your problem slightly as a question-answer type problem for which I think both T2T and Bert are well equipped to handle. The questions could be of type, "The three little pigs each had their own home?" For which an answer paraphrasing the question could be, "Each of the three little pigs had their own house." (But I'm taking a guess here). Maybe there is a way to tweak the problem so you don't have to include a question mark? Then it would be a paraphrasing task...
I'm sorry I can't provide you with a more professional answer but I think T2T is worth looking at for sure.
Answered by mLstudent33 on April 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP