TransWikia.com

Inference from text data without label or Target

Data Science Asked by sayan_sen on May 16, 2021

I have a use case where I have text data entered by an approver while approving of some loan.

I have to make some inferences as to what could be the reasons for approval using NLP. How should I go about it?

It’s a Non english language. Can Clustering of text help?? Is it possible to cluster TEXT OF non English language using python libraries.

One Answer

Is it possible to cluster TEXT OF non English language using python libraries?

Sure! classic approaches based on Bag-of-Words are language independent. For modern approaches based on DNNs, mostly pre-trained models, you just need to find a model in your language or train one model from scratch (for this you need lots of text in that language). For example in case of using AWS infrastructure, check Object2Vec algorithm.

Can Clustering of text help?

Can help. For instance for an initial labeling you can cluster data into similar texts and labels each according to overal concept. More sophisticated solution (easily implemented in python) is topic modeling e.g. LDA algorithm.

More sophisticated solution is, again, pre-trained models like S-BERT.

In this direction, I also recommend having an analysis on keywords for algorithms like RAKE or YAKE.

Hope it helps!

Answered by Kasra Manshaei on May 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP