Use Large Existing Dataset to Extract Information From Text

Data Science Asked on July 6, 2021

What I have:
I have a large dataset of documents and their data.
So I have the text of about 1M documents, and I know, for example the invoice number, of each one.

What I need:
Is there a way to use my dataset to train a model to take in a the text of a new document and predict what the invoice number is?

I’m just getting my feet wet with ML, and I’ve used ML.Net to predict the document type (from about 12 possible) from my dataset using MultiClassClassification. There, the prediction will always be from one of the 12 possible labels.
Here, I need it to use my training data to predict a new label that it wasn’t trained specifically on.

machine learning machine learning model text text classification training

Add your own answers!

Ask a Question

Get help from others!