How to train NER LSTM on single sentence level

Question

My documents are only a single sentence long, containing one annotation.
Sentences with the same named entity of course are similar, but not context-wise.
NER training examples (afaik) always has documents sequentially related, aka the next document is context-wise related to the previous document. Consider the example below. The first sentence is about the US, with location annotations. The second sentence is about an organisation but still related to the previous.

The United States of America (LOC), commonly known as the United States (U.S. or US).
The Bank of America (ORG) is a multinational investment bank.

My dataset for example would be:

The United States of America (LOC), commonly known as the United States (U.S. or US) or America (LOC).
The Netherlands (LOC), informally Holland, is a country in Western Europe.
Peter (PER) works at the harbour.

The sentences are not related. When considering a bi-lstm, should I somehow separate the two sentences during training? so that it doesn't think that the annotation of the current sentence is related to the previous sentence too?
Take for example the dataset previewed in this Kaggle notebook. It has a sentence_idx to separate sentences by ID but other than that its just one gigantic long list of words (with features). What happens when a (bi)-lstm finds itself in a completely different context, where the current sentence has absolutely no relation to the previous.
Normally I suppose this isn't a problem because documents are very large, but mine are just a single sentence. Not sharing a context, except that, for example, sentences with a LOC annotation are of course about locations, but not a specific context.
I had a very difficult time describing my problem, questions and edits to make it more clear are very welcome. I believe a similar question is: How to train NER LSTM on single sentence level

Rien · Answer

Looks like this is done by the hyper-parameter called timestemps or sequence length. The steps/length indicates when the cell state is reset. Which can be set to a sentence length.

Just have to figure out whether this can be a variable length or the sentences should perhaps be padded?

How to train NER LSTM on single sentence level

One Answer

Add your own answers!

Ask a Question