Lalit Vyas
Lalit Vyas

Reputation: 241

How many training data(sentences) are required for custom NER using spacy python?[Just rought idea]

I want to know let's say I have 10 custom entities to recognize how much annotated training sentences should I give (Any rough idea) ??

Thank You, in Advance!! :)

I am new to this, please help

Upvotes: 4

Views: 5407

Answers (2)

Ridhima Garg
Ridhima Garg

Reputation: 67

For the custom NER model from Spacy, you will definitely require around 100 samples for each entity that too without any biases in your dataset.

All this is as per my experience.

Suggestion -: Spacy Custom model you can explore, but for production level or some good project, you can't be totally dependent on that only, You have to do some NLP/ Relation Extraction, etc. along with this.

Hope this helps.

Upvotes: 1

Hitesh Laddha
Hitesh Laddha

Reputation: 57

For developing custom ner model at least 50-100 occurrences of each entity will be required along with their proper context. Otherwise if you have less data than your custom model will overfit on that. So, depending upon your data you will require atleast 200 to 300 sentences.

Upvotes: 3

Related Questions