Reputation: 1035
Is it possible to train SpaCy NER with validation data? Or split some data to validation set like in Keras (validation_split in model.fit)? Thanks
with nlp.disable_pipes(*other_pipes): # only train NER
for itn in tqdm(range(n_iter)):
random.shuffle(train_data_list)
losses = {}
# batch up the examples using spaCy's minibatch
batches = minibatch(train_data_list, size=compounding(8., 64., 1.001))
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
losses=losses)
Upvotes: 0
Views: 2120
Reputation: 11474
Use the spacy train
CLI instead of the demo script:
spacy train lang /path/to/output train.json dev.json
The validation data is used to choose the best model from the training iterations and optionally for early stopping.
The main task is converting your data to spacy's JSON training format, see: https://stackoverflow.com/a/59209377/461847
Upvotes: 2