eng2019
eng2019

Reputation: 1035

Set validation data in SpaCy NER training

Is it possible to train SpaCy NER with validation data? Or split some data to validation set like in Keras (validation_split in model.fit)? Thanks

with nlp.disable_pipes(*other_pipes):  # only train NER
        for itn in tqdm(range(n_iter)):
            random.shuffle(train_data_list)
            losses = {}
            # batch up the examples using spaCy's minibatch
            batches = minibatch(train_data_list, size=compounding(8., 64., 1.001))
            for batch in batches:
                texts, annotations = zip(*batch)
                nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
                           losses=losses)

Upvotes: 0

Views: 2120

Answers (1)

aab
aab

Reputation: 11474

Use the spacy train CLI instead of the demo script:

spacy train lang /path/to/output train.json dev.json

The validation data is used to choose the best model from the training iterations and optionally for early stopping.

The main task is converting your data to spacy's JSON training format, see: https://stackoverflow.com/a/59209377/461847

Upvotes: 2

Related Questions