How to check if a token in present in a document with spaCy?

Question

I have a huge list of larger spaCy documents and a list of words which I want to look up in the document. An example: I want to look up the word "Aspirin" in a website text, which was parsed with spaCy. The list of keywords I want to look up is quite long.

Naive approach

Don't use spacy and just use if keyword in website_text: as a simple matcher. Of course this has the downside that tokens are ignored and searches for test will yield false positives at words like tested, attested, etc.

Use spaCy's matchers

Matcher are an option, but I would need to automatically build a lot of matchers based on my list of keywords.

Is there a recommended way to achieve this task?

How to check if a token in present in a document with spaCy?

Naive approach

Use spaCy's matchers

Answers (1)

Related Questions