Reputation: 21
I am trying to run spaCy's fuzzy matching method. I am using the 3.5.0 version for both the spaCy package as well as the model en_core_web_sm.
I ran the following:
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
pattern = [{"LOWER": "hello"}, {"FUZZY": "world"}]
matcher.add("my_name", [pattern])
When running the above, I get the following error:
MatchPatternError: Invalid token patterns for matcher rule 'my_name'
Pattern 0: [pattern -> 1 -> FUZZY] extra fields not permitted
I'm not familiar with what this error is trying to say. Given that the above example was taken from spaCy's documentation, I would not expect an error to occur. If I remove {"FUZZY": "world"}
, the code runs without error.
Would some please explain why the error is being returned?
Upvotes: 2
Views: 828
Reputation: 15593
You can't use FUZZY by itself in a rule, it needs to be under another item that tells it what field to check against. If you check the docs again, you'll see this line:
Just like REGEX, it always needs to be applied to an attribute like TEXT or LOWER.
You can also see that all the examples are something like {"LOWER": {"FUZZY": ...
or {"TEXT": {"FUZZY": ...
. It seems natural that the token text would be matched by default, but actually you have to specify that, since it's possible to match against other things like lemmas etc.
In this case you can fix your code by changing your pattern to this:
pattern = [{"LOWER": "hello"}, {"LOWER": {"FUZZY": "world"}}]
Upvotes: 2