Reputation: 1326
I'm wondering if a bigquery/patents person can pop in on something that I'm scratching my head over.
Basically, if you are searching for two phrases, say "phrase 1" and "phrase 2", then patents.google.com seems to do something more than just search in title/abstract/description/claims. I've seen cases where at least one of the phrases is not in any the info when you look at one of the patents that patents.google.com actually returned. When you run a bigquery query matching on title or abstract or description or claims, the count of results is way lower than patents.google.com.
Is it using some additional info/structures not available to bigquery?
I noticed that there are "similar" and "cited by" fields in the google-research patent db, but this seems more useful for use with seeding, which is (I think) the reverse of starting from phrases and finding patents for those phrases.
I also noticed a field "embedded_v1" which is an array of floats for some kind of similarity(?) measure, but I don't see how you'd use if as part of the query here.
thanks!
Upvotes: 0
Views: 135