LEBRON L
LEBRON L

Reputation: 17

Acquire patents' forward citation data from BigQuery by application

I want to collect data by application_number like this. The real application number is CN 201510747352.

SELECT c.application_number AS Pub, COUNT(p.publication_number) AS CitedBy 
     FROM `patents-public-data.patents.publications` AS p, UNNEST(citation) AS c 
     WHERE c.application_number IN ('CN-201510747352-A') 
     GROUP BY c.application_number

But it can't work. The url is the patent page. Who can do me a favor? https://patents.google.com/patent/CN105233911B/zh?oq=CN201510747352.8

Upvotes: 0

Views: 297

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173121

My guess is that patent can be cited after it's status is Application - so instead of using initial number CN-201510747352 - you should use app/pub number when status is Application - Also you need to apply not only distinct counting but also excluding counting same app with -A or -B or etc. suffix - that is why you will see use of the regex_extract function

#standardSQL
SELECT 
  c.publication_number AS Pub, 
  COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p, 
UNNEST(citation) AS c 
WHERE c.publication_number LIKE ('CN-105233911%') 
GROUP BY c.publication_number  

with result

Row Pub             CitedBy  
1   CN-105233911-A  10   

... If I only have the application data, how can I realize it?

#standardSQL
SELECT 
  c.publication_number AS Pub, 
  COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p, 
UNNEST(citation) AS c 
WHERE c.publication_number IN (
  SELECT publication_number 
  FROM `patents-public-data.patents.publications`
  WHERE application_number IN ('CN-201510747352-A') 
)
GROUP BY c.publication_number 

Upvotes: 1

Related Questions