[PYTHON] The most cited patent in the world

Introduction

Since the holidays have begun, I wanted to make an urgent analysis, so I wondered which patent document had the highest number of citations in the world.

SQL in BigQuery

For the first time in online learning, I learned that SQL is pronounced as squeaky. ..

What was difficult was how to create a citation count pattern that eliminates the total number of citations and duplicates (excluding being used many times for reasons for refusal of the same application), and because BigQuery contains only citation information, it is cited. The part that generates information (reverse direction). It's a little complicated, but it looks like the following.

SQL for population
WITH bibtable as (
  SELECT
    pub.application_number AS appnum,
    pub.publication_number AS pubnum,
    pub.filing_date as appday,
    STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
    SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
    STRING_AGG(DISTINCT(title.text)) AS title
  FROM `patents-public-data.patents.publications_201912` AS pub,
     UNNEST(title_localized) AS title,
     UNNEST(assignee_harmonized) as applicants,
     UNNEST(ipc) as ipcs
     GROUP BY appnum,pubnum,appday
)

SELECT 
  pubnum,
  SUBSTR(pubnum,0,2) AS appcountry,
  SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
  COUNT(main.application_number) AS total_cit_count,
  COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
  STRING_AGG(DISTINCT(ipc4)) AS ipcs,
  STRING_AGG(DISTINCT(title)) AS titles,
  STRING_AGG(DISTINCT(applicants)) AS applicants,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs
LEFT JOIN bibtable
  ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC

Another SQL version for the population: ARRAY_AGG and STRUCT may be cleaner
WITH bibtable as (
  SELECT
    pub.application_number AS appnum,
    pub.publication_number AS pubnum,
    pub.filing_date as appday,
    STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
    STRING_AGG(DISTINCT(title.text)) AS texts,
    SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
    STRING_AGG(DISTINCT(title.text)) AS title
  FROM `patents-public-data.patents.publications_201912` AS pub,
     UNNEST(title_localized) AS title,
     UNNEST(assignee_harmonized) as applicants,
     UNNEST(ipc) as ipcs
     GROUP BY appnum,pubnum,appday
)

SELECT 
  bibtable.pubnum,
  SUBSTR(bibtable.pubnum,0,2) AS appcountry,
  SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
  #STRING_AGG(DISTINCT(cit.application_number)) as appnum,
  #STRING_AGG(DISTINCT(cit.type)) AS cit_type,
  COUNT(main.application_number) AS total_cit_count,
  COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
  STRING_AGG(DISTINCT(ipc4)) AS ipcs,
  STRING_AGG(DISTINCT(title)) AS titles,
  STRING_AGG(DISTINCT(applicants)) AS applicants,
  ARRAY_AGG(STRUCT(SUBSTR(ipcs.code,0,1) AS ipc_sec,
                   SUBSTR(main.publication_number,0,2) AS cc,
                   SUBSTR(CAST(main.filing_date AS STRING),0,4) AS year,
                   appls)
            ) AS fcit,
   #STRING_AGG(cit.type),STRING_AGG(DISTINCT(family_id))
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs,
  UNNEST(main.assignee) AS appls
LEFT JOIN bibtable
  ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC

0. Output result (top 10)

The execution time is about 5 minutes. A total of 30,262,060 documents were extracted. Below are the top 10 citations.

Row pubnum appcountry appyear total_cit_count unique_cit_count ipcs titles applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 US-4683202-A US 1985 8996 4766 C Process for amplifying nucleic acid sequences CETUS CORP 1602 148 6116 13 1 12 1083 21 5542 0 1701 1 1 1751
2 US-4816567-A US 1983 8668 4748 C Recombinant immunoglobin preparations GENENTECH INC 3319 27 4354 8 0 3 954 3 4387 0 1878 0 1 2402
3 US-4683195-A US 1986 7174 3763 C Process for amplifying, detecting, and/or-cloning nucleic acid sequences CETUS CORP 1115 154 4853 1 1 11 1021 18 4593 0 1355 1 1 1224
4 US-5223409-A US 1991 4956 2523 C Directed evolution of novel binding proteins PROTEIN ENG CORP 1598 17 2833 10 0 2 493 3 2629 2 1092 1 0 1232
5 US-5523520-A US 1994 4759 3761 C Mutant dwarfism gene of petunia GOLDSMITH SEEDS INC 4585 1 162 0 2 1 5 3 4759 0 0 0 0 0
6 US-4946778-A US 1989 4657 2474 C Single polypeptide chain binding molecules GENEX CORP 1655 9 2436 1 0 1 554 1 2521 1 1146 1 0 988
7 US-5585089-A US 1995 4604 2520 A Humanized immunoglobulins PROTEIN DESIGN LABS INC 1829 38 2306 0 0 1 430 0 2337 1 976 0 2 1288
8 US-5530101-A US 1990 4036 1987 A Humanized immunoglobulins PROTEIN DESIGN LABS INC 1776 9 1920 0 0 1 330 0 2564 2 672 0 4 794
9 US-5892900-A US 1996 4035 1944 G Systems and methods for secure transaction management and electronic rights protection INTERTRUST TECH CORP 53 26 3 0 1 1 2886 1065 3955 2 19 4 29 26
10 US-2003189401-A1 US 2003 3849 1958 C Organic electroluminescent device INT MFG & ENG SERVICES CO LTD 2 14 33 1 1 6 759 3033 3820 1 16 3 1 8

The number one in the world is US4683202A "Process for amplifying nucleic acid sequences". Is it a biotechnology application? Second place is like that.

Save the results as a BigQuery table. Call bigquery from pythono and try to visualize the result with pandas.

result

1. Whole

Distribution of citation counts

code
from google.cloud import bigquery
project_id = '~~~~~'
client = bigquery.Client(project=project_id)

#Query
query =  """
                SELECT total_cit_count,count(*) AS fcit_count
                FROM `bqml_tutorial.fcitdata`
                GROUP BY total_cit_count
                ORDER BY total_cit_count DESC
         """
#Receive at pandas.
df_count_fcit = client.query(query).to_dataframe()

#plot

import plotly.express as px
px.histogram(df_count_fcit,x='total_cit_count')

Looking at this, I'm worried that there is one mountain per 3000 times, rather than being gentle in the order of the number of citations. This tendency is the same in the histogram of the number of citations without duplication (there is one mountain per 1800 times).

image.png

Distribution of cited ages

code
#The part to get the data in the dataframe is only the column name change, so it is omitted.
fig = px.bar(df_count_fcit[df_count_fcit['appyear'].astype(int)>1990],x='appyear',y='count')
fig.show()

image.png

Which country's application is most cited?

code
SELECT appcountry,SUM(unique_cit_count) AS count
FROM `bqml_tutorial.fcitdata`
GROUP BY appcountry
ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='appcountry',x='count',orientation='h',text='count')
fig.show()

newplot (34).png

Applicants with many citations

code
SELECT applicants,COUNT(*) AS count
FROM `bqml_tutorial.fcitdata`
GROUP BY applicants
ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='applicants',x='count',orientation='h',text='count')
fig.show()

newplot (35).png

2. By technical field

code
SELECT * FROM (
  SELECT 
    ipcs,
    row_number() over (partition by ipcs order by unique_cit_count DESC) AS rank,
    pubnum,
    total_cit_count,unique_cit_count,
    applicants,
    titles
  FROM `bqml_tutorial.fcitdata`
  GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants
  )
WHERE rank <= 5

Almost US

List
section rank pubnum total_cit_count unique_cit_count applicants titles
A 1 US-5585089-A 4604 2520 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 2 US-5530101-A 4036 1987 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 3 US-4658085-A 2222 1829 UNIV GUELPH Hybridization using cytoplasmic male sterility, cytoplasmic herbicide tolerance, and herbicide tolerance from nuclear genes
A 4 US-5693762-A 3455 1822 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 5 US-5569825-A 3082 1730 GENPHARM INT Transgenic non-human animals capable of producing heterologous antibodies of various isotypes
B 1 US-2006113549-A1 3391 1756 TOKYO INST TECH Light-emitting device
B 2 US-2010092800-A1 3150 1678 CANON KK Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device
B 3 US-6336137-B1 1505 940 SIEBEL SYSTEMS INC Web client-server system and method for incompatible page markup and presentation languages
B 4 US-4723129-A 1489 928 CANON KK Bubble jet recording method and apparatus in which a heating element generates bubbles in a liquid flow path to project droplets
B 5 US-6766817-B2 1856 908 TUBARC TECHNOLOGIES LLC Fluid conduction utilizing a reversible unsaturated siphon with tubarc porosity action
C 1 US-4683202-A 8996 4766 CETUS CORP Process for amplifying nucleic acid sequences
C 2 US-4816567-A 8668 4748 GENENTECH INC Recombinant immunoglobin preparations
C 3 US-4683195-A 7174 3763 CETUS CORP Process for amplifying, detecting, and/or-cloning nucleic acid sequences
C 4 US-5523520-A 4759 3761 GOLDSMITH SEEDS INC Mutant dwarfism gene of petunia
C 5 US-5223409-A 4956 2523 PROTEIN ENG CORP Directed evolution of novel binding proteins
D 1 US-3849241-A 1577 926 EXXON RESEARCH ENGINEERING CO Non-woven mats by melt blowing
D 2 US-4340563-A 1491 906 KIMBERLY CLARK CO Method for forming nonwoven webs
D 3 US-3802817-A 1282 805 ASAHI CHEMICAL IND Apparatus for producing non-woven fleeces
D 4 US-3692618-A 1236 802 METALLGESELLSCHAFT AG Continuous filament nonwoven web
D 5 US-4100324-A 1344 775 KIMBERLY CLARK CO Nonwoven fabric and method of producing same
e 1 US-7174579-B1 23 13 BAUZA PEDRO Temperature display system
E 1 US-4902508-A 1159 493 PURDUE RESEARCH FOUNDATION Tissue graft composition
E 2 CN-201962688-U 392 391 FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD Diaphram wall wharf pile foundation structure,Underground wall structure
E 3 US-5809415-A 720 352 OPENWAVE SYS INC Method and architecture for an interactive two-way data communication network
E 4 US-5445304-A 709 312 UNITED STATES SURGICAL CORP Safety device for a surgical stapler cartridge
E 5 US-7098794-B2 618 279 KIMBERLY CLARK CO Deactivating a data tag for user privacy or tamper-evident packaging
F 1 US-7213940-B1 917 506 LED LIGHTING FIXTURES INC Lighting device and lighting method
F 2 US-5037397-A 895 384 MEDICAL DISTRIBUTORS INC Universal clamp
F 3 US-2010327766-A1 611 378 LEVINE DAVID B,RECKER MICHAEL V Wireless emergency lighting system
F 4 US-6577073-B2 569 367 MATSUSHITA ELECTRIC IND CO LTD Led lamp
F 5 US-6432098-B1 661 361 PROCTER & GAMBLE Absorbent article fastening device
G 1 US-5892900-A 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection
G 2 US-2007194379-A1 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor
G 3 US-5731856-A 3571 1795 SAMSUNG ELECTRONICS CO LTD Methods for forming liquid crystal displays including thin film transistors and gate pads having a particular structure
G 4 US-2006208977-A1 3373 1746 SEMICONDUCTOR ENERGY LAB Semiconductor device, and display device, driving method and electronic apparatus thereof
G 5 US-2009280600-A1 3163 1727 JAPAN SCIENCE & TECH AGENCY Amorphous oxide and thin film transistor
H 1 US-2006244107-A1 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device
H 2 US-7674650-B2 3529 1837 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof
H 3 US-2006108636-A1 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor
H 4 US-7061014-B2 3500 1776 JAPAN SCIENCE & TECH AGENCY Natural-superlattice homologous single crystal thin film, method for preparation thereof, and device using said single crystal thin film
H 5 US-6294274-B1 3391 1771 TDK CORP,KAWAZOE HIROSHI
J 1 US-7693341-B2 36 25 APPLE INC Workflows for color correcting images
J 2 US-7847532-B2 20 13 ASTEC INT LTD Centralized controller and power manager for on-board power systems
J 3 US-7364473-B2 11 7 FUJITSU LTD Connector for electronic device
J 4 US-7239379-B2 1 1 TECHNOLOGY INNOVATIONS LLC Method and apparatus for determining a vertical intensity profile through a plane of focus in a confocal microscope
K 1 US-7208984-B1 18 9 LINEAR TECHN INC CMOS driver with minimum shoot-through current
K 2 US-7287321-B2 13 7 DENSO CORP Multi-layer board manufacturing method
K 3 US-7233174-B2 10 4 TEXAS INSTRUMENTS INC Dual polarity, high input voltage swing comparator using MOS input transistors
K 4 US-7752749-B2 2 1 PANASONIC CORP Electronic component mounting method and electronic component mounting device
M 1 US-8233880-B2 21 16 JOHNSON GARTH,GLOBAL TEL LINK CORP Integration of cellular phone detection and reporting into a prison telephone system
N 1 US-8141117-B1 16 10 LOBECK MATTHEW R,ARRIS GROUP INC,CHRISTENSEN KORY D,CONINGSBY DONNA JO,BROADUS CHARLES R,KELLUM JOHN M PC media center and extension device for interfacing with a personal video recorder through a home network
N 2 US-7258343-B2 15 9 BANDAI AMERICA INC Card game and methods of play
N 3 US-8134758-B2 4 2 UEDA HIDENORI,NAKAISHI YOSHIAKI,OKI DATA KK Image reading apparatus, image forming apparatus, image forming system that employs the image reading apparatus and the image forming apparatus
N 4 US-8198132-B2 2 1 GALERA MANOLITO,ALABIN LEOCADIO MORONA,FAIRCHILD SEMICONDUCTOR Isolated stacked die semiconductor packages
O 1 US-8306908-B1 36 20 WEST CORP,PETTAY MARK J,JOHNSON ROBERT E,KEMPKES RODNEY J,BARKER THOMAS B,STRUBBE TODD B Methods and apparatus for intelligent selection of goods and services in telephonic and electronic commerce
Q 1 US-7689487-B1 57 36 AMAZON COM INC Computer-assisted funds transfer system
Q 2 US-8170916-B1 41 18 AHMED WAQAS,MANOLACHE FLORIN V,AMAZON TECH INC,MONGRAIN SCOTT ALLEN,MUNTEANU VALENTIN RADU,RUDEANU CORNELIU GABRIEL ALEXANDRU,WILSON AARON D,DICKER RUSSELL A,ROSCA VAL DAN DAR ION I Related-item tag suggestions
Q 3 US-7130826-B1 5 3 IBM Method and apparatus for conducting coinless transactions
Q 4 US-8190451-B2 2 2 DANCHA LYNNE A,TAN AGNES W H,LINDQUIST TAMMIE J,LLOYD KAREN D,GROUP HEALTH PLAN INC,KOOPMEINERS MICHAEL Method and computer program product for predicting and minimizing future behavioral health-related hospital admissions
R 1 US-7360912-B1 42 30 PASS & SEYMOUR INC Electrical device with lamp module
R 2 US-7196508-B2 24 15 MIRAE CORP Handler for testing semiconductor devices
R 3 US-8052455-B1 10 7 HONGFUJIN PREC IND SHENZHEN,HON HAI PREC IND CO LTD Mounting apparatus for flash drive
R 4 US-8128432-B2 10 5 REVOL DIDIER,LAROCHE VINCENT,FORATIER NATHALIE,JAOUEN JEAN-MARC,LEGRAND SNC,COUSY JEAN-PIERRE,LEGRAND FRANCE Insert and method of assembling such an insert
R 5 US-8002580-B2 5 2 ANDREW LLC Coaxial cable crimp connector
V 1 US-7692056-B2 1 1 SHANGHAI RES INST PETROCHEMICAL TECHNOLOGY SINOPEC Process for producing lower olefins from methanol or dimethylether
Y 1 US-8471523-B2 1 1 MULTI FUNCTION CO LTD,LIN WEI-JONG Charging/discharging device having concealed universal serial bus plug(s)

3. By country

Extract the top 5 by country. It is interesting that there are quite a few differences in each country. Japan is all about semiconductors.

code
SELECT *
FROM `bqml_tutorial.fcitdata`
WHERE appcountry = 'WO'# OR appcountry = 'JP' OR appcountry = 'EP' OR appcountry = 'CN' OR appcountry = 'KR' OR appcountry = 'WO'
ORDER BY unique_cit_count DESC
LIMIT 5
List
Row pubnum appcountry appyear total_cit_count unique_cit_count ipcs titles applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 JP-2002076356-A JP 2000 3084 1758 H Semiconductor device JAPAN SCIENCE & TECH CORP 1 4 11 2 0 2 691 2373 3017 43 6 4 5 9
2 JP-2004103957-A JP 2002 3110 1746 H Transparent thin film field effect transistor using a homologous thin film as an active layer,Transparent thin film field effect type transistor using homologous thin film as active layer JAPAN SCIENCE & TECH CORP,OTA HIROMICHI 1 4 18 1 0 2 688 2396 2912 93 8 90 0 7
3 JP-2003086808-A JP 2001 3066 1742 H Thin film transistor and matrix display,Thin film transistor and matrix display device SHARP KK,KAWASAKI MASASHI,ONO HIDEO 1 3 14 1 0 2 683 2362 2950 96 0 7 1 12
4 JP-2002289859-A JP 2001 3060 1734 H Thin film transistor,Thin-film transistor MINOLTA CO LTD 1 3 11 1 0 2 685 2357 2979 57 2 10 6 6
5 JP-2000044236-A JP 1998 3028 1722 H Articles having a transparent conductive oxide thin film and its manufacturing method,Article having transparent conductive oxide thin film and its production HOYA CORP 1 7 20 1 0 2 680 2317 2991 13 12 3 3 6
applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 US-4683202-A US 1985 8996 4766 C Process for amplifying nucleic acid sequences CETUS CORP 1602 148 6116 13 1 12 1083 21 5542 0 1701 1 1 1751
2 US-4816567-A US 1983 8668 4748 C Recombinant immunoglobin preparations GENENTECH INC 3319 27 4354 8 0 3 954 3 4387 0 1878 0 1 2402
3 US-4683195-A US 1986 7174 3763 C "Process for amplifying, detecting, and/or-cloning nucleic acid sequences" CETUS CORP 1115 154 4853 1 1 11 1021 18 4593 0 1355 1 1 1224
4 US-5523520-A US 1994 4759 3761 C Mutant dwarfism gene of petunia GOLDSMITH SEEDS INC 4585 1 162 0 2 1 5 3 4759 0 0 0 0 0
5 US-5223409-A US 1991 4956 2523 C Directed evolution of novel binding proteins PROTEIN ENG CORP 1598 17 2833 10 0 2 493 3 2629 2 1092 1 0 1232
1 EP-2226847-A2 EP 2005 2388 1510 H "Amorphous oxide and thin film transistor,Oxyde amorphe et transistor à couche mince,Amorpher Oxid- und Dünnschichttransistor" JAPAN SCIENCE & TECH AGENCY 1 3 9 0 0 2 594 1779 2387 0 0 1 0 0
2 EP-0239400-A2 EP 1987 1789 1190 C "Recombinant antibodies and methods for their production,Anticorps recombinants et leurs procédés de production,Rekombinante Antikörper und Verfahren zu deren Herstellung" WINTER GREGORY PAUL 593 1 1066 0 0 0 129 0 572 0 490 1 2 724
3 EP-1737044-A1 EP 2005 1662 1187 H "Oxyde amorphe et transistor film mince,Amorph-oxid- und dünnfilmtransistor,Amorphous oxide and thin film transistor" JAPAN SCIENCE & TECH AGENCY 0 3 7 0 0 2 452 1198 1651 0 4 0 1 6
4 EP-0404097-A2 EP 1990 1568 1152 A "Récepteurs mono- et oligovalents, bispécifiques et oligospécifiques, ainsi que leur production et application,Bispezifische und oligospezifische, mono- und oligovalente Rezeptoren, ihre Herstellung und Verwendung,Bispecific and oligospecific, mono- and oligovalent receptors, production and applications thereof" BEHRINGWERKE AG 505 6 910 0 0 0 147 0 262 0 495 0 0 811
5 EP-1737044-B1 EP 2005 1339 934 H "Amorphes oxid und dünnfilmtransistor,Amorphous oxide and thin film transistor,Oxyde amorphe et transistor film mince" JAPAN SCIENCE & TECH AGENCY 1 0 4 1 0 0 236 1097 1339 0 0 0 0 0
1 CN-201962688-U CN 2010 392 391 E "Diaphram wall wharf pile foundation structure,Underground wall structure" FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD 0 0 0 0 392 0 0 0 0 0 0 0 392 0
2 CN-1895777-A CN 2005 297 296 B "A kind of sieving sulphate molecular sieving catalyst and its preparation method,Porous molecular-sieve catalyst for assembling carbide and its preparation" UNIV BEIJING CHEMICAL 0 7 290 0 0 0 0 0 0 0 0 0 297 0
3 CN-1634601-A CN 2003 680 288 A "Medical device for one kind of bacteriostatic method,Method for sterilizing medical appliance" JILIN PROVINCE ZHONGLI IND CO 631 8 0 0 0 4 6 31 677 0 0 0 3 0
4 CN-1262969-A CN 2000 290 288 B "Catalyst using TiO2 as carrier to load metal nitride Mo2N,TiO" UNIV NANKAI 0 7 283 0 0 0 0 0 0 0 0 0 290 0
5 CN-1470327-A CN 2002 286 286 C "Metal nitride catalyst preparing method and catalyst,A kind of metal catalyst catalyst preparation method and catalyst catalyst" CHINA PETROCHEMICAL CORP 0 7 279 0 0 0 0 0 0 0 0 0 286 0
1 KR-20000051826-A KR 1999 192 179 C "New organomattalic complex molecule for the fabrication of organic light emitting diodes,신규한 착물 및 그의 제조 방법과 이를 이용한 유기 발광 소자" LG CHEMICAL LTD 0 0 162 0 0 0 0 30 12 0 0 168 0 12
2 KR-20110003229-A KR 2009 379 175 A "하이브리드 수술용 로봇 시스템 및 수술용 로봇 제어방법,Hybrid surgical robot system and control method thereof" ETERNE INC 360 6 0 0 0 1 7 5 366 0 1 4 1 7
3 KR-20000074034-A KR 1999 202 145 H "Ultra-slim Repeater with Variable Attenuator,케이블 손실 보상이 가능한 초소형 중계기" ACE TECH 0 0 0 0 0 0 10 192 197 0 0 4 0 1
4 KR-20120009843-A KR 2010 168 142 G "이동단말기 및 그의 어플리케이션 공유 방법,Mobile terminal and method for sharing applications thereof" LG ELECTRONICS INC 0 0 0 0 0 0 17 151 155 0 0 3 2 8
5 KR-20190000980-A KR 2017 148 141 H "A personal Healthcare System,퍼스널 헬스케어 시스템" LEE DONG WON 15 38 0 1 15 24 19 36 1 0 0 147 0 0
1 WO-2004114391-A1 WO 2004 2859 1667 H "Dispositif a semi-conducteur et son procede de production, et dispositif electronique,Semiconductor device, its manufacturing method, and electronic device,Semiconductor devices, their manufacturing methods, and electronic devices" "KAWASAKI MASASHI,SHARP KK,SUGIHARA TOSHINORI,OHNO HIDEO" 1 3 10 1 0 2 666 2176 2843 7 4 1 1 3
2 WO-9201047-A1 WO 1991 2421 1413 C "Methods for producing members of specific binding pairs,Procede de production de chainon de paires a liaison specifique" "CAMBRIDGE ANTIBODY TECH,MEDICAL RES COUNCIL" 797 2 1410 0 0 0 212 0 850 2 780 1 4 784
3 WO-9633735-A1 WO 1996 1764 1023 C "Anticorps humains derives d'une xenosouris immunisee,Human antibodies derived from immunized xenomice" CELL GENESYS INC 645 2 1003 0 0 0 114 0 616 29 510 2 1 606
4 WO-9634096-A1 WO 1995 1712 1022 C "Anticorps humains derives de xeno-souris immunisees,Human antibodies derived from immunized xenomice" CELL GENESYS INC 626 2 977 0 0 0 107 0 553 17 523 2 0 617
5 WO-9307278-A1 WO 1992 1137 995 A "Sequence d'adn synthetique ayant une action insecticide accrue dans le mais,Synthetic dna sequence having enhanced insecticidal activity in maize" CIBA GEIGY AG 515 1 616 0 0 0 5 0 81 0 230 0 1 825

4. Cross-country citation

For example, a pattern in which a US gazette is used in a JP review. This time, we targeted the applications of 6 JPOs, which are the 5 major JPOs, JP, US, CN, EP, KR plus the international patent application (WO). Is extracted. Usually, the reference is written in the language of the country, but if there is no document with similar technology, foreign documents may be brought in. Looking at the country, most of them are in the US, but it seems that there are many telecommunications systems (G, H). Impression that there are many Japanese companies. Is it easy to use as a technical document?

code
SELECT 
  pubnum, 
  ipcs,
  total_cit_count,unique_cit_count,
  applicants,
  titles,
  US,JP,EP,CN,KR,WO
  FROM `bqml_tutorial.fcitdata`
  WHERE US>0 AND JP>0 AND EP >0 AND CN>0 AND KR >0 AND WO >0
  #GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants
  ORDER BY total_cit_count DESC
  LIMIT 10

List
pubnum ipcs total_cit_count unique_cit_count applicants titles US JP EP CN KR WO
US-5892900-A G 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection 3955 2 19 29 4 26
US-2003189401-A1 C 3849 1958 INT MFG & ENG SERVICES CO LTD Organic electroluminescent device 3820 1 16 1 3 8
US-2007194379-A1 G 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor 3574 1 7 1 2 7
US-2007108446-A1 H 3501 1758 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof 3464 4 11 7 13 2
US-2006244107-A1 H 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device 3461 2 12 7 3 8
US-2006108636-A1 H 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor 3452 6 10 3 1 3
US-2006110867-A1 H 3454 1752 TOKYO INST TECH Field effect transistor manufacturing method 3426 4 9 5 4 6
US-2006113565-A1 H 3439 1763 TOKYO INST TECH Electric elements and circuits utilizing amorphous oxides 3412 4 3 10 2 8
US-2008038882-A1 H 3401 1734 TAKECHI KAZUSHIGE,NAKATA MITSURU Thin-film device and method of fabricating the same 3342 13 8 24 4 10
US-2007090365-A1 H 3398 1742 CANON KK Field-effect transistor including transparent oxide and light-shielding member, and display utilizing the transistor 3384 3 1 8 1 1

It may be easier to understand this area by visualizing it as a network diagram.

5. Correction by time

It may be said that the citations of the new era are disadvantageous because the time has not passed since they were published, and the citations of the old era tend to be more frequently cited, so I corrected it a little. If you really work on it, there is disconnection bias, but that is the process. Is not this time. The US & 2008-2009 applications have come to the top.

code
DECLARE kotoshi INT64;
SET kotoshi = CAST(FORMAT_DATE("%Y",CURRENT_DATE()) AS INT64);

SELECT pubnum,appcountry,appyear,total_cit_count,unique_cit_count,kotoshi - CAST(appyear AS INT64) AS pub_interval, total_cit_count / (kotoshi - CAST(appyear AS INT64)) AS ave_cit,ipcs,	titles
FROM `bqml_tutorial.fcitdata`
ORDER BY ave_cit DESC
LIMIT 20

List
pubnum appcountry appyear total_cit_count unique_cit_count pub_interval ave_cit ipcs titles
US-2009278122-A1 US 2009 3164 1725 11 287.6 H Amorphous oxide and thin film transistor
US-2009280600-A1 US 2009 3163 1727 11 287.5 G Amorphous oxide and thin film transistor
US-2010092800-A1 US 2009 3150 1678 11 286.4 B Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device
US-2010065844-A1 US 2009 3149 1677 11 286.3 H Thin film transistor and method of manufacturing thin film transistor
US-7732819-B2 US 2008 3341 1737 12 278.4 H Semiconductor device and manufacturing method thereof
US-2009134399-A1 US 2008 3303 1708 12 275.3 C Semiconductor Device and Method for Manufacturing the Same
US-2009073325-A1 US 2008 3298 1705 12 274.8 G "Semiconductor device and method for manufacturing the same, and electric device"
US-2008224133-A1 US 2008 3285 1714 12 273.8 H Thin film transistor and organic light-emitting display device having the thin film transistor
US-2008258141-A1 US 2008 3279 1720 12 273.3 H "Thin film transistor, method of manufacturing the same, and flat panel display having the same"
US-2009068773-A1 US 2008 3275 1709 12 272.9 H Method for fabricating pixel structure of active matrix organic light-emitting diode

6. Reference: Non-patent documents with many citations

Since non-patent documents are included in the cited documents, I extracted them (no name identification). Compared to patent documents, it's a lot more complicated to put in ... The number one citation is the paper of the sequencer of the following gene sequence except for strange ones.

ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410

code
SELECT 
  cit.npl_text AS title,
  COUNT(cit.npl_text) AS count,
  STRING_AGG(DISTINCT(SUBSTR(main.publication_number,0,2))) AS cit_country,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs
WHERE cit.npl_text NOT IN ('','None','NEANT','NICHTS ERMITTELT','NICHTS-ERMITTELT','No Search','International Search Report.','No further relevant documents disclosed')
  AND ipcs.first = TRUE
GROUP BY cit.npl_text
ORDER BY count DESC

List
title count cit_country IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
"PATENT ABSTRACTS OF JAPAN vol. 2003, no. 12 5 December 2003 (2003-12-05)" 3694 "FR,NL,EP,LU,CH,GR,WO" 317 805 407 60 59 461 819 766 0 0 1922 0 0 1354
PATENT ABSTRACTS OF JAPAN 3117 "WO,FR,US,DE,NL,EP" 367 659 505 17 73 260 703 533 1 0 1258 0 0 1717
"ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410" 2385 "WO,EP" 516 2 1756 1 0 0 109 1 0 0 964 0 0 1421
"Kraft et al., ""Linkage disequilibrium and fingerprinting in sugar beet,"" Theor Appl Genet, 101:323-326, 2000." 2256 US 2193 0 63 0 0 0 0 0 2256 0 0 0 0 0
"SAMBROOK ET AL.: ""Molecular Cloning: A Laboratory Manual"", 1989, COLD SPRING HARBOR LABORATORY PRESS" 2086 "WO,EP" 528 14 1389 0 0 1 153 1 0 0 987 0 0 1099
"JONES ET AL., NATURE, vol. 321, 1986, pages 522 - 525" 1969 "EP,WO" 584 6 1221 0 0 0 158 0 0 0 881 0 0 1088
"WARD ET AL., NATURE, vol. 341, 1989, pages 544 - 546" 1908 "EP,WO" 504 3 1233 0 0 0 168 0 0 0 816 0 0 1092
"BIRD ET AL., SCIENCE, vol. 242, 1988, pages 423 - 426" 1900 "EP,WO" 555 4 1199 0 0 0 142 0 0 0 739 0 0 1161
"CLACKSON ET AL., NATURE, vol. 352, 1991, pages 624 - 628" 1784 "WO,EP" 536 9 1035 0 0 0 204 0 0 0 728 0 0 1056
"Meghji et al., ""Inbreeding depression, inbred and hybrid grain yields, and other traits of maize genotypes representing three eras,"" Crop Science, 24:545-549, 1984." 1754 US 1698 0 56 0 0 0 0 0 1754 0 0 0 0 0

7. Excluding US applicant citations

Since there are many applicant citations in US applications, the WHERE clause part was corrected with the intention of excluding it.

~~Only different parts
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE AND cit.category IN ('EXA','SEA','ISR')

Then, in the following ranking.

Recommended Posts