[PYTHON] Das am häufigsten zitierte Patent der Welt

Einführung

Da die Feiertage begonnen haben, wollte ich eine dringende Analyse durchführen und fragte mich, welches Patentdokument die weltweit höchste Anzahl an Zitaten aufweist.

SQL in BigQuery

Zum ersten Mal beim Online-Lernen habe ich gelernt, dass SQL als quietschend ausgesprochen wird. ..

Schwierig war es, eine Gesamtzahl von Zitaten und Deduplizierungen zu erstellen (außer, dass sie aus Gründen der Ablehnung derselben Anwendung häufig verwendet werden) und ein Zitierzählmuster zu erstellen. Da BigQuery nur Zitierinformationen enthält, wird es zitiert. Der Teil, der Informationen generiert (umgekehrte Richtung). Es ist ein wenig kompliziert, aber es sieht wie folgt aus.

SQL für Population
WITH bibtable as (
  SELECT
    pub.application_number AS appnum,
    pub.publication_number AS pubnum,
    pub.filing_date as appday,
    STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
    SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
    STRING_AGG(DISTINCT(title.text)) AS title
  FROM `patents-public-data.patents.publications_201912` AS pub,
     UNNEST(title_localized) AS title,
     UNNEST(assignee_harmonized) as applicants,
     UNNEST(ipc) as ipcs
     GROUP BY appnum,pubnum,appday
)

SELECT 
  pubnum,
  SUBSTR(pubnum,0,2) AS appcountry,
  SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
  COUNT(main.application_number) AS total_cit_count,
  COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
  STRING_AGG(DISTINCT(ipc4)) AS ipcs,
  STRING_AGG(DISTINCT(title)) AS titles,
  STRING_AGG(DISTINCT(applicants)) AS applicants,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs
LEFT JOIN bibtable
  ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC

SQL-spezifische Version für die Grundgesamtheit: ARRAY_AGG und STRUCT sind möglicherweise sauberer
WITH bibtable as (
  SELECT
    pub.application_number AS appnum,
    pub.publication_number AS pubnum,
    pub.filing_date as appday,
    STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
    STRING_AGG(DISTINCT(title.text)) AS texts,
    SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
    STRING_AGG(DISTINCT(title.text)) AS title
  FROM `patents-public-data.patents.publications_201912` AS pub,
     UNNEST(title_localized) AS title,
     UNNEST(assignee_harmonized) as applicants,
     UNNEST(ipc) as ipcs
     GROUP BY appnum,pubnum,appday
)

SELECT 
  bibtable.pubnum,
  SUBSTR(bibtable.pubnum,0,2) AS appcountry,
  SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
  #STRING_AGG(DISTINCT(cit.application_number)) as appnum,
  #STRING_AGG(DISTINCT(cit.type)) AS cit_type,
  COUNT(main.application_number) AS total_cit_count,
  COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
  STRING_AGG(DISTINCT(ipc4)) AS ipcs,
  STRING_AGG(DISTINCT(title)) AS titles,
  STRING_AGG(DISTINCT(applicants)) AS applicants,
  ARRAY_AGG(STRUCT(SUBSTR(ipcs.code,0,1) AS ipc_sec,
                   SUBSTR(main.publication_number,0,2) AS cc,
                   SUBSTR(CAST(main.filing_date AS STRING),0,4) AS year,
                   appls)
            ) AS fcit,
   #STRING_AGG(cit.type),STRING_AGG(DISTINCT(family_id))
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs,
  UNNEST(main.assignee) AS appls
LEFT JOIN bibtable
  ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC

0. Ausgabeergebnis (Top 10)

Die Ausführungszeit beträgt ca. 5 Minuten. Insgesamt wurden 30.262.060 Dokumente extrahiert. Das Folgende sind die Top 10 der zitierten Zeiten.

Row pubnum appcountry appyear total_cit_count unique_cit_count ipcs titles applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 US-4683202-A US 1985 8996 4766 C Process for amplifying nucleic acid sequences CETUS CORP 1602 148 6116 13 1 12 1083 21 5542 0 1701 1 1 1751
2 US-4816567-A US 1983 8668 4748 C Recombinant immunoglobin preparations GENENTECH INC 3319 27 4354 8 0 3 954 3 4387 0 1878 0 1 2402
3 US-4683195-A US 1986 7174 3763 C Process for amplifying, detecting, and/or-cloning nucleic acid sequences CETUS CORP 1115 154 4853 1 1 11 1021 18 4593 0 1355 1 1 1224
4 US-5223409-A US 1991 4956 2523 C Directed evolution of novel binding proteins PROTEIN ENG CORP 1598 17 2833 10 0 2 493 3 2629 2 1092 1 0 1232
5 US-5523520-A US 1994 4759 3761 C Mutant dwarfism gene of petunia GOLDSMITH SEEDS INC 4585 1 162 0 2 1 5 3 4759 0 0 0 0 0
6 US-4946778-A US 1989 4657 2474 C Single polypeptide chain binding molecules GENEX CORP 1655 9 2436 1 0 1 554 1 2521 1 1146 1 0 988
7 US-5585089-A US 1995 4604 2520 A Humanized immunoglobulins PROTEIN DESIGN LABS INC 1829 38 2306 0 0 1 430 0 2337 1 976 0 2 1288
8 US-5530101-A US 1990 4036 1987 A Humanized immunoglobulins PROTEIN DESIGN LABS INC 1776 9 1920 0 0 1 330 0 2564 2 672 0 4 794
9 US-5892900-A US 1996 4035 1944 G Systems and methods for secure transaction management and electronic rights protection INTERTRUST TECH CORP 53 26 3 0 1 1 2886 1065 3955 2 19 4 29 26
10 US-2003189401-A1 US 2003 3849 1958 C Organic electroluminescent device INT MFG & ENG SERVICES CO LTD 2 14 33 1 1 6 759 3033 3820 1 16 3 1 8

Die Nummer eins in der Welt ist US4683202A "Verfahren zur Amplifikation von Nukleinsäuresequenzen". Ist es eine biobasierte Anwendung? Der zweite Platz ist so.

Speichern Sie die Ergebnisse als BigQuery-Tabelle. Rufen Sie bigquery von pythono auf und versuchen Sie, das Ergebnis mit Pandas zu visualisieren.

Ergebnis

1. Ganz

Verteilung der Anzahl der Zitate

code
from google.cloud import bigquery
project_id = '~~~~~'
client = bigquery.Client(project=project_id)

#Abfrage
query =  """
                SELECT total_cit_count,count(*) AS fcit_count
                FROM `bqml_tutorial.fcitdata`
                GROUP BY total_cit_count
                ORDER BY total_cit_count DESC
         """
#Bei Pandas empfangen.
df_count_fcit = client.query(query).to_dataframe()

#Handlung

import plotly.express as px
px.histogram(df_count_fcit,x='total_cit_count')

Wenn ich mir das anschaue, mache ich mir Sorgen, dass es einen Berg pro 3000 Mal gibt, anstatt in der Reihenfolge der Anzahl der Zitate sanft zu sein. Diese Tendenz ist im Histogramm der Anzahl der Zitate ohne Verdoppelung dieselbe (es gibt einen Berg pro 1800-mal).

image.png

Verteilung des angegebenen Alters

code
#Der Teil, um die Daten im Datenrahmen abzurufen, ist nur die Änderung des Spaltennamens, daher wird er weggelassen.
fig = px.bar(df_count_fcit[df_count_fcit['appyear'].astype(int)>1990],x='appyear',y='count')
fig.show()

image.png

Welches Land wird am häufigsten zitiert?

code
SELECT appcountry,SUM(unique_cit_count) AS count
FROM `bqml_tutorial.fcitdata`
GROUP BY appcountry
ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='appcountry',x='count',orientation='h',text='count')
fig.show()

newplot (34).png

Bewerber mit vielen Zitaten

code
SELECT applicants,COUNT(*) AS count
FROM `bqml_tutorial.fcitdata`
GROUP BY applicants
ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='applicants',x='count',orientation='h',text='count')
fig.show()

newplot (35).png

2. Nach technischem Bereich

code
SELECT * FROM (
  SELECT 
    ipcs,
    row_number() over (partition by ipcs order by unique_cit_count DESC) AS rank,
    pubnum,
    total_cit_count,unique_cit_count,
    applicants,
    titles
  FROM `bqml_tutorial.fcitdata`
  GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants
  )
WHERE rank <= 5

Fast US

Liste
Sektion rank pubnum total_cit_count unique_cit_count applicants titles
A 1 US-5585089-A 4604 2520 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 2 US-5530101-A 4036 1987 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 3 US-4658085-A 2222 1829 UNIV GUELPH Hybridization using cytoplasmic male sterility, cytoplasmic herbicide tolerance, and herbicide tolerance from nuclear genes
A 4 US-5693762-A 3455 1822 PROTEIN DESIGN LABS INC Humanized immunoglobulins
A 5 US-5569825-A 3082 1730 GENPHARM INT Transgenic non-human animals capable of producing heterologous antibodies of various isotypes
B 1 US-2006113549-A1 3391 1756 TOKYO INST TECH Light-emitting device
B 2 US-2010092800-A1 3150 1678 CANON KK Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device
B 3 US-6336137-B1 1505 940 SIEBEL SYSTEMS INC Web client-server system and method for incompatible page markup and presentation languages
B 4 US-4723129-A 1489 928 CANON KK Bubble jet recording method and apparatus in which a heating element generates bubbles in a liquid flow path to project droplets
B 5 US-6766817-B2 1856 908 TUBARC TECHNOLOGIES LLC Fluid conduction utilizing a reversible unsaturated siphon with tubarc porosity action
C 1 US-4683202-A 8996 4766 CETUS CORP Process for amplifying nucleic acid sequences
C 2 US-4816567-A 8668 4748 GENENTECH INC Recombinant immunoglobin preparations
C 3 US-4683195-A 7174 3763 CETUS CORP Process for amplifying, detecting, and/or-cloning nucleic acid sequences
C 4 US-5523520-A 4759 3761 GOLDSMITH SEEDS INC Mutant dwarfism gene of petunia
C 5 US-5223409-A 4956 2523 PROTEIN ENG CORP Directed evolution of novel binding proteins
D 1 US-3849241-A 1577 926 EXXON RESEARCH ENGINEERING CO Non-woven mats by melt blowing
D 2 US-4340563-A 1491 906 KIMBERLY CLARK CO Method for forming nonwoven webs
D 3 US-3802817-A 1282 805 ASAHI CHEMICAL IND Apparatus for producing non-woven fleeces
D 4 US-3692618-A 1236 802 METALLGESELLSCHAFT AG Continuous filament nonwoven web
D 5 US-4100324-A 1344 775 KIMBERLY CLARK CO Nonwoven fabric and method of producing same
e 1 US-7174579-B1 23 13 BAUZA PEDRO Temperature display system
E 1 US-4902508-A 1159 493 PURDUE RESEARCH FOUNDATION Tissue graft composition
E 2 CN-201962688-U 392 391 FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD Diaphram wall wharf pile foundation structure,Struktur der unterirdischen Verbindung 续 墙 头 桩 桀
E 3 US-5809415-A 720 352 OPENWAVE SYS INC Method and architecture for an interactive two-way data communication network
E 4 US-5445304-A 709 312 UNITED STATES SURGICAL CORP Safety device for a surgical stapler cartridge
E 5 US-7098794-B2 618 279 KIMBERLY CLARK CO Deactivating a data tag for user privacy or tamper-evident packaging
F 1 US-7213940-B1 917 506 LED LIGHTING FIXTURES INC Lighting device and lighting method
F 2 US-5037397-A 895 384 MEDICAL DISTRIBUTORS INC Universal clamp
F 3 US-2010327766-A1 611 378 LEVINE DAVID B,RECKER MICHAEL V Wireless emergency lighting system
F 4 US-6577073-B2 569 367 MATSUSHITA ELECTRIC IND CO LTD Led lamp
F 5 US-6432098-B1 661 361 PROCTER & GAMBLE Absorbent article fastening device
G 1 US-5892900-A 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection
G 2 US-2007194379-A1 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor
G 3 US-5731856-A 3571 1795 SAMSUNG ELECTRONICS CO LTD Methods for forming liquid crystal displays including thin film transistors and gate pads having a particular structure
G 4 US-2006208977-A1 3373 1746 SEMICONDUCTOR ENERGY LAB Semiconductor device, and display device, driving method and electronic apparatus thereof
G 5 US-2009280600-A1 3163 1727 JAPAN SCIENCE & TECH AGENCY Amorphous oxide and thin film transistor
H 1 US-2006244107-A1 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device
H 2 US-7674650-B2 3529 1837 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof
H 3 US-2006108636-A1 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor
H 4 US-7061014-B2 3500 1776 JAPAN SCIENCE & TECH AGENCY Natural-superlattice homologous single crystal thin film, method for preparation thereof, and device using said single crystal thin film
H 5 US-6294274-B1 3391 1771 TDK CORP,KAWAZOE HIROSHI
J 1 US-7693341-B2 36 25 APPLE INC Workflows for color correcting images
J 2 US-7847532-B2 20 13 ASTEC INT LTD Centralized controller and power manager for on-board power systems
J 3 US-7364473-B2 11 7 FUJITSU LTD Connector for electronic device
J 4 US-7239379-B2 1 1 TECHNOLOGY INNOVATIONS LLC Method and apparatus for determining a vertical intensity profile through a plane of focus in a confocal microscope
K 1 US-7208984-B1 18 9 LINEAR TECHN INC CMOS driver with minimum shoot-through current
K 2 US-7287321-B2 13 7 DENSO CORP Multi-layer board manufacturing method
K 3 US-7233174-B2 10 4 TEXAS INSTRUMENTS INC Dual polarity, high input voltage swing comparator using MOS input transistors
K 4 US-7752749-B2 2 1 PANASONIC CORP Electronic component mounting method and electronic component mounting device
M 1 US-8233880-B2 21 16 JOHNSON GARTH,GLOBAL TEL LINK CORP Integration of cellular phone detection and reporting into a prison telephone system
N 1 US-8141117-B1 16 10 LOBECK MATTHEW R,ARRIS GROUP INC,CHRISTENSEN KORY D,CONINGSBY DONNA JO,BROADUS CHARLES R,KELLUM JOHN M PC media center and extension device for interfacing with a personal video recorder through a home network
N 2 US-7258343-B2 15 9 BANDAI AMERICA INC Card game and methods of play
N 3 US-8134758-B2 4 2 UEDA HIDENORI,NAKAISHI YOSHIAKI,OKI DATA KK Image reading apparatus, image forming apparatus, image forming system that employs the image reading apparatus and the image forming apparatus
N 4 US-8198132-B2 2 1 GALERA MANOLITO,ALABIN LEOCADIO MORONA,FAIRCHILD SEMICONDUCTOR Isolated stacked die semiconductor packages
O 1 US-8306908-B1 36 20 WEST CORP,PETTAY MARK J,JOHNSON ROBERT E,KEMPKES RODNEY J,BARKER THOMAS B,STRUBBE TODD B Methods and apparatus for intelligent selection of goods and services in telephonic and electronic commerce
Q 1 US-7689487-B1 57 36 AMAZON COM INC Computer-assisted funds transfer system
Q 2 US-8170916-B1 41 18 AHMED WAQAS,MANOLACHE FLORIN V,AMAZON TECH INC,MONGRAIN SCOTT ALLEN,MUNTEANU VALENTIN RADU,RUDEANU CORNELIU GABRIEL ALEXANDRU,WILSON AARON D,DICKER RUSSELL A,ROSCA VAL DAN DAR ION I Related-item tag suggestions
Q 3 US-7130826-B1 5 3 IBM Method and apparatus for conducting coinless transactions
Q 4 US-8190451-B2 2 2 DANCHA LYNNE A,TAN AGNES W H,LINDQUIST TAMMIE J,LLOYD KAREN D,GROUP HEALTH PLAN INC,KOOPMEINERS MICHAEL Method and computer program product for predicting and minimizing future behavioral health-related hospital admissions
R 1 US-7360912-B1 42 30 PASS & SEYMOUR INC Electrical device with lamp module
R 2 US-7196508-B2 24 15 MIRAE CORP Handler for testing semiconductor devices
R 3 US-8052455-B1 10 7 HONGFUJIN PREC IND SHENZHEN,HON HAI PREC IND CO LTD Mounting apparatus for flash drive
R 4 US-8128432-B2 10 5 REVOL DIDIER,LAROCHE VINCENT,FORATIER NATHALIE,JAOUEN JEAN-MARC,LEGRAND SNC,COUSY JEAN-PIERRE,LEGRAND FRANCE Insert and method of assembling such an insert
R 5 US-8002580-B2 5 2 ANDREW LLC Coaxial cable crimp connector
V 1 US-7692056-B2 1 1 SHANGHAI RES INST PETROCHEMICAL TECHNOLOGY SINOPEC Process for producing lower olefins from methanol or dimethylether
Y 1 US-8471523-B2 1 1 MULTI FUNCTION CO LTD,LIN WEI-JONG Charging/discharging device having concealed universal serial bus plug(s)

3. Nach Land

Extrahieren Sie die Top 5 nach Ländern. Es ist interessant, dass es in jedem Land einige Unterschiede gibt. In Japan dreht sich alles um Halbleiter.

code
SELECT *
FROM `bqml_tutorial.fcitdata`
WHERE appcountry = 'WO'# OR appcountry = 'JP' OR appcountry = 'EP' OR appcountry = 'CN' OR appcountry = 'KR' OR appcountry = 'WO'
ORDER BY unique_cit_count DESC
LIMIT 5
Liste
Row pubnum appcountry appyear total_cit_count unique_cit_count ipcs titles applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 JP-2002076356-A JP 2000 3084 1758 H Halbleiterbauelement JAPAN SCIENCE & TECH CORP 1 4 11 2 0 2 691 2373 3017 43 6 4 5 9
2 JP-2004103957-A JP 2002 3110 1746 H Transparenter elektrischer Feldeffekttransistor mit Dünnschicht, der einen homologen Dünnfilm als aktive Schicht verwendet,Transparent thin film field effect type transistor using homologous thin film as active layer JAPAN SCIENCE & TECH CORP,OTA HIROMICHI 1 4 18 1 0 2 688 2396 2912 93 8 90 0 7
3 JP-2003086808-A JP 2001 3066 1742 H Thin film transistor and matrix display,Dünnschicht- und Matrixanzeigegerät SHARP KK,KAWASAKI MASASHI,ONO HIDEO 1 3 14 1 0 2 683 2362 2950 96 0 7 1 12
4 JP-2002289859-A JP 2001 3060 1734 H Dünner Film,Thin-film transistor MINOLTA CO LTD 1 3 11 1 0 2 685 2357 2979 57 2 10 6 6
5 JP-2000044236-A JP 1998 3028 1722 H Artikel mit einem transparenten leitfähigen Oxiddünnfilm und sein Herstellungsverfahren,Article having transparent conductive oxide thin film and its production HOYA CORP 1 7 20 1 0 2 680 2317 2991 13 12 3 3 6
applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
1 US-4683202-A US 1985 8996 4766 C Process for amplifying nucleic acid sequences CETUS CORP 1602 148 6116 13 1 12 1083 21 5542 0 1701 1 1 1751
2 US-4816567-A US 1983 8668 4748 C Recombinant immunoglobin preparations GENENTECH INC 3319 27 4354 8 0 3 954 3 4387 0 1878 0 1 2402
3 US-4683195-A US 1986 7174 3763 C "Process for amplifying, detecting, and/or-cloning nucleic acid sequences" CETUS CORP 1115 154 4853 1 1 11 1021 18 4593 0 1355 1 1 1224
4 US-5523520-A US 1994 4759 3761 C Mutant dwarfism gene of petunia GOLDSMITH SEEDS INC 4585 1 162 0 2 1 5 3 4759 0 0 0 0 0
5 US-5223409-A US 1991 4956 2523 C Directed evolution of novel binding proteins PROTEIN ENG CORP 1598 17 2833 10 0 2 493 3 2629 2 1092 1 0 1232
1 EP-2226847-A2 EP 2005 2388 1510 H "Amorphous oxide and thin film transistor,Oxyde amorphe et transistor à couche mince,Amorpher Oxid- und Dünnschichttransistor" JAPAN SCIENCE & TECH AGENCY 1 3 9 0 0 2 594 1779 2387 0 0 1 0 0
2 EP-0239400-A2 EP 1987 1789 1190 C "Recombinant antibodies and methods for their production,Anticorps recombinants et leurs procédés de production,Rekombinante Antikörper und Verfahren zu deren Herstellung" WINTER GREGORY PAUL 593 1 1066 0 0 0 129 0 572 0 490 1 2 724
3 EP-1737044-A1 EP 2005 1662 1187 H "Oxyde amorphe et transistor film mince,Amorph-oxid- und dünnfilmtransistor,Amorphous oxide and thin film transistor" JAPAN SCIENCE & TECH AGENCY 0 3 7 0 0 2 452 1198 1651 0 4 0 1 6
4 EP-0404097-A2 EP 1990 1568 1152 A "Récepteurs mono- et oligovalents, bispécifiques et oligospécifiques, ainsi que leur production et application,Bispezifische und oligospezifische, mono- und oligovalente Rezeptoren, ihre Herstellung und Verwendung,Bispecific and oligospecific, mono- and oligovalent receptors, production and applications thereof" BEHRINGWERKE AG 505 6 910 0 0 0 147 0 262 0 495 0 0 811
5 EP-1737044-B1 EP 2005 1339 934 H "Amorphes oxid und dünnfilmtransistor,Amorphous oxide and thin film transistor,Oxyde amorphe et transistor film mince" JAPAN SCIENCE & TECH AGENCY 1 0 4 1 0 0 236 1097 1339 0 0 0 0 0
1 CN-201962688-U CN 2010 392 391 E "Diaphram wall wharf pile foundation structure,Struktur der unterirdischen Verbindung 续 墙 头 桩 桀" FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD 0 0 0 0 392 0 0 0 0 0 0 0 392 0
2 CN-1895777-A CN 2005 297 296 B "Eine Art verpacktes Botenmolekül,Porous molecular-sieve catalyst for assembling carbide and its preparation" UNIV BEIJING CHEMICAL 0 7 290 0 0 0 0 0 0 0 0 0 297 0
3 CN-1634601-A CN 2003 680 288 A "Für eine Art von medizinischem Behandlungsinstrument bakteriostatische Methode,Method for sterilizing medical appliance" JILIN PROVINCE ZHONGLI IND CO 631 8 0 0 0 4 6 31 677 0 0 0 3 0
4 CN-1262969-A CN 2000 290 288 B "Catalyst using TiO2 as carrier to load metal nitride Mo2N,TiO" UNIV NANKAI 0 7 283 0 0 0 0 0 0 0 0 0 290 0
5 CN-1470327-A CN 2002 286 286 C "Metal nitride catalyst preparing method and catalyst,Eine Art Metallisierer" CHINA PETROCHEMICAL CORP 0 7 279 0 0 0 0 0 0 0 0 0 286 0
1 KR-20000051826-A KR 1999 192 179 C "New organomattalic complex molecule for the fabrication of organic light emitting diodes,신규한 착물 및 그의 제조 방법과 이를 이용한 유기 발광 소자" LG CHEMICAL LTD 0 0 162 0 0 0 0 30 12 0 0 168 0 12
2 KR-20110003229-A KR 2009 379 175 A "하이브리드 수술용 로봇 시스템 및 수술용 로봇 제어방법,Hybrid surgical robot system and control method thereof" ETERNE INC 360 6 0 0 0 1 7 5 366 0 1 4 1 7
3 KR-20000074034-A KR 1999 202 145 H "Ultra-slim Repeater with Variable Attenuator,케이블 손실 보상이 가능한 초소형 중계기" ACE TECH 0 0 0 0 0 0 10 192 197 0 0 4 0 1
4 KR-20120009843-A KR 2010 168 142 G "이동단말기 및 그의 어플리케이션 공유 방법,Mobile terminal and method for sharing applications thereof" LG ELECTRONICS INC 0 0 0 0 0 0 17 151 155 0 0 3 2 8
5 KR-20190000980-A KR 2017 148 141 H "A personal Healthcare System,퍼스널 헬스케어 시스템" LEE DONG WON 15 38 0 1 15 24 19 36 1 0 0 147 0 0
1 WO-2004114391-A1 WO 2004 2859 1667 H "Dispositif a semi-conducteur et son procede de production, et dispositif electronique,Semiconductor device, its manufacturing method, and electronic device,Halbleiterbauelemente und ihre Herstellungsverfahren sowie elektronische Bauelemente" "KAWASAKI MASASHI,SHARP KK,SUGIHARA TOSHINORI,OHNO HIDEO" 1 3 10 1 0 2 666 2176 2843 7 4 1 1 3
2 WO-9201047-A1 WO 1991 2421 1413 C "Methods for producing members of specific binding pairs,Procede de production de chainon de paires a liaison specifique" "CAMBRIDGE ANTIBODY TECH,MEDICAL RES COUNCIL" 797 2 1410 0 0 0 212 0 850 2 780 1 4 784
3 WO-9633735-A1 WO 1996 1764 1023 C "Anticorps humains derives d'une xenosouris immunisee,Human antibodies derived from immunized xenomice" CELL GENESYS INC 645 2 1003 0 0 0 114 0 616 29 510 2 1 606
4 WO-9634096-A1 WO 1995 1712 1022 C "Anticorps humains derives de xeno-souris immunisees,Human antibodies derived from immunized xenomice" CELL GENESYS INC 626 2 977 0 0 0 107 0 553 17 523 2 0 617
5 WO-9307278-A1 WO 1992 1137 995 A "Sequence d'adn synthetique ayant une action insecticide accrue dans le mais,Synthetic dna sequence having enhanced insecticidal activity in maize" CIBA GEIGY AG 515 1 616 0 0 0 5 0 81 0 230 0 1 825

4. Cross-Country-Zitat

Zum Beispiel ein Muster, in dem US-Gazetten bei JP-Untersuchungen verwendet werden. Dieses Mal haben wir Anmeldungen von sechs Patentämtern eingereicht, bei denen es sich angeblich um die fünf wichtigsten Patentämter handelt, darunter JP, US, CN, EP und KR sowie internationale Patentanmeldungen (WO). Wird extrahiert. Normalerweise wird die in der Landessprache verfasste Literatur als Referenz verwendet. Wenn jedoch keine Literatur mit ähnlicher Technologie vorhanden ist, kann ausländische Literatur eingebracht werden. Mit Blick auf das Land sind die meisten von ihnen USA, aber es scheint, dass es viele Telekommunikationssysteme gibt (G, H). Eindruck, dass es viele japanische Unternehmen gibt. Ist es einfach als technisches Dokument zu verwenden?

code
SELECT 
  pubnum, 
  ipcs,
  total_cit_count,unique_cit_count,
  applicants,
  titles,
  US,JP,EP,CN,KR,WO
  FROM `bqml_tutorial.fcitdata`
  WHERE US>0 AND JP>0 AND EP >0 AND CN>0 AND KR >0 AND WO >0
  #GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants
  ORDER BY total_cit_count DESC
  LIMIT 10

Liste
pubnum ipcs total_cit_count unique_cit_count applicants titles US JP EP CN KR WO
US-5892900-A G 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection 3955 2 19 29 4 26
US-2003189401-A1 C 3849 1958 INT MFG & ENG SERVICES CO LTD Organic electroluminescent device 3820 1 16 1 3 8
US-2007194379-A1 G 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor 3574 1 7 1 2 7
US-2007108446-A1 H 3501 1758 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof 3464 4 11 7 13 2
US-2006244107-A1 H 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device 3461 2 12 7 3 8
US-2006108636-A1 H 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor 3452 6 10 3 1 3
US-2006110867-A1 H 3454 1752 TOKYO INST TECH Field effect transistor manufacturing method 3426 4 9 5 4 6
US-2006113565-A1 H 3439 1763 TOKYO INST TECH Electric elements and circuits utilizing amorphous oxides 3412 4 3 10 2 8
US-2008038882-A1 H 3401 1734 TAKECHI KAZUSHIGE,NAKATA MITSURU Thin-film device and method of fabricating the same 3342 13 8 24 4 10
US-2007090365-A1 H 3398 1742 CANON KK Field-effect transistor including transparent oxide and light-shielding member, and display utilizing the transistor 3384 3 1 8 1 1

Es ist möglicherweise einfacher, diesen Bereich zu verstehen, indem Sie ihn als Netzwerkdiagramm visualisieren.

5. Korrektur nach Zeit

Man kann sagen, dass die zitierten Dokumente der neuen Ära nachteilig sind, weil die Zeit seit ihrer Veröffentlichung nicht vergangen ist und die Dokumente der alten Ära häufiger zitiert werden, deshalb habe ich sie ein wenig korrigiert. Wenn Sie ernsthaft daran arbeiten, gibt es Disconnection Bias, aber diesen Prozess Ist diesmal nicht. Die Anwendungen für die USA und 2008-2009 sind an die Spitze gekommen.

code
DECLARE kotoshi INT64;
SET kotoshi = CAST(FORMAT_DATE("%Y",CURRENT_DATE()) AS INT64);

SELECT pubnum,appcountry,appyear,total_cit_count,unique_cit_count,kotoshi - CAST(appyear AS INT64) AS pub_interval, total_cit_count / (kotoshi - CAST(appyear AS INT64)) AS ave_cit,ipcs,	titles
FROM `bqml_tutorial.fcitdata`
ORDER BY ave_cit DESC
LIMIT 20

Liste
pubnum appcountry appyear total_cit_count unique_cit_count pub_interval ave_cit ipcs titles
US-2009278122-A1 US 2009 3164 1725 11 287.6 H Amorphous oxide and thin film transistor
US-2009280600-A1 US 2009 3163 1727 11 287.5 G Amorphous oxide and thin film transistor
US-2010092800-A1 US 2009 3150 1678 11 286.4 B Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device
US-2010065844-A1 US 2009 3149 1677 11 286.3 H Thin film transistor and method of manufacturing thin film transistor
US-7732819-B2 US 2008 3341 1737 12 278.4 H Semiconductor device and manufacturing method thereof
US-2009134399-A1 US 2008 3303 1708 12 275.3 C Semiconductor Device and Method for Manufacturing the Same
US-2009073325-A1 US 2008 3298 1705 12 274.8 G "Semiconductor device and method for manufacturing the same, and electric device"
US-2008224133-A1 US 2008 3285 1714 12 273.8 H Thin film transistor and organic light-emitting display device having the thin film transistor
US-2008258141-A1 US 2008 3279 1720 12 273.3 H "Thin film transistor, method of manufacturing the same, and flat panel display having the same"
US-2009068773-A1 US 2008 3275 1709 12 272.9 H Method for fabricating pixel structure of active matrix organic light-emitting diode

6. Referenz: Nicht patentierte Dokumente mit vielen Zitaten

Da nicht patentierte Dokumente in den zitierten Dokumenten enthalten sind, habe ich versucht, sie zu extrahieren (keine Namensidentifikation). Es ist viel komplizierter als die Patentdokumente, aber ... Die Nummer eins, die zitiert wird, ist das Papier des Sequenzers der folgenden Gensequenz mit Ausnahme von seltsamen.

ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410

code
SELECT 
  cit.npl_text AS title,
  COUNT(cit.npl_text) AS count,
  STRING_AGG(DISTINCT(SUBSTR(main.publication_number,0,2))) AS cit_country,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G,
  SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN,
  SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO
FROM 
  `patents-public-data.patents.publications_201912` as main,
  UNNEST(main.citation) AS cit,
  UNNEST(main.ipc) AS ipcs
WHERE cit.npl_text NOT IN ('','None','NEANT','NICHTS ERMITTELT','NICHTS-ERMITTELT','No Search','International Search Report.','No further relevant documents disclosed')
  AND ipcs.first = TRUE
GROUP BY cit.npl_text
ORDER BY count DESC

Liste
title count cit_country IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO
"PATENT ABSTRACTS OF JAPAN vol. 2003, no. 12 5 December 2003 (2003-12-05)" 3694 "FR,NL,EP,LU,CH,GR,WO" 317 805 407 60 59 461 819 766 0 0 1922 0 0 1354
PATENT ABSTRACTS OF JAPAN 3117 "WO,FR,US,DE,NL,EP" 367 659 505 17 73 260 703 533 1 0 1258 0 0 1717
"ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410" 2385 "WO,EP" 516 2 1756 1 0 0 109 1 0 0 964 0 0 1421
"Kraft et al., ""Linkage disequilibrium and fingerprinting in sugar beet,"" Theor Appl Genet, 101:323-326, 2000." 2256 US 2193 0 63 0 0 0 0 0 2256 0 0 0 0 0
"SAMBROOK ET AL.: ""Molecular Cloning: A Laboratory Manual"", 1989, COLD SPRING HARBOR LABORATORY PRESS" 2086 "WO,EP" 528 14 1389 0 0 1 153 1 0 0 987 0 0 1099
"JONES ET AL., NATURE, vol. 321, 1986, pages 522 - 525" 1969 "EP,WO" 584 6 1221 0 0 0 158 0 0 0 881 0 0 1088
"WARD ET AL., NATURE, vol. 341, 1989, pages 544 - 546" 1908 "EP,WO" 504 3 1233 0 0 0 168 0 0 0 816 0 0 1092
"BIRD ET AL., SCIENCE, vol. 242, 1988, pages 423 - 426" 1900 "EP,WO" 555 4 1199 0 0 0 142 0 0 0 739 0 0 1161
"CLACKSON ET AL., NATURE, vol. 352, 1991, pages 624 - 628" 1784 "WO,EP" 536 9 1035 0 0 0 204 0 0 0 728 0 0 1056
"Meghji et al., ""Inbreeding depression, inbred and hybrid grain yields, and other traits of maize genotypes representing three eras,"" Crop Science, 24:545-549, 1984." 1754 US 1698 0 56 0 0 0 0 0 1754 0 0 0 0 0

7. Ohne US-Anmelderzitate

Da es in US-Anmeldungen viele Bewerberzitate gibt, wurde der WHERE-Klauselteil mit der Absicht korrigiert, ihn auszuschließen.

~~Nur verschiedene Teile
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE AND cit.category IN ('EXA','SEA','ISR')

Dann in der folgenden Rangfolge.

Recommended Posts