[PYTHON] Get a domain owned by a specific organization

Trigger of things

There are many people in the world who think about it, and it seems that there is something called Iranechkei that can only be viewed by the Japan Broadcasting Corporation. (Although it was quite a while ago) So, I wanted to think about what kind of processing would be done when I brought it digitally.

Change log

--Posted on April 28, 2020 --2020/4/29 I changed it because I was notified that the tag should be Python instead of Python3 by the function of add summary and edit request?

What I thought

--Disable viewing of radio waves owned by a specific organization. => From a net perspective, disable access to domains owned by a particular organization

How to implement (domain search)

  1. Using the whois url owned by JPRS etc., check the domain with the whois command => Is it insufficient? The whois command could not check based on the organization name.
  2. Use a search site owned by JPRS etc. to organize the returned data by sending a request by programming. => Success

This time, 2. is used.

Implementation (domain search)

This time, I will take as an example the Japan Broadcasting Corporation, which was the key to thinking. I have no intention.

python3.8


import requests
import re

url = 'https://whois.jprs.jp/?key=Japan Broadcasting Corporation&type=DOM-HOLDER'
res = requests.get(url)
result = [m.span() for m in re.finditer('dom">.*.JP', res.text)]
for i in range(len(result)):
    print(res.text[result[i][0] + 5:result[i][1]])

I was able to output. Oh, close your eyes to the dirty code.

Output result


NHK.OR.JP
NHK.JP
NHK-KEIZAI.JP
NHK1S.JP
NHK-ONDEMAND.JP
AKNC.JP
ITWHITEBOX.JP
IT-WHITEBOX.JP
STRL-TRIAL.JP
ORCUSGATE.JP
TENKAME.JP
NHK-ASSIST.JP
NHK-NEWS.JP
MTSSG.JP
NHKWORLD-JAPAN.JP
RADIRER.JP
NHKID.JP
Japan Broadcasting Corporation.JP
IT White Box.JP
Ayaka Ikezawa.JP

Process finished with exit code 0

How to implement (communication interruption)

――If you go any further, I feel like it will be deleted like the person in the example, so after that, delusions and soliloquy. (Chat)

  1. Look up the IP address in some way (nslookup (quiet)) and add it to the blacklist. => Since it is possible to communicate with the router, it seems that the equipment that can receive by being quibble is called a router. At that point, it doesn't make sense anymore.
  2. Using a server outside the house such as GCP, AWS, Azure, it is possible to control the IP address found in "1." and the domain in the previous section by some method so that they will not be communicated by some method. Build a system (proxy or (low voice)). => Communication is coming from the server side that is blocked by some method, and it is not possible to connect by communication from the server to your terminal, so it may not be possible to say that you have installed equipment that can receive ... ??

Summary

――How to use this is a weapon, just like how to use a kitchen knife, but it will be convenient if you use it as a tool. For example ...

  1. If you enter your company's organization name and incorporate it into your company's site, it will be difficult to reflect JPRS immediately after domain acquisition, but after reflection, you will be able to understand which domain your company has acquired.
  2. In the request used this time, you can also get the url where you can get detailed information about the domain, so you can use that url to build an in-house site that you can see in a tree structure.

Recommended Posts

Get a domain owned by a specific organization
Get a list of Qiita likes by scraping
Get a row containing a specific element in np.where
Get a list created by a user other than yourself
Get the number of specific elements in a python list