Introduction

When doing Internet services, I sometimes want to determine the region [^ host-user] of the host that is accessing.

For example, in Japan, it can be determined that "when the IP address is reverse-looked up, it was * .hkd.mesh.ad.jp, so it looks like access from Hokkaido "[^ maxmind], but this time it is so detailed. I will not. Although it is an expression of region, we think about it with the particle size [^ region] of the country.

[^ maxmind]: This direction will be around MaxMind.

[^ host-user]: I think what I really want is the area of residence of the accessing user, but it is assumed that I only know the IP address of the accessing host. As a matter of fact, the IP address of the access source only tells you the area where the distributed organization is located, and the actual location of the host or user is another story (there is a VPN).

IP address and region association

IP addresses from IANA for each [RIR](https://ja.wikipedia.org/wiki/%E5%9C%B0%E5%9F%9F%E3%82%A4%E3%83%B3%E3%82 % BF% E3% 83% BC% E3% 83% 8D% E3% 83% 83% E3% 83% 88% E3% 83% AC% E3% 82% B8% E3% 82% B9% E3% 83% 88 It is allocated to% E3% 83% AA), and the RIRs that receive it allocate and allocate to each region.

Therefore, if you process the "Registered to which region" list provided by each RIR, you can create a list of distributed IP addresses to each region.

[^ region]: Roughly speaking, the country is like, "There is an allocation to Guam separately from the United States."

Allocation and allocation

As I wrote "allocation / allocation" in the article title, there are two expressions for distribution. In addition, there are two types of "allocated" and "assigned" in the list issued by the RIR.

This is as described in the document What is Allocation and Assignment issued by JPNIC. However, there is a difference whether the recipient of the distributed address area uses it by himself or herself.

--allocate --Distribute to the management organization (distribute to the affiliated members) --assign --Distribution to end users (distribute the distributed address area to the actual use)

In this article, when we use these two without distinction, we write "distribution".

Predecessors

Since this need has been around for a long time, there are already some lists on the Internet that have been created based on the above ideas.

I want to make it myself

It wouldn't be fun if it ended with a list, so I decided to make it myself. Since there is data that can be regarded as the correct answer on the net, it seems that the correctness can be evaluated by comparing the generated result with it.

Problems in making a list

This section is an appropriate summary of the previous article.

The list format of RIR is public by APNIC etc. .. You can read this and make it, but there are some problems.

Not CIDR notation for IPv4

IPv4 uses the notation [^ ipv4-value] of "start address + number". When determining the area from the IP address in the program, CIDR is often used, so it is more convenient to convert it to CIDR. In addition, there are blocks [^ non-cidr] that cannot be represented by one CIDR notation in one record [^ historical-noncidr].

In the case of IPv6, CIDR notation [^ ipv6-value], so this problem does not occur.

[^ ipv4-value]: The original text says "In the case of IPv4 address the count of hosts for this range. This count does not have to represent a CIDR range." In the value section. It says it's not CIDR.

[^ ipv6-value]: The original text says "In the case of an IPv6 address the value will be the CIDR prefix length from the'first address'value of .".

[^ historical-noncidr]: It's probably a format that existed before the concept of CIDR.

Even with CIDR, there are times when blocks are subdivided

This is not mentioned in the RIR documentation. There is a description to that effect in the previous article, and even if you implement it yourself, the list will certainly shrink [^ historical-noncidr].

There's nothing that can make a list redundant, so it's best to keep it short.

Regarding CIDR binding, What kind of processing does CIDR + CIDR do? is recommended.

Try to make it with Python

CIDR is generated and combined, but there is a library netaddr for operations such as CIDR combination in Python.

CIDR record generation

Arbitrary IP address ranges --ʻIPRange When you feed the start and end of the IP address to the argument of the constructor and hit [cidrs ()](https://netaddr.readthedocs.io/en/latest/api.html#netaddr.IPRange.cidrs) You will get an array [^ cidr-array] of CIDR (ʻIP Network objects).

[^ cidr-array]: This is because, as mentioned above, the IP address range cannot always be represented by a single CIDR.

from netaddr import IPRange,IPAddress

# `start`When`value`Is the RIR record information
start_ip = IPAddress(start, version = 4)
end_ip = IPAddress(int(start_ip) + value - 1) # `value`Is the number-1
cidr_list = IPRange(start, end_ip).cidrs()

Join CIDR blocks

IP Set --If you feed a list of CIDR information (ʻIPNetwork` objects), they will be combined appropriately. --The combined CIDR can be retrieved with iter_cidrs ().

from netaddr import IPSet

v4set = IPSet(v4_cider_list)
for cidr in v4set.iter_cidrs():
    print(cidr)

What I made

I've written a lot, but I was able to write it quickly. The library is amazing.

It's not a big deal, but I threw it into Gist.

https://gist.github.com/walkure/d1d87d8b4aad3c692edef1cce0f69aab

What about other languages

In the case of Perl, CIDR binding seems to be possible with the [Net :: CIDR :: Lite](https://metacpan.org/pod/Net :: CIDR :: Lite) library [^ perl-lib]. I thought I'd actually write it, but when I hit cpan, I said" Free to wrong pool 1f7d20 not 89034600d957d249 at C: \ Perl64 \ site \ lib / IO / Socket / SSL.pm line 2739. " It ends up. Upon examination, I gave up on the appearance of a known and unresolved problem [^ community-active state] that occurs only on Windows.

In the case of Go, there is a netaddr-inspired [^ cidrman-readme] library called cidrman, but currently only IPv4 is implemented. [^ cidrman-v6 issue].

In the case of PHP, there is an article Create your own IPv6 and IPv4 address allocation list by country.

[PYTHON] Make an IP address allocation / allocation list for a certain area