[PYTHON] Aim for GIS Master # 2 R-tree for fast search of nearby roads

In the previous post, I extracted the road NW from OSM

Last time, I extracted the road NW from OSM. Aim for GIS Master # 1 Extract Road NW from OpenStreetMap

If you take a closer look, you may find that POI-like items such as traffic lights, post offices, and cafes are also included. And you can also take the Polygon on the site. Well, of course. → However, not all. There are various ways to create the site, some of which have only buildings, and some of which include parking lots. You may also be making Polygon. No, it's still this quality, so I really respect the mappers. I can't. I am always grateful for your help.

You can get various things. But I'll save that story for the third time.

This time I would like to write how to use the extracted road NW.

What I want to do is find a road near my current location

What I want to do is which road is closest to this place It is to draw from the extracted road NW.

(If the GPS error is not so large) Since it is considered that the road was the closest to the location information obtained from the GPS logger, All you have to do is draw the nearest road. It's a very simple map matching.

How do you do it? (Problem)

As a premise Roads pulled from OSM often have multiple midpoints (nodes). It may be entered as a node at the time of a curve, or it may be entered as an intersection with another road. → The latter is also interesting. Leave it.

In that case, let's leave the GPS error for the time being, Measure the distance between your current location and all the nodes, and the road with the shortest node is the one It can be said that it is a road that should be matched.

Now let's calculate the distance between the current location and all the nodes! Just in case, I want to put out about 3 candidates, so let's arrange them in ascending order of distance and get the 3 road ids from the top! Is it better to put all the nodes in the database and search by SQL? Arrange in descending order of Euclidean distance! !!

order by SQRT(POW((lon-Target position lon),2) + POW((lat-Target position lat),2)) desc

↑ I haven't tried it, but I don't know when it will end. It may be surprisingly fast, but I can only see the late future. It is rejected.

How do you do (Solution)

It is a solution "solution" by running away w I think there are many ways to do it, but for me it was relatively easy and fast.

R-tree From R-tree wikipedia

This. The point is to index the position data. Easy to use with python. → It seems to be a python wrapper version of libspatialindex, You need to install libspatialindex first. I remember that I was a little addicted to the installation, but ... When I tried to import with python as below after installation, I got an error like "libspatialindex-cannot be read". from rtree import index

It should have worked if I put the path containing the so file of libspatialindex in LD_LIBRARY_PATH in the environment variable. That's all for me.

When you are ready, put the OSM road information in the R-tree and create an index. Since the road NW was taken in the previous post, all the intermediate points (position rows that are LINE STRING (position 1, position 2, ...)) are taken out from there. → Since the distance between these positions varies, I think that linear interpolation may be performed.

I will insert it.

#Usage example with reference to the tutorial

#rtree properties
p = rtree.index.Property()
p.dat_extension = 'data'
p.idx_extension = 'index'
file_idx = index.Index('rtree', properties = p)

#From here on down is all points. Loop or anything is fine

#It is convenient after searching if you create an object and enter information
hogehogeho = {}
hogehogeho["name"]="test" #Osm here_Do you put id or something

#If you put it in a rectangle, lon1,lat1,lon2,Although it is lat2, I will put a point, so I will put the same value continuously for the time being
file_idx.insert(id=id, bounds=(lon1, lat1, lon1, lat1), obj=hogehogeho)

I executed file_idx.insert (~) at all points to create an index. I think it will be tens of GB in Japan alone. Be careful about the capacity of the HDD.

Once you get here, just search.

#You can search like this
# - target_lon,target_lat is the location information you want to search
# -3 is the number of searches
# -Basically, you can take them in order of closeness(Confirmation required)
#→ When calculating the distance with the earth in mind, I feel that there were some things that were slightly out of order.
#But I don't think it will shift by 1m, so I want to think it's okay!
for pdat in file_idx.nearest((target_lon,target_lat), 3, objects=True):
    print(pdat.object['name'])
    #The hit index is pdat.Since it can be taken with bbox, target_lon,target_lat and pdat.When you calculate the distance of bbox, the actual distance comes out.

Operation check app

So I tried to check the operation. I made a prototype tool that draws the road in the vicinity when you click on the map. You can see that the road in the vicinity is drawn at the place where the red diamond marker is clicked. The colors are red → orange → green → light blue → blue → black below that indicates the road closest to the clicked position. The crowns are displayed in order from 1 to 5. Well, I don't think there is a 2, but I think it's under the 3.

akihabara_rtree.png

By the way, the search speed is almost unnoticeable. I think it is no exaggeration to say that it is a moment. This should not be the case if you calculated the distance with all points.

Now you are ready for battle

It is now possible to search for roads near this point at high speed. After that, it is necessary to match while looking at the angle of the road acquired according to the data. → Hey, this is the most troublesome ... If the GPS accuracy is good, the nearest road should be decided by majority vote. .. .. This is difficult. Yes. I am also struggling.

But the battle is ready! I would like to say that.

Later talk?

OSM roads are quite quirky, and even after passing an intersection, the ids continue to be the same and often become L-shaped. For my use, I want to divide it into one road at an intersection, so I put in a pre-process to divide the original data. I used the R-tree made above for this as well. → How much do you like ... All the nodes are searched in the neighborhood by rtree in order, and the nodes with the distance of 0 are extracted. A road with a node with a distance of 0 can be said to be an intersection if it is a point where multiple unique osm_ids overlap. So in that case, cut the road. It is carved and given something like sub_id to generate something unique.

Digression

I've used the rtree library in another case, but searching for close data and range search is extremely fast. Moreover, at that time, it was a distance search for 8-dimensional data. By the way, multiple dimensions can be realized by specifying properties. p.dimension = 3 The biggest reason for choosing this library is that it supports multiple dimensions. → If 2D is called multidimensional, then all of them are multidimensional. I think there were few things that could be done in 8 dimensions.

Originally, I used a database to do 8-dimensional distance calculation (which should have been Euclidean distance). I made a query to get 5 items in ascending order of distance. The number of records is hundreds of thousands, but one search takes less than 10 seconds. However, when I searched on R-tree, it took less than 3 seconds and I was able to get the result in 1/3 of the time. great!

Since then, I've become accustomed to using R-tree for everything.

Recommended Posts

Aim for GIS Master # 2 R-tree for fast search of nearby roads
Aim for GIS Master # 3 I investigated a mesh-like Geocoding method