Try to import to the database by manipulating ShapeFile of national land numerical information with Python

ShapeFile is a data format for storing information about the position and shape of spatial data and its attribute information. The file with the .shp extension that comes with the download from the national land numerical information etc. is applicable.

This time, the goal is to read this ShapeFile with Python and store it in SpatiaLite.

ShapeFile details

ShapeFile consists of three files. There are three files: main file, index file, and attribute file. These file names will be the same except for the extension.

■ Main file: counties.shp ■ Index file: counties.shx ■ Attribute file: counties.dbf

The main file contains spatial data. The index file is an index that facilitates access to each spatial data. The attribute file stores the attribute values.

Please refer to the following for the detailed specifications of these.

** Shapefile technical information ** http://www.esrij.com/cgi-bin/wp/wp-content/uploads/documents/shapefile_j.pdf

Manipulate ShapeFile in Python

To operate ShapeFile in Python, it is recommended to use the following library.

https://github.com/GeospatialPython/pyshp

** Installation method ** Place shapefile.py in any folder and import it.

This library can be used with python2.4-3.x series.

Operation example of national land numerical information

In this example, let's operate N02-05-g_RailroadSection.shp of the railway line information of the national land numerical information.

** National land numerical information Railway data ** http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-N02-v2_2.html

# -*- coding: utf-8 -*-
import os
import sys
sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/pyshp')
import shapefile

sf = shapefile.Reader('original_data\\N02-05_GML\\N02-05\\N02-05-g_RailroadSection.shp')
shapeRecs = sf.iterShapeRecords()
for sr in shapeRecs:
  #Contains attribute values
  print ('attribute:' , sr.record)

  #Type of type
  #NULL = 0
  #POINT = 1
  #POLYLINE = 3
  #POLYGON = 5
  #MULTIPOINT = 8
  #POINTZ = 11
  #POLYLINEZ = 13
  #POLYGONZ = 15
  #MULTIPOINTZ = 18
  #POINTM = 21
  #POLYLINEM = 23
  #POLYGONM = 25
  #MULTIPOINTM = 28
  #MULTIPATCH = 31
  print ('shapeType:' ,sr.shape.shapeType)

  #List of coordinate points
  print ('points:', sr.shape.points)

  #Where to split points for MultiLing and MultiPolygon
  print ('parts:' ,sr.shape.parts)

iterShapeRecords () parses shp files from the beginning. At this time, only one data is expanded in the memory, so it is suitable for processing large data.

However, iterShapeRecord assumes that the next record is in the next byte of the content length recorded in the record header. In many cases, this assumption can be used for analysis, but in some cases this assumption is incorrect. For example, running the following A31-12_17_GML.shp will result in an error.

** National land numerical information Ishikawa prefecture of estimated inundation area ** http://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-A31.html

This cannot be parsed by the shp file alone because there is garbage between the records, and you need to use the index file.

In this case, it can be implemented without using iterShapeRecords as shown below.

# -*- coding: utf-8 -*-
import os
import sys
sys.path.append(os.path.dirname(os.path.abspath(__file__)) + '/pyshp')
import shapefile

sf = shapefile.Reader('original_data\\A31-12\\output\\A31-12_17_GML\\old\\A31-12_17.shp')
try:
    #The shp file of the national land numerical information is invalid, and the content length in the shp file does not match the actual length.
    #There is no choice but to get each record via shx file
    i = 0
    while True:
        shape = sf.shape(i)
        rec = sf.record(i)
        #Contains attribute values
        print ('attribute:' , rec)

        #Type of type
        #NULL = 0
        #POINT = 1
        #POLYLINE = 3
        #POLYGON = 5
        #MULTIPOINT = 8
        #POINTZ = 11
        #POLYLINEZ = 13
        #POLYGONZ = 15
        #MULTIPOINTZ = 18
        #POINTM = 21
        #POLYLINEM = 23
        #POLYGONM = 25
        #MULTIPOINTM = 28
        #MULTIPATCH = 31
        print ('shapeType:' ,shape.shapeType)

        #List of coordinate points
        print ('points:', shape.points)

        #Where to split points for MultiLing and MultiPolygon
        print ('parts:' , shape.parts)

        i += 1
except IndexError:
    pass

By using this, you can analyze the shape file with Python and import it into spatialite.

In the following program, sediment disaster risk location data, inundation area data, gust data such as tornadoes, etc. of national land numerical information are stored in spatialite from the Shape file. https://github.com/mima3/kokudo/blob/master/kokudo_db.py

Demo http://needtec.sakura.ne.jp/kokudo/

Please refer to the following article for how to use SPATIALITE. http://qiita.com/mima_ita/items/64f6c2b8bb47c4b5b391

Recommended Posts

Try to import to the database by manipulating ShapeFile of national land numerical information with Python
Try to display the railway data of national land numerical information in 3D
Let's utilize the railway data of national land numerical information
Try to automate the operation of network devices with Python
PhytoMine-I tried to get the genetic information of plants with Python
Try to image the elevation data of the Geographical Survey Institute with Python
I tried to get the movie information of TMDb API with Python
Try to solve the man-machine chart with Python
[Scientific / technical calculation by Python] Numerical calculation to find the value of derivative (differential)
When using PyQtGraph with Python Pyside, pay attention to the order of import
Add information to the bottom of the figure with Matplotlib
Try to solve the internship assignment problem with Python
Try to get the contents of Word with Golang
Extract the band information of raster data with python
Get information equivalent to the Network tab of Chrome developer tools with Python + Selenium
I tried to find the entropy of the image with python
Try scraping the data of COVID-19 in Tokyo with Python
Try to get the function list of Python> os package
Try to display various information useful for debugging with python
Debug by attaching to the Python process of the SSH destination
Try to visualize the nutrients of corn flakes that M-1 champion Milkboy said with Python
Try to decipher the garbled attachment file name with Python
How to get the information of organizations, Cost Explorer of another AWS account with Lambda (python)
Get the source of the page to load infinitely with python.
Try to extract the features of the sensor data with CNN
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 1
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 2
Find the white Christmas rate by prefecture with Python and map it to a map of Japan
Try to calculate the position of the transmitter from the radio wave propagation model with python [Wi-Fi, Beacon]
[Cloudian # 9] Try to display the metadata of the object in Python (boto3)
[Python] Try to graph from the image of Ring Fit [OCR]
First python ② Try to write code while examining the features of python
Try to solve the N Queens problem with SA of PyQUBO
I want to output the beginning of the next month with Python
Try to display google map and geospatial information authority map with python
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
Try to implement and understand the segment tree step by step (python)
I tried to improve the efficiency of daily work with Python
Try to predict the triplet of boat race by ranking learning
Try to solve the shortest path with Python + NetworkX + social data
[Python] The stumbling block of import
Try to operate Facebook with Python
[Python] How to import the library
I tried to open the latest data of the Excel file managed by date in the folder with Python
[Completed version] Try to find out the number of residents in the town from the address list with Python
I replaced the numerical calculation of Python with Rust and compared the speed
Put Cabocha 0.68 on Windows and try to analyze the dependency with Python
How to crop the lower right part of the image with Python OpenCV
I'm stunned by the behavior of filter () due to different versions of Python
[Introduction to Python] How to sort the contents of a list efficiently with list sort
Setting to debug test by entering the contents of the library with pytest
I tried to streamline the standard role of new employees with Python
Try to solve a set problem of high school math with Python
[Introduction to Python] What is the method of repeating with the continue statement?
Try sending the aggregated results of two records by email with pykintone
[Cloudian # 5] Try to list the objects stored in the bucket with Python (boto3)
Try to separate the background and moving object of the video with OpenCV
Get property information by scraping with python
Try to reproduce color film with Python
Try logging in to qiita with Python
[Python] Visualize the information acquired by Wireshark