Python dummy data generation (address edition)

background

After generating the address using ʻaddress () `of faker-python,

4-14-7 Matsuishi, Nakano-ku, Hyogo Crest Chizuka 106

There is Nakano Ward in Hyogo prefecture like this, and a terrible fake address comes out, but even if you search for this information, such as when testing with an application that actually uses a map, the execution result will be pinpointed. It does not come out.

So

** 1. Generate actual address data instead of fake data. ** ** ** 2. Also, get many addresses for one execution as dummy data. ** **

This time, I would like to suppress these two points and generate data.

How to make

Immediately I came up with a method to format and generate information via an API that provides address information. I will try it for the time being.

HeartRails Geo API The API that will be taken care of this time is HeartRails Geo API. Dummy addresses will be generated using the address search API by latitude and longitude here. zyuusyof.png

Address generation program

import requests
import random
import json

#xml_url = 'httpi://geoapi.heartrails.com/api/xml?method=searchByGeoLocation'
json_url = 'http://geoapi.heartrails.com/api/json?method=searchByGeoLocation'

#API request function
def get_data(lug,lat):
    payload = {'method': 'searchByGeoLocation', 'x': lug, 'y': lat}
    try:
        ret = requests.get(json_url, params=payload)
        json_ret = ret.json()
    except requests.exceptions.RequestException as e:
        print("ErrorContent: ",e)

    return json_ret

#Address data formatting function
def serealize_data(data):
    try:
        dic = data['response']['location'][0]
        det = dic['prefecture'] + dic['city'] + dic['town']
        return det
    except KeyError as e:
        print(e)

#Random number generation for longitude / latitude
def gene_number(lug_fnum, lug_lnum, lat_fnum, lat_lnum):
    lug = round(random.uniform(lug_fnum,lug_lnum),6)
    lat = round(random.uniform(lat_fnum,lat_lnum),6)
    return lug,lat

def main():
    for i in range(10):
        lug,lat = gene_number(123,154,20,46)
        ret = get_data(lug,lat)
        print("longitude:" + str(lug) + "," + "latitude:" + str(lat))
        print("%s\n" %serealize_data(ret))

if __name__ == '__main__':
    main()                                                                                                                                                                                                                                                                                                     

I chose the json format as the data format this time. There is no particular reason.

-The get_data function is a function that fetches data. -The data_number function formats the fetched data. -The gene_number function automatically generates the longitude and latitude in Japan and passes it to the get_data function.

Verification


Longitude: 149.691295,Latitude: 20.525873

'location'
None
Longitude: 146.369748,Latitude: 23.905043

'location'
None
Longitude: 128.552226,Latitude: 28.268003

'location'
None
Longitude: 138.839354,Latitude: 36.14651

Akihata, Kanra-cho, Kanra-gun, Gunma Prefecture
Longitude: 128.442362,Latitude: 24.173392

'location'
None
Longitude: 149.328955,Latitude: 35.501685

'location'
None
Longitude: 143.701187,Latitude: 31.806533

'location'
None
Longitude: 152.518577,Latitude: 38.932277

'location'
None
Longitude: 131.0144,Latitude: 38.670175

'location'
None
Longitude: 149.70269,Latitude: 36.445081

'location'
None

I specified to generate 10 data for one program execution, but most of the data is displayed as None. Only the data of the 4th Akihata, Kanra-cho, Kanra-gun, Gunma Prefecture could be generated.

For some reason. .. ..

Ah. ..

I checked the second coordinate data as a trial.

zyuusyoo.png

It's the sea.

Change the latitude and longitude setting to the center of Tokyo

lug,lat = gene_number(139.51,139.76,35.68,35.87)
Longitude: 139.699048,Latitude: 35.780655
4-chome, Azusawa, Itabashi-ku, Tokyo

Longitude: 139.739455,Latitude: 35.733378
1-chome, Sugamo, Toshima-ku, Tokyo

Longitude: 139.542504,Latitude: 35.711219
5-chome, Sekimae, Musashino-shi, Tokyo

Longitude: 139.591343,Latitude: 35.718665
Zenpukuji 3-chome, Suginami-ku, Tokyo

Longitude: 139.683952,Latitude: 35.787578
3-chome, Sakashita, Itabashi-ku, Tokyo

Longitude: 139.624341,Latitude: 35.795323
3-chome, Niikura, Wako City, Saitama Prefecture

Longitude: 139.543935,Latitude: 35.771779
Uenohara 2-chome, Higashikurume-shi, Tokyo

Longitude: 139.517793,Latitude: 35.718969
2-chome, Hanakoganei-Minamicho, Kodaira-shi, Tokyo

Longitude: 139.674748,Latitude: 35.687898
3-chome, Yayoi-cho, Nakano-ku, Tokyo

Longitude: 139.606371,Latitude: 35.746559
4-chome, Shakujiimachi, Nerima-ku, Tokyo

It was a hit.

Conclusion

Apparently, when I set the latitude and longitude range of Japan to be roughly (longitude 123-154) and (latitude 20-46), I forgot that the sea was included.

Therefore, it seems better to focus the longitude and latitude to be set on the longitude within the country. After that, it seems quite so with my own functions and fine adjustments.

Recommended Posts

Python dummy data generation (address edition)
python> tuple> data, address = s.recvfrom (10000)
Wind-like dummy data generation in Markov process
Data analysis python
[python] Read data
Data analysis with python 2
A * algorithm (Python edition)
First Python 3rd Edition
Python Data Visualization Libraries
Data analysis using Python 0
Data cleaning using Python
Random string generation (Python)
[Python tutorial] Data structure
[Python] Sorting Numpy data
Data analysis with Python
I tried to make various "dummy data" with Python faker
20200329_Introduction to Data Analysis with Python Second Edition Personal Summary
Specification generation and code generation in REST API development (Python edition)
Sample data created with python
My python data analysis container
Handle Ambient data in Python
data structure python push pop
Artificial data generation with numpy
Python for Data Analysis Chapter 4
PyTorch C ++ VS Python (2019 Edition)
Display UTM-30LX data in Python
Get Youtube data with python
CI environment construction ~ Python edition ~
[Python] Insert ":" in MAC address
Data Science Cheat Sheet (Python)
Python installation (Mac edition) (old)
[Python] Notes on data analysis
My python data analytics environment
Python application: data visualization # 2: matplotlib
Python data analysis learning notes
Create a dummy data file
Python data type summary memo
[Python] Plot time series data
Python for Data Analysis Chapter 2
[python] Random number generation memorandum
Image data type conversion [Python]
Data analysis using python pandas
Python for Data Analysis Chapter 3
Read json data with python
Practical exercise of data analysis with Python ~ 2016 New Coder Survey Edition ~
Practice of data analysis by Python and pandas (Tokyo COVID-19 data edition)