In bullet points, I worked on this matter with the following motivation.
-Geolonia Address Data has been released, so I wanted to try something with it. Data ――I found a suspicious person information site at about the same time, so this is it! Prompt decision --I wanted to scrape using selenium ――I had only used beautiful soup.
I made something like this.
https://suzukidaisuke.gitlab.io/fushinsha_map/
It's like a suspicious person information version of Oshima Teru. It's rudely less complete than that.
I've uploaded the code to gitlab and you can read it. https://gitlab.com/suzukidaisuke/fushinsha_map
If you write
-Japan Suspicious Person Information Center August 2020 Scraping suspicious person information in Tokyo ――Tokyo basically has the following title, so get the address part with a regular expression
(Tokyo) Public obscenity at 3-chome, Tanashi-cho, Nishitokyo-shi, after noon on August 24
--Combine the obtained address and Geolonia address data to obtain latitude and longitude. --Map suspicious person information on a map using the folium library
――I think you made something nice for the crispness. ――If you look at the notebook of git, you can see how much it is. ――It seems that you can make more (national version, set up a server, etc.) ――But it's going to end here ――If your address is up to "○○ chome", you can easily make something like this. --Selenium is easy and convenient. But is beautifulsoup faster? --I didn't notice that I uploaded a large file to git. I wonder if I should use this kind of thing. https://docs.gitlab.com/ee/topics/git/lfs/
Be careful of suspicious people.
Recommended Posts