[PYTHON] Creating a scraping tool

This is Qiita's first post. Thank you.

Self-introduction 31 years old ♂ Graduated from the Department of Information Science at a national university. 22 years old Joined an independent SIer. Resident at a food wholesale company. 26 years old Moved to the information system department of a food wholesale company. Until now

How did you want to learn Python Transferred to the planning of the information system department of a food wholesale company I made a proposal to introduce AI.
However, AI vendors are too expensive, I was dismissed by the user department if it was not cost-effective.

It was a complex that was only legacy development, I wondered if I could make a proposal by incorporating deep learning by myself. I started studying Python.

I started studying Python, learned about scraping, I thought this was in demand, so I tried to make it a tool.

About scraping tools The food wholesale company has more than 50 branches and more than 100 stores, each of which has different customers, and it was impossible for the information system department to handle everything, so each store has some IT knowledge and patience. Designed to be usable if there is.

Execution method Distribute the batch file to the startup, and when you start your PC in the morning, execute the Python program from the batch file. The information of each customer is acquired, and if there is a difference from the previous acquisition contents, the URL and new information are displayed in a pop-up.

File structure Input a simple csv file so that you can create it yourself. The output is also csv, making it easy to compare with the previous acquisition.

Specified content 1.URL 2. Acquisition item class (up to 3 can be specified) 3. Output file name

Challenges 1. Items that do not define a class cannot be taken → If the class is not defined for the item you want to take, you cannot get it. I considered taking the ID and name as well, but it would be confusing, so I decided not to. For future improvement 2. I can get extra items other than the target → Since it is not used as input data, ask them to delete unnecessary parts. We have supported the acquisition of weather information, but it will be complicated if it is generalized, so we will not accept it. For future improvement 3. You must be aware that you do not violate the terms of service and do not overload.

And to change jobs I've been writing for a long time so far, but now that I have acquired web-based development technology and want to become a company-independent engineer, I started to change jobs. Use this scraping tool as a portfolio.
It will be published on GitHub. I would be very grateful if you could give me some advice. https://github.com/yamamasa2020/scraping-tool

Recommended Posts
Creating a scraping tool

[Python] Creating a scraping tool Memo

Memo for creating a text formatting tool

A tool for creating symbolic links on Windows

[Day 9] Creating a model

Creating a Home screen

4. Creating a structured program

Creating a dataset loader

Problems when creating a csv-json conversion tool with python

Try creating a CRUD function

[Python] What a programming inexperienced person did before creating a tool

Create a tool to check scraping rules (robots.txt) in Python

Block device RAM Disk Creating a device

Creating a web application using Flask ②

Creating a wav file split program

Step by Step for creating a Dockerfile

Creating a decision tree with scikit-learn

Creating a Flask server with Docker

Creating a voice transcription web application

Creating a simple table using prettytable

Creating a web application using Flask ①

Precautions when creating a Python generator

Creating a learning model using MNIST

Creating a web application using Flask ③

When creating a matrix in a list

A tool to convert Juniper config

Creating a web application using Flask ④

Scraping 1

[Python] Chapter 03-01 turtle graphics (creating a turtle)

Creating a simple PowerPoint file with Python

Commands for creating a new django project

Creating a python virtual environment on Windows

Creating a login screen with Django allauth

Scraping a website using JavaScript in Python

Creating a data analysis application using Streamlit

[Python] Creating a stock price drawdown chart

Try HTML scraping with a Python library

A tool for easily entering Python code

[Python] Scraping a table using Beautiful Soup

I made a browser automatic stamping tool.

Creating a shell script to write a diary

Memo about Sphinx Part 1 (Creating a project)

Creating a cholera map for John Snow

Creating a virtual environment in an Anaconda environment

I created a password tool in Python.

Creating a development environment for machine learning

python: Creating a ramen timer (pyttsx3, time)

Creating a Python document generation tool because it is difficult to use sphinx