[PYTHON] Created a service that allows you to search J League data

Overview

Article summary

While introducing the service created this time, I will introduce the development flow and stumbling points. If it helps people who make similar things ...

App overview

--This is a service that allows you to search soccer J-League data. -You can jump to the search site from here ――It consists of a Web API for scraping J League data and a front screen for searching. ――Currently, you can check the record of consecutive wins from 2017 to 2019. --Front search screen

Reason for creating the app

――I'm a supporter of the soccer J-League team called Kawasaki Frontale, but I wanted to look up past data after breaking the record of consecutive wins (10 consecutive wins!) In 2020. ――I wanted to make it myself to understand WebAPI better.

Overall composition

The source code for Front and Back is on GitHub.

Server configuration diagram

Jsearch-saverimage (1).png

ER diagram

ER図.png

Technology used

--Python (Flask in library)

Creation flow & stumbling point

flow

  1. Build an environment with Docker
  2. Create a DB by scraping J League data site
  3. Convert data to Web API using Flask API
  4. Deploy to AWS

Flow details

I will explain the flow of development while introducing the sites that I referred to.

1. Build an environment with Docker

For the base, I referred to here (Connecting to MySQL with Python in Docker).

version: "3"

services:
  mysql_db_j:
    container_name: "mysql_db_j"
    image: mysql:5.7
    command: mysqld --character-set-server=utf8 --collation-server=utf8_unicode_ci
    environment: # Set up mysql database name and password
      MYSQL_ROOT_PASSWORD: password
      MYSQL_DATABASE: employees
      MYSQL_USER: user
      MYSQL_PASSWORD: password
    networks:
      - app-tier

  python3_j:
    restart: always
    build: .
    ports:
      - '3000:3000'
      - '5000:5000'
    container_name: "python3_j"
    working_dir: "/root/"
    tty: true
    stdin_open: true
    depends_on:
      - mysql_db_j
    networks:
      - app-tier
    volumes:
      - .:/root/app/
    # command: python3 app/app.py
    command: >
      bash -c "python3 app/create_teamid.py
      && python3 app/create_result.py
      && python3 app/app.py
      "
    
networks:
  app-tier:
    driver: bridge

I have launched a MySQL and Python container and connected them. Also. I am trying to create a DB and start Flask at startup with `command:`.

2. Create a DB by scraping J League data site

Scraping uses requests and `beautifulsoup```. I used `mysql-connector-python``` to connect from Python to MySQl. Refer to here ・ Summary of how to use Python + mysql-connector-python -How to install and connect MySQL Connector with Python3 [Introducing how to use it comfortably] -Official documentation

3. Convert data to Web API using Flask API

The data saved in the DB using the Flask API is sent as JSON. reference ・ Easy to use Flask

Also, you have to use `` `flask_cors```

Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

I get an error like this. Necessary for CROS measures.

4. Deploy to AWS

Install Docker on EC2 and start Docker by referring to the following. I really wanted to use ECS ... reference -[How to install docker and docker-compose on AWS EC2 instance and launch a simple web service](https://qiita.com/y-do/items/e127211b32296d65803a#%E7%B5%8C%E7% B7% AF) -[Start the Docker container on EC2 and run Flask (Try creating a simple Web service with Flask [5th])](http://pixelbeat.jp/build-flask-on-container-on- ec2 /)

Stumble point

What should I do with the design ...

When I was trying to develop a Web API, I couldn't find much information that could be used as a reference for the "first step". There are various explanations of Restful's principles, but how is it actually standard? It will be. For the time being, I designed it based on my experience using Web API.

DB is not saved! ??

I was connecting to MySQL with a library called mysql-connector-python, but when I operated from Python, there was a problem that it was not saved even if I added DB processing. I've been stuck for a long time, but in the Official Documentation, "Connector / Python by default" Does not autocommit, so it is important to call this method after every transaction that modifies the data in a table that uses the transaction storage engine. " I have to read the document properly!

AWS itching

When publishing the front side on Github Pages, the following error occurs

index.js:30 Mixed Content: The page at 'https://yuta97.github.io/j-search-front/' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://18.178.166.9:5000/continuous-records/match/win/?yearFrom=2017&yearTo=2019&continuou_recordFrom=2&continuou_recordTo=15'. This request has been blocked; the content must be served over HTTPS.

In short, it seems that the problem is calling the http API server in the https site. It seems that such mixed content is deprecated. As a solution, I tried to use SSL for EC2 API server, but it doesn't work ... As a painstaking measure, hosting with S3 and using http communication at the front desk.

Impressions

Write your impressions lazily.

――It is important for your mental health to make it small and improve it like agile development. ――How do people in the media check for record updates? Is there any database? ――You have to get used to AWS while using it. I still have a hard time understanding. ――I want to know more about security. If you don't know what kind of attack there is, you won't know what to do.

Future outlook

I will list what I want to do in the future.

--Sort function --Search for other data (consecutive losses, no losses, etc.) --Strengthening security (such as inserting an API key) --API test --API documentation (can you do it with swagger?) --Use of other AWS (APIGateWay, ECS, RDS) services --SSL --Automatically collect the latest data --UI improvement (I want to use Vue.js etc.)

Recommended Posts

Created a service that allows you to search J League data
Create a plugin that allows you to search Sublime Text 3 tabs in Python
Introducing "Sherlock", a CLI tool that allows you to search for user names across SNS
We have released an extension that allows you to define xarray data like a Python data class.
A memo that allows you to change Pineapple's Python environment with pyenv
Introduction of "scikit-mobility", a library that allows you to easily analyze human flow data with Python (Part 1)
I want to create a web application that uses League of Legends data ①
I made a system that allows you to tweet just by making a phone call
I tried to analyze J League data with Python
A learning roadmap that allows you to develop and publish services from scratch with Python
I tried to build a service that sells machine-learned data at explosive speed with Docker
Created a method to downsample for unbalanced data (for binary classification)
A python script that converts Oracle Database data to csv
If you want to become a data scientist, start with Kaggle
Don't you want to say that you made a face recognition program?
Created a Discord bot to notify you of updates to become a novelist
A story that was struggling to loop processing 3 million ID data
I wrote a book that allows you to learn machine learning implementations and algorithms in a well-balanced manner.
How to send a visualization image of data created in Python to Typetalk
[Choregraphe] Created a box where you can post any message to ChatWork
Security Server Docker Image that allows you to easily try X-Road is now available, so give it a try 1