[GO] [Python] Until scraping beginners save J-League standings to CSV files

Review of scraping

I was worried about scraping and wanted to get some data for the time being, so I tried scraping while referring to the following site. https://www.atmarkit.co.jp/ait/articles/1910/18/news015_2.html I will write it as a review, so I hope it will be helpful for those who are new to scraping! Written in Google Colab using Python. Therefore, there may be some differences from the local description.

Basics of scraping

I scraped with request and Beautiful so up. In request, the specified web k and other files are acquired, and the desired information is extracted from the file acquired by Beautiful soup. As you can see on the site, I am writing a program to get the J League standings. In addition, I have written up to the point of additionally saving to CSV. The code used this time is shown below.

qiita.rb


from bs4 import BeautifulSoup
from urllib import request

url = 'https://www.jleague.jp/standings/j1/'
response = request.urlopen(url)
content = response.read()
response.close()

charset = response.headers.get_content_charset()
html = content.decode(charset, 'ignore')
soup = BeautifulSoup(html)

table = soup.find_all('tr')

standing = []
for row in table:
    tmp = []
    for item in row.find_all('td'):
        if item.a:
            tmp.append(item.text[0:len(item.text) // 2])
        else:
            tmp.append(item.text)
    del tmp[0]
    del tmp[-1]
    standing.append(tmp)

for item in standing:
    print(item)

import pandas as pd
from google.colab import files 
del standing[0]
df = pd.DataFrame(standing,columns = ['Ranking', 'Club name', 'Points', 'Number of games', 'Win', 'Minutes', 'negative', 'score', 'Conceded', '得Conceded'])

from google.colab import drive

filename = 'j1league.csv'
path = '/content/drive/My Drive/' + filename

with open(path, 'w', encoding = 'utf-8-sig') as f:
  df.to_csv(f,index=False)

Since I implemented it while checking it in detail on the way, I put print () in between, but here I have implemented it up to saving it to a file at once.

Recommended Posts

[Python] Until scraping beginners save J-League standings to CSV files
[Part1] Scraping with Python → Organize to csv!
Python) Save scraping content to local PC
Scraping tabelog with python and outputting to CSV
[Python] Reading CSV files
[Introduction for beginners] Reading and writing Python CSV files
How to save a table scraped by python to csv
Save lists, dictionaries and tuples to external files python
[R] [Python] Memo to read multiple csv files in multiple zip files
3 Reasons Beginners to Start Python
Write to csv with Python
~ Tips for beginners to Python ③ ~
Until Toot to Mastodon (Python)
How to import CSV and TSV files into SQLite with Python
Transpose CSV files in Python Part 1
[Python] Loading csv files using pandas
[Python] Write to csv file with Python
Output to csv file with Python
Handle Excel CSV files with Python
Beginners use Python for web scraping (4) ―― 1
python beginners tried to find out
Error due to UnicodeDecodeError when reading CSV file with Python [For beginners]
[Python] How to convert db file to csv
Answer to AtCoder Beginners Selection by Python3
[Python] Convert csv file delimiters to tab delimiters
Function to save images by date [python3]
Read Python csv and export to txt
Python> Output numbers from 1 to 100, 501 to 600> For csv
Convert HEIC files to PNG files with Python
How to read CSV files in Pandas
[Python] Add comments to standard input files
[For beginners] Try web scraping with Python
What I did to save Python memory
Tips for Python beginners to use the Scikit-image example for themselves 2 Handle multiple files