[PYTHON] Machine learning in Delemas (data acquisition)

The other day, I completed the topic cousera machine learning course, so I want to try it in practice Idolmaster Cinderella Girls .wikipedia.org/wiki/%E3%82%A2%E3%82%A4%E3%83%89%E3%83%AB%E3%83%9E%E3%82%B9%E3%82%BF% E3% 83% BC_% E3% 82% B7% E3% 83% B3% E3% 83% 87% E3% 83% AC% E3% 83% A9% E3% 82% AC% E3% 83% BC% E3% I will try to predict three types (Cu, Co, Pa) using the profile data of 83% AB% E3% 82% BA).

Data acquisition

First is the acquisition of data used for learning. I searched for a Delemas version of Pokemon api, but it didn't look good, so I usually use the Delemas wiki. I got the data from wiki.gamerch.com/).

For the scraping method, I referred to the following pages. http://qiita.com/Azunyan/items/9b3d16428d2bcc7c9406

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import urllib2
import csv
from bs4 import BeautifulSoup

#URL to access
url = "https://imascg-slstage-wiki.gamerch.com/%E3%82%A2%E3%82%A4%E3%83%89%E3%83%AB%E4%B8%80%E8%A6%A7"
#Read URL
html = urllib2.urlopen(url)
#Handle html with Beautiful Soup
soup = BeautifulSoup(html, "html.parser")
#Get all the contents of the first table
table = soup.findAll("table")[0]
#Decompose table row by row
rows = table.findAll("tr")

csvFile = open("aimasudata.csv", 'wt')
writer = csv.writer(csvFile)
for row in rows:
    csvRow = []
    for cell in row.findAll(['td', 'th']):
        csvRow.append(cell.get_text().encode('utf-8'))
    writer.writerow(csvRow)

result

Like this スクリーンショット 2017-04-01 23.10.30.png

Note

--I didn't know how to read the html tag, so it took a long time to find the acquisition destination of soup.findAll. If you want to get the data of the table for the time being, you can specify the table and know the number of the table in the same page. --It is said that ASCII code cannot be used when cell.get_text () is used for Japanese data, so encoding to utf-8 is required.

Recommended Posts

Machine learning in Delemas (data acquisition)
Preprocessing in machine learning 2 Data acquisition
Python: Preprocessing in machine learning: Data acquisition
Machine learning in Delemas (practice)
Preprocessing in machine learning 4 Data conversion
Python: Preprocessing in machine learning: Data conversion
Preprocessing in machine learning 1 Data analysis process
Data supply tricks using deques in machine learning
Data set for machine learning
Used in machine learning EDA
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Automate routine tasks in machine learning
Classification and regression in machine learning
Machine learning
Pre-processing in machine learning 3 Missing values, outliers, and imbalanced data
Python: Preprocessing in Machine Learning: Overview
Random seed research in machine learning
Basic machine learning procedure: ② Prepare data
How to collect machine learning data
Machine learning imbalanced data sklearn with k-NN
[python] Frequently used techniques in machine learning
[Python] First data analysis / machine learning (Kaggle)
[Python] Saving learning results (models) in machine learning
[Memo] Machine learning
Machine learning classification
Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data
Machine Learning sample
Full disclosure of methods used in machine learning
[Python] Data analysis, machine learning practice (Kaggle) -Data preprocessing-
Machine learning Training data division and learning / prediction / verification
[Python3] Let's analyze data using machine learning! (Regression)
Summary of evaluation functions used in machine learning
Get a glimpse of machine learning in Python
I started machine learning with Python Data preprocessing
Stock price forecast using deep learning [Data acquisition]
A story about data analysis by machine learning
[For beginners] Introduction to vectorization in machine learning
Machine learning tutorial summary
About machine learning overfitting
Machine learning ⑤ AdaBoost Summary
Sampling in imbalanced data
Tool MALSS (application) that supports machine learning in Python
Machine learning logistic regression
About data preprocessing of systems that use machine learning
How to split machine learning training data into objective variables and others in Pandas
Tool MALSS (basic) that supports machine learning in Python
Machine learning support vector machine
About testing in the implementation of machine learning models
Studying Machine Learning ~ matplotlib ~
Machine learning linear regression
Machine learning course memo
Machine learning library dlib
Machine learning (TensorFlow) + Lotto 6
Coursera Machine Learning Challenges in Python: ex1 (Linear Regression)
Somehow learn machine learning
Time series data prediction by AutoML (automatic machine learning)
Attempt to include machine learning model in python package
Cross-entropy to review in Coursera Machine Learning week 2 assignments
xgboost: A valid machine learning model for table data
Machine learning library Shogun
Machine learning rabbit challenge