① https://qiita.com/yohiro/items/04984927d0b455700cd1 ② https://qiita.com/yohiro/items/5aab5d28aef57ccbb19c ③ https://qiita.com/yohiro/items/cc9bc2631c0306f813b5 ④ https://qiita.com/yohiro/items/d376f44fe66831599d0b Continued
--Reference materials: Udemy Everyone's AI course Artificial intelligence and machine learning learned from scratch with Python
scikit-learn Machine learning library used this time
Given the length and width of the petals and calyxes, the iris varieties are identified. 0 represents "Setosa". 1 stands for "Versicolor". 2 stands for "Virsinica".
from sklearn import datasets
from sklearn import svm
#Reading Iris measurement data
iris = datasets.load_iris()
ʻIris` contains the following data
iris.data
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
...
iris.target
[0 0 ... 1 1 ... 2 2]
...
Both have 150 elements. Probably, there are 50 correct answer data for each of "0: Setosa", "1: Versicolor", and "2: Virsinica".
#Linear vector machine
clf = svm.LinearSVC()
#Training with support vector machine
clf.fit(iris.data, iris.target)
Train a support vector machine using the svm
method.
The linear vector machine used this time is a model that draws a line (or a surface if it is 3D) to group a group of multiple points driven into a plane (probably any number of dimensions).
In this case, there are four data to be handled: "petal length", "petal width", "calyx length", and "calyx width", so plot the correct answer data in a four-dimensional space. Do you draw an identifiable line? I think that the.
Let the clf created above read the three data, and classify them as either "0: Setosa", "1: Versicolor", or "2: Virsinica".
#Judge the variety
print(clf.predict([[5.1, 3.5, 1.4, 0.1], [6.5, 2.5, 4.4, 1.4], [5.9, 3.0, 5.2, 1.5]]))
There is a warning, but is it classified?
C:\Anaconda3\python.exe C:/scikit_learn/practice.py
C:\Anaconda3\lib\site-packages\sklearn\svm\_base.py:947: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
"the number of iterations.", ConvergenceWarning)
[0 1 2]
I tried to visualize what kind of data the contents of iris.data are
from sklearn import datasets
import matplotlib.pyplot as plt
#Reading Iris measurement data
iris = datasets.load_iris()
# Setosa, Versicolour, Virginica
sepal_length = [[], [], []]
petal_length = [[], [], []]
sepal_width = [[], [], []]
petal_width = [[], [], []]
for num, data in enumerate(iris.data):
cls = iris.target[num]
sepal_length[cls].append(data[0])
petal_length[cls].append(data[1])
sepal_width[cls].append(data[2])
petal_width[cls].append(data[3])
plt.subplot(1,2,1)
plt.scatter(sepal_length[0], petal_length[0], c="red", label="Setosa", marker="+")
plt.scatter(sepal_length[1], petal_length[1], c="blue", label="Versicolour", marker="+")
plt.scatter(sepal_length[2], petal_length[2], c="green", label="Virginica", marker="+")
plt.xlabel('sepal_length')
plt.ylabel('petal_length')
plt.legend()
plt.subplot(1,2,2)
plt.scatter(sepal_width[0], petal_width[0], c="red", label="Setosa", marker="+")
plt.scatter(sepal_width[1], petal_width[1], c="blue", label="Versicolour", marker="+")
plt.scatter(sepal_width[2], petal_width[2], c="green", label="Virginica", marker="+")
plt.xlabel('sepal_width')
plt.ylabel('petal_width')
plt.legend()
plt.show()
By drawing a line between the Setosa, Versicolour, and Virginica groups, it can be understood that the data near the Versicolour, Virginica line may be difficult to classify.
Recommended Posts