[PYTHON] scikit-learn How to use summary (machine learning)

Clustering analysis (k-means method)

・ Enter the data frame in df and the number of clusters in num. -Specify a random seed integer in random_state

def clustering_analytics(df, num):
    df_temp = df.copy()
    from sklearn.preprocessing import StandardScaler
    from sklearn.cluster import KMeans
    sc = StandardScaler()
    #Standardization
    df_std = sc.fit_transform(df_temp)
    
    kmeans = KMeans(n_clusters=num, random_state=0)
    clusters = kmeans.fit(df_std)
    df_temp["cluster"] = clusters.labels_
    return df_temp

Principal component analysis (PCA)

・ Enter the data frame in df and the number of principal components in num.

def PCA_analytics(df, num):
    from sklearn.preprocessing import StandardScaler
    from sklearn.decomposition import PCA
    import numpy as np
    sc = StandardScaler()
    df_temp = df.copy()
    #Standardization
    df_std = sc.fit_transform(df_temp)
    pca = PCA(n_components = num)
    pca.fit(df_std)
    df_temp__pca = pca.transform(df_std)
    pca_df = pd.DataFrame(df_temp__pca)
    
    print('components, main components')
    print(pca.components_)
    print('mean, mean')
    print(pca.mean_)
    print('covariance, covariance matrix')
    print(pca.get_covariance())
    W, v = np.linalg.eig(pca.get_covariance())
    print('eigenvector, eigenvector')
    print(v)
    print('eigenvalue, eigenvalue')
    print(W)
    return pca_df

Recommended Posts

scikit-learn How to use summary (machine learning)
Summary of how to use pandas.DataFrame.loc
Summary of how to use pyenv-virtualenv
Summary of how to use csvkit
How to collect machine learning data
How to use machine learning for work? 03_Python coding procedure
[Python] Summary of how to use pandas
Introduction to Machine Learning: How Models Work
[Python2.7] Summary of how to use unittest
How to enjoy Coursera / Machine Learning (Week 10)
Summary of how to use Python list
[Python2.7] Summary of how to use subprocess
How to use machine learning for work? 01_ Understand the purpose of machine learning
How to use xml.etree.ElementTree
Machine learning tutorial summary
How to use Python-shell
How to use tf.data
How to use virtualenv
How to use Seaboan
How to use image-match
How to use shogun
How to use Pandas 2
Machine learning ⑤ AdaBoost Summary
How to use Virtualenv
How to use numpy.vectorize
How to use pytest_report_header
How to use partial
How to use Bio.Phylo
How to use SymPy
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
How to use list []
How to use python-kabusapi
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
Introduction to machine learning
How to use imutils
How to use import
Summary of how to use MNIST in Python
How to use machine learning for work? 02_Overview of AI development project
How to use Qt Designer
How to use search sorted
[gensim] How to use Doc2Vec
python3: How to use bottle (2)
Understand how to use django-filter
An introduction to machine learning
How to use the generator
Machine learning python code summary (updated from time to time)
[Python] How to use list 1