Self-made module
pip install my_awesome_module
To publish it so that it can be published, it is OK to publish it on GitHub together with setup.py
GitHub page: https://github.com/kyohashi/model_selection
I'm looking for a module on PyPI (Python Package Index). For example, if you access PyPI.org and search for numpy, ↓ https://pypi.org/
Certainly there is numpy. ↓ https://pypi.org/search/?q=numpy
When registering a module on PyPI, you will need setup.py
, which represents the meta information of the module, along with the source code of the module.
To summarize the above,
pip install hogehoge
setup.py
associated with hogehoge as a loading moduleYou can understand that you are following the procedure.
It turns out that you can pip install
by registering with PyPI, but you can actually publish it on GitHub as well.
The point is that you only need to tell me the source code and the location of setup.py
, so after registering on GitHub
pip install git+(URL)
Anyone can install it. This time I tried publishing on GitHub instead of PyPI.
The files registered on GitHub and setup.py
are as follows.
file organization
.
├── README.md
├── requirements.txt
├── setup.py
└── src
└── model_selection
├── __init__.py
├── bayes_clustering.py
└── utils
├── __init__.py
└── check_datashape.py
setup.py
from glob import glob
from os.path import basename
from os.path import splitext
from setuptools import setup
from setuptools import find_packages
def _requires_from_file(filename):
return open(filename).read().splitlines()
setup(
name="model_selection",
version="0.1.0",
description="statistical model selection with Bayesian IC like WAIC",
author="kyohashi",
url="https://github.com/kyohashi/model_selection.git",
packages=find_packages("src"),
package_dir={"": "src"},
py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
include_package_data=True,
zip_safe=False,
install_requires=_requires_from_file('requirements.txt')
)
If you look at setup.py
, you can see that it recognizes the python files under the src folder as modules.
If you register the above, it will be fine
pip install git+https://github.com/kyohashi/model_selection.git
Is possible.
The module created this time aims to help determine the number of clusters. Specifically, as shown in the figure below, the optimal number of clusters is estimated by modeling and calculating WAIC with GMM for all candidate cluster numbers. PyMC3 is used for MCMC sampling.
There is also an implementation example for Toy Data, so please refer to it as well. Usecase: https://kyohashi.github.io/model_selection/gmm_usecase.html