[PYTHON] BigGorilla environment construction memo

What i did

Building a Big Gorilla environment Try the FlexMatcher sample

What I found

――It is a recent standard to use pyenv only to put anaconda and to manage the environment with conda. ―― ~~ (As of July 12, 2017) Environment construction does not go well ~~ --~~ The dependency of the originally published conda environment is broken ~~ -You can download yml locally from ~~ Anaconda Cloud, delete the line that specifies urllib, and install by specifying the file. .. ~~ --Addition: The file has been updated to include the official documentation. --FlexMatcher sample didn't work either --It seems difficult to move without reading the code

What to do next

Read the Flexmatcher code

environment

Mac OS X 10.11 El Capitan homebrew is already installed Install anaconda using pyenv

BigGorilla environment construction (The following information is old. It is kept as a work record)

pyenv was old, so update Update the version of python managed by pyenv --Qiita

Install anaconda

$ pyenv install anaconda3-4.2.0
$ pyenv global anaconda3-4.2.0

Creating an environment for Big Gorilla. .. ~~ I can't. ~~ 2017/07/21 postscript: It became possible. Below old record

$ conda env create biggorilla/py3gorilla
Collecting urllib==1.21.1
Downloading urllib-1.21.1.tar.gz (226kB)
100% |████████████████████████████████| 235kB 640kB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/setup.py", line 191
s.connect((base64.b64decode(rip), 017620))
                                  ^
SyntaxError: invalid token
 ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/
CondaValueError: Value error: pip returned an error.

It's not completely included, but I try to activate it. With source activate Py3 Gorilla, the shell falls. If you are using pyenv, you need to specify the conda activate command with the full path. Note on how to use Conda-Qiita Python environment construction for those who aim to be a data scientist 2016 --Qiita

$ conda info -e
# conda environments:
#
Py3Gorilla               /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla
root                  *  /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0

$ source /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla/activate Py3Gorilla

I tried the Jupyter NoteBook to check the operation, but it says that the Py3 Gorilla kernel cannot be found.

$ anaconda download biggorilla/hi_gorilla
$ jupyter notebook hi_gorilla.ipynb
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-770f0b5370fe> in <module>()
----> 1 import py_stringmatching as sm
      2
      3 # This notebook imports a package that most users do not have installed
      4 # before using BigGorilla. Running the notebook successfully implies the
      5 # successful installation of BigGorilla.

ImportError: No module named 'py_stringmatching'

Once you create conda env, it is said that the prefix is registered. To remove it, use conda env remove -n.

$ conda env create biggorilla/py3gorilla
Using Anaconda API: https://api.anaconda.org
CondaValueError: Value error: prefix already exists: /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla

$ conda env remove -n Py3Gorilla

Package plan for package removal in environment /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla:

The following packages will be REMOVED:

openssl:    1.0.2l-0
pip:        9.0.1-py36_1
python:     3.6.1-2
readline:   6.2-2
setuptools: 27.2.0-py36_0
sqlite:     3.13.0-0
tk:         8.5.18-0
wheel:      0.29.0-py36_0
xz:         5.2.2-1
zlib:       1.2.8-3

Proceed ([y]/n)? y

Unlinking packages ...
[      COMPLETE      ]|###############################################################################| 100%

~~ When I tried it as of July 12, 2017, I got the following error with this method and did not enter. (It seems that the older yml is applied, probably because the file name updated in June is strange. Probably it will be fixed by the update from now on) ~~

Addendum: The file has been updated to include the official documentation.

$ conda env create biggorilla/py3gorilla
Collecting urllib==1.21.1
Downloading urllib-1.21.1.tar.gz (226kB)
100% |████████████████████████████████| 235kB 640kB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/setup.py", line 191
s.connect((base64.b64decode(rip), 017620))
                                  ^
SyntaxError: invalid token
 ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/
CondaValueError: Value error: pip returned an error.

You can install it by downloading yml from Files :: Anaconda Cloud and removing the line that specifies urllib. The newer yml can also be included, but the flexmatcher version is old (degreased?)

#Erase the environment that was once halfway
$ conda env remove -n Py3Gorilla

#Recreate the environment by specifying the locally modified yml file
$ vim ~/Downloads/Py3Gorilla.yml //Delete the urllib line
$ conda env create --name test --file ~/Downloads/Py3Gorilla.yml

#If you are using pyenv, you need to specify the conda activate command with the full path. With source activate Py3 Gorilla, the shell falls.
$ source /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/bin/activate test

#Drop the notebook for operation check and start it
$ anaconda download biggorilla/hi_gorilla
$ jupyter notebook hi_gorilla.ipynb

Try the FlexMatcher sample

Next, I tried the flexmatcher sample.

Sample code is attached, so copy the source and paste it into the jupyter notebook.

As a result of trying, I found that it did not work due to an error.

Execution result

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-34cd037abc3a> in <module>()
     27 mapping_list = [data1_mapping, data2_mapping]
     28 fm.create_training_data(schema_list, mapping_list)
---> 29 fm.train()
     30 
     31 # Creating a test schmea

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/flexmatcher.py in train(self)
     27     The class considers panda dataframes as databases and their column names as
     28     the schema. FlexMatcher learn to do schema matching by training on
---> 29     instances of dataframes and how their columns are matched against the
     30     mediated schema.
     31 

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/flexmatcher.py in <listcomp>(.0)
     27     The class considers panda dataframes as databases and their column names as
     28     the schema. FlexMatcher learn to do schema matching by training on
---> 29     instances of dataframes and how their columns are matched against the
     30     mediated schema.
     31 

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/classify.py in predict_training(self, folds)

TypeError: 'float' object cannot be interpreted as an integer

Recommended Posts

BigGorilla environment construction memo
ConoHa environment construction memo
Anaconda environment construction memo
Django development environment construction memo
[Memo] Construction of cygwin environment
Anaconda3 × Pycharm environment construction memo
[MEMO] [Development environment construction] Python
[MEMO] [Development environment construction] wine
Python environment construction memo on Windows 10
Star Cluster environment construction work memo
Environment construction memo of pyenv + conda
[MEMO] [Development environment construction] Jupyter Notebook
Emacs Python development environment construction memo
Ubuntu Desktop 20.04 development environment construction memo
DeepIE3D environment construction
Emacs-based environment construction
Linux environment construction
Python environment construction
Environment construction (python)
django environment construction
CodeIgniter environment construction
python environment construction
Python --Environment construction
Python environment construction
OpenLDAP construction memo
Golang environment construction
python environment construction
Word2vec environment construction
Mac OS X development environment construction memo
A memo packed with RADEX environment construction
My python environment memo
Environment construction: GCP + Docker
Django project environment construction
python windows environment construction
PyData related environment construction
Python development environment construction
python2.7 development environment construction
grip environment construction onCentOS6.5
Golang environment construction [goenv]
Mac environment construction Python
Pyxel environment construction (Mac)
[Memo] Django development environment
Python environment construction @ Win7
[Ubuntu 18.04] Tensorflow 2.0.0-GPU environment construction
Python + Anaconda + Pycharm environment construction
About Linux environment construction (CentOS)
PyTorch C ++ (LibTorch) environment construction
Anaconda environment construction on CentOS7
First LAMP environment construction (Linux)
Python environment construction (Windows10 + Emacs)
CI environment construction ~ Python edition ~
ML environment construction with Miniconda
Python environment construction For Mac
Anaconda3 python environment construction procedure
Docker + Django + React environment construction
Python environment construction and TensorFlow
NumPy and matplotlib environment construction
Machine learning environment construction macbook 2021
Python environment construction under Windows7 environment
Ubuntu18.04 Development environment creation memo
Ubuntu14.04 + GPU + TensorFlow environment construction