[PYTHON] Install Faiss on CentOS 7

Overview:

A note on how to install faiss in a CentOS7 environment without Anaconda. I'm not sure if Anaconda can be installed on CentOS7 without any problems, but ... I couldn't install it smoothly with pip, so I recorded the steps that can be installed.

background:

I can't install it in CentOS7 environment for some reason! From a global perspective, it seems that there are some people who have encountered similar problems, but there is no information that clearly finds a solution such as "This is it!" ... Spent more than half a day. Oops. I have confirmed that it cannot be installed in multiple CentOS7 environments as well, so I thought it was a problem specific to CentOS7 and wanted to establish a procedure.


At first it was going well ...

Faiss is a library that implements a high-speed algorithm for similarity search (and clustering) published by Facebook. I'm trying to build a Semantic Search mechanism from the values vectorized by SentenceBERT ... Initially, the implementation was to simply calculate and sort the Cos similarity between the vector you want to search and the vector to be searched to obtain the vector with the highest similarity. However, when the number of search targets of this Semantic Search became large, I felt that the calculation cost would be dangerous.

After a little research, I found a library called "Faiss" that can index vectors (?) And calculate them at low cost (in a short time). I immediately tried it on Google Colab!

!pip3 install faiss-cpu
import numpy as np
import faiss

d = max([len(v) for v in sentence_vectors])
index = faiss.IndexFlatL2(d)
index.add(np.array(sentence_vectors).astype('float32'))
closest_n = 1
D, I = index.search(np.array(query_embeddings).astype('float32'), closest_n)

Easy victory! !!

It's still a test with a small number of vectors to search (100 or less), so I don't feel any dramatic changes ... Actually, the search time was shortened, and the similarity search was not different from the extraction result by Cos similarity, so I started to try to incorporate it into the actual mechanism.


Well ... let's install it on the CentOS 7 server of the main subject! !! Of course it ’s an easy win, right? !!

Install with pip like Colab. Faiss seems to be mainly installed with Anaconda, but I don't use Anaconda, so it's pip. In the case of pip, it seems that the module name faiss-cpu / faiss-gpu is specified ... When switching between cpu and gpu, it seems to uninstall and reinstall with either one.

https://pypi.org/project/faiss-cpu/ https://pypi.org/project/faiss-gpu/

However! I'm getting an error! why? !!

$ sudo pip3 install faiss-cpu
Collecting faiss-cpu
  Downloading https://files.pythonhosted.org/packages/8b/3e/d64ff22504a70fb15457de8fb2f5fd84e35448fdcd9958880ae8d0438a82/faiss-cpu-1.6.4.post2.tar.gz
Building wheels for collected packages: faiss-cpu
  Running setup.py bdist_wheel for faiss-cpu ... error
  Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmp2c2gltlxpip-wheel- --python-tag cp36:
  running bdist_wheel
  running build
  running build_py
  running build_ext
  building 'faiss._swigfaiss' extension
  swigging faiss/faiss/python/swigfaiss.i to faiss/faiss/python/swigfaiss_wrap.cpp
  swig -python -c++ -Doverride= -I/usr/local/include -Ifaiss -DSWIGWORDSIZE64 -o faiss/faiss/python/swigfaiss_wrap.cpp faiss/faiss/python/swigfaiss.i
  unable to execute 'swig': No such file or directory
  error: command 'swig' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for faiss-cpu
  Running setup.py clean for faiss-cpu
Failed to build faiss-cpu
Installing collected packages: faiss-cpu
  Running setup.py install for faiss-cpu ... error
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-q0l4dufw-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    running build_ext
    building 'faiss._swigfaiss' extension
    swigging faiss/faiss/python/swigfaiss.i to faiss/faiss/python/swigfaiss_wrap.cpp
    swig -python -c++ -Doverride= -I/usr/local/include -Ifaiss -DSWIGWORDSIZE64 -o faiss/faiss/python/swigfaiss_wrap.cpp faiss/faiss/python/swigfaiss.i
    unable to execute 'swig': No such file or directory
    error: command 'swig' failed with exit status 1
    
    ----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-q0l4dufw-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-i9sic395/faiss-cpu/

By the way, I tried it on my local Mac. Maybe Colab alone was the way to go? !! No, the installation is completed with faiss-cpu on my local Mac without any problems ... This smells bad!

I got lost in the labyrinth ...

I searched on the Web in various ways, but I couldn't find any decisive measures ... Among them, I found a substitute called faiss-centos, which is a combination of the seed words of my current worries. This is cool! !!

https://pypi.org/project/faiss-centos/

I'm enthusiastic here! Look! !!

$ sudo pip3 install faiss-centos
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting faiss-centos
  Could not find a version that satisfies the requirement faiss-centos (from versions: )
No matching distribution found for faiss-centos

cancer…

I've been wandering around the Web in various ways, but I wonder if the solution was as follows ... I'm not sure. https://github.com/facebookresearch/faiss/issues/866

After that, I did various things!

Among them, faiss-centos is an egg, not a wheel, so try dropping the pip version to 8 ... Try unzipping the egg file ... Try installing openblas-serial or gmp-devel ...

However, I can't find _swigfaiss or something, without worrying about my anguish! I will say a difficult problem. I'm tired ...

Take a break ... or do something else to distract, drink tea, get sick ...


that~! ??

Well, I took a rest, I got rid of my brain fatigue, and it's already night! Log in to the CentOS7 server again ...

$ python3
Python 3.6.8 (default, Apr  2 2020, 13:34:55) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faiss
>>> 

There are no errors! !!

Before the break, it was an error ... Did seven Kobitos finally come to me? ??

So move to the desired folder and again ... error. Go back to the route and try again ... Cool! What's the difference? ??

Somehow, I downloaded from "https://pypi.org/project/faiss-centos/", unzipped faiss_centos-1.5.2-py3.6.egg, made the faiss / folder directly under it, and imported it. It is. A ray of light ...

If so ... What if I copy this faiss / folder to site-packages /? ?? ??


Conclusion

After that, I identified the libraries that need to be installed additionally, and identified the procedure for installing faiss on CentOS7.

Once I knew it, it was just this ...

$ wget https://files.pythonhosted.org/packages/f6/8b/ab69a201ea1b8be759ba16f172f92d1fb935a8f4a94f02fe52c7d8ec579f/faiss_centos-1.5.2-py3.6.egg
$ unzip faiss_centos-1.5.2-py3.6.egg
$ sudo cp -r ./faiss /usr/local/lib/python3.6/site-packages
(Or ...$ sudo cp -r ./faiss /usr/lib/python3.6/site-packages according to the environment ...)

$ sudo yum install openblas-serial
$ sudo yum install gmp gmp-devel

$ python3
Python 3.6.8 (default, Apr 2 2020, 13:34:55) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faiss
>>>

If anyone has a similar problem, I would be grateful if you could refer to it.

Recommended Posts

Install Faiss on CentOS 7
Install numba on CentOS 7.2
Install Python3.4 on CentOS 6.6
Install mecab-python on CentOS
Install Python 2.7.3 on CentOS 5.4
Install awscli on centos7
Install Chainer on CentOS 6.7
Install ImageMagick-6.2.x series on CentOS7.7
Install Python 3.8 on CentOS 7 (SCL)
Install Chrome on CentOS 7 series
Install Python 3.8 on CentOS 8 (AppStream)
Steps to install MySQL 8.0 on CentOS 8.1
raspberry pi 4 centos7 install on docker
Steps to install VirtualBox on CentOS
Install java (Oracle JDK14) on CentOS7
How to install PyPy on CentOS
How to install TensorFlow on CentOS 7
Install pip on CentOS7. Also iPython.
Install VirtualBox on CentOS 7 on VirtualBox (mac + vagrant)
Install Python on CentOS using Pyenv
How to install Maven on CentOS
Study Note 9_Install Jenkins on CentOS7
Install Python on CentOS using pyenv
CentOS8 --Install --Django
CentOS8 --Install --Python3
Enable sar command on CentOS (install sysstat)
Install Mecab on Linux (CentOS) with brew
How to install Apache (httpd) on CentOS7
How to install Eclipse GlassFish 5.1.0 on CentOS 7
How to install Apache (httpd) on CentOS8
Install PostgreSQL from source code on CentOS
Install pyenv and rbenv on CentOS system-wide
Install mecab on Marvericks
Install TensorFlow on Ubuntu
Install python on WSL
Install pyenv on mac
Install pip on Mavericks
Install Python on Pidora.
Install mongodb on termux
Install Scrapy on python3
Install docker on Fedora31
Install Ansible on Mac
[Ansible] Install dnf on Centos7 with Python3 interpreter
Install Python on Mac
Install Python 3 on Mac
Install Plone (4.3.6) on MacOSX (10.10.3)
Install PySide2 on Ubuntu
Install gensim on Marvericks
Install JModelica on Ubuntu
Install Anaconda on Windows 10
Install numpy on Marvericks
Install python on windows
Install enebular-agent on Chromebook
Install pycuda on Windows10
Install aws-cli on MacPorts
Install pygraphviz on Windows 10
Install Docker on AWS
Install Python 3.3 on Ubuntu 12.04
[Failure] Install Stack Overflow Clone Askbot on CentOS 6.4
Install Chainer 1.5.0 on Windows
Install Python 3.4 on Mac