I'm interested in big data, real-time analysis, data mining, machine learning, and so on, because everyone is blogging and talking in a fun way. It sounds interesting. So, it's just that I'm personally interested and researching. I'm less than a bad guy, so I'm enjoying what I'm doing at the next level.
Just set up Apache-Spark to lend in IPython Notebook. If you google it, various things will come out, but I want to keep it close to myself, so make a note. As I learned earlier, spark 1.2.0 was released, so it's already slightly old. But I think it's the same anyway.
When installed with Homebrew, it will be placed in / usr / local / Cellar / apache-spark / 1.1.1
.
Set the environment variable to SPARK_HOME
export SPARK_HOME="Folder where spark was unzipped"
Create an IPython profile
$ ipython profile create pyspark
Edit the IPython environment profile startup / 00-pyspark-setup.py
#coding:utf-8
import os
import sys
os.environ['SPARK_HOME'] = '/usr/local/Cellar/apache-spark/1.1.1'
spark_home = os.environ.get('SPARK_HOME', None)
if not spark_home:
raise ValueError('SPARK_HOME environment variable is not set')
sys.path.insert(0, os.path.join(spark_home, 'libexec/python'))
sys.path.insert(0, os.path.join(spark_home, 'libexec/python/lib/py4j-0.8.2.1-src.zip'))
execfile(os.path.join(spark_home, 'libexec/python/pyspark/shell.py'))
In my environment the config file is in ~ / .ipython / profile_pyspark
. py4j-0.8.2.1-src.zip
is different depending on the version, so let's rewrite it.
In Windows, I think it was around the user folder.
Try to start
$ ipython notebook --profile=pyspark
It feels like something is moving. No!
http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
Recommended Posts