initial import split off from https://github.com/harisekhon/tools
d694d8a1
Hari Sekhon
committed
13 changed files
.gitignore
/.gitignore+6
/.gitignore
Add comment 1 Plus  # This doesn't actually work on already committed files, git update-index --assume-unchanged is set in Makefile instead
Add comment 2 Plus  scrub_custom.conf
Add comment 3 Plus  solr-env.sh
Add comment 4 Plus  *.log
Add comment 5 Plus  *.swp
Add comment 6 Plus  
.ipython-notebook-pyspark.00-pyspark-setup.py
/.ipython-notebook-pyspark.00-pyspark-setup.py+28
/.ipython-notebook-pyspark.00-pyspark-setup.py
Add comment 1 Plus  #
Add comment 2 Plus  # Author: Hari Sekhon
Add comment 3 Plus  # Date: 22/8/2014
Add comment 4 Plus  #
Add comment 5 Plus  
Add comment 6 Plus  # Yarn library support
Add comment 7 Plus  
Add comment 8 Plus  # Requires SPARK_HOME to be set
Add comment 9 Plus  
Add comment 10 Plus  __author__ = "Hari Sekhon"
Add comment 11 Plus  __version__ = "0.1"
Add comment 12 Plus  
Add comment 13 Plus  import glob
Add comment 14 Plus  import os
Add comment 15 Plus  import sys
Add comment 16 Plus  
Add comment 17 Plus  # This only runs PySpark in local mode, not Yarn mode
Add comment 18 Plus  #
Add comment 19 Plus  # See ipython-notebook-spark.py for cluster mode (YARN or Standalone)
Add comment 20 Plus  
Add comment 21 Plus  spark_home = os.getenv('SPARK_HOME', None)
Add comment 22 Plus  if not spark_home:
Add comment 23 Plus   raise ValueError('SPARK_HOME environment variable is not set')
Add comment 24 Plus  sys.path.insert(0, os.path.join(spark_home, 'python'))
Add comment 25 Plus  for lib in glob.glob(os.path.join(spark_home, 'python/lib/py4j-*-src.zip')):
Add comment 26 Plus   sys.path.insert(0, lib)
Add comment 27 Plus  execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
Add comment 28 Plus  
.ipython-notebook-pyspark.ipython_notebook_config.py.j2
/.ipython-notebook-pyspark.ipython_notebook_config.py.j2+19
/.ipython-notebook-pyspark.ipython_notebook_config.py.j2
Add comment 1 Plus  #
Add comment 2 Plus  # Generated by Hari Sekhon's {{ name }} on {{ date }}
Add comment 3 Plus  #
Add comment 4 Plus  # Can be hand-edited as long as you leave the path to the passwd.txt file,
Add comment 5 Plus  # otherwise will be overwritten on next run of {{ name }}
Add comment 6 Plus  #
Add comment 7 Plus  # Template Source: {{ template_path }}
Add comment 8 Plus  #
Add comment 9 Plus  
Add comment 10 Plus  c = get_config()
Add comment 11 Plus  # Set to 0.0.0.0 for portability but better set to your IP
Add comment 12 Plus  # for feedback to given users the correct URL
Add comment 13 Plus  c.NotebookApp.ip = '{{ ip }}'
Add comment 14 Plus  #c.NotebookApp.password = u'{{ password }}'
Add comment 15 Plus  passwd_txt = '{{ passwd_txt }}'
Add comment 16 Plus  c.NotebookApp.password = open(passwd_txt).read().strip()
Add comment 17 Plus  c.NotebookApp.open_browser = False
Add comment 18 Plus  c.IPKernelApp.pylab = 'inline'
Add comment 19 Plus  
.travis.yml
/.travis.yml
/.travis.yml
hadoop_hdfs_files_native_checksums.jy
/hadoop_hdfs_files_native_checksums.jy
/hadoop_hdfs_files_native_checksums.jy
hadoop_hdfs_files_stats.jy
/hadoop_hdfs_files_stats.jy
/hadoop_hdfs_files_stats.jy
hadoop_hdfs_time_block_reads.jy
/hadoop_hdfs_time_block_reads.jy
/hadoop_hdfs_time_block_reads.jy
ipython-notebook-pyspark.py
/ipython-notebook-pyspark.py
/ipython-notebook-pyspark.py
LICENSE
/LICENSE
/LICENSE
Makefile
/Makefile
/Makefile
pig-udfs.jy
/pig-udfs.jy
/pig-udfs.jy
README.md
/README.md
/README.md
welcome.py
/welcome.py
/welcome.py