Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Google+ Youtube

Github Stackoverflow Pinterest

Notebooks that displays HTML markup between executable Python code

Overview

The name Jupyter comes from the combination of Julia, Python, and R (the statistics package).

Jupyter is a web application that runs several separate environments (one on each port).

Each Jupyter notebook contains explanatory text, math equations, code, and visualizations all in one easily sharable document.

Notebooks used to be called “iPython” notebooks before languages other than Python were added.

Below is a guided learning experience. You perform actions in a planned sequence that takes you through various features. Commentary along the way describes concepts to understand.

You will open a notebook, add a Markdown text and add a header while you try various keyboard shortcuts.

This is based on excellent tutorials from several sources:

https://github.com/dunovank/jupyter-themes


These occur within a Terminal window:

Jupyter session

  1. Make a directory to hold Jupyter session files.

  2. Navigate to a URL containing Jupyter Notebook files.

    Notebooks are JSON files with the extension .ipynb. However, “.txt” is added for storage.

    https://github.com/jakevdp/PythonDataScienceHandbook

    https://www.eecis.udel.edu/~boncelet/ipython.html

    https://github.com/odewahn/ipynb-examples

    https://github.com/ogrisel/notebooks

    analysis of gravitational waves from two colliding blackholes detected by the LIGO experiment.

    Sentiment-network.zip from Andrew Trask contains notebooks for sentiment analysis.

  3. Download a Jupyter Notebook file into your folder.

    You may have to move downloaded files from your Downloads folder.

  4. In a new Terminal window, enter Jupyter (with & to keep working on the command line):

    jupyter notebook &
    

    This creates a (hidden) file .jupyter in your home folder, invokes a web server locally, and pops up to your default browser a tab:

    http://localhost:8888/tree?token=cdc…

    Additional notebook servers increment the port number from 8888.

  5. Shutdown the server from Terminal by pressing control + C twice.
  6. Enter again.

  7. Create a kernel by clicking on the New drop-down and selecting one.

    NOTE: A kernel is each web page you are on.

  8. Click on tab File to show all the files and folders in the current directory.

  9. Click Open.

  10. List all the currently running notebooks by clicking on the Running tab.

  11. Select “Python [Conda root]” for a new window (with tabs).

    jupyter types

    CAUTION: Just closing the browser leaves a kernel running.

  12. To close and halt the kernel, select File, then Close and Halt.

    Conda

  13. cd to the folder containing the Jupyter notebook above.

  14. A) Create a conda environment based on the requirements file:

    conda install --file requirements.txt

    B) Create a stand-alone environment named PDSH with Python 3.5 and all the required package versions:

    conda create -n PDSH python=3.5 --file requirements.txt
    
  15. Source activate the conda environment.

  16. Install packages from within the Conda environment:

    conda install jupyter notebook numpy matplotlib scikit-learn bokeh

    Editing

  17. Click on the Edit tab.

    NOTE: Cells are where code is written and run.

    Several of the selections have an icon equivalent.

  18. Press Esc key.

  19. Each cell can be changed by just clicking on it.

    When the thick left border on a box is colored green, the box is in edit mode.

    If you don’t see a blinking cursor in the cell, click in the cell.

  20. To exit from edit mode to display mode, hold down Shift and press Enter.

    Command mode

    When a new cell is created, it is in command mode.

    When the thick left border on a box is colored blue, the box is in command mode.

    When the bar is blue, press H for a help screen about Keyboard shortcuts. It reads:

    “Command mode binds the keyboard to notebook level actions.”

    When the bar is blue, press A to create a new cell.

    press B to create a cell below the currently selected cell.

    Command Palette

    NOTE: This does not work in Firefox and Internet Explorer, only in Chrome and Apple Safari.

  21. When in command mode, click the little keyboard icon, called the “command palette”.

    A panel appears with a search bar.

  22. Press down arrow to scroll down.

    Keyboard shortcuts are on the right side.

  23. Enter a command to search for. Helpful for speeding up your workflow instead of looking around in the menus with your mouse.

    See http://docutils.sourceforge.net/rst.html

  24. Press Esc key.

    Python Code

  25. Click on the “Code” drop-down, which specifies what is typed in the cell (input box).

  26. Select Markdown to format syntax for writing web content.

  27. Type in some markdown text.

    https://daringfireball.net/projects/markdown/basics

    https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

    Two dollar signs begin and end math entries.

  28. To run the cell, click the >| icon (used to mean forward to the end) or press Shift + Enter or Control + Enter to run the Markdown cell,

    Heading

  29. Click the up/down arrows to position the cell above or below existing runned lines.

  30. Select Heading and add a # to reduce the level before typing heading text.

    Open and Rename Notebook

  31. Click Files tab.
  32. Click Open.

    Select _____???

  33. Click the checkbox to the right of the file name.
  34. Click the Rename button.
  35. Remove the “.txt” and press OK.
  36. Click on the file name itself to open the file.

    A new tab should appear.

    Timings

    To get the number of milliseconds a function took to run, put %timeit in front of commands to invoke.

    To get the number of seconds it took for a cell to run, put %%timeit at the top of the cell.

    Debugging

    %pdb at the top of the cell turns debugging on.

    https://docs.python.org/3/library/pdb.html

    Visualization magic keywords

    %matplotlib inline

    See http://ipython.readthedocs.io/en/stable/interactive/magics.html for docs about magic commands.

    nbextensions

  37. Code to dynamically loaded:
    %%javascript
    IPython.load_extensions('custom_shortcuts');
    

    See https://carreau.gitbooks.io/jupyter-book/content/Jsextensions.html

    Raw NBConvert

    NBConvert code modify the UI and behavior of Jupyter itself on the browser.

    So Jupyter notebook extensions are written in javascript and CSS.

    nbextensions are installed in the directory of the same name, either system wide or in your user profile. Their entry point is named .js

    Output as HTML

  38. To share a Notebook with others who do not have Notebook installed, convert the Notebook to HTML or Markdown.

    PROTIP: Some prefer receiving Markdown text so they paste in blog editing software which formats the Markdown to their own liking.

    
    jupyter nbconvert --to html notebook.ipynb
    

    See https://nbconvert.readthedocs.io/en/latest/usage.html

    See https://ipython.org/ipython-doc/1/interactive/nbconvert.html#notebook-format

    GitHub

    https://github.com/blog/1995-github-jupyter-notebooks-3 GitHub renders Jupyter Notebooks with Git Large File Support.

    NBViewer

    http://nbviewer.jupyter.org/ renders notebooks from a GitHub repo or from notebooks stored elsewhere.

    Pandas Slideshows

  39. Click the View tab.

    See http://nbviewer.jupyter.org/format/slides/github/jorisvandenbossche/2015-PyDataParis/blob/master/pandas_introduction.ipynb#/

Turi (Dato) Python algorithms

GraphLab Create from Dato provides scalable “pre-implemented” ML algorithms using Python installed using Anaconda. Entire courses on its use is at:

  • https://www.coursera.org/learn/ml-foundations
  • https://www.turi.com/learn/userguide/
  • https://www.turi.com/products/create/docs/
  • https://github.com/learnml/machine-learning-specialization
  • https://www.coursera.org/learn/ml-clustering-and-retrieval/supplement/iF7Ji/software-tools-you-ll-need-for-this-course

When the one-year free license is over, note scikit-learn also uses Python with Anaconda.

https://www.youtube.com/watch?v=GxZUdZMPGpQ Large-Scale Deep Learning with TensorFlow Turi, Inc.

Additional

Clusters is not longer used to create multiple kernels used in parallel computing.

https://ipyparallel.readthedocs.io/en/latest/intro.html

https://bcourses.berkeley.edu/courses/1267848 Introduction to Data Science at UC Berkeley

Python for Data Analysis by Wes McKinney