All the options for Enterprise integration
Microsoft has a “Cheat Sheet” to help you select a Machine Learning alogorithm from among the various initiatives:
A-Z List of Machine Learning Studio Modules from Microsoft Azure
https://bit.ly/a4r-mlbook Azure Machine Learning: Microsoft Azure Essentials
https://www.youtube.com/watch?v=eUce2cB844s&t=19m52s Hands-On with Azure Machine Learning 26 Sep 2016 predicts car prices
Some utilities may involve conventional lookups of data:
Algorithmia.com provide API interfaces to algorithms offered by its partners.
https://algorithmia.com/algorithms/opencv/ChangeImageFormat (from jpg to png)
- Google Translate API has been working on websites for years.
Image Recognition / Computer Vision
https://algorithmia.com/algorithms/opencv/FaceDetection then https://algorithmia.com/algorithms/opencv/CensorFace
Some of these make use of OpenCV (CV = Computer Vision).
http://www.deeplearningbook.org by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
Google Cloud Speech API, which powers Google’s own voice search and voice-enabled apps.
Speech to Text
Analyze text for positive or negative sentiment, based on a training database of potential word meanings:
Document (article) Search
TF-IDF = Term Frequency - Inverse Document Frequency emphasizes important words (called a vector) which appear rarely in the corpus searched (rare globally). which appear frequently in document (common locally) Term frequency is measured by word count (how many occurances of each word).
The IDF to downweight words is the log of #docs divided by 1 + #docs using given word.
Cosine similarity normalizes vectors so small angle thetas identify similarity.
Normalizing makes the comparison invariant to the number of words. The common compromise is to cap maximum word count.
Microsoft Azure Machine Learning
https://azure.microsoft.com/en-us/services/machine-learning offers free plans
Guest Workspace for 8 hours on https://studio.azureml.net/Home/ViewWorkspaceCached/…
Registered free workspaces with 10 GB storage can scale resources to increase experiment execution performance.
All their plans offer:
- Stock sample datasets
- R and Python script support
- Full range of ML alogorithms
- Predictive web services
Follow this machine learning tutorial to use Azure Machine Learning Studio to create a linear regression model that predicts the price of an automobile based on different variables such as make and technical specifications. Then iterate on a simple predictive analytics experiment after
Regression works on numbers.
Classification works on strings.
Enter Microsoft’s Learning Studio:
As per this video:
Look at examples in the Cortana Intelligence Gallery
Take the introductory tutorial:
Create a model
As per this video using
- Clean Missing Data - Clip Outliers
- Edit Metadata
- Feature Selection
- Learning with Counts
- Normalize Data
- Partition and Sample
- Principal Component Analysis
- Quantize Data
- SQLite Transformation
- Synthetic Minority Oversampling Technique
Train the model
- Cross Validation
- Parameter Sweep
Score and test the model
Make predictions with Elastic APIs
- Request-Response Service (RRS) Predictive Experiment - Batch Execution Service (BES)
- Retraining API
Python 3.6 has formatted strings
Conda is similar to virtualenv and pyenv, other popular environment managers.
conda install numpy pandas matplotlib conda install jupyter notebook conda install -c https://conda.binstar.org/menpo opencv
Can’t find it? Look among all users and operating systems supported
anaconda search -t conda pygame
On a Mac https://anaconda.org/tlatorre/pygame is not recognized because it’s only for Linux.
On Stack Overflow a user recommends on that supports Windows 32 and 64, MacOS, and Linux:
conda install -c cogsci pygame=1.9.2a0
pip install pygame
Copy a user/package to show more info:
anaconda show USER/PACKAGE
List the packages installed, with its version number and what version of Python:
Create new environment for Python, specifying packages needed:
conda create -n my_env python=3 numpy pandas
Enter an environment on Mac:
source activate my_env
When you’re in the environment, the environment’s name appears in the prompt:
(my_env) ~ $.
Leave the environment (like exit):
On Windows, it’s just deactivate.
Get back in again.
Create an enviornment file by piping the output from an export:
conda env export > some_env.yaml
When sharing your code on GitHub, it’s good practice to make an environment file and include it in the repository. This will make it easier for people to install all the dependencies for your code. I also usually include a pip requirements.txt file using pip freeze (learn more here) for people not using conda.
Load an environment metadata file:
conda env create -f some_env.yaml
List environments created on your machine:
conda env list
Remove an environment:
conda env remove -n some_env
Add a package
anaconda show menpo/opencv3
conda install –channel https://conda.anaconda.org/menpo opencv3
Test within Python »> :
import cv2 print(cv2.__version__)
The response should be:
Install readline to do autocompletion in Jupyter notebooks by hitting
conda/pip install readline
Readline comes with anaconda
http://www.h2o.ai/h2o/ platform is built using Java working on H2O’s rapid in-memory distributed parallel processing. Its models can be visually inspected during training, which is unique to H2O. , so they can immediately spot a job that should be stopped and more quickly iterate to find the optimal approach.
Hello World [6:52]</a> apples and oranges
Visualizing a Decision Tree - Apr 13, 2016 [6:31]</a>
50 examples of each of 4 types of irises, with Sepal and Petal length and width, at https://en.wikipedia.org/wiki/Iris_flower_data_set
open -a preview iris.pdf
sudo python3 -m pip install pydot
What Makes a Good Feature?</a>
Let’s Write a Pipeline</a>
Writing Our First Classifier [8:43]</a>
Train an Image Classifier with TensorFlow for Poets</a>
- Classifying Handwritten Digits with TF.Learn</a>
https://hub.docker.com/r/jbgordon/recipes/ is a Docker image to help folks having trouble with Pydot or Graphviz. It has all the dependencies setup and installation instructions.
awesome-machine-learning provides many links to resources, so they will not be repeated here.