Here are the corporate overlords are making humans into robots
Major organizations are in an arms race in offering Artificial intelligence and Machine Learning (ML) services in their clouds:
- Microsoft Cortana in Azure
- IBM Watson
- Amazon Alexa
Each of the above are cloud vendors hoping to cash in by charging for processing of other people’s data.
Benedict Evans, the resident futurist at venture capital firm Andreessen Horowitz, observes in a recent blog post that the future of AI remains opaque: “This field is moving so fast that it’s not easy to say where the strongest leads necessarily are, nor to work out which things will be commodities and which will be strong points of difference,”
Algorithmia.com provide API interfaces to algorithms offered by its partners.
awesome-machine-learning provides many links to resources, so they will not be repeated here.
In 2014, Microsoft showed off its facial recognition capabilities with how-old.net”>how-old.net to guess how old someone is. At conferences they built a booth that takes a picture.
In 2015, Microsoft unleashed the Tay chat bot.
Algorithms from Azure
Below are various initiatives by MS (Microsoft) and other organizations:
A-Z List of Machine Learning Studio Modules from Microsoft Azure lists basic database and UI features such as forms, which means it’s building standard computing functions on top of AI capabilities.
Some utilities may involve conventional lookups of data:
https://algorithmia.com/algorithms/opencv/ChangeImageFormat (from jpg to png)
- Google Translate API has been working on websites for years.
Image Recognition / Computer Vision
https://algorithmia.com/algorithms/opencv/FaceDetection then https://algorithmia.com/algorithms/opencv/CensorFace
Some of these make use of OpenCV (CV = Computer Vision).
Google Cloud Speech API, which powers Google’s own voice search and voice-enabled apps.
Speech to Text
NLP Sentiment Analysis
Analyze text for positive or negative sentiment (opinion), based on a training database of potential word meanings, which involved Natural Language Processing:
Andrew W. Trask, PhD student at University of Oxford Deep Learning for Natural Language Processing authored Grokking Deep Learning.
Use Bag of words and Word2vec transform words into vectors. Use TFLearn, a Python library for quickly building networks.
Document (article) Search
TF-IDF = Term Frequency - Inverse Document Frequency emphasizes important words (called a vector) which appear rarely in the corpus searched (rare globally). which appear frequently in document (common locally) Term frequency is measured by word count (how many occurances of each word).
The IDF to downweight words is the log of #docs divided by 1 + #docs using given word.
Cosine similarity normalizes vectors so small angle thetas identify similarity.
Normalizing makes the comparison invariant to the number of words. The common compromise is to cap maximum word count.
Microsoft Azure Machine Learning
https://azure.microsoft.com/en-us/services/machine-learning offers free plans
Guest Workspace for 8 hours on https://studio.azureml.net/Home/ViewWorkspaceCached/…
Registered free workspaces with 10 GB storage can scale resources to increase experiment execution performance.
All their plans offer:
- Stock sample datasets
- R and Python script support
- Full range of ML alogorithms
- Predictive web services
Follow this machine learning tutorial to use Azure Machine Learning Studio to create a linear regression model that predicts the price of an automobile based on different variables such as make and technical specifications. Then iterate on a simple predictive analytics experiment after
Enter Microsoft’s Learning Studio:
As per this video:
Look at examples in the Cortana Intelligence Gallery
Take the introductory tutorial:
Create a model
As per this video using
- Clean Missing Data - Clip Outliers
- Edit Metadata
- Feature Selection
- Learning with Counts
- Normalize Data
- Partition and Sample
- Principal Component Analysis
- Quantize Data
- SQLite Transformation
- Synthetic Minority Oversampling Technique
Train the model
- Cross Validation
- Parameter Sweep
Score and test the model
Make predictions with Elastic APIs
- Request-Response Service (RRS) Predictive Experiment - Batch Execution Service (BES)
- Retraining API
Python 3.6 has formatted strings
Conda is similar to virtualenv and pyenv, other popular environment managers.
conda install numpy pandas matplotlib
conda install jupyter notebook
List the packages installed, with its version number and what version of Python:
Create new environment for Python, specifying packages needed:
conda create -n my_env python=3 numpy pandas
Enter an environment on Mac:
source activate my_env
When you’re in the environment, the environment’s name appears in the prompt:
(my_env) ~ $.
Leave the environment
On Windows, it’s just deactivate.
Get back in again.
Create an enviornment file by piping the output from an export:
conda env export > some_env.yaml
When sharing your code on GitHub, it’s good practice to make an environment file and include it in the repository. This will make it easier for people to install all the dependencies for your code. I also usually include a pip requirements.txt file using pip freeze (learn more here) for people not using conda.
Load an environment metadata file:
conda env create -f some_env.yaml
List environments created on your machine:
conda env list
Remove an environment:
conda env remove -n some_env