Wilson Mar bio photo

Wilson Mar

Hello. Join me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Google+ Youtube

Github Stackoverflow Pinterest

Brand names for how corporate overlords are making humans into robots


Overview

Leading Companies

In the 2010’s there is an “arms race” in offering Artificial intelligence and Machine Learning (ML) services in their clouds:

Each of the above are cloud vendors hoping to cash in by charging for storing and processing data on their cloud. Facebook, as we all know to our chagrin, makes money from selling their user’s data to advertisers.

Benedict Evans, resident futurist at venture capital firm Andreessen Horowitz, observes in a blog post that the future of AI remains opaque: “This field is moving so fast that it’s not easy to say where the strongest leads necessarily are, nor to work out which things will be commodities and which will be strong points of difference.”

awesome-machine-learning provides many links to resources, so they will not be repeated here.

Pubs

There is a website that specializes in academic publications about Artificial Intelligence. See the Arxiv Paper Analysis Worksheet (Responses) on Google Sheet

Microsoft Academic Graph (MAG) knowledge base mined from the Bing web index. It models scholarly activities: field of study, author, institution, paper, venue, and event.

Algorithmia

Algorithmia.com provide API interfaces to algorithms offered by its partners. They have these data conversion utilities for conventional lookups of data:

  • https://algorithmia.com/algorithms/alixaxel/CoordinatesToTimezone

  • https://algorithmia.com/algorithms/Geo/ZipData

  • https://algorithmia.com/algorithms/Geo/ZipToState

  • https://algorithmia.com/algorithms/Geo/LatLongDistance

  • https://algorithmia.com/algorithms/Geo/LatLongToUTM

  • https://algorithmia.com/algorithms/util/ip2hostname

  • https://algorithmia.com/algorithms/opencv/ChangeImageFormat (from jpg to png)

The offerings

Translation

https://translate.google.com and the Google Translate API has been working on translating websites since the 90’s. In 2017 Google made a breakthrough

Microsoft’s Translator Speech

Computer Vision

Open-source OpenCV (Computer Vision) was an early entrant and is still used today by many because it is written in C and runs quite efficiently.

Microsoft’s Computer Vision

https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/getting-started-build-a-classifier Hands-on guide: build a classifier with Custom Vision

Microsoft’s “Face”

  • https://algorithmia.com/algorithms/z/ColorPalettefromImage

  • Google Cloud Vision API

  • https://algorithmia.com/algorithms/opencv/FaceDetection then https://algorithmia.com/algorithms/opencv/CensorFace

  • https://algorithmia.com/algorithms/ocr/RecognizeCharacters OCR

Some of these make use of OpenCV (CV = Computer Vision).

Voice Recognition

Microsoft’s Web App Bot

NLP Sentiment Analysis

Analyze text for positive or negative sentiment (opinion), based on a training database of potential word meanings, which involved Natural Language Processing:

  • https://algorithmia.com/algorithms/nlp/SentimentAnalysis

  • IBM’s algorithm

Andrew W. Trask, PhD student at University of Oxford Deep Learning for Natural Language Processing authored Grokking Deep Learning.

Use Bag of words and Word2vec transform words into vectors. Use TFLearn, a Python library for quickly building networks.

Document (article) Search

Google made it’s fortune on offering search services.

Microsoft’s Bing Search

TF-IDF = Term Frequency - Inverse Document Frequency emphasizes important words (called a vector) which appear rarely in the corpus searched (rare globally). which appear frequently in document (common locally) Term frequency is measured by word count (how many occurances of each word).

The IDF to downweight words is the log of #docs divided by 1 + #docs using given word.

Cosine similarity normalizes vectors so small angle thetas identify similarity.

Normalizing makes the comparison invariant to the number of words. The common compromise is to cap maximum word count.

Recommender

Recommender systems recommend (advises) users about what to do, based on the pattern detected in similar situations observed in the past.

collaborative filtering and factorization machines.

implement the solution using sparse distributed matrices in PySpark.

Footnotes

https://www.wikiwand.com/en/Deep_learning

More

This is one of a series on AI, Machine Learning, Deep Learning, Robotics, and Analytics:

  1. AI Ecosystem
  2. Machine Learning
  3. Testing AI

  4. Microsoft’s AI
  5. Microsoft’s Azure Machine Learning Algorithms
  6. Microsoft’s Azure Machine Learning tutorial
  7. Microsoft’s Azure Machine Learning certification

  8. Python installation
  9. Juypter notebooks processing Python for humans

  10. Image Processing
  11. Amazon Lex text to speech

  12. Code Generation

  13. Multiple Regression calculation and visualization using Excel and Machine Learning
  14. Tableau Data Visualization