Wilson Mar bio photo

Wilson Mar

Hello!

Calendar YouTube Github

LinkedIn

Here’s how to earn a legitimate micro-degree

US (English)   Norsk (Norwegian)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Estonian   اَلْعَرَبِيَّةُ (Egypt Arabic)   Napali   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean

Overview

This page contains my notes about both the Microsoft Professional Program (MPP) for Data Science track and Artificial Intelligence track.

What’s Data Science?

Business Intelligence (BI) is closely tied to the design, implementation, and use of data warehouses, and the database-oriented technology to support them.

Data scientists are consumers of BI. They are less concerned with building permanent infrastructure for others to use, and more on answering questions and communicating the results. *

Microsoft

Microsoft Professional Program (MPP) Certification for Artificial Intelligence announced April 2018 covers 10 skills over 10 courses of 8 - 16 hours each, led by Graeme Malcolm:

https://academy.microsoft.com/en-us/professional-program/data-science
Data science track in the Microsoft Professional Program.

https://www.edx.org/microsoft-professional-program-certficate-data-science
Microsoft Professional Program Certificate in Data Science offered by EdX.com

The courses cover use of Excel, Python, R on desktop machines, plus Spark big data in Azure. You must take only 9 of 12 classes offered because there is an option to take either Python or R for some of the courses.

PROTIP: Pay to get verified on 10 classes (for a total of $990), then audit the rest for free. For example, if you only want to learn Python and not R:

US$ Course InstructorsDSAI
$99 Introduction to Data Science Graeme Malcolm, Liberty Munson1-
$99 Introduction to AI Graeme Malcolm-1
$99 Querying Data with Transact-SQL Geoff Allix2-
$99 1) Analyzing and Visualizing Data with ExcelDany Hoter, Jonathan Sanito3-
2) Analyzing and Visualizing Data with Power BI Dany Hoter4-
$99 Essential Mathematics for Artificial IntelligenceGraeme Malcolm-3
$99 Ethics and Law in Data and AnalyticsBen Olsen, Geneva Lasprogata, Nathan Colaner-4
$99 Essential Statistics for Data Analysis using ExcelLiberty J. Munson, Matthew Minton 5-
$99 1) Introduction to Python for Data Science Filip Schouwenaars, Jonathan Sanito63
2) Introduction to R for Data Science Filip Schouwenaars, Jonathan Sanito6-
$99 Data Science Essentials Graeme Malcolm, Cynthia Rudin, Steve Elston75
$99 Principles of Machine Learning Graeme Malcolm, Steve Elston, Cynthia Rudin 86
$99 1) Programming with Python for Data Science ?9-
2) Programming with R for Data Science ? 9-
$99 1) Applied Machine LearningGraeme Malcolm, Cynthia Rudin, Steve Elston--
2) Implementing Predictive Solutions with Spark in Azure HDInsight (using Python, Scala, and R with Apache Spark)Graeme Malcolm9-
3) Developing Intelligent Applications Gerry O'Brien, Amy Nicholson9-
$49 Data Science Professional Capstone Project Graeme Malcolm10-
$99 Deep Learning ExplainedJonathan Sanito, Sayan Pathak, Roland Fernandez-7
$99 Reinforcement Learning ExplainedJonathan Sanito, Roland Fernandez, Adith Swaminathan-8
$99 1) Natural Language Processing (NLP)Lei Ma, Roland Fernandez, Xiaodong He-9
2) Speech RecognitionAdrian Leven-9
3) Computer Vision and Image AnalysisAndrew Byrne, Ivan Griffin, Daire McNamara-9
$49 Microsoft Professional Capstone : Artificial Intelligence (DAT 264X) Graeme Malcolm-10

The predecessor of this program last year had $25 for the intro class and $49 for the “Capstone”.

NOTE: There is NO one-on-one tutoring with these classes, unlike the $1,000 Xamarin certificate. And there is no job search assistance like Udacity provides with its $398 Machine Learning course.

Each class contains several modules.

A set of classes begin every 3 months starting the first of January, April, July, September.

CAUTION: If you sign up a week before the end of the quarter, you’ll only have a week to complete that class.

Tools references

  • The Excel 2016 Professional Plus edition, has the Power Pivot add-in and some advanced Power Query functionality.

  • Machine Learning: https://aka.ms/edx-dat211x-aml01

  • Stream Analytics: https://aka.ms/edx-dat211x-az01
  • Python Tools for Visual Studio: https://aka.ms/edx-dat208x-vs01

  • Web Services in .NET 2.0: https://aka.ms/edx-dat211x-wsn

  • SQL Database: https://aka.ms/edx-dat101x-sql01
  • SQL Data Warehouse: https://aka.ms/edx-dat101x-sql02
  • Microsoft SQL server: https://aka.ms/edx-dat215.5x-sql
  • Power BI: https://aka.ms/edx-dat101x-pbi01

Cognitive Services on Azure

  • Microsoft Azure: https://aka.ms/edx-dat211x-az02
  • Microsoft Azure portal: https://aka.ms/edx-dat211x-az04
  • Cognitive Services API: https://aka.ms/edx-dat211x-cogs
  • Cognitive Services API: https://aka.ms/edx-dat211x-cog01 resolves to https://www.microsoft.com/en-us/

Cognitive Services enables you to build applications that can tap into the powerful artificial intelligence (AI) algorithms that Microsoft has developed and hosts on Microsoft Azure. These AI algorithms offer machine-based intelligence for your applications around vision, speech, and language.

API Headers contain:

Content-Type (Optional) - a string type indicating the media type of the body that is sent to the API

Ocp-Apim-Subscription-Key - a string type that contains your API key that is found in your Cognitive Services account on Microsoft Azure

Text Analytics is the component that allows you to gain access to natural language processing. The data returned from an analytics session can provide insight into sentiment and topic mapping for your applications.

  • Text Analytics APIs: https://aka.ms/edx-dat211x-az03

With Sentiment Analysis, submit text that originates from sources such as Twitter accounts, forums, blogs, or customer feedback on your web sites, and then have the service evaluate the text for keywords. These are then used to indicate what users think about your company or products. Use this in marketing campaigns to take a baseline analysis before launching your campaign, then do another analysis and compare any differences after the campaign ends. This can help determine trends and validity of your marketing efforts.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment is the request URL, followed by the necessary request headers.

Topic Detection mines articles or other textual content for identification of key issues or topics that users are discussing. The information available in using this feature can help determine the direction you may want to take on a new or envisioned product. It can also help identify popular topics where users are focusing, offering insight into what is important to them.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/topics[?minDocumentsPerWord][&maxDocumentsPerWord], followed by the necessary request headers.

Key Phrase Extraction determines key topic points of the main points of an article. Pass the text of the article into the Text Analytics engine to extract key phrases from the English text in the article. This can help to focus efforts on the main points of an article, saving time in reviewing and evaluating the article(s) manually.

Language Detection takes the URL to a blog post or response to an article, post, or Tweet, and determines what language it is written in, among 120 possible languages.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages[?numberOfLanguagesToDetect], followed by the necessary request headers.

Machine Learning

Here is one of the few courses on use of C# for machine learning.

Instructors

I personally find that the course doesn’t gloss over topics that are difficult to understand. It covers the most useful skills in a short amount of time.

Graeme Malcolm, content developer for the series, has a great Scottish accent.

Liberty Munson - module 2

Authman Apatira

Amy Nicholson

Richard Conway

Capstone Project

The capstone project is entered in the quarter after your last class. It’s a Cortana competition.

PROTIP: Audit the capstone project before taking other classes to get a preview of what is required of you. But you can progress only through the first of 3 weeks.

And only one try is permitted per answer.

Each challenge is available at the beginning of each week (UTC time during London Winter).

Week 1: Explore and analyze the Adventureworks Sales and Customers CSV data files, for 30% of the grade. The Sales file contains CustomerID BikeBuyer AvgMonthSpend.

Week 2: Create predictive machine learning models, for 50% of the grade.

Week 3: Write and submit a report of your analysis and findings, for 20% of the grade.

Week 4: Review and grade the reports submitted by three fellow students.

References

http://konect.uni-koblenz.de/

Excel

  1. Find and remove duplicates
  2. Create a new column to turn category text into numbers for processing.
  3. Fill in missing values (on 7/20). Bold such numbers.

PROTIP: Create a separate workbook tab to calculate summary and Maximums and Minimums. Call it “Calcs”. Doing this rather than putting it ???

NOTE: The Analysis ToolPak is not available on the Mac until the 2016 version, where it’s enabled from the Tools menu, Excel Add-ins. Do that rather than in Files, Options on Windows editions of Excel.

Create a separate workbook tab to the bins histogram based on: https://support.office.com/en-us/article/Create-a-histogram-in-Excel-2016-for-Mac-4d6ada52-3153-4c81-a85f-8c4ee798e95a

PROTIP: I constructed the spreadsheet during the course such that I can re-use it later for analyzing other data than the “lemonade stand” data used in the class

More

This is one of a series on AI, Machine Learning, Deep Learning, Robotics, and Analytics:

  1. AI Ecosystem
  2. Machine Learning
  3. Testing AI

  4. Microsoft’s AI
  5. Microsoft’s Azure Machine Learning Algorithms
  6. Microsoft’s Azure Machine Learning tutorial
  7. Microsoft’s Azure Machine Learning certification

  8. Python installation
  9. Juypter notebooks processing Python for humans

  10. Image Processing
  11. Tessaract OCR using OpenCV
  12. Amazon Lex text to speech

  13. Code Generation

  14. Multiple Regression calculation and visualization using Excel and Machine Learning
  15. Tableau Data Visualization