Here’s how to earn a legitimate micro-degree
This page contains my notes about both the Microsoft Professional Program (MPP) for Data Science track and Artificial Intelligence track.
What’s Data Science?
Business Intelligence (BI) is closely tied to the design, implementation, and use of data warehouses, and the database-oriented technology to support them.
Data scientists are consumers of BI. They are less concerned with building permanent infrastructure for others to use, and more on answering questions and communicating the results. *
Microsoft Professional Program (MPP) Certification for Artificial Intelligence announced April 2018 covers 10 skills over 10 courses of 8 - 16 hours each, led by Graeme Malcolm:
Data science track in the Microsoft Professional Program.
Microsoft Professional Program Certificate in Data Science offered by EdX.com
The courses cover use of Excel, Python, R on desktop machines, plus Spark big data in Azure. You must take only 9 of 12 classes offered because there is an option to take either Python or R for some of the courses.
PROTIP: Pay to get verified on 10 classes (for a total of $990), then audit the rest for free. For example, if you only want to learn Python and not R:
The predecessor of this program last year had $25 for the intro class and $49 for the “Capstone”.
NOTE: There is NO one-on-one tutoring with these classes, unlike the $1,000 Xamarin certificate. And there is no job search assistance like Udacity provides with its $398 Machine Learning course.
Each class contains several modules.
A set of classes begin every 3 months starting the first of January, April, July, September.
CAUTION: If you sign up a week before the end of the quarter, you’ll only have a week to complete that class.
The Excel 2016 Professional Plus edition, has the Power Pivot add-in and some advanced Power Query functionality.
Machine Learning: https://aka.ms/edx-dat211x-aml01
- Stream Analytics: https://aka.ms/edx-dat211x-az01
Python Tools for Visual Studio: https://aka.ms/edx-dat208x-vs01
Web Services in .NET 2.0: https://aka.ms/edx-dat211x-wsn
- SQL Database: https://aka.ms/edx-dat101x-sql01
- SQL Data Warehouse: https://aka.ms/edx-dat101x-sql02
- Microsoft SQL server: https://aka.ms/edx-dat215.5x-sql
- Power BI: https://aka.ms/edx-dat101x-pbi01
Cognitive Services on Azure
- Microsoft Azure: https://aka.ms/edx-dat211x-az02
- Microsoft Azure portal: https://aka.ms/edx-dat211x-az04
- Cognitive Services API: https://aka.ms/edx-dat211x-cogs
- Cognitive Services API: https://aka.ms/edx-dat211x-cog01 resolves to https://www.microsoft.com/en-us/
Cognitive Services enables you to build applications that can tap into the powerful artificial intelligence (AI) algorithms that Microsoft has developed and hosts on Microsoft Azure. These AI algorithms offer machine-based intelligence for your applications around vision, speech, and language.
API Headers contain:
Content-Type (Optional) - a string type indicating the media type of the body that is sent to the API
Ocp-Apim-Subscription-Key - a string type that contains your API key that is found in your Cognitive Services account on Microsoft Azure
Text Analytics is the component that allows you to gain access to natural language processing. The data returned from an analytics session can provide insight into sentiment and topic mapping for your applications.
- Text Analytics APIs: https://aka.ms/edx-dat211x-az03
With Sentiment Analysis, submit text that originates from sources such as Twitter accounts, forums, blogs, or customer feedback on your web sites, and then have the service evaluate the text for keywords. These are then used to indicate what users think about your company or products. Use this in marketing campaigns to take a baseline analysis before launching your campaign, then do another analysis and compare any differences after the campaign ends. This can help determine trends and validity of your marketing efforts.
- https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment is the request URL, followed by the necessary request headers.
Topic Detection mines articles or other textual content for identification of key issues or topics that users are discussing. The information available in using this feature can help determine the direction you may want to take on a new or envisioned product. It can also help identify popular topics where users are focusing, offering insight into what is important to them.
- https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/topics[?minDocumentsPerWord][&maxDocumentsPerWord], followed by the necessary request headers.
Key Phrase Extraction determines key topic points of the main points of an article. Pass the text of the article into the Text Analytics engine to extract key phrases from the English text in the article. This can help to focus efforts on the main points of an article, saving time in reviewing and evaluating the article(s) manually.
Language Detection takes the URL to a blog post or response to an article, post, or Tweet, and determines what language it is written in, among 120 possible languages.
- https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages[?numberOfLanguagesToDetect], followed by the necessary request headers.
Here is one of the few courses on use of C# for machine learning.
I personally find that the course doesn’t gloss over topics that are difficult to understand. It covers the most useful skills in a short amount of time.
Graeme Malcolm, content developer for the series, has a great Scottish accent.
Liberty Munson - module 2
The capstone project is entered in the quarter after your last class. It’s a Cortana competition.
PROTIP: Audit the capstone project before taking other classes to get a preview of what is required of you. But you can progress only through the first of 3 weeks.
And only one try is permitted per answer.
Each challenge is available at the beginning of each week (UTC time during London Winter).
Week 1: Explore and analyze the Adventureworks Sales and Customers CSV data files, for 30% of the grade. The Sales file contains CustomerID BikeBuyer AvgMonthSpend.
Week 2: Create predictive machine learning models, for 50% of the grade.
Week 3: Write and submit a report of your analysis and findings, for 20% of the grade.
Week 4: Review and grade the reports submitted by three fellow students.
- Find and remove duplicates
- Create a new column to turn category text into numbers for processing.
- Fill in missing values (on 7/20). Bold such numbers.
PROTIP: Create a separate workbook tab to calculate summary and Maximums and Minimums. Call it “Calcs”. Doing this rather than putting it ???
NOTE: The Analysis ToolPak is not available on the Mac until the 2016 version, where it’s enabled from the Tools menu, Excel Add-ins. Do that rather than in Files, Options on Windows editions of Excel.
Create a separate workbook tab to the bins histogram based on: https://support.office.com/en-us/article/Create-a-histogram-in-Excel-2016-for-Mac-4d6ada52-3153-4c81-a85f-8c4ee798e95a
PROTIP: I constructed the spreadsheet during the course such that I can re-use it later for analyzing other data than the “lemonade stand” data used in the class