Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Google+ Youtube

Github Stackoverflow Pinterest

Here’s how to a legitimate micro-certification


Overview

Definitions

Business Intelligence (BI) is closely tied to the design, implementation, and use of data warehouses, and the database-oriented technology to support them.

Data scientists are consumers of BI. They are less concerned with building permanent infrastructure for others to use, and more on answering questions and communicating the results. *

Microsoft

https://academy.microsoft.com/en-us/professional-program/data-science
Data science track in the Microsoft Professional Program.

https://www.edx.org/microsoft-professional-program-certficate-data-science
Microsoft Professional Program Certificate in Data Science offered by EdX.com

The courses cover use of Excel, Python, R on desktop machines, plus Spark big data in Azure. You must take only 9 of 12 classes offered because there is an option to take either Python or R for some of the courses.

PROTIP: Pay to get verified on just 9 classes (for a total of $767), then audit the rest for free. For example, if you only want to learn Python and not R:

US$ Course Instructor
$25 Data Science Orientation Liberty Munson
$99 Querying Data with Transact-SQL ?
$99 1) Analyzing and Visualizing Data with Excel?
2) Analyzing and Visualizing Data with Power BI Dany Hoter
$99 Essential Statistics for Data Analysis using Excel ?
$99 1) Introduction to Python for Data Science Filip Schouwenaars, DataCamp
2) Introduction to R for Data Science ?
$99 Data Science Essentials Cynthia Rudin & Steve Elston
$99 Principles of Machine Learning ?
$99 1) Programming with Python for Data Science ?
2) Programming with R for Data Science ?
$99 1) Applied Machine Learning?
2) Implementing Predictive Solutions with Spark in Azure HDInsight (using Python, Scala, and R with Apache Spark) ?
3) Developing Intelligent Applications Gerry O'Brien & Amy Nicholson
$49 Data Science Professional Capstone Project ?

NOTE: There is NO one-on-one tutoring with these classes, unlike the $1,000 Xamarin certificate. And there is no job search assistance like Udacity provides with its $398 Machine Learning course.

Each class contains several modules.

A set of classes begin every 3 months starting the first of January, April, July, September.

CAUTION: If you sign up a week before the end of the quarter, you’ll only have a week to complete that class.

Tools references

  • The Excel 2016 Professional Plus edition, has the Power Pivot add-in and some advanced Power Query functionality.

  • Machine Learning: https://aka.ms/edx-dat211x-aml01

  • Stream Analytics: https://aka.ms/edx-dat211x-az01
  • Python Tools for Visual Studio: https://aka.ms/edx-dat208x-vs01

  • Web Services in .NET 2.0: https://aka.ms/edx-dat211x-wsn

  • SQL Database: https://aka.ms/edx-dat101x-sql01
  • SQL Data Warehouse: https://aka.ms/edx-dat101x-sql02
  • Microsoft SQL server: https://aka.ms/edx-dat215.5x-sql
  • Power BI: https://aka.ms/edx-dat101x-pbi01

Cognitive Services on Azure

  • Microsoft Azure: https://aka.ms/edx-dat211x-az02
  • Microsoft Azure portal: https://aka.ms/edx-dat211x-az04
  • Cognitive Services API: https://aka.ms/edx-dat211x-cogs
  • Cognitive Services API: https://aka.ms/edx-dat211x-cog01 resolves to https://www.microsoft.com/en-us/

Cognitive Services enables you to build applications that can tap into the powerful artificial intelligence (AI) algorithms that Microsoft has developed and hosts on Microsoft Azure. These AI algorithms offer machine-based intelligence for your applications around vision, speech, and language.

API Headers contain:

Content-Type (Optional) - a string type indicating the media type of the body that is sent to the API

Ocp-Apim-Subscription-Key - a string type that contains your API key that is found in your Cognitive Services account on Microsoft Azure

Text Analytics is the component that allows you to gain access to natural language processing. The data returned from an analytics session can provide insight into sentiment and topic mapping for your applications.

  • Text Analytics APIs: https://aka.ms/edx-dat211x-az03

With Sentiment Analysis, submit text that originates from sources such as Twitter accounts, forums, blogs, or customer feedback on your web sites, and then have the service evaluate the text for keywords. These are then used to indicate what users think about your company or products. Use this in marketing campaigns to take a baseline analysis before launching your campaign, then do another analysis and compare any differences after the campaign ends. This can help determine trends and validity of your marketing efforts.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment is the request URL, followed by the necessary request headers.

Topic Detection mines articles or other textual content for identification of key issues or topics that users are discussing. The information available in using this feature can help determine the direction you may want to take on a new or envisioned product. It can also help identify popular topics where users are focusing, offering insight into what is important to them.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/topics[?minDocumentsPerWord][&maxDocumentsPerWord], followed by the necessary request headers.

Key Phrase Extraction determines key topic points of the main points of an article. Pass the text of the article into the Text Analytics engine to extract key phrases from the English text in the article. This can help to focus efforts on the main points of an article, saving time in reviewing and evaluating the article(s) manually.

Language Detection takes the URL to a blog post or response to an article, post, or Tweet, and determines what language it is written in, among 120 possible languages.

  • https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages[?numberOfLanguagesToDetect], followed by the necessary request headers.

Machine Learning

Here is one of the few courses on use of C# for machine learning.

Instructors

I personally find that the course doesn’t gloss over topics that are difficult to understand. It covers the most useful skills in a short amount of time.

Graeme Malcolm, content developer for the series, has a great Scottish accent.

Liberty Munson - module 2

Authman Apatira

Amy Nicholson

Richard Conway

Capstone Project

The capstone project is entered in the quarter after your last class. It’s a Cortana competition.

PROTIP: Audit the capstone project before taking other classes to get a preview of what is required of you. But you can progress only through the first of 3 weeks.

And only one try is permitted per answer.

Each challenge is available at the beginning of each week (UTC time during London Winter).

Week 1: Explore and analyze the Adventureworks Sales and Customers CSV data files, for 30% of the grade. The Sales file contains CustomerID BikeBuyer AvgMonthSpend.

Week 2: Create predictive machine learning models, for 50% of the grade.

Week 3: Write and submit a report of your analysis and findings, for 20% of the grade.

Week 4: Review and grade the reports submitted by three fellow students.

References

http://konect.uni-koblenz.de/

Excel

  1. Find and remove duplicates
  2. Create a new column to turn category text into numbers for processing.
  3. Fill in missing values (on 7/20). Bold such numbers.

PROTIP: Create a separate workbook tab to calculate summary and Maximums and Minimums. Call it “Calcs”. Doing this rather than putting it ???

NOTE: The Analysis ToolPak is not available on the Mac until the 2016 version, where it’s enabled from the Tools menu, Excel Add-ins. Do that rather than in Files, Options on Windows editions of Excel.

Create a separate workbook tab to the bins histogram based on: https://support.office.com/en-us/article/Create-a-histogram-in-Excel-2016-for-Mac-4d6ada52-3153-4c81-a85f-8c4ee798e95a

PROTIP: I constructed the spreadsheet during the course such that I can re-use it later for analyzing other data than the “lemonade stand” data used in the class