Mastering Spark Data Science on Azure

Apache Spark is a fast, in-memory data-processing engine with elegant and expressive development APIs that enable data scientists to execute streaming workloads, build sophisticated machine-learning models, and perform other tasks endemic to extracting information from large datasets. Apache Spark for Azure HDInsight makes high-performance Spark clusters available to the masses, and Azure's Data Science Virtual Machine is perfect for learning Spark and other tools such as Jupyter and Microsoft R Server. In this landmark series, Microsoft data scientist Mark Tabladillo takes a deep dive into Apache Spark and the Spark ecosystem and demonstrates how to use the Spark support in Azure to make short work of big-data workloads and build sophisticated machine-learning models.

Course Title Author Duration Topic(s)
Introduction to Spark Mark Tabladillo 00:58:49 Data Science, Spark, Azure, Big Data
Business Intelligence Tools and Spark Mark Tabladillo 01:02:01 Data Science, Spark, Azure, Big Data
Data Processing with Spark 2 Mark Tabladillo 00:48:21 Data Science, Spark, Azure, Big Data
Text Analytics with Spark ML Mark Tabladillo 01:05:57 Data Science, Spark, Azure, Machine Learning, Big Data
Regression with Spark ML Mark Tabladillo 01:06:39 Data Science, Spark, Azure, Machine Learning, Big Data
Classification with Spark ML Mark Tabladillo 01:24:13 Data Science, Spark, Azure, Machine Learning, Big Data
Clustering with Spark ML Mark Tabladillo 01:10:03 Data Science, Spark, Azure, Machine Learning, Big Data
Recommendation with Spark ML Mark Tabladillo 00:59:49 Data Science, Spark, Azure, Machine Learning, Big Data