Mastering Hadoop Data Science on Azure

Apache Hadoop is an open-source framework for extracting information from massively large datasets using the MapReduce programming model. It is capable of distributing workloads across multiple nodes in a cluster for fast parallel processing, and it uses the Hadoop Distributed File System (HDFS) to provide high-aggregate bandwidth to machines comprising the cluster. In this series, Frank La Vigne takes a deep dive into Hadoop and the Hadoop ecosystem and demonstrates how to run these tools locally or in Azure HDInsight clusters to make short work of big data.

Course Title	Author	Duration	Topic(s)
Introducing Hadoop	Frank La Vigne	00:23:27	Data Science, Hadoop, Azure, Big Data
Processing Big Data with MapReduce	Frank La Vigne	01:07:23	Data Science, Azure, Hadoop, Big Data
Using Hive to Query Hadoop	Frank La Vigne	00:57:31	Data Science, Azure, Hadoop, Big Data, Hive
Using Pig with Hadoop	Frank La Vigne	00:50:13	Data Science, Pig, Hadoop, Big Data
Using HBase	Frank La Vigne	00:50:35	Data Science, HBase, Hadoop, Big Data