The Ultimate Introduction to Big Data

Home : Book details : Book description

Description of The Ultimate Introduction to Big Data

The Ultimate Introduction to Big Data
.MP4, AVC, 260 kbps, 1920x10800  English, AAC, 126 kbps, 2 Ch  14h 29m  4.4 GB
Instructor: Frank Kane

See it. Do it. Learn it! Businesses rely on data for decision-making, success, and survival. The volume of data companies can capture is growing every day, and big data platforms like Hadoop help store, manage, and analyze it.

In The Ultimate Introduction to Big Data, big data guru Frank Kane introduces you to big data processing systems and shows you how they fit together. This liveVideo spotlights over 25 different technologies in over 14 hours of video instruction.
Designed for data storage and processing, Hadoop is a reliable, fault-tolerant operating system. The most celebrated features of this open source Apache project are HDFS, Hadoop's highly-scalable distributed file system, and the MapReduce data processing engine. Together, they can process vast amounts of data across large clusters. An ecosystem of hundreds of technologies has sprung up around Hadoop to answer the ever-growing demand for large-scale data processing solutions. Understanding the architecture of massive-scale data processing applications is an increasingly important and desirable skill, and you'll have it when you complete this liveVideo course!
The Ultimate Introduction to Big Data teaches you how to design powerful distributed data applications. With lots of hands-on exercises, instructor Frank Kane goes beyond Hadoop to cover many related technologies, giving you valuable firsthand experience with modern data processing applications. You'll learn to choose an appropriate data storage technology for your application and discover how Hadoop clusters are managed by YARN, Tez, Mesos, and other technologies. You'll also experience the combined power of HDFS and MapReduce for storing and analyzing data at scale.
Using other key parts of the Hadoop ecosystem like Hive and MySQL, you'll analyze relational data, and then tackle non-relational data analysis using HBase, Cassandra, and MongoDB. With Kafka, Sqoop, and Flume, you'll make short work of publishing data to your Hadoop cluster. When you're done, you'll have a deep understanding of data processing applications on Hadoop and its distributed systems.
what you will learn
Using HDFS and MapReduce for storing and analyzing data at scale
Analyzing relational data using Hive and MySQL
Creating scripts to process data on a Hadoop cluster using Pig and Spark
Using HBase, Cassandra, and MongoDB to analyze non-relational data
Querying data interactively with Drill, Phoenix, and Presto
Choosing an appropriate data storage technology for your application
Understanding how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie
Publishing data to your Hadoop cluster using Kafka, Sqoop, and Flume