Taming Big Data with MapReduce and Hadoop – Hands On!
Learn MapReduce fast by building over 10 real examples, using Python, MRJob, and Amazon’s Elastic MapReduce Service.
Created by Frank Kane, Sundog Education Team | 5 hours on-demand video course
“Big data” analysis is a hot and highly valuable skill – and this Taming Big Data with MapReduce and Hadoop – Hands On! course will teach you two technologies fundamental to big data quickly: MapReduce and Hadoop. Ever wonder how Google manages to analyze the entire Internet on a continual basis? You’ll learn those same techniques, using your own Windows system right at home.
Learn and master the art of framing data analysis problems as MapReduce problems through over 10 hands-on examples, and then scale them up to run on cloud computing services in this course. You’ll be learning from an ex-engineer and senior manager from Amazon and IMDb.
What you’ll learn
- Understand how MapReduce can be used to analyze big data sets
- Write your own MapReduce jobs using Python and MRJob
- Run MapReduce jobs on Hadoop clusters using Amazon Elastic MapReduce
- Chain MapReduce jobs together to analyze more complex problems
- Analyze social network data using MapReduce
- Analyze movie ratings data using MapReduce and produce movie recommendations with it.
- Understand other Hadoop-based technologies, including Hive, Pig, and Spark
- Understand what Hadoop is for, and how it works
Recommended Course by Sundog Education
Machine Learning, Data Science and Generative AI with Python
AWS Certified Data Analytics Specialty 2024 – Hands On! Best seller
Building Recommender Systems with Machine Learning and AI Best seller
Elasticsearch 8 and the Elastic Stack: In Depth and Hands On Best seller
Taming Big Data with Apache Spark and Python – Hands On! Best seller
Who this course is for:
- This course is best for students with some prior programming or scripting ability. We will treat you as a beginner when it comes to MapReduce and getting everything set up for writing MapReduce jobs with Python, MRJob, and Amazon’s Elastic MapReduce service – but we won’t spend a lot of time teaching you how to write code. The focus is on framing data analysis problems as MapReduce problems and running them either locally or on a Hadoop cluster. If you don’t know Python, you’ll need to be able to pick it up based on the examples we give. If you’re new to programming, you’ll want to learn a programming or scripting language before taking this course.