Hadoop Developer Course Content:
Course Content:
- How MapReduce and the Hadoop Distributed File System work
- How to write MapReduce code in Java or other programming languages
- What issues to consider when developing MapReduce jobs
- How to implement common algorithms in Hadoop
- Best practices for Hadoop development and debugging
- How to leverage other project such as Apache Hive, Apache Pig, Sqoop and Oozie
- Advanced Hadoop API topics required for real-world data analysis
Outline
Introduction
The Motivation for Hadoop
- Problems with traditional large-scale systems
- Requirements for a new approach
Hadoop: Basic Concepts
- An Overview of Hadoop
- The Hadoop Distributed File System
- Hands-On Exercise
- How MapReduce Works
- Hands-On Exercise
- Anatomy of a Hadoop Cluster
- Other Hadoop Ecosystem Components
Writing a MapReduce Program
- The MapReduce Flow
- Examining a Sample MapReduce Program
- Basic MapReduce API Concepts
- The Driver Code
- The Mapper
- The Reducer
- Hadoop’s Streaming API
- Using Eclipse for Rapid Development
- Hands-on exercise
- The New MapReduce API
Integrating Hadoop into the Workflow
- Relational Database Management Systems
- Storage Systems
- Importing Data from RDBMSs With Sqoop
- Hands-on exercise
- Importing Real-Time Data with Flume
- Accessing HDFS Using FuseDFS and Hoop
Delving Deeper Into The Hadoop API
- More about ToolRunner
- Testing with MRUnit
- Reducing Intermediate Data With Combiners
- The configure and close methods for Map/Reduce Setup and Teardown
- Writing Partitioners for Better Load Balancing
- Hands-On Exercise
- Directly Accessing HDFS
- Using the Distributed Cache
- Hands-On Exercise
Common MapReduce Algorithms
- Sorting and Searching
- Indexing
- Machine Learning With Mahout
- Term Frequency – Inverse Document Frequency
- Word Co-Occurrence
- Hands-On Exercise
Using Hive and Pig
- Hive Basics
- Pig Basics
- Hands-on exercise
Practical Development Tips and Techniques
- Debugging MapReduce Code
- Using LocalJobRunner Mode For Easier Debugging
- Retrieving Job Information with Counters
- Logging
- Splittable File Formats
- Determining the Optimal Number of Reducers
- Map-Only MapReduce Jobs
- Hands-On Exercise
Joining Data Sets in MapReduce
-
- You can attend 1st 2 classes or 3 hours for free. once you like the classes then you can go for registration.
- or full course details please visit our website www.hadooponlinetraining.net
-
- Duration for course is 30 days or 45 hours and special care will be taken. It is a one to one training with hands on experience.
-
- * Resume preparation and Interview assistance will be provided.
- For any further details please contact
- INDIA: +91-9052666559
- USA: +1-6786933475
or
- visit www.hadooponlinetraining.net
- please mail us all queries to info@magnifictraining.com