IT Online Training and Placements in USA.

Big Data and Hadoop

Big data and Hadoop Online Training

 

Courses » Big Data and Hadoop » Hadoop Online Training

 

Hadoop Online Training

 

Modules Topic Description
Module 1 Introduction and Overview of Hadoop
  • What is Hadoop?
  • History of Hadoop
  • Building Blocks - Hadoop Eco-System
  • Who is behind Hadoop?
  • What Hadoop is good for and what it is not
  • Parallel Computer vs. Distributed Computing
  • How to configure Hadoop on your system
  • NameNode architecture (EditLog, FsImage, location of replicas)
  • Secondary NameNode architecture
  • DataNode architecture
Module 2 Hadoop Distributed FileSystem (HDFS)
  • HDFS Overview and Architecture
  • HDFS Installation
  • HDFS Use Cases
  • Hadoop FileSystem Shell
  • FileSystem Java API
Module 3 HBase - The Hadoop Database
  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • Java Client API
  • Java Administrative API
  • Filters
  • Scan Caching and Batching
  • Key Design
  • Table Design
Module 4 Map/Reduce 2.0/YARN
  • MapReduce 2.0 and YARN Overview
  • MapReduce 2.0 and YARN Architecture
  • Installation
  • Input and Output Formats
  • Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler)
  • HDFS and HBase as Source and Sink
  • Job Configuration
  • Job Submission and Monitoring
  • Anatomy of Job Execution on YARN
  • Distributed Cache
  • Hadoop Streaming
Module 5 Hadoop Developer Tasks
  • Writting a map-reduce programme
  • Reading and writing data using Java
  • Hadoop Eclipse integration
  • Mapper in details
  • Reducer in details
  • Using Combiners
  • Reducing Intermediate Data with Combiners
  • Writing Partitioners for Better Load Balancing
  • Sorting in HDFS
  • Searching in HDFS
  • Indexing in HDFS
  • SHands-On Exercise
Module 6 Hadoop Administrative Tasks
  • Routine Administrative Procedures
  • Understanding dfsadmin and mradmin
  • Block Scanner, Balancer
  • Health Check & Safe mode
  • DataNode commissioning/decommissioning
  • Monitoring and Debugging on a production cluster
  • NameNode Back up and Recovery
  • Upgrading Hadoop
Module 7 MapReduce Workflows
  • Decomposing Problems into MapReduce Workflow
  • Using JobControl
  • Oozie Introduction and Architecture
  • Oozie Installation
  • Developing, deploying, and Executing Oozie Workflows
Module 8 Pig
  • Pig Overview
  • Installation
  • Pig Latin
  • Developing Pig Scripts
  • Processing Big Data with Pig
  • Joining data-sets with Pig
Module 9 Inheritance
  • Types of in inheritance
  • Advantage of inheritance
  • Single inheritance
  • Multilevel inheritance
  • Hierarchical inheritance
  • Overriding methods
  • Runtime polymorphism
Module 10 Hive
  • Hive Overview
  • Installation
  • Hive QL