The Best College Academy of Our Small City

Latest News - QUIS NOSTRUM - Exercitationem ullam corporis suscipit laboriosam

Hadoop Admin Course

Course Duration: 35 hours

Prerequisites:
  • Linux Administration
  • Shell basic concepts
  • Knowledge Hadoop bigdata concepts
70% hands-on 30% concepts class


Hadoop Admin Course overview


  • How the Hadoop Distributed File System and Map Reduce work
  • What hardware configurations are optimal for Hadoop clusters
  • Configure Hadoop's options for best cluster performance
  • Configure NameNode High Availability
  • Configure NameNode Federation
  • Configure the FairScheduler to provide service-level agreements for multiple users of a cluster
  • How to install and implement Kerberos-based security for your cluster
  • What system administration issues exist with other Hadoop projects such as Hive, Pig, and HBase

Introduction to Big Data


  • Characteristics of Big Data
  • Why is parallel computing important
  • Discuss various products developed by vendors

Introducing Hadoop


  • Components of Hadoop
  • Starting Hadoop
  • Identify various processes
  • Hands on

Working with HDFS


  • Basic file commands
  • Web Based User Interface
  • Reading & Writing to files
  • Run a word count program
  • View jobs in the Web UI
  • Hands on

Installation & Configuration of Hadoop


  • Types of installation (RPM’s & Tar files)
  • Set up ‘ssh’ for the Hadoop cluster
  • Tree structure
  • XML, masters and slaves files
  • Checking system health
  • Discuss block size and replication factor
  • Benchmarking the cluster
  • Hands on

Advanced administration activities


  • Adding and de-commissioning nodes
  • Purpose of secondary name node
  • Recovery from a failed name node
  • Managing quotas
  • Enabling trash
  • Hands on

Monitoring Cluster


  • Hadoop infrastructure monitoring
  • Hadoop specific monitoring
  • Install and configure Nagios / Ganglia
  • Capture metrics
  • Hands on

Components of the Hadoop ecosystem

  • Discuss Hive, Sqoop, Pig, HBase, Flume
  • Use cases of each
  • Use Hadoop streaming to write code in Perl / Python
  • Hands on
Takeaway from this course
  • You can work on your HADOOP project setup with best practices implementations
  • Know more about troubleshooting aspects in Hadoop Administration
  • Tuning Hadoop infrastructure with Handon architecturing and design