Course Duration: 35 hours
Prerequisites:
- Linux Administration
- Shell basic concepts
- Knowledge Hadoop bigdata concepts
70% hands-on 30% concepts class
Hadoop Admin Course overview
- How the Hadoop Distributed File System and Map Reduce work
- What hardware configurations are optimal for Hadoop clusters
- Configure Hadoop's options for best cluster performance
- Configure NameNode High Availability
- Configure NameNode Federation
- Configure the FairScheduler to provide service-level agreements for multiple users of a cluster
- How to install and implement Kerberos-based security for your cluster
- What system administration issues exist with other Hadoop projects such as Hive, Pig, and HBase
Introduction to Big Data
- Characteristics of Big Data
- Why is parallel computing important
- Discuss various products developed by vendors
Introducing Hadoop
- Components of Hadoop
- Starting Hadoop
- Identify various processes
- Hands on
Working with HDFS
- Basic file commands
- Web Based User Interface
- Reading & Writing to files
- Run a word count program
- View jobs in the Web UI
- Hands on
Installation & Configuration of Hadoop
- Types of installation (RPM’s & Tar files)
- Set up ‘ssh’ for the Hadoop cluster
- Tree structure
- XML, masters and slaves files
- Checking system health
- Discuss block size and replication factor
- Benchmarking the cluster
- Hands on
Advanced administration activities
- Adding and de-commissioning nodes
- Purpose of secondary name node
- Recovery from a failed name node
- Managing quotas
- Enabling trash
- Hands on
Monitoring Cluster
- Hadoop infrastructure monitoring
- Hadoop specific monitoring
- Install and configure Nagios / Ganglia
- Capture metrics
- Hands on
Components of the Hadoop ecosystem
- Discuss Hive, Sqoop, Pig, HBase, Flume
- Use cases of each
- Use Hadoop streaming to write code in Perl / Python
- Hands on
Takeaway from this course
- You can work on your HADOOP project setup with best practices implementations
- Know more about troubleshooting aspects in Hadoop Administration
- Tuning Hadoop infrastructure with Handon architecturing and design