Hadoop Package
Course Syllabus:
• BigData
• What is BigData
• Characterstics of BigData
• Problems with BigData
• Handling BigData
• Distributed Systems
• Introduction to Distributed Systems
• Problems with Existing Distributed Systems to deal BigData
• Requirements of NewApprocach
• HADOOP history
• HADOOP Core Concepts
• HDFS
• MapReduce
• HADOOP Cluster
• Install Pseudo cluster
• Install Multi node cluster
• Configuration Introduction to HADOOP Cluster
• The Five Deamons working
• NameNode
• JobTracker
• SecondaryNameNode
• TaskTracker
• DataNode
• Introduction to HADOOP EcoSystem projects
• Writing MapReduce programs
• Understanding HADOOP API
• Basic programs of HADOOP MapReduce ApplicationForm
- Driver Code
- Mapper Code
- Reducer Code
• Eclipse intigration with HADOOP for Rapid Application Development
• Understanding ToolRunner
• More about ToolRunner
• Combiner
• Reducer
• configure and close methods
• Common MapReduce Algorithems
• Sorting
• Searching
• Indexing
• TF-IDF
• Word_CoOccurance
• HADOOP EcoSystem
• Flume
• Sqoop
• Importing data from RDBMS using sqoop
• Hive
• Introduction to hive
• Creating tables in hive
• Running queries
• Pig
• Introduction to pig
• Different modes of pig
• when to use hive and when to use pig
• HBASE
• Basics of HBASE
• Advanced MapReduce Programming
• Developing custom Writable
• Developing custom WritableComparable
• Understanding Input Output formats
• Introduction to Ooziee
• Hands on Exercise for each concept