The Best College Academy of Our Small City

Latest News - QUIS NOSTRUM - Exercitationem ullam corporis suscipit laboriosam

HADOOP-HIVE

Course objective

You can work on Hive programming

Course Content

  1. Hive Introduction
    • What is hive
    • Inside Hive
  2. Hive Data types and File Format
    • Primitive Data Types
    • Collection Data Types
    • Text File encoding of Data Types
  3. Hive Data Defination
    • Database in Hive
    • Alter Database
    • Creating Tables
    • Partitend and Managed Tables
    • Dropping Tables
    • Alter Tables
  4. Hive Data manipulation
    • Loading data into Managed Table
    • Inserting data into Tables and Queries
    • Creating tables and loading them in one query
    • Exporting Query output
  5. Hive QL: Queries
    • Select .. From clause
    • Where
    • Group By
    • Join Statements
    • Order By and Sort By
    • Distribute By with Sort By
    • Cluster By
    • Casting
    • Union All
  6. Hive QL: Views
    • Views to reducte query complexity
    • Views that restrict data based on conditions
    • View and MapType for For Dynamic Tables
    • View Odds and Ends
  7. Hive QL : Indexes
    • Creating Index
    • Rebuilding Index
    • Showing an Index
    • Dropping an Index
  8. Schema Design
    • Table by Day
    • Over Parturitions
    • Uniques Keys and Normalization
    • Making multiple Passes over Same Data
    • Partitioning every table
    • Buckenting Table Data storage
    • Adding columns to Tables
    • Using Column table
    • Almost always Compression
  9. Tuning
    • Using EXPLAIN
    • EXPLAIN EXTENDED
    • Limit Tuning
    • Optimized Joins
    • Local Mode
    • Parallel Execution
    • Strict Mode
    • Tuning the Number of Mappers and Reducers
    • JVM Reuse
    • Indexes
    • Dynamic Partition Tuning
    • Speculative Execution
    • Single MapReduce MultiGROUP BY
    • Virtual Columns
  10. Hive - Functions
    • Discovering and Describing Functions
    • Calling Functions
    • Standard Functions
    • Aggregate Functions
    • Table Generating Functions
    • A UDF for Finding a Zodiac Sign from a Day
    • UDF Versus GenericUDF
    • Permanent Functions
    • User-Defined Aggregate Functions
    • Creating a COLLECT UDAF to Emulate GROUP_CONCAT
    • User-Defined Table Generating Functions
    • UDTFs that Produce Multiple Rows
    • UDTFs that Produce a Single Row with Multiple Columns
    • UDTFs that Simulate Complex Types
    • Accessing the Distributed Cache from a UDF
    • Annotations for Use with Functions
    • Deterministic
    • Stateful
    • DistinctLike
    • Macros
  11. Customizing hive file record formats
    • File Versus Record Formats
    • Demystifying CREATE TABLE Statements
    • File Formats
    • SequenceFile
    • RCFile
    • Example of a Custom Input Format: DualInputFormat
    • Record Formats: SerDes
    • CSV and TSV SerDes
    • ObjectInspector
    • Think Big Hive Reflection ObjectInspector
    • XML UDF
    • XPath-Related Functions
    • JSON SerDe
    • Avro Hive SerDe
    • Defining Avro Schema Using Table Properties
    • Defining a Schema from a URI
    • Evolving Schema
    • Binary Output

Course Takeaway