Preferred method of contact:

Extracting Business Value From Big Data With Pig and Hive

COURSE TYPE

Intermediate

Course Number

1254

Duration

4 Days

PDF Add to WishList

Increase productivity by avoiding low-level Java coding characteristic of MapReduce, and rapidly begin extracting business value for competitive advantage. In this big data training course, you will learn to gain access to previously inaccessible data, gather and feed data into Hadoop for storage, transform and filter data using Pig, and extract value using Hive and Spark SQL.

You Will Learn How To

  • Manipulate complex data sets stored in Hadoop for competitive advantage
  • Automate the transfer of data into Hadoop storage with Flume and Sqoop
  • Filter data with Extract-Transform-Load (ETL) operations using Pig
  • Query multiple data sets for analysis with Pig and Hive
  • Perform real-time queries on Hadoop data with Tez and Spark SQL

Important Course Information

Recommended Experience:

  • Knowledge of databases and SQL

Course Outline

  • The Hadoop Ecosystem
  • Hadoop overview
  • Surveying the Hadoop components
  • Defining the Hadoop architecture
  • Exploring HDFS and MapReduce

Storing data in HDFS

  • Achieving reliable and secure storage
  • Monitoring storage metrics
  • Controlling HDFS from the Command Line

Parallel processing with MapReduce

  • Detailing the MapReduce approach
  • Transferring algorithms not data
  • Dissecting the key stages of a MapReduce job

Automating data transfer

  • Facilitating data Ingress and Egress
  • Aggregating data with Flume
  • Configuring data fan in and fan out
  • Moving relational data with Sqoop
  • Executing Data Flows with Pig

Describing characteristics of Apache Pig

  • Contrasting Pig with MapReduce
  • Identifying Pig use cases
  • Pinpointing key Pig configurations

Structuring unstructured data

  • Representing data in Pig's data model
  • Running Pig Latin commands at the Grunt Shell
  • Expressing transformations in Pig Latin Syntax
  • Invoking Load and Store functions
  • Performing ETL with Pig

Transforming data with Relational Operators

  • Creating new relations with joins
  • Reducing data size by sampling
  • Extending Pig with user–defined functions

Filtering data with Pig

  • Consolidating data sets with unions
  • Partitioning data sets with splits
  • Injecting parameters into Pig scripts
  • Manipulating Data with Hive

Leveraging business advantages of Hive

  • Factoring Hive into components
  • Imposing structure on data with Hive

Organising data in Hive Data Warehouse

  • Creating Hive databases and tables
  • Contrasting available data types in Hive
  • Loading and storing data efficiently with SerDes

Designing data layout for maximum performance

  • Populating tables from queries
  • Partitioning Hive Tables for optimal queries
  • Composing HiveQL queries
  • Extracting Business Value with HiveQL

Performing joins on unstructured data

  • Distinguishing joins available in Hive
  • Optimising join structure for performance

Pushing HiveQL to the limit

  • Sorting, distributing and clustering data
  • Reducing query complexity with views
  • Improving query performance with indexes

Deploying Hive in production

  • Designing Hive schemas
  • Setting up data compression
  • Debugging Hive scripts

Streamlining storage management with HCatalog

  • Unifying the data view with HCatalog
  • Leveraging HCatalog to access the Hive metastore
  • Communicating via the HCatalog interfaces
  • Populating a Hive table from Pig
  • Interacting with Hadoop Data in Real Time
  • Performing low-latency queries with Impala
  • Leveraging the Tez execution engine to improve performance
  • Reducing data access time with Spark SQL
Show complete outline
Show Less

Convenient Ways to Attend This Instructor-Led Course

Hassle-Free Enrolment: No advance payment required to reserve your seat.
Tuition Fee due 30 days after you attend your course.

In the Classroom

Live, Online

Private Team Training

In the Classroom — OR — Live, Online

Tuition Fee — Standard: £2095  

AFTERNOON START: Attend these live courses online via Anyware

27 - 30 Mar (4 Days)
2:00 PM - 9:30 PM BST
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

25 - 28 Sep (4 Days)
2:00 PM - 9:30 PM BST
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Guaranteed to Run

Private Team Training

Enroling at least 3 people in this course? Consider bringing this (or any course that can be custom designed) to your preferred location as a private team training.

For details, call 0800 282 353 or Click here »

Tuition Fee

Standard

In Classroom or
Online

Standard

£2095

Private Team Training

Contact Us »

Course Tuition Fee Includes:

After-Course Instructor Coaching
When you return to work, you are entitled to schedule a free coaching session with your instructor for help and guidance as you apply your new skills.

After-Course Computing Sandbox
You'll be given remote access to a preconfigured virtual machine for you to redo your hands-on exercises, develop/test new code, and experiment with the same software used in your course.

Free Course Exam
You can take your Learning Tree course exam on the last day of your course or online at any time after class and receive a Certificate of Achievement with the designation "Awarded with Distinction."

Prev
Next

Training Hours

Standard class hours:
9:00 a.m. - 4:30 p.m.

Last day class hours:
9:00 a.m. - 3:30 p.m.

Free Course Exam – Last Day:
3:30 p.m. - 4:30 p.m.

Each class day:
Informal discussion with instructor about your projects or areas of special interest:
4:30 p.m. - 5:30 p.m.

AFTERNOON START class hours:
2:00 p.m. - 9:30 p.m.


Last day class hours:
2:00 p.m. - 8:30 p.m.


Free Course Exam – Last Day:
8:30 p.m. - 9:30 p.m.


Each class day:
Informal discussion with instructor about your projects or areas of special interest
9:30 p.m. - 10:30 p.m.

- ,

Prev
Next
Chat Now

Please Choose a Language

Canada - English

Canada - Français