COURSE ID | GES-BD |
DURATION | 39 hrs |
DELIVERY METHOD | Classroom Instructor-led training (CILT) Online Instructor-led training ( OILT) |
RDBMS vs Hadoop Ecosystem tour (9 products) Vendor comparison (Cloudera, Hortonworks, MapR, Amazon EMR) Hardware Recommendations NameNode and DataNode architecture Write pipeline Read pipeline Heartbeats Rack awareness Block scanner JobTracker/TaskTracker architecture Shuffle: Sort + Partitioning Speculative Execution Input/output formats Distributed cache
Introduction to Hadoop FS and Processing Environment’s UIs
How to read and write files Basic Unix commands for Hadoop Hadoop FS shell Hadoop releases practical Hadoop daemons practical
Pig Introduction
Why Pig if Map Reduce is there? How Pig is different from Programming languages Pig Data flow Introduction How Schema is optional in Pig Pig Data types Pig Commands – Load, Store , Describe , Dump Map Reduce job started by Pig Commands Execution plan Pig- UDFs Pig Use cases Pig Assignment Complex Use cases on Pig XML Data Processing in Pig Structured Data processing in Pig Semi-structured data processing in Pig Pig Advanced Assignment Real time scenarios on Pig When we should use Pig When we shouldn’t use Pig Live examples of Pig Use cases Hive Introduction Meta storage and meta store Introduction to Derby Database Hive Data types HQL DDL, DML and sub languages of Hive Internal , external and Temp tables in Hive Differentiation between SQL based Datawarehouse and Hive
Hive releases
Why Hive is not best solution for OLTP OLAP in Hive Partitioning Bucketing Hive Architecture Thrift Server Hue Interface for Hive How to analyze data using Hive script Differentiation between Hive and Impala UDFs in Hive Complex Use cases in Hive Hive Advanced Assignment Real time scenarios of Hive POC on Pig and Hive , With real time data sets and problem statements How Map Reduce works as Processing Framework End to End execution flow of Map Reduce job Different tasks in Map Reduce job Why Reducer is optional while Mapper is mandatory? Introduction to Combiner Introduction to Partitioner Programming languages for Map Reduce Why Java is preferred for Map Reduce programming POC based on Pig, Hive, HDFS, MR Introduction to NOSQL Why NOSQL if SQL is in market since several years Databases in market based on NOSQL CAP Theorem ACID Vs. CAP OLTP Solutions with different capabilities Which Nosql based solution is capable to handle specific requirements Examples of companies like Google, Facebook, Amazon, and other clients who are using NOSQL based databases HBase Architecture of column families
How to work on Map Reduce in real time Map Reduce complex scenarios Introduction to HBase Introduction to other NOSQL based data models Drawbacks of Hadoop Why Hadoop can’t work for real time processing How HBase or other NOSQL based tools made real time processing possible on the top of Hadoop HBase table and column family structure HBase versioning concept HBase flexible schema HBase Advanced
Introduction to Zookeeper
How Zookeeper helps in Hadoop Ecosystem How to load data from Relational storage in Hadoop Sqoop basics Sqoop practical implementation Sqoop alternative Sqoop connector Quick revision of previous classes to fill the gap in your understanding and correct understandings How to load data in Hadoop that is coming from web server or other storage without fixed schema How to load unstructured and semi structured data in Hadoop Introduction to Flume Hands-on on Flume How to load Twitter data in HDFS using Hadoop Introduction to Oozie How to schedule jobs using Oozie What kind of jobs can be scheduled using Oozie How to schedule jobs which are time based Hadoop releases From where to get Hadoop and other components to install Introduction to YARN Significance of YARN Introduction to Hue How Hue is used in real time Hue Use cases Real time Hadoop usage Real time cluster introduction Hadoop Release 1 vs Hadoop Release 2 in real time Hadoop real time project Major POC based on combination of several tools of Hadoop Ecosystem Comparison between Pig and Hive real time scenarios Real time problems and frequently faced errors with solution Introduction to Spark Introduction to scala Basics Features of SPARK and Scala available in Hue Why Spark demand is increasing in market How can we use Spark with Hadoop Eco System Datasets for practice purpose Spark use cases with real time scenarios Spark Practical with advanced concepts Scala platform with complex use cases Real time project use cases examples based on Spark and Scala |
Recent comments
"I would like to thank Global erp solutions Trainers for enhancing my technical knowledge which help to boosted up my careers and confidence for guiding me throughout my training. The training was superb that helped me upgrade my knowledge & technical skills .
I assured sincerely refer to My friends.."
"Thanks for giving excellent training for big data training. "
"Well qualified trainers with Global ERP Solutions, they have done outstanding job. Great job!"