Over the years businesses world over have been saving data pertaining to business, customers, products, services etc which according to some sources has grown to about 2.5 quintillion bytes of data being generated everyday world over. These data could be in structured or unstructured format however the point is the businesses now realize there is immense wealth treasured in these data sets since they can give a lot of information on product development,customer understanding wrt behavior, buying patterns, trends,social patterns, projections, patterns etc and hence comes the need for learning big data. Introducing “Hadoop”- the core platform for structuring Big Data and making it useful for analytics purposes. The Hadoop training program equips you to pull right data from those burried over years, run analytics so as to improve business.. .
WE WILL OFFER 75% OFF ON COURSES FEES TO MARTYRS FAMILIES AND 50% OFF ON COURSE FEES FOR ARMYMEN DEPENDANTS.
Hadoop Course Details
Introduction to BigData & Hadoop
Types of data and their significance
Need for Bigdata Analytics.
Why Bigdata with hadoop?
History Of Hadoop.
Node, Rack , Cluster.
Architecture of Hadoop.
Characteristics of Namenode.
Significance of JobTracker and Tasktrackers.
hase co-ordinatiaon with JobTracker.
Scondary Namenode usage and workaround.
Hadoop releases and their significance.
Workaround with datanodes.
YARN architecture.Significance of scalability of operation.
Use cases where not to use Hadoop.
Use cases where Hadoop Is used.
Hadoop And java Api
Hadoop Classes,What is MapReduceBase?
Mapper Class and its Methods.
What is Partitioner and types.
Hadoop specific datatypes
Working on unstructured data analytics.
What is an iterator and its usage techniques.
Types of mappers and reducers.
What is output collector and its significance.
Workaround with Joining of datasets.
Complications with mapreduce.
Anagram example,Teragen Example,Terasort Example
Working with multiple mappers.
Working with weather data on multiple datanodes in a Fullydistributed architecture.
UseCases where mapreduce anatomy fails.
Interview questions based on JAVA mapreduce
Introduction to Pig Latin, History and evolution of Pig latin, Why Pig is used only with Bigdata, Pig architecture and overview of Compiler and Execution Engine, Pig Release and significance with bugfixes,
Pig Specific Datatypes, Complex DatatypesBags, Tuples, FieldsPig Specific Methods, Comparison Between Yahoo Pig & Facebook Hive.
Working with Grunt Shell, Grunt commands(total 17)Pig Data input techniques for flatfiles(comma separated, tab delimited and fixed width).
Working with schemaless approachHow to attach schema to a file/table in pig, Schema referencing for similar tables and files, Working with delimiters
PIG LATIN II
Working with BinaryStorage and Text Loader
Bigdata Operations and Read write analogy
Filtering DatasetsFiltering rows with specific condition
Filtering rows with multiple conditions
Filtering rows with string based conditionsSorting DataSets
Sorting rows with specific column or columns
Multilevel SortAnalogy of a sort operation
Grouping datasets and Co-grouping data
Joining DataSetsTypes of Joins supported by Pig Latin
Aggregate oprations like average,sum,min,max,count
Creating a UDF(USER DFINED FUNCTION) using java
Calling UDF froma pig scriptData validation scripts
Installation and Configuration
Interacting HDFS using HIVE
Map Reduce Programs through HIVE
Loading, Filtering, Grouping
Data types, OperatorsJoins, Groups
Sample programs in HIVE
Alter and Delete in Hive.
Partition in Hive.
Joins in Hive.Unions in hive.
Industry specific configuration of hive parameters.
Authentication & Authorization.
Statistics with Hive.
Archiving in Hive.
Hbase & Zookeeper
Hbase Architechtural point of view
Regionservers and their implementation
Client API’s and their features
How messaging system works
Columns and column families
Loading Hbase with semi-structured data
Internal data storage in hbase
Creating table with column families
HBase: Advanced Usage, Schema Design
Load data from pig to hbase
Data Import and export in SQOOP.
Deploying quorum and configuration throughout the cluster.
Introduction to YARN and MR2 daemons.
Active and Standby Namenodes
Resource Manager and Application Master
Container Objects and Container
Cloudera Manager and Impala
Load balancing in cluster with namenode federation
Architectural differences between Hadoop 1.0 and 2.0
Hadoop on Amazon Cloud
Introduction to cloud infrastructure.
Amazon SaaS, Paas and IaaS.
Creating EC2 instance for processing.
Creating S3 buckets
Deploying data on to the cloud.
Choosing size of our instance.
Configuration of EMR instance
Creating a virtual cluster on amazon
Deploying project and getting stats.