Over the years businesses world over have been saving data pertaining to business, customers, products, services etc which according to some sources has grown to about 2.5 quintillion bytes of data being generated everyday world over. These data could be in structured or unstructured format however the point is the businesses now realize there is immense wealth treasured in these data sets since they can give a lot of information on product development,customer understanding wrt behavior, buying patterns, trends,social patterns, projections, patterns etc and hence comes the need for learning big data. Introducing “Hadoop”- the core platform for structuring Big Data and making it useful for analytics purposes. The Hadoop training program equips you to pull right data from those burried over years, run analytics so as to improve business.. .
WE WILL OFFER 75% OFF ON COURSES FEES TO MARTYRS FAMILIES AND 50% OFF ON COURSE FEES FOR ARMYMEN DEPENDANTS.
Types of data and their significance
Need for Bigdata Analytics.
Why Bigdata with hadoop?
History Of Hadoop.
Node, Rack , Cluster.
Architecture of Hadoop.
Characteristics of Namenode.
Significance of JobTracker and Tasktrackers.
hase co-ordinatiaon with JobTracker.
Scondary Namenode usage and workaround.
Hadoop releases and their significance.
Workaround with datanodes.
YARN architecture.Significance of scalability of operation.
Use cases where not to use Hadoop.
Use cases where Hadoop Is used.
Hadoop Classes,What is MapReduceBase? Mapper Class and its Methods. What is Partitioner and types. Hadoop specific datatypes Working on unstructured data analytics. What is an iterator and its usage techniques. Types of mappers and reducers. What is output collector and its significance. Workaround with Joining of datasets. Complications with mapreduce. Mapreduce anatomy. Anagram example,Teragen Example,Terasort Example WordCount Example Working with multiple mappers. Working with weather data on multiple datanodes in a Fullydistributed architecture. UseCases where mapreduce anatomy fails. Interview questions based on JAVA mapreduce
Introduction to Pig Latin, History and evolution of Pig latin, Why Pig is used only with Bigdata, Pig architecture and overview of Compiler and Execution Engine, Pig Release and significance with bugfixes,
Pig Specific Datatypes, Complex DatatypesBags, Tuples, FieldsPig Specific Methods, Comparison Between Yahoo Pig & Facebook Hive.
Working with Grunt Shell, Grunt commands(total 17)Pig Data input techniques for flatfiles(comma separated, tab delimited and fixed width).
Working with schemaless approachHow to attach schema to a file/table in pig, Schema referencing for similar tables and files, Working with delimiters
Working with BinaryStorage and Text Loader Bigdata Operations and Read write analogy
Filtering DatasetsFiltering rows with specific condition Filtering rows with multiple conditions Filtering rows with string based conditionsSorting DataSets
Sorting rows with specific column or columns Multilevel SortAnalogy of a sort operation Grouping datasets and Co-grouping data Joining DataSetsTypes of Joins supported by Pig Latin
Aggregate oprations like average,sum,min,max,count Flatten operator Creating a UDF(USER DFINED FUNCTION) using java Calling UDF froma pig scriptData validation scripts
Introduction Installation and Configuration Interacting HDFS using HIVE Map Reduce Programs through HIVE
HIVE Commands Loading, Filtering, Grouping Data types, OperatorsJoins, Groups Sample programs in HIVE Alter and Delete in Hive. Partition in Hive.
Indexing. Joins in Hive.Unions in hive. Industry specific configuration of hive parameters. Authentication & Authorization. Statistics with Hive. Archiving in Hive. Hands-on exercise
Hbase Architechtural point of view Regionservers and their implementation Client API’s and their features How messaging system works Columns and column families Configuring hbase-site.xml
Available Client Loading Hbase with semi-structured data Internal data storage in hbase Timestamps HBase Architecture Creating table with column families MapReduce Integration.
HBase: Advanced Usage, Schema Design Load data from pig to hbase Sqoop architecture Data Import and export in SQOOP. Deploying quorum and configuration throughout the cluster.
Introduction to YARN and MR2 daemons. Active and Standby Namenodes Resource Manager and Application Master Node Manager
Container Objects and Container Namenode Federation Cloudera Manager and Impala Load balancing in cluster with namenode federation Architectural differences between Hadoop 1.0 and 2.0
Introduction to cloud infrastructure. Amazon SaaS, Paas and IaaS. Creating EC2 instance for processing. Creating S3 buckets Deploying data on to the cloud.
Choosing size of our instance. Configuration of EMR instance Creating a virtual cluster on amazon Deploying project and getting stats.