Total Views:
79
Understanding Big Data and Hadoop
Preview
Learning Objectives: In this module, you will understand what Big Data is, the limitations of the traditional solutions for Big Data problems, how Hadoop solves those Big Data problems, Hadoop Ecosystem, Hadoop Architecture, HDFS, Anatomy of File Read and Write & how MapReduce works.
-
Introduction to Big Data & Big Data Challenges
Preview
- Limitations & Solutions of Big Data Architecture
- Hadoop & its Features
- Hadoop Ecosystem
-
Hadoop 2.x Core Components
Preview
- Hadoop Storage: HDFS (Hadoop Distributed File System)
- Hadoop Processing: MapReduce Framework
- Different Hadoop Distributions
Topics:
Hadoop Architecture and HDFS
Preview
Learning Objectives: In this module, you will learn Hadoop Cluster Architecture, important configuration files of Hadoop Cluster, Data Loading Techniques using Sqoop & Flume, and how to setup Single Node and Multi-Node Hadoop Cluster.
-
Hadoop 2.x Cluster Architecture
Preview
-
Federation and High Availability Architecture
Preview
- Typical Production Hadoop Cluster
- Hadoop Cluster Modes
-
Common Hadoop Shell Commands
Preview
- Hadoop 2.x Configuration Files
- Single Node Cluster & Multi-Node Cluster set up
- Basic Hadoop Administration
Topics:
Hadoop MapReduce Framework
Preview
Learning Objectives: In this module, you will understand Hadoop MapReduce framework comprehensively, the working of MapReduce on data stored in HDFS. You will also learn the advanced MapReduce concepts like Input Splits, Combiner & Partitioner.
- Traditional way vs MapReduce way
-
Why MapReduce
Preview
- YARN Components
- YARN Architecture
- YARN MapReduce Application Execution Flow
- YARN Workflow
-
Anatomy of MapReduce Program
Preview
- Input Splits, Relation between Input Splits and HDFS Blocks
- MapReduce: Combiner & Partitioner
- Demo of Health Care Dataset
- Demo of Weather Dataset
Topics:
Advanced Hadoop MapReduce
Preview
Learning Objectives: In this module, you will learn Advanced MapReduce concepts such as Counters, Distributed Cache, MRunit, Reduce Join, Custom Input Format, Sequence Input Format and XML parsing.
Apache Pig
Preview
Learning Objectives: In this module, you will learn Apache Pig, types of use cases where we can use Pig, tight coupling between Pig and MapReduce, and Pig Latin scripting, Pig running modes, Pig UDF, Pig Streaming & Testing Pig Scripts. You will also be working on healthcare dataset.
Apache Hive
Preview
Learning Objectives: This module will help you in understanding Hive concepts, Hive Data types, loading and querying data in Hive, running hive scripts and Hive UDF.
-
Introduction to Apache Hive
Preview
- Hive vs Pig
-
Hive Architecture and Components
Preview
- Hive Metastore
- Limitations of Hive
- Comparison with Traditional Database
- Hive Data Types and Data Models
- Hive Partition
- Hive Bucketing
- Hive Tables (Managed Tables and External Tables)
- Importing Data
- Querying Data & Managing Outputs
- Hive Script & Hive UDF
- Retail use case in Hive
- Hive Demo on Healthcare Dataset
Topics:
Advanced Apache Hive and HBase
Preview
Learning Objectives: In this module, you will understand advanced Apache Hive concepts such as UDF, Dynamic Partitioning, Hive indexes and views, and optimizations in Hive. You will also acquire indepth knowledge of Apache HBase, HBase Architecture, HBase running modes and its components.
-
Hive QL: Joining Tables, Dynamic Partitioning
Preview
- Custom MapReduce Scripts
-
Hive Indexes and views
Preview
- Hive Query Optimizers
- Hive Thrift Server
-
Hive UDF
Preview
-
Apache HBase: Introduction to NoSQL Databases and HBase
Preview
- HBase v/s RDBMS
- HBase Components
-
HBase Architecture
Preview
- HBase Run Modes
- HBase Configuration
- HBase Cluster Deployment
Topics:
Advanced Apache HBase
Preview
Learning Objectives: This module will cover advance Apache HBase concepts. We will see demos on HBase Bulk Loading & HBase Filters. You will also learn what Zookeeper is all about, how it helps in monitoring a cluster & why HBase uses Zookeeper.
Processing Distributed Data with Apache Spark
Preview
Learning Objectives: In this module, you will learn what is Apache Spark, SparkContext & Spark Ecosystem. You will learn how to work in Resilient Distributed Datasets (RDD) in Apache Spark. You will be running application on Spark Cluster & comparing the performance of MapReduce and Spark.
Oozie and Hadoop Project
Preview
Learning Objectives: In this module, you will understand how multiple Hadoop ecosystem www.edureka.co © 2019 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved. components work together to solve Big Data problems. This module will also cover Flume & Sqoop demo, Apache Oozie Workflow Scheduler for Hadoop Jobs, and Hadoop Talend integration.
Certification Project
Preview
- Analyses of a Online Book Store
- Find out the frequency of books published each year. (Hint: Sample dataset will be provided)
- B. Find out in which year the maximum number of books were published
- Find out how many books were published based on ranking in the year 2002.
- Sample Dataset Description
- The Book-Crossing dataset consists of 3 tables that will be provided to you.
- Airlines Analysis
- Find list of Airports operating in Country India
- Find the list of Airlines having zero stops
- List of Airlines operating with codeshare
- Which country (or) territory having highest Airports
- Find the list of Active Airlines in United state
- Sample Dataset Description
- In this use case, there are 3 data sets. Final_airlines, routes.dat, airports_mod.dat
1 Review
Vitiday
- Vitiday@topmailnew.xyz
https://bestadalafil.com/
1
using tadalafil
Thyrox 200 Without A Prescription Rjutca https://bestadalafil.com/ - Cialis Jitlns Buy Viagra Greece Yttjod Identification of Negri bodies histologically cialis coupon Discount Need Secure Ordering Clobetasol Clobex Alopecia Areata Qpgicy https://bestadalafil.com/ - buy cialis Hzkqcg