Torrent details for "Practical Guide to setup Hadoop and Spark Cluster using CDH"    Log in to bookmark

wide
Torrent details
Cover
Download
Torrent rating (1 rated)
Controls:
Category:
Language:
English English
Total Size:
10.09 GB
Info Hash:
b41ba1b86df1729aa312d26bb25147f14ffa75da
Added By:
Added:  
21-09-2022 01:54
Views:
260
Health:
Seeds:
2
Leechers:
0
Completed:
129
wide



Thanks for rating :
TheIndianPirate:_male: (5),


Description
wide
Image error
Description

Cloudera is one of the leading vendor for distributions related to Hadoop and Spark. As part of this Practical Guide, you will learn step by step process of setting up Hadoop and Spark Cluster using CDH.

Install โ€“ Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.

   Set up a local CDH repository
   Perform OS-level configuration for Hadoop installation
   Install Cloudera Manager server and agents
   Install CDH using Cloudera Manager
   Add a new node to an existing cluster
   Add a service using Cloudera Manager

Configure โ€“ Perform basic and advanced configuration needed to effectively administer a Hadoop cluster

   Configure a service using Cloudera Manager
   Create an HDFS userโ€™s home directory
   Configure NameNode HA
   Configure ResourceManager HA
   Configure proxy for Hiveserver2/Impala

Manage โ€“ Maintain and modify the cluster to support day-to-day operations in the enterprise

   Rebalance the cluster
   Set up alerting for excessive disk fill
   Define and install a rack topology script
   Install new type of I/O compression library in cluster
   Revise YARN resource assignment based on user feedback
   Commission/decommission a node

Secure โ€“ Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices

   Configure HDFS ACLs
   Install and configure Sentry
   Configure Hue user authorization and authentication
   Enable/configure log and query redaction
   Create encrypted zones in HDFS

Test โ€“ Benchmark the cluster operational metrics, test system configuration for operation and efficiency

   Execute file system commands via HTTPFS
   Efficiently copy data within a cluster/between clusters
   Create/restore a snapshot of an HDFS directory
   Get/set ACLs for a file or directory structure
   Benchmark the cluster (I/O, CPU, network)

Troubleshoot โ€“ Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios

   Resolve errors/warnings in Cloudera Manager
   Resolve performance problems/errors in cluster operation
   Determine reason for application failure
   Configure the Fair Scheduler to resolve application delays

Our Approach

   You will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager.
   You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year.
   You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later.
   Once servers are provisioned, you will go ahead and set up Ansible for Server Automation.
   You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages.
   You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager.
   As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc.

Who this course is for:

   System Administrators who want to understand Big Data eco system and setup clusters
   Experienced Big Data Administrators who want to learn how to manage Hadoop and Spark Clusters setup using CDH
   Entry level professionals who want to learn basics and Setup Big Data Clusters

Requirements

   Basic Linux Skills
   A 64 bit computer with minimum of 4 GB RAM
   Operating System โ€“ Windows 10 or Mac or Linux Flavor

Last Updated 6/2019

  User comments    Sort newest first

No comments have been posted yet.



Post anonymous comment
  • Comments need intelligible text (not only emojis or meaningless drivel).
  • No upload requests, visit the forum or message the uploader for this.
  • Use common sense and try to stay on topic.

  • :) :( :D :P :-) B) 8o :? 8) ;) :-* :-( :| O:-D Party Pirates Yuk Facepalm :-@ :o) Pacman Shit Alien eyes Ass Warn Help Bad Love Joystick Boom Eggplant Floppy TV Ghost Note Msg


    CAPTCHA Image 

    Anonymous comments have a moderation delay and show up after 15 minutes