Torrent details for "Udemy - Apache Spark SQL – Bigdata In-Memory Analytics Master Course"    Log in to bookmark

wide
Torrent details
Cover
Download
Torrent rating (0 rated)
Controls:
Category:
Language:
English English
Total Size:
1.78 GB
Info Hash:
7abe4e3f3192b7f993a66e07869530e18a2c4782
Added By:
Added:  
20-08-2019 15:18
Views:
1,463
Health:
Seeds:
0
Leechers:
0
Completed:
188
wide




Description
wide
Image error
Description

This course is designed for professionals from zero experience to already skilled professionals to enhance their Spark SQL Skills. Hands on session covers on end to end setup of Spark Cluster in AWS and in local systems.

In data pipeline whether the data is in structured or in unstructured form, the final extracted data would be in structured form only. At the final stage we need to work with the structured data. SQL is popular query language to do analysis on structured data.

Apache spark facilitates distributed in-memory computing. Spark has inbuilt module called Spark-SQL for structured data processing. Users can mix SQL queries with Spark programs and seamlessly integrates with other constructs of Spark.

Spark SQL facilitates loading and writing data from various sources like RDBMS, NoSQL databases, Cloud storage like S3 and easily it can handle different format of data like Parquet, Avro, JSON and many more.

Spark Provides two types of APIs

Low Level API – RDD

High Level API – Dataframes and Datasets

Spark SQL amalgamates very well with various components of Spark like Spark Streaming, Spark Core and GraphX as it has good API integration between High level and low level APIs.

Initial part of the course is on Introduction on Lambda Architecture and Big data ecosystem. Remaining section would concentrate on reading and writing data between Spark and various data sources.

Dataframe and Datasets are the basic building blocks for Spark SQL. We will learn on how to work on Transformations and Actions with RDDs, Dataframes and Datasets.

Optimization on table with Partitioning and Bucketing.

To facilitate the understanding on data processing following usecase have been included to understand the complete data flow.

1) NHL Dataset Analysis

2) Bay Area Bike Share Dataset Analysis
Who this course is for:

   Beginners who wanted to start with Spark SQL with Apache Spark
   Data Analysts, Big data analysts
   Those who wants to leverage in-memory computing against structured data.

Requirements

   Introduction to Big Data ecosystem
   Basics on SQL

Last updated 5/2019

  User comments    Sort newest first

No comments have been posted yet.



Post anonymous comment
  • Comments need intelligible text (not only emojis or meaningless drivel).
  • No upload requests, visit the forum or message the uploader for this.
  • Use common sense and try to stay on topic.

  • :) :( :D :P :-) B) 8o :? 8) ;) :-* :-( :| O:-D Party Pirates Yuk Facepalm :-@ :o) Pacman Shit Alien eyes Ass Warn Help Bad Love Joystick Boom Eggplant Floppy TV Ghost Note Msg


    CAPTCHA Image 

    Anonymous comments have a moderation delay and show up after 15 minutes