Data Engineering Essentials - SQL, Python and Spark

Build Data Engineering Pipelines using SQL, Python and Spark



Platform: Udemy
Status: Available
Duration: 38 Hours

Price: $129.99 $0.00


Note: Udemy FREE coupon codes are valid for maximum 3 days only. Look for "Get Coupon" orange button at the end of Description.

What you'll learn

  • Setup Development Environment on GCP
  • Database Essentials using Postgres
  • Programming Essentials using Python
  • Data Engineering using Spark Dataframe APIs
  • Data Engineering using Spark SQL
Requirements
  • Laptop with decent configuration (Minimum 4 GB RAM and Dual Core)
  • Free Sign up for GCP with the available credit
  • CS or IT degree or prior IT experience is highly desired
Description
As part of this course, you will learn all the Data Engineering Essentials related to building Data Pipelines using SQL, Python as well as Spark.

About Data Engineering

Data Engineering is nothing but processing the data depending up on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development etc.

Course Details

As part of this course, you will be learning Data Engineering Essentials such as SQL, Programming using Python and Spark. Here is the detailed agenda for the course.

Database Essentials - SQL using Postgres

Getting Started with Postgres

Basic Database Operations (CRUD or Insert, Update, Delete)

Writing Basic SQL Queries (Filtering, Joins and Aggregations)

Creating Tables and Indexes

Partitioning Tables and Indexes

Predefined Functions (String Manipulation, Date Manipulation and other functions)

Writing Advanced SQL Queries

Programming Essentials using Python

Perform Database Operations

Getting Started with Python

Basic Programming Constructs

Predefined Functions

Overview of Collections - list and set

Overview of Collections - dict and tuple

Manipulating Collections using loops

Understanding Map Reduce Libraries

Overview of Pandas Libraries

Database Programming - CRUD Operations

Database Programming - Batch Operations

Setting up Single Node Cluster for Practice

Setup Single Node Hadoop Cluster

Setup Hive and Spark on Single Node Cluster

Introduction to Hadoop eco system

Overview of HDFS Commands

Data Engineering using Spark SQL

Getting Started with Spark SQL

Basic Transformations

Managing Tables - Basic DDL and DML

Managing Tables - DML and Partitioning

Overview of Spark SQL Functions

Windowing Functions

Data Engineering using Spark Data Frame APIs

Data Processing Overview

Processing Column Data

Basic Transformations - Filtering, Aggregations and Sorting

Joining Data Sets

Windowing Functions - Aggregations, Ranking and Analytic Functions

Spark Metastore Databases and Tables

Desired Audience

Here are the desired audience for this course.

College students and entry level professionals to get hands on expertise with respect to Data Engineering. This course will provide enough skills to face interviews for entry level data engineers.

Experienced application developers to gain expertise related to Data Engineering.

Conventional Data Warehouse Developers, ETL Developers, Database Developers, PL/SQL Developers to gain enough skills to transition to be successful Data Engineers.

Testers to improve their testing capabilities related to Data Engineering applications.

Any other hands on IT Professional who want to get knowledge about Data Engineering with Hands-On Practice.

Prerequisites

Logistics

Computer with decent configuration (At least 4 GB RAM, however 8 GB is highly desired)

Dual Core is required and Quad Core is highly desired

Chrome Browser

High Speed Internet

Desired Background

Engineering or Science Degree

Ability to use computer

Knowledge or working experience with databases and any programming language is highly desired

Who this course is for:

  • Computer Science or IT Students or other graduates with passion to get into IT
  • Data Warehouse Developers who want to transition to Data Engineering roles
  • ETL Developers who want to transition to Data Engineering roles
  • Database or PL/SQL Developers who want to transition to Data Engineering roles
  • BI Developers who want to transition to Data Engineering roles
  • QA Engineers to learn about Data Engineering
  • Application Developers to gain Data Engineering Skills