Data Engineering course 2023 Syllabus

Scope of studies

44 hours via zoom:

  1. 15 sessions once a  week – 2 hours per session. 
  2. each session will include –
    1. reviewing the last session home assignment.
    2. learning new topics and discussing the next home assignment.
  3. Final project- 14 hours

Course description

Information technology developments enable the collection and processing
of huge amounts of data from a wide variety of sources.
The Data Engineer is typically in charge of managing data workflows, pipelines,
and ETL processes to allow
business entities receive important business insights
and a significant improvement in business performance.

The purpose of this course is to provide the student with the principles of data engineering
in a practical way that will allow him to start working in the field.

Lecturer

Omid Vahdaty, CTO Jutomate.
A Big Data Ninja and a meetup organizer.
Omid has over 20 years of career experience in helping building systems from the ground up,
In startups at all stages from seed level to exit  (SQream. Jajah, etc ) and  in big media
organizations (Walla News. Investing.com, etc).

He specializes in Big Data Architecture, Product innovation & strategic engineering thinking,
while designing the systems in a startup environment, meaning – agile, cost effective and fast learning curve.

Omid’s Linkedin- https://www.linkedin.com/in/omid-vahdaty/
Personal Bloghttps://big-data-demystified.ninja/

The course is designed for

people who work in the high tech sector that wants to acquire new skills in the world of data to broaden
their knowledge or to change their
specialization.

Technical information

The course will contain a maximum of 30 students.
Frontal lectures via zoom. 
Home assignments- watching videos and exercises.
Students will require a laptop/desktop for the exercises. 
Personal gmail is required. 
A GCP account is required. 
Most of the work will be hands-on (in linux environments with no gui).

Course plan:

Subject

Module

Hours

Intro

1

2

Linux, Containers, Windows

2

2

GCS + Storage + File System +GCE

3

2

SQL and Big Data Impact

4

2

SQL VIA bigQuery

5

2

Redshift  

6

2

Athena & Glue

7

2

Hadoop overview

8

2

Python Basics

9

2

Airflow basics

10

2

Airflow Advanced

11

2

Architecture & Big Data

12

2

Planning Final project 

13

2

Project – Students implementation

—-

14

Quality review Final project

14

2

Presentation Final project 

15

2

Sum Total

 

44