Data Engineering course 2023 Syllabus
Scope of studies
44 hours via zoom:
- 15 sessions once a week – 2 hours per session.
- each session will include –
- reviewing the last session home assignment.
- learning new topics and discussing the next home assignment.
- Final project- 14 hours
Course description
Information technology developments enable the collection and processing
of huge amounts of data from a wide variety of sources.
The Data Engineer is typically in charge of managing data workflows, pipelines,
and ETL processes to allow business entities receive important business insights
and a significant improvement in business performance.
The purpose of this course is to provide the student with the principles of data engineering
in a practical way that will allow him to start working in the field.
Lecturer
Omid Vahdaty, CTO Jutomate.
A Big Data Ninja and a meetup organizer.
Omid has over 20 years of career experience in helping building systems from the ground up,
In startups at all stages from seed level to exit (SQream. Jajah, etc ) and in big media
organizations (Walla News. Investing.com, etc).
He specializes in Big Data Architecture, Product innovation & strategic engineering thinking,
while designing the systems in a startup environment, meaning – agile, cost effective and fast learning curve.
Omid’s Linkedin- https://www.linkedin.com/in/omid-vahdaty/
Personal Blog– https://big-data-demystified.ninja/
The course is designed for
people who work in the high tech sector that wants to acquire new skills in the world of data to broaden
their knowledge or to change their specialization.
Technical information
The course will contain a maximum of 30 students.
Frontal lectures via zoom.
Home assignments- watching videos and exercises.
Students will require a laptop/desktop for the exercises.
Personal gmail is required.
A GCP account is required.
Most of the work will be hands-on (in linux environments with no gui).
Course plan:
Subject | Module | Hours |
Intro | 1 | 2 |
Linux, Containers, Windows | 2 | 2 |
GCS + Storage + File System +GCE | 3 | 2 |
SQL and Big Data Impact | 4 | 2 |
SQL VIA bigQuery | 5 | 2 |
Redshift | 6 | 2 |
Athena & Glue | 7 | 2 |
Hadoop overview | 8 | 2 |
Python Basics | 9 | 2 |
Airflow basics | 10 | 2 |
Airflow Advanced | 11 | 2 |
Architecture & Big Data | 12 | 2 |
Planning Final project | 13 | 2 |
Project – Students implementation | —- | 14 |
Quality review Final project | 14 | 2 |
Presentation Final project | 15 | 2 |
Sum Total | 44 |