architecture

Advanced ETL Demystified

Advanced ETL Demystified

Lecturer: Omid Vahdaty 18.10.2021

PySpark advantages over traditional ETL. Advanced techniques of parsing largest scales, JSONs based data sets at scale.

Video

Slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

Uncategorised

Full devops automation for microservices in AWS using Simloud platform

Full devops automation for microservices in AWS using Simloud platform

Lecturer: Assaf Weissblat 13.10.2021

Key concepts related to microservices implementation in AWS and how to use Simloud platform to automate it

Video


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

AWS EMR, dataproc

Jupyter Demystified

Jupyter Demystified

Author: Omid Vahdaty 13.10.2021

Following my combustion around managed jupyter offerings in AWS and GCP I have created a simple research to clarify the differences of  jupyter notebook, jupyter hub and jupyter labs. In addition, I added some bootstrapping script instructions to manage admin users. Based on this research , I had the pleasure of correcting AWS official documentations.  Let me know if you find this useful! 


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

ETL tools

An Intro to Rivery Through Kits – Plug and Play Data Models

An Intro to Rivery Through Kits - Plug and Play Data Models

Lecturer: Dan Greenberg 2.9.2021

Rivery is a new age SaaS ELT platform that helps companies streamline analytics in their data warehouse of choice. In this session, the audience will learn about how to deploy pre-built data models through a UI based approach. Gone are the days where ETL developers have to spend hours or days building their data pipelines. In Rivery, we’ve developed a feature called Kits to help data teams instantly create powerful data pipelines, automated data transformation, and pre-defined tables to tackle common data sources and use cases.

The session will start with a high level overview on the types of data pipelines that can be built within Rivery. From there, we will explore an example Kit to give the audience a real-world example of this capability in action.

About the Lecturer: Dan Greenberg leads the Sales & Partnerships team at Rivery. He’s been with the company for almost 4 years, and before that has spent time in the data analytics space at Keyrus and IBM.

Video

Slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

airflow

Airflow Performance and Best Practices Demystified

Airflow Performance and Best Practices Demystified

Lecturer: Omid Vahdaty 4.5.2021

Airflow architecture, performance tuning for an unstable cluster, cost implications and the varied configuration options available to resolve weird airflow issues, how to use Cloudwatch to monitor Airflow performance.

Video

Slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty: