This is my suggestion for training newbies data engineer on python basic. I am a strong believer in “cut the bullshit and give me what i need” approach, and starting to learn by example. However – bare in mind that , in the right moment, you are expected to have a deeper understand of the code you writing.
I took the livery to compile a list of subjects, they are all focused to data engineering tasks. This blog will update from type to time. I used the Online free course by microsoft, which quite good to cover all the basic. Notice, you may need and IDE such as PyCharm to write your code.
If you prefer to read step by step python manual: start with this excellent python tutorial
If you are already developer and your need a quick 1 hour tutorial on python basic examples, you may need something quicker. Thus, I have created a colabs notebook with python basics. Be sure to understand the difference between PyCharm IDE and CoLabs.
All the python code examples and python basic subjects a data engineer will require to get started
- Python basics such as:
print (" Hello World ")
more basic python commands we are going to cover here: if else conditional
,data types such as ( int, string, date, array, list) , casting of data types, string manipulation
,Date manipulation
Quick start by reviewing these code examples: string manipulation , string to date
if you want a verbal explanations watch these lecture by microsoft.
Working with Dates in python
Conditionals in python:
Python collections (lists and arrays )
2. Python Loops and functions:
Start with reading an example of code use case of browsing over local folder and files , if it still not clear, and intuitive watch the really short and to the point videos by microsoft:
python functions
Call API’s
3. Python input / output :
a very useful example, is command line arguments : use case of python command line arguments. This is very useful in the context of airflow , so be sure to understand this as a data engineer.
3.1 Handling Jsons in python
as a data engineer a huge chunk of your time will be spend parsing JSONS, below are quick explanations about how to handle jsons in python.
4. Recursion
5. Python exception Handling
example of exception handling should be trivial, if not watch the below videos.
6. Python example of encryption via SHA in python, also useful use case in data engineering.
7. Handling modules and packages in python – sometime you need to install packages to get started.
Virtualenv in python, a very useful use case when working with airflow, as each ETL /script may require different imports.
8. Python Data frame / pandas, yes is a complex subject, and it has a world of use case, but you may want to get you self familiar with this subject.
9. GCP python packages : BigQuery and GCS examples
10. AWS packages : Boto3. TBD.
11. Visualization packages such as seaborn , and notebooks such Google colabs
https://colab.research.google.com/notebooks/intro.ipynb#recent=true
12. Connecting to DB inside a python script: use case connecting to DB via SQL Alchemy
13. Saving string to file in python
14. Airflow related python examples. … TBD
15. Tensor flow … TBD
Few other examples on python to get you started are committed in our git:
https://github.com/omidvd79/Big_Data_Demystified/tree/master/python_basic
https://github.com/omidvd79/Big_Data_Demystified/tree/master/python_gcp_examples
——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me: