Airflow SequentialExecutor Installation Centos 7.6

Author: Omid Vahdaty 13.8.2020​

This is a step by step manual to install Airflow Sequential Executor on Cento 7.6

If you need to install Airflow on Ubuntu 18 then use this blog instead.

getting started

sudo yum install -y git
git clone https://github.com/omidvd79/Big_Data_Demystified.git
sudo sh ~/Big_Data_Demystified/linux/centos7/install_pip.sh
  • Install python – min version 3.5, i used 3.7

https://github.com/omidvd79/Big_Data_Demystified/blob/master/linux/centos7/install_python3_7_cento7.sh

#install python 3.7
sudo yum install -y gcc openssl-devel bzip2-devel libffi-devel wget
cd ~/
wget https://www.python.org/ftp/python/3.7.2/Python-3.7.2.tgz
tar xzf ~/Python-3.7.2.tgz
cd Python-3.7.2
./configure --enable-optimizations
sudo make altinstall

python3.7 -V
cd ~/Big_Data_Demystified/airflow/setup/
cp * ~/
sudo sh install_airflow_centos7.sh
  • Create an optional logs folder in the home folder. and the dags folder.
cd ~/
mkdir gs_logs
mkdir -p airflow/dags
  • Start airflow for the first time by running initdb , webserver and scheduler:
# initialise the database , notice is is only used ONCE! on setup time :)
airflow initdb
# start the web server, default port is 8080
airflow webserver -p 8080
#start the scheduler
airflow scheduler
  • After Airflow web is up, Don’t forget to add the GCP related variables, in Airflow–>variables
gce_zone	us-central1-a	
gcp_project	myProjectID	
gcs_bucket	gs://airflow_gcs_bucket
  • Another thing to remember in GCP , you need to specify default project in bigquery connection. Airflow web >> Admin >> Connections >> bigquery_default >> Project Id  , add the value of your projectID
myProjectID


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

Leave a Reply