Airflow SequentialExecutor Installation manual and basic commands

Author: Omid Vahdaty 15.8.2018

I used the below command, it took me several attempts, so i list here the list of CMD’s that worked for me.

  1. Install a new machine via GCE or EC2, with minimal resource, preferably free tier. if you are using GCE , make sure Cloud API access scopes has api enabled for BQ and GCS. if you are using AWS make sure, the machine has role enabled with the required permissions. [ I highly recommend using disk size of at least 200GB as the airflow logs folder is quickly filled , causing the airflow to crush]
  2. Install git
sudo apt-get -y install git

3. Git clone big data demystified git

git clone

4. Copy the airflow installation script to home folder and run it

cd Big_Data_Demystified/
cd airflow/
cd setup/
cp * ~/
cd ~/
sudo sh
#this script will also work on GCP Debian OS

5. Create an optional logs folder in the home folder. and the dags folder.

mkdir gs_logs
mkdir -p airflow/dags

5. Start airflow for the first time by running initdb , webserver and scheduler:

# initialise the database , notice is is only used ONCE! on setup time 🙂
airflow initdb
# start the web server, default port is 8080
airflow webserver -p 8080
#start the scheduler
airflow scheduler

You can check our our sh script to start airflow, notice it was customized to our needs


Common problem starting the web server:

To stop airflow

#stop server:  Get the PID of the service you want to stop 
ps -eaf | grep airflow
# Kill the process 
kill -9 {PID}
#or in one command (ubuntu):
pkill airflow

Go over the config file

nano airflow/airflow.cfg

Notice the LOGS & DAGS folder is located:

base_log_folder = /home/omid/airflow/logs
dags_folder = /home/omid/airflow/dags

Make sure HTTP 8080 is open on the machine via GCP/AWS. Instruction on GCE:

enter GCE and choose your instance
right click on 3 dots (on the right corner ,of the instance row on GCE)
View Network Details
add your IP and remove 
change the port http port to 8080
Check connectivity(based on the External IP):

on AWS EC2 machine

Change the ip/port on security groups of the instance on EC2

To avoid the problem of permission for different linux users, You might want to consider GCS fuse on GCP machines, i assume dags are located on bucket name below. It will also decouple your dags from the instance, and generally speaking will make the process of uploading new dags easy.



After Airflow web is up, Don’t forget to add the GCP related variables, in Airflow–>variables

gce_zone	us-central1-a	
gcp_project	myProjectID	
gcs_bucket	gs://airflow_gcs_bucket

Another thing to remember in GCP , you need to specify default project in bigquery connection. Airflow web >> Admin >> Connections >> bigquery_default >> Project Id  , add the value of your projectID


Airflow user for login

airflow users create \
> --username admin \
> --firstname FIRST_NAME \
> --lastname LAST_NAME \
> --role Admin \
> --email


Advanced command to start / stop Airflow services

  1. Start Web Server
    nohup airflow webserver $* >> ~/airflow/logs/webserver.logs &
  2. Start Celery Workers
    nohup airflow worker $* >> ~/airflow/logs/worker.logs &
  3. Start Scheduler
    nohup airflow scheduler >> ~/airflow/logs/scheduler.logs &
  4. Navigate to the Airflow UI
  5. Start Flower (Optional)
    • Flower is a web UI built on top of Celery, to monitor your workers.
    • nohup airflow flower >> ~/airflow/logs/flower.logs &
  6. Navigate to the Flower UI (Optional)

Example path of Airflow Dags folder:


More Example could be found on another blog of this website:


Another good manual:


I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

1 thought on “Airflow SequentialExecutor Installation manual and basic commands”

Leave a Reply