adamSC
adamSC

Reputation: 101

Implementing Postgres Sql in Apache Airflow

I have Apache-Airflow implemented on an Ubuntu version 18.04.3 server. When I set it up, I used the sql lite generic database, and this uses the sequential executor. I did this just to play around and get used to the system. Now I'm trying to use the Local Executor, and will need to transition my database from sqlite to the recommended postgres sql.

Does anybody know how to make this transition? All of the tutorials I've found entail setting up Airflow with postgres sql from the beginning. I know there are a ton of moving parts and I'm scared of messsing up what I currently have running. Anybody who knows how to do this or can point me at where to look is much appreciated. Thanks!

Upvotes: 10

Views: 16144

Answers (3)

Firas Omrane
Firas Omrane

Reputation: 932

Another option other than adding to the airflow.cfg file

is to set the ENV varibale AIRFLOW__CORE__SQL_ALCHEMY_CONN to the postgresql server you want.

Example: export AIRFLOW__CORE__SQL_ALCHEMY_CONN_SECRET=sql_alchemy_conn

Or you can set it in your Dockerfile setting.

See documentation here

Upvotes: 1

Marcelo Machado
Marcelo Machado

Reputation: 1397

Just to complete @lalligood answer with some commands:

In airflow.cfg file look for sql_alchemy_conn and update it to point to your PostgreSQL serv:

sql_alchemy_conn = postgresql+psycopg2://user:pass@hostadress:port/database

For instance:

sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@localhost:5432/airflow

As indicated in the above line you need both user and database called airflow, therefore you need to create that. To do so, open your psql command line and type the following commands to create a user and database called airflow and give all privileges over database airflow to user airflow:

CREATE USER airflow;
CREATE DATABASE airflow;
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;

Now you are ready to init the airflow application using postgres:

airflow initdb

If everything was right, access the psql command line again, enter in airflow database with \c airflow command and type \dt command to list all tables of that database. You should see a list of airflow tables, currently it is 23.

Upvotes: 13

lalligood
lalligood

Reputation: 144

I was able to get it working by doing the following 4 steps:

  1. Assuming that you are starting from scratch, initialize your airflow environment with the SQLite database. The key takeaway here is for it to generate the airflow.cfg file.
  2. Update the sql_alchemy_conn line in airflow.cfg to point to your PostgreSQL server.
  3. Create the airflow role + database in PostgreSQL. (Revoke all permissions from public to airflow database & ensure airflow role owns airflow database!)
  4. (Re)Initialize airflow (airflow initdb) & confirm that you see ~19 tables in the airflow database.

Upvotes: 0

Related Questions