Apache Airflow on K8S by ArgoCD Helm

Art Krisada
3 min readJul 4, 2024

--

Short note on my project using Apache Airflow on K8S.

I used ArgoCD. You can follow how to set up and install Argo from my old post.

Good read before install your Airflow.

First, from ArgoCD create new App.

Click Edit AS YAML Button.

This Airflow use LocalExecuter without Redis. K8S Namespace is apache-airflow. You can change the values to suit your needs.

I choose to persist Logs on pvc-poc-apache-airflow-log. You need to create pvc as the name you set here.

I use external PostgresDB to keep Airflow configuration data. Change values of data.metadataConnection.host, data.metadataConnection.user, data.metadataConnection.pass, data.metadataConnection.db to your Database.

I use Git-sync to sync DAGs from private gitlab. You need to set deploy token in gitlab and set value for dags.gitSync.repo something like https://YOUR-gitlab+deploy-token-HERE@gitlab.com/your_repo/your_dags.git

Gitlab Repo

Go to your DAGs project > Setting > Repository > Deploy tokens to generate one.

Paste YML below to your ArgoCD page. Make change as appropriate.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: apache-airflow
spec:
destination:
name: ''
namespace: apache-airflow
server: 'https://kubernetes.default.svc'
source:
path: ''
repoURL: 'https://airflow.apache.org'
targetRevision: 1.14.0
chart: airflow
helm:
parameters:
- name: createUserJob.applyCustomEnv
value: 'false'
- name: createUserJob.useHelmHooks
value: 'false'
- name: migrateDatabaseJob.applyCustomEnv
value: 'false'
- name: migrateDatabaseJob.useHelmHooks
value: 'false'
- name: useStandardNaming
value: 'true'
- name: logs.persistence.enabled
value: 'true'
- name: logs.persistence.existingClaim
value: pvc-poc-apache-airflow-log
- name: config.core.executor
value: LocalExecutor
- name: dags.gitSync.enabled
value: 'true'
- name: dags.gitSync.repo
value: >-
https://YOUR-gitlab+deploy-token-HERE@gitlab.com/your_repo/your_dags.git
- name: dags.gitSync.branch
value: main
- name: dags.gitSync.subPath
value: ''
- name: executor
value: LocalExecutor
- name: workers.persistence.size
value: 10Gi
- name: data.metadataConnection.host
value: postgres-primary.postgres.svc.cluster.local
- name: data.metadataConnection.user
value: airflow
- name: data.metadataConnection.db
value: apache_airflow
- name: data.metadataConnection.pass
value: airflow
- name: logs.persistence.size
value: 10Gi
- name: postgresql.enabled
value: 'false'
- name: redis.enabled
value: 'false'
- name: pgbouncer.enabled
value: 'false'
- name: webserverSecretKey
value: fb18e14f211ede3d0cac061db57d408d
- name: webserver.defaultUser.firstName
value: Airflow
- name: webserver.defaultUser.username
value: airflow
- name: webserver.defaultUser.password
value: airflow
- name: webserver.defaultUser.lastName
value: Admin
- name: webserver.defaultUser.email
value: admin@airflow.com
sources: []
project: default
syncPolicy:
syncOptions:
- CreateNamespace=true

Click Save Button. It will go back to Create App Page.

Click Create and wait for your Airflow to Sync. When all pods created, go check your airflow by port forward from
apache-airflow-webserver service. check your browser http://localhost:YOURPORT

Login with user/password airflow/airflow

For Production configuration, you can read more here.

--

--

Art Krisada
Art Krisada

Written by Art Krisada

Never stop learning, because life never stop teaching.

No responses yet