Installation#
First approach to the management of experimental-computational workflows with Apache Airflow and Docker for a biolab experiment.
Requeriments#
GIT: to download and manage the source code.
Docker: to manage the dependencies and be able to run this example.
Windows Subsystem for Linux (WSL - only for Windows Users): Install this dependency from Microsof Store (Ubuntu, Alpine, or other). In the last version of Docker this is mandatory to run containers. Please check if exists a File .wslconfig in your user root directory (example: C:UsersUsername), if not, create and add these lines to limit the memory used for the virtual machines (kind of bug in Docker Windows) memory=6GB
Usage#
After getting all the requirements, clone the repository with the use of GIT:
To get the files from the submodules the workflow depends on, these should be initialized and updated before building the docker containers:
git submodule init
git submodule update
Attention
Check if the
Now, go into the newly clones repository, where the docker_compose.yml is. The first step is to initialize Airflow running:
docker-compose up airflow-init
and then generate and start the services by running:
docker-compose up -d
Wait for a few seconds and should be able to access to the KIWI experiment DAGs at http://localhost:8080/.
Log in with user: airflow and password: ** airflow**.
Important
After the airflow installation and service generation, it is necessary to change a global variable for the DAGs to run! Follow the steps in setting configuration variables
Danger
The variable host_path must be set in order the DAG to run! Follow the steps in setting configuration variables
Frameworks and Libraries#
The workflow makes use (or will make) of several packages from the KIWI group:
DBpandas
simm2
fitm2
mode
BVI matlab
LinODEnet
These dependencies are automatically installed by Docker through the definition of the docker-compose.yml file and the different images are created via their corresponding Dockerfile as airflow services.