Do you like DAGs? Implementing a Graph Executor for Bacalhau
by Enrico Rotundo , Associate Data Scientist
Winder.AI helped Protocol Labs, a technology company in the crypto space, to help develop Bacalhau, a novel decentralised computational platform that focuses on the AI lifecycle. This case study describes some of our work to develop this project but for more information view the Bacalhau website.
This video describes this work in more detail. A short description can be found below.
Problem: Empowering the Distributed Compute Network with DAGs
Bacalhau is a next-generation decentralized computational platform that exposes functionality to jobs at a massive scale. But sometimes you need more coordination over how the jobs complete, or you need to chain jobs together to form dependencies. When this happens it’s often easier to describe the workload as a directed acyclic graph (DAG) in which each job is a node and the edges describe how jobs are related to each other.
Building support for DAGs into the Bacalhau platform is a key part of the project. This is because it allows users to express complex workflows simply. It also allows the platform to be used for more complex use cases, such as machine learning pipelines.
Solution: MLOps Development Expertise
The Winder.AI team collaborated with the rest of the Bacalhau team to deliver iterative improvements to the platform. We then developed a prototype of the DAG executor and tested it against a range of use cases.
Result: Developing a Bacalhau Airflow Integration
We found that the best first step was to integrate with Airflow, a popular open-source workflow management system. This allowed us to quickly prototype and validate the DAG executor. We developed an operator for Airflow that allows users to submit jobs to the Bacalhau platform directly from within Airflow python code. This plugin is now available on GitHub.
Value of This Work
By leveraging Winder.AI’s experience, we were able to quickly prototype and validate an advanced pipelining solution for the Bacalhau project. In the future we want to extend this work to support more advanced features, such as:
- Open-source a stable Bacalhau operator for Airflow
- Develop a solution so that an Airflow instance is provisioned for you when you submit a DAG to Bacalhau
- Integrated lineage tracking for Airflow DAGs
Contact
If this work is of interest to your organization, then we’d love to talk to you. Please get in touch with the sales team at Winder.AI and we can chat about how we can help you.