Do you like DAGs? Implementing a Graph Executor for Bacalhau

by Enrico Rotundo , Associate Data Scientist

Winder.AI helped Protocol Labs, a technology company in the crypto space, to help develop Bacalhau, a novel decentralised computational platform that focuses on the AI lifecycle. This case study describes some of our work to develop this project but for more information view the Bacalhau website.

This video describes this work in more detail. A short description can be found below.

Problem: Empowering the Distributed Compute Network with DAGs

Bacalhau is a next-generation decentralized computational platform that exposes functionality to jobs at a massive scale. But sometimes you need more coordination over how the jobs complete, or you need to chain jobs together to form dependencies. When this happens it’s often easier to describe the workload as a directed acyclic graph (DAG) in which each job is a node and the edges describe how jobs are related to each other.

Building support for DAGs into the Bacalhau platform is a key part of the project. This is because it allows users to express complex workflows simply. It also allows the platform to be used for more complex use cases, such as machine learning pipelines.

Solution: MLOps Development Expertise

The Winder.AI team collaborated with the rest of the Bacalhau team to deliver iterative improvements to the platform. We then developed a prototype of the DAG executor and tested it against a range of use cases.

Result: Developing a Bacalhau Airflow Integration

We found that the best first step was to integrate with Airflow, a popular open-source workflow management system. This allowed us to quickly prototype and validate the DAG executor. We developed an operator for Airflow that allows users to submit jobs to the Bacalhau platform directly from within Airflow python code. This plugin is now available on GitHub.

Value of This Work

By leveraging Winder.AI’s experience, we were able to quickly prototype and validate an advanced pipelining solution for the Bacalhau project. In the future we want to extend this work to support more advanced features, such as:

  • Open-source a stable Bacalhau operator for Airflow
  • Develop a solution so that an Airflow instance is provisioned for you when you submit a DAG to Bacalhau
  • Integrated lineage tracking for Airflow DAGs

Contact

If this work is of interest to your organization, then we’d love to talk to you. Please get in touch with the sales team at Winder.AI and we can chat about how we can help you.

More articles

Explain, Enhance and Enrich Your Data with Bacalhau Amplify

An AI product development case study introducing Bacalhau Amplify, a data engineering tool based upon Web3 technologies, and our work with Expanso Inc.

Read more

Presentation: MLOps and the Online Safety Bill

A presentation discussing the results of a report for a UK government regulator to interview and analyse how online platforms are using MLOps and governance to help operate their AI solutions.

Read more
}