At Mu Sigma Labs, I led a significant project focused on BPMN-based analytics automation and pipeline orchestration. Using the open-source platform Activiti, I owned, developed, tested, and maintained a system serving about 3,000 internal users, handling critical reporting and data pipelines.
Technologies Used
The core technologies employed were:
- Backend: Java and Spring Boot
- Scripting: Python and R for analytics tasks
- Frontend: Angular for user interface
Understanding BPMN
Business Process Model and Notation (BPMN) provided the foundation for our automation approach. BPMN, a graphical notation standard for business processes, was extended to accommodate automated pipelines and human-in-the-loop processes.
A simple example:
Two major extensions were made by us to the platform:
Scripting tasks: Key to Automation
We integrated R and Python scripting into the platform to enable sophisticated automation tasks. This capability allowed us to design complex pipelines like propensity-to-click prediction model involving user input, predictive modeling, and approval workflows. Example:
User tasks: Human in the loop
User tasks, including forms and approvals, were seamlessly integrated with email notifications and the assigned user can use the link in the email to submit form or approve the task. This notification could include some context from the previous tasks by attaching documents.
Pure Pipelining Example
You can also use it as simple pipelining by chaining just scripting tasks.
Example:
Dynamic Triggers
Our platform supported diverse trigger mechanisms, including cron-based schedules and REST APIs, ensuring seamless integration with external systems for task initiation.
The Role of BPMN in ML pipelining?
Why use BPMN for automating / pipelining? We don’t use UML for coding right?
BPMN’s adoption for automation and pipelining stemmed from its ability to cater to both technical and non-technical users. Whether orchestrating simple data flows or complex machine learning pipelines, BPMN empowered users to architect automated workflows with ease and flexibility.
Moreover, platforms’s support for human-in-the-loop processes addressed a critical gap in conventional pipelining systems, offering robust mechanisms for human validation and approval—a necessity in the realm of machine learning and MLOps where human intervention is often required.
Key Contributions
During my tenure, I focused on:
- Enabling scripting support for R and Python
- Enhancing Logging and Observability of these tasks.
- Implementing timer and dynamic events.
- Implementing dynamic user task assignments.
- Improving scalability and modularity of the platform
- Addressing user interface enhancements and optimizations
Recognition ⭐️
For my contributions, I received a spot award.