Getting started with Orbitra Flows
Orbitra Flows is your workflow orchestration framework built on top of Prefect and integrated with Orbitra’s environment. It provides a streamlined way to build, deploy, and monitor data pipelines and automated workflows in your organization’s infrastructure.What is Prefect?
Prefect is a modern workflow orchestration platform that allows you to build, observe, and react to data pipelines. Think of it as a way to define Python functions as tasks, chain them together into workflows (called “flows”), and execute them reliably with features like retries, scheduling, and observability.Flows and Tasks
In Prefect terminology:- Task: A Python function decorated with
@taskthat represents a discrete unit of work - Flow: A Python function decorated with
@flowthat orchestrates multiple tasks into a complete workflow
@orbitra_deployment decorator replaces the Prefect @flow decorator. You don’t need both - @orbitra_deployment handles flow registration and deployment configuration in one step.
Orbitra Flows vs Plain Prefect
While you can use Prefect directly, Orbitra Flows provides:- Declarative Deployments: Use the
@orbitra_deploymentdecorator to register and manage flow deployments consistently:- Multiple schedules with cron expressions, intervals, and RRULE support
- Custom container sizing and concurrency limits for your deployments
- Pre-configured connections to Orbitra Lake, authentication, and compute resources
- Managed execution identity integrated with the Orbitra ecosystem (Lake, Flows, Email, and related services)
- Monitoring and alerting integrations, including notifications via Microsoft Teams and Slack
- Self-hosted infrastructure alerting for operational visibility
- Multiple pre-configured self-hosted worker profiles (Container Instances Spot, dedicated Container Instances, and VMs)
- Repository templates with pre-configured CI/CD pipelines for flow deployment
- Cloud storage for retry logic to reuse results if task/flow fails
Orbitra Flows SDK in 2 minutes
1) Define a simple flow
2) Add retry logic and result persistence
Orbitra Flows works seamlessly with Orbitra Lake for data persistence. Thepersist_result=True parameter enables cloud storage of task results, allowing subsequent runs to reuse successful task outputs if the flow fails:
3) Add production configuration
Now let’s add schedules, resource limits, and other production settings:@orbitra_deployment decorator handles flow definition and deployment configuration. Here we’ve added:
- A cron schedule to run daily at 2 AM
- Concurrency limit to prevent overlapping runs
- Tags for organization
- Container size configuration
- Enabled schedules on creation
Advanced Configuration
The@orbitra_deployment decorator supports advanced options for container sizes if you must run heavy computation. You can use predefined sizes (“XS”, “S”, “M”, “L”) or specify custom resources:
Monitoring and Observability
Orbitra Flows automatically tracks:- Flow run status (success, failure, running);
- Task execution times and states;
- Logs from all tasks and flows;
- Retry attempts and failures.