Data is the lifeblood of modern business, but moving it is often a frustrating and fragile process. Traditional ETL (Extract, Transform, Load) pipelines, frequently written as large, monolithic scripts, are notoriously brittle. A minor change in a data source or an API can cause the entire process to fail, leading to hours of painful debugging and data downtime. But what if we could build data pipelines like we build modern software—out of small, robust, and reusable components?
This is the core idea behind action.do: breaking down complex processes into their fundamental building blocks. By embracing atomic actions, you can transform your ETL pipelines from fragile scripts into resilient, scalable, and observable agentic workflows.
If you've ever inherited a 1,000-line Python script responsible for your company's entire data integration, you already know the pain. Monolithic ETL processes suffer from several critical flaws:
This approach doesn't scale. As data volume and complexity grow, these scripts become unmanageable liabilities.
The .do platform introduces a powerful concept to solve this: the atomic action.
So, what is an atomic action? It's the smallest, indivisible unit of work in a workflow. Think of it as a self-contained, executable function designed to do one thing exceptionally well.
Each action has clearly defined inputs, performs a specific task, and produces a predictable output. They are the fundamental building blocks of every powerful Agentic Workflow.
It's crucial to understand the distinction. Actions are the individual steps, while workflows are the orchestration of multiple actions in a specific sequence or with conditional logic. You build complex and powerful ETL pipelines by composing simple, reusable actions together, much like assembling LEGO bricks to create a sophisticated model.
Let's imagine we need to build a pipeline that pulls new user sign-ups, enriches their data with a third-party service, and loads the clean data into our analytics warehouse.
Instead of one giant script, we define three distinct atomic actions.
First, we create an action to fetch new user data. This action is self-contained and only responsible for extraction.
import { Action } from '@do-co/agent';
// Define an action to fetch new users from a source
const fetchNewUsers = new Action('fetch-new-users', {
title: 'Fetch New Users',
description: 'Retrieves a batch of new user records from the primary API.',
input: {
since: { type: 'string', description: 'ISO timestamp for last fetch' },
},
async handler({ since }) {
console.log(`Fetching users since ${since}...`);
// Logic to call your internal API endpoint
// const users = await internalApi.get(`/users?since=${since}`);
const users = [{ id: 1, email: 'alex@example.com' }, { id: 2, email: 'casey@example.com' }]; // Mock data
return { users };
},
});
Next, we define an action to enrich the data. This action doesn't know or care where the data came from; it only knows how to process a user record.
import { Action } from '@do-co/agent';
// Define an action to enrich user data
const enrichUserProfile = new Action('enrich-user-profile', {
title: 'Enrich User Profile',
description: 'Enriches a user profile with data from a third-party service.',
input: {
email: { type: 'string', required: true },
},
async handler({ email }) {
console.log(`Enriching profile for ${email}...`);
// In a real scenario, you'd call an external API like Clearbit or FullContact
const enrichedData = { company: 'Example Inc.', title: 'Developer' };
return { enrichedData };
},
});
Finally, an action to load the fully processed record into our data warehouse. Its sole responsibility is insertion.
import { Action } from '@do-co/agent';
// Define an action to load data into a warehouse
const loadToWarehouse = new Action('load-to-warehouse', {
title: 'Load to Data Warehouse',
description: 'Loads a final user record into the analytics data warehouse.',
input: {
record: { type: 'object', required: true },
},
async handler({ record }) {
console.log('Loading record to warehouse:', record);
// Logic to connect and INSERT INTO your warehouse (e.g., Snowflake, BigQuery)
return { success: true, recordId: record.id };
},
});
With these atomic actions defined, a workflow on the .do platform would orchestrate them:
If the enrich-user-profile action fails due to an API key issue, the entire pipeline doesn't crash blindly. You know exactly which step failed, for which user, and why—making debugging trivial.
The true power of this model is that you can create your own custom actions. The .do SDK is designed to let you transform your unique business logic into reusable, programmable building blocks.
Have a proprietary data-cleaning algorithm? Turn it into a cleanse-proprietary-data action. Need to interact with a legacy internal system? Wrap it in a query-legacy-crm action. This turns your business operations into Business as Code—versionable, testable, and scalable assets that can be used across countless workflows.
Stop firefighting brittle ETL scripts and start building robust, automated data systems. By breaking down complexity into discrete, atomic actions, you create a foundation for scalable and maintainable task automation. Your data pipelines become less of a liability and more of a strategic asset—reliable, transparent, and ready for whatever comes next.
Ready to build your first atomic ETL pipeline? Define, execute, and scale your data processing on the .do platform today.