Managing Callbacks

What are Callbacks?

Callbacks are the way Captain Data delivers results for async and schedule execution modes. Instead of waiting for the entire operation to complete, you receive results progressively as they become available.

When to use callbacks: Choose async or schedule mode when you need to process large datasets, want to consume data in parallel, or need to handle long-running operations without timeouts.

Understanding Async and Schedule Modes

async and schedule execution modes are both async modes: the results are not sent immediately in the response payload but delivered during execution via callbacks.

schedule is built on top of async and only adds ways to postpone and schedule calls. Both modes share the same callback delivery mechanism. For a detailed comparison and guidance on when to use each mode, see When to use live, async or schedule modes..

Setting Up Callbacks

Every action call in async/schedule mode requires a callback parameter:

url

string

Your webhook URL that will receive the callback data. Must be HTTPS.

headers

array

Optional headers for authentication or custom metadata (API keys, etc.).

Please refer to each async/schedule action in the API Reference for specific details.

Callback Structure

All callbacks follow the same consistent format:

Callback Delivery: Both async and schedule modes deliver results via the callback URL provided in your request.Consistent Format: All execution modes use the same action logic, so inputs and results are identical regardless of mode.Error Handling: Errors follow the standard API error format.

async callback format

{
  "run": {
    "run_uid": "string",
    "batch_uid": "string",
    "status": "CREATED" || "INVALID" || "QUEUED" || "SCHEDULED" || "BLOCKED" || "STOPPED" || "RUNNING" || "FAILED" || "PARTIAL_SUCCEEDED" || "SUCCEEDED",
  },
  "input": {} || null,
  "custom_data": {} || null,
  "error": {} || null,
  "results": [] || null
}

run

object

Execution context containing run_uid (unique to the entire run), batch_uid (unique to this batch), and status (current execution state).

input

object | null

The processed input data that was used for this batch (cleaned and validated by our engine).

custom_data

object | null

Your custom data passed via inputs.custom_data. Available at the root level for easy access.

error

object | null

Error details if this batch failed. Follows the standard API error format.

results

array | null

The actual results for this batch (null if there was an error). Format matches the specific action’s output.

How Callbacks Work

Parallel Processing & Pagination

Async actions are designed to scale efficiently by processing multiple pages concurrently:

Improved performance: Pages are processed in parallel
Manageable payloads: Each page generates a separate callback
Faster time-to-first-result: Start consuming data immediately

Callback Flow

Trigger Action

Action is triggered in async mode with max_results: 100

Auto-Pagination

Backend automatically paginates results with page_size: 10

Receive Running Callbacks

You receive 10 callbacks with:

status: RUNNING
Each containing ~10 results (may vary slightly as we filter out ads and other content)

Final Success Callback

Once the run completes, you receive one final callback with status: SUCCEEDED

Important: You may receive multiple callbacks for a single execution, and they may arrive out of order due to parallel processing. Always use the run_uid and batch_uid to track and organize your data.

Handling Callbacks & Idempotency

Since you may receive multiple callbacks (including retries), implement idempotency to prevent duplicate processing:

Best Practices

Track processed results using meaningful keys (e.g., linkedin_profile_id) or run_uid/batch_uid
Aggregate progressively as you receive RUNNING callbacks
Finalize only when you receive the SUCCEEDED callback

Benefits

This approach allows you to:

Stream results progressively
Handle partial failures gracefully
Ensure data consistency even with retries

Captain Data provides endpoints to track, retrieve, and manage your callbacks.

Listing Callbacks

Use GET /runs/callbacks to retrieve all callbacks with filtering and pagination: Available filters:

status: PENDING, RUNNING, FAILED, SUCCESS
run_uid: Filter by specific run
limit & offset: Pagination
sort: Sorting (prefix with - for descending)

Example: List failed callbacks

curl -X GET "https://api.captaindata.co/v4/runs/callbacks?status=FAILED&limit=20&sort=-created_at" \
  -H "X-API-Key: <YOUR_API_KEY>"

Getting a Specific Callback

Use GET /runs/callbacks/{callback_uid} to fetch detailed information:

Example: Get callback details

curl -X GET "https://api.captaindata.co/v4/runs/callbacks/{callback_uid}" \
  -H "X-API-Key: <YOUR_API_KEY>"

Replaying Callbacks

Use POST /runs/callbacks/{callback_uid}/replay to retry failed callbacks:

How replay works:

Each replay creates a new callback (up to 3 total replays)
You can only replay the original callback, not callback responses
This helps you track each attempt and identify specific issues with your webhook URL

When to use replay:

Your webhook URL was temporarily unavailable
You received a callback but want to retry processing
You need to debug callback delivery issues

Example: Replay a callback

curl -X POST "https://api.captaindata.co/v4/callbacks/{callback_uid}/replay" \
  -H "X-API-Key: <YOUR_API_KEY>" \
  -H "Content-Type: application/json"

If you have a sandbox, you will have access to several workspaces. The API key will define the current workspace for the calls.

Next Steps

See the API Reference for detailed endpoint usage, parameters, and more examples.

Getting Started

Essentials

Managing Callbacks

What are Callbacks?

Understanding Async and Schedule Modes

Setting Up Callbacks

Callback Structure

How Callbacks Work

Callback Flow

Handling Callbacks & Idempotency

Best Practices

Benefits