Skip to content

Job Idempotency (Deduplication)

When you click Run All Experiments quickly more than once, the backend automatically detects the duplicate requests and returns the existing in-progress job instead of spawning new Celery tasks.

What Gets Deduplicated

A submission is considered a duplicate if all three of these match a job already in queued or running state created within the last 30 seconds:

Key Description
dataset_source_id The dataset the pipeline is running on
node_id The canvas node ID of the terminal (Training / Tuning) node
branch_index Which parallel branch (0 = first, 1 = second, …)

Parallel Branches Are Not Deduplicated Against Each Other

If your canvas has two paths feeding the same training node (parallel experiment mode), Branch 0 and Branch 1 get different branch_index values from the partitioner and are each allowed to create their own job.

Branch 0: Dataset → Encode → [Train]          branch_index=0
Branch 1: Dataset → Encode → Scale → [Train]  branch_index=1

Clicking twice fires two requests per branch. Request 2 of Branch 0 is deduplicated against Request 1 of Branch 0. Branch 1 is unaffected.

The 30-Second Window

The dedup window is 30 seconds. After 30 seconds have passed since the last job was created, a new click always starts a fresh run. This prevents accidental blocking of legitimate re-runs.

How It Works Internally

The backend uses a per-key asyncio.Lock to make the "check if job exists → create if not" step atomic:

Click 1  →  acquires lock  →  no job found  →  creates job J1  →  releases lock
Click 2  →  acquires lock  →  J1 found (queued)  →  returns J1  →  releases lock

Without the lock, both clicks could see an empty table simultaneously before either one committed its new row (a classic read-modify-write race).

Known Limitation

The lock lives in-process. If you run multiple uvicorn workers (--workers N) each worker has its own lock dictionary and cross-process races are not prevented. The current default single-worker configuration is fully protected.