Comparison with other tools

Deciding between daggle and something else? This page maps daggle against the main alternatives you’re likely to already be using. The short version:

If you use… Consider daggle when…
targets You need scheduling, retries, notifications, deadlines, or multi-pipeline orchestration around your existing tar_make() runs.
cron + Rscript You want dependencies, retries, timeouts, parallel execution, and structured logs for scheduled R scripts.
dagu You run R in your pipelines and want first-class R awareness rather than shelling out to Rscript.
Airflow / Prefect / Dagster Your workloads are R-centric and you don’t need a database-backed, Kubernetes-scale platform.
GitHub Actions You need to run R workflows on your own machines, with direct access to private data and seconds (not minutes) of trigger latency.
orderly2 You need scheduling or broader workflow orchestration around reproducible orderly reports.

Skip to any section below – they’re independent.


targets

targets is an R package for reproducible, dependency-aware computation. It tracks which objects are stale and rebuilds them.

daggle and targets solve different problems. targets answers “which objects need rebuilding?”. daggle answers “when should this run, what happens if it fails, and who gets told?”.

The two are complementary. A common pattern is to run tar_make() inside a daggle step:

name: nightly-etl
trigger:
  schedule: "0 2 * * *"
steps:
  - id: pipeline
    targets: .
    timeout: 2h
    retry:
      limit: 2
      backoff: exponential
    on_failure:
      r_expr: slackr::slackr_msg("nightly ETL failed")

daggle handles the when, retry, timeout, and notify. targets handles the what and cache. Neither replaces the other.

targets daggle
Core job Dependency-based caching and invalidation Scheduling, orchestration, observability
Scheduling None (you call tar_make()) Built-in cron, CLI, API, file watch, webhook
Failure handling Stops; you check tar_meta() Retries, timeouts, SIGTERM/SIGKILL, hooks
Notifications None on_success / on_failure hooks (R expressions)
Multi-pipeline One _targets.R per project Unlimited DAGs, each a YAML file
Non-R steps tar_target_raw() + system2 First-class command: steps
Package dev Not designed for it test:, check:, document:, pkgdown: steps
Artifact tracking Object store keyed by hash Format-aware (.rds, .parquet, .arrow), SHA-256 hashed, per-run
Run comparison tar_meta() diffing by hand daggle diff run-a run-b

Pick targets alone if you have a single-project, locally invoked pipeline and you just want tar_make() to rebuild stale objects.

Pick daggle (with or without targets inside) if you need scheduling, retries, notifications, deadlines, parallel DAGs, or run history.


cron + Rscript

The incumbent for most R users doing scheduled work: a crontab line that calls Rscript some_script.R.

cron + Rscript daggle
Dependencies between scripts Implicit (run order, hope for the best) Explicit depends: with parallel execution
Failure visibility Buried in syslog or email Structured events, daggle status, hooks
Retries DIY in each script Declarative retry: { limit: 3, backoff: exponential }
Timeouts None (runaway R processes forever) SIGTERM then SIGKILL with process group cleanup
Passing data between scripts Hardcoded temp file paths ::daggle-output:: markers + per-run artifact dir
History None Full run history with logs and artifacts
Triggers Time only Cron + file watch + webhook + on_dag + git + condition
Setup cost Zero go install + one YAML file

Migration is incremental. You can put one script behind daggle, keep the rest in cron, and migrate the rest as you see value.

Pick cron alone if you have a single script, no dependencies, no failure notifications needed, and you’re happy reading syslog.

Pick daggle if anything on that list starts to pinch.


dagu

dagu is the closest spiritual ancestor – same philosophy: single binary, YAML DAGs, file-based storage, local-first. daggle takes inspiration from it.

dagu daggle
Language awareness None – all steps are shell commands R-native: script:, r_expr:, test:, check:, quarto:, …
R step boilerplate command: Rscript --no-save --no-restore script.R script: script.R
renv / library mgmt Manual env var setup Auto-detect renv.lock, set R_LIBS_USER
R version constraints DIY r_version: ">= 4.3"
Test output Exit code only Structured pass/fail/skip + coverage
R CMD check DIY shell command check: with parsed errors/warnings/notes
Rendering DIY quarto: and rmd: with format and site support
Process observability None Per-step peak_rss_kb, user_cpu_sec, sys_cpu_sec
Live monitoring Web UI polling daggle monitor TUI + SSE event stream

The gap: dagu treats R like any other shell command. It has no concept of what an R CMD check warning means, what renv is, or why a testthat failure should show which tests failed.

Pick dagu if your steps are mostly shell commands and R is incidental.

Pick daggle if R is the assumed language.


Airflow / Prefect / Dagster

The enterprise Python pipeline tools. Industry standard for data engineering at scale.

Airflow et al. daggle
Primary language Python-first (DAGs as Python code) R-first (DAGs as YAML, steps are R)
Install pip + database + webserver + scheduler + workers Single binary
Infrastructure PostgreSQL/MySQL, Redis/RabbitMQ, Python env R
Memory at idle 500 MB+ 10–30 MB
R integration BashOperator("Rscript ...") Native R step types
Distributed execution Yes (Celery, K8s, Dask) No – single host
Learning curve Weeks Minutes
Approval gates Custom operator code First-class approve: step

Pick Airflow-class tools if you have a data engineering team, dozens of pipelines across many languages, Kubernetes or similar for workers, or strict compliance requirements that demand a managed platform.

Pick daggle if your workload is R-centric and the operational cost of Airflow isn’t justified.


GitHub Actions

Many R package authors use GHA for testing and checking on push and PR. r-lib/actions provides solid R setup.

GitHub Actions daggle
Runs where GitHub’s cloud runners Your machine, your server
Data access Must upload/download Direct access to local files, databases, network
Trigger Push, PR, schedule CLI, cron, file watch, webhook, on_dag, git, condition
R awareness r-lib/actions provides setup Native R step types with structured output
Iteration speed Push, wait, read logs daggle run with instant feedback
Cost Free tier limits, then paid Free
Privacy Code and data on GitHub infrastructure Everything stays local
Scheduling latency Minutes Seconds

They coexist well. Use GHA for public CI on pushes and PRs. Use daggle locally (or on a self-hosted server) for:

  • workflows that touch private data,
  • scheduled runs that need seconds (not minutes) of trigger latency,
  • interactive iteration on DAGs where the push-wait-read-logs loop is too slow.

orderly2

orderly2 is an R package from VIMC for reproducible report packets with dependency tracking between reports.

orderly2 daggle
Core focus Reproducible report packets with provenance General workflow orchestration
Scheduling None Built-in cron + 5 other trigger types
Step types R scripts producing “packets” 27 step types: script, test, check, quarto, targets, …
Scope Reports and their dependencies Any R workflow: ETL, training, package dev, reports
Artifact model Immutable packets with metadata Format-aware artifacts per run, SHA-256 hashed

These aren’t mutually exclusive. If you already use orderly2, daggle can call orderly2::orderly_run() from an r_expr: or script: step when you need scheduling, retries, or notifications around orderly report generation.


When not to use daggle

Honest limits:

  • You’re Python-first. All our R-specific affordances (renv, testthat output, R CMD check, Quarto integration) are dead weight. Use Airflow, Prefect, or Dagster.
  • You want a bundled drag-and-drop DAG editor. daggle intentionally ships only a minimal read-only status dashboard (daggle serve). The UI story is a REST API plus the daggleR package, so you build whatever Shiny app fits your team – an operator dashboard, a report browser, a mobile status page. That’s a feature if you already know Shiny (you get a UI in the language your R users read anyway), a limit if you want visual DAG authoring out of the box.
  • You need distributed execution across many machines. daggle runs on a single host. It doesn’t schedule to Kubernetes or a cluster.
  • You need tight integration with one cloud vendor’s managed pipeline service. daggle is local-first by design.
  • Your needs genuinely are tar_make(). Single project, no schedule, no notifications, no multi-pipeline? Keep targets.
  • Your needs genuinely are one cron line. One script, no dependencies, no retries? Keep cron.

The honest positioning: daggle fits the space between “a crontab line” and “a data platform” – where you need more than cron but less than Airflow, and where R is the assumed language.