File Layout
daggle uses a fixed directory structure for configuration, DAG definitions, and run data.
Directory structure
~/.config/daggle/
config.yaml # Global configuration (tool paths, cleanup, etc.)
dags/ # Global DAG definitions
my-pipeline.yml
nightly-etl.yml
projects.yaml # Registered project directories
~/.local/share/daggle/
runs/ # Run data, organized by DAG and date
<dag>/
<YYYY-MM-DD>/
run_<id>/
meta.json # Run metadata (params, start time, status)
events.jsonl # Append-only event log (see Event Schema)
dag.yaml # Snapshot of the DAG YAML at run start
dag_diff.patch # Unified diff vs. the prior run (only if dag_hash changed)
<step>.stdout.log # Step stdout (includes output markers)
<step>.stderr.log # Step stderr
<step>.inline.R # Rendered inline R code (for r_expr steps)
<step>.sessioninfo.json # R sessionInfo() — written only on R step failure
proc/
scheduler.pid # Scheduler daemon PID file
.daggle/ # Project-local DAG definitions (repo root)
my-dag.yml
DAG discovery order
When resolving a DAG by name, daggle searches in this order:
--dags-dirflag (if provided)DAGGLE_DAGS_DIRenvironment variable (if set).daggle/in the current working directory (project-local)~/.config/daggle/dags/(global)
The first match wins.
Run directory contents
Each run gets its own directory under runs/<dag>/<date>/run_<id>/.
| File | Description |
|---|---|
meta.json |
Run metadata: DAG name, parameters, start/end timestamps, final status. |
events.jsonl |
Append-only event log. See Event Schema. |
dag.yaml |
Copy of the DAG YAML taken at run start. Used by dag_diff.patch and by daggle archive to make runs self-describing and reproducible. |
dag_diff.patch |
Unified diff of this run’s dag.yaml against the previous run’s. Written only when the DAG hash changed between runs. Gives a self-contained “what changed?” record without requiring git. |
<step>.stdout.log |
Captured stdout for each step. Includes raw output markers. |
<step>.stderr.log |
Captured stderr for each step. |
<step>.inline.R |
Rendered R source for r_expr steps (useful for debugging). |
<step>.sessioninfo.json |
sessionInfo() snapshot written only when an R step fails. Contains r_version, platform, error_message, session_info (full text), and timestamp. Useful for compliance and post-mortem debugging — proves which package versions were active at the moment of failure without re-running R. |
Overriding directories
| Mechanism | Config dir | Data dir | DAGs dir |
|---|---|---|---|
| CLI flags | – | --data-dir |
--dags-dir |
| Environment variables | DAGGLE_CONFIG_DIR |
DAGGLE_DATA_DIR |
DAGGLE_DAGS_DIR |
| XDG fallback | $XDG_CONFIG_HOME/daggle |
$XDG_DATA_HOME/daggle |
– |
| Default | ~/.config/daggle |
~/.local/share/daggle |
(discovery order) |
Priority is top to bottom: CLI flags override environment variables, which override XDG, which override defaults.
Global configuration (config.yaml)
The file ~/.config/daggle/config.yaml holds global settings. All fields are optional.
# Override tool paths (useful when the scheduler can't find binaries)
tools:
rscript: /usr/local/bin/Rscript
quarto: /opt/homebrew/bin/quarto
git: /usr/bin/git
# Automatic cleanup of old run data
cleanup:
older_than: "30d"
interval: "6h"
# Named notification channels, referenced from hooks by name.
# See the Hooks page for usage.
notifications:
team-slack:
type: slack
webhook_url: https://hooks.slack.com/services/...
ops-email:
type: smtp
smtp_host: mail.example.com
smtp_port: 587
smtp_from: daggle@example.com
smtp_to: [ops@example.com]
smtp_user: daggle
smtp_password: ${SMTP_PASSWORD}Tool path resolution
At startup, daggle resolves each tool path using this precedence:
tools:in config.yaml — explicit absolute path (highest priority)exec.LookPath— searches the currentPATH- Bare binary name — fallback (may fail if not on
PATH)
Auto-detection: When daggle discovers a tool via PATH lookup (step 2), it automatically saves the resolved absolute path to config.yaml. This means running any daggle command once from an interactive shell (e.g., daggle doctor or daggle run) will persist the tool paths. Future scheduler runs — even as a system service with a minimal PATH — will find the saved paths in config.
Run daggle doctor to see which paths daggle resolved.