File Layout

daggle uses a fixed directory structure for configuration, DAG definitions, and run data.

Directory structure

~/.config/daggle/
  config.yaml                    # Global configuration (tool paths, cleanup, etc.)
  dags/                          # Global DAG definitions
    my-pipeline.yml
    nightly-etl.yml
  projects.yaml                  # Registered project directories

~/.local/share/daggle/
  runs/                          # Run data, organized by DAG and date
    <dag>/
      <YYYY-MM-DD>/
        run_<id>/
          meta.json              # Run metadata (params, start time, status)
          events.jsonl           # Append-only event log (see Event Schema)
          dag.yaml               # Snapshot of the DAG YAML at run start
          dag_diff.patch         # Unified diff vs. the prior run (only if dag_hash changed)
          <step>.stdout.log      # Step stdout (includes output markers)
          <step>.stderr.log      # Step stderr
          <step>.inline.R        # Rendered inline R code (for r_expr steps)
          <step>.sessioninfo.json # R sessionInfo() — written only on R step failure
  proc/
    scheduler.pid                # Scheduler daemon PID file

.daggle/                         # Project-local DAG definitions (repo root)
  my-dag.yml

DAG discovery order

When resolving a DAG by name, daggle searches in this order:

--dags-dir flag (if provided)
DAGGLE_DAGS_DIR environment variable (if set)
.daggle/ in the current working directory (project-local)
~/.config/daggle/dags/ (global)

The first match wins.

Run directory contents

Each run gets its own directory under runs/<dag>/<date>/run_<id>/.

File	Description
`meta.json`	Run metadata: DAG name, parameters, start/end timestamps, final status.
`events.jsonl`	Append-only event log. See Event Schema.
`dag.yaml`	Copy of the DAG YAML taken at run start. Used by `dag_diff.patch` and by `daggle archive` to make runs self-describing and reproducible.
`dag_diff.patch`	Unified diff of this run’s `dag.yaml` against the previous run’s. Written only when the DAG hash changed between runs. Gives a self-contained “what changed?” record without requiring git.
`<step>.stdout.log`	Captured stdout for each step. Includes raw output markers.
`<step>.stderr.log`	Captured stderr for each step.
`<step>.inline.R`	Rendered R source for `r_expr` steps (useful for debugging).
`<step>.sessioninfo.json`	`sessionInfo()` snapshot written only when an R step fails. Contains `r_version`, `platform`, `error_message`, `session_info` (full text), and `timestamp`. Useful for compliance and post-mortem debugging — proves which package versions were active at the moment of failure without re-running R.

Overriding directories

Mechanism	Config dir	Data dir	DAGs dir
CLI flags	–	`--data-dir`	`--dags-dir`
Environment variables	`DAGGLE_CONFIG_DIR`	`DAGGLE_DATA_DIR`	`DAGGLE_DAGS_DIR`
XDG fallback	`$XDG_CONFIG_HOME/daggle`	`$XDG_DATA_HOME/daggle`	–
Default	`~/.config/daggle`	`~/.local/share/daggle`	(discovery order)

Priority is top to bottom: CLI flags override environment variables, which override XDG, which override defaults.

Global configuration (`config.yaml`)

The file ~/.config/daggle/config.yaml holds global settings. All fields are optional.

# Override tool paths (useful when the scheduler can't find binaries)
tools:
  rscript: /usr/local/bin/Rscript
  quarto: /opt/homebrew/bin/quarto
  git: /usr/bin/git

# Automatic cleanup of old run data
cleanup:
  older_than: "30d"
  interval: "6h"

# Named notification channels, referenced from hooks by name.
# See the Hooks page for usage.
notifications:
  team-slack:
    type: slack
    webhook_url: https://hooks.slack.com/services/...
  ops-email:
    type: smtp
    smtp_host: mail.example.com
    smtp_port: 587
    smtp_from: daggle@example.com
    smtp_to: [ops@example.com]
    smtp_user: daggle
    smtp_password: ${SMTP_PASSWORD}

Tool path resolution

At startup, daggle resolves each tool path using this precedence:

tools: in config.yaml — explicit absolute path (highest priority)
exec.LookPath — searches the current PATH
Bare binary name — fallback (may fail if not on PATH)

Auto-detection: When daggle discovers a tool via PATH lookup (step 2), it automatically saves the resolved absolute path to config.yaml. This means running any daggle command once from an interactive shell (e.g., daggle doctor or daggle run) will persist the tool paths. Future scheduler runs — even as a system service with a minimal PATH — will find the saved paths in config.

Run daggle doctor to see which paths daggle resolved.