daana-cli 0.7.0: foundation rework, ~2× faster, full-refresh, M:N relationships and more!

Line-art mission-control HUD diagram in Daana blue on near-black: source databases on the left flowing into a structured central node (the entity model), with a speed gauge top-right capturing the 2x faster theme and a before/after detail panel bottom-right capturing the modeling rigor improvements

Released 2026-05-18

0.7.0 is a foundation rework. The headlines:

~2× faster deploy and execute thanks to parallelization across the metadata pipeline and pipeline-SQL caching. Metadata loading on BigQuery is 3-4× faster.
One execution mode, smaller operational footprint. Everything now runs inside the Go binary itself. The local Postgres sidecar is gone - no external database, no Docker port, no second process to babysit. The legacy execution-mode flag is gone with it; the in-process path is the only path. Batch lifecycle now lives in the warehouse and works the same from a laptop, CI/CD, or a teammate's machine.
--full-refresh at any scope: workflow, entity, or attribute. Same semantic as dbt's and Dataform's --full-refresh, with the option to scope finer: rebuild the full workflow (execute --full-refresh), a single entity (--entity X), or one attribute (--attribute Y). Pairs with new pipeline-execution lifecycle tracking (status, stale-run auto-recovery, --force).
Many-to-many relationships are now correct and tested across PostgreSQL and BigQuery. Set-oriented relationship views replace the previous pair-oriented behavior. Pre-0.7.0 binaries silently broke INCREMENTAL relationship processing on any model change after deploy; this is fixed.
Casing in YAML is now whatever you want it to be. Entity IDs, workflow IDs, attribute names, and relationship names accept lowercase, UPPERCASE, Mixed_Case, mixedCase. Names are preserved as written; cross-references are case-insensitive at compile time. Framework-generated columns standardize to lowercase.
INCREMENTAL correctness work on late-arriving rows, watermark progression, and M:N transitions. The bugs that make a practitioner trust or distrust a platform - surfaced honestly and fixed.

The upgrade from any pre-0.7.0 build is a standard drop-and-redeploy (10-15 minutes for a full DWH; runs are idempotent). Full migration steps and rollback notes are at the bottom of this post.

What shipped

Foundation rework

The most significant structural change removes the local Postgres sidecar container that previously brokered SQL queues and internal operations. Everything now runs inside the Go binary itself - no external database, no Docker port, no second process to babysit. Local setup is a single executable.

This simplification eliminates the legacy execution-mode flag (the in-process path is now the only path), removes the Docker sidecar requirement, and collapses what used to be two operational subsystems into one. Two large code paths were deleted outright rather than refactored: the baseline-comparison infrastructure and the stored-procedure execution path.

A few small consequences:

install no longer has a local subcommand. install remote is now just install.
daana-cli reset is gone. It only ever existed to reset the sidecar.
Faster everything. Removing the SQL queue indirection cuts a layer of overhead from every deploy and execute. The ~2× speedup numbers below are with this overhead already removed.

~2× faster deploy and execute

Three parallelization and caching changes:

Parallel workers default raised from 4 to 8. This brings the metadata workers in line with the rest of the execution pipeline and removes a long-standing mismatch.
Metadata pipeline is parallel. Previously sequential; now runs across the 8 workers. Metadata loading on BigQuery is 3-4× faster as a result.
PROCINST key caching. Eliminates roughly 10 minutes of redundant pipeline-SQL regeneration per execute on BigQuery.

The combined effect is ~2× faster deploy and execute end-to-end. The bigger your model, the more you feel it.

Many-to-many relationships fixed

Pre-0.7.0, M:N relationship support had real correctness issues under INCREMENTAL ingestion strategy. 0.7.0 ships a hardened implementation:

M:N X-table reader rewritten with fingerprint-based set-change detection.
INCREMENTAL truedelta aligned with FULL_LOG for exact-tuple modeling. Within-batch transitions now emit historical rows correctly.
Set-oriented relationship views. view_<source>_has_<target> now returns the active member set per source entity at the latest effective timestamp. The previous pair-oriented partitioning returned one row per source/target pair; the new partitioning ties all members of the latest set together so the view returns them as one set.
Forward-only emission under INCREMENTAL. INCREMENTAL relationship pipelines no longer write row_st = 'N' rows for set transitions. Set membership changes are detected by the set-oriented view at query time rather than by row-level closer/opener emission. FULL_LOG continues to soft-delete rows for source-row disappearance.
Canonical test suite extended to cover BigQuery in addition to PostgreSQL.

Customers running INCREMENTAL pipelines on M:N relationships may see different row counts than before. The new counts are the correct ones.

`--full-refresh` at any scope

--full-refresh is the standard term across dbt and Dataform for "rebuild this thing from scratch." In dbt and Dataform it operates at model / dataset scope. In daana-cli 0.7.0 it operates at three scopes - full workflow, entity, or attribute:

# Rebuild the entire workflow
daana-cli execute --full-refresh

# Rebuild a single entity
daana-cli execute --full-refresh --entity CUSTOMER

# Rebuild a single attribute on an entity
daana-cli execute --full-refresh --entity CUSTOMER --attribute customer_email

--entity requires --full-refresh. --attribute requires --entity. Names are validated via the same SSOT used by the linter.

The use case: you've changed a single attribute mapping or noticed stale data in one entity, and you want to rebuild that slice without taking the full DWH down. Previously the answer was "drop and redeploy"; now it's a scoped rebuild - at whatever scope fits the problem.

Naming convention relaxed

Entity IDs, workflow IDs, attribute names, and relationship names now accept any casing - lowercase, UPPERCASE, Mixed_Case, mixedCase. The checker and compiler previously enforced UPPER_SNAKE_CASE.

Names must still be valid SQL identifiers: letters, digits, underscores. Hyphens are no longer accepted. Model YAML names are preserved exactly as written; cross-reference lookups are case-insensitive at compile time; compiled output uses the model's canonical casing.

The three corresponding lint warnings - entity-name-casing, attribute-type-casing, relationship-name-casing - were removed.

Framework columns now lowercase

All Daana-generated column names (eff_tmstp, ver_tmstp, type_key, data_key, inst_key, row_st, inst_row_key, popln_tmstp) and table suffixes (_idfr, _desc, _x) now use lowercase, following modern SQL convention.

This is a breaking change for downstream consumers. If your BI dashboards, custom SQL, or dbt models reference framework columns in UPPERCASE, audit them and update to lowercase before redeploying.

BigQuery improvements

Several BigQuery-specific changes ship in 0.7.0:

_desc and _x tables are now partitioned via PARTITION BY RANGE_BUCKET(type_key, ...). Cluster keys restructured to favor ver_tmstp (system load time) over eff_tmstp (business time). This reduces bytes scanned on analytical queries and enables incremental DAR / data-mart builds via the WHERE ver_tmstp > last_run_tmstp pattern as a clustered range scan.
Cross-project batch_expression validation now runs correctly when the source project differs from the target project.
DATE source columns no longer fail batch filtering. The generated filter casts the column to TIMESTAMP and wraps batch values in TIMESTAMP().
Transient BigQuery errors retried via a new bqretry SSOT package. jobBackendError and jobInternalError are classified as transient and retried with exponential backoff across executors, metadata read, preflight, query, table-exists, and storage-write API paths.

Existing customers should drop and redeploy their BigQuery data warehouse to benefit from the partitioning changes. PostgreSQL is unchanged.

INCREMENTAL correctness: late arrivals, history views, M:N

Three related fixes ship together (tracked internally as DAANA-131):

Late-arrival rows for TYPE_TABLE attributes are now captured at their own effective time. Previously, late arrivals could be silently dropped because the truedelta compared the earliest source row against the target's chronologically-latest row regardless of timestamp; rows with matching values were marked NOCHANGE and never inserted.
view_<entity>_hist returns one row per value-run. Entity history views now apply gaps-and-islands compaction. Consecutive observations of the same attribute tuple collapse into one row anchored at the earliest observed effective timestamp for that value-run. COUNT(*), AVG, and SUM over the view now treat each value-state as one observation rather than each ingest event. Audit consumers that need every observation should query <entity>_desc directly.
Active-set semantics on view_<source>_has_<target> - covered above under M:N.

Pipeline execution lifecycle tracking

Every pipeline execution is now tracked in a batch_history table with a status lifecycle: Running → Complete / Failed. Failed pipelines don't advance the watermark. Stale runs from crashes are auto-recovered via DW reconciliation.

Two related additions:

--force flag on execute overrides stale run detection when a previous execution appears stuck from a crash.
batch_stale_timeout in workflow advanced settings - configurable timeout (default 8h) for stale run detection. Pipelines running longer than this are auto-recovered on the next execution.

The old local CSV batch tracking (~/.daana/batch/) is replaced by this. Works from CI/CD, containers, and across team members.

Documentation overhaul

docs.daana.dev was rebuilt. Search now works locally and the tutorial flow is redesigned. New pages cover advanced mapping patterns, an expanded glossary, and a unified brand presentation. CI validates link integrity on every prebuild.

Modeling additions worth knowing

Three new features land mid-stack and are worth surfacing for anyone in the middle of a mapping or model file:

Unlimited-text column type via TypeTextDDL - a new atom for unlimited-length string columns. Maps to text (PostgreSQL), STRING (BigQuery / Snowflake), CLOB (Oracle), VARCHAR(MAX) (MSSQL). Use for free-text fields where a fixed length is impractical.
Per-table batch_expression in mapping YAML - override the workflow-level batch expression on individual tables when source tables use different timestamp columns or require custom SQL.
Relationship-only mapping tables now supported - mapping tables can declare relationships without any attributes. Cleaner pattern for pure relationship sources; previously such tables were rejected with an empty-attributes validation error.

Reliability and dialect work

A handful of quality-of-life fixes ship as part of the release:

MSSQL and Oracle dialects fail fast with a clear "not yet supported" error at config load, factory time, preflight, and every other entry point. Previously they fell through silently to the PostgreSQL executor, producing wrong-syntax SQL and unpredictable corruption. The dialects remain in the connection-profile schema under a PlannedDialects list so profiles can be authored in advance.
Auto-discover source schemas in the test harness - typos in mapping source_table references are caught with a validation error showing the closest match.
PostgreSQL identifier length guard - the engine validates generated index / table / schema names against PostgreSQL's 63-character limit, with a clear error and suggested fix.
Dialect-aware TRUNCATE TABLE - BigQuery and Snowflake get per-table truncate statements; PostgreSQL keeps multi-table truncate. Resolves cases where multi-table truncate failed on BigQuery / Snowflake.

Migration

Existing customers upgrade via a standard drop-and-redeploy. The exact sequence:

Update the daana-cli image / binary.
Drop the metadata schema and the data warehouse schema.
daana-cli install → daana-cli deploy → daana-cli execute.

The drop-and-redeploy itself takes roughly 10-15 minutes for a full DWH. Runs are idempotent.

What changes under the hood during migration

The metadata pipeline was redesigned (DAANA-97). Pre-DAANA-97 binaries silently broke INCREMENTAL relationship processing on any model change after deploy. Customers who previously ran deploy without a clean install may have orphan relationships in their _x tables. The symptom is "data looks fine, but relationships don't update." daana-cli execute --full-refresh --entity X per affected entity is the less invasive recovery path if you want to avoid a full drop-and-redeploy.
Entity key columns in metadata tables widened to VARCHAR(256) (previously BIGINT). type_key, data_key, inst_key, and the batch-lifecycle keys remain BIGINT. See ADR-109 for the rationale.
The deploy sequence is now 5 phases: Staging DDL → Entity DDL → Populate metadata staging → Execute metadata pipeline → Function DDL. Function DDL runs last so semantic views embed type_key values read from the metadata tables rather than sequential counters.
Downstream consumers that reference framework columns in UPPERCASE must be updated to lowercase.

Operational notes on the upgrade

Maintenance window depends on your architecture. With the recommended three-layer setup (DAS → DAB → DAR), only the DAB layer rebuilds during the 10-15 minutes. DAR and downstream BI consumers don't notice anything as long as the rebuild completes within their refresh interval. If you query DAB directly from BI, plan a window accordingly.

State preservation: drop-and-redeploy doesn't touch your YAML. Models, mappings, workflows, connection profiles - the source of truth lives in your version-controlled YAML, and the new build regenerates everything from it. Same source, same agentic interface, new generated SQL.

Tested environments: latest PostgreSQL. BigQuery is a managed cloud service without explicit version pinning - current GCP-default behavior is what's tested. MSSQL and Oracle remain authored under PlannedDialects; they fail-fast with a clear error at runtime.

If something doesn't go to plan, ping us. Daana is at the scale where we know every customer's setup personally. The Daana community Slack is the fastest path to a human who can help - for upgrade questions, recovery from anything unexpected, or coordinating with us on the timing. We coordinate upgrades, we don't ship and abandon.

Try it

If you are an existing customer on a pre-0.7.0 build, see the migration section above before upgrading. For new evaluations, 0.7.0 is the build to start with.