Features

Bug fixes

Internal

  • Fix compatibility with dplyr 1.0.8 (#698).

Features

  • mutate(), transmute(), distinct() and summarize() now support dplyr::across() and extra arguments (#640).
  • Key tracking for the first three verbs is less strict and based on name equality (#663).
  • relocate() now works on zoomed dm objects (#666).
  • dm_add_fk() gains on_delete argument which copy_dm_to() picks up and translates to an ON DELETE CASCADE or ON DELETE NO ACTION specification for the foreign key (#649).
  • dm_copy_to() defines foreign keys during table creation, for all databases except DuckDB. Tables are created in topological order (#658). For cyclic relationship graphs, table creation is attempted in the original order and may fail (#664).
  • waldo::compare() shows better output for dm objects (#642).
  • dm_paste() output uses trailing commas in the dm::dm() and tibble::tibble() calls, and sorts column attributes by name, for better modularity (#641).

Breaking changes

Bug fixes

Internal

  • Remove method only needed for RSQLite < 2.2.8, add warning if loaded RSQLite version is <= 2.2.8 (#632).
  • Adapt MSSQL tests to testthat update (#648).

Features

Bug fixes

Internal

Bug fixes

Features

Internal

  • Skip examples that might require internet access on non-CI platforms.

Features

Internal

  • Always run database tests on sqlite for df source.
  • Establish compatibility with testthat > 3.0.2 (#566, @moodymudskipper).

Breaking changes

Features

  • dm_add_fk() gains ref_columns argument that supports creating foreign keys to non-primary keys (#402).
  • dm_get_all_pks() gains table argument for filtering the returned primary keys (#560).
  • dm_get_all_fks() gains parent_table argument for filtering the returned foreign keys (#560).
  • dm_rm_fk() gains an optional ref_columns argument. This function now supports removal of multiple foreign keys filtered by parent or child table or columns, with a message (#559).
  • dm_rm_pk() gains columns argument and allows filtering by columns and by tables or removing all primary keys. The rm_referencing_fks argument has been deprecated in favor of the new fail_fk argument (#558).
  • dm_get_all_fks() has been optimized for speed and no longer sorts the keys (#560).
  • dm operations are now slightly faster overall.

Internal

  • The internal data structure for a dm object has changed to accommodate foreign keys to other columns than the primary key. An upgrade message is shown when working with a dm object from an earlier version, e.g. if it was loaded from a cache or an .rds file (#402).
  • Drop "dm_v1" class from dm objects again, this would have made every S3 dispatch more costly. Relying on an internal "version" attribute instead (#547).

Breaking changes

  • Deprecate dm_get_src() tbl.dm(), src_tbls.dm(), copy_to.dm(). These functions have better alternatives and use the notion of a “data source” which is being phased out of dplyr (#527).
  • *_pk() and *_fk() functions gain an ellipsis argument that comes before check, force and rm_referencing_fks arguments (#520).

Features

  • dm_add_pk() and dm_add_fk() support compound keys via the c() notation, e.g. dm_add_pk(dm, table, c(col1, col2)). dm_nycflights13() returns a data model with compound keys by default. Use compound = FALSE to return the data model from dm v0.1.13 or earlier (#3).
  • dm_get_all_fks() includes parent_pk_cols column that describes the primary key columns of the parent table (#335).
  • dm_from_src() supports the schema argument also for MariaDB and MySQL databases (#516).
  • dm objects now inherit from "dm_v1" in addition to "dm", to allow backward-compatible changes of the internal format (#521).
  • Use hack to create compound primary keys on the database (#522).
  • dm_examine_constraints() and other check functions count the number of rows that violate constraints for primary and foreign keys (#335).
  • copy_dm_to(set_key_constraints = FALSE) downgrades unique indexes to regular indexes (#335).
  • rows_truncate() implemented for data frames (#335).
  • dm_enum_fk_candidates() enumerates column in the order they apper in the table (#335).

Features

Bug fixes

Performance

  • enum_fk_candidates() now only checks distinct values, this improves performance for large tables. As a consequence, only the number of distinct values is reported for mismatches, not the number of mismatching rows/entries (#494).

Documentation

Internal

  • Columns with missing values are no longer primary keys (#469).
  • Fix dm_from_src() for MSSQL when learn_keys = FALSE (#427).
  • Tests use expect_snapshot() everywhere (#456).
  • Fix compatibility with testthat 3.0.1 (#457).
  • Bump RMariaDB required version to 1.0.10 to work around timeout with R CMD check.
  • dm_from_src() accepts schema argument for MSSQL databases (#367).

Breaking changes

Features

  • copy_dm_to() gives a better error message for bad table_names (#397).
  • dm objects with local data sources no longer show the “Table source” part in the output.
  • Error messages now refer to “tables”, not “elements” (#413).
  • New dm_bind() for binding two or more ‘dm’ objects together (#417).

Bug fixes

  • For databases, the underlying SQL table names are quoted early to avoid later SQL syntax errors (#419).
  • dm_financial() no longer prints message about learn_keys = FALSE.
  • dm_rows_update() and related functions now use the primary keys defined in x for establishing matching rows.

Internal

Features

  • dm_paste() generates self-contained code (#401).
  • Errors regarding cycles in the relationship graph now show the shortest cycle (#405).
  • Implement rows_truncate() for databases.
  • collect() works on a zoomed dm, with a message.
  • The data model is drawn in a more compact way if it comprises of multiple connected components.
  • dm_add_pk(check = TRUE) gives a better error message.

Bug fixes

  • rows_insert() works if column names consist of SQL keywords (#409).
  • Cycles in other connected components don’t affect filtering in a cycle-free component.
  • Avoid src_sqlite() in examples (#372).

Internal

  • Testing SQLite, Postgres and SQL Server on GitHub Actions (#408, @pat-s).
  • Testing packages with all “Suggests” uninstalled.

Features

Documentation

Internal

  • Require dplyr >= 1.0.0.

  • Use GitHub Actions (#369, @pat-s).

  • Avoid src_sqlite() in vignettes (#372).
  • Rename vignettes (#349).
  • Rename error class "dm_error_tables_not_neighbours" to "dm_error_tables_not_neighbors".
  • Shortened README and intro article (#192, @jawond).
  • Better testing for MSSQL (#339).
  • Fix compatibility with dplyr 1.0.0.

Features

Bug fixes

  • Fix visualization of column that acts as a foreign key more than once (#37).
  • dm_add_pk(), dm_rm_pk(), dm_add_fk() and dm_rm_fk() are now stricter when keys exists or when attempting to remove keys that don’t exist. A more relaxed mode of operation may be added later (#214).
  • examine_cardinality(), dm_examine_constraints() and enum_pk_candidates() now work for columns named n.
  • dm_set_key_constraints() (and by extension dm_copy_to(set_key_constraints = TRUE)) now quote identifiers for the SQL that creates foreign keys on the database.
  • collect() gives a better error message when called on a "zoomed_dm" (#294).
  • check_subset() gives a clean error message if the tables are complex expressions.
  • dm_from_src(schema = "...") works on Postgres if search_path is not set on the connection.
  • compute.zoomed_dm() no longer throws an error.
  • Remove unused DT import (#295).

Compatibility

  • Remove use of deprecated src_df() (#336).
  • Fix compatibility with dplyr 1.0.0 (#203).

Documentation

Internal

  • Testing on local data frames (by default), optionally also SQLite, Postgres, RMariaDB, and SQL Server. Currently requires development versions and various pull requests (#334, #327, #312, #76).
  • dm_nycflights13(subset = TRUE) memoizes subset and also reduces the size of the weather table.
  • Expand definitions of deprecated functions (#204).
  • Implement format.dm().
  • Adapt to tidyselect 1.0.0 (#257).
  • Zooming and unzooming is now faster if no columns are removed.
  • Table names must be unique.
  • dm_examine_constraints() formats the problems nicely.
  • New class for prettier printing of keys (#244).
  • Add experimental schema support for dm_from_src() for Postgres through the new schema and table_type arguments (#256).

Features

Documentation

  • New demo.
  • Add explanation for empty dm (#100).

Bug fixes

  • Avoid asterisk when printing local zoomed_dm (#131).
  • cdm_select_tbl() works again when multiple foreign keys are defined between two tables (#122).
  • Fix R CMD check.
  • Remove the src component from dm (#38).
  • Internal: Add function checking if all tables have same src.
  • Internal: Add 2 classed errors.
  • cdm_get_src() for local dm always returns a src based on .GlobalEnv.
  • cdm_flatten() gains ... argument to specify which tables to include. Currently, all tables must form a connected subtree rooted at start. Disambiguation of column names now happens after selecting relevant tables. The resulting SQL query is more efficient for inner and outer joins if filtering is applied. Flattening with a right_join with more than two tables is not well-defined and gives an error (#62).
  • Add a vignette for joining functions (#60, @cutterkom).
  • Shorten message in cdm_disambiguate_cols().
  • cdm_flatten_to_tbl() disambiguates only the necessary columns.
  • When flattening, the column name of the LHS (child) table is used (#52).
  • Fix formatting in enum_pk_candidates() for character data.
  • cdm_add_pk() and cdm_add_fk() no longer check data integrity by default.
  • Explicitly checking that the join argument is a function, to avoid surprises when the caller passes data.
  • cdm_copy_to() works correctly with filtered dm objects.
  • cdm_apply_filters() actually resets the filter conditions.
  • A more detailed README file and a vignette for filtering (#29, @cutterkom).
  • cdm_draw() no longer supports the table_names argument, use cdm_select_tbl().
  • Copying a dm to a database now creates indexes for all primary and foreign keys.

Breaking changes

Performance

New functions

Minor changes

Documentation

  • Add setup article (#7).

Internal

  • Using simpler internal data structure to store primary and foreign key relations (#26).
  • New nse_function() replaces h() for marking functions as NSE to avoid R CMD check warnings.
  • Simplified internal data structure so that creation of new operations that update a dm becomes easier.
  • When copying a dm to a database, NOT NULL constraints are set at creation of the table. This removes the necessity to store column types.
  • Using {RPostgres} instead of {RPostgreSQL} for testing.

Initial GitHub release.

Creating dm objects and basic functions:

Primary keys

Foreign keys

Flattening

  • cdm_join_tbl()

Filtering

Interaction with DBs

Utilizing foreign key relations

Miscellaneous