Skip to contents

We detail in this vignette how {constructive} works and how you might define custom constructors or custom .cstr_construct.*() methods.

This documents provides the general theory here but you are encouraged to look at examples.

In particular the package {constructive.examples} accessible at https://github.com/cynkra/constructive.example/ contains 2 examples, support a new class (“qr”), or implement a new constructor for an already supported class (“tbl_df). This package might be used as a template.

The scripts starting with “s3-” and “s4-” in the {constructive} package provide many more examples in a similar but slightly different shape, those 2 resources along with the explanations in this document should get you started. Don’t hesitate to open issues if things are unclear.

The next 5 sections describe the inner logic of the package, the last 2 sections explain how to support a new class and/or define your own constructors.

The package is young and subject to breaking changes, so we apologize in advance for the possible API breaking changes in the future.

Recursion system

  • .cstr_construct() builds code recursively, without checking input or output validity, without handling errors, and without formatting.
  • construct() wraps .cstr_construct() and does this extra work.
  • .cstr_construct() is a generic and many methods are implemented in the package, for instance construct(iris) will call .cstr_construct.data.frame() which down the line will call .cstr_construct.atomic() and .cstr_construct.factor() to construct its columns.
  • Additionally, before dispatching, .cstr_construct() attempts to match its data input to a list of objects provided to the data argument.
.cstr_construct
#> function(x, ..., data = NULL) {
#>   data_name <- perfect_match(x, data)
#>   if (!is.null(data_name)) return(data_name)
#>   UseMethod(".cstr_construct")
#> }
#> <bytecode: 0x5597802ffbd0>
#> <environment: namespace:constructive>
.cstr_construct(letters)
#> [1] "c("                                                                                                        
#> [2] "  \"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\", \"h\", \"i\", \"j\", \"k\", \"l\", \"m\", \"n\", \"o\","
#> [3] "  \"p\", \"q\", \"r\", \"s\", \"t\", \"u\", \"v\", \"w\", \"x\", \"y\", \"z\""                             
#> [4] ")"
construct(letters)
#> c(
#>   "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
#>   "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
#> )

.cstr_construct.?() methods

.cstr_construct.?() methods typically have this form:

.cstr_construct.Date <- function(x, ...) {
  opts <- .cstr_fetch_opts("Date", ...)
  if (is_corrupted_Date(x) || opts$constructor == "next") return(NextMethod())
  constructor <- constructors$Date[[opts$constructor]]
  constructor(x, ..., origin = opts$origin)
}
  • .cstr_fetch_opts() gathers options provided to construct() through the opts_*() function (see next section), or falls back to a default value if none were provided.
  • If the object is corrupted or if the user decided to bypass the current method by choosing “next” as a constructor for this class, we return NextMethod() to forward all our inputs to a lower level constructor.
  • constructor() actually builds the code from the object x, the parameters forwarded through ... and the optional construction details gathered in opts (here the origin)

opts_?() function

When implementing a new method you’ll need to define and export the corresponding opts_?() function. It provides to the user a way to choose a constructor and object retrieved by .cstr_fetch_opts() in the .cstr_construct() method.

It should always have this form:

opts_Date <- function(
    constructor = c(
      "as.Date", "as_date", "date", "new_date",  "as.Date.numeric", 
      "as_date.numeric", "next", "atomic"
    ),
    ..., 
    origin = "1970-01-01"
  ) {
  .cstr_combine_errors(
    constructor <- .cstr_match_constructor(constructor),
    rlang::check_dots_empty()
  )
  .cstr_options("Date", constructor = constructor, origin = origin)
}
  • The class is present in the name of the opts_?() function and as the first argument of .cstr_options().
  • A character vector of constructors is provided, starting with the default constructor
  • A “next” value is mandatory except for internal types.
  • Additional arguments passed to the constructors might be added, as we did here with origin

The following code illustrates how the information is retrieved.

# .cstr_fetch_opts() takes a class and the dots and retrieves the relevant options
# if none were provided it falls back on the default value for the relevant opts_?() function
test <- function(...) {
  .cstr_fetch_opts("Date", ...)
}
test(opts_Date("as_date"), opts_data.frame("read.table"))
#> <constructive_options_Date/constructive_options>
#> constructor: "as_date"
#> origin:      "1970-01-01"
test()
#> <constructive_options_Date/constructive_options>
#> constructor: "as.Date"
#> origin:      "1970-01-01"

is_corrupted_?() function

is_corrupted_?() checks if x has the right internal type and attributes, sometimes structure, so that it satisfies the expectations of a well formatted object of a given class.

If an object is corrupted for a given class we cannot use constructors for this class, so we move on to a lower level constructor by calling NextMethod() in .cstr_construct().

This is important so that constructive doesn’t choke on corrupted objects but instead helps us understand them.

For instance in the following example x prints like a date but it’s corrupted, a date should not be built on top of characters and this object cannot be built with as.Date() or other idiomatic date constructors.

x <- structure("12345", class = "Date")
x
#> [1] "2003-10-20"
x + 1
#> Error in unclass(e1) + unclass(e2): non-numeric argument to binary operator

We have defined :

is_corrupted_Date <- function(x) {
  !is.double(x)
}

And as a consequence the next method, .cstr_construct.default() will be called through NextMethod() and will handle the object using an atomic vector constructor:

construct(x)
#> "12345" |>
#>   structure(class = "Date")

constructors

{constructive} exports a constructors environment object, itself containing environments named like classes, the latter contain the constructor functions.

It is retrieved in the .cstr_construct() method by:

constructor <- constructors$Date[[opts$constructor]]

For instance the default constructor for “Date” is :

constructors$Date$as.Date
#> function(x, ..., origin = "1970-01-01") {
#>   compatible_with_char <-
#>     all(rlang::is_integerish(x) & (is.finite(x) | (is.na(x) & !is.nan(x))))
#>   if (!compatible_with_char || all(is.na(x))) {
#>     return(constructors$Date$as.Date.numeric(x, ..., origin = origin))
#>   }
#>   code <- .cstr_apply(list(format(x)),  "as.Date", ..., new_line = FALSE)
#>   repair_attributes_Date(x, code, ...)
#> }
#> <bytecode: 0x55977f16bef0>
#> <environment: namespace:constructive>

A function call is made of a function and its arguments. A constructor sets the function and constructs its arguments recursively. This is done with the help of .cstr_apply() once these output have been prepared. In the case above we have 2 logical paths because dates can be infinite but date vectors containing infinite elements cannot be represented by as.Date(<character>), our preferred choice.

x <- structure(c(12345, 20000), class = "Date")
y <- structure(c(12345, Inf), class = "Date")
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\"))"
constructors$Date$as.Date(y)
#> [1] "as.Date(c(12345, Inf), origin = \"1970-01-01\")"

It’s important to consider corner cases when defining a constructor, if some cases can’t be handled by the constructor we should fall back to another constructor or to another .cstr_construct() method.

For instance constructors$data.frame$read.table() falls back on constructors$data.frame$data.frame() when the input contains non atomic columns, which cannot be represented in a table input, and constructors$data.frame$data.frame() itself falls back on .cstr_construct.list() when the data frame contains list columns not defined using I(), since data.frame() cannot produce such objects.

That last line of the function does the attribute reparation.

Attribute reparation

Constructors should always end by a call to .cstr_repair_attributes() or a function that wraps it.

These are needed to adjust the attributes of an object after idiomatic constructors such as as.Date() have defined their data and canonical attributes.

x <- structure(c(12345, 20000), class = "Date", some_attr = 42)
# attributes are not visible due to "Date"'s printing method
x
#> [1] "2003-10-20" "2024-10-04"

# but constructive retrieves them
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\")) |>"
#> [2] "  structure(some_attr = 42)"

.cstr_repair_attributes() essentially sets attributes with exceptions :

  • It doesn’t set names, these should be handled by the constructors
  • It doesn’t set the class explicitly if it’s identical to the idiomatic class, i.e. the class returned by the constructor before the repair call, and provided through the idiomatic_class argument
  • It doesn’t set attributes that we choose to ignore because they are set by the constructor (e.g. row names for data frames or levels for factors)

.cstr_repair_attributes() does a bit more but we don’t need to dive deeper in this vignette.

constructive:::repair_attributes_Date
#> function(x, code, ...) {
#>   .cstr_repair_attributes(
#>     x, code, ...,
#>     idiomatic_class = "Date"
#>   )
#> }
#> <bytecode: 0x559780354b50>
#> <environment: namespace:constructive>

constructive:::repair_attributes_factor
#> function(x, code, ...) {
#>   .cstr_repair_attributes(
#>     x, code, ...,
#>     ignore = "levels",
#>     idiomatic_class = "factor"
#>   )
#> }
#> <bytecode: 0x55978178a900>
#> <environment: namespace:constructive>

Register a new class

Registering a new class is done by defining and registering a .cstr_construct.?() method. In a package you might register the method with {roxygen2} by using the “@export tag”

Register new constructors

You should not attempt to modify manually the constructors object of the {constructive} package, instead you should :

  • Define an unexported constructor function
  • Call .cstr_register_constructors(class_name, constructor_name = constructor_function, ...)

Do the latter in .onload() if the new constructor is to be part of a package, for instance.

# in zzz.R
.onLoad <- function(libname, pkgname) {
  .cstr_register_constructors(
    class_name, 
    constructor_name1 = constructor1, 
    constructor_name2 = constructor2
  )
}