A toolbox for debugging and refactoring in R

.title[
# A toolbox for debugging and refactoring in R
]
.author[
### Antoine Fabri
]
.author[
### cynkra GmbH
]
.date[
### June 8th, 2023
]

---

.big40 {
  font-size: 40px;
}

.xlarge { 
     font-size: 150%
}
.large { 
     font-size: 130% 
}
.medium {
     font-size: 80%
}
.small { 
     font-size: 70% 
}
.xsmall {
     font-size: 50%
}

.caption {
  text-align: center;
  font-size: .8rem;
}

.small-code .remark-code{
  font-size: 80%
}

.xsmall-code .remark-code{
  font-size: 50%
}
</style>

## Hi I'm Antoine

]

]

.pull-left[
* https://www.cynkra.com/about/
* https://github.com/moodymudskipper
* https://github.com/cynkra
* https://stackoverflow.com/
* https://twitter.com/antoine_fabri
]

---
class: small20

## Resources

```r
install.packages(c("tidyverse", "lintr", "styler", "usethis", "renv", "covr", "flow", "constructive"))
remotes::install_github("moodymudskipper/refactor")
remotes::install_github("moodymudskipper/boomer")
```
]

- {flow} : https://moodymudskipper.github.io/flow/
- {constructive} : https://cynkra.github.io/constructive/
- {boomer} : https://moodymudskipper.github.io/boomer/
- {refactor} : https://github.com/moodymudskipper/refactor
- Good practice: https://design.tidyverse.org
- Style: https://style.tidyverse.org
- Git: https://happygitwithr.com
- Packages: https://r-pkgs.org
- Base R debugging: 
  * https://adv-r.hadley.nz/debugging.html
  * https://www.youtube.com/watch?v=9vABzGCQeqU

---

## Warm up

-   **Warm up** 📌
-   Setup
-   Clean up
-   Step up
-   Fix up
-   Wrap up

---

## Warm up: Bugs everywhere

-   You'll spend about half of your time fixing bugs or introducing new ones
-   Debbuging : bugs bugs bugs
-   Refactoring : planning for less bugs, introducing new bugs
-   Features : design them, try to break them, solve the bugs
-   R users are not developers

---
class: small20

## Warm up:  Mental game

-   You really really want to believe that the code you wrote is good
-   You don't know when you'll be done, your boss is waiting, your client is paying
-   You feel like an imposter because it "should be easy"
-   You catch yourself just staring at the screen, you're not even thinking anymore
-   Or you're thinking hard, but you don't really know what you're thinking about

---

## Warm up:  Mental game

-   Have a walk
-   Write it down
-   Talk it out
]

.pull-right[.center[
![](data:image/png;base64,#images/duck.png)
.caption[https://www.smbc-comics.com/]
]]

---

## Warm up: Art or science ?

A bit of both ?

---

## Warm up: How to get out of trouble?

Don't get in trouble!

-   There are ways to get out of trouble

-   But did you really have to get into trouble ?

---

## Warm up: How not to get into trouble

-   Follow good practice for your own code

-   Depend on good code, don't reinvent the wheel

-   Don't let technical debt accumulate

---

## Warm up:  Technical debt

-   Cost of taking shortcuts or making compromises
    -   Tight project deadlines,
    -   Lack of resources
    -   Inexperience
    -   Devs just wanna have fun ?
    - ...

Refactoring = investment to reduce technical debt

(and recover your sanity)

---

## Setup

-   Warm up
-   **Setup** 📌
-   Clean up
-   Step up
-   Fix up
-   Wrap up

---

## Setup

* We have a messy code base (maybe not your code)
* We want a clean package to enjoy dedicated tools

... Where do we start ?

![](data:image/png;base64,#images/hadoken.jpeg)

---
class: small20

## Setup

A path to refactor a messy codebase into a package.

Real project : order might not be strict, steps might overlap.

- Create a project

- Version control

- Syntactic code

- No absolute paths

- Create a package

- Declare dependencies

- Extract existing functions

---

## Setup: Create a project

- Create a project 📌

- Version control

- Syntactic code

- No absolute paths

- Create a package

- Declare dependencies

- Extract existing functions

---

## Setup: Create a project

- No more messy flying scripts on your desktop

```r
usethis::create_project()
```

Organization / collaboration / reproducibility...

=> Sets you up for the next steps

---

## Setup: Version control

- Create a project : `usethis::create_project()`

- Version control 📌

- Syntactic code

- No absolute paths

- Create a package

- Declare dependencies

- Extract existing functions

---

## Setup: Version control

Without version control :

-   Email team about updates
-   Updates directly on production server
-   Previous code is lost, accidental deletions are deadly
-   No trace of who made the changes
-   Versions of code between users might be out of sync
-   Users shy to make any change

---

## Setup:  Version control

With Version control :

-   Version control itself is a communicating tool
-   Work on branches without affecting production code until confident
-   All changes can be reverted
-   All changes and their author can be identified
-   Everyone is synced
-   No harm is irreversible so more confident users

---

## Setup:  Version control

.pull-left[
* Not much knowledge needed to start
* Can be done from RStudio
* Jenny Bryan https://happygitwithr.com

```r
usethis::use_git()
```
]

---

## Setup: Syntactic code

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code 📌

- No absolute paths

- Create a package

- Declare dependencies

- Extract existing functions

---

## Setup: Syntactic code

-   Codebases often contain non syntactic code
    -   Messy WIP files
    -   Code uncarefully commented
    -   Incorrect copy and paste
    -   ...

`refactor::check_files_parse()` will check all files of the project and make sure R scripts are really Rscripts and that their code is syntactic.

```r
refactor::check_files_parse()
```

---

## Setup: No absolute paths

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code : `refactor::check_files_parse()`

- No absolute paths 📌

- Create a package

- Declare dependencies

- Extract existing functions

---

## Setup: No absolute paths

-   Avoiding absolute paths is the norm in software development
-   They force all users to use the same directory layout
-   First reason why your code is not reproducible

If external data stored in file outside of project :
- Path to files should be set in environment variables, options or config files
- Could you have a data package with those ?

---

## Setup: No absolute paths

Relative paths, relative to what ?

-   Relative paths are relative to working directory
    -   By default the project folder in R script if working in project
    -   By default the Rmd file's folder in case of a report
    -   A function might call `setwd()` and alter it and then your scripts don't work anymore
    -   They are often build with `file.path()`

Using `setwd()` sets you up for bad surprises, other scripts can use `setwd()` and disrupt our code, possibly writing the file at the wrong places etc

---

## Setup: No absolute paths

`here::here()` creates a path relative to the project folder, when {here} is loaded it fetches the current working directory (often but not always the project root itself) and finds the project root using heuristics.

-   It guarantees your scripts and Rmds will refer to the same project root
-   Functions that use it won't be polluted by a user or function calling `setwd()`

---

## Setup: No absolute paths

* Use {lintr} to detect and convert absolute paths, and to find problematic
  function calls.
* Use `here::here()` in markdown reports so they have the same wd as your R scripts

```r
## Find absolute paths
lint_dir(linters = absolute_path_linter())

## Find uses of undesirable functions setwd and getwd
lint_dir(linters = undesirable_function_linter(c(setwd = NA, getwd = NA)))

## in a markdown report
here::here("hello", "world.png")
```

---

## Setup: Create a package

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code : `refactor::check_files_parse()`

- No absolute paths : `lintr::lint_dir()`

- Create a package 📌

- Declare dependencies

- Extract existing functions

---

## Setup:  Create a package

-   Let's  make our current project a package!

```r
usethis::create_package() # locally
usethis::create_package(path) # at chosen location
```

- Move or copy our current project into a "inst/" subfolder.
- Or edit `.Rbuildignore` to ignore some folders
- We have a package! (with no object yet!)
- Hadley Wickham https://r-pkgs.org

---

## Setup: Declare dependencies

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code : `refactor::check_files_parse()`

- No absolute paths : `lintr::lint_dir()`

- Create a package : `usethis::create_package()`

- Declare dependencies 📌

- Extract existing functions

---

## Setup: Declare dependencies

-   With scripts we declare dependencies with `library()` calls
-   Or we use the `dplyr::select()` notation and we often don't declare anything at all

-   For packages we need to add dependencies to the DESCRIPTION file, then we can use the `dplyr::select()` notation
-   If we want to call `select()` without prefix we also need to import it in our package

---

## Setup: Declare dependencies

-   `renv::dependencies()` will analyse your code and attempt to retrieve all dependencies.
-   Then edit the DESCRIPTION file to list those in `Imports`, or use `usethis::use_package("dplyr")`

-   For meta packages like {tidyverse}, mention names separately: ggplot2, tibble, tidyr, readr, purrr, dplyr, stringr, forcats

---

## Setup: Declare dependencies

-   Find all the `library()`/`require()` calls in your project
-   Create a "R/imports.R" script one line per packaged attached with `library()` :

```r
#' @import dplyr
#' @import ggplot2
NULL
```

- Ctrl + Shift + D ( or `devtools::document()`) will populate the NAMESPACE file
- Ctrl + Shift + L ( or `devtools::load_all()`) will attach the imported functions

---
class: small20

## Setup: Declare dependencies

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code : `refactor::check_files_parse()`

- No absolute paths : `lintr::lint_dir()`

- Create a package : `usethis::create_package()`

- Declare dependencies : `renv::dependencies()`  
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp;`usethis::use_package()`

- Extract existing functions 📌

---

## Setup: Extract existing functions

Function definitions :

-   Are not expensive to source
-   Can be called in any order
-   Can be run before the analysis
-   Might clutter the scripts if they are not isolated
-   Are better ultimately stored in packages
-   Let's move them all to scripts under "R/"

```r
# Find scripts that contain both function definitions and other objects
`refactor::identify_hybrid_scripts()`
```

---
class: small20

## Setup

- Create a project : `usethis::create_project()`

- Version control : `usethis::use_git()`

- Syntactic code : `refactor::check_files_parse()`

- No absolute paths : `lintr::lint_dir()`

- Create a package : `usethis::create_package()`

- Extract existing functions : `refactor::identify_hybrid_scripts()`

---

## Clean up

-   Warm up
-   Setup
-   **Clean up** 📌
-   Step up
-   Fix up
-   Wrap up

---

## Clean up

* We have a working package
* We didn't lose any information
* We can recover an older situation if we observe unintended behavior
* Now we can tighten things up a bit!

---

## Clean up: Unit tests

- Unit tests 📌

- Extract new functions

- lint and style

- Explicit function imports

- Strive for quiet code

---
class: small20

## Clean up: Unit tests

Unit tests :
  * Ensure that your functions work
  * Ensure that what works keeps working
  
-   {testthat} is a common framework for unit tests
-   Snapshot tests are the easiest, they capture the output of a call and test it against subsequent runs
-   They play very well with version control
-   The priority is to write tests for functions you want to change

```r
usethis::uses_testthat()
usethis::use_test()
expect_snapshot({
  x <- "hello"
  y <- "world"
  my_function(x, y)
})
```
]

---
class: small20

## Clean up: Unit tests

coverage

-   Coverage is a measure of how much of the code is tested
-   Also makes testing a bit more fun

```r
covr::report()
```
]

![](data:image/png;base64,#images/coverage1.png)

---
class: small20

## Clean up: Unit tests

coverage

```r
covr::report()
```
]

![](data:image/png;base64,#images/coverage2.png)
---

## Clean up: Extract new functions

- Unit tests

- Extract new functions 📌

- lint and style

- Explicit function imports

- Strive for quiet code

---

## Clean up: Extract new functions

Scripts are easier to start than functions :

-   No need to think about a function name, no need to isolate arguments and output value
-   We can run them line by line, no need for `browser()` or `debug()`/`debugonce()`

BUT :

* They're a slippery slope, the garden grows!
* If a script is confusing, it should probably be refactored into one or more functions
* Too many functions : rarely an issue, what about too many scripts ?

---

## Clean up: Extract new functions

* Build functions from scripts and place them in scripts under "R/"
* Hunt source() calls and convert them to good function calls
* Use `refactor::detect_similar_code()` to identify duplicated logic
* Go through these resources :
    -   https://design.tidyverse.org
    -   https://style.tidyverse.org
* Write new unit tests!

---

## Clean up: Extract new functions

-   A good script is self contained, it means
    -   It loads all it needs
    -   It writes its output and stops there, meaning it's not there to populate the global environment with variables for further scripts to pick up

---

## Clean up: Extract new functions

-   A good function works like a sourced script except it has :
    -   It has well defined inputs aka arguments (with optional defaults)
    -   A well defined return value OR a well defined side-effect
    -   A scope
      - less worry about name collision
      
---

## Clean up: Extract new functions

Pure functions:

- A pure function's only effect is to return an output
- This output depends only on its inputs

Side effect functions:

- Output invisibly their main argument or NULL

Avoid hybrid functions!

---

## Clean up: lint and style

- Unit tests

- Extract new functions

- lint and style 📌

- Explicit function imports

- Strive for quiet code

---

## Clean up: lint and style

- We can use the {lintr} package to improve our code step by step, we might use
  `lintr::lint_package()`
- `styler::style_pkg()` will make your code look good and consistent.
-   `refactor:::use_lintr_template_on_dir()` will open a script where we can go through different linters

```r
refactor:::use_lintr_template_on_dir()
```

---

## Clean up: Explicit function imports

- Unit tests

- Extract new functions

- lint and style

- Explicit function imports 📌

- Strive for quiet code

---

## Clean up: Explicit function imports

- "' @import dplyr" to "'@importFrom dplyr select"
- "' @import dplyr" to `dplyr::select()`
    -   to avoid conflicts due to new functions
    -   to be notified quickly if a function disappears
    -   to document what features we need from other packages

```r
refactor::find_pkg_funs("dplyr")
```

---

## Clean up: Strive for quiet code

- Unit tests

- Extract new functions

- lint and style

- Explicit function imports

- Strive for quiet code 📌

---

## Clean up: Strive for quiet code

* Code that talks too much is a sign some things are not robust
* We tend not to read it

- Avoid most warnings as if they were errors
- Avoid messages too when you can
- Give an explicit argument to {dplyr} join functions
- Provide the `col_types` argument to `readr::read_csv`
- Ungroup your data!!! Use `.groups = "drop"` or `.by`
- ...

---

## Clean up: Strive for quiet code

- Unit tests

- Extract new functions

- lint and style

- Explicit function imports

- Strive for quiet code 📌

---

## Clean up: Extra steps

Not always required

* Export functions called directly by the scripts
  * Install your package
  * Move remaining scripts to a different project that calls `library(yourpkg)` 
  * OR Create Rmd report that call `devtools::load_all()` or `library(yourpkg)`
  
---

## Setup

-   Warm up
-   Setup
-   Clean up
-   **Step up** 📌
-   Fix up
-   Wrap up

---

## Step up

* We have a clean package
* We have a set of steps/principle to apply to improve it and keep it clean

Some additional tools might help

* {refactor}
* {flow}

---
class: big40

## Step up : {refactor}

* {refactor} 📌
* {flow}

---

## Step up : {refactor}

* `remotes::install_github("moodymudskipper/refactor")`
* `%refactor_value%`
* `%refactor_chunk%`

- If functions are well tested we can change them fearlessly
- But how much do you trust your tests ?
- How nice would it be to test your refactoring safely on real cases continuously for some time ?

---
class: small20

## Step up : {refactor}

`%refactor_value%`

* Runs each side
  * Fails explicitly when the output is different
  * Often used with functions

```r
library(refactor) # or import in your pkg 
multiply <- function(x, y) {
  purrr::reduce(
    replicate(y, x), 
    .init = x, 
    \(x, y) x + y
  ) - x
}  %refactor_value% {
  x * y
}
```
]

---
class: small20

## Step up : {refactor}

Then we can use it for some time, both sides will be executed and we'll be notified if they give a different output

```r
multiply(2, 3)
```

```
## [1] 6
```

```r
multiply(2, 4.5)
```

```
## Error: The refactored expression returns a different value from the original one.
## 
##   `original`: 8
## `refactored`: 9
```
]

---
class: small20

## Step up : {refactor}

`%refactor_chunk%`

* Runs each side
  * Fails explicitly when the environment changes are different
  * Useful in scripts

```r
{
  data1 <- dplyr::filter(cars, speed < 5)
  data2 <- dplyr::mutate(data1, speed = speed * 1.60934, speed2 = speed * 1000/3600)
} %refactor_chunk% {
  data1 <- subset(cars, speed < 5)
  data2 <- transform(data1, speed = speed * 1.60934, speed2 = speed * 1000/3600)
}
```
]

---
class: small20

## Step up : {refactor}

```
## Error: The variable `data2` is bound to a  different value after the original and refactored code
## original vs refactored
##                     speed2
## - original[1, ]   1.788156
## + refactored[1, ] 1.111111
## - original[2, ]   1.788156
## + refactored[2, ] 1.111111
## 
##   `original$speed2`: 1.8 1.8
## `refactored$speed2`: 1.1 1.1
```
]

---
class: big40

## Step up : {flow}

* {refactor}
* {flow} 📌

---
class: small20

## Step up : {flow}

{flow} helps you visualize:

* The logic of individual functions or scripts
* The dependencies between variables in a given function or script
* The dependencies between functions in a package, or scripts in a folder

```r
library(flow)
```

---
class: small20

## Step up : {flow}

Visualize the **logic of individual functions or scripts**

* `flow_view()` can be used on functions or paths

```r
flow_view(rle)
```
.center[
<img src="data:image/png;base64,#images/flow_view_rle.png" style="width: 80%" />
]
]

```r
rle
```

```
## function (x) 
## {
##     if (!is.vector(x) && !is.list(x)) 
##         stop("'x' must be a vector of an atomic type")
##     n <- length(x)
##     if (n == 0L) 
##         return(structure(list(lengths = integer(), values = x), 
##             class = "rle"))
##     y <- x[-1L] != x[-n]
##     i <- c(which(y | is.na(y)), n)
##     structure(list(lengths = diff(c(0L, i)), values = x[i]), 
##         class = "rle")
## }
## <bytecode: 0x13b2ad840>
## <environment: namespace:base>
```
]
]

---
class: small20

## Step up : {flow}

* Easier to follow logic for long functions
* A good format to share with colleagues and management
* Lets you identify branches to refactor
* we can use the `out` arg to export or open temp files

```r
flow_view(data.frame)
```

---
class: small20

## Step up : {flow}

Visualize the **dependencies between variables** in a given function or script

* `flow_view_vars()` can be used on functions or paths

```r
flow_view_vars(ave)
```
.center[
<img src="data:image/png;base64,#images/flow_view_vars_ave.png" style="width: 50%" />
]
]

```r
ave
```

```
## function (x, ..., FUN = mean) 
## {
##     if (missing(...)) 
##         x[] <- FUN(x)
##     else {
##         g <- interaction(...)
##         split(x, g) <- lapply(split(x, g), FUN)
##     }
##     x
## }
## <bytecode: 0x11f74ec78>
## <environment: namespace:stats>
```
]
]

---
class: small20

## Step up : {flow}

Visualize the **dependencies between variables** in a given function or script

* `flow_view_vars()` can be used on functions or paths

```r
flow_view_vars(ave, expand = FALSE)
```
.center[
<img src="data:image/png;base64,#images/flow_view_vars_ave2.png" style="width: 50%" />
]
]

```r
ave
```

---
class: small20

## Step up : {flow}

Visualize the **dependencies between functions** in a package, or scripts in a folder

* `flow_view_deps()` shows the objects called recursively by its input

```r
flow_view_deps(dplyr::ifelse)
```
]

---
class: small20

## Step up : {flow}

Visualize the **dependencies between functions** in a package, or scripts in a folder

* `flow_view_deps()` shows the objects called recursively by its input

```r
flow_view_deps(dplyr::ifelse, show_imports = "packages")
```
]

---
class: small20

## Step up : {flow}

Visualize the **dependencies between functions** in a package, or scripts in a folder

* `flow_view_deps()` shows the objects called recursively by its input

```r
flow_view_deps(dplyr::ifelse, show_imports = "none")
```
]

---
class: small20

## Step up : {flow}

Visualize the **dependencies between functions** in a package, or scripts in a folder

* `flow_view_uses()` shows the functions which call the input directly or indirectly

```r
flow_view_uses(dplyr::ifelse)
```
]

---
class: small20

## Step up : {flow}

Visualize the **dependencies between functions** in a package, or scripts in a folder

* `flow_view_shiny()` shows the modular structure of a shiny app

```r
flow_view_shiny(esquisse::esquisser, show_imports = "none")
```
]

---

## Step up : {flow}

* {refactor} : continuous live testing
* {flow} : Understand the logic, improve the vision

---

## Setup

-   Warm up
-   Setup
-   Clean up
-   Step up*
-    **Fix up ** 📌
-   Wrap up

---

## Fix up : Bugs!

Despite our efforts, trouble found us

---

## Fix up

* What's in a bug ? 📌
* Base R toolkit
* {flow}
* {constructive}
* {boomer}

---
class: small20

## Fix up: What's in a bug ?

A bug is a misunderstanding by the dev or user of either

* The Logic
* The State
* The Data

Where state is made of :
* versions
* OS
* system dependencies
* internet access etc.
* Random seed
* environment stack
* ...

---
## Fix up: What's in a bug ?

* The error message is an attempt to summarize those for you
* The documentation is an attempt to provide you the only knowledge about the
  logic that you need
* You should really read them!
* But it might not be enough

---
class: small20

## Fix up: What's in a bug ?

Minimal reprex: 
  * Minimize the logic: minimal code, minimal dependencies
  * Minimize the state: Run in new session
  * Minimize the data

In those 3 dimensions we can either:
  * start from nothing and add elements until it breaks
  * start from complex case and remove element until it works

---
class: small20

## Fix up: What's in a bug ?

* Use the {reprex} package! 
* Capture those 3 dimensions into something pretty and concise
* You're halfway there
* And likely to get an answer if you ask 
  * http://www.stackoverflow.com
  * RStudio community
  * twitter

---

## Fix up

* What's in a bug ? 
* Base R toolkit 📌
* {flow}
* {constructive}
* {boomer}

---
class: small20

## Fix up: Base R toolkit

Many resources already available, this one is great:
  * Quant Psych: debugging strategy for R
  * https://www.youtube.com/watch?v=9vABzGCQeqU
 
I'll do a quick summary

---
class: small20

## Fix up: Base R toolkit

- `options(warn = 2)` : Fail to better identify situation at time of warning
- `options(error = recover)` : Explore data at different places in the call stack
- `traceback()` : The sequence of calls that got you to an error
- `browser()`, `debug()`, `debugonce()` : Explore the logic step by step from
  a given point, explore the data.
- `typeof()`, `attributes()`, `str()`, `dput()`: often better than `print()`
  to understand objects.
- `message()`, `cat()`: Log information to the console, or to a file, you can use
  `trace()`, `trace(,edit = TRUE)`, `untrace()` to insert logging calls in any function
  temporarily.
- `try()`, `tryCatch()`: Capture error and for instance log or browse if error
- `search()`, `sessionInfo()`, `Sys.info()`: Explore the state
- `on.exit()`: run some code whenever a function is exited, including if error

---

## Fix up: {flow}

* What's in a bug ? 
* Base R toolkit
* {flow} 📌
* {constructive}
* {boomer}

---
class: small20

## Fix up: {flow}

Let's see how `flow_run()` can help us understand a bug better

```r
# this works
df <- data.frame(
  x = "Keep calm and",
  y = "love",
  z = "Ukraine"
)
df
```

```
##               x    y       z
## 1 Keep calm and love Ukraine
```
]

---
class: small20

## Fix up: {flow}

```r
# this also works
df$y <- emo::ji("heart")
df$z <- emo::ji("ukraine")
df
```

```
##               x y  z
## 1 Keep calm and ❤️ 🇺🇦
```
]

---
class: small20

## Fix up: {flow}

```r
# but this doesn't, why ?
df <- data.frame(
  x = "Keep calm and",
  y = emo::ji("heart"),
  z = emo::ji("ukraine")
)
```

```
## Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors): cannot coerce class '"emoji"' to a data.frame
```
]

---
class: small20

## Fix up: {flow}

Let's explore

```r
flow_run(data.frame(
  x = "Keep calm and",
  y = emo::ji("heart"),
  z = emo::ji("ukraine")
), out = "png")
```
]

---
class: small20

## Fix up: {flow}

The function `flow_compare_runs()` will compare 2 calls to a same
function.

```r
flow_compare_runs(rle(NULL), rle(c(1, 2, 2, 3)))
```
]

---

## Fix up: {constructive}

* What's in a bug ? 
* Base R toolkit
* {flow}
* {constructive} 📌
* {boomer}

---
class: small20

## Fix up: {constructive}

* {constructive} strives to represent data simply and accurately.
* Accuracy is crucial when debugging
* We tend to trust `print()` but we shouldn't

```r
df1 <- data.frame(date = as.Date("2023-08-24"), country = factor("UA"))
df2 <- data.frame(date = "2023-08-24", country = "UA")
attr(df2$date, "attr_date") <- "national_day"
attr(df2, "attr_df") <- "some_dataset"
df1
```

```
##         date country
## 1 2023-08-24      UA
```

```r
df2
```

```
##         date country
## 1 2023-08-24      UA
```
]

---
class: small20

## Fix up: {constructive}

`dput()` and `str()` are more accurate here, but hard to read:

```r
dput(df1)
```

```
## structure(list(date = structure(19593, class = "Date"), country = structure(1L, levels = "UA", class = "factor")), class = "data.frame", row.names = c(NA, 
## -1L))
```

```r
dput(df2)
```

```
## structure(list(date = structure("2023-08-24", attr_date = "national_day"), 
##     country = "UA"), row.names = c(NA, -1L), class = "data.frame", attr_df = "some_dataset")
```

```r
str(df1)
```

```
## 'data.frame':	1 obs. of  2 variables:
##  $ date   : Date, format: "2023-08-24"
##  $ country: Factor w/ 1 level "UA": 1
```

```r
str(df2)
```

```
## 'data.frame':	1 obs. of  2 variables:
##  $ date   : chr "2023-08-24"
##   ..- attr(*, "attr_date")= chr "national_day"
##  $ country: chr "UA"
##  - attr(*, "attr_df")= chr "some_dataset"
```
]

---
class: small20

## Fix up: {constructive}

By contrast `constructive::construct()` produces idiomatic code

```r
library(constructive)
construct(df1)
```

```
## data.frame(date = as.Date("2023-08-24"), country = factor("UA"))
```

```r
construct(df2)
```

```
## data.frame(
##   date = "2023-08-24" |>
##     structure(attr_date = "national_day"),
##   country = "UA"
## ) |>
##   structure(attr_df = "some_dataset")
```
]

---
class: small20

## Fix up: {constructive}

`constructive::construct_diff()` can be used to compare them

```r
construct_diff(df1, df2)
```
]

---
class: small20

## Fix up: {constructive}

Sometimes dput is inaccurate.

```r
dput(dplyr::select)
```

```
## function (.data, ...) 
## {
##     UseMethod("select")
## }
```

```r
construct(dplyr::select)
```

```
## (function(.data, ...) {
##   UseMethod("select")
## }) |>
##   (`environment<-`)(asNamespace("dplyr"))
```

```r
dput(RSQLite::SQLite())
```

```
## new("SQLiteDriver", )
```

```r
construct(RSQLite::SQLite())
```

```
## new(
##   "SQLiteDriver" |>
##     structure(package = "RSQLite")
## )
```
]

---
class: small20

## Fix up: {constructive}

Sometimes it is even non syntactic

```r
dput(environment(dplyr::select))
```

```
## <environment>
```

```r
construct(environment(dplyr::select))
```

```
## asNamespace("dplyr")
```
]

---
class: small20

## Fix up: {constructive}

Sometimes it is even non syntactic

```r
dt <- data.table::data.table(a=1)
dput(dt)
```

```
## structure(list(a = 1), row.names = c(NA, -1L), class = c("data.table", 
## "data.frame"), .internal.selfref = <pointer: 0x14c80bee0>)
```

```r
construct(dt)
```

```
## data.table::data.table(a = 1)
```
]

---
class: small20

## Fix up: {constructive}

* We have a lot of control on how we want to generate the code
* This is done by using the `opts_*` functions implemented for supported class

```r
construct(dplyr::band_members)
```

```
## tibble::tibble(name = c("Mick", "John", "Paul"), band = c("Stones", "Beatles", "Beatles"))
```

```r
construct(dplyr::band_members, opts_tbl_df("tribble"))
```

```
## tibble::tribble(
##   ~name,  ~band,
##   "Mick", "Stones",
##   "John", "Beatles",
##   "Paul", "Beatles",
## )
```
]

---
class: small20

## Fix up: {constructive}

Using the constructor "next" we can opt out of the idiomatic constructor
and use the next method, yet still get a faithful object.

```r
construct(dplyr::band_members, opts_tbl_df("next"))
```

```
## data.frame(name = c("Mick", "John", "Paul"), band = c("Stones", "Beatles", "Beatles")) |>
##   structure(class = c("tbl_df", "tbl", "data.frame"))
```

```r
construct(dplyr::band_members, opts_tbl_df("next"), opts_data.frame("next"))
```

```
## list(name = c("Mick", "John", "Paul"), band = c("Stones", "Beatles", "Beatles")) |>
##   structure(class = c("tbl_df", "tbl", "data.frame"), row.names = 1:3)
```
]

---
class: small20

## Fix up: {constructive}

To reproduce a bug `construct_multi()` is handy:

```r
a <- head(cars, 2)
b <- letters

construct_multi(list(a = a, b = b))
```

```
## a <- data.frame(speed = c(4, 4), dist = c(2, 10))
## 
## b <- c(
##   "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
##   "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
## )
```
]

---

## Fix up: {boomer}

* What's in a bug ? 
* Base R toolkit
* {flow}
* {constructive} 
* {boomer} 📌

---
class: small20

## Fix up: {boomer}

{boomer} makes expressions or functions verbose, so they print the output
of each intermediate step.

`boom()` explodes a call

```r
# remotes::install_github("moodymudskipper/boomer")
library(boomer)
boom(1 + !1 * 2)
boom(subset(head(mtcars, 2), qsec > 17))
```
]

---
class: small20

## Fix up: {boomer}

* `rig()` sets up a function so when it's called it prints all intermediate steps
* Use `rig_in_namespace()` to rig permanently a package function during development

```r
hello <- function(x) {
  if(!is.character(x) || length(x) != 1) {
    stop("`x` should be a string")
  }
  paste0("Hello ", x, "!")
}

hello2 <- rig(hello)

hello2("world")
```
]

---
class: small20

## Fix up: {boomer}

* `boom()` and `rig()` have a `print` arg to tweak the way values are printed.
* We can use {constructive} there.

Example from SO. A user wanted to understand what this does

```r
library(dplyr, warn.conflicts = FALSE)
fun <- function(df, Country_name){
  Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
  df %>%
    filter(Country == Country_name)
}
df <- data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
df
```
]

---
class: small20

## Fix up: {boomer}

```r
fun2 <- boomer::rig(fun, print = constructive::construct)
fun2(df, Ukraine)
```
]

---

## Fix up: {boomer}

* What's in a bug ? 
* Base R toolkit
* {flow}
* {constructive} 
* {boomer}

---

## Wrap up

-   Warm up : general ideas
-   Setup : make a messy code base into a proper pakage
-   Clean up : make it better
-   Step up : Useful refactoring tools
-   Fix up : Useful debugging tools
-   **Wrap up** 📌

---

# Questions?

---

# Thank you!