forked from hadley/adv-r
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathPackage-development-cycle.rmd
106 lines (60 loc) · 5.82 KB
/
Package-development-cycle.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: The package development cycle
layout: default
---
# The package development cycle
The package development cycle describes the sequence of operations that you use when developing a package. You probably already have a sequence of operations that you're already comfortable with when developing a single file of R code. It might be:
* Try something out on the command line.
* Modify it until it works, and then copy and paste the command into an R file
* Every now and then, restart R, and source in the R file to make sure you
haven't missed anything
Or:
* Write all your functions in an R file.
* `source()` the file into your current session
* Interactively try out the functions and see if they return the correct
results
* Repeat the above steps until the functions work the way you expect.
Things get a bit harder when you're working on a package because you have multiple R files. You might also be a little bit more worried about checking that your functions work, not only now, on your computer, but also in the future and on other computers.
The sections below describe useful practices for developing packages, where you might have many R files, and you are also thinking about documentation and testing. Both are supported by the `devtools` function `dev_mode`, which helps to keep your production and development packages separate.
## Dev mode
When you're developing packages, typically you are developing them because you're using them - and that easily leads to confusion. When you're using a package for a data analysis or other project, you probably want to use the released version of the package- you don't want to use the possibly buggy or broken in-development version. But when you're developing and testing a package, you obviously want the development version.
The `dev_mode()` function makes it easier to managing this distinction. When you're doing development you can run `dev_mode()` to switch to a special library where you can install and test new packages. When you're doing regular work, you keep it off and R won't find the development packages:
# During development
dev_mode()
install("mypackage")
library(mypackage) # uses the development version
# Fresh R session
library(mypackage) # uses the released package you've installed previously
## Key functions
The three functions that you'll use most often are:
* `load_all("pkg")`, which loads code, data and C files. These are loaded into
a non-global environment to avoid conflicts, and so all functions can easily
be removed. By default `load_all` will only load changed files to save time:
if you want to reload everything from scratch, run `load_all("pkg", T)`
* `test("pkg")` runs all tests in `inst/tests/` and reports the results
* `document("pkg")` runs `roxygen` on the package to update all documentation.
This makes the development process very easy. After you've modified the code, you run `load_all("pkg")` to load it in to R. You can explore it interactively from the command line, or type `test("pkg")` to run all your automated tests.
Some GUIs have integrated supported for these functions. Rstudio provides the build menu which allows you to easily run `load_all` with a single key press (Ctrl + Shift + L at time of writing). ESS?
The following sections describe how you might combine these functions into a process.
## Development cycles
It's useful to distinguish between exploratory programming and confirmatory programming (in the same sense as exploratory and confirmatory data analysis) because the development cycle differs in several important ways.
### Confirmatory programming
Confirmatory programming happens when you know what you need to do, and what the results of your changes will be (new feature X appears or known bug Y disappears) you just need to figure out the way to do it. Confirmatory programming is also known as [test driven development][tdd] (TDD), a development style that grew out of extreme programming. The basic idea is that before you implement any new feature, or fix a known bug, you should:
1. Write automated test, and run `test()` to make sure test fails (so you know
you've captured the bug correctly)
2. Modify code to fix the bug or implement the new feature.
3. Run `test(pkg)` to reload the package and re-run the tests.
4. Repeat 2--4 until all tests pass.
5. Update documentation comments, run `document()`, update `NEWS`
For this paradigm, you might also want to use `testthat::auto_test()` which will watch your tests and code and will automatically rerun your tests when either changes. This allows you to skip step three: you just modify your code and watch to see if the tests pass or fail.
### Exploratory programming
Exploratory programming is the complement of confirmatory programming, when you have some idea of what you want to achieve, but you're not sure about the details. You're not sure what the functions should look like, what arguments they should have and what they should return. You may not even be sure how you are going to break down the problem into pieces. In exploratory programming, you're exploring the solution space by writing functions, and you need the freedom to rewrite large chunks of the code as you understand the problem domain better.
The exploratory programming cycle is similar to confirmatory, but it's not usually worth writing the tests before writing the code, because the interface will change so much:
1. Edit code and reload with `load_all()`.
2. Test interactively.
3. Repeat 1--2 until code works.
4. Write automated tests and `test()`.
5. Update documentation comments, run `document()`, update `NEWS`
The automated tests are still vitally important because they are what will prevent your code from failing silently in the future.
[devtools-down]:https://github.com/hadley/devtools/tarball/master
[tdd]:http://en.wikipedia.org/wiki/Test-driven_development