diff --git a/_includes/pompstyle.css b/_includes/pompstyle.css index d757bab9..fcaa4e82 100644 --- a/_includes/pompstyle.css +++ b/_includes/pompstyle.css @@ -10,7 +10,7 @@ body { code { font-family: monospace; - font-size: 1.125em; + font-size: 125%; } h1, h2, h3, h4 { @@ -48,10 +48,12 @@ a:link, a:visited { color: #0000ff; text-decoration: none; } + a:hover, a:active { - color: #cc3333; + color: #cc3333; text-decoration: none; } + a.activated { text-decoration: underline; } @@ -81,20 +83,24 @@ dt { } .emph { - color: #cc3333; + color: #cc3333; font-weight: bold; } .emph1 { - color: #3333ff; + color: #3333ff; font-weight: bold; } -.firstcharacter { - float: left; - color: #3333ff; +.firstcharacter { + float: left; + color: #3333ff; font-size: 2.5em; line-height: 0.8em; vertical-align: top; padding-right: 5px; } + +.sourceCode, .language-r, .language-c, .language-sh { + color: #000066; +} diff --git a/blog.html b/blog.html index 3f1b8a0f..b39055de 100644 --- a/blog.html +++ b/blog.html @@ -1,6 +1,6 @@ --- layout: pomp -id: blog +id: news title: pomp news blog ---
Xcode
app installed.
Xcode
is free and can be installed according to [these instructions](https://mac.r-project.org/tools/){:target="_blank"}.
@@ -66,21 +68,21 @@ To use **pomp**'s compilation facility, you need to have the Xcode
In addition, some users report problems installing **pomp** from source due to lack of an appropriate **gfortran** installation, which is not included by default in all versions of **Xcode**.
If you have this problem, see [these instructions](https://mac.r-project.org/tools/){:target="_blank"}.
To test your Xcode
and Fortran installations, run the following in an **R** session:
-```
+```r
source("https://kingaa.github.io/scripts/hello.R",echo=TRUE)
```
On success, you will see two "Hello!" messages.
------------------------
-## Important note for Linux users
+### Important note for Linux users
To install **pomp** from source, you will need the `gfortran` compiler on your machine and will therefore may need to install it.
To do so on a Debian-based system (e.g., Ubuntu), for example, run:
-```
+```sh
sudo apt install gfortran
```
On an RPM-based system, run:
-```
+```sh
sudo yum install gcc-gfortran
```
diff --git a/vignettes/C_API.Rmd b/vignettes/C_API.Rmd
index 7f855865..8a242322 100644
--- a/vignettes/C_API.Rmd
+++ b/vignettes/C_API.Rmd
@@ -3,27 +3,13 @@ title: 'pomp C API'
output:
html_document:
theme: null
- highlight: kate
+ highlight: haddock
toc: yes
toc_depth: 3
params:
prefix: "c_api"
---
-```{css echo=FALSE}
-a:link, a:visited {
- color: #0000ff;
- text-decoration: none;
-}
-a:hover, a:active {
- color: #cc3333;
- text-decoration: none;
-}
-code {
- font-size: 110%;
-}
-```
-
```{r knitr-opts,include=FALSE,purl=FALSE,cache=FALSE}
source("setup.R", local = knitr::knit_global())
```
@@ -189,16 +175,16 @@ void to_log_barycentric(double *xt, const double *x, int n);
void from_log_barycentric(double *xt, const double *x, int n);
```
-The log-barycentric transformation takes a vector $X_i\in\mathbb{R}^n_+$, $i=1,\dots,n$, to a vector $Y_i\in\mathbb{R}^n$, where
+The log-barycentric transformation takes a vector $X\in\mathbb{R}^n_+$ to a vector $Y\in\mathbb{R}^n$, where
$$Y_i = \log\frac{X_i}{\sum_j\!X_j}.$$
-The log-barycentric transformation takes every simplex defined by $\sum_i\!X_i = c$, $c$ constant, to $n$-dimensional Euclidean space $\mathbb{R}^n$.
+For every $c>0$, this transformation maps the simplex $\{X\in\mathbb{R}^n_+\;\vert\;\sum_i\!X_i = c\}$ bijectively onto $\mathbb{R}^n$.
The pseudo-inverse transformation takes $\mathbb{R}^n$ to the unit simplex $S=\{X\in\mathbb{R}^n_+\;\vert\;\sum_i\!X_i=1\}$.
Specifically,
$$X_i = \frac{e^{Y_i}}{\sum_j\!e^{Y_j}}.$$
Note that if $T:\mathbb{R}^n_+\to\mathbb{R}^n$ is the log-barycentric transformation so defined, $U$ is the pseudo-inverse, and $Id$ denotes the identity map, then $T\circ U=Id:\mathbb{R}^n\to\mathbb{R}^n$ but $U\circ T\ne Id$.
-However, if $T$ is restricted to the unit simplex, then $U\circ T=Id:S\to S$.
+However, if $T$ is restricted to the unit simplex S, then $U\circ{T\vert_{S}}=Id:S\to S$.
Input:
diff --git a/vignettes/C_API.html b/vignettes/C_API.html
index 0e8a43c1..6caa4127 100644
--- a/vignettes/C_API.html
+++ b/vignettes/C_API.html
@@ -89,46 +89,38 @@
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
- background-color: #ffffff;
- color: #a0a0a0;
+ color: #aaaaaa;
}
-pre.numberSource { margin-left: 3em; border-left: 1px solid #a0a0a0; padding-left: 4px; }
+pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
- { color: #1f1c1b; background-color: #ffffff; }
+ { }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
-code span { color: #1f1c1b; } /* Normal */
-code span.al { color: #bf0303; background-color: #f7e6e6; font-weight: bold; } /* Alert */
-code span.an { color: #ca60ca; } /* Annotation */
-code span.at { color: #0057ae; } /* Attribute */
-code span.bn { color: #b08000; } /* BaseN */
-code span.bu { color: #644a9b; font-weight: bold; } /* BuiltIn */
-code span.cf { color: #1f1c1b; font-weight: bold; } /* ControlFlow */
-code span.ch { color: #924c9d; } /* Char */
-code span.cn { color: #aa5500; } /* Constant */
-code span.co { color: #898887; } /* Comment */
-code span.cv { color: #0095ff; } /* CommentVar */
-code span.do { color: #607880; } /* Documentation */
-code span.dt { color: #0057ae; } /* DataType */
-code span.dv { color: #b08000; } /* DecVal */
-code span.er { color: #bf0303; text-decoration: underline; } /* Error */
-code span.ex { color: #0095ff; font-weight: bold; } /* Extension */
-code span.fl { color: #b08000; } /* Float */
-code span.fu { color: #644a9b; } /* Function */
-code span.im { color: #ff5500; } /* Import */
-code span.in { color: #b08000; } /* Information */
-code span.kw { color: #1f1c1b; font-weight: bold; } /* Keyword */
-code span.op { color: #1f1c1b; } /* Operator */
-code span.ot { color: #006e28; } /* Other */
-code span.pp { color: #006e28; } /* Preprocessor */
-code span.re { color: #0057ae; background-color: #e0e9f8; } /* RegionMarker */
-code span.sc { color: #3daee9; } /* SpecialChar */
-code span.ss { color: #ff5500; } /* SpecialString */
-code span.st { color: #bf0303; } /* String */
-code span.va { color: #0057ae; } /* Variable */
-code span.vs { color: #bf0303; } /* VerbatimString */
-code span.wa { color: #bf0303; } /* Warning */
+code span.al { color: #ff0000; } /* Alert */
+code span.an { color: #008000; } /* Annotation */
+code span.at { } /* Attribute */
+code span.bu { } /* BuiltIn */
+code span.cf { color: #0000ff; } /* ControlFlow */
+code span.ch { color: #008080; } /* Char */
+code span.cn { } /* Constant */
+code span.co { color: #008000; } /* Comment */
+code span.cv { color: #008000; } /* CommentVar */
+code span.do { color: #008000; } /* Documentation */
+code span.er { color: #ff0000; font-weight: bold; } /* Error */
+code span.ex { } /* Extension */
+code span.im { } /* Import */
+code span.in { color: #008000; } /* Information */
+code span.kw { color: #0000ff; } /* Keyword */
+code span.op { } /* Operator */
+code span.ot { color: #ff4000; } /* Other */
+code span.pp { color: #ff4000; } /* Preprocessor */
+code span.sc { color: #008080; } /* SpecialChar */
+code span.ss { color: #008080; } /* SpecialString */
+code span.st { color: #008080; } /* String */
+code span.va { } /* Variable */
+code span.vs { color: #008080; } /* VerbatimString */
+code span.wa { color: #008000; font-weight: bold; } /* Warning */
-
-
+
-
-
-
+code span.al { color: #ff0000; } /* Alert */
+code span.an { color: #008000; } /* Annotation */
+code span.at { } /* Attribute */
+code span.bu { } /* BuiltIn */
+code span.cf { color: #0000ff; } /* ControlFlow */
+code span.ch { color: #008080; } /* Char */
+code span.cn { } /* Constant */
+code span.co { color: #008000; } /* Comment */
+code span.cv { color: #008000; } /* CommentVar */
+code span.do { color: #008000; } /* Documentation */
+code span.er { color: #ff0000; font-weight: bold; } /* Error */
+code span.ex { } /* Extension */
+code span.im { } /* Import */
+code span.in { color: #008000; } /* Information */
+code span.kw { color: #0000ff; } /* Keyword */
+code span.op { } /* Operator */
+code span.ot { color: #ff4000; } /* Other */
+code span.pp { color: #ff4000; } /* Preprocessor */
+code span.sc { color: #008080; } /* SpecialChar */
+code span.ss { color: #008080; } /* SpecialString */
+code span.st { color: #008080; } /* String */
+code span.va { } /* Variable */
+code span.vs { color: #008080; } /* VerbatimString */
+code span.wa { color: #008000; font-weight: bold; } /* Warning */
+
+
@@ -166,7 +249,7 @@ If you find that your question has not been answered, or you believe you have found a bug, then submit a request for help via the Issues page if possible or to Aaron King (kingaa at umich dot edu) if necessary. Include in your request—at a minimum—the code that you ran that produced the error, together with a transcript of the R session that gave the errors. Better still, construct a minimal example that will reproduce the error: this allows for the most efficient solution of problems.
Make sure to include the output of following, so that I can see what version of pomp you are using, what version of R, what kind of machine you have, any customizations you have that might be affecting performance, etc.
-source("https://kingaa.github.io/scripts/diagnostics.R")
+source("https://kingaa.github.io/scripts/diagnostics.R")
A C snippet can make use of any feature of the C language. In particular, we can use pointers to give access to arrays. For example, consider the following, which implement a simple chain of random variables.
-rSim <- Csnippet("
- double *x = &X1;
- int i, n = (int) N;
- for (i = 0; i < n-1; i++) x[i] = x[i+1];
- x[n-1] = rnorm(0,1);
-")
-
-rInit <- Csnippet("
- double *x = &X1;
- int i, n = (int) N;
- for (i=0; i < n; i++) x[i] = 0.0;
-")
-
-rMeas <- Csnippet("
- Y1 = rnorm(X2,2);
-")
+ Csnippet("
+ rSim <- double *x = &X1;
+ int i, n = (int) N;
+ for (i = 0; i < n-1; i++) x[i] = x[i+1];
+ x[n-1] = rnorm(0,1);
+")
+
+ Csnippet("
+ rInit <- double *x = &X1;
+ int i, n = (int) N;
+ for (i=0; i < n; i++) x[i] = 0.0;
+")
+
+ Csnippet("
+ rMeas <- Y1 = rnorm(X2,2);
+")
The following simulates and plots using the above.
-library(ggplot2)
-library(tidyr)
-
-simulate(
- times=1:20,t0=0,
- rprocess=euler(rSim,delta.t=1),
- rmeasure=rMeas,
- rinit=rInit,
- obsnames="Y1",
- statenames=sprintf("X%d",1:5),
- paramnames="N",
- params=c(N=5),
- format="data.frame"
-) |>
- pivot_longer(-c(.id,time)) |>
- ggplot(aes(x=time,y=value,group=name,color=name))+
- geom_line()+
- theme_bw()
+library(ggplot2)
+library(tidyr)
+
+simulate(
+times=1:20,t0=0,
+ rprocess=euler(rSim,delta.t=1),
+ rmeasure=rMeas,
+ rinit=rInit,
+ obsnames="Y1",
+ statenames=sprintf("X%d",1:5),
+ paramnames="N",
+ params=c(N=5),
+ format="data.frame"
+ |>
+ ) pivot_longer(-c(.id,time)) |>
+ ggplot(aes(x=time,y=value,group=name,color=name))+
+ geom_line()+
+ theme_bw()
pomp
for multivariate data?Yes, this is no problem. The data
you supply to pomp
can contain multiple variables. You simply refer to these variables by name in writing the rmeasure
and dmeasure
C snippets. The following example illustrates this for a multivariate Ornstein-Uhlenbeck process (as in pompExample("ou2")
) with a slightly peculiar measurement model.
simulate(
- times=1:100, t0=0,
- rmeasure=Csnippet("
- y1 = rnorm(x1+x2,sigma1);
- y2 = rnorm(x2-x1,sigma2);
- "),
- dmeasure=Csnippet("
- lik = dnorm(y1,x1+x2,sigma1,1)+dnorm(y2,x2-x1,sigma2,1);
- lik = (give_log) ? lik : exp(lik);
- "),
- rprocess=discrete_time(
- step.fun=Csnippet("
- double tx1, tx2;
- tx1 = rnorm(a11*x1 + a12*x2,nu1);
- tx2 = rnorm(a21*x1 + a22*x2,nu2);
- x1 = tx1; x2 = tx2;
- "),
- delta.t=1),
- rinit=Csnippet("
- x1 = 0;
- x2 = 0;
- "),
- obsnames=c("y1","y2"),
- statenames=c("x1","x2"),
- paramnames=c("a11","a12","a21","a22","sigma1","sigma2","nu1","nu2"),
- params=c(a11=0.5,a12=-0.1,a21=0.2,a22=-1,nu1=0.3,nu2=0.1,sigma1=0.1,sigma2=0.3)
-) -> obj
+simulate(
+times=1:100, t0=0,
+ rmeasure=Csnippet("
+ y1 = rnorm(x1+x2,sigma1);
+ y2 = rnorm(x2-x1,sigma2);
+ "),
+dmeasure=Csnippet("
+ lik = dnorm(y1,x1+x2,sigma1,1)+dnorm(y2,x2-x1,sigma2,1);
+ lik = (give_log) ? lik : exp(lik);
+ "),
+rprocess=discrete_time(
+ step.fun=Csnippet("
+ double tx1, tx2;
+ tx1 = rnorm(a11*x1 + a12*x2,nu1);
+ tx2 = rnorm(a21*x1 + a22*x2,nu2);
+ x1 = tx1; x2 = tx2;
+ "),
+delta.t=1),
+ rinit=Csnippet("
+ x1 = 0;
+ x2 = 0;
+ "),
+obsnames=c("y1","y2"),
+ statenames=c("x1","x2"),
+ paramnames=c("a11","a12","a21","a22","sigma1","sigma2","nu1","nu2"),
+ params=c(a11=0.5,a12=-0.1,a21=0.2,a22=-1,nu1=0.3,nu2=0.1,sigma1=0.1,sigma2=0.3)
+ obj ) ->
Missing data are a common complication. pomp can handle NAs in the data, but if it is needed, the measurement model probability density function, dmeasure
must be written so as to deal with NAs appropriately. For example, look at the following variant of the SIR model describing the influenza outbreak in a boarding school:
library(ggplot2)
-
-read.csv(text="
-B,day
-NA,0
-1,1
-6,2
-26,3
-73,4
-NA,5
-293,6
-258,7
-236,8
-191,9
-124,10
-69,11
-26,12
-11,13
-4,14
-") -> dat
-
-dat |>
- ggplot(aes(x=day, y=B)) +
- geom_line() + geom_point()
+library(ggplot2)
+
+read.csv(text="
+B,day
+NA,0
+1,1
+6,2
+26,3
+73,4
+NA,5
+293,6
+258,7
+236,8
+191,9
+124,10
+69,11
+26,12
+11,13
+4,14
+") -> dat
+
+|>
+ dat ggplot(aes(x=day, y=B)) +
+ geom_line() + geom_point()
Data are missing at days 0 and 5.
We create a pomp
object implementing a simple SIR process model and a binomial measurement model, as in the original example. The only difference is in the dmeasure
:
library(pomp)
-
-sir_step <- Csnippet("
- double dN_SI = rbinom(S,1-exp(-Beta*I/N*dt));
- double dN_IR = rbinom(I,1-exp(-gamma*dt));
- S -= dN_SI;
- I += dN_SI - dN_IR;
- R += dN_IR;
- H += dN_IR;
-")
-
-sir_init <- Csnippet("
- S = N-1;
- I = 1;
- R = 0;
- H = 0;
-")
-
-rmeas <- Csnippet("B = rbinom(H,rho);")
-
-dmeas <- Csnippet("
- if (ISNA(B)) {
- lik = (give_log) ? 0 : 1;
- } else {
- lik = dbinom(B,H,rho,give_log);
- }
-")
-
-dat |>
- pomp(time="day",t0=0,
- rprocess=euler(sir_step,delta.t=1/12),
- rinit=sir_init,
- rmeasure=rmeas,
- dmeasure=dmeas,
- accumvars="H",
- paramnames=c("Beta","gamma","N", "rho"),
- statenames=c("S","I","R","H")
- ) -> sir_na
+library(pomp)
+
+ Csnippet("
+ sir_step <- double dN_SI = rbinom(S,1-exp(-Beta*I/N*dt));
+ double dN_IR = rbinom(I,1-exp(-gamma*dt));
+ S -= dN_SI;
+ I += dN_SI - dN_IR;
+ R += dN_IR;
+ H += dN_IR;
+")
+
+ Csnippet("
+ sir_init <- S = N-1;
+ I = 1;
+ R = 0;
+ H = 0;
+")
+
+ Csnippet("B = rbinom(H,rho);")
+ rmeas <-
+ Csnippet("
+ dmeas <- if (ISNA(B)) {
+ lik = (give_log) ? 0 : 1;
+ } else {
+ lik = dbinom(B,H,rho,give_log);
+ }
+")
+
+|>
+ dat pomp(time="day",t0=0,
+rprocess=euler(sir_step,delta.t=1/12),
+ rinit=sir_init,
+ rmeasure=rmeas,
+ dmeasure=dmeas,
+ accumvars="H",
+ paramnames=c("Beta","gamma","N", "rho"),
+ statenames=c("S","I","R","H")
+ sir_na ) ->
In the above, to check for missing data, we use the ISNA
macro, which is part of R’s C API. Note that the dmeasure
returns a likelihood of 1 (log likelihood 0) for the missing data. [What’s the probability of seeing something if you don’t look?] When there is an observation, it returns a binomial likelihood as usual.
Our simulations now fill in simulated values for the missing data,
-sir_na |>
- simulate(
- params=c(Beta=3,gamma=2,rho=0.9,N=2600),
- nsim=10,
- format="data.frame",
- include.data=TRUE
- ) |>
- ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
- geom_line()+
- guides(color="none")+
- theme_bw()
+|>
+ sir_na simulate(
+params=c(Beta=3,gamma=2,rho=0.9,N=2600),
+ nsim=10,
+ format="data.frame",
+ include.data=TRUE
+ |>
+ ) ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
+ geom_line()+
+ guides(color="none")+
+ theme_bw()
and the particle filter handles the missing data correctly:
-sir_na |>
- pfilter(Np=1000,params=c(Beta=3,gamma=2,rho=0.9,N=2600)) |>
- as.data.frame() |>
- pivot_longer(-day) |>
- ggplot(aes(x=day,y=value,color=name))+
- guides(color="none")+
- geom_line()+
- facet_wrap(~name,ncol=1,scales='free_y')+
- theme_bw()
+|>
+ sir_na pfilter(Np=1000,params=c(Beta=3,gamma=2,rho=0.9,N=2600)) |>
+ as.data.frame() |>
+ pivot_longer(-day) |>
+ ggplot(aes(x=day,y=value,color=name))+
+ guides(color="none")+
+ geom_line()+
+ facet_wrap(~name,ncol=1,scales='free_y')+
+ theme_bw()
In the above particle filter computation, notice that the effective sample size (ESS) is full, as it should be, when the missing data contribute no information.
As described in the documentation (?pomp
), pomp allows you to define “accumulator variables” that collect events occurring between observations. The example above illustrates this.
In that example, we took t0
equal to the first observation time. Sometimes we want to take t0
to be much earlier. For example, if we wish to posit that the initial state of the unobserved Markov process is distributed at t0
according to its stationary distribution, one way to achieve this is to put t0
so early that simulations will equilibrate before the first observation is made.
This is straightforward, but the presence of accumulator variables leads to a small twist. Let’s look at the boarding-school flu example to illustrate.
-library(pomp)
-
-sir_step <- Csnippet("
- double dN_SI = rbinom(S,1-exp(-Beta*I/N*dt));
- double dN_IR = rbinom(I,1-exp(-gamma*dt));
- S -= dN_SI;
- I += dN_SI - dN_IR;
- R += dN_IR;
- H += dN_IR;
- ")
-
-sir_init <- Csnippet("
- S = N-1;
- I = 1;
- R = 0;
- H = 0;
-")
-
-rmeas <- Csnippet("B = rbinom(H,rho);")
-
-dmeas <- Csnippet("if (ISNA(B)) {
- lik = (give_log) ? 0 : 1;
-} else {
- lik = dbinom(B,H,rho,give_log);
-}")
-
-read.csv(text="
-day,B
-1,3
-2,8
-3,28
-4,76
-5,222
-6,293
-7,257
-8,237
-9,192
-10,126
-11,70
-12,28
-13,12
-14,5
-") -> dat
-
-dat |>
- pomp(time="day",t0=-8,
- rprocess=euler(sir_step,delta.t=1/12),
- rinit=sir_init,
- rmeasure=rmeas,
- dmeasure=dmeas,
- accumvars="H",
- paramnames=c("Beta","gamma","N", "rho"),
- statenames=c("S","I","R","H")
- ) -> bsflu
+library(pomp)
+
+ Csnippet("
+ sir_step <- double dN_SI = rbinom(S,1-exp(-Beta*I/N*dt));
+ double dN_IR = rbinom(I,1-exp(-gamma*dt));
+ S -= dN_SI;
+ I += dN_SI - dN_IR;
+ R += dN_IR;
+ H += dN_IR;
+ ")
+
+ Csnippet("
+ sir_init <- S = N-1;
+ I = 1;
+ R = 0;
+ H = 0;
+")
+
+ Csnippet("B = rbinom(H,rho);")
+ rmeas <-
+ Csnippet("if (ISNA(B)) {
+ dmeas <- lik = (give_log) ? 0 : 1;
+} else {
+ lik = dbinom(B,H,rho,give_log);
+}")
+
+read.csv(text="
+day,B
+1,3
+2,8
+3,28
+4,76
+5,222
+6,293
+7,257
+8,237
+9,192
+10,126
+11,70
+12,28
+13,12
+14,5
+") -> dat
+
+|>
+ dat pomp(time="day",t0=-8,
+rprocess=euler(sir_step,delta.t=1/12),
+ rinit=sir_init,
+ rmeasure=rmeas,
+ dmeasure=dmeas,
+ accumvars="H",
+ paramnames=c("Beta","gamma","N", "rho"),
+ statenames=c("S","I","R","H")
+ bsflu ) ->
Note that, as above, we’ve allowed for the possibility of NAs in the data.
Now let’s look at the data and some simulations from the model.
-library(ggplot2)
-
-bsflu |>
- as.data.frame() |>
- ggplot(aes(x=day,y=B))+
- geom_point()+geom_line()+
- theme_bw()
+library(ggplot2)
+
+|>
+ bsflu as.data.frame() |>
+ ggplot(aes(x=day,y=B))+
+ geom_point()+geom_line()+
+ theme_bw()
bsflu |>
- simulate(params=c(Beta=1.5,gamma=1,rho=0.9,N=2600),
- nsim=10,format="d",include.data=TRUE) |>
- ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
- geom_line()+
- guides(color="none")+
- theme_bw()
+|>
+ bsflu simulate(params=c(Beta=1.5,gamma=1,rho=0.9,N=2600),
+nsim=10,format="d",include.data=TRUE) |>
+ ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
+ geom_line()+
+ guides(color="none")+
+ theme_bw()
The data plot looks fine, but what’s going on with the simulations? It’s easy to understand: we are assuming that there is one infectious host at t = t0 = -8
days. In most of the simulations, this leads to a number of new infections, which the H
variable accumulates from t=-8
to t=1
, the time of our first observation. But our first observation is not the number of new cases to have occurred in the last 9 days, but the number that occurred (and were reported) in the last 1 day.
We can fix this by introducing a fictitious “observation” at t=0
, with missing data, i.e., by assuming we observed nothing at t=0
:
library(dplyr)
-
-dat |>
- bind_rows(c(day=0,B=NA)) |>
- arrange(day) |>
- pomp(time="day",t0=-8,
- rprocess=euler(sir_step,delta.t=1/12),
- rinit=sir_init,
- rmeasure=rmeas,
- dmeasure=dmeas,
- accumvars="H",
- paramnames=c("Beta","gamma","N", "rho"),
- statenames=c("S","I","R","H")
- ) -> bsflu
-
-bsflu |>
- simulate(params=c(Beta=1.5,gamma=1,rho=0.9,N=2600),
- nsim=10,format="d",include.data=TRUE) |>
- filter(day>0) |>
- ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
- geom_line()+
- guides(color="none")+
- theme_bw()
+library(dplyr)
+
+|>
+ dat bind_rows(c(day=0,B=NA)) |>
+ arrange(day) |>
+ pomp(time="day",t0=-8,
+rprocess=euler(sir_step,delta.t=1/12),
+ rinit=sir_init,
+ rmeasure=rmeas,
+ dmeasure=dmeas,
+ accumvars="H",
+ paramnames=c("Beta","gamma","N", "rho"),
+ statenames=c("S","I","R","H")
+ bsflu
+ ) ->
+|>
+ bsflu simulate(params=c(Beta=1.5,gamma=1,rho=0.9,N=2600),
+nsim=10,format="d",include.data=TRUE) |>
+ filter(day>0) |>
+ ggplot(aes(x=day,y=B,group=.id,color=.id=="data"))+
+ geom_line()+
+ guides(color="none")+
+ theme_bw()
To make the Euler multinomial approximation, we approximate the total exit rates as constant over a small interval \([t,t+{\Delta}t)\). Let the random variable \({\Delta}n_k\), \(k=1,\dots,K\), be the number that exit by path \(k\) in this time interval and \({\Delta}n_0\) be the number that remain. Under this assumption, the vector of numbers of exits, \(({\Delta}n_{0},{\Delta}n_{1},\dots,{\Delta}n_{K})\) is multinomially distributed with size \(N_t\) and probabilities \((p_k)_{k=0}^K\), where \[p_0 = \exp\left(-\sum\!\mu_i\,{\Delta}t\right),\] and \[p_k = \frac{\mu_k}{\sum\!\mu_i}\,\left(1-p_0\right),\qquad k=1,\dots,K.\] By way of shorthand, we say that \({\Delta}n=({\Delta}n_k)_{k=1}^K\) is Euler-multinomially distributed with size \(N_t\), rates \(\mu=(\mu_k)_{k=1}^K\), and time-step \({\Delta}t\) and we write \[{\Delta}n \sim \mathrm{Eulermultinom}\left(N_t,\mu,{\Delta}t\right).\]
When constructing a POMP for a compartmental model using the Euler multinomial approximation, to write the rprocess basic component, one uses the euler
plugin. One chooses the time step using the delta.t
argument. In writing the step.fun
C snippet, one needs exactly one call to reulermultinom
for each compartment of the model. Using the notation above, one has to pack the \(K\) rates \(\mu_1,\dots,\mu_K\) into contiguous memory locations and retrieve the results in (a different set of) contiguous memory locations. For example, if rate
is a pointer to \(K\) contiguous memory locations holding the rates and dn
is a pointer to \(K\) contiguous memory locations ready to hold the results, then
reulermultinom(K,N,rate,dt,dn);
+ reulermultinom(K,N,rate,dt,dn);
will result in a random sample from the Euler multinomial distribution (with timestep dt) being stored in dn[0]
, …, dn[K-1]
. In the foregoing, we’ve assumed that the quantities \(N_t\) and \(K\) are stored in the memory locations denoted N
and K
, respectively. Note that the rules for step.fun
C snippets guarantee that the operative time step (\({\Delta}t\)) is stored in dt
and that this will not exceed the user-specified delta.t
. See ?rprocess_spec
and ?Csnippet
for more information.
When the step function is provided as an R function, the corresponding call is
-dn <- reulermultinom(n=1,size=N,rate=mu,dt=delta.t)
+ reulermultinom(n=1,size=N,rate=mu,dt=delta.t) dn <-
where mu
is a vector containing the rates.
The manual page contains more information on the Euler-multinomial distribution.
Yes, this is straightforward. One can make use of the globals
and shlib.args
arguments to do so.
Any valid C code passed to globals
is included at the top level of the C file containing C snippets. For example, any variables or functions defined there will be visible to every C snippet in that file. If one wishes to call functions defined in a precompiled C library, one can include the library’s header file via the usual #include
statement, passed via the globals
argument.
Any text passed to the shlib.args
argument is included in the command-line call to R CMD SHLIB
. To link to a precompiled library, then, one can do something like
shlib.args = "<library>"
+ "<library>" shlib.args =
where <library>
is the path of the library.
Consider the following simple example. The function truncated_normal_a_pdf
is defined in the library libtn.so
, located in directory libs
. It has a header file, located in include/tn.h
.
simulate(
- t0 = 0,
- times = 0:100,
- obsnames = "y",
- rprocess=discrete_time(
- step.fun=Csnippet("
- double e = rnorm(0,sigma);
- N = exp(log(r)+log(N)-N+e);"),
- delta.t=1
- ),
- rmeasure = Csnippet("
- y = rnorm(N,4);
- y = (y > 0) ? y : 0;"
- ),
- dmeasure = Csnippet("
- lik = truncated_normal_a_pdf(y,N,4,0);
- if (give_log) lik = log(lik);"
- ),
- statenames = "N",
- paramnames = c("r", "sigma"),
- params = c(N_0 = 7, r = 45, sigma = 0.3),
- globals = r"{#include "include/tn.h"}",
- shlib.args = "libs/libtn.so"
-) -> rick
-
-rick |>
- pfilter(Np=1000) |>
- logLik()
+simulate(
+t0 = 0,
+ times = 0:100,
+ obsnames = "y",
+ rprocess=discrete_time(
+ step.fun=Csnippet("
+ double e = rnorm(0,sigma);
+ N = exp(log(r)+log(N)-N+e);"),
+delta.t=1
+
+ ),rmeasure = Csnippet("
+ y = rnorm(N,4);
+ y = (y > 0) ? y : 0;"
+
+ ),dmeasure = Csnippet("
+ lik = truncated_normal_a_pdf(y,N,4,0);
+ if (give_log) lik = log(lik);"
+
+ ),statenames = "N",
+ paramnames = c("r", "sigma"),
+ params = c(N_0 = 7, r = 45, sigma = 0.3),
+ globals = r"{#include "include/tn.h"}",
+ shlib.args = "libs/libtn.so"
+ rick
+ ) ->
+|>
+ rick pfilter(Np=1000) |>
+ logLik()
In Discussion #156 there is another simple example.
@@ -550,7 +633,7 @@dmeasure
returning illegal values. What does it mean?A common error message is
-Error : in ‘pfilter’: ‘dmeasure’ returns illegal value
+: in ‘pfilter’: ‘dmeasure’ returns illegal value Error
This error arises because your dmeasure
function is giving a negative, infinite, NA
, or otherwise illegal value of the likelihood. Common causes are:
If your system does not support R CMD SHLIB
, you won’t be able to use C snippets. R compiles native codes (C, C++, and FORTRAN) into shared-object libraries using SHLIB
. To check whether your R installation is configured to use SHLIB
, run the following code snippet in an R session:
cat(
- "#include <R.h>
- void hello (void) {
- Rprintf(\"hello world!\\n\");
- }",
- file="hello.c")
-
-system("R CMD SHLIB hello.c")
-dyn.load(paste0("hello",.Platform$dynlib.ext))
-
-.C("hello",PACKAGE="hello")
+cat(
+"#include <R.h>
+ void hello (void) {
+ Rprintf(\"hello world!\\n\");
+ }",
+file="hello.c")
+
+system("R CMD SHLIB hello.c")
+dyn.load(paste0("hello",.Platform$dynlib.ext))
+
+.C("hello",PACKAGE="hello")
If it works, you should see
-hello world!
-list()
+!
+ hello worldlist()
If it doesn’t work, consult the installation instructions.
c
operator to concatenate certain kinds of pomp structures. How do I fix this?This appears to arise from incompatibility between certain R installations and binary versions of the pomp package from CRAN. Try installing the latest version from the github site:
-install.packages("pomp",repos="https://kingaa.github.io/")
+install.packages("pomp",repos="https://kingaa.github.io/")
If you continue to see the error, try installing from source. The easiest way to do this is to use devtools:
-devtools::install_github("kingaa/pomp")
+::install_github("kingaa/pomp") devtools
For more help with installation, see the installation instructions.
If these steps do not solve the problem, please open an issue. In doing so, please follow the instructions in FAQ 1.1 above.
Error messages containing the following elements often arise.
-...
-Error: error in building shared-object library from C snippets:
-in ‘Cbuilder’: compilation error: cannot compile shared-object library
-...
-compiler messages:
-...
-warning: "beta" redefined
-...
+
+ ...: error in building shared-object library from C snippets:
+ Errorin ‘Cbuilder’: compilation error: cannot compile shared-object library
+
+ ...:
+ compiler messages
+ ...: "beta" redefined
+ warning ...
This is due to a name conflict between a quantity beta
defined as a model parameter, covariate, data variable, or in some other way defined in a C snippet, and the Euler beta function, which is defined as beta
in Rmath.h
and documented in the R extensions manual. To work around this, simply rename the parameter. For example, it could be called Beta
instead.
The error message displays the “prototype” of the function pomp expects. For example, in the message
-Error : in 'sannbox': 'candidate.dist' must be a function of the form 'candidate.dist(par, temp, scale, ...)'.
+: in 'sannbox': 'candidate.dist' must be a function of the form 'candidate.dist(par, temp, scale, ...)'. Error
the prototype expected is candidate.dist(par, temp, scale, ...)
. This means that a function supplied to the candidate.dist
argument must have at least the arguments par
, temp
, scale
, and ...
. Note that the ellipses (...
) are necessary. Similarly, the error message
Error: in ‘rmeasure’: ‘rmeasure’ must be a function of the form ‘rmeasure(...)’
+: in ‘rmeasure’: ‘rmeasure’ must be a function of the form ‘rmeasure(...)’ Error
tells us that we have left the ...
out of the argument list in the function we supplied for rmeasure
.
Error messages like the following
-Error : in 'pmcmc': in 'dprior': variable 'r' not found among the parameters.
+: in 'pmcmc': in 'dprior': variable 'r' not found among the parameters. Error
or
-Error : in 'dmeasure': variable 'y2' not found among the observables.
+: in 'dmeasure': variable 'y2' not found among the observables. Error
or similar messages, wherein a named variable is not found among the parameters, state variables, covariates, or observables, arise when a C snippet-based workhorse function, i.e., one that executes a basic model component, cannot find the named variable among the supplied parameters, state variables, covariates, or observables. Typically this means that either (a) the full vector of model parameters has been supplied neither via the params
argument nor in the pomp object’s coef
slot; (b) a basic model component that computes state variables (e.g., rinit
or rprocess
) is not returning the full vector of state variables; (c) the covariate table does not contain the full set of covariates; or (d) some data variable is missing.
It may seem puzzling when, for example, the rinit
workhorse complains about the lack of a parameter upon which the rinit basic component does not in fact depend. To understand this, it is necessary to know that, whenever one or more C snippets are passed to a pomp function, the names of the parameters, state variables, covariates, and observables that are needed by any of them will be expected by all of them. Thus, when rinit
complains that it cannot find a certain parameter that it actually doesn’t need to know, it is effectively doing so on behalf of another basic model component which will depend on that parameter.
Finally, note that, while one informs pomp functions about the names of parameters using the paramnames
argument and about the names of state variables using statenames
, the names of covariates are by default drawn from the covariate table itself, and the names of the observables are taken from the data. This behavior can be over-ridden using the optional covarnames
and obsnames
arguments.
To see what is happening during the execution of a C snippet, one can embed calls to Rprintf
. Such calls cause information to be printed to the R console. In the following example, we have a dmeasure snippet that implements a measurement model wherein \(D_1\) is negative-binomially distributed with mean \(\rho_1\,N_1\) and variance \(\rho_1\,N_1\,(1+k\,{\rho_1}/{N_1})\) and \(D_2\) is binomially distributed with size \(N_2\) and probability \(\rho_2\). Whenever a non-finite log likelihood results (i.e., -Inf
, Inf
, NA
, or NaN
), the values of some relevant quantities are printed.
dmeas <- Csnippet("
- lik = dnbinom_mu(D1,1/k,rho1*N1,1) + dbinom(D2,N2,rho2,1);
- if (!R_FINITE(lik)) {
- Rprintf(\"t = %lg, D1 = %lg, loglik = %lg\\n\",t,D1,lik);
- }
- lik = (log) ? lik : exp(lik);
-")
+ Csnippet("
+ dmeas <- lik = dnbinom_mu(D1,1/k,rho1*N1,1) + dbinom(D2,N2,rho2,1);
+ if (!R_FINITE(lik)) {
+ Rprintf(\"t = %lg, D1 = %lg, loglik = %lg\\n\",t,D1,lik);
+ }
+ lik = (log) ? lik : exp(lik);
+")
More generally, one can make use of any aspect of R’s C API in writing C snippets.
Note that a call to Rprintf
involves passing a quoted string. To avoid having to escape those quotes, and the linefeed character "\n"
, as in the above code, one can use R’s raw string literals, as in the following.
dmeas <- Csnippet(r"{
- lik = dnbinom_mu(D1,1/k,rho1*N1,1) + dbinom(D2,N2,rho2,1);
- if (!R_FINITE(lik)) {
- Rprintf("t = %lg, D1 = %lg, loglik = %lg\n",t,D1,lik);
- }
- lik = (log) ? lik : exp(lik);
-}")
+ Csnippet(r"{
+ dmeas <- lik = dnbinom_mu(D1,1/k,rho1*N1,1) + dbinom(D2,N2,rho2,1);
+ if (!R_FINITE(lik)) {
+ Rprintf("t = %lg, D1 = %lg, loglik = %lg\n",t,D1,lik);
+ }
+ lik = (log) ? lik : exp(lik);
+}")