-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathAppendix_S8.Rmd
549 lines (433 loc) · 24.4 KB
/
Appendix_S8.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
---
title: |
| Appendix S8:
| Estimating biphasic growth for multiple populations using MCMC
author: |
| Kyle L. Wilson
| The University of Calgary
date: "October 5, 2017"
output:
html_document:
pandoc_args:
- --biblio
- Appendix_S8_references.bib
- --csl
- methods-in-ecology-and-evolution.csl
fig_caption: yes
fig_height: 6
fig_width: 7
fontsize: 11pt
highlight: tango
df_print: kable
pdf_document:
pandoc_args:
- --biblio
- Appendix_S8_references.bib
- --csl
- methods-in-ecology-and-evolution.csl
references:
- DOI: null
URL: null
author:
- family: Wilson
given: K. L.
- family: Honsey
given: A.
- family: Moe
given: B.
- family: Venturelli
given: P.
container-title: Methods in Ecology and Evolution
id: WilsonInReview
issue: null
issued: 2017
language: en-GB
page: null
title: 'Growing the biphasic framework: techniques and recommendations for fitting
emerging growth models'
title-short: Growing the biphasic framework
type: article-journal
volume: In Review
- DOI: null
URL: null
author:
- family: Gelman
given: A.
- family: Carlin
given: J.
- family: Stern
given: H.
- family: Dunson
given: D.
- family: Vehtari
given: A.
- family: Rubin
given: D.
edition: 3
id: Gelman2013
issue: null
issued: 2013
language: en-GB
location: Boca Raton, FL
page: null
publisher: Chapman and Hall/CRC Press
title: Bayesian Data Analysis
title-short: BDA3
type: book
volume: null
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo=TRUE)
opts_chunk$set(tidy=TRUE)
opts_chunk$set(fig.show = "hold", collapse=TRUE)
#library(devtools)
#install.packages("knitcitations")
library(knitcitations)
cleanbib()
options("citation_format" = "pandoc")
```
##Summary description of objectives:
This appendix is in support of @WilsonInReview. The citation style language (csl) used herein is the methods-in-ecology-and-evolution.csl file which can be downloaded from https://github.com/citation-style-language/styles/blob/master/methods-in-ecology-and-evolution.csl and placed in the same directory as this .rmd file.
The following is an example application of the Lester biphasic model `r citep("10.1098/rspb.2004.2778")` where the breakpoint $T$ (age-at-maturity) is treated as unknown prior to model estimation. Specifically, we simulate multiple populations, and the life history of each population is related to that of other populations in a hierarchical manner (i.e., each life history parameter is random arising from a global distribution for that life history parameter). Known growth parameters are used to simulate the random, population-specific parameters. We generate size-at-age data given a population-specific, constant coefficient of variation in size-at-age.
We then fit the Lester biphasic model to these simulated data using a hierarchical framework in the Bayesian MCMC software JAGS. This framework allows us to estimate population-specific and 'global' (i.e., average across populations) growth parameters. We build the hierarchical model with vague priors (or hyperpriors), and we run JAGS from the R console using the `runjags` package and `run.jags()` function. JAGS must be installed independently prior to running this code (see 'http://mcmc-jags.sourceforge.net/'). All variables are treated as unknown at both the population and 'global' levels.
Our results summarize variables by their marginal posterior distribution. We then compare the central tendency of each parameter's posterior distribution to determine how well we recover the simulated 'true' parameters at the population and global levels.
## Global functions
First, we will define a few global functions that will be used later.
```{r used_Functions}
Corner_text <- function(text, location="topright") #function to write text to the corner of plots
{
legend(location,legend=text, bty ="n", pch=NA)
}
get_beta <- function(mean,cv) #function that returns the alpha and beta shape parameters of a beta distribution, based on the mean and variation of a given beta distribution
{
sd <- mean*cv
alpha <- -((mean*(mean^2+sd^2-mean))/sd^2)
beta <- alpha/mean-alpha
return(list(alpha=alpha,beta=beta))
}
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
rngList <- function(x,y){ #this function creates randomly 'jittered' starting values for each MCMC chain
lis <- lapply(x, lapply, length) #get the lower order dimensions of the list x
names(lis) <- lapply(x, length) #get the names of those dimensions of the list x
l_el <- length(names(lis)) #get the maximum number elements of the highest order dimensions of the list x
for(i in 1:(l_el-2)) #loop through those dimensions which need to be 'jittered'
{
x[[i]] <- x[[i]]*(1+runif(1,-0.15,0.15)) #jitter values of the list by +/- 15%
}
x[[l_el]] <- round(runif(1,1,100000),0) #have a random RNG seed for the MCMC chain
return(x)
}
```
## Define life history parameters
First, we must load the required libraries. We then start the simulation by specifying how many populations we want to model with `npop`. Then, we specify the leading life history parameters of the growth model: $h$, $t_1$, $M$, $g$, and $cv$. Consistent with the hierarchy of the model, we specify the among-population variation of each life history parameter (e.g., `h_cv` describes the coefficient of variation in the parameter $h$ across all populations). The $cv$ parameter for variation in length-at-age is 15%, consistent with typical observations among fishes `r citet("10.1016/j.fishres.2016.01.006")`.
```{r true_parameters}
library(runjags) #load, install, or require the 'runjags' library
library(rjags)
library(coda)
library(stats4) #load the 'stats4' library
library(corrplot)
nPop <- 10 ## how many populations are there?
ages <- 1:30 #create an integer sequence of ages
Tmat <- 8 # age of maturity
Tmat_cv <- 0.1 # what is the variation in maturity across populations?
h <- 50 # somatic growth in millimeters per year
h_cv <- 0.15 # what is the variation in growth rate across populations?
t1 <- -0.2 #age when size=0 for the juvenile phase
t1_cv <- 0.1 # what is the variation in age when size=0 across populations?
M <- 0.15 #Natural mortality for the population
g <- 1.18*(1-exp(-M)) # proportion of energy in adult phase allocated to reproduction per year
g_cv <- 0.01 # what is the variation in g across populations?
cv <- 0.15 # coefficient of variation in size-at-age
sizeCV <- 0.000000001 # what is the variation in the CV parameter across populations?
#i.e., are some populations more variable in size-at-age than others?
linf <- 3*h/g # convert to VBGF L-infinity
vbk <- log(1+g/3) # convert to VBGF kappa
t0 <- Tmat + log(1-g*(Tmat-t1)/3)/log(1+g/3) #convert to VBGF t0
lena_phase1 <- h*(ages-t1) # length-at-age for phase 1
lena_phase2 <- linf*(1-exp(-vbk*(ages-t0))) # length-at-age for phase 2
biphasic <- ifelse(ages<Tmat,lena_phase1,lena_phase2) #if-else statement for which phase a fish is allocating surplus energy
plot(ages,lena_phase1, ylab="Size", xlab="Age")
lines(ages,lena_phase2)
lines(ages,biphasic)
abline(v=Tmat,col='red',lty=2) #plot where maturity occurs
```
Next, we generate the population-specific life history parameters, which arise as a random variable from that parameter's distribution. For instance, for $h$:
$$h_i \sim N(\mu, \sigma)$$
with mean $\mu$ and variance $\sigma$ of this distribution:
$$\mu = h$$
$$\sigma = h*cv_\text{h}$$
We set the random number generator to 100 so that our results are repeatable: `set.seed(100)`.
```{r generate Hierarchy, message=FALSE}
set.seed(100)
h_i <- rnorm(nPop,h,h*h_cv) # growth rate, h, for population i is random arising from a normal distribution with mean h and standard deviation of h*h_cv
Tmat_i <- rnorm(nPop,Tmat,Tmat*Tmat_cv)
t1_i <- rnorm(nPop,t1,abs(t1*t1_cv))
g_i <- rbeta(nPop,get_beta(g,g_cv)$alpha,get_beta(g,g_cv)$beta) # grab the shape parameters of a beta distribution based on its mean and variance
ifelse(all(g_i < 3/(Tmat_i-t1_i)), # there is a life-hisory constraint on the parameter g
g_i <- g_i,
g_i <- rbeta(nPop,get_beta(g,g_cv)$alpha,get_beta(g,g_cv)$beta))
cv_i <- rnorm(nPop,cv,cv*sizeCV)
true.par <- list(h=h,T50=Tmat,t1=t1,g=g,sizeCV=cv,
h_pop=h_i,T50_pop=Tmat_i,t1_pop=t1_i,g_pop=g_i,
h_cv=h_cv,T50_cv=Tmat_cv,t1_cv=t1_cv,g_cv=g_cv)
```
## Simulating the data
Next, we make an empty data object which we later fill with our population-specific, randomly generated length-at-age data. We use $M$ to generate each population's survivorship curve. This allows us to simulate more realistic age-structures (by multipltying $M$ by selectivity). We generate data using a random normal distribution (realized parameter values and model fit quality will change due to randomness) using the `rnorm()` function. Sample sizes expected for each age should be realistic for fisheries data and are determined from a function of gear selectivity and natural mortality (or survivorship) arising from a multinomial process using the `rmultinom()` function.
```{r generate Data, message=FALSE}
dataPop <- NULL # make an empty object that we will fill in later
for(i in 1:nPop)
{
surv <- rep(NA,length(ages)) # create an empty vector
surv[1] <- 1;for(j in 2:max(ages)){surv[j]<-surv[j-1]*exp(log(((g_i[i]/1.18)-1)/-1))} #survivorship from discrete annual survival
gearA50 <- Tmat_i[i] # induce a gear selectivity that inflects at T
gearSlope <- -0.3 # define the slope of selectivity
select <- 1/(1+exp(gearSlope*(ages-gearA50))) # the average selectivity curve
linf <- 3*h_i[i]/g_i[i] # conversion for the VBGF L-infinity
vbk <- log(1+g_i[i]/3) # conversion for the VBGF parameter kappa
t0 <- Tmat_i[i] + log(1-g_i[i]*(Tmat_i[i]-t1_i[i])/3)/log(1+g_i[i]/3) #conversion for the VBGF parameter t0
lena_juv <- h_i[i]*(ages-t1_i[i]) # length-at-age for phase 1
lena_adult <- linf*(1-exp(-vbk*(ages-t0))) # length-at-age for phase 2
mean_size <- ifelse(ages<=Tmat_i[i],lena_juv,lena_adult) #if-else statement for which phase a fish is allocating surplus energy
SampSize <- 10000
maxSamp <- as.vector(rmultinom(1,prob=surv*select,size=SampSize)) # whats the maximum number of observable samples for an age group in a population?
# surv*select is the probability of a fish surviving a certain ageand being sampled
mean.samp <- as.data.frame(cbind(mean_size,maxSamp)) #make matrix of mean lengths-at-age and sample sizes
mlen <- rep(mean.samp[,1],mean.samp[,2]) #repeat each mean "sample size" number of times
ageData <- rep(ages,mean.samp[,2]) #repeat each age "sample size" number of times
lengths <- sapply(mlen,function(x) rnorm(1,mean=x,sd=x*cv)) #generate random normal length data using means & cv error
Data <- data.frame(cbind(ageData,lengths,rep(i,length(ageData))),row.names=NULL) #bind vectors into age and length matrix, covert to data frame
colnames(Data) <- c("Age","Size","Pop_Num") # re-name the columns
Data <- Data[sample(nrow(Data),size=round(runif(1,40,200)),replace=F),] #draw a random sample from the population -- total sample size can be adjusted
rownames(Data) <- c()
dataPop <- rbind(dataPop, Data)
}
rownames(dataPop) <- c()
```
Next, we plot the noisy data, with colors assigned to the data coming from each population. We can also take a quick look at some of the data to visualize the data structure.
```{r PlotData}
plot(dataPop$Age,dataPop$Size,bg=dataPop$Pop_Num,pch=21, xlab="Age (yrs)",ylab="Length (mm)")
dataPop[sample(1:nrow(dataPop),10),]
```
## Write the hierarchical model in JAGS
The following code is written in the JAGS language `r citep("http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.3406")`. JAGS is fed a list associated with the data. Each data point has an population identification index.
The JAGS model loops through each sample, from *i* to *Nfish*, and determines the contribution of the predicted size of fish *i* to the log-posterior density, which is the sum of the log-likelihood and the log-prior. The predicted growth follows the analytical model from the equations in Lester et al. (2004). There is an if-else statement that determines if the predicted size-at-age of fish *i* comes from the juvenile phase or the adult phase. The priors for each population's life history trait e.g., $h$ for population *j*, comes from a global hyper-prior for $h$ representing the average across all populations. Modifications to this code will need to use JAGS syntax and not R syntax.
```{r JAGSmodel}
model <- "model {
#run through all the individual fish
for(i in 1:Nfish) {
size[i] ~ dnorm(pred[i],1/(pred[i]*size.cv)^2)T(0,) # one variability across all populations
#predict growth for each fish
juv[i] <- h_pop[Pop[i]]*(age[i]-t1_pop[Pop[i]]) # predicted growth for fish i for the juvenile phase
# below is the predicted growth for fish i for the adult phase which follows converting Lester et al. (2004) equations into the von Bertalanffy growth function
adult[i] <- (3*h_pop[Pop[i]]/g_pop[Pop[i]])*(1-exp(-(log(1+g_pop[Pop[i]]/3))*(age[i]-(T50_pop[Pop[i]]+log(1-g_pop[Pop[i]]*(T50_pop[Pop[i]]-t1_pop[Pop[i]])/3)/log(1+g_pop[Pop[i]]/3)))))
pred[i] <- ifelse(age[i]<=T50_pop[Pop[i]],juv[i],adult[i]) # does the age of fish i exceed the maturity predicted for its population?
} # end the calculation of the likelihood
# priors by population
for(j in 1:Npop) {
# normal priors for population j
T50_pop[j] ~ dnorm(T50,tau.T50)T(0,)
h_pop[j] ~ dnorm(h,tau.h)T(0,)
t1_pop[j] ~ dnorm(t1, tau.t1)
# Beta distribution for the reproductive investment
#(bounded between 0 and stochastic upper limit)
g_pop[j] ~ dbeta(beta_alpha,beta_beta)T(,3/(T50_pop[j]-t1_pop[j]))
} # end the priors for each trait by population
# hyper priors on life history traits
T50 ~ dnorm(10,1e-7)T(0,)
h ~ dnorm(40,1e-7)T(0,)
t1 ~ dnorm(0,1e-2)
g ~ dgamma(0.001,0.001)T(,3/(T50-t1))
# below are the half-t priors on variance parameters
T50_cv ~ dhalfcauchy(10)
h_cv ~ dhalfcauchy(10)
g_cv ~ dhalfcauchy(10)
t1_cv ~ dhalfcauchy(10)
size.cv ~ dhalfcauchy(10)
# conversion to JAGS precision parameters
tau.T50 <- 1/(T50*T50_cv)^2
tau.h <- 1/(h*h_cv)^2
tau.t1 <- 1/(t1*t1_cv)^2
# re-parameterize the parameters of the beta
g_var <- (g*g_cv)^2
beta_alpha <- -g*(g_var+g^2-g)/g_var
beta_beta <- beta_alpha/g-beta_alpha
}"
```
Let's quickly look at how many samples we have for each of the populations. Then, we will compile the data into a usable list, and pass JAGS some initial starting values for each of the MCMC chains that we will run.
```{r initialValues, message=FALSE}
print(table(dataPop$Pop_Num))
data <- list(Nfish=length(dataPop$Age),
Npop=length(unique(dataPop$Pop_Num)),
age=dataPop$Age,
size=dataPop$Size,
Pop=dataPop$Pop_Num)
# the above list compiles the noisy data from the dataPop dataframe
# into 3 vectors for age, size, and Pop (numerically, which population does each row correspond to?)
inits1 <- list(h=h,
T50=Tmat,
g=g,
t1=t1,
h_pop=rep(h,nPop),
T50_pop=rep(Tmat,nPop),
t1_pop=rep(t1,nPop),
g_pop=rep(g,nPop),
size.cv=cv,
h_cv=h_cv,
t1_cv=t1_cv,
g_cv=g_cv,
T50_cv=Tmat_cv,
.RNG.name="base::Wichmann-Hill", .RNG.seed=735)
# initial estimates of each parameter must be provided. RNG is random number generators for that chain
inits2 <- inits3 <- inits4 <- inits1
inits2 <- rngList(inits2,inits1) # jitter chain 2, based on values of chain 1
inits3 <- rngList(inits3,inits1) # jitter chain 3, based on values of chain 1
inits4 <- rngList(inits4,inits1) # jitter chain 4, based on values of chain 1
inits <- list(inits1,inits2,inits3,inits4) # compile all initial values into one list
mon_names <- c(names(inits3)[-c(length(inits3),length(inits3)-1)]) # create the stochastic nodes to be monitored
```
## Run the MCMC chains in JAGS
Our next step is to set how many posterior samples we want, the length of the burn-in period and adaptation period, and the thinning rate. Our thinning rate is a somewhat high value of 10-30 - although this slows us down, the expected large correlations between parameters and the Markovian sampling process can lead to high autocorrelation among the consecutive posterior samples. To gain more independent samples, we run each chain longer by increasing our thinning rate while still only taking `Nsamp` number of samples.
We then use the `run.jags()` function to call JAGS to run our model `r citep(citation("runjags"))`. We specify the modules we run, and some other parameters internal to `run.jags()`, like the `method` and `modules`. One can use the `rjparallel` method to parallelize the MCMC estimation and speed up JAGS model run times.
The `summary(results)` command reports some quick MCMC diagnostic tests to assess whether the posterior has converged on a stable distribution. The potential scale reduction factor (also called the Gelman-Rubin test) and the effective number of sample sizes are all reported in this summary output. The library `coda` offers more options for MCMC diagnostics.
```{r runJAGS, message=FALSE}
Nsamp <- 1000 # how many posterior samples does each chain need to get, after thinning and burin-in and adaptation?
thin_rt <- 15 # place some sort of thinning rate?
burnins <- 0.75*round(Nsamp*thin_rt,0) # how long is the burnin, this bases it on the number of total posterior draws?
adaptin <- round(0.4*burnins,0)
a <- proc.time();
results <- run.jags(model=model, monitor=mon_names,
data=data, n.chains=4, method="rjags", inits=inits,
plots=F,silent.jag=F, modules=c("bugs","glm","dic"),
sample=Nsamp,adapt=adaptin,burnin=burnins,thin=thin_rt,summarise=F)
b <- (proc.time() - a)
print((b[3]/60)/60) # how long it took in hours
print((b[3]/(4*(Nsamp*thin_rt+burnins+adaptin)))) # how long it took in seconds per iteration
sum_results <- summary(results)
print(sum_results[,c(1,2,3,8,9,10)])
res.corr <- extract.runjags(add.summary(results),"crosscorr") # extract the cross-correlation matrix
```
## Store the model results
Next, we store the results as an `mcmc.list` which could be used in `coda` for MCMC diagnostics `r citep(citation("coda"))` (see Appendix S5), but here we just quickly coerce this into a `matrix` object for our purposes. We then use some `grep` functions to quickly grab the posterior distributions of monitored parameters associated with specific growth model parameters (e.g., $h$, $h_i$, and $cv_\text{h}$.
We can do some further diagnostics of the model by evaluating how biased the estimates of the growth parameters were in comparison to the simulated 'true' parameter values. To do this, we will use percent bias: $$Bias = ((\theta_i - \hat{\theta_i})/\theta_i) * 100$$
where $\theta_i$ is a life history parameter of interest, e.g., juvenile growth rate $h_i$.
```{r DataStorage,message=FALSE}
TheRes <- as.mcmc.list(results, vars=mon_names)
TheRes <- as.matrix(TheRes)
Ntot <- length(dataPop$Pop_Num)
Npop <- table(dataPop$Pop_Num)
h_plot <- cbind(TheRes[,grep("h",colnames(TheRes))])
Tmat_plot <- cbind(TheRes[,grep("T",colnames(TheRes))])
g_plot <- cbind(TheRes[,grep("g",colnames(TheRes))])
t1_plot <- cbind(TheRes[,grep("t1",colnames(TheRes))])
var_plot <- cbind(TheRes[,grep("cv",colnames(TheRes))])
hBias <- Tbias <- gBias <- t1Bias <- matrix(NA,ncol=ncol(h_plot)-1,nrow=nrow(h_plot))
for(i in 1:(ncol(h_plot)-1))
{
hBias[,i] <- (h_plot[,i]-c(h,h_i)[i])/c(h,h_i)[i]*100
Tbias[,i] <- (Tmat_plot[,i]-c(Tmat,Tmat_i)[i])/c(Tmat,Tmat_i)[i]*100
gBias[,i] <- (g_plot[,i]-c(g,g_i)[i])/c(g,g_i)[i]*100
t1Bias[,i] <- (t1_plot[,i]-c(t1,t1_i)[i])/c(t1,t1_i)[i]*100
}
varBias <- matrix(NA,ncol=ncol(var_plot),nrow=nrow(var_plot))
for(i in 1:ncol(var_plot))
{
varBias[,i] <- (var_plot[,i]-c(h_cv,Tmat_cv,g_cv,t1_cv,cv)[i])/c(h_cv,Tmat_cv,g_cv,t1_cv,cv)[i]*100
}
```
## Percent bias and correlation plots
The code below generates a plot showing percent bias for each of the estimated parameters, along with some example correlations between the parameters $T_i$ and $h_i$.
```{r PlotBias}
par(mfrow=c(1,1))
par(mar=c(4,4,1,1))
layout(matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=TRUE))
boxplot(Tbias,ylab="Percent Bias in T",col="grey80",outline=FALSE,xaxt="n")
axis(1,1:(length(Tmat_i)+1),c("T",paste("T(",1:nPop,")",sep="")),line=0,cex.axis=0.9)
axis(1,1:(length(Tmat_i)+1),paste("N=",c(Ntot,Npop),sep=""),line=0.75,tick=F,cex.axis=0.8)
abline(h=0,lty=2,col="red",lwd=2)
Corner_text("a.","topleft")
boxplot(hBias,ylab="Percent Bias in h",col="grey80",outline=FALSE,xaxt="n")
axis(1,1:(length(h_i)+1),c("h",paste("h(",1:nPop,")",sep="")),cex.axis=0.9)
axis(1,1:(length(Tmat_i)+1),paste("N=",c(Ntot,Npop),sep=""),line=0.75,tick=F,cex.axis=0.8)
abline(h=0,lty=2,col="red",lwd=2)
Corner_text("b.","topleft")
boxplot(gBias,ylab="Percent Bias in g",col="grey80",outline=FALSE,xaxt="n")
axis(1,1:(length(g_i)+1),c("g",paste("g(",1:nPop,")",sep="")),cex.axis=0.9)
axis(1,1:(length(Tmat_i)+1),paste("N=",c(Ntot,Npop),sep=""),line=0.75,tick=F,cex.axis=0.8)
abline(h=0,lty=2,col="red",lwd=2)
Corner_text("c.","topleft")
boxplot(t1Bias,ylab="Percent Bias in t1",col="grey80",outline=FALSE,xaxt="n")
axis(1,1:(length(t1_i)+1),c("t1",paste("t1(",1:nPop,")",sep="")),cex.axis=0.9)
axis(1,1:(length(Tmat_i)+1),paste("N=",c(Ntot,Npop),sep=""),line=0.75,tick=F,cex.axis=0.8)
abline(h=0,lty=2,col="red",lwd=2)
Corner_text("d.","topleft")
boxplot(varBias,ylab="Percent Bias in Variance Terms",col="grey80",outline=FALSE,xaxt="n")
axis(1,1:length(c(h_cv,Tmat_cv,g_cv,t1_cv,cv)),c("cv(h)","cv(T)","cv(g)","cv(t1)","cv(L)"),cex.axis=0.9)
abline(h=0,lty=2,col="red",lwd=2)
Corner_text("e.","topleft")
corr.mat <- res.corr[5:14,15:24]
name1 <- name2 <- c()
for(i in 1:nPop)
{
name1[i] <- paste("h(",i,")",sep="")
name2[i] <- paste("T(",i,")",sep="")
}
rownames(corr.mat) <- name1
colnames(corr.mat) <- name2
corrplot(corr.mat,is.corr=TRUE,method="ellipse",mar = c(2, 4, 1, 0), type="lower",diag=T)
Corner_text("f.","topleft")
```
## Posterior predictive checks
Next, we will run posterior predictive checks, which use the posterior distribution of the parameter estimates to re-generate a randomized posterior distribution of data, assuming the data arose according to the truncated normal distribution (as specified in the model above). We then overlay our observed data atop this distribution to check if there is any systematic bias (@Gelman2013).
```{r PosteriorPredictives, message=FALSE}
age_vec <- c(ages,rev(ages))
post_pred <- array(NA, dim=c(nrow(TheRes),length(ages),nPop))
# Above code creates an empty array to track size-at-age for population i for posterior draw j
par(mfrow=c(1,1))
par(mar=c(5,4,1,1))
layout(matrix(c(1,2,3,4),nrow=2,ncol=2,byrow=TRUE))
for(i in 1:nPop)
{
subbed <- subset(dataPop,dataPop$Pop_Num==i)
for(j in 1:nrow(TheRes))
{
h_j <- TheRes[j,match(paste("h_pop","[",i,"]",sep=""),colnames(TheRes))]
g_j <- TheRes[j,match(paste("g_pop","[",i,"]",sep=""),colnames(TheRes))]
T50_j <- TheRes[j,match(paste("T50_pop","[",i,"]",sep=""),colnames(TheRes))]
t1_j <- TheRes[j,match(paste("t1_pop","[",i,"]",sep=""),colnames(TheRes))]
cv_j <- TheRes[j,match("size.cv",colnames(TheRes))]
linf <- 3*h_j/g_j # conversion for the VBGF L-infinity
vbk <- log(1+g_j/3) # conversion for the VBGF parameter kappa
t0 <- T50_j + log(1-g_j*(T50_j-t1_j)/3)/log(1+g_j/3) #conversion for the VBGF parameter t0
juv_j <- h_j*(ages-t1_j)
adult_j <- linf*(1-exp(-vbk*(ages-t0)))
pred_j <- ifelse(ages<T50_j,juv_j,adult_j)
pred_j[pred_j < 0.001] <- 0.001
post_pred[j,,i] <- rnorm(length(ages),pred_j,pred_j*cv_j)
}
quants <- t(apply(post_pred[,,i],2,FUN=quantile,probs=c(0.025,0.225,0.50,0.775,0.975)))
plot(age_vec,c(quants[,1],rev(quants[,5])),type="l",
lwd=2,col=NA,
ylab="",xlab="",ylim=c(0,1200))
axis(1,at=median(c(0,max(ages))),
paste("Age (yrs) in ","Population ",i,sep=""),tick=FALSE,line=0.90)
axis(2,at=600,"Size (mm)",tick=FALSE,line=1)
polygon(age_vec,c(quants[,1],rev(quants[,5])),col="grey50")
polygon(age_vec,c(quants[,2],rev(quants[,4])),col="grey95")
lines(ages,quants[,3],lwd=2,col="black")
points(subbed$Age,subbed$Size,pch=21,bg=rainbow(nPop)[i])
}
```
## References
```{r references, echo=FALSE, message=FALSE}
write.bibtex(file="Appendix_S8_references.bib")
```