-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
123 lines (97 loc) · 4.67 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
fig.align = "center"
)
```
# forrel <img src="man/figures/logo.png" align="right" height=140/>
<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/forrel)](https://CRAN.R-project.org/package=forrel)
[![](https://cranlogs.r-pkg.org/badges/grand-total/forrel?color=yellow)](https://cran.r-project.org/package=forrel)
[![](https://cranlogs.r-pkg.org/badges/last-month/forrel?color=yellow)](https://cran.r-project.org/package=forrel)
<!-- badges: end -->
## Introduction
The goal of **forrel** is to provide forensic pedigree computations and relatedness inference from genetic marker data. The **forrel** package is part of the **pedsuite**, a collection of R packages for pedigree analysis.
The most important analyses currently supported by **forrel** are:
* Likelihood ratio (LR) computations for kinship testing
* `quickLR()`
* `kinshipLR()`
* Pairwise relatedness inference: Estimation of IBD coefficients (both $\kappa$ and $\Delta$) from marker data
* `ibdEstimate()`
* `ibdBootstrap()`
* Check and visualise relationships in pedigree data
* `checkPairwise()`
* Simulation of marker genotypes, possibly conditional on known genotypes
* `markerSim()`
* `profileSim()`
* `markerSimParametric()`
* `profileSimParametric()`
* Power analysis for relationship testing
* `LRpower()`
* `exclusionPower()`
* Tailor-made functions for power analysis in missing person cases
* `missingPersonPlot()`
* `missingPersonEP()`
* `missingPersonIP()`
* `MPPsims()`
* `powerPlot()`
## Installation
To get the current official version of **forrel**, install from CRAN as follows:
```{r, eval = FALSE}
install.packages("forrel")
```
Alternatively, you can obtain the latest development version from GitHub:
```{r, eval = FALSE}
# install.packages("devtools") # install devtools if needed
devtools::install_github("magnusdv/forrel")
```
## An example
In this short introduction, we first demonstrate simulation of marker data for a pair of siblings. Then - pretending the relationship is unknown to us - we estimate the relatedness between the brothers using the simulated data. If all goes well, the estimate should be close to the expected value for siblings.
```{r}
library(forrel)
```
**Create the pedigree**
We start by creating and plotting a pedigree with two brothers, named `bro1` and `bro2`.
```{r sibs, fig.height=2.5, fig.width=2.5}
x = nuclearPed(children = c("bro1", "bro2"))
plot(x)
```
**Marker simulation**
Now let us simulate the genotypes of 100 independent SNPs for all four family members. Each SNP has alleles 1 and 2, with equal frequencies by default. This is an example of _unconditional_ simulation, since we don't give any genotypes to condition on.
```{r}
x = markerSim(x, N = 100, alleles = 1:2, seed = 1234)
```
Note 1: The `seed` argument is passed onto the random number generator. If you use the same seed, you should get exactly the same results.
Note 2: To suppress the informative messages printed during simulation, add `verbose = FALSE` to the function call.
The pedigree `x` now has 100 markers attached to it. The genotypes of the first few markers are shown when printing `x` to the screen:
```{r}
x
```
**Conditional simulation**
Suppose one of the brothers is homozygous 1/1 and that we want to simulate genotypes for the other brother.
This is achieved with the following code, where after first attaching a marker to the pedigree, specifying the known genotype, we condition on it by referencing it in `markerSim()`.
```{r}
y = nuclearPed(children = c("bro1", "bro2")) |>
addMarker(bro1 = "1/1", alleles = 1:2, name = "snp1") |>
markerSim(N = 100, ids = "bro2", partialmarker = "snp1",
seed = 321, verbose = FALSE)
y
```
Note that the previous code also demonstrates how **pedsuite** is well adapted to the R pipe `|>`.
**Estimation of IBD coefficients**
The `ibdEstimate()` function estimates the coefficients of _identity-by-descent_ (IBD) between pairs of individuals, from the available marker data. Let us try with the simulated genotypes we just generated:
```{r}
k = ibdEstimate(y, ids = c("bro1", "bro2"))
k
```
To get a visual sense of the estimate, it is instructive to plot it in the IBD triangle:
```{r triangle, fig.height=4, fig.width=4}
showInTriangle(k, labels = TRUE)
```
Reassuringly, the estimate is close to the theoretical expectation for non-inbred full siblings, $(\kappa_0, \kappa_1, \kappa_2) = (0.25, 0.5, 0.25)$, corresponding to the point marked `S` in the triangle.