README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  fig.align = "center"
)
```

# forrel <img src="man/figures/logo.png" align="right" height=140/>

<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/forrel)](https://CRAN.R-project.org/package=forrel)
[![](https://cranlogs.r-pkg.org/badges/grand-total/forrel?color=yellow)](https://cran.r-project.org/package=forrel)
[![](https://cranlogs.r-pkg.org/badges/last-month/forrel?color=yellow)](https://cran.r-project.org/package=forrel)
<!-- badges: end -->


## Introduction
The goal of **forrel** is to provide forensic pedigree computations and relatedness inference from genetic marker data. The **forrel** package is part of the **pedsuite**, a collection of R packages for pedigree analysis.

The most important analyses currently supported by **forrel** are:

* Likelihood ratio (LR) computations for kinship testing
    * `quickLR()`
    * `kinshipLR()`
* Pairwise relatedness inference: Estimation of IBD coefficients (both $\kappa$ and $\Delta$) from marker data
    * `ibdEstimate()`
    * `ibdBootstrap()`
* Check and visualise relationships in pedigree data
    * `checkPairwise()`
* Simulation of marker genotypes, possibly conditional on known genotypes
    * `markerSim()`
    * `profileSim()`
    * `markerSimParametric()`
    * `profileSimParametric()`
* Power analysis for relationship testing
    * `LRpower()`
    * `exclusionPower()`
* Tailor-made functions for power analysis in missing person cases
    * `missingPersonPlot()`
    * `missingPersonEP()`
    * `missingPersonIP()`
    * `MPPsims()`
    * `powerPlot()`

## Installation
To get the current official version of **forrel**, install from CRAN as follows:
```{r, eval = FALSE}
install.packages("forrel")
```
Alternatively, you can obtain the latest development version from GitHub:
```{r, eval = FALSE}
# install.packages("devtools") # install devtools if needed
devtools::install_github("magnusdv/forrel")
```

## An example
In this short introduction, we first demonstrate simulation of marker data for a pair of siblings. Then - pretending the relationship is unknown to us - we estimate the relatedness between the brothers using the simulated data. If all goes well, the estimate should be close to the expected value for siblings.

```{r}
library(forrel)
```

**Create the pedigree**  

We start by creating and plotting a pedigree with two brothers, named `bro1` and `bro2`. 

```{r sibs, fig.height=2.5, fig.width=2.5}
x = nuclearPed(children = c("bro1", "bro2"))
plot(x)
```

**Marker simulation**  

Now let us simulate the genotypes of 100 independent SNPs for all four family members. Each SNP has alleles 1 and 2, with equal frequencies by default. This is an example of _unconditional_ simulation, since we don't give any genotypes to condition on. 
```{r}
x = markerSim(x, N = 100, alleles = 1:2, seed = 1234)
```
Note 1: The `seed` argument is passed onto the random number generator. If you use the same seed, you should get exactly the same results.  
Note 2: To suppress the informative messages printed during simulation, add `verbose = FALSE` to the function call.

The pedigree `x` now has 100 markers attached to it. The genotypes of the first few markers are shown when printing `x` to the screen:

```{r}
x
```

**Conditional simulation**

Suppose one of the brothers is homozygous 1/1 and that we want to simulate genotypes for the other brother.
This is achieved with the following code, where after first attaching a marker to the pedigree, specifying the known genotype, we condition on it by referencing it in `markerSim()`. 

```{r}
y = nuclearPed(children = c("bro1", "bro2")) |> 
  addMarker(bro1 = "1/1", alleles = 1:2, name = "snp1") |> 
  markerSim(N = 100, ids = "bro2", partialmarker = "snp1", 
            seed = 321, verbose = FALSE)
y
```

Note that the previous code also demonstrates how **pedsuite** is well adapted to the R pipe `|>`.

**Estimation of IBD coefficients**  

The `ibdEstimate()` function estimates the coefficients of _identity-by-descent_ (IBD) between pairs of individuals, from the available marker data. Let us try with the simulated genotypes we just generated:
```{r}
k = ibdEstimate(y, ids = c("bro1", "bro2"))
k
```

To get a visual sense of the estimate, it is instructive to plot it in the IBD triangle:
```{r triangle, fig.height=4, fig.width=4}
showInTriangle(k, labels = TRUE)
```

Reassuringly, the estimate is close to the theoretical expectation for non-inbred full siblings, $(\kappa_0, \kappa_1, \kappa_2) = (0.25, 0.5, 0.25)$, corresponding to the point marked `S` in the triangle.