demo.Rmd


---
title: "An example RMarkdown file"
subtitle: "Illustrating use of R, bash, and Python code chunks"
author: "Christopher Paciorek"
date: "August 2015"
bibliography: refs.bib
---

```{r chunksetup, include=FALSE} 
# include any code here you don't want to show up in the document,
# e.g., package and dataset loading

require(ggplot2)
set.seed(0)

# also a good place to set global chunk options

library(knitr) # need this for opts_chunk command
opts_chunk$set(fig.width = 5, fig.height = 5)
# if we wanted chunks by default not to be evaluated, uncomment next line
# opts_chunk$set(eval = FALSE)
myControlVar <- TRUE   # used later to control chunk behavior
```

# 1) How to generate a document from this file

From the UNIX command line, you can run this document through the either the *rmarkdown* or *knitr* package for R to generate an html file, or through the *rmarkdown* package to generate PDF or Word (the latter being useful at times but hopefully avoidable).

```{bash, compile-bash, eval=FALSE}
Rscript -e "library(rmarkdown); render('demo.Rmd', 'pdf_document')"  # PDF
Rscript -e "library(rmarkdown); render('demo.Rmd', 'html_document')"  # HTML
Rscript -e "library(rmarkdown); render('demo.Rmd', 'word_document')"  # Word
Rscript -e "library(knitr); knit2html('demo.Rmd')"  # HTML alternative

```

Alternatively, start R and run the desired line from amongst following in R:

```{r, compile-R, eval=FALSE}
library(rmarkdown); render('demo.Rmd', 'pdf_document')
library(rmarkdown); render('demo.Rmd', 'html_document')
library(rmarkdown); render('demo.Rmd', 'word_document')
library(knitr); knit2html('demo.Rmd')  
```

Or in RStudio, click on the 'Knit' pull-down menu and choose to knit to HTML, PDF, or Word.

# 2) Some basic Markdown formatting


Here's an *introduction* to our **critical** discovery. We have some code to display inline but not evaluate: `exp(7)` and we can embed the code in a static code block as follows:
```
a = 7 %% 5
b = exp(a)
```

This document will focus on embedding math and code and not on standard Markdown formatting. There are lots of sources of information on Markdown. [RStudio has good information on R Markdown](http://rmarkdown.rstudio.com) (including Markdown formatting).


# 3) Embedding equations using LaTeX

This can be done with the following syntax. Note that you can't have a space after the initial $ for the inline equations.

Here is an inline equation $f(x) = \int f(y, x) dy$.

Here's a displayed equation
$$
f_\theta(x) = \int f_\theta(y, x) dy.
$$ 


# 4) Embedding R code


Here's an R code chunk
```{r, r-chunk1}
a <- c(7, 3)
mean(a)
b <- a + 3
mean(b)
```

Here's another chunk: 

```{r, r-chunk2}
mean(b)
```

When running R code, output is printed interspersed with the code, as one would generally want. Also, later chunks have access to result from earlier chunks (i.e., state is preserved between chunks). 

Let's make a plot:
```{r, r-plot, fig.height = 3}
hist(rnorm(20))
```

And here's some inline R code: What is 3 plus 5? `r 3+5`.

You have control over whether code in chunks is echoed into the document and evaluated using the `include`, `echo`, and `eval` tags.

```{r, includeChunk, include=FALSE}
cat("this code is evaluated but nothing is printed in the document")
library(fields) 
## fields package should now be loaded for use by later chunks
```

```{r, echoChunk, echo=FALSE}
cat("this code is not printed in the document but results of evaluating the code are printed")
```
```{r, evalChunk, eval=FALSE}
cat("this code is not evaluated but the code itself is printed in the document")
```

Intensive calculations can be saved using the `cache=TRUE` tag so they don't need to be rerun every time you compile the document. 
```{r, slow-step, cache=TRUE}
a <- mean(rnorm(5e7))
a
```

You can use R variables to control the chunk options. Note that the variable `myControlVar` is defined in the first chunk of this document.

```{r, use-var-in-chunk-option, echo=myControlVar, eval=!myControlVar}
print("hi")
```


# 5) Embedding bash and Python code

## 5.1) bash

A bash chunk:
```{bash, bash-chunk1}
ls -l
df -h
cd /tmp
pwd
```

Unfortunately, output from bash chunks occurs after all the code is printed. Also, state is not preserved between chunks.

```{bash, bash-chunk2}
pwd  # result would be /tmp if state were preserved
```

Inline bash code won't work: `bash wc demo.Rmd`.

## 5.2) Embedding Python code

```{python, py-chunk1}
import numpy as np
x = np.array((3, 5, 7))
print(x.sum())
x.min()  # note not printed
```

```{python, py-chunk2}
try:
        print(x[0])
except NameError:
       print('x does not exist')
```

As with bash chunks, all output occurs after the code is printed. Also, state is not preserved between chunks. Finally, you need explicit print statements; you won't see what would normally be printed to the screen.

There is no facility for inline Python code: `python print(3+5)`

# 6) Reading code from an external file

It's sometimes nice to draw code in from a separate file. Before invoking a chunk, we need to read the chunks from the source file. Note that a good place for reading the source file via `read_chunk()` is in an initial setup chunk at the beginning of the document.

```{r, read-chunk, echo=FALSE}
library(knitr)
read_chunk('demo.R')
```

```{r, external_chunk_1}
```

```{r, external_chunk_2}
```

# 7) Formatting of long lines of code and of output

## 7.1) R code

Having long lines be nicely formatted and other aspects of formatting can be a challenge. Also, results can differ depending on your output format (e.g., PDF vs. HTML). In general the code in this section will often overflow the page width in PDF but not in HTML, but even in the HTML the line breaks may be awkwardly positioned.

```{r, r-long}
b <- "Statistics at UC Berkeley: We are a community engaged in research and education in probability and statistics. In addition to developing fundamental theory and methodology, we are actively"
## Statistics at UC Berkeley: We are a community engaged in research and education in probability and statistics. In addition to developing fundamental theory and methodology, we are actively

## this should work to give decent formatting, but often doesn't
cat(b, fill = TRUE)

vecLongName = rnorm(100)
a = length(mean(5 * vecLongName + vecLongName - exp(vecLongName) + vecLongName * vecLongName, na.rm = TRUE))
a = length(mean(5 * vecLongName + vecLongName)) # this is a comment that goes over the line by a good long ways
a = length(mean(5 * vecLongName + vecLongName - exp(vecLongName) + vecLongName, na.rm = TRUE)) # this is a comment that goes over the line by a good long long long long long long long long ways

## long output usually is fine
rnorm(100)
```

Sometimes you can format things manually for better results. You may need to tag the chunk with `tidy=FALSE`, but I have not done that here.

```{r, r-long-tidy}
## breaking up a string
b <- "Statistics at UC Berkeley: We are a community engaged in research
 and education in probability and statistics. In addition to developing 
fundamental theory and methodology, we are actively"

## breaking up a comment
## Statistics at UC Berkeley: We are a community engaged in research and 
## education in probability and statistics. In addition to developing 
## fundamental theory and methodology, we are actively

## breaking up code lines
vecLongName = rnorm(100)
a <- length(mean(5 * vecLongName + vecLongName - exp(vecLongName) + 
    vecLongName * vecLongName, na.rm = TRUE))
a <- length(mean(5 * vecLongName + vecLongName)) # this is a comment that 
    ## goes over the line by a good long ways
a <- length(mean(5 * vecLongName + vecLongName - exp(vecLongName) + 
    vecLongName, na.rm = TRUE)) # this is a comment that goes over the line 
    ## by a good long long long long long long long long ways
```


## 7.2) bash code


In bash, we have similar problems with lines overflowing in PDF output, but bash allows us to use a backslash to break lines of code. However that strategy doesn't help with long lines of output. 

```{bash, bash1}
echo "Statistics at UC Berkeley: We are a community engaged in research and education in probability and statistics. In addition to developing fundamental theory and methodology, we are actively" > tmp.txt
  
echo "Second try: Statistics at UC Berkeley: We are a community engaged \
in research and education in probability and statistics. In addition to \
developing fundamental theory and methodology, we are actively" \
>> tmp.txt

cat tmp.txt
```

We also have problems with long comments, so we would need to manually format them.

```{bash, bash2}
# the following long comment line is not broken in my test:
# asdl lkjsdf jklsdf kladfj jksfd alkfd klasdf klad kla lakjsdf aljdkfad kljafda kaljdf afdlkja lkajdfsa lajdfa adlfjaf jkladf afdl

# instead manually break it:
# asdl lkjsdf jklsdf kladfj jksfd alkfd klasdf klad kla 
# lakjsdf aljdkfad kljafda kaljdf afdlkja lkajdfsa lajdfa 
# adlfjaf jkladf afdl
```

## 7.3) Python code

In Python, there is similar trouble with lines overflowing in PDF output too.

```{python, test-python}
# this overflows the page
b = "asdl lkjsdf jklsdf kladfj jksfd alkfd klasdf klad kla lakjsdf aljdkfad kljafda kaljdf afdlkja lkajdfsa lajdfa adlfjaf jkladf afdl"
print(b)

# this code overflows the page
zoo = {"lion": "Simba", "panda": None, "whale": "Moby", "numAnimals": 3, "bear": "Yogi", "killer whale": "shamu", "bunny:": "bugs"}
print(zoo)

# instead manually break the code
zoo = {"lion": "Simba", "panda": None, "whale": "Moby", 
       "numAnimals": 3, "bear": "Yogi", "killer whale": "shamu", 
       "bunny:": "bugs"}
print(zoo)

# long comments overflow too
# asdl lkjsdf jklsdf kladfj jksfd alkfd klasdf klad kla lakjsdf aljdkfad kljafda kaljdf afdlkja lkajdfsa lajdfa adlfjaf jkladf afdl

# the long output that will appear next in the resulting document (produced from the evaluation of the code above) also overflows:
```

# 8) References

We'll just see how you use bibtex style references.  @Bane:etal:2008 proposed a useful method. This was confirmed [@Cres:Joha:2008].

Note the indication of the `refs.bib` file in the initial lines of this document so that the bibliographic information for these citations can be found.

The list of references is placed at the end of the document. You'd presumably want a section header like this:

# Literature cited