-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
286 lines (205 loc) · 9.33 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
# Please put your title here to include it in the file below.
Title <- "ROF Camera Trap Data Analysis - Preliminary Report"
```
# ROF Camera Trap Data Analysis Research Compedium
<!-- [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/StewartWILDlab/rofcamtrap/main?urlpath=rstudio) -->
*Dependencies:*
- *docker*
- *nvidia-container-toolkit*
## Report(s) and website
The **reports** directory currently contains the [:file\_folder: paper](/analysis/paper) directory with the Quarto source document for the GnC report. Includes code to reproduce the figures and tables generated by the analysis.
Website is under construction.
## How to run in your browser or download and run locally
This research compendium has been developed using the statistical programming
language R. To work with the compendium, you will need
installed on your computer the [R software](https://cloud.r-project.org/)
itself and optionally [RStudio Desktop](https://rstudio.com/products/rstudio/download/).
You can download the compendium as a zip from from this URL:
[master.zip](/archive/main.zip). After unzipping:
- open the `.Rproj` file in RStudio
- run `devtools::install()` to ensure you have the packages this analysis depends on (also listed in the
[DESCRIPTION](/DESCRIPTION) file).
- finally, open `analysis/paper/paper.Rmd` and knit to produce the `paper.docx`, or run `rmarkdown::render("analysis/paper/paper.qmd")` in the R console
### How to cite
Please cite this compendium as:
> Lucet, Valentin; Stewart, Frances et al., (`r format(Sys.Date(), "%Y")`). _Compendium of R code and data for `r Title`_. Accessed `r format(Sys.Date(), "%d %b %Y")`. Online at <https://doi.org/xxx/xxx>
## Main Workflow
1. At the terminal, clone this repository.
```{bash eval=FALSE}
git clone https://github.com/StewartWILDlab/rofcamtrap
```
2. Build the docker image of the computing environment, by running the build script. This will also build the apptainer image needed on HPC.
```{bash eval=FALSE}
rofcamtrap/dockerfiles/build.sh
```
3. Run docker image, with the proper volumes. We currently have two separate storage volumes, for each of the two camera retrievals that took place. We also make sure to hook the `rofcamtrap` folder as a volume.
```{bash eval=FALSE}
# -e DISABLE_AUTH=true --shm-size 50G
docker run \
-v "$(pwd):/workspace/rofcamtrap" \
-v "/media/vlucet/TrailCamST/TrailCamStorage:/workspace/storage/TrailCamStorage" \
-v "/media/vlucet/TrailCamST/TrailCamStorage_2:/workspace/storage/TrailCamStorage_2" \
--gpus all \
-it rofcamtrap
```
4. Activate ENV, then run mega detector on images using the bash script.
```{bash eval=FALSE}
mamba activate cameratraps-detector
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-m "/workspace/models/md_v5a.0.0.pt" \
-o "/workspace/rofcamtrap/1_MegaDetector/0_outputs/TrailCamStorage_2" \
md
```
5. Run the repeat detector using the bash script, and remove all instances of trues positives.
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-i "/workspace/rofcamtrap/1_MegaDetector/0_outputs/TrailCamStorage_2" \
repeat-detect
```
6. Patch MD's output with the repeat detect results.
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-i "/workspace/rofcamtrap/1_MegaDetector/0_outputs/TrailCamStorage_2" \
-o "/workspace/rofcamtrap/1_MegaDetector/1_outputs_no_repeats/TrailCamStorage_2" \
repeat-remove
```
7. Optionally, write out the visualizations of the detections.
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-i "/workspace/rofcamtrap/1_MegaDetector/1_outputs_no_repeats/TrailCamStorage_2" \
-o "/workspace/rofcamtrap/1_MegaDetector/2_visualize/TrailCamStorage_2" \
viz
```
8. Switch environments
```{bash eval=FALSE}
mamba deactivate
cd mdtools
poetry shell
cd ../
```
9. Convert to coco and ls [and 3rd format?].
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-i "/workspace/rofcamtrap/1_MegaDetector/1_outputs_no_repeats/TrailCamStorage_2" \
-o "/workspace/rofcamtrap/2_LabelStudio/0_inputs/TrailCamStorage_2" \
repeat-convert
```
10. Crop annotations.
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-b "/workspace/git" \
-s "/workspace/storage/TrailCamStorage_2" \
-i "/workspace/rofcamtrap/2_LabelStudio/0_inputs/TrailCamStorage_2" \
crop
```
## Label studio outputs
1. Enter the container running the label studio app and output ths number of projects
```{bash eval=FALSE}
docker exec -it label-studio-app-1 bash
curl -X GET http://localhost:8080/api/projects/?page_size=1000 -H 'Authorization: Token INSERT_TOKEN' -o files/outputs/project_counts.json
```
2. On the instance outside the container, run the Python script to extract projects id
```{python eval=FALSE, python.reticulate = FALSE}
import json
with open('data/outputs/project_counts.json', 'r') as file:
data = json.load(file)
ids = [data['results'][i]['id'] for i in range(len(data['results']))]
with open('data/outputs/project_ids.txt', 'w') as file:
for the_id in ids:
file.write(str(the_id) + '\n')
```
3. Back in the container
```{bash eval=FALSE}
arr=(192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 65 64 63 62 61 60 59 58 57 55 54 53 52 51 50 49 48 47 46 44 43 42 41 40 39 37 35 33 32 31 30 28 25 23 21 20 19 18 17 16 15 13 10);
for id in ${arr[@]}; do
# url="http://localhost:8080/api/projects/${id}/export?exportType=JSON&download_all_tasks=true";
url="http://localhost:8080/api/projects/${id}/export?exportType=JSON";
curl -X GET "$url" -H 'Authorization: Token 3135fdd1f4a5b9b3630b69011ec4d70e7800c41d'\
-o "files/outputs/ls/output_file_$id.json";
done
```
4. Back outside the container, check for file numbers
```{bash eval=FALSE}
ls data/outputs/ls # 82 Jan 29 2024
```
5. Copy to local machine
```{bash eval=FALSE}
scp -i ssh_key/arbutus_def_fstewart_prod '[email protected]:~/data/outputs/ls/*' rofcamtrap/2_LabelStudio/1_outputs_downloaded/
```
6. Process those outputs
```{bash eval=FALSE}
rofcamtrap/scripts/bash/camtrap.sh \
-i "/workspace/rofcamtrap/2_LabelStudio/1_outputs_downloaded/" \
-o "/workspace/rofcamtrap/2_LabelStudio/2_outputs_processed" \
post
```
## Processing detections & annotations
See the quarto documents in the analysis directory.
## Classifier training workflow
On beluga, we use the apptainer image instead.
```{bash eval=FALSE}
. /workspace/conda/etc/profile.d/conda.sh
. /workspace/conda/etc/profile.d/mamba.sh
PATH="$PATH:$HOME/.local/bin"
mamba activate cameratraps-detector
rofcamtrap/scripts/bash/camtrap.sh -b "/workspace/git" -s "/workspace/storage/my_passport_images" -m "/workspace/models/md_v5a.0.0.pt" md
$ mkdir -p /scratch/$USER/apptainer/{cache,tmp}
$ export APPTAINER_CACHEDIR="/scratch/$USER/apptainer/cache"
$ export APPTAINER_TMPDIR="/scratch/$USER/apptainer/tmp"
salloc --time=00:05:00 --mem=4G --ntasks=1 --gpus-per-task=1 --cpus-per-task=1 --account=rrg-fstewart
apptainer shell --nv -C -B "$(pwd):/workspace/rofcamtrap" -B "/media/vlucet/TrailCamST/TrailCamStorage:/workspace/storage/TrailCamStorage" -B "/media/vlucet/My Passport/Images:/workspace/storage/my_passport_images" rofcamtrap.sif
apptainer shell --nv -C -B "$(pwd):/workspace/rofcamtrap" -B "/home/vlucet/projects/rrg-fstewart/vlucet:/workspace/project/" rofcamtrap.sif
```
### Species classifier
1. Crop all annotations using the functions in the R package. TODO: first local then containerized (code now written for locals)
```{r, eval=FALSE}
# Read in the annotations
anns <- readRDS("data/objects/annotations_wide_noempty.rds")
rofcamtrap::crop_from_annotations(anns,
base_dir = "/media/vlucet/TrailCamST1/",
out_dir = "data/images/cropped/")
```
We need to download the models... TBC.
<!--
### False detections classifier
## LabelStudio instance setup
## Labelme? WildTrax?
### Licenses
TBD
**Text and figures :** [CC-BY-4.0](http://creativecommons.org/licenses/by/4.0/)
**Code :** See the [DESCRIPTION](DESCRIPTION) file
**Data :** [CC-0](http://creativecommons.org/publicdomain/zero/1.0/) attribution requested in reuse
### Contributions
We welcome contributions from everyone. Before you get started, please see our [contributor guidelines](CONTRIBUTING.md). Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
### Notes
```{bash eval=FALSE}
for FILE in project*
mdtools postprocess --write-csv $FILE
end
```
This repository contains the data and code for our paper:
> Authors, (YYYY). _`r Title`_. Name of journal/book <https://doi.org/xxx/xxx>
Our pre-print is online here:
> Authors, (YYYY). _`r Title`_. Name of journal/book, Accessed `r format(Sys.Date(), "%d %b %Y")`. Online at <https://doi.org/xxx/xxx>
````
-->