Skip to content

Commit

Permalink
Update documentation for postprocessing
Browse files Browse the repository at this point in the history
  • Loading branch information
bencardoen committed Oct 29, 2024
1 parent 60a0130 commit 2bf99bf
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 11 deletions.
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using Documenter
push!(LOAD_PATH,"../src/")
makedocs(sitename="SubPrecisionContactDetection Documentation", pages=[ "Tutorial" => "tutorial.md", "Parameter selection and tuning" => "parameters.md", "Generated output" => "output.md",
"Cluster Usage" => "clustercomputing.md", "Installation" => "installation.md", "Postprocessing" => "Postprocessing.md", "Help and FAQ" => "faq.md"])
"Cluster Usage" => "clustercomputing.md", "Installation" => "installation.md", "Postprocessing" => "postprocessing.md", "Help and FAQ" => "faq.md"])

deploydocs(repo = "github.com/bencardoen/SubPrecisionContactDetection.jl.git")
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@ The below 3D rendering shows the software predicting ER-Mitochondria contacts in

![example.png](./assets/example.png)

Mitochondria are in red, ER in green translucent, the contact zones in white.
Mitochondria are in red, ER in green translucent, the contact zones in white.
5 changes: 4 additions & 1 deletion docs/src/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,7 @@ See the [build](https://github.com/bencardoen/SubPrecisionContactDetection.jl/bu
```bash
export PYTHON=""
julia --project=. -e 'using Pkg; Pkg.build'
```
```
!!! note "Attention"
For the remainder of this document we assume all commands are run inside the cloned directory, e.g. `SubPrecisionContactDetection.jl`.
52 changes: 44 additions & 8 deletions docs/src/postprocessing.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,20 @@
Once the contact maps have been computed, you often need quantification and additional filtering.
For example, coverage, features descriptors, and so forth.

## Aggregating CSV files
There are three key processing steps disjoint from the actual algorithm output:
- bleedthrough filter
- CSV curation
- Sampling

In the remaining of this document, let us assume `DIR` is the directory where the algorithm saved its output on a full dataset.

## Sampling contacts
In [scripts/run_cube_sampling_on_dataset.jl](https://github.com/bencardoen/SubPrecisionContactDetection.jl/scripts/run_cube_sampling_on_dataset.jl) you'll find a script that samples contacts with a sliding window, to avoid long tail statistics dominating the conclusion of any analysis. The paper goes into more depth why this is beneficial.



## Preprocessing and filtering
## Bleedthrough filter
The background filter removes ghost effects (bleedthrough).
If you want to tune this without invoking the full pipeline, you can do so:
It is run as part of the pipeline, but you can invoke it separately.

!!! note "This is Optional"
This is entirely optional, but useful if you want to optimize this filter independently.
```
Suppose we want to filter all tif files ending with "1.tif" or "2.tif" , for z=1 to 1.1 in 0.25 steps, and then compute the object properties.
```julia
Expand Down Expand Up @@ -43,4 +46,37 @@ For all the files, it will generate a CSV with columns, where each row is an obj

!!! warning "Shape features"
The ``\lambda`` values are disabled by default due given that for very large objects they can stall the pipeline (1e6 voxels).

## CSV Curation
You can run our Python script to aggregate and curate the processed CSV files.

```python
python3 scripts/csvcuration.py --inputdirectory <where you saved the output> --outputdirectory <where you want the new CSV files saved>
```
By default this will look for output produced with ``\alpha`` 0.05, you can override this as needed with `--alpha 0.01` for example.

This will produce:

```
contacts_aggregated.csv # Contacts aggregated per cell, so 1 row = 1 cell, use this for e.g. mean height, Q95 Volume
contacts_filtered_novesicles.csv # All contacts, without vesicles
contacts_unfiltered.csv # All contacts, no filtering
```

## Sampling contacts
In [scripts/run_cube_sampling_on_dataset.jl](https://github.com/bencardoen/SubPrecisionContactDetection.jl/scripts/run_cube_sampling_on_dataset.jl) you'll find a script that samples contacts with a sliding window, to avoid long tail statistics dominating the conclusion of any analysis. The paper goes into more depth why this is beneficial.

```julia
julia --project=. scripts/run_cube_sampling_on_dataset.jl --inpath DIR --outpath <where to save your output>
```

A convenience script is provided to further aggregate the output of this stage.

```python
python3 scripts/coverage.py --inputdirectory DIR --outputdirectory <where to save your ouput>
```

This will print summary output and save a file `coverage_aggregated.csv`. The columns Coverage % mito by contacts, mean per cell and ncontacts mean are the columns you'll be most interested in.

They report the coverage of contacts on mitochondria (minus MDVs), and the number of contacts per sliding window of 5x5x5 voxels.

0 comments on commit 2bf99bf

Please sign in to comment.