Skip to content

Commit

Permalink
fix(spm): various bugs in new API integration (#118)
Browse files Browse the repository at this point in the history
* fix(tools): various bugs in new API integration
* docs(tools): notebook fixes to work with updated APIs
* docs(tools): README and CHANGELOG update
  • Loading branch information
nfrasser authored Jan 17, 2025
1 parent 4cb88d8 commit da1c39d
Show file tree
Hide file tree
Showing 12 changed files with 248 additions and 138 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Next

- BREAKING: replaced low-level `CryoSPARC.cli`, `CryoSPARC.rtp` and `CryoSPARC.vis` attributes with single unified `CryoSPARC.api`
- BREAKING: When a `job.start()` or `job.run()` is called for an external job, changing the job connections with `job.add_input`, `job.add_output` or `job.connect` will now trigger an error. Please add all inputs and outputs and connect all inputs before running an external job.
- BREAKING: `CryoSPARC.download_asset(fileid, target)` no longer accepts a directory target. Must specify a filename.
- BREAKING: removed `CryoSPARC.get_job_specs()`. Use `CryoSPARC.job_register` instead
- BREAKING: `CryoSPARC.list_assets()` and `Job.list_assets()` return list of models instead of list of dictionaries, accessible with dot-notation
Expand All @@ -15,10 +16,15 @@
- OLD: `cs.get_targets()[0]['hostname']`
- NEW: `cs.get_targets()[0].hostname`
- Some top-level target attributes have also been moved into the `.config` attribute
- BREAKING: `CryoSPARC.print_job_types` `section` argument renamed to `category`
- OLD: `cs.print_job_types(section=["extraction", "refinement"])`
- NEW: `cs.print_job_types(category=["extraction", "refinement"])`
- BREAKING: Restructured schema for Job models, many `Job.doc` properties have been internally rearranged
- Added: `CryoSPARC.job_register` property
- Added: `job.load_input()` and `job.load_output()` now accept `"default"`, `"passthrough"` and `"all"` keywords for their `slots` argument
- Added: `job.alloc_output()` now accepts `dtype_params` argument for fields with dynamic shapes
- Added: `CryoSPARC.print_job_types` now includes a job stability column
- Added: `Job.print_output_spec` now includes a passthrough indicator column for results
- Updated: Improved type definitions
- Deprecated: When adding external inputs and outputs, expanded slot definitions now expect `"name"` key instead of `"prefix"`, support for which will be removed in a future release.
- OLD: `job.add_input("particle", slots=[{"prefix": "component_mode_1", "dtype": "component", "required": True}])`
Expand Down
14 changes: 5 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,27 +117,23 @@ rm -rf cryosparc/*.so build dist *.egg-info
Install dependencies into a new conda environment:

```sh
conda create -n cryosparc-tools-example -c conda-forge \
python=3 numpy==1.18.5 \
pyqt=5 libtiff wxPython=4.1.1 adwaita-icon-theme
conda create -n cryosparc-tools-example -c conda-forge python=3 numpy=1.18.5 \
pyqt=5 libtiff wxPython=4.1.1 adwaita-icon-theme 'setuptools<66' # exclude these dependencies if you don't need cryolo
conda activate cryosparc-tools-example
pip install -U pip
pip install nvidia-pyindex matplotlib~=3.4.0 pandas==1.1.4 notebook
pip install "cryolo[c11]"
pip install -e ".[build]"
pip install cryosparc-tools matplotlib~=3.4.0 pandas~=1.1.0 notebook
pip install nvidia-pyindex # exclude last two steps if you don't need cryolo
pip install 'cryolo[c11]'
```

Run the notebook server with the following environment variables:

- `CRYOSPARC_LICENSE_ID` with Structura-issued CryoSPARC license
- `CRYOSPARC_EMAIL` with a CryoSPARC user account email
- `CRYOSPARC_PASSWORD` with a CryoSPARC user account password

You may also need to include `LD_LIBRARY_PATH` which includes the location of
CUDA Toolkit and cuDNN runtime libraries (e.g., `~/miniconda3/envs/tools/lib/python3.8/site-packages/nvidia/*/lib`).

```
CRYOSPARC_LICENSE_ID="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx" \
CRYOSPARC_EMAIL="[email protected]" \
CRYOSPARC_PASSWORD="password123" \
jupyter notebook
Expand Down
33 changes: 18 additions & 15 deletions cryosparc/controllers/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -988,7 +988,7 @@ def print_param_spec(self):
"""
headings = ["Param", "Title", "Type", "Default"]
rows = []
for key, details in self.full_spec.params:
for key, details in self.full_spec.params.items():
if details.get("hidden") is True:
continue
type = (details["anyOf"][0] if "anyOf" in details else details).get("type", "Any")
Expand Down Expand Up @@ -1050,25 +1050,28 @@ def print_output_spec(self):
>>> job.doc['type']
'extract_micrographs_multi'
>>> job.print_output_spec()
Output | Title | Type | Result Slots | Result Types
==========================================================================================
micrographs | Micrographs | exposure | micrograph_blob | micrograph_blob
| | | micrograph_blob_non_dw | micrograph_blob
| | | background_blob | stat_blob
| | | ctf | ctf
| | | ctf_stats | ctf_stats
| | | mscope_params | mscope_params
particles | Particles | particle | blob | blob
| | | ctf | ctf
Output | Title | Type | Result Slots | Result Types | Passthrough?
=========================================================================================================
micrographs | Micrographs | exposure | micrograph_blob | micrograph_blob | ✕
| | | micrograph_blob_non_dw | micrograph_blob | ✓
| | | background_blob | stat_blob | ✓
| | | ctf | ctf | ✓
| | | ctf_stats | ctf_stats | ✓
| | | mscope_params | mscope_params | ✓
particles | Particles | particle | blob | blob | ✕
| | | ctf | ctf | ✕
"""
specs = self.cs.api.jobs.get_output_specs(self.project_uid, self.uid)
headings = ["Output", "Title", "Type", "Result Slots", "Result Types"]
headings = ["Output", "Title", "Type", "Result Slots", "Result Types", "Passthrough?"]
rows = []
for key, spec in specs.root.items():
output = self.model.spec.outputs.root.get(key)
if not output:
warnings.warn(f"No results for input {key}", stacklevel=2)
continue
name, title, type = key, spec.title, spec.type
for slot in spec.slots:
slot = as_output_slot(slot)
rows.append([name, title, type, slot.name, slot.dtype])
for result in output.results:
rows.append([name, title, type, result.name, result.dtype, "✓" if result.passthrough else "✕"])
name, title, type = "", "", "" # only these print once per group
print_table(headings, rows)

Expand Down
12 changes: 7 additions & 5 deletions cryosparc/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from contextlib import contextmanager
from functools import cached_property
from hashlib import sha256
from io import BytesIO
from io import BytesIO, TextIOBase
from pathlib import PurePath, PurePosixPath
from typing import IO, TYPE_CHECKING, Any, Container, Dict, Iterable, List, Optional, Tuple, Union, get_args

Expand Down Expand Up @@ -313,7 +313,7 @@ def print_job_types(
"""
allowed_categories = {category} if isinstance(category, str) else category
register = self.job_register
headings = ["Category", "Job", "Title"]
headings = ["Category", "Job", "Title", "Stability"]
rows = []
prev_category = None
for job_spec in register.specs:
Expand All @@ -326,7 +326,7 @@ def print_job_types(

category = job_spec.category
display_category = "" if category == prev_category else category
rows.append([display_category, job_spec.type, job_spec.title])
rows.append([display_category, job_spec.type, job_spec.title, job_spec.stability])
prev_category = category

print_table(headings, rows)
Expand Down Expand Up @@ -782,13 +782,15 @@ def upload(
project_uid (str): Project unique ID, e.g., "P3"
target_path (str | Path): Name or path of file to write in project
directory.
source (str | bytes | Path | IO | Stream): Local path or file handle to
upload. May also specified as raw bytes.
source (str | bytes | Path | IO | Stream): Local path or file handle
to upload. May also specified as raw bytes.
overwrite (bool, optional): If True, overwrite existing files.
Defaults to False.
"""
if isinstance(source, bytes):
source = BytesIO(source)
if isinstance(source, TextIOBase): # e.g., open(p, "r") or StringIO()
source = Stream.from_iterator(s.encode() for s in source)
if not isinstance(source, Stream):
source = Stream.load(source)
self.api.projects.upload_file(project_uid, source, path=str(target_path), overwrite=overwrite)
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/3dflex-custom-latent-trajectory.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -608,7 +608,7 @@
"# so we need to divide the components_mode fields by two to get the total number of components\n",
"num_components = int(len([x for x in particles.fields() if \"components_mode\" in x]) / 2)\n",
"\n",
"slot_spec = [{\"dtype\": \"components\", \"prefix\": f\"components_mode_{k}\", \"required\": True} for k in range(num_components)]\n",
"slot_spec = [{\"dtype\": \"components\", \"name\": f\"components_mode_{k}\"} for k in range(num_components)]\n",
"job = project.create_external_job(\"W5\", \"Custom Latents\")\n",
"job.connect(\"particles\", \"J243\", \"particles\", slots=slot_spec)"
]
Expand Down
18 changes: 10 additions & 8 deletions docs/examples/cryolo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Connection succeeded to CryoSPARC command_core at http://cryoem0.sbi:40002\n",
"Connection succeeded to CryoSPARC command_vis at http://cryoem0.sbi:40003\n",
"Connection succeeded to CryoSPARC command_rtp at http://cryoem0.sbi:40005\n"
"Connection succeeded to CryoSPARC API at http://cryoem0.sbi:61002\n"
]
}
],
"source": [
"from cryosparc.tools import CryoSPARC\n",
"\n",
"cs = CryoSPARC(host=\"cryoem0.sbi\", base_port=40000)\n",
"cs = CryoSPARC(host=\"cryoem0.sbi\", base_port=61000)\n",
"assert cs.test_connection()\n",
"\n",
"project = cs.find_project(\"P251\")"
Expand Down Expand Up @@ -175,12 +173,12 @@
"\n",
"for mic in all_micrographs.rows():\n",
" source = mic[\"micrograph_blob/path\"]\n",
" target = job.uid + \"/full_data/\" + source.split(\"/\")[-1]\n",
" target = job.uid + \"/full_data/\"\n",
" project.symlink(source, target)\n",
"\n",
"for mic in train_micrographs.rows():\n",
" source = mic[\"micrograph_blob/path\"]\n",
" target = job.uid + \"/train_image/\" + source.split(\"/\")[-1]\n",
" target = job.uid + \"/train_image/\"\n",
" project.symlink(source, target)"
]
},
Expand Down Expand Up @@ -293,7 +291,11 @@
"\n",
"cryosparc-tools provides a `job.subprocess` function to run arbitrary processes, including `cryolo_*.py` scripts installed in the active conda environment.\n",
"\n",
"Use `job.subprocess` to generate a crYOLO configuration file with the `cryolo_gui.py config` command. Specify a box size of 130 for this dataset."
"Use `job.subprocess` to generate a crYOLO configuration file with the `cryolo_gui.py config` command. Specify a box size of 130 for this dataset.\n",
"\n",
"```{note}\n",
"When connecting to a remote CryoSPARC instance, note that `job.subprocess` processes will run on the local machine, not remotely.\n",
"```"
]
},
{
Expand Down Expand Up @@ -368,7 +370,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This creates a `cryolo_model.h5` trained model file in the job directory.\n",
"Open the External job's \"Events\" tab from CryoSPARC's web interface to view crYOLO's output. When the process completes, crYOLO creates a `cryolo_model.h5` trained model file in the job directory.\n",
"\n",
"## Picking\n",
"\n",
Expand Down
Loading

0 comments on commit da1c39d

Please sign in to comment.