modify training script to update the model it starts with #39

mnaydan · 2024-12-13T14:52:08Z

work towards #39

cmroughan · 2025-01-17T02:49:16Z

Tested various train tasks from the GUI. Assuming slurm's train job completes successfully and produces a _best.mlmodel, the results:

Test	Success?	Job	scratch / refine	overwrite?	which GUI
1	✅	transcription	refining	no	old
2	✅	segmentation	refining	no	old
3	✅	transcription	refining	yes	old
4	✅	segmentation	refining	yes	old
5	⚠️	transcription	refining	no	new
6	⚠️	segmentation	refining	no	new
7	⚠️	transcription	refining	yes	new
8	⚠️	segmentation	refining	yes	new

Tests 5-8 (submitting the refining train job using the new GUI) have a major bug in that the model that is selected to be refined on is somehow deleted from eScr's files. The model object in eScr will still exist and will display as if it is present, but the model file will be gone: any attempt to use that model or download it will lead to a FileNotFoundError such as:

FileNotFoundError: [Errno 2] No such file or directory: '/mnt/nfs/cdh/htr/media/models/c26deef8/greek_print_math-11.mlmodel'

I have been able to replicate this with both transcription and segmentation train tasks, regardless of whether the "Overwrite" checkbox is clicked. It only happens in the new GUI (see the toggle to switch between the two here). The deletion occurs at the end of the script's runtime, after the slurm job has completed and the script sends the newly trained model back into eScr -- before that, it still exists in the filesystem.

The deletions are also occurring regardless of whether the user is the owner of the model or not -- I encountered disappearing models with ones that this account had only User permissions for.

cmroughan · 2025-01-17T02:50:57Z

Additionally, "Overwrite" behavior is functioning incorrectly in the new GUI. Running a train job with "Overwrite" in the new GUI will produce a new model object that is being trained on, instead of the training happening on top of the input model. See attached screenshot -- "override-seg-newGUI" should not be a new item on this list, the training in progress icon should be appearing on "bnseg_complex2" instead. In the old GUI, this works as it should.

I wouldn't have expected the new GUI to impact the underlying submitted task, so I am not sure why this and the above are happening -- I still need to try to find the relevant eScr code.

error handling on #39

improvement for #39

ref #39

rlskoeser · 2025-02-05T16:31:21Z

Closing based on testing and refinement from @cmroughan

mnaydan moved this to IceBox in Iteration Planning Board Dec 13, 2024

mnaydan added this to Iteration Planning Board Dec 13, 2024

mnaydan moved this from IceBox to To Do in Iteration Planning Board Jan 13, 2025

mnaydan assigned rlskoeser Jan 13, 2025

rlskoeser moved this from To Do to In Progress in Iteration Planning Board Jan 13, 2025

rlskoeser added a commit that referenced this issue Jan 14, 2025

Preliminary work to support updating eScriptorium model with best result

3efa78f

work towards #39

rlskoeser mentioned this issue Jan 14, 2025

Revise training script to update model #41

Merged

rlskoeser moved this from In Progress to Under Review in Iteration Planning Board Jan 16, 2025

rlskoeser moved this from Under Review to In Progress in Iteration Planning Board Jan 28, 2025

rlskoeser added a commit that referenced this issue Jan 28, 2025

Get best model by accuracy when not found by filename #39

20e8649

rlskoeser added a commit that referenced this issue Jan 28, 2025

Get best model by accuracy when not found by filename #39

bbf7250

rlskoeser moved this from In Progress to Under Review in Iteration Planning Board Jan 29, 2025

rlskoeser added a commit that referenced this issue Feb 3, 2025

Don't check model parent file without ensuring model has a parent first

f46fdf4

error handling on #39

rlskoeser added a commit that referenced this issue Feb 3, 2025

Add flag and logic for update-if-improved

4156467

improvement for #39

rlskoeser added a commit that referenced this issue Feb 3, 2025

Use --update-if-improved option when overwriting existing model

2c29088

ref #39

rlskoeser mentioned this issue Feb 3, 2025

Revise script/task to refine logic for updating model #46

Merged

rlskoeser closed this as completed Feb 5, 2025

github-project-automation bot moved this from Under Review to Done in Iteration Planning Board Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify training script to update the model it starts with #39

modify training script to update the model it starts with #39

mnaydan commented Dec 13, 2024 •

edited by cmroughan

Loading

cmroughan commented Jan 17, 2025 •

edited

Loading

cmroughan commented Jan 17, 2025

rlskoeser commented Feb 5, 2025

modify training script to update the model it starts with #39

modify training script to update the model it starts with #39

Comments

mnaydan commented Dec 13, 2024 • edited by cmroughan Loading

testing and review (round 2)

cmroughan commented Jan 17, 2025 • edited Loading

cmroughan commented Jan 17, 2025

rlskoeser commented Feb 5, 2025

mnaydan commented Dec 13, 2024 •

edited by cmroughan

Loading

cmroughan commented Jan 17, 2025 •

edited

Loading