Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract common logic for detectStart and detectStop into a helper class #77

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lindapaiste
Copy link
Contributor

The logic for detectStart and detectStop is the same for 4 of the models, so I moved those common parts into a new Helper class ImageDetector. Then the ml5 functions handPose etc. return a new ImageDetector instance which internally uses an instance of the HandPose class.

Alternatively, we could define ImageDetector as a abstract class and use class HandPose extends ImageDetector. That would be less flexible, but probably easier to understand?

I'm marking this as a draft because there's a few things that I may want to change, mainly moving the this.ready into the ImageDetector. And also probably moving the p5 preload logic, but that's its own can of worms.

Comment on lines +11 to +14
* @property {(input: tf.Tensor3D) => Promise<Array>} detect - core detection method.
* the ImageDetector will call the `detect` method of the specific implementation.
* It should accept a TensorFlow tensor?? or an image??
* And return an array of detections??
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially thinking that we would handle the image to tensor conversion here in the ImageDetector and call the specific model with the tensor. But it seems like all of these models accept image or video and convert it themselves?

Comment on lines +8 to +9
* @property {Promise<boolean>} ready - lets the parent detector know that the
* specific implementation is ready.
Copy link
Contributor Author

@lindapaiste lindapaiste Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to change this to be a method (the this.loadModel method) rather than a property because a property doesn't work as nicely when using the @implements {SpecificDetectorImplementation} tag on the model class. Not sure if it's different in VSCode but JetBrains is warning me about the unused property and does not pick up that it implements anything.
image
It works with the detect method.
image

Comment on lines +33 to +37
/**
* @type {Promise<boolean>}
* TODO: do we need to handle onReady callbacks?
*/
this.ready = this.implementation.ready;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking I should handle the callCallback stuff here instead of in each model.

async detectLoop() {
// Make sure that both the model and the media are loaded before beginning.
await Promise.all([
this.ready,
Copy link
Contributor Author

@lindapaiste lindapaiste Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the current code is not waiting for this.loadModel() to finish so I added an additional await here, and used Promise.all to wait for the two things (model and media) in parallel rather than in series. Though the this.ready promise has already started, so series vs. parallel might not even make a difference?

@shiffman
Copy link
Member

Thank you @lindapaiste! I haven't had a chance to dig into this more yet, but @gohai and I were discussing this morning about how while it's important for the ml5.js library to follow best practices and organize its code in a scalable and efficient way, we also want to make sure we are prioritizing the ability for beginner students (who have never worked with an open source codebase) to contribute. This has its downsides of course, but it's ok for the project to have imperfect design patterns if it makes it easier for a new contributor to come in and make a modification without having to move around between many different files (even if there is redundant code). @gohai feel free to add any more thoughts!

One thing we pinponted that we definitely need help with is figuring out how to be compatible with all the quirks of p5.js preload() etc. That could be a great place to focus some of your time and energy if you are interested!!

@gohai
Copy link
Member

gohai commented Mar 1, 2024

Hi @lindapaiste - just adding to what Dan wrote with my own experiences: for most of my students, "programming" is largely putting function calls into some predefined setup and draw until they like what they see on the screen 😉 They learn some basic OOP in JavaScript, but that's roughly where their deepest introduction to programming ends (NYU being a liberal arts institution, and our areas being concerned media arts and design rather than software engineering). Still, we have some very talented and motivated students, and it'd be amazing if - similarly to how ml5 lowers the barrier to entry when it comes to exploring machine learning, we can also keep our barriers to contributing and getting involved with open source as low as it makes sense.

In advising some students who hacked on ml5 last summer, I got the sense that they were often time struggling with:

  • some more recent ECMAscript (ES6+) features they hadn't encountered before (perhaps worthwhile to be slightly more verbose if something can also be expressed in a more traditional way)
  • lots of indirection (they are used to being able to read code top to bottom, and since most of their learning has been with online code editors, they often time also aren't familiar with the features of modern IDEs that make working across many files more doable)
  • inheritance in OOP (more explicit helper functions that return values might be more familiar to them)

I don't think we need to change code to make it conform to this stone-age notion of JavaScript necessarily, but as Dan said, I think it's okay to look past some of the code-duplication and less-than-optimal use of language features 🙂

@lindapaiste
Copy link
Contributor Author

Pros

  • Ensures that behavior is consistent across all detector models. We won't have issues where different models work slightly differently because they were written by different people.
  • It's easier for contributors to edit methods like detectStart etc. because they only need to touch one file instead of 4+.
  • It's easier for us to add additional methods and properties like the proposed isDetecting() because changing one file will add this method to all detector models.
  • It's easier to add additional models in the future because there is much less logic and boilerplate which needs to be written in the model class. We only need a sort of "stub" with the core logic.
  • We can use this helper class for the common object detector (CocoSsd model, ml5.objectDetector() function) when we bring that model over to this repo.

Cons

  • Code for ml5.handPose is spread across two files, and this my be confusing for contributors who expect to see a detectStart method in the HandPose\index.js file.
  • Calling ml5.handPose returns an instance of ImageDetector rather than an instance of HandPose which may be confusing to contributors. (It doesn't matter to end users, IMO)

Neutrals

  • There is no change to the public API. The object returned from ml5.handPose() has the same properties and methods as before.
  • This is a very similar setup to what we have currently in the ImageClassifier, where Darknet and Doodlenet only have methods classify and load.
    • The difference is how we are exporting the public API to the user. We use ml5.imageClassifier('doodlenet') instead of ml5.doodleNet(). But here we are keeping ml5.handPose() instead of ml5.imageDetector('handpose').
  • Other models like ml5.neuralNetwork are already spread among many files (but that one definitely is confusing!).
  • Specific classes like HandPose must use the correct method name detect or else it won't work. I kind of see this as a plus because it enforces consistency.

Comment on lines +24 to +31
/**
* @param {SpecificDetectorImplementation} specificImplementation
*/
constructor(specificImplementation) {
/**
* @type {SpecificDetectorImplementation}
*/
this.implementation = specificImplementation;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gohai @shiffman After reading your comments I think that my terminology here might be over the top. I wanted to be really clear about what is what. But it's not going to translate if the person reading it doesn't understand the terminology and it sounds like your students will not be familiar with concepts and terms like interface vs implementation and concretion vs abstraction. Maybe it is better to keep it simple instead.

  /**
   * @param {DetectorModel} model
   */
  constructor(model) {
    /**
     * @type {DetectorModel}
     */
    this.model = model;

I'm also tempted to add a link to https://en.wikipedia.org/wiki/Dependency_injection in one of those JSDoc blocks! 😝

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would err on the side of keeping things simple for now, maybe we can open an issue for discussion around using "interfaces" in the codebase for later? For now, I would priortize the codebase being as friendly for beginners to contribute to even if it's not making use of the latest and greatest JS features / best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants