Skip to content

Commit

Permalink
minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
abhishek-ghose committed Sep 26, 2024
1 parent 3777f61 commit 56036ea
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions inactive_learning.html
Original file line number Diff line number Diff line change
Expand Up @@ -245,14 +245,14 @@ <h2 id="here-be-dragons">Here be Dragons</h2>
</li>
<li>Related to the above point: I have heard arguments along the lines that performing model selection or calibration is time-consuming. First, that should not be an excuse to report volatile numbers. Second, we can’t solve problems we are not aware of, and if selection/calibration is a deal-breaker for AL research, let’s let that be widely known so that someone out there might take a crack at it. Maybe that will renew focus on efforts like the <em>Swiss Army Infinitesimal Jacknife</em> <a class="citation" href="#pmlr-v89-giordano19a">(Giordano et al., 2019)</a>.</li>
<li>
<p>Some AL algorithms have fine-tunable hyperparameters. These are impossible to use in practice. We are in a setup where labeled data is non-existent - what are these hyperparams supposed to be fine-tuned against? And remember that at each iteration you’re picking one batch of points, which implies the hyperparams are held fixed to some values at an iteration; so, over how many iterations should this fine-tuning occur, and how do we stabilize this process given the number of data points at iterations differ? These questions are typically not addressed in the literature.</p>
<p>Some AL algorithms have fine-tunable hyperparameters. These are impossible to use in practice. We are in a setup where labeled data is non-existent - what are these hyperparams supposed to be fine-tuned against? And remember that at each iteration you’re picking one batch of points, which implies the hyperparams are held fixed to some values at the iteration; so, over how many iterations should this fine-tuning occur, and how do we stabilize this process given the number of labeled data points at iterations differ? These questions are typically not addressed in the literature.</p>

<p>AL hyperparams are like <em>existence proofs</em> in mathematics - “we know for some value of these hyerparams our algorithm knocks it out of the park!” - as opposed to <em>constructive proofs</em> - “Ah! But we don’t know how to get to that value…”.</p>
</li>
<li>Lack of experiment standards: its hard to compare AL techniques across papers because there is no standard for setting batch or seed sizes or even the labeling budget (the final number of labeled points). These widely vary across papers, and sadly, they heavily influence performance, especially when you want to pick an algorithm for actual use.</li>
<li>Lack of experiment standards: its hard to compare AL techniques across papers because there is no standard for setting batch or seed sizes or even the labeling budget (the final number of labeled points). These <strong>wildly</strong> vary across papers (for an idea, take a look at Table 4 in the paper), and sadly, they heavily influence performance.</li>
</ul>

<p>I hope this post doesn’t convey the impression that I hate AL. But yes, it can be frustrating :-) I still think its a worthy problem, and I often read papers from the area. In fact, we have an ICML workshop paper involving AL from earlier <a class="citation" href="#XAI_human_in_the_loop">(Nguyen &amp; Ghose, 2023)</a>. All we are saying is that it is time to scrutinize the various practical aspects of AL. The paper is accompanied by a <a href="https://github.com/ThuongTNguyen/active_learning_comparisons">library we’re open-sourcing</a> (still polishing up things) - which will hopefully make good benchmarking convenient.</p>
<p>I hope this post doesn’t convey the impression that I hate AL. But yes, it can be frustrating :-) I still think its a worthy problem, and I often read papers from the area. In fact, we have an ICML workshop paper involving AL from earlier <a class="citation" href="#XAI_human_in_the_loop">(Nguyen &amp; Ghose, 2023)</a>. All we are saying is that it is time to scrutinize the various practical aspects of AL. Our paper is accompanied by a <a href="https://github.com/ThuongTNguyen/active_learning_comparisons">library that we’re releasing</a> (still polishing up things) - which will hopefully make good benchmarking convenient.</p>

<h2 id="acknowledgements">Acknowledgements</h2>

Expand Down

0 comments on commit 56036ea

Please sign in to comment.