Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: Margin of error for coverage thresholds #375

Open
curita opened this issue Dec 22, 2022 · 5 comments
Open

idea: Margin of error for coverage thresholds #375

curita opened this issue Dec 22, 2022 · 5 comments

Comments

@curita
Copy link
Member

curita commented Dec 22, 2022

Background

FieldCoverageMonitor is configured via the SPIDERMON_FIELD_COVERAGE_RULES setting. This dictionary defines how much coverage we expect from each field (expressed as a float number between 0 and 1, 1 being 100% coverage).

These thresholds are sometimes too strict. For example, if we set up a field coverage of 95% (0.95) on a field, if the actual field coverage ends up at 94.9%, we'll get an alert from Spidermon. Those kinds of close coverage drops aren't usually an issue, and devs tend to ignore them, but they still trigger alerts that get mixed with more severe errors.

Proposal

We could allow a certain margin of error or tolerance over these thresholds. Say, for example, a ±5% allowance, which wouldn't trigger an alert for the case defined before.

This could also be configured via a general setting for all fields, though a field-level defined tolerance could exist.

Alternatives

We've discussed an alternative to solve this: to lower the existing thresholds. This should work, but in practice, sometimes it is hard to calculate precisely how much it should be reduced, and jobs with high coverage variability require constant updates. A new tolerance setting might need to be updated too from time to time but could give more flexibility.

Another solution could be to use past jobs to calculate the expected thresholds dynamically. For example, in one of our projects, we don't have fixed coverage thresholds, but thresholds for how much we allow the coverage to be reduced between jobs. However, this is an entirely different approach compared to FieldCoverageMonitor (it depends on ScrapyCloud, and FieldCoverageMonitor doesn't), so this could co-exist as another monitor.

Let me know what you think! We've been discussing this in the monitoring chapter at Zyte and thought this could be beneficial for a lot of existing projects /cc @Chandral @mrwbarg

@Gallaecio
Copy link
Member

If the result will be the same, i.e. an alert, I would rather lower 0.95 to 0.9 than add a new setting that effectively does the same.

I would understand if we wanted to have different levels of alerts. If we want to make 0.9-0.94 a warning and 0.89 and lower an error, then a threshold makes sense to me. Although it could be implemented as an absolute value (e.g. 0.9) rather than a relative one (0.05).

@VMRuiz
Copy link
Collaborator

VMRuiz commented Dec 23, 2022

I would rather lower 0.95 to 0.9 than add a new setting that effectively does the same.

This is what I did on my jobs.

We've` discussed an alternative to solve this: to lower the existing thresholds. This should work, but in practice, sometimes it is hard to calculate precisely how much it should be reduced

You still have to calculate this even if you reduced it to a single variable.

and jobs with high coverage variability require constant updates. A new tolerance setting might need to be updated too from time to time but could give more flexibility.

I don't agree with this approach. If the job has a high variability, the lower tolerance value should be calculated before the job goes into production. If the variability starts after it was already in production, then the monitors needs to be triggered to alert of the change, so developers can evaluate the new situation and decide if the previous tolerances are still valid of if they need to figure out new ones.

Another solution could be to use past jobs to calculate the expected thresholds dynamically. For example, in one of our projects, we don't have fixed coverage thresholds, but thresholds for how much we allow the coverage to be reduced between jobs

This is a dangeours approach as it would allow for a slow degradation over time that is not picked by any monitor

@curita
Copy link
Member Author

curita commented Dec 26, 2022

Thanks for the replies!

Maybe we could have different severity levels for those alerts, and developers could set how each severity level is forwarded to them. But I imagine that would require careful consideration for how to implement it so monitors can support them, as well as some analysis of which other cases could benefit from it. I like that idea though!

I wonder if we could approach this threshold problem another way? Maybe we could develop some tools to help calculate and set those thresholds more easily, as it seems to be a common problem among internal projects.

This is a dangeours approach as it would allow for a slow degradation over time that is not picked by any monitor

There are ways to mitigate this, like comparing against the average from different timeframes (last job, last month, last year). But it does indeed open the door to missing gradual degradations, but maybe that's a risk some projects might be fine to take. I wouldn't use that approach to replace the current coverage one, but maybe it's a viable alternative that could be offered in parallel to this one.

@VMRuiz
Copy link
Collaborator

VMRuiz commented May 7, 2024

I think this can be solved by using a custom monitor that inherits from the default coverage monitor. Different Monitor suites can take care of the different alerting levels.

I will close this issue as I don't think we actually need to change Spidermon core just for this, please reopen if you think otherwise.

@VMRuiz VMRuiz closed this as completed May 7, 2024
@curita
Copy link
Member Author

curita commented Jan 8, 2025

Reopening this ticket after our discussions with Victor, in which users might find this setting helpful when they have several coverage thresholds and might not want to update them one by one every time they need to adjust the tolerance margin.

@curita curita reopened this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants