Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[covr.record_tests] segfault when tracing R6 active bindings to parquet data #573

Open
dgkf opened this issue Jul 10, 2024 · 1 comment
Open

Comments

@dgkf
Copy link
Contributor

dgkf commented Jul 10, 2024

Note

This is in code I introduced and only affects very specific scenarios when options(covr.record_tests) is enabled.
I plan to fix this

Example at dgkf/covr-r6-active-binding-test

git clone https://github.com/dgkf/covr-r6-active-binding-test
cd covr-r6-active-binding-test
R -e 'options(covr.record_tests = TRUE)' -e 'covr::package_coverage()'

Details

From what I can tell, there are a few key features of this package that culminate in this bug:

  • An R6 active binding is included in the test call stack
    • For reasons that are still unclear to me, the sys.calls() captured in update_current_test(), when capturing an active binding, includes the R function, not just the call. Critically, this function brings its calling environment.
  • When the R6 object environment includes a parquet dataset (I'm assuming due to the way arrow works with the in-memory representation of the data), this environment segfaults when serialized to disk.

Proposed solution

  • Will first try to drill into why this is grabbing a function. There might be a more holistic way to avoiding this altogether.
  • If there's no better way to avoid functions in the call stack, then before recording the call stack, if a frame contains a function, set its environment to emptyenv() so it can be serialized.
@dgkf
Copy link
Contributor Author

dgkf commented Jul 11, 2024

Just following up to say that I have a safeguard implemented in dgkf/covr@573-record-tests-arrow-segfault.

I can open a PR whenever you'd like, but I'd prefer to test it in some more real-world applications first to make sure there aren't any severe performance implications and that no necessary data gets thrown out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant