Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

closes #6556 [feature request] diagnostic for merge.data.table when by = key is not present in dt being merged #6691

Closed
wants to merge 2 commits into from

Conversation

spiddy1204
Copy link

@spiddy1204 spiddy1204 commented Dec 25, 2024

Closes #6556

which is not helpful as the user would then have to debug which key is not present in which dt. A more informative error would list all keys not present in all dts. For example:

combined = test |>

  • left_join(manual, by = c('iso3c', 'year'))
    Error in left_join():
    ! Join columns in y must be present in the
    data.
    ✖ Problem with iso3c.

i made some changes to make the error look more informative .

@spiddy1204
Copy link
Author

@MichaelChirico can you please approve the workflow only .

@MichaelChirico
Copy link
Member

MichaelChirico commented Dec 25, 2024

FYI GitHub does not automatically recognize 'Closes #XXX' in issue/PR titles, it has to be linked in the description; edited accordingly

FWIW, you should also describe the PR in the title in a more self-contained way -- imagine coming back to this PR in 10 years' time. The context of the issue/bug should be long-gone -- rather than having to do a deep-dive to understand the PR, it's better to have a quick blurb giving context & referring to the issue for more context as needed

Copy link

codecov bot commented Dec 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.62%. Comparing base (3b2812b) to head (8ba2483).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6691   +/-   ##
=======================================
  Coverage   98.61%   98.62%           
=======================================
  Files          79       79           
  Lines       14559    14566    +7     
=======================================
+ Hits        14358    14365    +7     
  Misses        201      201           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@spiddy1204 spiddy1204 changed the title closes #6556 closes #6556 [feature request] diagnostic for merge.data.table when by = key is not present in dt being merged Dec 25, 2024
@spiddy1204
Copy link
Author

spiddy1204 commented Dec 25, 2024

i chaged it can you guide me a little as lintr is giving error each time i am changing it ,it was giving error in starting also even when i did not introduced any changes in it ,
atime performance tests / comment (pull_request) it is ame with this one i am not able to figured them out

can you please suggest what might be the case here . @MichaelChirico

@venom1204
Copy link
Contributor

@aitap can you please guide me what is going wrong here as on my local i am running all the tests succesfully but while making a pull request it is failing two tests can you guide me what could be the reason here .
kindly look for the pr by spiddy1204 #6691

Copy link
Contributor

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the atime check was expecting to find the issue1 branch in the rdatatable/data.table repository, not your own fork, so it is not a problem with the code you are suggesting to change.

# Identify which keys are missing from each data table
missing_x = setdiff(by, nm_x)
missing_y = setdiff(by, nm_y)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@venom1204, do you see a "check warning" from the "lint-r" check around this line when you visit the files changed tab of the pull request? It is asking you to remove the spaces on line 58. Since it doesn't contain any text, it shouldn't contain any spaces either.


# Construct a more detailed error message
error_message = "Elements listed in `by` must be valid column names in x and y."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also remove the spaces at the beginning of the line 61.

error_message = "Elements listed in `by` must be valid column names in x and y."

if (length(missing_x) > 0) {
error_message = paste(error_message, "\nMissing columns in 'x':", paste(missing_x, collapse = ", "))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, lint-r is asking you to use toString(missing_x) instead of paste(missing_x, collapse = ", ").

error_message = paste(error_message, "\nMissing columns in 'x':", paste(missing_x, collapse = ", "))
}
if (length(missing_y) > 0) {
error_message = paste(error_message, "\nMissing columns in 'y':", paste(missing_y, collapse = ", "))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

if (length(missing_y) > 0) {
error_message = paste(error_message, "\nMissing columns in 'y':", paste(missing_y, collapse = ", "))
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again some spaces at the beginning of the line.

}
# UPDATED PART ENDS HERE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are using a version control system, there is no need to delimit the updated part using comments. The difference between the original code and your change request can be reliably computed by Git itself.

}
# UPDATED PART ENDS HERE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again spaces on the otherwise empty line.

stopf("Elements listed in `by` must be valid column names in x and y")
by = unname(by)
by.x = by.y = by
# UPDATED PART STARTS HERE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment will stop being helpful after the suggested change is accepted, and we have more reliable ways of finding out which parts of the code are updated.

@spiddy1204 spiddy1204 closed this Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature request] diagnostic for merge.data.table when by = key is not present in dt being merged
4 participants