Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Florabank query #49

Merged
merged 21 commits into from
Jul 3, 2019
Merged

Florabank query #49

merged 21 commits into from
Jul 3, 2019

Conversation

hansvancalster
Copy link
Contributor

@hansvancalster hansvancalster commented Jun 11, 2019

I propose to add 3 functions to inborutils that facilitate some often required queries to the florabank.
See also this issue in inbo/tutorials

R/florabank_queries.R Outdated Show resolved Hide resolved
@hansvancalster
Copy link
Contributor Author

Anyone know how to get rid of this note (after running Build check)?

> checking R code for possible problems ... NOTE
  florabank_traits: no visible global function definition for '%LIKE%'
  Undefined global functions or variables:
    %LIKE%

It happens when using an infix SQL function within a dbplyr filter statement. See dbplyr vignette section on infix functions

@hansvancalster
Copy link
Contributor Author

The issue about the note can be solved as explained here

@hansvancalster
Copy link
Contributor Author

Question: the functions possibly return lots of records (especially florabank_taxon_ifbl_year() can be 10^6). Should we actively return them (= current implementation), or only return a pointer to the query (lazy data)? This can be done by removing the final collect statement from the function using dbplyr syntax. However, for the functions using dbGetQuery I don't know a simple solution to do this.

@damianooldoni
Copy link
Member

Maybe by using commandos dbBegin and dbExecute avoiding to commit (dBCommit) the query (same as collect() of dbplyr)?
https://github.com/trias-project/occ-processing/blob/master/src/belgium/assign_grid.Rmd#L122

But, the question is: do people really need to work locally on these data? Would it be an option to inform them about number of records they are going to get before to collect the data? A message like:

You are going to import a data.frame with x rows y columns. 
Press Enter to continue or any other key to abort.

@hansvancalster
Copy link
Contributor Author

Thanks! I'll have a look at the documentation of these DBI functions. I'd rather not let the function be interactive with the user (not suitable for use in markdown knitted documents).

@ThierryO
Copy link
Contributor

dbBegin and dbCommit are intended for queries that alter the database. Because you can dbRollback these changes to the status stored by dbBegin untill you dbCommit the changes.

dbplyr automatically replaces SELECT by SELECT TOP 10, unless collect() is specified.

R/florabank_queries.R Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
R/florabank_queries.R Outdated Show resolved Hide resolved
@hansvancalster
Copy link
Contributor Author

@ThierryO thanks for the review! Most of the comments are resolved. For some I need to take a closer look.

Copy link
Member

@florisvdh florisvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to be learning from this code + reviewing @hansvancalster, @ThierryO , cool.

R/florabank_queries.R Outdated Show resolved Hide resolved
@hansvancalster hansvancalster requested a review from ThierryO June 28, 2019 11:19
@hansvancalster
Copy link
Contributor Author

@ThierryO I think all your comments are addressed. Can you approve?

@hansvancalster hansvancalster merged commit c36ca04 into master Jul 3, 2019
@hansvancalster hansvancalster deleted the florabank_query branch July 3, 2019 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants