Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] split out functionality of the fetch function #175

Open
ha0ye opened this issue Feb 14, 2019 · 5 comments
Open

[feature request] split out functionality of the fetch function #175

ha0ye opened this issue Feb 14, 2019 · 5 comments

Comments

@ha0ye
Copy link
Collaborator

ha0ye commented Feb 14, 2019

The fetch() function seems to do 2 things:

  • download dataset(s) to a temporary directory
  • import dataset(s) into a name list

There are existing functions to separately download raw and formatted datasets, but no functions to handle data importing. It seems reasonable to also expose the importing functionality to users and then refactor fetch to be a combined 2-step function that goes through the download and import steps using a temporary directory.

@ethanwhite
Copy link
Member

Thanks for the suggestion @ha0ye!

This sounds like a good idea, but I'd like to understand the use case just a little more to help guide development. Is the idea that in some cases you might want to download the data once, but load it repeatedly?

If so do you envision wanting to handle that logic yourself (e.g.,):

if (datafiles %in% list_of_files){
  rdataretriever::load("dataset")
} else {
  rdataretriever::install_csv("dataset")
  rdataretriever::load("dataset")
}

Or would you prefer to have fetch handle this for you, so that fetch looks to see if the csv files have already been created and then loads them if they have and installs and loads if they haven't.

@ha0ye
Copy link
Collaborator Author

ha0ye commented Feb 16, 2019

Yep, that’s the use case I had in mind.

I think I lean towards expecting a user to handle the logic of installing and loading. You could modify fetch to have additional logic to check downloads, but with the name “fetch”, my default expectation is that it would always be downloading again.

@ethanwhite
Copy link
Member

Thanks! One follow up. What do you think about a third alternative: including this behavior in fetch, but as an optional flag:

rdataretriever::fetch("dataset", update_local_data = FALSE)

With the default for update_local_data set to TRUE.

@ha0ye
Copy link
Collaborator Author

ha0ye commented Feb 17, 2019

Sure, I think that could work!

@ethanwhite
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants