Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling NA input values #34

Closed
ChrisMuir opened this issue Sep 23, 2018 · 6 comments
Closed

Handling NA input values #34

ChrisMuir opened this issue Sep 23, 2018 · 6 comments
Milestone

Comments

@ChrisMuir
Copy link
Contributor

I ran into a few instances of encode() and decode() handling NA inputs in interesting ways. Here are some examples:

decode()

googlePolylines::decode(NA_character_)
#> [[1]]
#>      lat   lon
#> 1 -8e-05 1e-05

Should this return NA_real_, or a data frame containing two NA values? Either could be done within the Rcpp function rcpp_decode_polyline() without much trouble or additional code. I'd be happy to submit a PR for this if you'd like.

encode()

Example 1:
NA inputs round-tripped through encode/decode return numeric lat/lon coordinates.

df <- data.frame(lat = c(NA_real_), lon = c(NA_real_))

(res <- googlePolylines::encode(df))
#> [1] ">>"

googlePolylines::decode(res)
#> [[1]]
#>        lat      lon
#> 1 -0.00016 -0.00016

Example 2:
Data frame with four observations round-tripped through encode/decode returns a data frame with only three observations.

df <- data.frame(lat = c(38.5, 40.7, NA_real_, 43.252), 
                 lon = c(-120.2, -120.95, NA_real_, -126.453))

(res <- googlePolylines::encode(df))
#> [1] "_p~iF~ps|U_ulLnnqC_\016\x9e\xd7"

googlePolylines::decode(res)
#> [[1]]
#>       lat      lon
#> 1 38.5000 -120.200
#> 2 40.7000 -120.950
#> 3 40.7024 -120.954
@SymbolixAU
Copy link
Collaborator

Thanks for the examples!

decode

Should this return NA_real_, or a data frame containing two NA values

My instinct tells me a data.frame with two NA values, to keep the structure consistent.
Which do you think is the better option?

I'd be happy to submit a PR for this if you'd like.

yes please :)

encode

Yes these cases need to be handled. My first thoughts are to either

  1. Issue an Error if any NAs are found, to force the issue back to the user
  2. Issue a warning if any NAs are found, but exclude them from the encoding

What do you think to these options?

@ChrisMuir
Copy link
Contributor Author

ChrisMuir commented Sep 23, 2018

So full disclaimer, I actually have never used googlePolylines, so I don't have a strong take what makes the most sense on how to handle these cases. I was just poking around this repo after seeing the cool stuff being done with mapdeck on rstats Twitter, and seeing that you were planning on trying to migrate some of the code to Rcpp.

Having said that, for the decode example, I agree that returning a data frame is probably better. Looking at the list output, I could see wanting to rbind all of the results together via do.call, so a data frame with the same column headers seems wise. I'll submit a PR for that fix tomorrow.

Also, I was playing around with the decode Rcpp functions, and modifyed it so that it's not having to create a new Rcpp::DataFrame for every input string......I currently have a ~15x speed up for decode() with char vector input, and all the tests are passing. Still want to work with it some more, but are you interested in a performance/refactor PR as well?

@SymbolixAU
Copy link
Collaborator

but are you interested in a performance/refactor PR as well?

absolutely! Always keen to see & learn from others.

@SymbolixAU SymbolixAU self-assigned this Nov 22, 2018
@SymbolixAU SymbolixAU added this to the v0.7.2 milestone Nov 22, 2018
@SymbolixAU
Copy link
Collaborator

@ChrisMuir did you look at the encode examples while you were writing your PR?

If not I might move them into a separate issue and make them part of the v0.7.3 milestone

@ChrisMuir
Copy link
Contributor Author

Nope, I addressed the decode() example, and made some minor speed refactors, but did not address the encode() examples.

@SymbolixAU
Copy link
Collaborator

Moved the encode() examples to a separate issue - #39

@SymbolixAU SymbolixAU removed their assignment Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant