-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check precision of site coordinates #229
Comments
I'll try it out today and get back to you if I have any issues. Thanks very much for the written instructions! |
#229 @Troger4 , it's important to check that the coordinates match what is given in the paper. In the case of Chao_2009_atdq, the paper gave coordinates to the minute, but the values entered were less precise. In cases like this, please correct the values in the spreadsheet (using the deg-min-sec to decimal degree conversion given in the instructions). I've fixed the Chao coordinates.
@ValentineHerr , it would be helpful if you could write a script (quick, I think) to flag low-precision coordinates for review. We want to identify records at the poorest level of precision:
What I envision is a script that looks at |
Here are an example for each case we have (where precision is "NAC" and we have coordinates). the first 2 columns are the coordinates in SITES, the 2 last columns are the number of decimals: To confirm what you are asking, I'll enter "degree" in column |
Well, it's not quite so simple as looking at n decimals, because sometimes a very rough coordinate could have high n decimals. For example, if coordinates are rounded to the nearest 10 minutes, you could get repeating values after the decimal (e.g., 30°40' --> 30.6666667). I guess the rule I'd apply is: min(ndec_lat, ndec_lon) <= 1 OR decimal portion = .25, .75 OR decimal portion = .167 (1/6), .33 (1/3), .67 (2/3), or .83 (5/6), where there could be any number of repeats in the 6 or 3 (e.g., we'd count .33, .333, .3333, .33333, etc.). |
Ok, I think I got it... but does it seem right that this would be 1788 sites? Also, I'll be adding "(decimal)" so that @Troger4 can change to "decimal" when she checked, right? see this comment |
Yikes, that's a lot--roughly 1/3! I didn't expect the number to be that high. If that's the case, maybe make two categories: 1. "(degree)"min(ndec_lat, ndec_lon) =0 OR decimal portion = .25, 0.5, .75 2. "(minutes rounded)"doesn't meet criteria above AND min(ndec_lat, ndec_lon) <= 1 OR decimal portion = .25, .75 OR decimal portion = .167 (1/6), .33 (1/3), .67 (2/3), or .83 (5/6), where there could be any number of repeats in the 6 or 3 (e.g., we'd count .33, .333, .3333, .33333, etc.). |
This doesn't make sense, unless you meant "(degree)". If that's what you meant, the answer is yes. |
Yes, sorry, that is what I meant. And ok for th rest, I make the change |
Okay. Note that I just made a couple small edits above (so use that, not the email). |
did you really mean |
Yeah, I'd like to separate out the ones rounded to the nearest degree or quarter degree from those rounded to the nearest ten minutes or tenth of a degree. The former could easily fall in the wrong |
ok, this part should be done |
@Troger4 , I added a column at the end of the sites spreadsheet flagging sites with coordinates that have no climate data, generally because they fall in the ocean (see issue #233 ). These are high priority for review. Based on the names, I can tell that at least some are coastal, and so the coordinates may be more-or-less correct, just slightly missing what the climate database considers land. But I also noticed that at least some were flagged by Valentine's script as low precision. |
@teixeirak, what should happen to the orange ones (when only one coordinate meets the criteria), then? Leave NAC? |
Correct. My interpretation of those is that a trailing zero on one coordinate was dropped (so, for example, lat on the first one would be 56.60). |
So I need to change the statements above to the following (note that 1. "(degree)"( 2. "(minutes rounded)"doesn't meet criteria above AND decimal portion of lat AND lon = .25, .75 (can't work with first statement) OR decimal portion lat AND lon = .167 (1/6), .33 (1/3), .67 (2/3), or .83 (5/6), where there could be any number of repeats in the 6 or 3 (e.g., we'd count .33, .333, .3333, .33333, etc.). )) |
The one modification to the above is that the two coordinates don't have to meet the same criteria. For example, for degree, So, ( And similarly for minutes rounded. |
ok, thanks! |
I pushed a fixed version. |
Thanks @ValentineHerr ! @Troger4 , please be sure to pull the latest version before working on this again. |
@Troger4 , also a couple clarifications:
|
Okay, I'll fix that. And yes exactly, I've been assigning on whichever is less precise. Also, I am unable to access the original PDFs for many of the records in ForC_sites but I can access the publications which the info was loaded in from. These seem to be studies in which data from hundreds of sites were used but it's obviously not the original publication. Is it alright to confirm coordinate data using the secondary source if I am not able to access the original or should I leave it if I cannot access the original? Thanks! |
Thanks, @Troger4 ! For the site info loaded from papers that were not the original study, please enter the precision as reported in the paper you're looking at, AND put a "1" in the field |
Will do, thanks! |
Luyssaert_2007_cbob https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1365-2486.2007.01439.x |
I checked the Luyssaert database. Coordinates are reported in decimal degrees to 2 decimal points, although there seem to be a few with lower precision (presumably because original pub lacks that info). |
I see. How did you access it? Is the database something I can download and add to our references? Would you like me to fill "decimal degrees to 2 decimal points" for all our Luyssaert records? Most of the records from Luyssaert have at least 3 decimal points currently. Apologies for all the questions! |
We have a copy here (file: |
Thanks! I quickly double checked the coordinate data between the two databases so all of our Luyssaert_2007 (degree) records have been resolved now. |
I forgot to add a column label for sites with coordinates flagged as suspect-- fixed now. @Troger4 , the sites flagged "1" in `coordinates.suspect` are suspect because they lack data in a climate database, generally because they fall in the ocean. It would be very helpful if you could review these. You may have already got a few. If the coordinates are reported accurately, please record the precision and add to geography.notes "Coordinates fall in ocean according to Worldclim2 database." #229
@Troger4 , here are instructions for checking the precision of site coordinates. Please try these out and let me know if they make sense to you. If you have any doubts, please ask before proceeding.
coordinates.precision
(K in Excel)NAC
in this field, then find the study/ studies with records at this site inmeasurement.refs
. There may be additional info on the site insite.refs
. It would also be valuable to check the original source of anything with(minutes)
in this field, as that indicates that current precision is very low (and hopefully the original source has something better). Open reference (you already know how to link citation.ID to pdf). If there's more than one, it may be helpful to look at more than one, but don't go overboard if there are tons of references.converting degrees-minutes-seconds to decimal degrees:
coordinates.precision
, enter the precision reported in the original pub (which should now match what's in ForC). Please enter exactly one of the following (parts in bold):measurement.refs
field for the citation.ID you're looking at.The text was updated successfully, but these errors were encountered: