ensemble_stat RMSE weighting application #2623
-
https://github.com/dtcenter/MET/blob/main_v11.1/src/libcode/vx_statistics/ens_stats.cc Looking in this file it seems like the individual weight for a specific area is normalized by the sum of the weights in that area. I have not been able to independently replicate the results that MET gives for a single date/time/variable/level group using MET's weighting scheme. Removing the weighting, or in the case of the following routine, replacing w with 1, has allowed me to replicate MET's output for a run with grid weighting set to NONE in the ensemble stat config file. Though, I have only been able to replicate the rmse values that get displayed by the ecnt output file from ensemble stat when I normalize the rmse result by the number of points in the area. Perhaps that's happening somewhere else in the code and I'm missing it. lines 295 through 314 in the referenced file:
Applying the weighting scheme on my end as written here drops the rmse number significantly and doesn't give a good match for what is in the ecnt files. Can someone break down what exactly is going on with MET's weighting scheme and help me replicate what's going on so that my team is sure we're getting the correct results? I'll follow up this initial question with data and configs. These results have been tested on MET versions 9.1, 10.0.1, 11.1.0, 12.1(beta) and the numbers are all exactly the same. I just need to replicate it because our in-house statistical calculations definitely don't match and it's pretty far apart, ~0.5 for the 12hrly airtemp prediction for 850mb. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
I have placed the file DAVIS_DATA.tar in the folder DAVIS_DATA according to the instruction given in the "How to send us data" outline. There are a forecast and obs file, an ensemble file list, configs, and output along with a file reader |
Beta Was this translation helpful? Give feedback.
-
Thanks for the data Wesley, the developers who will have the best shot at answering this question are out right now but we will answer your question as soon as possible. |
Beta Was this translation helpful? Give feedback.
-
Hi Wesley: As Hank said, the developer who could best answer this question is currently out and will not be back until Tuesday at the earliest. I'm not certain if what I'm going to provide below is useful, as I'm guessing much of it you may have figured out already. However, I'll list what I know here in the hopes that something may shed some light on the calculation before he returns. For rmse, when you say "normalize the rmse results by the number of points in the area," do you mean divide by the number of grid points(i.e. matched pairs)? This is actually how ME, RMSE, and MAE are computed, although the formulas make it look like this is not happening. The division is not explicit in the formula, but is handled through the weights. Specifically, when the weight for each grid point (pd.wgt_dp) is set to one, the w in the formulas for obar, fbar, etc is actually pd.wgt_dp/sum(pd.wgt_dp, as seen on line 302. So if all my weights are set to 1, the w in the formula is not actually 1, but rather 1/sum(weights). Similarily, the w calculated using cos(lat) isn't the actual value of w in the equations, but rather it is divided by the sum of all weights. I'm guessing you already have this in your calculations, but wanted to mention it, since it's easy to miss. For additional information the calculation of weights is performed in this code which uses cosd, and xy_to_latlon. Does this help at all? Christina |
Beta Was this translation helpful? Give feedback.
Hi Wesley:
As Hank said, the developer who could best answer this question is currently out and will not be back until Tuesday at the earliest. I'm not certain if what I'm going to provide below is useful, as I'm guessing much of it you may have figured out already. However, I'll list what I know here in the hopes that something may shed some light on the calculation before he returns.
For rmse, when you say "normalize the rmse results by the number of points in the area," do you mean divide by the number of grid points(i.e. matched pairs)? This is actually how ME, RMSE, and MAE are computed, although the formulas make it look like this is not happening. The division is not explicit in th…