You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While updating the UFS-WM to the latest hash (e403bb4), the SRW App's WE2E tests started failing with the following error message:
FATAL from PE 1: compute_qs: saturation vapor pressure table overflow, nbad= 1
Ultimately, we were able to get around this issue by decreasing DT_ATMOS from 180 to 150. This change caused the grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot WE2E test fail. To make this test work again, DT_ATMOS was set to 180 for this specific test.
Trying to identify the issue that led to the failures, I attempted to change the cq parameter values in mfpbltq.f, mfscuq.f, samfdeepcnv.f, samfshalcnv.f, and satmedmfvdifq.f from 1.0 back to 1.3. Changing this value in samfdeepcnv.f allowed the RRFS_CONUS_25km tests to run using the original DT_ATMOS value of 180. Further, this change corrected the issue seen in the plotting WE2E test.
Unfortunately, I'm not familiar enough with CCPP to know what the cq parameter is used for. Is there a reason that it was reduced from 1.3 to 1.0 in the noted routines as part of PR #65? Would it be possible to set this value back to 1.3 for samfdeepcnv.f, or maybe add a namelist variable so that the value can be set at the application level?
Tagging @grantfirl, @JongilHan66, and @Qingfu-Liu since these individuals are either the PR owner or worked closely with the changes made in PR #65 for HR2.
Steps to Reproduce
Clone the SRW App on Hera: git clone [email protected]:ufs-community/ufs-srweather-app.git
cd ufs-srweather-app
./manage_externals/checkout_externals
./devbuild.sh -p=hera
module use $PWD/modulefiles
module load wflow_hera
conda activate workflow_tools
vi ush/predef_grid_params.yaml find RRFS_CONUS_25km and set DT_ATMOS from 150 to 180
See the noted error message in the description in the log/run_fcst* log file. A copy of the error in the log file has been added to the end of this issue as well.
Additional Context
Issues were encountered on UCAR's Cheyenne (with Intel compilers) and Hera (both Intel and GNU).
Cheyenne's Intel compiler used is 2022.1, Hera's Intel compiler used is 2022.1.2, and Hera's GNU compiler used is 9.2.0.
The test noted above uses the FV3_GFS_v15p2 SDF, while the noted failure of the grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot test uses the FV3_GFS_v17_p8 SDF.
cq=1.3 implies that entrainment rates in updrafts for moisture and tracers are about 30% larger than that in temperature. The reason for cq=1.3 is to increase CAPE by reducing mass flux transport for moisture and tracers. I changed it back to cq=1.0 (which is the value in the current operational GFSv16) because of lack of physical justification for cq=1.3. I don't think the change of the cq value can cause a numerical instability.
compiled with DEBUG options. It failed immediately in post_fv3.F90, line 4645
turned off inline post, it run successfully all 6 hours (still in DEBUG mode).
compiled back with optimization (inline post OFF), and it failed same way after 60 steps in compute_qs
I doubt that it is up to CFL, rather this longer step revealed some other underlaying bug. We might have also two things here: one for QS and other is why it failed in post:
Description
While updating the UFS-WM to the latest hash (e403bb4), the SRW App's WE2E tests started failing with the following error message:
FATAL from PE 1: compute_qs: saturation vapor pressure table overflow, nbad= 1
Ultimately, we were able to get around this issue by decreasing DT_ATMOS from 180 to 150. This change caused the
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot
WE2E test fail. To make this test work again, DT_ATMOS was set to 180 for this specific test.Trying to identify the issue that led to the failures, I attempted to change the cq parameter values in
mfpbltq.f
,mfscuq.f
,samfdeepcnv.f
,samfshalcnv.f
, andsatmedmfvdifq.f
from 1.0 back to 1.3. Changing this value insamfdeepcnv.f
allowed the RRFS_CONUS_25km tests to run using the original DT_ATMOS value of 180. Further, this change corrected the issue seen in the plotting WE2E test.Unfortunately, I'm not familiar enough with CCPP to know what the cq parameter is used for. Is there a reason that it was reduced from 1.3 to 1.0 in the noted routines as part of PR #65? Would it be possible to set this value back to 1.3 for
samfdeepcnv.f
, or maybe add a namelist variable so that the value can be set at the application level?Tagging @grantfirl, @JongilHan66, and @Qingfu-Liu since these individuals are either the PR owner or worked closely with the changes made in PR #65 for HR2.
Steps to Reproduce
git clone [email protected]:ufs-community/ufs-srweather-app.git
cd ufs-srweather-app
./manage_externals/checkout_externals
./devbuild.sh -p=hera
module use $PWD/modulefiles
module load wflow_hera
conda activate workflow_tools
vi ush/predef_grid_params.yaml
find RRFS_CONUS_25km and set DT_ATMOS from 150 to 180cd tests/WE2E
./run_WE2E_tests.py -t= grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2 -m=hera -a=<insert account here>
Additional Context
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot
test uses the FV3_GFS_v17_p8 SDF.Output
The text was updated successfully, but these errors were encountered: