Explanation of the ioserver code in gcm_setup

Explicit code

@ MODEL_NPES = $NX * $NY

# Calculate OSERVER nodes based on recommended algorithm
if ( $DO_IOS == TRUE ) then

   # In the calculations below, the weird bc-awk command is to round up the floating point calcs

   # First we calculate the number of model nodes
   set NUM_MODEL_NODES=`echo "scale=6;($MODEL_NPES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

   # Next the number of frontend PEs is 10% of the model PEs
   set NUM_FRONTEND_PES=`echo "scale=6;($MODEL_NPES * 0.1)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

   # Now we roughly figure out the number of collections in the HISTORY.rc (this is not perfect, but is close to right)
   set NUM_HIST_COLLECTIONS=`cat $TMPHIST | sed -n '/^COLLECTIONS:/,/^ *::$/{p;/^ *::$/q}' | grep -v '^ *#' | wc -l`

   # And the total number of oserver PEs is frontend PEs plus number of history collections
   @ NUM_OSERVER_PES=$NUM_FRONTEND_PES + $NUM_HIST_COLLECTIONS

   # Now calculate the number of oserver nodes
   set NUM_OSERVER_NODES=`echo "scale=6;($NUM_OSERVER_PES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

   # And then the number of backend PEs is the number of history collections divided by the number of oserver nodes
   set NUM_BACKEND_PES=`echo "scale=6;($NUM_HIST_COLLECTIONS / $NUM_OSERVER_NODES)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

   # multigroup requires at least two backend pes
   if ($NUM_BACKEND_PES < 2) set NUM_BACKEND_PES = 2

   # Calculate the total number of nodes to request from batch
   @ NODES=$NUM_MODEL_NODES + $NUM_OSERVER_NODES

else
   # Calculate the number of model nodes
   set NODES=`echo "scale=6;($MODEL_NPES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

   set NUM_OSERVER_NODES = 0
   set NUM_BACKEND_PES   = 0
endif

What the code is doing if DO_IOS=TRUE

Calculate the number of model nodes

The first step of the code:

@ MODEL_NPES = $NX * $NY
set NUM_MODEL_NODES=`echo "scale=6;($MODEL_NPES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

is first you calculate the number of processes for the model (MODEL_NPES) by multiplying NX and NY. Then, to get the number of model nodes (NUM_MODEL_NODES), you divide the number of processes by the number of CPUs per node (NCPUS_PER_NODE) and round up. NCPUS_PER_NODE was set when the user selected what node type they wanted.

C720 Example

Assume:

NX = 24
NY = NX * 6 = 144
NCPUS_PER_NODE = 40

then:

MODEL_NPES = NX * NY = 24 * 144 = 3456
NUM_MODEL_NODES = ceil(MODEL_NPES / NCPUS_PER_NODE) = ceil(3456 / 40) = ceil(86.4) = 87

Calculate the number of frontend PEs

The next step of the code:

set NUM_FRONTEND_PES=`echo "scale=6;($MODEL_NPES * 0.1)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

is to calculate the number of frontend PEs (NUM_FRONTEND_PES) by multiplying the number of model processes (MODEL_NPES) by 0.1 (10%). Again, the result is rounded up.

C720 Example

We have:

MODEL_NPES = 3456

then:

NUM_FRONTEND_PES = ceil(MODEL_NPES * 0.1) = ceil(3456 * 0.1) = ceil(345.6) = 346

Calculate the number of history collections

This bit of code:

set NUM_HIST_COLLECTIONS=`cat $TMPHIST | sed -n '/^COLLECTIONS:/,/^ *::$/{p;/^ *::$/q}' | grep -v '^ *#' | wc -l`

looks at the HISTORY template selected by the user (TMPHIST) and counts the number of collections by counting the number of lines between the line beginning with "COLLECTIONS:" and the line beginning with "::" and then, excluding lines beginning with "#".

For example:

COLLECTIONS: 'geosgcm_prog'
#             'prog.eta'
             'geosgcm_surf'
             'geosgcm_ocn'
             'geosgcm_moist'
             'geosgcm_turb'
             'geosgcm_gwd'
             'geosgcm_tend'
             'geosgcm_budi'
             'geosgcm_buda'
             'geosgcm_landice'
             'geosgcm_meltwtr'
             'geosgcm_snowlayer'
             'geosgcm_tracer'
>>>HIST_GOCART<<<             'tavg2d_aer_x'
>>>HIST_GOCART<<<             'tavg3d_aer_p'
#             'geosgcm_iau'
#             'geosgcm_conv'
#             'goswim_catch'
#             'goswim_land'
#             'goswim_landice'
#             'geosgcm_lidar'
#             'geosgcm_parasol'
#             'geosgcm_modis'
#             'geosgcm_radar'
#             'geosgcm_isccp'
#             'geosgcm_misr'
             ::

So above that would be 15 collections. Now this is not perfect, but it's ... a good estimate.

C720 Example

We have:

NUM_HIST_COLLECTIONS = 15

Calculate the total number of oserver PEs

This bit of code:

@ NUM_OSERVER_PES=$NUM_FRONTEND_PES + $NUM_HIST_COLLECTIONS

adds the number of frontend PEs (NUM_FRONTEND_PES) to the number of history collections (NUM_HIST_COLLECTIONS) to get the total number of oserver PEs

C720 Example

We have:

NUM_FRONTEND_PES = 346
NUM_HIST_COLLECTIONS = 13

then:

NUM_OSERVER_PES = NUM_FRONTEND_PES + NUM_HIST_COLLECTIONS = 346 + 13 = 359

Calculate the number of oserver nodes

This bit of code:

set NUM_OSERVER_NODES=`echo "scale=6;($NUM_OSERVER_PES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

divides the number of oserver PEs (NUM_OSERVER_PES) by the number of CPUs per node (NCPUS_PER_NODE) and rounds up to get the number of oserver nodes.

C720 Example

We have:

NUM_OSERVER_PES = 359
NCPUS_PER_NODE = 40

then:

NUM_OSERVER_NODES = ceil(NUM_OSERVER_PES / NCPUS_PER_NODE) = ceil(359 / 40) = ceil(8.975) = 9

Calculate the number of backend PEs

This bit of code:

set NUM_BACKEND_PES=`echo "scale=6;($NUM_HIST_COLLECTIONS / $NUM_OSERVER_NODES)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`
if ($NUM_BACKEND_PES < 2) set NUM_BACKEND_PES = 2

divides the number of history collections (NUM_HIST_COLLECTIONS) by the number of oserver nodes (NUM_OSERVER_NODES) and rounds up to get the number of backend PEs. And we force the number of backend PEs to be at least 2.

C720 Example

We have:

NUM_HIST_COLLECTIONS = 13
NUM_OSERVER_NODES = 9

then:

NUM_BACKEND_PES = ceil(NUM_HIST_COLLECTIONS / NUM_OSERVER_NODES) = ceil(13 / 9) = ceil(1.444) = 2

Since we got 2 from the calculation, we don't need to force it to be at least 2.

Calculate the total number of nodes

This bit of code:

@ NODES=$NUM_MODEL_NODES + $NUM_OSERVER_NODES

adds the number of model nodes (NUM_MODEL_NODES) to the number of oserver nodes (NUM_OSERVER_NODES) to get the total number of nodes to request from batch.

C720 Example

We have:

NUM_MODEL_NODES = 87
NUM_OSERVER_NODES = 9

then:

NODES = NUM_MODEL_NODES + NUM_OSERVER_NODES = 87 + 9 = 96

What the code is doing if DO_IOS=FALSE

If DO_IOS is FALSE, then the code just calculates the number of model nodes (NUM_MODEL_NODES) and sets the number of oserver nodes (NUM_OSERVER_NODES) and the number of backend PEs (NUM_BACKEND_PES) to 0.

Calculate the number of model nodes

This bit of code:

set NODES=`echo "scale=6;($MODEL_NPES / $NCPUS_PER_NODE)" | bc | awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print ceil($1)}'`

divides the number of model processes (MODEL_NPES) by the number of CPUs per node (NCPUS_PER_NODE) and rounds up to get the number of model nodes.

C720 Example

We have:

MODEL_NPES = 3456
NCPUS_PER_NODE = 40

then:

NODES = ceil(MODEL_NPES / NCPUS_PER_NODE) = ceil(3456 / 40) = ceil(86.4) = 87

And then we set:

NUM_OSERVER_NODES = 0
NUM_BACKEND_PES = 0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explanation of the ioserver code in gcm_setup

Explicit code

What the code is doing if DO_IOS=TRUE

Calculate the number of model nodes

C720 Example

Calculate the number of frontend PEs

C720 Example

Calculate the number of history collections

C720 Example

Calculate the total number of oserver PEs

C720 Example

Calculate the number of oserver nodes

C720 Example

Calculate the number of backend PEs

C720 Example

Calculate the total number of nodes

C720 Example

What the code is doing if DO_IOS=FALSE

Calculate the number of model nodes

C720 Example

Clone this wiki locally