Implement offload instructions via macros #80

awnawab · 2025-01-28T19:02:20Z

A draft PR that represents a proof of concept of the development I proposed in #73.

The offload instructions in the code are replaced with macros, which are defined in python modules. More of these python modules can be added to support multiple backends, whilst leaving the code relatively clean. So far I have only implemented the macros for the code in FIELD_RANKSUFF_DATA_MODULE that would be run if OpenACC is enabled but not CUDA.

The preprocessor definitions need to still be generalised. I envisage the following two: WITH_GPU_OFFLOAD corresponding to _OPENACC and WITH_HIC corresponding to _CUDA.

I am eager to start work (very) soon on an OpenMP offload backend, but before I write more code I would really appreciate your feedback on this.

awnawab · 2025-01-28T19:06:44Z

NB: the hpc ci tests are failing because of a permissions issue for PRs filed from forks. Once #78 is merged I can rebase on top of that to fix the failing hpc-ci. I have tested offline that it works.

reuterbal

This is very neat and I like how clean this looks, despite the powerful features it unlocks for us. While discussing this offline we've noticed two small things that I've marked as comments.

reuterbal · 2025-01-29T08:05:22Z

cmake/field_api_find_fypp.cmake

+     set( Python3_FIND_VIRTUALENV STANDARD )
+     find_package( Python3 COMPONENTS Interpreter )
+
+     execute_process( COMMAND ${Python3_EXECUTABLE} -m pip --disable-pip-version-check install fypp OUTPUT_QUIET )


Small comment from something we've hit very recently: Bare-metal Python installations may come without any pip or only ancient versions available. But they are guaranteed to have a bootstrap mechanism for it that ensures that you have at least a pip version equivalent to the ensurepip version packaged with the installation. For this, add the following:

Suggested change

execute_process( COMMAND ${Python3_EXECUTABLE} -m pip --disable-pip-version-check install fypp OUTPUT_QUIET )

execute_process( COMMAND ${Python3_EXECUTABLE} -m ensurepip --upgrade OUTPUT_QUIET )

execute_process( COMMAND ${Python3_EXECUTABLE} -m pip --disable-pip-version-check install fypp OUTPUT_QUIET )

I didn't even know we could have pip-less python, thanks!

reuterbal · 2025-01-29T08:09:34Z

python_utils/offload_backends/nvidia/openacc.py

+
+__all__ = ['NvidiaOpenACC']
+
+class NvidiaOpenACC():


The idea of categorising backends by compiler-specific programming model implementations is a really good one, it allows to easily encode the minor differences in the interpretation of standards between vendors.

I would suggest to name them according to compiler toolchain, though, renaming this one e.g. to NVHPCOpenACC. This avoids also any issues associated with putting protected brand names like Nvidia into code.

Yes I agree, thanks!

… FIELD_RANKSUFF_DATA_MODULE

awnawab · 2025-02-03T08:08:28Z

This is now ready for review, please have a look at your earliest convenience 🙏

pmarguinaud

I agree we need something like this, but the code which was already difficult to understand, is becoming cryptic.

Would it be possible to translate OpenACC directives such as :

!$acc kernels present (PX)

Into :

$:offload_macros.kernels (present=['PX'])

So that it keeps its meaning in the NVIDIA lingo. I know it looks unfair for other vendors, but we have to choose a convention anyway.

awnawab · 2025-02-04T09:40:29Z

Hi @pmarguinaud,

Of course, if you think that makes the code more intuitive I am very happy to adapt it accordingly. Thanks for the feedback, I'll implement your suggestion 👍

awnawab requested review from mlange05, wertysas, dareg and pmarguinaud January 28, 2025 19:02

reuterbal reviewed Jan 29, 2025

View reviewed changes

awnawab force-pushed the naan-offload-macros branch from 8bfd956 to 9a18e00 Compare February 3, 2025 07:59

awnawab added 9 commits February 3, 2025 08:00

Remove unecessary CUDAFOR imports

f379dfe

Remove CUDAFOR from HOST_ALLOC_MODULE

e64e194

FYPP: fypp now only supported if installed as a pip package

96240dc

WIP: implement macro based offload instructions for Nvidia OpenACC in…

b1b714f

… FIELD_RANKSUFF_DATA_MODULE

WIP: python macros adapted for host only functionality

f49858f

FIELD_ASYNC: convert to fypp file

249974e

Implement NVHPCOpenaCC entirely using python macros

3062bd8

Adapt CUDA backend to use python macros

04188f5

Generalise offload related preproc definitions

aac718d

awnawab force-pushed the naan-offload-macros branch from 9a18e00 to aac718d Compare February 3, 2025 08:01

awnawab added the approved-for-ci Approved to run hpc-ci label Feb 3, 2025

awnawab marked this pull request as ready for review February 3, 2025 08:02

awnawab changed the title ~~WIP: implement offload instructions via macros~~ Implement offload instructions via macros Feb 3, 2025

pmarguinaud reviewed Feb 4, 2025

View reviewed changes

Use nvhpc naming convention for offload macros

f5dc280

github-actions bot removed the approved-for-ci Approved to run hpc-ci label Feb 4, 2025

awnawab added the approved-for-ci Approved to run hpc-ci label Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement offload instructions via macros #80

Implement offload instructions via macros #80

awnawab commented Jan 28, 2025

awnawab commented Jan 28, 2025

reuterbal left a comment

reuterbal Jan 29, 2025

awnawab Jan 29, 2025

reuterbal Jan 29, 2025

awnawab Jan 29, 2025

awnawab commented Feb 3, 2025

pmarguinaud left a comment •

edited

Loading

awnawab commented Feb 4, 2025

	execute_process( COMMAND ${Python3_EXECUTABLE} -m pip --disable-pip-version-check install fypp OUTPUT_QUIET )
	execute_process( COMMAND ${Python3_EXECUTABLE} -m ensurepip --upgrade OUTPUT_QUIET )
	execute_process( COMMAND ${Python3_EXECUTABLE} -m pip --disable-pip-version-check install fypp OUTPUT_QUIET )


		__all__ = ['NvidiaOpenACC']

		class NvidiaOpenACC():

Implement offload instructions via macros #80

Are you sure you want to change the base?

Implement offload instructions via macros #80

Conversation

awnawab commented Jan 28, 2025

awnawab commented Jan 28, 2025

reuterbal left a comment

Choose a reason for hiding this comment

reuterbal Jan 29, 2025

Choose a reason for hiding this comment

awnawab Jan 29, 2025

Choose a reason for hiding this comment

reuterbal Jan 29, 2025

Choose a reason for hiding this comment

awnawab Jan 29, 2025

Choose a reason for hiding this comment

awnawab commented Feb 3, 2025

pmarguinaud left a comment • edited Loading

Choose a reason for hiding this comment

awnawab commented Feb 4, 2025

pmarguinaud left a comment •

edited

Loading