-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct memory access in fmi3SetX / fmi3GetX functions #515
Comments
It is not completely clear how this is intended to work and how it will provide an improvement. Could you please provide an example in pseudo code? |
I am all for the initiative, however assuming memory layout based on the order variables appear in modelDescription.xml is not feasible, in addition there might be internal variables not exposed in the modelDescription that share memory with variables that are exposed. |
Also if you share memory with the FMU and the master isn't get/set redundant?, you would just need a update/compute function that could replace all the get/set functions |
I agree with @KarlWernersson, this needs decoupling from order of variables, however this could be achieved through a proper setup call, and in this way could also be made optional (if people want this): The importing implementation should call a setup function (fmi3SetupSharedMemory or something of this nature), passing in an array of value references and a pointer to memory, informing the FMU of the memory area used for data exchange, and the layout of variables in there - if we allow mixture of types, this needs alignment standard, otherwise we might specify memory areas seperately by basic type. This way all data allocation is handled by the importer: This would solve e.g. the real-time static allocation issues, and allow direct sharing of this memory between FMUs, resulting in zero-copy overhead if done right. Sync for CS is automatic through fmi3DoStep, for ME this might need seperate functions. If setup shared memory function is not called, then the importer must use the classic fmi3Get/SetXXX approach. fmi3SetupSharedMemory( |
The initial memory layout / value references could still be specified in the XML. This way an importer can generate code that directly accesses the variables w/o looking up the value references dynamically. |
Since the importer would be controlling the memory layout, I don't see how this kind of initial memory layout default would be of benefit to the importer? I could see how this makes hard-coding the exported FMU code easier/faster (at the expense of not allowing a changed memory layout, which kind of negates the benefits of shared memory for coupled FMUs), but not how this would be of benefit to the importer... |
Retarget to 3.1 as discussed in WebMeeting on 2019-01-29. |
Core ideas for future (3.1) optional support (triggered by WebMeeting on 2019-01-29, just my currently feeling about those):
Should be possible to provide in completely backward compatible fashion for a minor (e.g. 3.1) release. |
FWIW, the dynamic memory model of S-functions has been criticized as a limitation for embedded use-cases in the comparison with FMI on Wikipedia (which also applies to FMI). |
Regular Design Meeting: |
At the coming Berlin Design Meeting due to the focus on layered standards we will not have time to discuss this in detail, but perhaps we could form a working group to further discuss this topic (and perhaps other efficiency related topics such as extended lifetime for binary variables) |
F2F Design meeting Berlin: Intention: avoid copying Working Group to discuss this: Andreas, Pierre, Torsten B, Timo, Torsten S., Klaus, dSPACE |
@t-sommer Regarding your original suggestion, I would offer a suggestion as follows:
The FMU operates as before, allocating is memory etc. Then, for each requested VR, the FMU can decide to return the address of the value or NULL (supported/not supported). The Importer takes care of the response. There is one edge condition for binary variables where it is more useful to return a pointer to a variable which holds the allocated address of the buffer (i.e. void**). In that case either the Importer or the FMU can |
We did a lot of work on this topic in the last days. As a result it seems we would need the following API:
In the case of complex types, the 'additional' parameter would hold an additional reference - e.g. the address of the size variable for an fmi3Binary. In the case that shared access is not supported for a variable, the function should return NULL for that variable. The importer falls back to the preexisting methods. No additional changes are required. Everything will work. FMUs supporting this will behave correctly. There are numerous use cases for doing this. However, many of them may require additional specification, so here one should only consider making the capability available. That is all that is required. |
@timrulebosch, could you implement a prototype based on the Reference FMUs? |
This is about it ... more or less:
The interface is trivial. Someone doing this would likely have a very specific use case, needing support in the Importer, and of-course the FMU (and possibly a layered standard ... as well). You can certainly achieve specific things, for example, reduction in the memory churn when working with fmi3Binary variables (avoiding unnecessary malloc, memcpy & free calls). And perhaps other interesting things too. |
I've created a prototype of a shared memory API based on the |
Discussion in FMI Design Webmeeting: https://github.com/modelica/fmi-design/tree/master/Meetings/2023/2023-06-20-FMI-Design-Webmeeting#shared-memory |
TODOs from the FMI Design Meeting 6/20/2023:
|
A pattern we use in our simulation system (non-FMI) is to effectively map variable vector/arrays from the Model into the Importer. Because the model is aware of its variable vector layout, it can directly access variables without overhead. Conversely, the importer learns the variable layout of the Model, and is able to map those variables to the signal exchange mechanism of the simulation. This mechanism removes the need for our models to do any kind of data marshaling (Vref searches, assignment, memcpy). Semantics of operation and access are inferred from the inherent state of the model (so no need for get/set functions). Such a pattern/mechanism could be achieved in FMI where an FMU maps its variable vector/arrays into the Importer. Code for such an approach is as follows: typedef enum {
fmi3ValueNone = 0,
fmi3ValueFloat32,
fmi3ValueFloat64,
fmi3ValueInt8,
fmi3ValueUint8,
fmi3ValueInt16,
fmi3ValueUint16,
fmi3ValueInt32,
fmi3ValueUint32,
fmi3ValueInt64,
fmi3ValueUint64,
fmi3ValueString,
fmi3ValueBinary,
fmi3ValueCount,
} fmi3ValueType;
typedef fmi3Status fmi3GetVariableMap(
fmi3Instance instance, fmi3ValueType type,
// Return VR table from internal FMU map.
const fmi3ValueReference valueReferences[], size_t nValueReferences,
// Return pointer to internal FMU array/vector
void** values,
// Return pointer to internal size for binary data, otherwise NULL.
size_t** valueSizes, // Size of content in binary buffer to be consumed
size_t** valueLength // Length of allocated binary buffer (for buffer management i.e. realloc)
); |
FMI Design Meeting Munich: Torsten: for paralellelization one must make sure that the writing / reading of variables is done correctly Andreas: we have to define the problem. Pierre: for vECUs think using binary variables with contiguous binary arrays I would expect an acceleration Andreas: all the solutions boil down to direct memory access and describing the memory layout. Andreas: if the importer knows better when to communicate what. Even today the importer could leave out "set" calls. Pierre: One optimization could be to have the set/doStep/get operations in one call. (cache effects) This makes sense for point-to-point connections. Andreas: Calling a the get/set functions with the same "valid" VRs could bring performance benefit. Pierre: the FMU could give a natural order and a kind of "get/set-it-all". the intelligence could be in the importer (e.g. just in time compilation) Pierre: We need more realistic code for benchmarking. Pierre: it might be beneficial to put set-doStep-get more closely together. Next steps: |
FMI-Design Meeting: |
Currently all requested values are copied twice for every call of fmi3GetX / fmi3SetX which is inefficient in terms of bandwidth and memory usage. This can be avoided by re-using the memory for subsequent calls of the getters and setters.
A possible implementation is outlined below. (It is the result of a discussion with @pmai)
Instantiation
The FMU allocates the memory for all variables during
fmi3Instantiate
using the callbacks provided by the environment.For FMUs that don't support variable array sizes this memory may be static. The FMU may even use this memory directly for it's calculations.
The Memory layout is defined by the model description: it is the same as for a get / set call where all variables of a respective type are set / retrieved in the same order as they appear in the model description.
Get variables
vr
: all variables the caller is interested in (NULL = all)nvr
: size of vrvalues
: the memory that contains the variablesnValues
: size ofvalues
Set variables
vr
: all variables that have changed since the last call to fmi3SetX (NULL = all)nvr
: size of vrvalues
: memory retrieved fromfmi3GetX
nValues
: size ofvalues
The value references are provided to allow optimizations if only a small number of variables are set / retrieved.
Structural changes
The FMU re-allocates the memory using the callbacks provided by the environment. The memory layout changes according to the structural parameters.
Termination
In fmi3Terminate() the FMU frees the memory using fmi3Free()
Comments welcome!
The text was updated successfully, but these errors were encountered: