Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize CAN writes #230

Closed
PeterBowman opened this issue Aug 12, 2019 · 8 comments · Fixed by #229
Closed

Parallelize CAN writes #230

PeterBowman opened this issue Aug 12, 2019 · 8 comments · Fixed by #229

Comments

@PeterBowman
Copy link
Member

Split from #209 (comment):

I would like to introduce threaded writes, though. Assuming that CAN TX comms are entirely managed by CanBusControlboard, this device would start a periodic thread to process incoming write requests. Messages would be registered by raw subdevices into a FIFO queue and dispatched to the network in periodic batches (thus taking advantage of the TX buffer), say every 1-5 milliseconds. The point of this is to parallelize multi-joint queries, so that one or more requests from one joint do not block requests of all other joints (also, this enables message buffering, i.e. we'll send more than one message at once). Incidentally, it would help synchronize (more or less) PDOs without a SYNC signal. Considerations:

  • Even though reads/writes are thread-safe, make sure this new thread does not collide with high-priority RPDOs (e.g. PT/PVT).
  • Raw subdevices will request a free slot from the message buffer, similarly to how prepare() returns YARP bottles in a buffered port context.
  • Most importantly, all of this is meaningless unless multi-joint queries can be actually processed in a parallel manner, e.g. via threaded for-loops. Reminds me of GrPPI; perhaps vanilla C++ (11?) renders sufficient for our needs, though.
@PeterBowman
Copy link
Member Author

PeterBowman commented Oct 10, 2019

Parallelization could be easily achieved with <future> :

f() = std::async(std::launch::async, observer_timer{MILLIS, [&]{ return emptyStateObserver.notify(); }});

I'd fancy an abstract DeviceMapper class and SequentialDeviceMapper + ConcurrentDeviceMapper subclasses, such that the latter manages a pool of threads given an arbitrary number of callbacks (one per joint).

@PeterBowman PeterBowman self-assigned this Oct 10, 2019
@PeterBowman PeterBowman mentioned this issue Oct 11, 2019
26 tasks
@PeterBowman
Copy link
Member Author

C++17 introduces execution policies which allow to easily switch between sequential and parallel loops via std::for_each algorithm: SO answer.

In C++11, I have to resort to playing with std::launch policies in std::async.

@PeterBowman
Copy link
Member Author

PeterBowman commented Oct 14, 2019

The DeviceMapper class was added in de80526. I plan on moving it to its own library (static/shared), write unit tests and enable parallelization (currently WIP at 7a7bf3f).

Not sure about enabling thread limits, i.e. fix the max number of parallel tasks this class may start so that additional tasks are enqueued for later.

SO Q/A regarding static/shared library performance:

  • Q: (...) once the initialization and all has happened, does the function calling and execution take longer in case of dynamic libraries than static libraries?
  • A: It might. However, even if it does, the performance difference is really negligible.

Regarding thread limits: SO, reddit.

Two nice blog entries about C++11 move semantics: ref1, ref2.

@PeterBowman
Copy link
Member Author

PeterBowman commented Nov 6, 2019

Quite interesting links from 8dcd9ec regarding template specialization/instantiation:

@PeterBowman
Copy link
Member Author

Ready at 3c6efbb.

Not sure about enabling thread limits, i.e. fix the max number of parallel tasks this class may start so that additional tasks are enqueued for later.

Using sequential policy by default (single-threaded), an arbitrary large number of threads can be enabled after instantiation via YARP property.

@PeterBowman
Copy link
Member Author

Ready at 3c6efbb.

This is a naive solution, it splits requested callbacks into two groups: those executed asynchronously in threads and the remaining sequential ones. Ideally, freed threads should allocate new tasks. This is achieved with a nice thread pool library I borrowed from https://github.com/vit-vit/CTPL: 403ade0. Results are even superior in equality of conditions (same number of tasks and threads): FutureTask.zip.

@PeterBowman
Copy link
Member Author

I realized this parallelization is actually meaningless since high frequency commands are further transmitted via TPDO protocol (#232). It only made sense for SDO multi-joint requests due to the confirmed nature of these transfers (every SDO client request/indication awaits a response/confirmation from the SDO server), hence sequentially applying this request-and-wait scheme for several joint commands would result in a linear increase of the overall wait.

Cons:

  • SDOs are low priority commands that should not interfere with high frequency TPDOs.
  • Actually, SDO requests should rarely happen on normal operation.
  • Incidental parallelization of TPDOs could quickly become detrimental for the performance of the application due to thread synchronization involved in the registration of new outgoing messages in the TX queue.

Therefore, I'm definitely disabling this feature for YARP motor commands. However, I found this actually helpful for synchronization of multiple CAN buses (#226): 4745561.

@PeterBowman
Copy link
Member Author

Note YARP's fakeMotionControl device showcases a similar technique with a sequential implementation, split into specific interfaces: yarp::dev::ImplementPositionControl, yarp::dev::ImplementVelocityControl... Therefore, most .cpp files in the CanBusControlboard tree could be removed in favor of this class of ours inheriting from the aforementioned YARP concrete classes, assuming proper joint index configuration on init.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant