Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transmit delays in the new ICanBus arquitecture #191

Closed
PeterBowman opened this issue Jul 31, 2018 · 8 comments
Closed

Transmit delays in the new ICanBus arquitecture #191

PeterBowman opened this issue Jul 31, 2018 · 8 comments

Comments

@PeterBowman
Copy link
Member

PeterBowman commented Jul 31, 2018

As a result of #158 (which also affects #190), excessive delays have been observed during the initiallization of CanBusControlboard (e.g. via launchManipulation). We noticed this during usual operation (after init, before close), too, when fewer CAN nodes are opened, e.g. one iPOS and no CUIs attached to it. Regarding the recently reworked CanBusHico device, this instruction might block further operation up to a few seconds:

What is happening behind the scenes: a background thread owned by CanBusControlboard is constantly performing calling canRead once and again. This is a high-rate operation with no delays involved. In the main thread, we (occasionally) want to call canWrite, which is prone to be blocked by canRead as a common semaphore object avoids simultaneous execution of both methods at the device level (::read(), ::write(), ::select()). However, I noticed that after canRead returns (and the semaphore is released), the reading thread enters it again and locks the semaphore even if there was a pending canWrite call already waiting at canBusReady.wait() (our semaphore, see previous snippet). Using a yarp::os::Mutex object renders the same result.

Ideas:

  • Don't block reads and writes simultaneously, use a different semaphore for each one. Per the HiCO.CAN driver manual, sec. 3.5 (Multiprocessing):

    The driver can be opened used by multiple processes/threads simultaneously. When opening CAN nodes for use with multiple processes, extra care has to be taken with ioctl calls such as IOC_RESET_BOARD, IOC_RESET_TIMESTAMP, since these affect both CAN nodes on the board.

    Although the driver is multithread and multiprocessing capable, it is recommended to use only one reader/write per CAN node.

    However, I don't think we can proceed in this way with other CAN devices, e.g. CanBusPeak, which wrap internal system calls in their own API.

  • Adjust thread priority: yarp::os::Thread::setPriority (same for yarp::os::RateThread, see also getPolicy() and https://stackoverflow.com/q/18884510).

@PeterBowman
Copy link
Member Author

PeterBowman commented Aug 6, 2018

Sample app resembling the behavior we want to achieve:

#include <iostream>

#include <yarp/os/Network.h>
#include <yarp/os/RateThread.h>
#include <yarp/os/Semaphore.h>
#include <yarp/os/Thread.h>
#include <yarp/os/Time.h>

struct MyClass
{
    void read() const
    {
        double start = yarp::os::Time::now();
        ioReady.wait();
        // perform some non-CPU-intensive operations
        int c = 0;
        for (int i = 0; i < 1000; i++) { c *= 2; }
        std::cout << "read-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.post();
    }

    void write() const
    {
        double start = yarp::os::Time::now();
        std::cout << "write-wait" << std::endl;
        ioReady.wait();
        std::cout << "write-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.post();
    }

    mutable yarp::os::Semaphore ioReady;
};

struct ReadThread : public yarp::os::Thread
{
    ReadThread(const MyClass & _mc) : mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting read thread..." << std::endl;
        return yarp::os::Thread::start();
    }

    virtual bool stop()
    {
        std::cout << "Stopping read thread..." << std::endl;
        return yarp::os::Thread::stop();
    }

    virtual void run()
    {
        while (!isStopping())
        {
            mc.read();
        }
    }

    const MyClass & mc;
};

struct WriteThread : public yarp::os::RateThread
{
    WriteThread(const MyClass & _mc) : yarp::os::RateThread(10), mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting write thread..." << std::endl;
        return yarp::os::RateThread::start();
    }

    virtual void stop()
    {
        std::cout << "Stopping write thread..." << std::endl;
        yarp::os::RateThread::stop();
    }

    virtual void run()
    {
        mc.write();
    }

    const MyClass & mc;
};

int main()
{
    yarp::os::Network yarp;

    MyClass mc;
    ReadThread rt(mc);
    WriteThread wt(mc);

    std::cout << "Initializing threads..." << std::endl;

    if (!rt.start())
    {
        std::cout << "Unable to start read thread." << std::endl;
        return 1;
    }

    if (!wt.start())
    {
        std::cout << "Unable to start write thread." << std::endl;
        return 1;
    }

    std::cout << "Setting thread priority..." << std::endl;

    // play with these
    if (rt.setPriority(0, 0) == -1 || wt.setPriority(99, 1) == -1)
    {
        std::cout << "The given priority cannot be set." << std::endl;
    }
    else
    {
        std::cout << "Waiting..." << std::endl;
        yarp::os::Time::delay(5);
    }

    std::cout << "Stopping threads..." << std::endl;

    rt.stop();
    wt.stop();

    std::cout << "Exiting..." << std::endl;

    return 0;
}
cmake_minimum_required(VERSION 3.5 FATAL_ERROR)
project(yarp-threads LANGUAGES CXX)
find_package(YARP REQUIRED)
add_executable(main main.cpp)
target_link_libraries(main YARP::YARP_OS YARP::YARP_init)

@PeterBowman
Copy link
Member Author

PeterBowman commented Aug 6, 2018

I'm unable to boost the write thread's priority so that the wait lock is minimized. Info:

It looks like the priority for SCHED_NORMAL (SCHED_OTHER?) cannot be other than 0. Also, changing to SCHED_FIFO requires sudo.

@jgvictores
Copy link
Member

changing to SCHED_FIFO requires sudo

I guess we could play around with groups/permissions but yes, I ran into that sort of issue some years ago.

@PeterBowman
Copy link
Member Author

PeterBowman commented Aug 20, 2018

Alternatively, use yarp::os::Time::yield() (same as yarp::os::Thread::yield()):

I guess we could try to put a yield() call in the body of the Thread::run() method. Moreover, this is what RateThread::run() (aka PeriodicThread) does internally according to the docs:

Note: after each run is completed, the thread will call a yield() in order to facilitate other threads to run.

See implementation.

@PeterBowman
Copy link
Member Author

PeterBowman commented Aug 22, 2018

Since yarp::os::RateThread calls yield internally, this snippet should resemble more closely what CanBusControlboard is actually doing:

#include <atomic>
#include <iostream>
#include <limits>

#include <yarp/os/Mutex.h>
#include <yarp/os/Network.h>
#include <yarp/os/Thread.h>
#include <yarp/os/Time.h>

void doWork()
{
    int c = 0;

    for (int i = 0; i < 1000; i++)
    {
        c *= 2;
    }
}

struct MyClass
{
    MyClass() : concurrentReads(0) {}

    void read() const
    {
        double start = yarp::os::Time::now();
        ioReady.lock();
        concurrentReads++;
        doWork();
        std::cout << "read-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.unlock();
    }

    void write() const
    {
        double start = yarp::os::Time::now();
        std::cout << "write-wait" << std::endl;
        concurrentReads = 0;
        ioReady.lock();
        std::cout << "write-wait: " << (yarp::os::Time::now() - start) << " (" << concurrentReads << ")" << std::endl;
        ioReady.unlock();
    }

    mutable yarp::os::Mutex ioReady;
    mutable std::atomic<int> concurrentReads;
};

struct ReadThread : public yarp::os::Thread
{
    ReadThread(const MyClass & _mc) : mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting read thread..." << std::endl;
        return yarp::os::Thread::start();
    }

    virtual bool stop()
    {
        std::cout << "Stopping read thread..." << std::endl;
        return yarp::os::Thread::stop();
    }

    virtual void run()
    {
        while (!isStopping())
        {
            mc.read();
            yarp::os::Thread::yield();
            //double start = yarp::os::Time::now();
            //yarp::os::Time::delay(std::numeric_limits<double>::min());
            //std::cout << "delay-wait: " << (yarp::os::Time::now() - start) << std::endl;
        }
    }

    const MyClass & mc;
};

int main()
{
    std::cout << std::numeric_limits<double>::min() << std::endl;
    yarp::os::Network yarp;

    MyClass mc;
    ReadThread rt(mc);

    std::cout << "Initializing thread..." << std::endl;

    if (!rt.start())
    {
        std::cout << "Unable to start read thread." << std::endl;
        return 1;
    }

    for (int i = 0; i < 20; i++)
    {
        mc.write();
        yarp::os::Time::delay(0.25);
    }

    std::cout << "Stopping thread..." << std::endl;

    rt.stop();

    std::cout << "Exiting..." << std::endl;

    return 0;
}

I introduced an atomic variable (requires C++11) just to count how many times the read thread predates execution of the already waiting write thread. This should be zero, but in fact reaches even a few thousands (!). However, if I introduce a minimal delay, it behaves as expected.

To sum up, I guess everything boils down to the read thread consuming too many resources on its own. Running the previous snippet leads to a 80+% CPU usage on my PC, but with said delay it drops down to 20%. Of course, more code that we want to execute inside the fast read loop (even the stdout operations can pose a notable difference) translates to a bigger chance for the processor to "lend" the slower write loop some time slices and do its work. I'd like to compare current performance with that prior to #158, then introduce that small delay.

Note that even the tiniest delay must take into account system calls. For instance, it takes around 0.05 milliseconds to process the delay from the previous snippet on my PC. See https://stackoverflow.com/q/18071664.

Also, an interesting read about condition_variable: https://stackoverflow.com/a/10975146.

@PeterBowman
Copy link
Member Author

Another note to self: check whether I can increment the read timeout, and how it affects CPU usage (is it equivalent to a sleep?).

@PeterBowman
Copy link
Member Author

In case I ever come across this again: yarp::os::Time::delay(0); does not solve anything, it is best to add a minimum delay (std::numeric_limits<double>::min()) instead, as stated in the last example.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jan 8, 2020

Also, YARP's PeriodicThread class (former RateThread) steals 100% CPU time on a single read thread with tiny periods (anything more precise than 1 microsecond is not feasible): d2d1da5.

Edit: see #231 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants