Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

launch not killing processes started with shell=True #545

Open
v-lopez opened this issue Oct 27, 2021 · 14 comments · May be fixed by #632
Open

launch not killing processes started with shell=True #545

v-lopez opened this issue Oct 27, 2021 · 14 comments · May be fixed by #632
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@v-lopez
Copy link
Contributor

v-lopez commented Oct 27, 2021

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • Foxy 0.10.6-1focal.20210901.033555
  • DDS implementation:
    • Fast

Steps to reproduce issue

import launch_testing
import unittest
import os

from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument, ExecuteProcess
from launch.substitutions import PathJoinSubstitution, LaunchConfiguration
from launch_ros.actions import Node
from launch.actions import IncludeLaunchDescription
from launch.substitutions import PathJoinSubstitution
from launch_ros.substitutions import FindPackageShare
from launch.launch_description_sources import PythonLaunchDescriptionSource



def generate_test_description():
    # This is necessary to get unbuffered output from the process under test
    proc_env = os.environ.copy()
    proc_env['PYTHONUNBUFFERED'] = '1'

    dut_process = ExecuteProcess(cmd=['/usr/bin/sleep', '5']) 
    pkg_dir = FindPackageShare('gazebo_ros')
    full_path = PathJoinSubstitution([pkg_dir] + ['launch', 'gazebo.launch.py'])

    gazebo_launch = IncludeLaunchDescription(
        PythonLaunchDescriptionSource(
            full_path))

    return LaunchDescription([
        gazebo_launch,
        dut_process,

        launch_testing.actions.ReadyToTest(),
    ]), {'dut_process': dut_process}


class TestWait(unittest.TestCase):

    def test_wait_for_end(self, proc_output, proc_info, dut_process):
        # Wait until process ends
        proc_info.assertWaitForShutdown(process=dut_process, timeout=10)


@ launch_testing.post_shutdown_test()
class TestSuccessfulExit(unittest.TestCase):

    def test_exit_code(self, proc_info, dut_process):
        # Check that dut_process finishes with code 0
        launch_testing.asserts.assertExitCodes(proc_info, process=dut_process)

Execute it with launch_test or as part of add_launch_test in CMake.

Expected behavior

When it ends, gazebo should be killed.

Actual behavior

Gazebo is not killed.

Additional information

Happens with other nodes besides gazebo. But gazebo is particularly troublesome because I cannot have 2 tests running sequentially using gazebo, because the first test didn't kill the server, and only one server may be running in a machine.

Might be related to #495 because if instead of launch_testing I use ros2 launch gazebo_ros gazebo.launch.py, I can kill it properly with Ctrl+C, but not with kill -SIGINT. Even if I use ros2 launch gazebo_ros gazebo.launch.py --noninteractive

v-lopez pushed a commit to pal-robotics/pmb2_simulation that referenced this issue Oct 27, 2021
@hidmic
Copy link
Contributor

hidmic commented Oct 27, 2021

Hmm, @v-lopez this looks like Gazebo not forwarding signals properly. Would you mind trying to force launch_testing's LaunchService instance to be non-interactive? If that fixes the issue, we can submit it as a patch.

@hidmic hidmic added the more-information-needed Further information is required label Oct 27, 2021
@v-lopez
Copy link
Contributor Author

v-lopez commented Oct 29, 2021

I tried that (and verified that it was not being overwritten to False), and the behavior was the same.

@benjinne
Copy link

I have the same problem with a standalone unity executable not being killed, but I can kill all the processes correctly if I Ctrl+C before the test is complete

@hidmic hidmic added bug Something isn't working and removed more-information-needed Further information is required labels Dec 7, 2021
@clalancette clalancette added the help wanted Extra attention is needed label Dec 23, 2021
@benjinne
Copy link

benjinne commented Jan 11, 2022

My current workaround is to add a post_shutdown_test to kill the remaining processes

@launch_testing.post_shutdown_test()
class TestShutdown(unittest.TestCase):
    def test_kill_sim(self):
        unity_pid = check_output(["pidof", "-z", "unity_executable.x86_64"])
        print('got unity pid')
        unity_pid = int(unity_pid)
        print('unity pid is: ' + str(unity_pid))
        os.kill(unity_pid, signal.SIGKILL)
        print('Killed unity')

I think the reason this is happening is because the executable is ran from a series of launch files, then the last launch file calls a bash script that uses ros2 run unity_executable.x86_64. I can provide more details if needed

@jacobperron
Copy link
Member

I'm experiencing this issue with gzserver, gzclient, and rqt. Here's a minimal reproducible example with Gazebo:

import sys
import time
import unittest

import launch
from launch.actions import IncludeLaunchDescription
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch_ros.substitutions import FindPackageShare
from launch_testing.actions import ReadyToTest
import launch_testing.markers
import pytest


@pytest.mark.launch_test
def generate_test_description():
    return launch.LaunchDescription(
        [
            IncludeLaunchDescription(
                PythonLaunchDescriptionSource([FindPackageShare('gazebo_ros'), '/launch/gazebo.launch.py'])
            ),
            ReadyToTest(),
        ])


class MyTestFixture(unittest.TestCase):

    def test_something(self):
        time.sleep(10.0)
        print('END TEST', file=sys.stderr)

@adityapande-1995
Copy link
Contributor

adityapande-1995 commented Apr 28, 2022

We were facing a similar issue in ignition fortress, where launch was not able to kill the gazebo process. As @hidmic mentioned, it was related to Gazebo not forwarding signals. That was fixed using this PR, not sure if that is related : gazebosim/gz-launch#151

@adityapande-1995
Copy link
Contributor

I'm experiencing this issue with gzserver, gzclient, and rqt. Here's a minimal reproducible example with Gazebo:

import sys
import time
import unittest

import launch
from launch.actions import IncludeLaunchDescription
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch_ros.substitutions import FindPackageShare
from launch_testing.actions import ReadyToTest
import launch_testing.markers
import pytest


@pytest.mark.launch_test
def generate_test_description():
    return launch.LaunchDescription(
        [
            IncludeLaunchDescription(
                PythonLaunchDescriptionSource([FindPackageShare('gazebo_ros'), '/launch/gazebo.launch.py'])
            ),
            ReadyToTest(),
        ])


class MyTestFixture(unittest.TestCase):

    def test_something(self):
        time.sleep(10.0)
        print('END TEST', file=sys.stderr)

Fix here : ros-simulation/gazebo_ros_pkgs#1376

@adityapande-1995 adityapande-1995 self-assigned this Apr 28, 2022
@adityapande-1995
Copy link
Contributor

Another minimal example :

import sys
import time
import unittest

import launch
from launch.actions import IncludeLaunchDescription, ExecuteProcess
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch_ros.substitutions import FindPackageShare
from launch_testing.actions import ReadyToTest
import launch_testing.markers
import pytest


@pytest.mark.launch_test
def generate_test_description():
    return launch.LaunchDescription(
        [
            ExecuteProcess(
                cmd=['python3','test1.py'],
                shell=True,
                output='both'
            ),
            ExecuteProcess(
                cmd=['python3','test2.py'],
                shell=False,
                output='both'
            ),
            ReadyToTest(),
        ])


class MyTestFixture(unittest.TestCase):

    def test_something(self):
        time.sleep(10.0)
        print('END TEST', file=sys.stderr)

Where both test files are just :

import time

while 1 :
    time.sleep(0.5)
    print("Hello 1")

test1.py does not terminate, yet test2.py does.

@ivanpauno
Copy link
Member

This is not a problem IMO.
When you use shell=True, you're running a shell process, which in this case will run python3 test1.py as a subprocess.
Launch is correctly sending a signal to the shell process, but that one is not forwarding it to its subprocess.

You probably can fix that be trapping SIGINT/SIGTERM and killing subprocesses before exiting the shell (e.g. with kill -$SIGNAL $(jobs -p)).

I don't think we can do anything by default in launch that's reasonable, people should be careful when using shell=True.

@ivanpauno ivanpauno changed the title launch_testing not killing properly included launch descriptions launch_testing not killing processes started with shell=True Jul 15, 2022
@ivanpauno ivanpauno added question Further information is requested and removed bug Something isn't working help wanted Extra attention is needed labels Jul 15, 2022
@ivanpauno
Copy link
Member

I have updated the issue title, as I don't think the original one was correct.

@ivanpauno ivanpauno changed the title launch_testing not killing processes started with shell=True launch not killing processes started with shell=True Jul 21, 2022
@ivanpauno
Copy link
Member

This is easy to reproduce from a pure launch file:

ros2 launch ./test.launch.py  &
kill -SIGINT $!
# ./test.launch.py
import launch
from launch.actions import ExecuteProcess

PYTHON_SCRIPT="""\
import time

while 1:
    time.sleep(0.5)
"""

def generate_launch_description():
    return launch.LaunchDescription(
        [
            ExecuteProcess(
                cmd=['python3', '-c', f'"{PYTHON_SCRIPT}"'],
                shell=True,
                output='both'
            ),
            ExecuteProcess(
                cmd=['python3', '-c', f'{PYTHON_SCRIPT}'],
                shell=False,
                output='both'
            ),
        ])

after waiting the launch process to finish (it will take some seconds and log some stuff), it's easy to check that the second process was killed and the first wasn't (e.g. using ps -elf).

Note:

  • The issue isn't noticed if instead of using kill -SIGINT $! the user presses ctrl+c.
    That's because pressing ctrl+c will send a SIGINT signal to all the processes of the group (all the processes launch from the launch file).
  • Currently if SIGTERM is sent, instead of SIGINT, no process is being killed. That seems like a separate issue that should be handled as well.
  • We could maybe try to send a signal to the whole group (in the platforms that support this), or to find all the children of the launch process recursively (e.g. with psutil) to solve the problem.

I will give it a try to the last option, but I'm not sure how multiplatform that's going to be.

cc @jacobperron

@ivanpauno ivanpauno linked a pull request Jul 25, 2022 that will close this issue
@ivanpauno ivanpauno added the enhancement New feature or request label Jul 25, 2022
@ivanpauno
Copy link
Member

I have opened #632, which will fix the issue.

@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/sequentially-starting-launch-files-using-ros-2-python-when-a-specific-log-message-is-detected/30633/3

@abaeyens
Copy link

This bug is unfortunately still present (ROS 2 Jazzy), and problematic when setting up automated integration testing with launch_testing as processes launched as part of a test are not shut down, potentially contaminating following tests. Could be mistaken but #757 has the same root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants