Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux/2.0.3] ubridge is extremely greedy when a cloud node with a loopback interface is linked to a qemu node #36

Open
jean-christophe-manciot opened this issue Jun 30, 2017 · 18 comments
Assignees

Comments

@jean-christophe-manciot
Copy link

jean-christophe-manciot commented Jun 30, 2017

GNS3 gui/server 2.0.3
ubridge 0.9.11

GNS3 is launched, but no project is open:
before opening project

Opening an existing 2.0.3 project with a single non running IOS-XRv 6.1.2 qemu node:
after opening an ios-xrv project
Checking which process needs so much CPU:
process usage
project

ubridge should NOT be running, since no node is currently running.
If I load another project with other type(s) of qemu node, such as ASAv or CSR 1000v, ubridge is not running.

@jean-christophe-manciot jean-christophe-manciot changed the title [Linux/2.0.3] ubridge is extremely greedy when opening a project with a qemu IOS-XRv [Linux/2.0.3] ubridge is extremely greedy when opening a project with a IOS-XRv qemu node Jun 30, 2017
@grossmj
Copy link
Member

grossmj commented Jun 30, 2017

The 2 uBridge processes you see are for the cloud and the Ansible node.

uBridge will use CPU power to receive and send packets. It make sense that the one for the cloud does since it must be receiving packets from your network. However, it doesn't make sense for your Ansible node.

Can you isolate the issue to 1 or 2 nodes?

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Jun 30, 2017

First off, here is the log
gns3_server.log.txt for previous topology.
If I remove the cloud & the ASAv, one ubridge process out of 2 stops by itself and I see this in the log:

2017-06-30 17:09:17 INFO base_node.py:555 Stopping uBridge hypervisor 127.0.0.1:48915
2017-06-30 17:09:17 INFO hypervisor.py:197 Stopping uBridge process PID=25222

@jean-christophe-manciot
Copy link
Author

In another topology, with 2 ASAv connected to 2 CSR 1000v, no such issue.
another project
process usage

Either there is something special with the IOS-XRv, or the cloud-ASAv connection excites ubridge ;)

@jean-christophe-manciot
Copy link
Author

Yes, the latter is correct: The issue is still there with only the following 2 nodes:
another project
process usage

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Jun 30, 2017

@grossmj
You're saying that in the last topology, ubridge must be running.
The issue for me is that it uses way too much CPU power. Sitting & waiting for some packets should use very little CPU.
Also, in that case, the "cloud" is the loopback interface lo.
This reminds me that I saw on wireshark a lot of unexplained ping packets from a null MAC address to a null MAC address. Does ubridge send these packets? This could explain a loop and a high CPU usage.

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Jun 30, 2017

If I connect the ASAv to a tap interface on the cloud instead, the issue is gone.
No ping travel on the connection.
FYI, the ubridge process does not run in that case, although it is supposed to potentially receive packets from the tap interface which is connected to a NAT Linux bridge connected to the Internet.
process usage

As a conclusion, the issue comes from the choice of lo on the cloud node.

@jean-christophe-manciot jean-christophe-manciot changed the title [Linux/2.0.3] ubridge is extremely greedy when opening a project with a IOS-XRv qemu node [Linux/2.0.3] ubridge is extremely greedy when a cloud node with a loopback interface is linked to a qemu node Jul 1, 2017
@grossmj
Copy link
Member

grossmj commented Jul 3, 2017

Yes the cloud is powered by uBridge. There must be lot of traffic on the loopback interface which explain that uBridge take resources to read it (can be configured by a Wireshark capture on lo interface).

Indeed the best way is to use a TAP interface instead.

@jean-christophe-manciot
Copy link
Author

No, the traffic in excess is directly produced by ubridge somehow:
Traffic on lo without GNS3 running: 15kbits/s
bandwidth usage without gns3

Traffic on lo with GNS3 running and a stopped qemu node connected to a cloud through lo: 92 Mbits/s
bandwidth usage with gns3

In particular, there's the presence of new insane ICMP frames:
pings

And UDP ports 15000/15001 are used by ubridge:
tunneling ports range

@grossmj
Copy link
Member

grossmj commented Jul 3, 2017

I will have closer look.

@julien-duponchelle
Copy link
Contributor

I think it's "normal". In the l0 we have traffic between qemu and ubridge and after that between the two ubridge.

Ubridge 2 = cloud node connected to lo
Ubridge 1 = connected to Qemu

  • Qemu will send a ping
  • Ubridge 1 will receive it via UDP on 127.0.0.1
  • Ubridge 2 will see also the ping on the lo and will resend it to Ubridge 1. This ubridge as no way to know from where a packet is coming

We can create a tap for lo but we have also the issue with other interfaces. We already create a tap if you try to connect a bridge:
https://github.com/GNS3/gns3-server/blob/master/gns3server/compute/builtin/nodes/cloud.py#L247

@grossmj
Copy link
Member

grossmj commented Jul 4, 2017

@noplay yes I think this is what is going on. I will do a quick test to confirm this.

@grossmj grossmj self-assigned this Jul 8, 2017
@grossmj
Copy link
Member

grossmj commented Jul 10, 2017

I could not reproduce the issue with one ASAv (not running) attached to a "cloud" on lo interface. I do not see any excess traffic. I would like to understand why you get a loop when using a Loopback interface.

@grossmj
Copy link
Member

grossmj commented Jul 10, 2017

I actually can reproduce that issue (excepting the high CPU usage). Somehow when a uBridge process is attached to loopback interface, ICMP packets are generated. I suspect an issue with libpcap.

@jean-christophe-manciot
Copy link
Author

jean-christophe-manciot commented Jul 10, 2017

Some more information about my setup:
qemu 2.8.1(Debian 1:2.8+dfsg-6)
libpcap 1.8.1-3

@grossmj
Copy link
Member

grossmj commented Jul 10, 2017

Thanks. I have run more tests and found out that you don't even need a cloud to get those strange packets on lo interface. Running GNS3 with 2 VPCS connected back-to-back is enough to trigger that traffic.

back_to_back

strange_traffic

@grossmj
Copy link
Member

grossmj commented Jul 10, 2017

Actually, just running GNS3 is enough. This is traffic generated by the GNS3 server WebSocket connection (notice the 3080 port in the capture).

websocket

I don't think this is an issue.

@jean-christophe-manciot
Copy link
Author

@grossmj
The traffic you show has nothing to do with this issue which is about endless unsolicited ICMP Ping packets looping and generating way too much traffic and consuming way too much CPU when a qemu node is connected to a loopback interface, which happens to be the same as the interface on which GNS3 listens..

However, I do notice a few PINGs generated automatically and endlessly as soon as GNS3 is launched, before any project is loaded.
These pings are fo sure generated by GNS3, from the local host to the local host, and towards a hard-coded 3080 port, despite the fact that my ~/.config/GNS3/gns3-server.conf port is set to 9000.
No pings are generated likewise outside of GNS3:
pings

There is also some non-ICMP traffic with the 9000 port, but I guess this one if expected.

@julien-duponchelle
Copy link
Contributor

GNS3 doesn't send ping. Their is no code in current GNS3/ubridge to make an ICMP packet.

You see that after starting GNS3 and when you have no topology opened?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants