-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support eBPF live filtering on Linux #1234
Comments
For posterity, this originates from the-tcpdump-group/libpcap#1379. From the "it would be cool" point of view, it would make sense to design such a new feature in a way to allow using cbpf-savefile(5) as a pre-compiled filter as well. In this case the pre-compiled bytecode would have to go to libpcap for the usual processing, including passing it to the kernel on certain OSes (what could possibly go wrong...). From the "how to keep this feature working long-term?" point of view, such a feature would require some amount of work for the initial implementation and documentation, and then some recurring amount of work to support and maintain it. If the feature becomes popular, the maintenance will require notable amounts of skills and time because the intended audience would be network developers that work on advanced use cases and therefore create or run into advanced problems. Or because the VLAN headers will migrate around. Or because year 2038 will come faster than expected, etc. Ideally there should be a developer that has a sound use case for this (at least remotely) and the time to get it done properly. Until then it would help at least to document a definition of done. For example, what would be the usual diagnostic steps to tell if a problem belongs to the loaded eBPF bytecode or elsewhere? What statistics would an eBPF filter return? |
I think you're overdoing it in trying to "bring in" a chunk of responsibility that currently belongs to the Linux kernel. The eBPF system's (or program's) "problems" are not tcpdump's responsibility, which ends at setsockopt(SO_ATTACH_BPF). IMO, agreeing to this Yalta is reasonable, as it adds very little work on the tcpdump side, and none on the kernel side, as the setsockopt() is just being used as documented. |
Power and responsibility normally should be in balance. Let's look for ways to keep this potential feature sustainable. If something does not work as expected, it would be nice to have an easy way for the user to tell — without opening a support ticket — that the problem certainly is at the kernel end of the socket and ideally a sense of the next place to check. Perhaps if you look through the existing open BPF-related issues in libpcap, it will make more sense. |
Well, there is an easy way: just launch two instances of tcpdump in parallel, one with the filter, the other without. |
On modern Linux, eBPF is available to do advanced filtering.
In some circumstances, the extra muscle brought by filtering early and in-kernel in zero-copy mode is critical.
One such use case is filtering of live a interface with a haystack of traffic and a needle matching a complex eBPF-expressed criterion.
As a simple example that currently has no efficient solution with tcpdump, is matching an IP address against a large hashtable (like is done in netfilter with
ipset
or nftables sets)It would be cool, for this case, to be able to tell tcpdump to load a separately compiled eBPF object file, and attach it to its raw socket with SO_ATTACH_BPF.
Note I'm not advocating for deeper integration, like eBPF-filtering a pcap file, since that has none of the performance requirements above, and arbitrarily complex logic in full-fledged C in userspace can be used instead.
I'm also not proposing (alas) to replicate the beautifully self-contained inline cBPF; in eBPF the heavier ELF machinery makes it unlikely anybody would want to enter a program as a list of comma-separated integers (though I, for one, would love to do it for "return XDP_DROP")
I guess the implementation is straightforward (for sufficiently recent Linux). The only thing that needs thinking I guess is the command-line API. Among possibilities:
ebpf /path/to/program.o[+SEC]
, with the constraint that it must be alone (no mixing like "port 80 and ebpf /path/to/program.o")-ebpf /path/to/program.o[+SEC]
The text was updated successfully, but these errors were encountered: