Skip to content

Commit

Permalink
Merge pull request #419 from ekoops/ekoops/k8s-polykube
Browse files Browse the repository at this point in the history
pcn-loadbalancer-rp MULTI port mode and pcn-k8sdispatcher
  • Loading branch information
frisso authored Jun 30, 2022
2 parents 577bf22 + 76616a0 commit a143e3c
Show file tree
Hide file tree
Showing 70 changed files with 7,480 additions and 217 deletions.
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ off on commits in the Polycube repository:
Gianluca Scopelliti [email protected]
Giuseppe Ognibebe [email protected]
Jianwen Pi [email protected]
Leonardo Di Giovanna [email protected]
Matteo Bertrone [email protected]
Mauricio Vásquez Bernal [email protected]
Nico Caprioli [email protected]
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions Documentation/services/pcn-k8sdispatcher/k8sdispatcher.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# K8dispatcher

The ``pcn-k8sdispatcher`` service is specifically designed as part of our Kubernetes networking solution (please see [polykube](https://github.com/polycube-network/polykube) to get more information about it). The service provides an eBPF implementation of a custom NAT: it performs different actions depending on the type and on the direction of the traffic.

For Egress Traffic, the following flow chart can be used to explain the functioning of the service:

![K8sdispatcher egress flow chart](egress.png)

The Egress Traffic is the traffic generated by Pods and directed to the external world.
This traffic can be generated by an internal Pod that wants to contact the external world or as a response to an
external world request. For this traffic, the service maintains an egress session table containing information
about the active egress sessions. The first time a Pod wants to contact the external world, no active egress session
will be present in the table: in this scenario, the service performs SNAT, replacing the address of the Pod
with the address of the node, and creates entries in the ingress and egress session table accordingly.
If the outgoing traffic is generated as a response to an external request, it can only be originated as a response to
a request made to a NodePort Service. For traffic related to NodePort Services with a CLUSTER ExternalTrafficPolicy,
if an egress session table hit happens, the destination IP address and port are replaced accordingly to the session data.
The traffic related to NodePort Services with a LOCAL ExternalTrafficPolicy is forwarded as it is to the next cube.

For Ingress Traffic, the following flow chart can be used to explain the functioning of the service:

![K8sdispatcher ingress flow chart](ingress.png)

The Ingress Traffic can be differentiated in traffic directed to the host (either directly or because it needs VxLAN
processing) and traffic directed to Pods. The traffic directed to Pods can be the traffic generated by an external host
trying to contact a NodePort service or the return traffic generated by an external host providing a response to an
internal Pod request. The service uses an ingress session table containing all the active ingress sessions.
If a session table hit happens, the service apply NAT according to the session data. If no session table entry is
associated with the incoming packet, the service tries to determine if a NodePort rule matches the packet
characteristics. In case of no NodePort rule matching, the packet is sent to the Linux stack for further processing.
In case of NodePort rule matching, different actions are applied according to the ExternalTrafficPolicy of the
Kubernetes NodePort Service associated to the rule. If the policy is LOCAL, the traffic is allowed to reach only
backend Pods located on the current node: in this case the packet can proceed towards the Pod without modifications.
In case the policy is CLUSTER, the packet can also reach backend Pods located on other nodes: since later in
the chain the packet will be processed by a load balancer and the return packet will have to transit through
the same load balancer, SNAT is applied by replacing the source IP address with a specific reserved address belonging
to the Pod CIDR of the node on which the k8sdispatcher is deployed. In this way the two nodes (the one that
receives the request and the one running the selected backend Pod) will exchange the packets of the flow over
the VxLAN interconnect. In this latter case, corresponding session entries are stored into the ingress and egress
sessions tables.
15 changes: 10 additions & 5 deletions Documentation/services/pcn-loadbalancer-rp/loadbalancer-rp.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@


This service implements a ``Reverse Proxy Load Balancer``.
According to the algorithm, incoming IP packets are delivered to the real servers by replacing their IP destination address with the one of the real server, chosen by the load balancing logic. Hence, IP address rewriting is performed in both directions, for traffic coming from the Internet and the reverse.
Packet are hashed to determine which is the correct backend; the hashing function guarantees that all packets belonging to the same TCP/UDP session will always be terminated to the same backend server.
According to the algorithm, incoming IP packets are delivered to the real servers by replacing their IP destination address with the one of the real server, chosen by the load balancing logic. Hence, IP address rewriting is performed in both directions, for traffic coming from the Internet and the reverse. Packet are hashed to determine which is the correct backend; the hashing function guarantees that all packets belonging to the same TCP/UDP session will always be terminated to the same backend server.

Unknown packets (e.g., ARP; IPv6) are simply forwarded as they are.
This service supports two different port types (FRONTEND and BACKEND) and two different port modes (SINGLE and MULTI). Depending on the port mode, the cube can have one or more FRONTEND ports: in SINGLE port mode (which is the default one), only a single FRONTEND port is supported, whereas multiple FRONTEND ports are supported in MULTI port mode. The MULTI port mode is specifically designed in order to allow the service to work properly as part of our Kubernetes networking solution (please see [polykube](https://github.com/polycube-network/polykube) to get more information about it). Regardless the port mode, only a BACKEND port is supported.

If a packet coming from a FRONTEND port is directed to a service, DNAT is performed on it; the corresponding reverse natting operation is performed for packets coming from backends and on the way back to clients. ARP packets are forwarded as they are (in MULTI port mode, if the packet is an ARP request from the BACKEND port, it is forwarded to the right FRONTEND port). Unknown packets (e.g., IPv6) are simply forwarded as they are (in MULTI port mode, if the packet comes from the BACKEND port, it is flooded to all the FRONTEND ports).

## Features


- Support for different port modes (SINGLE and MULTI)
- Support for multiple frontend ports (in MULTI port mode)
- Support for multiple virtual services (with multiple ``vip:protocol:port`` tuples)
- Support for ICMP Echo Request, hence enabling to ``ping`` virtual servers
- Session affinity: a TCP session is always terminated to the same backend server even in case the number of backend servers changes at run-time (e.g., a new backend is added)
Expand All @@ -21,8 +23,11 @@ Unknown packets (e.g., ARP; IPv6) are simply forwarded as they are.

## Limitations

- In SINGLE port mode, only two ports are supported (a FRONTEND and a BACKEND port)
- In MULTI port mode, multiple FRONTEND port are supported but only a single BACKEND port can exists
- In MULTI port mode, an IPv4 address must be configured on FRONTEND port creation in order to allow packets to flow back to the frontend clients
- In MULTI port mode, the supported topology is the one leveraged in the [polykube](https://github.com/polycube-network/polykube) Kubernetes networking solution

- Supports only two interfaces

## How to use

Expand All @@ -38,7 +43,7 @@ Each backend supports a ``weight`` that determines how incoming sessions are dis

A set of ``virtual services``, which are specified by a Virtual IP address, protocol and a port (``vip:protocol:port``), are mapped to a given set of ``backend services``, actually running on multiple real servers.

Hence, this service exports two network interfaces:
Hence, in SINGLE port mode, this service exports two network interfaces:
- Frontend port: connects the LB to the clients that connect to the virtual service, likely running on the public Internet
- Backend port: connects the LB to to backend servers

Expand Down
6 changes: 4 additions & 2 deletions scripts/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,8 @@ if [ "$MODE" == "pcn-iptables" ]; then
-DENABLE_SERVICE_SIMPLEFORWARDER=OFF \
-DENABLE_SERVICE_TRANSPARENTHELLOWORLD=OFF \
-DENABLE_SERVICE_SYNFLOOD=OFF \
-DENABLE_SERVICE_PACKETCAPTURE=OFF
-DENABLE_SERVICE_PACKETCAPTURE=OFF \
-DENABLE_SERVICE_K8SDISPATCHER=OFF
elif [ "$MODE" == "pcn-k8s" ]; then
cmake .. -DENABLE_SERVICE_BRIDGE=OFF \
-DENABLE_SERVICE_DDOSMITIGATOR=ON \
Expand All @@ -143,7 +144,8 @@ elif [ "$MODE" == "pcn-k8s" ]; then
-DENABLE_SERVICE_SIMPLEFORWARDER=OFF \
-DENABLE_SERVICE_TRANSPARENTHELLOWORLD=OFF \
-DENABLE_SERVICE_SYNFLOOD=OFF \
-DENABLE_SERVICE_PACKETCAPTURE=ON
-DENABLE_SERVICE_PACKETCAPTURE=ON \
-DENABLE_SERVICE_K8SDISPATCHER=OFF
else
cmake .. -DENABLE_PCN_IPTABLES=ON
fi
Expand Down
1 change: 1 addition & 0 deletions src/services/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ add_service(transparenthelloworld pcn-transparent-helloworld)
add_service(synflood pcn-synflood)
add_service(packetcapture pcn-packetcapture)
add_service(dynmon pcn-dynmon)
add_service(k8sdispatcher pcn-k8sdispatcher)

# save string to create code that load the services
SET_PROPERTY(GLOBAL PROPERTY LOAD_SERVICES_ ${LOAD_SERVICES})
Expand Down
13 changes: 13 additions & 0 deletions src/services/pcn-k8sdispatcher/.swagger-codegen-ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Swagger Codegen Ignore
# Generated by swagger-codegen https://github.com/swagger-api/swagger-codegen

# Use this file to prevent files from being overwritten by the generator.

.swagger-codegen-ignore

src/*.cpp
src/*.h

!src/*Interface.h
!src/*JsonObject.h
!src/*JsonObject.cpp
5 changes: 5 additions & 0 deletions src/services/pcn-k8sdispatcher/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
cmake_minimum_required (VERSION 3.2)

set (CMAKE_CXX_STANDARD 11)

add_subdirectory(src)
173 changes: 173 additions & 0 deletions src/services/pcn-k8sdispatcher/datamodel/k8sdispatcher.yang
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
module k8sdispatcher {
yang-version 1.1;
namespace "http://polycube.network/k8sdispatcher";
prefix "k8sdispatcher";

import polycube-base { prefix "polycube-base"; }
import polycube-standard-base { prefix "polycube-standard-base"; }

import ietf-inet-types { prefix "inet"; }

organization "Polycube open source project";
description "YANG data model for the Polycube K8s Dispatcher";

polycube-base:service-description "K8s Dispatcher Service";
polycube-base:service-version "2.0.0";
polycube-base:service-name "k8sdispatcher";
polycube-base:service-min-kernel-version "4.14.0";

typedef l4-proto {
type enumeration {
enum "TCP" {
value 6;
description "The TCP protocol type";
}
enum "UDP" {
value 17;
description "The UDP protocol type";
}
enum "ICMP" {
value 1;
description "The ICMP protocol type";
}
}
description "L4 protocol";
}

uses "polycube-standard-base:standard-base-yang-module" {
augment ports {
leaf type {
type enumeration {
enum BACKEND { description "Port connected to the internal CNI topology"; }
enum FRONTEND { description "Port connected to the node NIC"; }
}
description "Type of the K8s Dispatcher cube port (e.g. BACKEND or FRONTEND)";
mandatory true;
polycube-base:init-only-config;
}
leaf ip {
type inet:ipv4-address;
description "IP address of the node interface (only for FRONTEND port)";
polycube-base:cli-example "10.10.1.1";
polycube-base:init-only-config;
}
}
}

leaf internal-src-ip {
type inet:ipv4-address;
description "Internal source IP address used for natting incoming packets directed to Kubernetes Services with a CLUSTER external traffic policy";
mandatory true;
polycube-base:cli-example "10.10.1.1";
polycube-base:init-only-config;
}

leaf nodeport-range {
type string;
description "Port range used for NodePort Services";
default "30000-32767";
polycube-base:cli-example "30000-32767";
}

list session-rule {
key "direction src-ip dst-ip src-port dst-port proto";
description "Session entry related to a specific traffic direction";
config false;

leaf direction {
type enumeration {
enum INGRESS {
description "Direction of traffic going from the internal topology to the external world";
}
enum EGRESS {
description "Direction of traffic going from the external world to the internal CNI topology";
}
}
description "Session entry direction (e.g. INGRESS or EGRESS)";
}
leaf src-ip {
type inet:ipv4-address;
description "Session entry source IP address";
}
leaf dst-ip {
type inet:ipv4-address;
description "Session entry destination IP address";
}
leaf src-port {
type inet:port-number;
description "Session entry source L4 port number";
}
leaf dst-port {
type inet:port-number;
description "Session entry destination L4 port number";
}
leaf proto {
type l4-proto;
description "Session entry L4 protocol";
polycube-base:cli-example "TCP, UDP, ICMP";
}

leaf new-ip {
type inet:ipv4-address;
description "Translated IP address";
config false;
}
leaf new-port {
type inet:port-number;
description "Translated L4 port number";
config false;
}
leaf operation {
type enumeration {
enum XLATE_SRC { description "The source IP and port are replaced"; }
enum XLATE_DST { description "The destination IP and port are replaced"; }
}
description "Operation applied on the original packet";
config false;
}
leaf originating-rule {
type enumeration {
enum POD_TO_EXT {
description "Traffic related to communication between a Pod and the external world";
}
enum NODEPORT_CLUSTER {
description "Traffic related to communication involving a NodePort Service with having a CLUSTER external traffic policy";
}
}
description "Rule originating the session entry";
config false;
}
}

list nodeport-rule {
key "nodeport-port proto";
description "NodePort rule associated with a Kubernetes NodePort Service";

leaf nodeport-port {
type inet:port-number;
description "NodePort rule nodeport port number";
polycube-base:cli-example "30500";
}
leaf proto {
type l4-proto;
description "NodePort rule L4 protocol";
polycube-base:cli-example "TCP, UDP, ICMP";
}

leaf external-traffic-policy {
type enumeration {
enum LOCAL { description "Incoming traffic is allowed to be served only by local backends"; }
enum CLUSTER { description "Incoming traffic is allowed to be served by any backend of the cluster"; }
}
default CLUSTER;
description "The external traffic policy of the Kubernetes NodePort Service";
}
leaf rule-name {
type string;
description "An optional name for the NodePort rule";
polycube-base:cli-example "my-nodeport-rule";
polycube-base:init-only-config;
}
}
}

50 changes: 50 additions & 0 deletions src/services/pcn-k8sdispatcher/src/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
include(${PROJECT_SOURCE_DIR}/cmake/LoadFileAsVariable.cmake)

aux_source_directory(serializer SERIALIZER_SOURCES)
aux_source_directory(api API_SOURCES)
aux_source_directory(base BASE_SOURCES)

include_directories(serializer)

if (NOT DEFINED POLYCUBE_STANDALONE_SERVICE OR POLYCUBE_STANDALONE_SERVICE)
find_package(PkgConfig REQUIRED)
pkg_check_modules(POLYCUBE libpolycube)
include_directories(${POLYCUBE_INCLUDE_DIRS})
endif (NOT DEFINED POLYCUBE_STANDALONE_SERVICE OR POLYCUBE_STANDALONE_SERVICE)

# Needed to load files as variables
include_directories(${CMAKE_CURRENT_BINARY_DIR})

add_library(pcn-k8sdispatcher SHARED
${SERIALIZER_SOURCES}
${API_SOURCES}
${BASE_SOURCES}
K8sdispatcher.cpp
NodeportRule.cpp
Ports.cpp
SessionRule.cpp
K8sdispatcher-lib.cpp
Utils.cpp)

# load ebpf datapath code a variable
load_file_as_variable(pcn-k8sdispatcher
K8sdispatcher_dp.c
k8sdispatcher_code)

# load datamodel in a variable
load_file_as_variable(pcn-k8sdispatcher
../datamodel/k8sdispatcher.yang
k8sdispatcher_datamodel)

target_link_libraries(pcn-k8sdispatcher ${POLYCUBE_LIBRARIES})

# Specify shared library install directory

set(CMAKE_INSTALL_LIBDIR /usr/lib)

install(
TARGETS
pcn-k8sdispatcher
DESTINATION
"${CMAKE_INSTALL_LIBDIR}"
)
Loading

0 comments on commit a143e3c

Please sign in to comment.