

### **InBand Network Telemetry** With P4 and FPGA at 100 Gbps

Viktor Puš, CESNET (pus@cesnet.cz)

2017-06-07, DXDD, Utrecht

# Why INT?

CESNET Coliberouter

#### Classical network monitoring: NetFlow/IPFIX

- Collect statistics about packets and bytes per network connection
  - Connection = set of packets sharing the same 5-tuple (SRC IP, DST IP, Proto, SRC Port, DST Port)
- Collected and exported by switches, routers, or dedicated boxes
- We get valuable data about L3 and L4 (sometimes L7)
- But we know very little about the underlying L2 infrastructure!
  - Overloaded lines, packet drops, latencies
- Switches hold this information how can we retrieve it?
  SNMP is 1988, slow, insecure, lacks detail

#### Inband Network Telemetry!

# What is INT?



- "Inband Network Telemetry is a framework designed to allow the collection and reporting of network state by the data plane, without requiring intervention or work by the control plane."<sup>1</sup>
- Packets carry dedicated INT headers, added by switches
  - Detailed info about each packet's journey
    - Path, per-switch latency, queue occupation, ...
  - New protocol, not standardized
  - Need switch and endpoint support
  - Perfect use case for P4
- Our goal: Ultimate INT endpoint and analytics

# CESNET



- Czech NREN, 100G network, ~400k users
- Liberouter research team
  - Network acceleration since 2003
  - Applications: Monitoring, Security, DDoS Mitigation
  - Technologies: FPGA, 100GE, PCI Express, DPDK
  - Compiler from P4 to VHDL
    - Hardware acceleration made easy for network/security experts



## **Commercial Partners**



- Products for the hardware acceleration of network traffic processing using FPGA
- Contribution: NFB-100G2



Flowmon Networks

- Everything needed to get complete network traffic visibility and analysis
- Contribution: Flow
  Exporter and Collector SW





# INT@100Gbps

- CESNET
- INT-in-VXLAN traffic is pre-generated in PCAPs and sent via 2nd FPGA card at 100 Gbps
- FPGA receives INT-tagged traffic
- Removes INT headers and sends "original" packets out at line rate

Payload

- Destination is not aware of INT
- Relevant fields are forwarded to SW via DPDK
  P4 generate digest action
- Flow Exporter generates NetFlow records and sends them to Collector

Non tagged traffic

UDP

Switch 1 Switch 2

INT Data INT Data

Delay [µs]

32bits

Payload

• Visualization and query interface

IP

Switch ID

32bits

Ethernet

VXLAN INT

**INT-in-VXLAN** 

IP

Ethernet

UDP



## **Console** interface

|                                            |                |      |                                      |      |                 | I |
|--------------------------------------------|----------------|------|--------------------------------------|------|-----------------|---|
| RX Statistics:                             |                |      |                                      |      |                 |   |
|                                            |                | Ĺ    | RXO                                  | 1    |                 |   |
| Packets [-]<br>Discarded [-]<br>Octets [B] |                |      | 65097920011<br>0<br>47653324455398   |      |                 |   |
| Throughput [Gbps]                          |                | Î    | 96.4081850653                        | î    |                 |   |
| <u>TX Statistics:</u>                      |                |      |                                      |      |                 |   |
|                                            |                | I    | ⊤x0                                  | 1    |                 |   |
| Packets [-]<br>Octets [8]                  |                | ļ    | 65099452205<br>42837086559877        | ļ    |                 |   |
| Throughput [Gbps]                          |                | Ť.   | 86.6098969874                        | i    |                 |   |
| Duration: 1 h: 6 m: 13 s                   |                |      |                                      |      |                 |   |
|                                            |                |      | min_delay=40 us,<br>min_delay=80 us, |      |                 |   |
| -                                          | 'hu May 11 10: | 20:3 | 0 2017                               |      |                 |   |
|                                            |                |      | min_delay=49 us,<br>min_delay=96 us, |      |                 |   |
|                                            |                |      |                                      | max_ | leidy=112 us, a | 1 |
|                                            | 'hu May 11 10: | 20:3 | 2 2017                               |      |                 |   |
|                                            |                |      | min_delay=47 us,<br>min_delay=106 us |      |                 |   |

rage\_delay = 44.67 erage\_delay = 93.33

rage\_delay = 51.67 erage\_delay = 105.33

rage\_delay = 52.75 verage\_delay = 110.50

🗂 lihemuter

## Flowmon Web GUI



#### Graphs, queries



# Conclusion



- P4 shortens development time
  - New application, running at 100 Gbps and integrated into commercial-grade solution in about

#### 3 weeks

- PCAP data (Python-scapy)
- Firmware core (P4)
- Flowmon Exporter input plugin (C)
- Flowmon Collector modification (C+PHP)
- No expert hardware knowledge needed
- Synthesis from P4 doesn't have negative impact on unique FPGA features
  - High (and guaranteed) throughput
    - Bandwidth = bus width × frequency
  - Constant latency
  - Easy extensions for unanticipated functions



## Thank you!

#### Web: www.liberouter.org Twitter: @liberouter



# P4 Workshop 2017-05



