Netpfl

Introduction

The NetBee library implements a compiler for high level language used to program the NetVM in order to deal with the most common traffic filtering and monitoring applications. This document will present the high level language used by the NetVM compiler giving particular emphasis to the syntax of the language.

Goals of this Language

NetVM applications are built around the concept of Processing Element (NetPE). A NetPE is a functional block that performs a simple function (e.g. packet filtering) inside the NetVM. A NetVM application is split in different functions assigned to several NetPEs, each of them implementing a single functionality; complex structures can be built by connecting different NetPEs together. This modular view derives from the observation that many network applications can be decomposed in simple blocks connected to form a complex structures.

Although the NetPE instruction set allows for arbitrary and complex packet processing, the proposed high-level language targets applications that want to capture and filter traffic, compute network statistics, and extracts fields from the packet. This language is not a new general purpose language (like C or C++) and it focuses on some of the most common features required by packet processing application. This allows the language to be simple (small number of keywords and syntax rules), intuitive, easy to implement, and suitable for command-line tools.

For more complex applications we leave the programmer the possibility to write its own NetPE source code in NetVM assembly language: this approach gives the programmer the opportunity to write powerful tasks for each NetPE using all the features available within it.

Language syntax

The compactness of the language allows an expression to be typed as an argument of a command-line tool. This is due to the minimal number or syntactical rules required by this language.

General syntax

The general syntax of this language consist of two blocks: the former represents a filtering expression and the latter represents the action to be performed when the expression of the filter is true:

FilteringExpression Action

The figure below shows on the top section the two fundamental blocks of the language syntax and on the bottom section the functionalities programmed into the NetPE by each section.

FULVIO Un paio di commenti (da discutere con Gianluca):

- sembra che “extract fields” non ritorni nulla (che e' errato)

- sembra che “classify” non ritorni nulla (che è potenzialmente corretto) =⇒ come gestiamo graficamente questo caso?

- “classify” mi sembra una cosa forse fuoriviante. Possiamo dire “compute traffic statistics”? In fondo non so se stiamo semplicemente classificando o stiamo calcolando statistiche

FULVIO: la classificazione supportata da questo documento, è esattamente la stessa che abbiamo definito nell'altro documento teorico?

The FilteringExpression Section

The Filtering expression is applied to the stream coming from the input port of NetPE which is transparent to the programmer. That expression is evaluated for each incoming packet and it returns a boolean result (true or false).

It is possible to join more filtering expressions by a logical connector (and, or). So if we call ”filter” a complete filter expression, it could be written as:

FilterMeaning
expression the usage of a single expression is described below
filter or filter the result is true if one of two operands is true
filter and filter the result is true if all two operands are true
not filter the result is true if the result of the operand is false
(filter) this gives the opportunity to close into brackets a filtering expression in order to make more complex expressions

In the table above the bold symbols are terminal symbols of the language while the other symbols are detailed below.

The ”expression” could be written in the following modes:

ExpressionMeaning
protoname the name of a protocol contained in the NetPDL database
term RelOp ValueRelOp” and ”term” symbols are described below
term RelOp term used to compare two complex expressions

The ”RelOp” symbol is a relational operator used to write comparison expressions; it could be one of the following ones:

RelOpBehavior
== equal
!= not equal
< less than
<= less or equal
> greater than
>= greater or equal

AMBIGUITA' NELLA GRAMMATICA: term RelOp protoname.fieldname e term RelOp Value “myProto.myField == I_write_a_value.this_way”

con un parser top down, dopo aver riconosciuto RelOp, se inizia per ALLOWED_INIT_CHAR, prendi tutto quello che e' in ALLOWED_INIT_CHAR{ALLOWED_CHAR + ”.”} e processalo come value, se non e' un value, allora cerca di riconoscerlo come Term

cosa storta: una sintassi del

The ”term” symbol follows the rules described here; it could be written in the following modes:

termMeaning
protoname.fieldname To specify a single protocol header field
protoname.fieldname MathOP Value To write an operation between a protocol header field and a value
~protoname.fieldname To obtain a bitwise not of a protocol header field
MANCA!!!!!!protoname[index] To access a single byte or a group of bytes of a protocol header

The “MathOP” symbol is used to write some mathematical expressions like arithmetical and bitwise operations; it could be one of the following ones:

MathOPBehavior
+ add
- sub
* multiply
& bitwise and
| bitwise or
« left shift
» right shift

The Value could be represented as an integer value or as an hexadecimal value, prepending the 0x prefix. There is an advanced issue in representing values: it is possible to write them in a format like that one specified by the NetPDL description of the protocol field that is the other member of the expression (e.g. for an IPv4 address the value could be written in a dotted decimal format like 10.0.0.1); the result of an arithmetical operation between a formatted field and a value has the same format so could be coupled in further operations with operands expressed in the same format. For example, the result of ip.src & 255.255.255.0 has dotted decimal format so could be compared with another dotted decimal value like 10.0.0.0. for example to generate the expression ip.src & 255.255.255.0 == 10.0.0.0. At the same way the expression ip.src & 0xFFFFFF00 == 10.0.0.0 is correct because decimal and hexadecimal values can be used in any case.

The indexed access to a protocol header, that could be expressed by square brackets after a protocol name, is useful to evaluate single bytes o group of bytes not aligned to any field. The index specifies the offset (in bytes) of a single byte from the beginning of the protocol header (the first byte index is 0). It is also possible to specify an interval of bytes with the notation from:to. from and to are written in decimal (e.g. ip[8:11]) or hexadecimal notation (e.g. ip[0x08:0x0B]). This feature is applicable only to entire protocol header and not to protocol header fields (e.g. it is not possible to write ip.src[0] == 10)

FULVIO:

- facciamo un esempio?

- e' possibile specificare [] solo per i protocolli? Non posso, ad esempio, specificare la stessa cosa per un campo? (es ip.src[0] == 10)

Examples of Filtering Expression

Filtering ExpressionMeaning
ip it returns true if the packet contains the ip protocol header
tcp or udp it returns true if the packet contains the tcp or the udp protocol header
tcp.sPort==80 or icmp it returns true if the field sPort of the tcp protocol header in the packet is 80 or if the packet contains the icmp header
ip.src & 255.255.255.0 == 10.0.0.0 it returns true if the first 24 bits of ip source address are 0x0A0000 (masked with a dotted decimal value)
ip.src & 0xFFFFFF00 == 10.0.0.0 it returns true if the first 24 bits of ip source address are 0x0A0000 (masked with an hexadecimal value)
ip.src & 255.255.255.0 == ip.dst & 255.255.255.0 it returns true if the first 24 bits of ip source address are equal to the first 24 bits of ip destination address
ip.hlen * 5 == 20 it returns true if the ip header length is 20 bytes

Actions

Actions are a set of operations executed if the result of the filter is true. Actions are separated one from the others by the “;” symbol.

The actions allowed by this language are designed for three categories of network traffic analysis:

  • traffic capture: this action force the deliver of the current packet to a given NetPE output port, which ca be connected either to the user application or to another NetPE;
  • traffic statistics: this action updates some counters after a classification step based on some criteria; those criteria are specified through some protocol names or protocol fields used to classify the packet; (FULVIO: non e' molto chiaro)
  • field extraction: this action extracts the values of some fields within the current packet and delivers them on a specific output NetPE port.

Traffic capture action

This feature is devoted to send out packets in accordance with the filter by a given NetPE output port. The packet is not modified by the NetPE and will be transmitted as received from the input port. The syntax for this action is:

ReturnPacket on port PortID

This action sends the packet to the port identified by the 'PortID' label; that label identifies uniquely the port and its first instance in the programming string declares the port for other use in the programming string, so writing more times the same port label in the same programming string refers to same port. FULVIO: non si capisce la seconda parte della frase. FRANCESCO: ci ho provato in tutti i modi.

ExampleDescription
ReturnPacket on port p1 it sends accepted packets on port
p2 of NetPE ReturnPacket on port p1; ReturnPacket on port p2 it sends accepted packets on port 'p1' and port 'p2' of NetPE

This action sends out packets in accordance with the filtering expression by the port specified; that port will be configured as output push port so is the NetPE that begins the transmission of the accepted packets to the connected entity.

FULVIO: magari scrivere (o forse no) come se fosse un esempio, che magari un tool come windump puo' aggiungere la seconda parte della sintassi automaticamente, che la action e' magari gia' predefinita nel programma

FRANCESCO: a me pare che aggiungiamo troppa carne al fuoco e mettiamo troppe cose insieme.

Traffic statistics actions

This feature is devoted to those applications who want to retrieve statistical information about traffic. These application have to specify which aspect of traffic the want to compute (we suggest the reading of NetBee statistical capabilities for detailed description of classification and statistical process) and which type of counter they want to use.

The syntax for these application is:

Classify (protocols_and_fields) → ( counters )

where ”protocols_and_fields” is a comma separated list of protocol names or protocol header fields (e.g. (ip.src,ip.dst,tcp.sPort,tcp.dPort) or, other example, (ip, tcp, udp, icmp)) on which statistics are based, while ”counters” is a comma separated list of counters (e.g. CountBytes, CountPackets) and each of those count a single aspect of the traffic.

The ”protocols_and_fields” grammar follows these rules:

Single item of protocols_and_fields listMeaning
protoname To specify a single protocol header
protoname.fieldname To specify a single protocol header field
protoname.fieldname MathOP Value To write an operation between a protocol header field and a value
~protoname.fieldname To obtain a bitwise not of a protocol header field

For each item in the ”protocols_and_fields” list it is possible to specify a mnemonic string of the item that will be used in printing operations (e.g. Classify(ip.src & 255.255.255.0 “traffic coming from network”, tcp.dPort)→(CountBytes) will return to the user informations formatted like this: ”traffic coming from network 10.0.1.0 = 1293123 bytes” ).

FULVIO: in realta' il formato dei dati non l'abbiamo ancora specificato, quindi non e' prematuro dire come ritorna i dati? Io mi limiterei a citare che la stringa di legenda sarà ” traffic coming from network 10.0.1.0

FRANCESCO: non si capisce a cosa serva quella stringa se non si fa un esempio

If no other actions have been programmed for the NetPE, the incoming packet would be dropped.

ExampleDescription
Classify(ip.src “New sessions from”, tcp.dPort “Requested service”)→(CountPackets) for each accepted packet this action classify it depending on ip source address and tcp source port, then increment the packets counter
Classify(ip.src & 255.255.255.0 “traffic coming from network”, tcp.dPort)→(CountBytes) for each accepted packet this action classify it depending on ip source address masked with the 255.255.255.0 netmask and tcp destination port, then increment the bytes counter

This action processes the statistical information and stores it in some internals data structures; moreover this action configure a pull output port in order to permit the external entities (like applications or other programmed NetPEs) to request and obtain processed statisticals information.

FULVIO Non abbiamo detto da nessuna parte che, mentre Filtering e Field Extraction ritornano qualcosa all'utente per ogni pacchetto accettato, il Classify non ritorna nulla (ma è poi giusta questa cosa?)

FRANCESCO: Ho detto queste cose adesso come ultimo paragrafo della rispettiva sezione di presentazione .

Field extraction

This feature is used when we want to transmit from the NetPE to other NetPEs or external entities like user application the values of some protocol header fields. The syntax is:

ExtractFields (protocol_fields) on port PortID

where ”protocol_fields” is a comma separated list of protocol header fields (as specified in the above ”term” grammar FULVIO: cos'e' la TERM GRAMMAR? FRANCESCO: “specified above”, l'ho scritto adesso) to be returned on port portID (e.g. ExtractFields(ip.src,ip.dst,tcp.sPort,tcp.dPort) on port 2).

This action creates a buffer containing the fields required in the same order they have been specified. That buffer is transmitted by the port identified by the 'PortID' label; that label identifies uniquely the port. FULVIO: io eviterei di dire questo. Il formato di uscita dei dati dipende dalla NetVM e non dal linguaggio che io uso per definire il programma. Al limite (anzi, mi sembrerebbe una cosa buona e giusta) farei un link alla pagina della NetVM dove si spiegano i formati di uscita di questi dati.

If no other actions have been programmed for the NetPE, the incoming packet would be discarded.

ExampleDescription
tcp or udp ExtractFields(ip.src, ip.dst) on port 1 for each accepted packet extracts source and destination ip address sending them on port 1
ExtractFields(ip[1:20]) on port 2 for each accepted packet extracts the first 20 bytes of ip protocol header sending them on port 2

This action sends out packets built internally by the port specified; that port will be configured as output push port so is the NetPE that begins the transmission of the accepted packets to the connected entity.

FULVIO: non si cita cosa succede quando decido di fare piu' azioni contemporaneamente (si cita inizialmente che sono separate da ”;” ma non c'e' nessun esempio esplicito): Forse basta un esempio.

Port auto-declaration (FULVIO: "auto" mi suona male ;-) )

Let's consider a NetPE: it could have some input and some output ports and its behaviour could be different for packet streams received from a given port with respect to other ports. The language we are proposing has an auto-declarative mechanism of the NetPE ports. In particular for the NetPE are created a single input port and the same number of output ports that have been mentioned in the program string.

The language does not specify how to declare the input port but just create the NetPE with a single input port. It denies the opportunity to declare more than one input exchange port because the meaning of each port and its name should be known in the connection phase with other NetPE ports.

PER FULVIO: questo forse va un po' rivisto dal punto di vista linguistico

Language Examples

ExampleDescription
ip.src == 10.0.0.1 and tcp ReturnPacket on port 2 FULVIO: non è possibile eliminare la keyword “on” (che mi sembra inutile e verbosa)? filter all the tcp/ip packets whose source address is 10.0.0.1 and sent them on port 2 of NetPE
tcp or udp ExtractFields(ip.src, ip.dst) on port 1; ExtractFields(tcp.sPort, tcp.dPort) on port 2 extract for each tcp or udp packet source and destination ip address sending them on port 1 and tcp source and destination port sending them on port 2
tcp.syn==1 Classify(ip.src “New sessions from”, tcp.dPort “Requested service”)→(CountPackets) FULVIO: qui non mettiamo l'indicazione di nessuna porta? Come faccio ad accedere a questi dati, una volta classificati? filters all packets containing new tcp sessions and for each one increment the proper counter

Suggerimento: prima di inventare l'acqua calda, leggiti la pagina http://windump.polito.it/docs/manual.htm che è il manuale di windump. Ti interessa solo la sezione che specifica il linguaggio di filtering, ma secondo me puo' darti un'idea interessante di come sono gestiti (e spiegati) i filtri li'. Ad esempio, non disdegnerei che venissero riscritti tutti gli esempi di WinDUmp per dimostrare come si possono scrivere in netBee. Oh oh… abbiamo inventato l'acqua calda? Qui proponiamo una sintassi molto verbosa mentre windump propone una sintassi molto network oriented.

Language Specification

The complete NetPFL language specification document can be found at http://nbee.org/download/netpfl-20100315.pdf

Further developments: Developments, Report

 
netpfl/index.txt · Last modified: 2010/08/03 15:56 by admin     Back to top