NetPDL Expressions

The protocol header description, as well as the encapsulation description, requires the support of mathematical and string expressions. For instance, we should evaluate the length field of an IP packet before determining if that protocol contains any optional fields, and more.

The NetPDL language provides some elements that describe arithmetic and string expressions.

Expressions Basics

This section includes some tables that show the main concepts of the grammar (operators, operands) and it provides some examples from the real world.

Data types

NetPDL defines three different data types for operands:

TypeDescription
numberThe value is a number. Please note that a number is always an unsigned integer.
bufferThe value is a string contained in a buffer.
protocolThe value is a reference to a protocol.

Any operation must be done on operand belonging to the same data type.

Operators

This table lists the allowed operators, their meaning and the number and types of operands accepted.

Symbol# of OperandsOperand data typesDescription
+2numberAddition
-2numberSubtraction
*2numberMultiplication
div2numberDivision
mod2numberModule
bitwand2numberBitwise AND
bitwor2numberBitwise OR
bitwnot1numberBitwise NOT
==2number, buffer, protocolEqual
!=2number, buffer, protocolDifferent
ge2numberGreather or Equal
gt2numberGreather Than
le2numberLess or Equal
lt2numberLess Than
and2numberBoolean AND
or2numberBoolean OR
not1numberBoolean NOT

In case of boolean expression, the 'true' return value is given by a non-zero number, while the 'false' condition is equal to zero.

Arithmetic values are signed integer, maximum 32 bit-wide.

Warning: please note that the ”-” sign is an operator. In order to keep the syntax simple, fields and variables cannot use this character (e.g. a variable named ”$time-sec” is not allowed).

Operands

Operand syntax

Each operand have a different syntax. The following table lists the syntax allowed for each type of operand, and the allowed data type for each operand.

SymbolData TypeDescription
[0-9]+numberDecimal Number
0x[0-9a-fA-F]+numberHex Number
0b[0-1]+numberBinary Number
'[^']*'bufferString. A string is a generic buffer and it can contain also unprintable characters. In order to insert a non-printable character, you can use the ”\xnn” notation, where nn is the hex code of the wanted character. For instance, the string '\xFF\xFF\xFF\xFF\xFF\xFF' is a buffer that will contain the 0xFFFFFFFFFFFF string and will be 6 bytes long. Some special case may appear: if the special delimiter ” ' ” is part of a string, its corresponding hex code (\x27) must be used instead; if the special substring ”\x” is part of a string, one of these characters must be replaced by its hex code (i.e. ”\x” will became either ”\x5Cx” or ”\\x78”) Please note that any special escape character (e.g. ”\n” in the C language) is left untouched.
$[0-9a-zA-Z_]+\.[0-9a-zA-Z_]+(any)NetPDL variable. A simple variable is a string not containing the 'dot' character (e.g. $packetlength); viceversa, we are referring to a field inside a loopkup table (e.g. $tcpsessiontable.protocol). More details will be presented in the NetPDL variables section. Variables are typed; more details are available in the NetPDL variables section.
[0-9a-zA-Z_]+bufferReference to a protocol field. Please note that there is a special protocol field named this: this special field name is used when there is no need to indicate which is the field we are referring to. This keyword is used mostly in the NetPDL Visualization Extension. Being a field name (although with a special meaning), it supports partial access through the ”[]” operator. For instance, the following expression: buf2int(this[0:3]) == 0x0008BD returns the first 3 bytes of the current field. A reference to a protocol field always return a string in network byte order; in order to translate this string into a number, the buf2int() function must be used. Please note that a field larger than 32 bit cannot be returned as number.
#[0-9a-zA-Z_]+protocolReference to a protocol. This operand is used in some special case in which we need to refer to a protocol (e.g. in case we want to say ”jump to protocol ##”).

Some additional notes refers to NetPDL variables and references to protocol fields. Please note that variables MUST begin with the ”$” character and protocol references MUST begin with the ”#” character; these characters are not allowed as the beginning char of protocol fields.

Partial data access

Expressions can have access to a portion of a NetPDL field (i.e. <field>) using the following syntax:

fieldname[startingoffset:size]

Both startingoffset and size are expressed in bytes and can be other expressions; therefore their value can depend on other protocol fields, run-time variables and more. In case the size is equal to zero, the expression will return everything from the starting offset to the end of the buffer.

For instance, if we have to compare the first part of the MAC address (the OUI portion) with a string, we can use an expression like the following one:

macsrc[0:3] == '\x00\x08\xBD'

Partial access is permitted only to buffer-based operands, i.e. field references and buffer/refbuffer variables.

Nested Expressions

Expressions can be nested. Precedence is given to the sub-expression contained in the inmost brackets (i.e. ”(expression)”); the precedence assigned to the operators is the following:

PrecedenceOperator
HighestNOT, BITWNOT
*, DIV, MOD
+, -
BITWAND, BITWOR
GE, GT, LE, LT, ==, !=
LowestAND, OR

For instance, the following expression:

buf2int(headerlength) * 4 - 20

is used to calculate the size of the options in an IPv4 packet: the result is obtained by multiplying the headerlength field by four, then subtracting '20' (the standard header size) from the resulting value.

NetPDL variables

NetPDL supports two types of variables: simple variables and bi-dimensional arrays. NetPDL uses variables for storing temporary data that can be used to process the packet (e.g. it can be a state that we have to maintain to process following packets).

Variable basics

Data types associated to variables

Data associated to variables can be of four different types:

TypeDescription
numberThe variable contains a number.
bufferThe variable contains a string.
protocolThe variable contains a reference to a protocol.
refbufferThe variable contains a string passed by reference.

The first three data types are the same of the data types defined for operators. While a refbuffer is functionally equivalent to a buffer (i.e. a refbuffer variable can be compared to a string, etc), the main difference between buffer and refbuffer can be seen when assigning a value to a variable. For example, the operation $tempvar= $packet[10:13] (i.e. assigning the value of bytes from 10 to 13 within variable $packet to variable $tempvar) will result in a copy in case $tempvar is a buffer, or a reference copy in case $tempvar is a refbuffer. Therefore, any modification to $tempvar does not affect the content of the $packet variable in the first case, while the modification will be propagated to $packet in the second case.

Standard variables

Standard variables a simple location of memory containing a value (of the specified data type). Some variables are predefined inside the NetPDL engine (e.g. the $linktype contains the link layer type of the packet currently under processing), other can be declared by the user.

Lookup tables

A lookup table is similar to a bi-dimensional array in which each record is split between the key (i.e. the information required to locate the proper record), and the data (i.e. the information associated to the record itself). For example, a TCP session table can contain 4 keys (the source/destination address and the source/destination port), and a couple of data fields that keep the timestamp of the last packet and the application-layer protocol contained in the TCP session, as shown in the example below, related to the “$tcpsessiontable” lookup table:

1st Key: IP source2nd Key: IP dest.3rd Key: Port source4th Key: Port Destination1st Data: Timestamp of last packet2nd Data: Application-layer protocol for this session
10.1.1.120.2.2.280223410:11:12.122469#http
30.3.3.340.4.4.422532510:11:13.199652#smtp

Although lookup tables are conceptually similar to variables, there are some important differences. First, lookup tables must be managed as a set of records (that can be added, deleted, invalidated, etc.); second, values are associated to a single member of a single record (e.g. the timestamp of the second session is 10:11:13.199652, which implies that we need to specify which record we are referring to and, within that record, which data field we are referring to); third, the internal implementation inside a NetPDL engine may be rather different for variables and lookup tables.

Getting access to a standard variable / lookup table from an expression

Variables can be used in any expression: the NetPDL runtime engine will replace the variable name with its actual value.

From the programmer standpoint, getting access to a lookup table is similar to getting access to a simple variable. For instance, the following expressions are valid:

$packetlength - 4Returns the value contained in variable $packetlength decreased by 4.
$tcpsessiontable.protocol == #httpReturns 'true' if the current record within the $tcpsessiontable lookup table contains value the #http in its 'protocol' member (i.e. the data field called 'protocol').

However, things are more complicated in case of lookup tables because we cannot refer to the entire lookup table; since we have to refer to a single record in the lookup table and, within that record, to a single data field. Therefore, we have to locate first a record in the lookup table, and then we can select the wanted data member. The first task can be done through the checklookuptable() function, while the second one can be done using the syntax $tablename.membername. The Checking a value inside a lookup table example shows how variables can be used to get the value stored in a lookup table.

Definitions of NetPDL variables

Defining standard NetPDL variables: the <variable> element

Variables are defined through the <variable> element. This element supports the following attributes in addition to the Standard ones:

AttributeDescription
name (required)A unique name that identifies the variable.
type (required)It contains the data type of the variable. It can assume the following values: number: the variable contains a number buffer: the variable contains a string protocol: the variable contains a reference to a protocol refbuffer: the variable contains a string passed by reference
validity (required)It defines the duration (i.e. validity) of the variable. The validity can assume the following values: thispacket: this variable is valid when processing the current packet; it is initialized by the NetPDL engine when it starts analyzing the next packet. static: this variable acquires a permanent validity; the NetPDL engine will keep this variable forever.
size (required - meaningless)It specifies the size of the current variable. This attribute is required only for buffer variables (in this case the size is in bytes), while it is meaningless for other variable types.
value (optional)It defines the initialization value. This attribute is allowed only on buffer (in this case the initialization value is a string) and number (in this case the initialization value must be a number) variables. In case this attribute is missing, variables are initialized to a NULL value (e.g. 0 for number and the empty string for buffer variables). This value is assigned to the variable each time the variable must be initialized, i.e. once for all in case of a variable with static validity, or each time a new packet is processed in case of a variable with thispacket validity. Warning: please note that the initialization value must be a simple string (without the ' marker at the beginning and at the end) or a number; expressions are not supported at initialization time.

The <variable> element does not have any child nodes.

Example: defining a variable that contains the packet length

The variable that will contain the value of the packet length can be defined as follows:

<variable name="$packetlength" type="number" validity="thispacket"/>

The variable will contain numbers, and its content must be cleared at every packet.

Example: defining a variable that contains the IP source address

The variable that will contain the value of the IP source address can be defined as follows:

<variable name="$ipsrc" type="refbuffer" validity="thispacket"/>

This variable is a reference to a buffer (i.e. it can point to the position, within the packet buffer, related to the IP source address) and its content must be cleared at every packet.

Defining NetPDL lookuptables: the <lookuptable> element

Lookup tables are defined through the <lookuptable> element. This element supports the following attributes in addition to the Standard ones:

AttributeDescription
name (required)A unique name that identifies the lookup table. Please note that lookup tables are still variables, hence their name must begin with the $ character.
validity (required)It defines if the table is made up of static entries (i.e., the entry stays in the table until explicitly deleted), or it allows the insertion of dynamic entries as well. Dynamic entries are those entries whose duration is limited by a 'lifetime' parameter, which is automatically checked by the NetPDL engine. In other words, dynamic entries will be automatically deleted by the NetPDL engine when its lifetime expires. The validity attribute can assume the following values: static: records inside the lookup table never expire; the NetPDL engine will keep the records forever, until an explicit “delete” command is invoked. dynamic (default): records inside the lookup table will be automatically deleted when the lifetime expires (if not refreshed). More details will follow in the Defining time-limited records section.
exactentries (required)It specifies the maximum number of 'exact' records within the lookup table. 'Exact' entries are the ones in which all the keys are completely defined, and do not have any masks. NetPDL engines may overwrite this parameter and dynamically resize the lookup table. However this is application-dependent, and programmers must consider this value as the maximum allowed number of records.
maskentries (required)It specifies the maximum number of 'masked' records within the lookup table. 'Masked' entries are the ones in which some of te keys are not completely defined, e.g., the case in which a record contains an unknown value of a key. In this case, the match will be computed according to the values of the known keys, skipping the value of the 'masked' key. NetPDL engines may overwrite this parameter and dynamically resize the lookup table. However this is application-dependent, and programmers must consider this value as the maximum allowed number of records.

The <lookuptable> element requires the following child nodes:

ElementDescription
<key>It defines a field that is used as lookup key.
<data>It defines a field that contains some data related to the selected record.

Defining the members of a lookup table: the <key> and <data> elements

Before introducing the details about <key> and <data> elements, it is useful to remember that a lookup table is a table made up of several records. Each record is split in two parts: the lookup key, that is used to identify the record (e.g. in a TCP session table, the lookup key will be the IP source/destination and the TCP port source/destination fields), and the associated data (e.g. the timestamp associated to the current session, the application-layer protocol, and more).

<key> and <data> elements look like a standard variable and define the members of a lookup table. Both <key> and <data> have the same format; they do not support any child elements, and they support the following attributes in addition to the Standard ones:

AttributeDescription
name (required)A unique name that identifies the key / data member.
type (required)It contains the data type of the key / data member. It can assume the following values: number: the field contains a number buffer: the field contains a string protocol: the field contains a reference to a protocol Please note that <key> and <data> cannot assume the refbuffer data type, which is allowed in standard variables instead.
size (required - meaningless)It specifies the size of the current key / data member. This attribute is required only for buffer (in this case it contains the size in bytes), while it is meaningless for other types.

Example: defining a lookup table for TCP sessions

The lookup table that will contain a summary of all TCP sessions can be defined as follows:

  <lookuptable name="$tcpsessiontable" exactentries="10000" maskentries= "0" allowdynamicentries="yes">
    <key name="ipsource" type="buffer" size="16"/>
    <key name="ipdest" type="buffer" size="16"/>

    <key name="tcpsrcport" type="buffer" size="2"/>
    <key name="tcpdstport" type="buffer" size="2"/>
    <data name="nextproto" type="protocol"/>
  </lookuptable>

This variable defines a lookup table with 10000 exact entries (no mask entries are supported), which are valid upon expiration. Each entry is made up of 4 keys (the IP source/destination and the TCP port source/destination), and three data fields. Keys are buffers (the first two are 16-bytes larger in order to accomodate also IPv6 addresses), hence data is copied in there (and not passed by reference). Among the data fields, there are two numbers (the timestamp and the lifetime), and the last one is a protocol, keeping track of the application-level protocol associated to the session itself.

Defining time-limited records

Often, lookup tables are used to contain TCP/UDP session tables. Records inside session tables are characterized by an “expiration time” that is used to clean up zombies sessions (e.g. TCP sessions that dead unexpectedly).

In order to activate this functionality, NetPDL programmers must:

  • define the validity=“dynamic” attribute inside the <lookuptable> element
  • add entries with the validity attribute equal to one of the following: keepmaxtime, updateonhit, replaceonhit, addonhit.

When a table contains dynamic entries, two more data fields are automatically defined:

timestamp (number)This field contains the timestamp (in seconds, according to the UNIX time format) of the last packet that “hit” the current record.
lifetime (number)This field contains the maximum expiration time (in seconds); if no packets hit the current record for a time larger than this value, the record will be automatically deleted. For example, if the the timestamp field contains the 11.07:33am value and the lifetime contains value 60 (in seconds), the entire record will be deleted if no packets hitting that record are seen at time 11:08:33am.

Timestamp and lifetime data fields are “special” fields only in case dynamic entries are allowed. In this case the timestamp data field is automatically updated by the NetPDL engine each time there is a new hit for that session (i.e, when the function updatelookuptable() is called). Also the garbage collection is under the responsibility of the NetPDL engine. In other words, the NetPDL engine will automatically detect expired records (according to the values present in the timestamp and lifetime fields) and delete them from the lookup table.

Note. Being timestamp and lifetime data field associated to each record, the user can get access to these values in the usual way (e.g. $tcpsessiontable.timestamp), even if these fields have not been declared in the definition of the lookup table.

Getting access to NetPDL variables

Variables can be used in two different ways: as a r-values in expressions (the value associated to the variable is replaced and used in the expression), or as l-values in assigments (the value of an expression is associated to the variable).

These two modes of usage corresponds to different elements inside the NetPDL language.

Using variables in expressions

Variables can be used in any expression: the NetPDL runtime engine will replace the variable name with its actual value.

Also lookup table fields can be used in an expression, but things are more complicated because we cannot refer to the entire lookup table; we have to refer to a single record in the lookup table. Therefore, first we have to locate a record in the lookup table, and then we can select the wanted data field. The first task can be done through the checklookuptable()? function, while the second one can be done using the syntax $tablename.membername. The Checking a value inside a lookup table? example shows how variables can be used to get the value stored in a lookup table.

Warning. Please remember that in order to get access to a lookuptable field, the checklookuptable() function must always be used first to locate the correct record, as shown in the Checking a value inside a lookup table? example.

Example: Checking the value of the next byte in the packet buffer

In case the protocol processing depends on the next byte in the packet buffer, the following code can be used:

  <switch expr="$packet[$currentoffset:1]">
    <case value="1"> ...</case>
    <case value="2"> ...</case>
    ...
  </switch>

The expression in the <switch> element will extract one byte at the current position in the packet buffer (the $currentoffset variable keeps the offset related to the current processing point), and it compares against a set of alternatives (contained in the <case> branches).

This example shows that variables can be used either alone, or with the [] operator for partial data access (this is valid only for buffer and refbuffer variables).

Example: checking the value inside a lookup table

This code fragment (referred to the lookup table variable $tcpsessiontable defined in one of the previous examples) shows how to get access to a value stored inside a lookup table.

  <if expr="checklookuptable($tcpsessiontable, $ipsrc, $ipdst, $portsrc, $portdst)">
    <if-true>
      <if expr="$tcpsessiontable.nextproto == #http">

        <if-true>
        ....
        </if-true>
      </if>
    </if-true>
  </if>

The first <if> selects a wanted entry in the $tcpsessiontable lookup table according to the $ipsrc, $ipdst, $portsrc, $portdst parameters (which are variable containing the IP source/destination address and source/destination TCP port). If the entry exists, the checklookuptable() function will return 'true', hence the <if-true> branch will be executed. The second <if> will check the value of the nextproto members of the selected entry: if the value is #http (i.e. the application-layer protocol of that TCP session is HTTP), then the second <if-true> is executed, and so on.

Updating the value associated to a standard variable: the <assign-variable> element

The <assign-variable> element enables assigning a new value to a standard variable element. The allowed attributes (in addition to the Standard ones) are the following:

AttributeDescription
name (required)It identifies the variable by specifying its name (and, optionally, some of its offsets). The variable name can have different forms: $variablename: for getting access to the entire variable $variablename[startingoffset:size]: for variables whose data type is buffer and refbuffer in case a partial access is required (i.e. only part of the variable has to be modified) Warning. Please note that both startingoffset and size must be simple numbers; expressions are not allowed here.
value (required)It contains the expression that has to be evaluated and assigned to the variable.

Example: updating the $packetlength runtime variable in an Ethernet DIX frame

An Ethernet DIX frame does not have the indication of the length of the frame currently being transmitted. This causes some problems during packet parsing if the total length of the packet is less than 64 bytes because a frame will contain both valid data and the Ethernet padding at the end.

To avoid this problem and to detect the Ethernet padding, you can update the packetlength run-time variable with the proper packet length when decoding a network-level packet (for example the IP header). This can be done with the following code, included in the IP header:

<assign-variable name="$packetlength" value="buf2int(totlength) + 14"/>

This piece of code updates the packetlength run-time variable with the length of the IP packet (which is stored in the totlength IP field) plus the length of the Ethernet frame (14 bytes, excluding the CRC).

Updating the value associated to a field inside a lookup table: the <assign-lookuptable> element

The <assign-lookuptable> element enables assigning a new value to a field contained in a lookuptable record. The allowed attributes (in addition to the Standard ones) are the following:

AttributeDescription
name (required)It identifies the variable by specifying its name (and, optionally, some of its offsets). The variable name can have different forms: $variablename.membername: for getting access to the entire lookuptable field $variablename.membername[startingoffset:size]: for lookuptable fields whose data type is buffer in case a partial access is required (i.e. only part of the field has to be modified) Warning. Please note that both startingoffset and size must be simple numbers; expressions are not allowed here.
value (required)It contains the expression that has to be evaluated and assigned to the field.

Warning. Please remember that in order to get access to a lookuptable field, the checklookuptable() function must always be used first to locate the correct record, as shown in the Updating the timestamp of a TCP session example.

Example: updating the nextproto of a record inside the $tcpsessiontable lookup table

This example checks if a given session (identified by the tuple $ipsrc, $ipdst, $portsrc, $portdst) belongs to the $tcpsessiontable lookup table. If so, it updates the next protocol of that record:

  ...
  <if expr="checklookuptable($tcpsessiontable, $ipsrc, $ipdst, $portsrc, $portdst)">
    <if-true>
      <assign-lookuptable name="$tcpsessiontable.nextproto" value="#http"/>
      ...
    </if-true>
  </if>

Updating a record inside a lookup table: the <update-lookuptable> element

The <update-lookuptable> element allows modifying the list of records contained in the lookup table. Please note that it does not allow modifying the content of a field inside an existing record (the <assign-lookuptable> has been defined for that), while it permits to add, delete and mark as obsolete lookuptable records. The allowed attributes (in addition to the Standard ones) are the following:

AttributeDescription
name (required)It identifies the lookup table by specifying its name.
action (required)It contains the action that has to be performed. It can assume the following values: add: it adds a new record to the lookup table purge: it deletes an existing record from the lookup table obsolete: it marks as obsolete an existing record. In this case the record is not deleted immediately (such as in purge), but it can be deleted at any time by the NetPDL engine. For instance, the NetBee library keeps the given session of additional 2 minutes, then it is removed. The obsolete attribute can be useful in order to conserve a record for some time in which this entry may still be useful. For example, a TCP session sending a FIN flag is usually going to finish, but some packets may still be transmitted in order to acknowledge the data sent in the other direction. In this case, if we purge the session as soon as we see a FIN flag, the following packets will not be recognized, while if we mark as obsolete this session, packets will still find an entry for that session. Please note that there is no guarantee about the duration of the record that is marked as obsolete, which depends upon the NetPDL engine. Some can maintain the session entry for some seconds, while other can keep the entry until space is required in the lookup table, and such.

Please note that the additional attribute validity is required in case of an add action. This attribute defines if the record has permanent validity, or it has to be automatically managed by the lookup table (e.g. deleted at the end of its lifetime). It can assume the following values;

ValueDescription
It defines a record that has a permanent validity. This record will never be updated by the lookup table; any modification must be performed by the user. This is the only the of record that is allowed in lookup tables that do no support dynamic entries.
keepmaxtimeIt defines a record whose expiration time is well known in advance. This record will never be updated by the lookup table; however, the lookup table will delete this record when it will become older than its expiration time.
updateonhitIt defines a record whose expiration time depends on the last access to this record. The lookup table will update automatically the timestamp of the current record each time the function updatelookuptable() is called. When the record is not accessed for more that lifetime period, it will be deleted from the lookup table.
replaceonhitIt defines a masked record whose expiration time depends on the last access to this record. This kind of record is used to insert an entry that is not completely specified. For instance, in case of an FTP passive session, the new record has to be inserted in the tcpsessiontable as IPsrc, IPdst, *, PORTdst, since the source port is unknown. This attribute will define a special behavior in which the 'masked' record is replaced with the corresponding exact record as soon as a match occurs. From that point on, the record becomes a standard 'exact' record with its lifetime, and the management will be the same as the updateonhit record.
addonhitIt defines a masked record whose expiration time depends on the last access to this record. This kind of record is used to insert an entry that is not completely specified. For instance, in case of a known server (e.g. the Google web server), the new record has to be inserted in the tcpsessiontable as IPgoogle, *, 80, *, since the data related to the client is unknown. This record can be used for recognizing that any traffic directed to IPgooge, Port80 is HTTP. However, this record must not be replaced with the exact record at the first hit. Instead, this record is kept in the table, and a new 'exact' record is inserted when an hit occurs. From that point on, the new 'exact' record will define its lifetime, and the management will be the same as the updateonhit record.

In case of keepmaxtime, updateonhit, replaceonhit and addonhit records, some further additional attributes are defined:

AttributeDescription
keeptimeThis attribute must be defined in case of any record (exept for keepforever), and defines the validity of the record (in seconds) prior of the first it. In case of updateonhit, replacehit or addonhit, the lifespan is updated with another value (see following attributes) after the first hit. This allow to differentiate the lifespan of a record before and after an hit (although usually these values are the same). Only decimal numbers are supported (e.g., keeptime=“10”).
hittimeThis attribute must be defined in case of a updateonhit and addonhit record, and defines the validity of the record (in seconds) after the first hit. In case of the addonhit record, this attribute refers to the base record (i.e., the 'masked' one). The lifespan of the record is renewed automatically after any hit (i.e., when the updatelookuptable() is used). Only decimal numbers are supported (e.g., hittime=“10”).
newhittimeThis attribute must be defined in case of a replaceonhit and addonhit record, and defines the validity of the new record (in seconds) that is created after an hit on the 'masked' one. The lifespan of the record is renewed automatically after any hit (i.e., when the updatelookuptable() is used). Only decimal numbers are supported (e.g., hittime=“10”).

The <update-lookuptable> element requires the following child nodes:

ElementDescription
<lookupkey>It defines the values that must be used as lookup keys.
<lookupdata>It defines the values that must be assigned to data fields.

Defining the values for the members of a lookup table: the <lookupkey> and <lookupdata> elements

The <lookupkey> and <lookupdata> elements look more or less the same of the <key> and <data> elements presented before. These elements are used to defines the values of the keys and data that are required to identify a record inside the lookup table.

The <lookupkey> element must always be present as a child of the <update-lookuptable> element. Vice versa, the <lookupdata> element is required only in case we have to insert a new record (action=“add” in the <update-lookuptable> element). In any case, the number of <lookupkey> and <lookupdata> (if needed) elements must be equal to the number of <key> and <data> elements used to define the lookup table. In addition, values are assigned in the same order (i.e. the first <lookupkey> element corresponds to the first <key> defined in the lookup table).

<lookupkey> and <lookupdata> do not have child nodes and they support the following attributes in addition to the Standard ones:

AttributeDescription
value (required)It contains the expression that has to be evaluated and used as key or data for that field.

Example: deleting a TCP session from the $tcpsessiontable in case the FIN flag is turned on

In this example, a TCP session is deleted from the $tcpsessiontable in case the FIN flag is turned on:

  <if expr="buf2int(fin) == 1">
    <if-true>
      <update-lookuptable name="$tcpsessiontable" action="purge">
        <lookupkey value="$ipsrc"/>
        <lookupkey value="$ipdst"/>
        <lookupkey value="$portsrc"/>

        <lookupkey value="$portdst"/>
      </update-lookuptable>
    </if-true>
  </if>

Please note that the order of the <lookupkey> elements is equal to the one of the <key> elements used in the definition of the lookup table. That is, the first <lookupkey> contains the IP source address, and the first <key> was in fact defined to contain the same field.

Example: adding a new FTP data session to the $tcpsessiontable in case of a passive FTP

In this example, a new FTP data session is added to the $tcpsessiontable:

  <if expr="ispresent(port) and ispresent(code) and code == '227'">
    <if-true>

      <!-- Passive mode: server always uses port specified in the message; client port is unknown -->
      <assign-variable name="$ftpdataport[0:1]" value="int2buf( ascii2int(port1), 1)" />
      <assign-variable name="$ftpdataport[1:1]" value="int2buf( ascii2int(port2), 1)" />

      <update-lookuptable name="$tcpsessiontable" action="add" validity="replaceonhit" keeptime="60" newhittime="300">
        <lookupkey value="$ipsrc"/>
        <lookupkey value="$ipdst"/>

        <lookupkey value="$ftpdataport"/>
        <lookupkey value="'\x00\x00'" mask="0x0000"/>
        <lookupdata value="#ftpdata"/>
      </update-lookuptable>

      <!-- Please add the other side of the connection here -->
    </if-true>

  </if>

In this case, the client port is unknown (hence a 'masked' entry is used); this entry will be replaced by the corresponding exact entry as soon as a match on this record occurs. The masked record is kept for 60 seconds (keeptime) and it is deleted in case no hit occurs prior this time. Vice versa, if an hit occurs, the entry is replaced with a new exact entry, whose lifespan is 300 seconds (newhittime). This lifespan is renewed at each hit, i.e., the new entry is deleted of no hit occurs for more than 300 seconds.

Predefined NetPDL Variables

The NetPDL decoding engine must be able to define (and fill in) the following variables:

VariableData typeDescription
$packetbufferIt contains the complete hex dump of the packet under examination. This variable allows arbitrary access to the packet content and it is often used coupled with $currentoffset and $currentprotooffset variables in order to get access to specific locations.
$framelengthnumber (bytes)It keeps the total length of the frame as it is provided by the link-layer. For instance, this value cannot be more than 1518 bytes on a standard Ethernet (without VLANs).
$packetlengthnumber (bytes)It keeps the total length of the packet as it is provided by the link-layer. For instance, this value cannot be more than 1518 bytes on a standard Ethernet (without VLANs). Usually, this variable is equal to the framelength; it differs in case the packet has been captured on an Ethernet (with DIX format) and it is smaller than the minimum Ethernet size. For instance, an IP packet of 20 bytes will have packetlength equal to 38 (20 bytes IP plus 18 bytes Ethernet), while framelength will be 64 (the bytes traveling on the network, since the minimum Ethernet frame is 64 bytes).
$currentoffsetnumber (bytes)It keeps the total number of bytes that have already been processed. The NetPDL engine must update this variable as soon as a new field is correctly recognized.
$currentprotooffsetnumber (bytes)It keeps the total number of bytes that have already been processed within the current protocol. The decoder must update this variable as soon as a new field is correctly recognized. The variable is initialized to zero each time the dissector begins handling a new protocol.
$linklayernumberIt keeps the type of the link-layer the packet belongs to. This is needed in order to decide which is the first protocol that has to be examined.
$prevprotoprotocolIt keeps the protocol that encapsulated the current one. For instance, in case of ethernet-IP-UDP, the value of this variable will be #ethernet when processing IP, #IP when processing UDP, and so on.
$nextprotoprotocolIt keeps the protocol that follows the current one. For instance, in case of ethernet-IP-UDP, the value of this variable will be #IP after processing ethernet, #UDP after processing IP, and so on. Please note that this variable is updated only when the protocol processing ended, since there is no way to know the next protocol till the current one has been completely understood. Viceversa, variable $prevproto is updated before beginning the processing of the selected protocol.
$timestamp_secnumberThe timestamp is expressed by two different variables that look very similar to the struct timeval structure defined in BSD sockets. This variable keeps the number of seconds elapsed since midnight (00:00:00), January 1, 1970, coordinated universal time, according to the system clock of the host who captured the packet.
$timestamp_usecnumberThe timestamp is expressed by two different variables that look very similar to the struct timeval structure defined in BSD sockets. This variable keeps the microsecond value according to the system clock of the host who captured the packet.
$begintokenlennumberIt is automatically updated only in case of tokenwrapped fields. This variable contains the length of the string that is used to locate the beginning of the field itself. Although this value is obvious in case of fixed tokens, it can computed only at run-time in case a regular expression is used to locate the field. For instance, a field whose value is email@domain.com (with ”<” as the beginning string and ”>” as the ending string), this variable is equal to 1.
$fieldtokenlennumberIt is automatically updated only in case of tokenended and tokenwrapped fields. This variable contains the length of the field, not including the beginning (if present) and ending strings that are used to locate the field itself. For instance, a field whose value is email@domain.com (with ”<” as the beginning string and ”>” as the ending string), this variable is equal to 16.
$endtokenlennumberIt is automatically updated only in case of tokenended and tokenwrapped fields. This variable contains the length of the string that is used to locate the ending point of the field itself. Although this value is obvious in case of fixed tokens, it can computed only at run-time in case a regular expression is used to terminate the field. For instance, a field whose value is email@domain.com (with ”<” as the beginning string and ”>” as the ending string), this variable is equal to 1.

These variables are not always meaningful; for example the $timestamp-sec makes no sense if the NetPDL description is used to generate a run-time expression needed to filter packets. For this reason, some these variables might not be supported in all NetPDL engines.

Functions

The NetPDL language supports some additional functions for complex computations or special purposes. Currently, the following functions are defined:

In general, other functions can be added. However, this requires to make aware of the new function all the tools using the NetPDL language.

number buf2int(buffer Buffer)

It converts a string in a number. It is extremely helpful for converting the value of a protocol field (which is stored as a network-byte ordered string) into a number. Please remember that numbers are unsigned integers.

param Buffer

Buffer that contains the data that has to be transformed in a number. Please note that the string will be handled as being in network-byte order. Particularly, this function must be used when you need to get the value of a field (taken as a number), but it will be used also for non-field entities (e.g. variables such as the packet buffer). The buffer to be converted cannot be bigger than 4 bytes: for bigger strings, this function will fail.

Returns

A number representing the value contained in the Buffer parameter converted into an integer.

buffer int2buf(number Number, number BufferSize)

It converts a number in a network-byte ordered buffer. It is extremely helpful for converting some numeric values (e.g. a TCP port number) into a buffer that has the same format of the packet dump. Please remember that numbers are unsigned integers. This function is the dual of the buf2int().

param Number

Numeric value that has to be converted.

param BufferSize

Size of the buffer that will result form the conversion. For instance, if the number to be converted is ”128”, the

returned buffer will be ”80” if BufferSize is 1, ”0080” if BufferSize is 2, and ”00000080” if BufferSize is 4. The returned buffer cannot be bigger than 4 bytes: in this case the function will fail.

Returns

A buffer containing the numeric value formatted as it is in the packet dump.

buffer changebyteorder(buffer Buffer)

It converts a buffer from its byte order to the opposite one (e.g. from little-endian to big-endian, i.e. network-byte order).

Param Buffer

Buffer that contains the data whose byte ordering has to be changed. Please note that the resulting string will have the same size (in bytes) of the size of this parameter. The buffer to be converted cannot be bigger than 4 bytes: for bigger strings, this function will fail.

Returns

This function returns a string containing the Buffer string swapped from the original byte ordering to the opposite one.

number ascii2int(buffer Buffer)

It converts a number written as ASCII string into its numeric counterpart. This function is often used to parse the content of some text-based protocol and convert some ASCII number into their binary counterpart.

Param Buffer

Buffer that contains the data that has to be transformed into a number.

Returns

A number containing the numeric value of the ASCII string.

number ispresent(protofield NetPDLField)

It checks if the given field is present within the current protocol or not. This function is used mostly within NetPDL Visualization primitives, when the shown string may be different according that a given field is present or not.

Param NetPDLField

It contains the name of a protocol field that has to be checked.

Returns

'true' if this field is present within the packet, 'false' otherwise.

number hasstring(buffer Buffer, buffer RegEx, number CaseSensitive)

It looks for a given pattern (regular expressions can be used for defining the matching) within the Buffer string and it returns a non-zero number if the pattern is present.

Param Buffer

Buffer that contains the data in which we want to dig. It can be, for example, a run-time variable, a portion of the packet buffer, and more.

Param RegEx

String that contains the pattern that we want to locate (e.g. 'abc.*' if we want to look for the pattern 'abc' followed by any other characters). For who is not familiar with regular expression, NetPDL Regular Expressions document will contain some documentation and some examples.

Param CaseSensitive

This parameter must be 1 in case the pattern matching has to be done in a case-sensitive way, 0 for case-insensitive.

Returns

'true' if the pattern can be located within the Buffer, 'false' otherwise.

buffer extractstring(buffer Buffer, buffer RegEx, number CaseSensitive, number Match2Return)

It looks for a given pattern (regular expressions can be used for defining the matching) within the Buffer string and it returns one matching (either the complete matching string or one of the partial matching) found.

Param Buffer

Buffer that contains the data in which we want to dig. It can be, for example, a run-time variable, a portion of the packet buffer, and more.

Param RegEx

String that contains the pattern that we want to locate (e.g. 'abc.*' if we want to look for the pattern 'abc' followed by any other characters). For who is not familiar with regular expression, NetPDL Regular Expressions document will contain some documentation and some examples.

Param CaseSensitive

This parameter must be 1 in case the pattern matching has to be done in a case-sensitive way, 0 for case-insensitive.

Param !Match2Return

This parameter is useful in case of partial match (i.e. when the regular expression includes grouping, i.e. round parenthesis), in which more than one string is returned. It returns the complete matching when MatchToBeReturned is 0, the first partial matching if 1, the second if 2, and so on.

Returns

The wanted pattern if it was found within the Buffer. The returned buffer can be either the full matching or a partial matching, depending on the value of the MatchToBeReturned parameter.

number checklookuptable(LookuptableName, FirstKey, SecondKey, ...)

It anayzes a given lookuptable and checks if the record identified by the set of keys (the number of parameters is variable) is present.

This function has a collateral effect: it selects the given record (if present) and it sets that record as 'default' one. This is required in order to get access to the data associated to a record inside the lookup table, because the syntax for getting that data is lookuptablename.membername. The data returned will be the one associated to the default record, which is the one selected by this function.

Param LookuptableName

Name of the lookup table we are interested in.

Param FirstKey

Value of the first key. This value is not typed (i.e. number, string, etc) because it depends on the type of key associated to the table itself. From this perspective, the checklookuptable() function is unusual, since not only its number of parameters is unknown (it depends on the number of keys defined for that lookup table), but also its type.

Param SecondKey

Value of the second key, in the same way as the FirstKey parameter.

Param OtherKeys

Value of the other keys, in the same way as the FirstKey parameter.

Returns

A non-zero value if the record is present in the lookup table, or zero if the record cannot be found.

number updatelookuptable(LookuptableName, FirstKey, SecondKey, ...)

It anayzes a given lookuptable and checks if the record identified by the set of keys (the number of parameters is variable) is present. In addition, it updates the timestamp of the current entry; this is used to extend the lifetime of the current record. For the rest, it acts exactly as the checklookuptable() function.

This function has a collateral effect: it selects the given record (if present) and it sets that record as 'default' one. This is required in order to get access to the data associated to a record inside the lookup table, because the syntax for getting that data is lookuptablename.membername. The data returned will be the one associated to the default record, which is the one selected by this function.

Param LookuptableName

Name of the lookup table we are interested in.

Param FirstKey

Value of the first key. This value is not typed (i.e. number, string, etc) because it depends on the type of key associated to the table itself. From this perspective, the checklookuptable() function is unusual, since not only its number of parameters is unknown (it depends on the number of keys defined for that lookup table), but also its type.

Param SecondKey

Value of the second key, in the same way as the FirstKey parameter.

Param OtherKeys

Value of the other keys, in the same way as the FirstKey parameter.

Returns

A non-zero value if the record is present in the lookup table, or zero if the record cannot be found.

(value) debug(Expression)

This is a special function that is used for debugging expression. This function is transparent, i.e. it accepts an expression as parameter, and it returns the same expression. However, it prints on screen the result of the expression itself before continuing. The way the result is printed on screen depends on the NetPDL engine (it may be a printf on console, a message pop-up and more). The debug() function does not affect how the expression is processed.

Param Expression

Expression (string, numeric, etc) whose result has to be printed.

Returns

The same result of the expression passed as parameter.

Function examples

Checking the appropriate HTTP field

HTTP fields are made up of a first part (the header name), a colon, and the value. Fields are terminated by a carriage return. In order to separate the header type from the header field, the following function can be used:

extractstring($packet[$currentoffset : 0], %%'[%%^:]%%*'%%, 1, 0)

This extract the string that is before the first ”:” sign. Please note that the MatchToBeReturned? parameter is equal to zero: this is required in order to return the full match (although in this case there is not partial match).

Getting the value of an HTTP field

Following previous example, we now want to get the value of the HTTP field, i.e. the string that follows the ”:” colon. In this case, the following expression can be used:

extractstring($packet[$currentoffset : 0], ':(%%[[%%:print:]]*)', 1, 1)

In this case, the regular expression matches everything after the colon. However, we want to return only the part which is in between the round brackets, i.e. we want to return the first partial match (not the whole string, which starts with ”:”). This is the reason of the last parameter equal to 1.

Expression examples

Some examples about this formalism are given below.

Determining the network protocol contained in an Ethernet packet

This example is extracted from the protocol decoding instructions of an Ethernet frame: if the value of the field named protocoltype is equal to 0x0800 (hex), the expression returns true.

buf2int(protocoltype) == 0x0800

Checking the value of an arbitrary number of bytes within the network packet

In these two examples, the next bytes (referred to the current offset in packet processing) are checked against a constant. Particularly, the first expression checks (as a number) the value of the two bytes that follow the current offset; the second checks the value of the same two bytes as a string;

buf2int($packet[$currentoffset:2]) == 0x86DD
$packet[$currentoffset:2] == '\x86\xDD'

Swapping the bytes of a short (i.e. 2-bytes) field

This example shows how to derive the value of a field whose bytes are kept in the wrong byte order (i.e the most significant byte is at the end). This example shows how to access to every single byte within a given field.

buf2int(tlen[1:1])*256 + buf2int(tlen[0:1])

Reading the OUI portion of the MAC address as a string

This example extracts the first 3 bytes (startingoffset is equal to 0, while size is equal to 3) of the MACsrc field. This can be used to isolate the OUI (Organization Unique Identifier) portion of the MAC address, which will give us the manufacturer of the network interface card that has been used to send the packet. Since the lack of the “buf2int()” function, this expression returns a string.

MACsrc[0:3]

Checking if a given MAC address is a multicast/broadcast address

An Ethernet MAC address is broadcast/multicast if the last bit of the first byte of the MAC address is equal to '1'. The problem is that a MAC address is made up of 48 bits, hence it cannot be managed as a number.

To get the wanted result, we have to extract the first byte of the MAC address, then match the result against the xxxxxxx1 pattern:

buf2int(MACdst[0:1]) bitwand 0b00000001) == 0b00000001

In fact, the second part of the expression is not needed from the math point of view. Due to the properties of the ”bitwise and” operator, the expression included within round brackets is already able to return the correct result.

Differentiating Ethernet DIX from Ethernet 802.3 frames through a "lookahead" operand

Some times, there is the need to access to some bytes within the packet buffer that have not yet been processed; therefore, no fields can be used to evaluate the expression (because this value is still to be processed, hence it has not been assigned to any field).

An example of this need can be seen when decoding the Ethernet frame:

<protocol name="Ethernet">
  <format>
    <fields>
      <field type="fixed" name="dst" size="6"/>
      <field type="fixed" name="src" size="6"/>

      <if expr="buf2int($packet[$currentoffset:2]) le 1500">

        <if-true>
          <field type="fixed" name="Length" size="2"/>
        </if-true>

        <if-false>
          <field type="fixed" name="EtherType" size="2"/>
        </if-false>

      </if>
    </fields>
  </format>
</protocol>

The NetPDL engine will process the first two fields of the packet, then it will evaluate the expression in order to proceed. However, the expression makes use of a special variable that gets access to the next two bytes in the protocol frame (i.e. starting at byte 13, since the first two fields are both 6-bytes wide). This can be obtained through the $packet variable, which contains the complete packet buffer, at the offset indicated by the $currentoffset variable, which is automatically updated by the NetPDL engine as soon as the packet processing goes on.

 
netpdl/expressions.txt · Last modified: 2009/04/20 11:35 by admin     Back to top