Draft - Defining a set of Abstract Data Types for complex fields

Network applications often define a some very peculiar data structures for data exchange. Although the primitives defined in NetPDL are suitable for describin a large set of fields, it may be useful to define some complex fields once, and then reuse these definitions when needed.

This feature is particularly useful on grammar-based protocols, whose rules are often created incrementally from simple grammar rules (e.g., the recognition of the elementary tokens) to more complex productions, and more.

NetPDL addresses this issue through the set of the following elements:

ElementDescription
<adt-list>It defines a special section of the NetPDL file that contains new ADT definitions whose scope is shared among all protocol descriptions.
<adt>It defines a new type of field that is used as Abstract Data Type.

The ADT section: the <adt-list> element

The <adt-list> element can be present in two positions:

  • as a child of the root <netpdl> element: this solution allows the definitions within the ADT to be visible in all the NetPDL database;
  • as a child of the <format> element: in this case the new ADT will be visible only within the current protocol. This solution is more appropriate for some data types that are specific of a given protocol.

This element does not have any attribute and it can have the following child element:

ElementDescription
<adt>It defines an abstract data type to be used within other <adt> elements or within fields.

Defining a new ADT element: the <adt> element

New Abstract Data Types structures are defined through the <adt> element. An <adt> defines a derived data type that is based on a basic NetPDL data type. For instance, we can define an <adt> that describes email addresses and use that fiel type in the protocol description.

While an ADT does not add any semantic value to the language, it provides a way to create more compact (and powerful) descriptions.

The <adt> element supports the following attributes in addition to the Standard ones:

AttributeDescription
adtname
(required)
It defines the name of the current ADT. This name will be used to include ADT structure within a field or a subfield.
type
(required)
It defines the base type of the ADT. It can assume the same values defined for the type attribute of the <field> elements.
Other attributes Other attributes might be present according to the value of the type attribute. See later for more details.

In addition, an ADT can define all the attributes that are supported in the base field type. For example, an ADT based on a fixed field can define the size attribute.

An <adt> element supports all the child elements that are available for the <fields> element.

Example: ADT of e-mail address within header values

This example defines a new ADT that is able to recognize e-mail addresses. This new element is useful within HTTP, SIP, ad other similar application protocols.

  <adt-list>
    <adt adtname="email" type="eatall">
      <cfield type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="contact" longname="Mail Contact">
        <cfield type="delimited" endregex="[ \t\r\n]*(?=\x3C)" onmissingend="skipfield" name="dspname" longname="Display Name"/>
        <cfield type="delimited" beginregex="\x3C" endregex="\x3E[ \t\r\n]*" name="address" longname="Address"/>
      </cfield>
      <block name="addr_params" longname="Address Parameters">
        <loop type="while" expr="1">
          <cfield type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="addr_param" longname="Address Parameter">
            <cfield type="delimited" endregex="=" name="addr_param_name" longname="Address Parameter Name"/>
            <field type="eatall" name="addr_param_value" longname="Address Parameter Value"/>
          </cfield>
        </loop>
      </block>
    </adt>
  </adt-list>

Applying this declaration on the hypothetical SIP header value <sip:foo@bar.com>;tag=97180421, the fields here shown are decoded:

Field nameValueNotes
(please read the note) ”<sip:foo@bar.com>;tag=97180421” The name of this fields depends on the <adtfield> element (it will be presented within next section) which recalls current ADT.
contact”<sip:foo@bar.com>”
dspname(no value)This field is missing because its endregex failed, and the onmissingend attribute mandates to skip the field in case the initial marker cannot be found.
address”sip:foo@bar.com” This field contains the SIP address, i.e. the value between '<' and '>' markers.
addr_param”tag=97180421” The block addr_params encloses this field and the others listed below.
addr_param_pname”tag”
addr_param_pvalue”97180421”

Using ADTs for defining field formats

<adt> elements can be used to define a field contained in the <fields> section of NetPDL protocols. For instance, a field type can be derived from an ADT instead of a simple NetPDL type (e.g., fixed field).

This can be done through the following element

ElementDescription
<adtfield>It recalls within <fields> section an abstract data type previously declared by <adt> element.
<adtsubfield>It replace a <subfield> element within <fields> section in order to recall in its place an abstract data type previously declared by <adt> element. It requires/allows all attributes provided for <subfield> element.

Fabio: Propongo qui una modifica coerente con le nostre ultime scelte: non ho citato qui l'elemento <csubfield> perché sarebbe giusto eliminarlo, dato che i <cfield> non esistono più. Per distinguere semplicemente i <subfield> dai <csubfield> c'è già l'attributo ctype (che io cambierei contestualmente in subtype), la cui presenza costituisce un flag di riconoscimento.

The <adtfield> element supports one attribute in addition to the Standard ones, which mirrors the type attribute of the <field> element:

AttributeDescription
adttype
(required)
It defines the name of the ADT that has to be used to define the current field/subfield, i.e. the name specified by adtname attribute of <adt> element that defines the ADT we are referring to. Allowed names are the ones of ADTs visible at current scope, i.e. global ADTs and the ones defined for the current protocol.

When a field is based on an ADT, i.e. <adtfield> defines it, it cannot include any child with the exception of the new <replace>, which will be presented later.

Example: using an ADT in describing the SIP protocol

This example shows the usage of the previosly defined ADT within a SIP fragment:

  <protocol name="sip" longname="Session Initiation Protocol">
    <!-- ... -->
    <fields>
      <!-- ... -->

      <set type="hdrline" sepregex="(\r\n)?[\t ]*:(\r\n)?[\t ]*">
        <exit-when expr="$packet[$currentoffset:2] == '\x0D\x0A'"/>

        <!-- ... -->

        <cfield match="hasstring(this.hname,'^t(o)?$',0)" name="to" longname="To Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="From Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="email" name="hvalue" longname="From Header Value"/>
        </cfield>

        <cfield match="hasstring(this.hname,'^f(rom)?$',0)" name="from" longname="From Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="From Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="email" name="hvalue" longname="From Header Value"/>
        </cfield>

        <!-- ... -->

        <default-item name="sip_option" longname="SIP Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="SIP Header Name"/>
          <subfield portion="hdrline.hvalue" name="hvalue" longname="SIP Header Value"/>
        </default-item>

      </set>

      <!-- ... -->
    </fields>
  <protocol>

The better readability of the protocol description, thanks to the ADT, is evident.

However, such replacement is almost static because either to or from inner fields come from the same ADT, hence they are sharing the names fixed at ADT declaration time like happens with <block> element. As a matter of fact, for both to.hvalue and from.hvalue fields the NetPDL engine will produce the same internal structure of internal fields depicted in the example of previous section.

Customizing ADTs in protocol formats

An in-depth analysis of previous example shows that ADTs are somewhat similar to the <block> element. In fact, ADT recalling offers an interesting feature to allow reusability of a defined ADT. All elements which include an ADT can specify how to rename a given ADT-related field by <replace> element, a simple element (i.e. it have not children elements) suited to express the name-replacement desired. It supports the following attributes in addition to the Standard ones:

AttributeDescription
adtref (required)It keeps the name of the ADT-related field to rename under the form of a namespace-like syntax: a valid value for this attribute is a pair of unspaced names separated by a ”:” character, where first name indicates the name of ADT whose field belongs, and second name defines the name of the field itself. Such solution enables nesting of an ADT within an other one.
name (optional)It keeps the new unique name that identifies the object within its scope, in replacement of those declared within ADT.
longname (optional)It keeps the new “human” name (it may be used when the object has to be shown), in replacement of those declared within ADT.

Example: recalling “email” ADT within SIP protocol and rename its fields

This example shows how is possible to customize through recalling an ADT previously defined within NetPDL Database.

  <protocol name="sip" longname="Session Initiation Protocol">
    <!-- ... -->
    <fields>
      <!-- ... -->

      <set type="hdrline" sepregex="(\r\n)?[\t ]*:(\r\n)?[\t ]*">
        <exit-when expr="$packet[$currentoffset:2] == '\x0D\x0A'"/>

        <!-- ... -->

        <cfield match="hasstring(this.hname,'^t(o)?$',0)" name="to" longname="To Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="To Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="mail_in_hfields" name="hvalue" longname="To Header Value">
            <replace adtref="email:address" name="to_address" longname="From Address"/>
            <replace adtref="email:addr_param" name="to_address_param" longname="From Address Parameters"/>
          </adtsubfield>
        </cfield>

        <cfield match="hasstring(this.hname,'^f(rom)?$',0)" name="from" longname="From Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="From Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="mail_in_hfields" name="hvalue" longname="From Header Value">
            <replace adtref="email:address" name="from_address" longname="From Address"/>
            <replace adtref="email:addr_param" name="from_address_param" longname="From Address Parameters"/>
          </adtsubfield>
        </cfield>

        <!-- ... -->

        <default-item name="sip_option" longname="SIP Header Field" showtemplate="FieldAscii">
          <subfield portion="hdrline.hname" name="hname" longname="SIP Header Name" showtemplate="FieldAscii"/>
          <subfield portion="hdrline.hvalue" name="hvalue" longname="SIP Header Value" showtemplate="FieldAscii"/>
        </default-item>
      </set>

      <!-- ... -->
    </fields>
  <protocol>

In this case, verbosity of description does not increase much than the previous example. However now it is possibile to discriminate ADT-related fields based on the context in which ADT is called, because the original ADT-related name is replaced by the newer defined by <replace>. For instance, after decoding of the whole fragment starting from to.hvalue subfield, the overall address data can be retrieved by to.hvalue.to_address (or simpler to_address) instead of to.hvalue.address (or simpler address), and so on with other renamed ADT-related fields.

Composing a more complex ADT element: nesting of several ADT defined by the <adt> element

Abstract Data Types structures can also be defined through recalling other ADT from the same scope within a <adt> element.

Example: Modularize “email” ADT

This example defines the same ADT of previous examples, but now it is decomposed in some subpart and hence it is defined by a top-down approach. This feature is powerful because it allows user to define building-block ADTs and then to compose them and realize more complex ADTs.

  <adt-list>

    <adt adtname="contact" type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="m_contact" longname="Mail Contact">
      <cfield type="delimited" endregex="[ \t\r\n]*(?=\x3C)" onmissingend="skipfield" name="dspname" longname="Display Name"/>
      <cfield type="delimited" beginregex="\x3C" endregex="\x3E[ \t\r\n]*" name="address" longname="Address"/>
    </adt>

    <adt adtname="email" type="eatall">
      <adtfield adttype="contact" name="mail_contact"/>
      <block name="addr_params" longname="Address Parameters">
        <loop type="while" expr="1">
          <cfield type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="addr_param" longname="Address Parameter">
            <cfield type="delimited" endregex="=" name="addr_param_name" longname="Address Parameter Name"/>
            <field type="eatall" name="addr_param_value" longname="Address Parameter Value"/>
          </cfield>
        </loop>
      </block>
    </adt>

  </adt-list>

Please note a little difference between the two ADTs declared: the first ADT, i.e. the callee, has been set default name-related attributes, whose values can be modified (as really happens in the example for the name attribute) by the second one, i.e. the caller.

Name replacing in modularized ADT elements

In presence of nested Abstract Data Types structures renaming is performed giving priority to the latter <replace> elements declared on and so on upon the oldest <replace> element encountered.

Example: Modularize “mail_in_hfield” ADT

Let's merge everything we have been seen in previous examples and add some <replace> element, in order to analyze what are the results observing the renaming rule enunciated.

  <protocol name="sip" longname="Session Initiation Protocol">
    <!-- ... -->
    <adt-list>

      <adt adtname="contact" type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="m_contact" longname="Mail Contact">
        <cfield type="delimited" endregex="[ \t\r\n]*(?=\x3C)" onmissingend="skipfield" name="dspname" longname="Display Name"/>
        <cfield type="delimited" beginregex="\x3C" endregex="\x3E[ \t\r\n]*" name="address" longname="Address"/>
      </adt>

      <adt adtname="email" type="eatall">
        <adtfield adttype="contact" name="mail_contact"/>
        <block name="addr_params" longname="Address Parameters">
          <loop type="while" expr="1">
            <cfield type="delimited" endregex="(\r\n)?[ \t]*;(\r\n)?[ \t]*" name="addr_param" longname="Address Parameter">
              <cfield type="delimited" endregex="=" name="addr_param_name" longname="Address Parameter Name"/>
              <field type="eatall" name="addr_param_value" longname="Address Parameter Value"/>
            </cfield>
          </loop>
        </block>
      </adt>

    </adt-list>

    <!-- ... -->

    <fields>
      <!-- ... -->
      <set type="hdrline" sepregex="(\r\n)?[\t ]*:(\r\n)?[\t ]*">
        <exit-when expr="$packet[$currentoffset:2] == '\x0D\x0A'"/>
        <!-- ... -->

        <cfield match="hasstring(this.hname,'^t(o)?$',0)" name="to" longname="To Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="From Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="email" name="hvalue" longname="From Header Value">
            <replace adtref="contact:dspname" name="to_dspname"/>
            <replace adtref="contact:address" name="to_mail"/>
            <replace adtref="email:addr_param" name="to_address_param" longname="From Address Parameters"/>
          </adtsubfield>
        </cfield>

        <cfield match="hasstring(this.hname,'^f(rom)?$',0)" name="from" longname="From Header Field">
          <subfield portion="hdrline.hname" name="hname" longname="From Header Name"/>
          <adtsubfield portion="hdrline.hvalue" adttype="email" name="hvalue" longname="From Header Value">
            <replace adtref="contact:dspname" name="from_dspname"/>
            <replace adtref="contact:address" name="from_mail"/>
            <replace adtref="email:addr_param" name="from_address_param" longname="From Address Parameters"/>
          </adtsubfield>
        </cfield>

        <!-- ... -->
      </set>
      <!-- ... -->
    </fields>
  <protocol>

Replacing rules might introduce some misunderstanding, but they guarantee the maximum flexibility in dealing with ADTs only requiring a bit familiarity with them. To complete this example, please consider the following fragment of SIP header:

To: <sip:callee@foo.com>;tag=ff150769
From: <sip:caller@bar.com>;tag=97180421

The NetPDL engine will decode them as here reported:

Field nameValueNotes
to”To: <sip:callee@foo.com>;tag=ff150769”
hname”To”
hvalue”<sip:callee@foo.com>;tag=ff150769”
mail_contact”<sip:callee@foo.com>”
to_dspname(no value)This field is missing because its endregex failed, and the onmissingend attribute mandates to skip the field in case the initial marker cannot be found.
to_address”sip:callee@foo.com” This field contains the SIP address, i.e. the value between '<' and '>' markers.
addr_param”tag=ff150769” The block to_addr_params encloses this field and the others listed below.
addr_param_pname”tag”
addr_param_pvalue”ff150769”
from”From: <sip:caller@bar.com>;tag=97180421”
hname”From”
hvalue”<sip:caller@bar.com>;tag=97180421”
mail_contact”<sip:caller@bar.com>”
from_dspname(no value)This field is missing because its endregex failed, and the onmissingend attribute mandates to skip the field in case the initial marker cannot be found.
from_address”sip:caller@bar.com” This field contains the SIP address, i.e. the value between '<' and '>' markers.
addr_param”tag=97180421” The block from_addr_params encloses this field and the others listed below.
addr_param_pname”tag”
addr_param_pvalue”97180421”
 
netpdl/netpdlfulvio.txt · Last modified: 2010/03/31 14:56 by fulvio     Back to top