11 May 2019

ABNF

by mo


I recently began to look at implementing the search part of the SCIM 2.0 Protocol (RFC-7644).

The RFC describes the query language using something called an ABNF. Here’s what the ABNF for the SCIM query language looks like.

   SCIM filters MUST conform to the following ABNF [RFC5234] rules as
   specified below:

     FILTER    = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"

     valuePath = attrPath "[" valFilter "]"
                 ; FILTER uses sub-attributes of a parent attrPath

     valFilter = attrExp / logExp / *1"not" "(" valFilter ")"

     attrExp   = (attrPath SP "pr") /
                 (attrPath SP compareOp SP compValue)

     logExp    = FILTER SP ("and" / "or") SP FILTER

     compValue = false / null / true / number / string
                 ; rules from JSON (RFC 7159)

     compareOp = "eq" / "ne" / "co" /
                        "sw" / "ew" /
                        "gt" / "lt" /
                        "ge" / "le"

     attrPath  = [URI ":"] ATTRNAME *1subAttr
                 ; SCIM attribute name
                 ; URI is SCIM "schema" URI

     ATTRNAME  = ALPHA *(nameChar)

     nameChar  = "-" / "_" / DIGIT / ALPHA

     subAttr   = "." ATTRNAME
                 ; a sub-attribute of a complex attribute

               Figure 1: ABNF Specification of SCIM Filters

From RFC-7644

ABNF stands for Augmented Backus-Naur Form. It is a meta language used to describe the grammer of a language.

Ruby seems to be lacking support for the SCIM query language so I wanted to see if there was a way that I could help fill that void. To accomplish this goal, I needed to learn the ABNF meta-language.

ABNF is described in RFC-5234.

As an aside, drop this shell script in your $PATH to make reading RFCs much easier.

cat ~/.bin/rfc
#!/bin/sh

mkdir -p ~/.rfc
rsync -avz --delete ftp.rfc-editor.org::rfcs-text-only ~/.rfc
less ~/.rfc/rfc"$1".txt

ABNF is broken down into two categories:

  1. Rule Definitions
  2. Operators

Rule Definitions

Let’s take a look at a single rule.

     FILTER    = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"

The rule above has a name of FILTER. This name is case insensitive. It then refers to 3 other rules, then itself.

.i.e.

Match the rule named `attrExp` or the rule named `logExp` or the rule
named `valuePath` or (an optional literal `not` followed by literal `(`
followed by rule named `FILTER` followed by literal `)`.

Expressed as ruby using Parslet rule definitions:

  # FILTER = attrExp / logExp / valuePath / *1"not" "(" FILTER ")"
  rule(:filter) do
    (attribute_expression | logical_expression | value_path) |
      (not_op? >> lparen >> filter >> rparen)
  end

Operators

In the last section I showed a few examples of operators. In this section, let’s go over each operator in more detail.

Concatenation

The following is an example of the Concatenation operator.

 "(" FILTER ")"

It says that a literal ( must be followed by a match for rule FILTER then followed by a literal ).

This can be expressed in ruby via Parslet as:

  (str('not').maybe >> str('(') >> filter >> str(')'))

Ruby’s shovel (») operator is used as the ABNF concatenation operator.

Alternatives

The following is an example of the Alternatives operator.

   attrExp / logExp / valuePath

The alternatives operator allows a rule to be matched by supplying alternative rules that can be matched. In the example above, this rule is matched when either attrExp or logExp or valuePath rules are matched.

Expressed via Parslet as:

    (attribute_expression | logical_expression | value_path)

Ruby logical or operator is overloaded to support the ABNF alternative operator.

Sequence Group

Elements enclosed in parentheses are treated as a single element, whose contents are strictly ordered. - RFC-5234

The following example uses a sequence group.

 logExp = FILTER SP ("and" / "or") SP FILTER

It says that a FILTER rule must be followed by a literal space character ` , then can be followed by a liternal and or a literal or` then followed by a space and another filter.

Expressed as ruby using Parslet:

  rule(:logical_expression) do
    filter >> space >> (and_op | or_op) >> space >> filter
  end

Variable Repetition

The operator “*” preceding an element indicates repetition - RFC-5234

For example:

  *1"not"
  str('not').repeat(0, 1)

The above example states at most 1 literal not.

 ALPHA *(nameChar)
  name_character.repeat(0, nil)

The above example says at 0 or more nameChar can be present.

 1*2(nameChar)
  name_character.repeat(1, 2)

The above example states that at least 1 nameChar but not more than 2 nameChars must be present.

Specific Repetition

element is equivalent to \*element

Optional Sequence

[foo bar] is equivalent to *1(foo bar).

  attrPath  = [URI ":"] ATTRNAME *1subAttr
  (uri >> colon).repeat(0, 1) >> attribute_name >> sub_attribute.repeat(0, 1)

I hope this helps as a brief overview of how to map ABNF to Parslet.

The accompanying ruby source code can be found here.

💎