May 222015
 

A common pattern in OpenText Content Server development is the execution of a database query. This is a low-level operation, which is useful when an API call isn’t available to get the information you require.

Database queries are a quick way to get to the raw data, but are tricky when the construct of the query is not known until runtime (say, based on the values submitted in a request). A common solution is to dynamically concatenate the query together based on conditions to generate the final query. For example:

if RecArray.IsColumn(request, "filterValue")
    whereClause += " AND myColumn=:A0"
    args = {@args, request.filterValue}
end

...

if IsDefined(whereClause) AND whereClause != "" )
    selectStmt += " where " + whereClause
end

if IsDefined(orderbyClause) AND length(orderbyClause)
    selectStmt += " order by " + orderbyClause
end

Record recs = CAPI.Exec(selectStmt, args)

It’s a tedious process to build a query this way, and care must be taken to:

  • ensure the query is always valid (including syntax differences between MSSQL and Oracle);
  • prevent SQL injection; and
  • restrict the number of returned items (e.g., a million row result set will cause all sorts of problems).

Unfortunately, the approach provides almost no reusability and needs to be implemented again each time something similar is required.

It was while working with the Django Web framework that I was exposed to a novel way to construct and run a database query without having to write any SQL. It made me wonder if something similar could be done with OScript, and with this idea I developed the RHTableQuery class. It’s now a standard part of RHCore.

Introducing RHTableQuery

RHTableQuery is an abstraction to filter, sort, and page the contents of any table or view in Content Server without having to write any SQL. Let’s jump in with an example to query the contents of the WebNodes view. We start by constructing an instance of RHTableQuery and passing “WebNodes” into the constructor:

Frame nodes = $RHCore.RHTableQuery.New(prgCtx, "WebNodes")

At this point no database query has been executed and the nodes object is just a representation of all records in the WebNodes view.

To fetch the records we call the items() method, which constructs the query, executes it, and returns the results.

RecArray recs = nodes.items()

The underlying query is generated by the sql() method, which can be called to inspect what’s being executed:

echo( nodes.sql() )
> select WebNodes.* from WebNodes

Let’s look at filtering.

Filtering

Filters are applied with the filter() method and is used to reduce the result set based on a condition. The syntax is as follows (using nodes from our previous example):

nodes.filter(<columnName>, <operator>, <value>)

The parameters are:

  • columnName – the column name on which to apply the filter;
  • operator – the operator to apply (e.g., ==, !=, >, startsWith, contains, in, etc.); and
  • value – the value to query.

The filter() method changes the state of the object to include the condition in the query. For example, the following could be used to limit the nodes query to documents:

nodes.filter('subtype', '==', $TypeDocument)

A subsequent call to items() would now only include documents.

Alternatively, we could use the in operator to limit the results to documents and folders:

nodes.filter('subtype', 'in', {$TypeFolder, $TypeDocument})

The filter() method can be called multiple times to add additional conditions. For example, a second condition could be added to limit the folders and documents to names beginning with “HR”:

nodes \
    .filter('subtype', 'in', {$TypeFolder, $TypeDocument}) \
    .filter('name', 'startswith', 'HR')

Or, a third condition could be added to limit the results to items modified within the last 14 days:

nodes \
    .filter('subtype', 'in', {$TypeFolder, $TypeDocument}) \
    .filter('name', 'startswith', 'HR') \
    .filter('modifydate', '=>', $RHCore.DateUtils.AddDays(Date.Now(), -14))

All filter operations are applied at the database level (i.e., in the “where” clause) for optimal performance.

Sorting

A sort criteria can be applied with the sort() method and is similar to applying a filter. For example, the following sorts the nodes query by the name field:

nodes.sort('name')

The field name can also be prefixed with a negative sign to sort in reverse order:

nodes.sort('-name')

The method also permits sorting over multiple fields by passing in a list:

nodes.sort({'subtype','-name'})

As with filtering, sort is applied at the database level for optimal performance.

Paging

Since RHTableQuery is a subclass of Paginator (see Part V for information on the Paginator class) the results can be paged with the setPageSize() and setPageNumber() methods. For example, to set the page size to 25 and to get the contents of the 5th page is:

nodes.setPageSize(25).setPageNumber(5)

The items() method would now return the 25 items on the 5th page after all filter and sort conditions have been applied.

Paging works by iterating a database cursor, which has shown to work well over large result sets.

Putting it together

You might have noticed the filter, sort, and paging methods each return the query instance. This allows us to chain the methods together and consolidate them into a single expression. For example, we could combine our previous examples as follows:

RecArray recs = $RHCore.RHTableQuery.New(prgCtx, "WebNodes") \
        .filter('subtype', 'in', {$TypeFolder, $TypeDocument}) \
        .filter('name', 'startswith', 'HR') \
        .filter('modifydate', '=>', $RHCore.DateUtils.AddDays(Date.Now(), -14)) \
        .sort('-name') \
        .setPageSize(25) \
        .setPageNumber(5) \
        .items()

Not bad for a few lines of code, and is also highly readable.

The same approach can be applied to any other table or view in the Content Server system. For example, we could use RHTableQuery to find all users with last names beginning with “S”:

RecArray users = $RHCore.RHTableQuery.New(prgCtx, "KUAF") \
        .filter('deleted', '==', 0) \
        .filter('type', '==', UAPI.USER) \
        .filter('lastname', 'startswith', 'S') \
        .items()

The compactness, readability, and flexibility of the class makes it an ideal way to run queries that depend on dynamic conditions. It’s far easier to use than manually concatenating a query string.

Additional Methods

The RHTableQuery class has a number of other useful methods for fetching related information.

count()

The count() method returns the total number of records in the result set. The method takes the filter conditions into account, evaluates the result at the database using an aggregate query, and is cached to prevent multiple calls from repeatedly hitting the database. For example, say we want to know how many documents are owned by a particular user. This is simply:

Integer UserID = ...

Integer documentCount = $RHCore.RHTableQuery.New(prgCtx, "WebNodes") \
        .filter('subtype', '==', $TypeDocument) \
        .filter('userid', '==', UserID) \
        .count()

values_list()

The values_list() method returns a list with the values of a column. The method accepts a column name as an argument, and an optional boolean argument to remove duplicates from the list. For example, the following could be used to get all unique user last names whose first name begins with “A”:

List lastNames = $RHCore.RHTableQuery.New(prgCtx, "KUAF") \
        .filter('deleted', '==', 0) \
        .filter('type', '==', UAPI.USER) \
        .filter('firstname', 'startswith', 'A') \
        .values_list('lastname', true)

This method is useful when populating a <select> list for filtering.

min() & max()

The min() and max() methods return the minimum and maximum value of a column. The methods accept the column name as an argument, takes the filter conditions into account, and evaluates the result at the database using an aggregate query. For example, to get the last modified date of all documents is simply:

Date lastModifiedDate = $RHCore.RHTableQuery.New(prgCtx, "WebNodes") \
        .filter('subtype', '==', $TypeDocument) \
        .max('modifydate')

iterator()

The iterator() method returns the result set wrapped in an Iterator. I won’t get into the advantages of using an Iterator in this post, but you can read about them in Part II of this blog series.

extra()

The filter() method suffices for most query operations, but sometimes a more complex query condition is required. For this is the extra() method, which allows SQL to be inserted directly into the “where” clause of the underlying query.

For example, the following two statements are functionally equivalent to retrieve all nodes containing “RHCore” in the name.

Using the filter() method:

nodes.filter('name', 'contains', 'RHCore')

Or, using the extra() method (with MSSQL):

nodes.extra('LOWER(Name) LIKE LOWER('%'+:A0+'%')', {'RHCore'})

The extra() method is rarely used, but is useful when a complex query statement is required that cannot be expressed with the filter() method.

join()

The join() method is used to create an inner join to another table.

Special Cases: RHNodeQuery

The RHTableQuery class provides a simple and generic way to query a table in Content Server. It works well, but there are some special cases to consider. In particular, the “WebNodes” example misses two important and common requirements:

  • filtering by permissions; and
  • filtering by category attributes.

For this is the RHNodeQuery class, which is a direct subclass of RHTableQuery. It behaves the same as its parent (i.e., all the features mentioned earlier are still applicable), but with some minor differences. First, the constructor doesn’t accept a table or view argument since it’s hardcoded to use the WebNodes view:

Frame nodes = $RHCore.RHNodeQuery.New(prgCtx)

The constructor applies a permission filter by default (See & SeeContents) based on the user defined by the prgCtx context. This is usually fine, but can be disabled by passing false as a second argument to the constructor:

Frame nodesNoPermCheck = $RHCore.RHNodeQuery.New(prgCtx, false)

The RHNodeQuery class also provides a filterAttribute() method to filter on a category attribute. The syntax is as follows:

nodes.filterAttribute(<CatID>, <AttrID>, <operator>, <value>)

As with the filter() method, the filterAttribute() method extends the underlying query (including all necessary joins with the LLAttrData table) to permit filtering on the attribute value.

For example, say you want to find all documents with a boolean category attribute set to true (or 1 in the database):

Integer CatID = ...
Integer AttrID = ...

Frame nodes = $RHCore.RHNodeQuery(prgCtx) \
        .filter('subtype', '==', $TypeDocument) \
        .filterAttribute(CatID, AttrID, '==', 1) \
        .items()

Again, not bad for a few lines of code.

Wrapping Up

The RHTableQuery class has simplified much of my development. It provides a clean API for querying a table or view, and replaces the need to write complex code to generate a SQL statement. The class also works seamlessly with RHTemplate (see Part III), which allows the results to be rendered as HTML (including pagination) with minimal effort. It’s reusability at its best.

I welcome your questions or comments in the section below. If you like these blog posts you can also subscribe to updates in the “Subscribe to my posts” field at the top of the page.

Need help developing for Content Server or interested in using RHCore? Contact me at cmeyer@rhouse.ch.

  4 Responses to “Part XVII – Table Queries in OpenText Content Server”

  1. as ever.. spectacular…
    very similar to hibernate criteria query, the perfect companion for your object persistence mechanism (http://schwiiz.org/?p=2035), do you have any example of the two working together ?

    • Hi Patrick: Thanks! It’s interesting you noticed the connection to RHModel. A few months ago QuerySet and RHTableQuery were still separate, but I realised they were the same thing. After some refactoring I made QuerySet a subclass of RHTableQuery. So to answer your question, the following returns a QuerySet and has an interface identical to RHTableQuery:

      Frame queryset = $MyModel.objects(prgCtx)
      

      I’ll have to take a look at hibernate some time and see what I can learn from that. Thanks for the comment!

  2. Amazing as always you have even implemented a cursor under the covers about hard coding webnodes with multilingual binaries is it now webnodes_en_us or is it still webnodes and some others for other languages?

    • Hi Appu: A good observation. By default it uses the WebNodes view, but you can change this by calling setTableName() on the instance. For example:

      Frame nodes = $RHCore.RHNodeQuery.New(prgCtx) \
          .setTableName('WebNodes_de') \
          .filter( ... )
      

      Even more clever would be a function to automatically determine the appropriate view for the current user. It’s something to consider. Thanks for the comment!

 Leave a Reply

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)