Feb 142014
 

Categories & Attributes are a powerful feature of OpenText Content Server. Categories allow custom attributes to be defined and assigned to a document, folder, or any other node in the system. Other features of categories & attributes include:

  • definitions can be versioned;
  • can be applied to each version of a document;
  • attributes can be multi-valued;
  • simple one-to-many relationships are possible using “sets”;
  • supports multiple datatypes (boolean, string, text, long text, user, date, etc.);
  • required attributes can be enforced;
  • can be extended (e.g., table key lookup attributes);
  • updates are audited; and
  • data is indexed and searchable.

This is all possible from the Content Server user interface without having to write any OScript. It’s quite amazing when you think about it.

Things get tricky when customers want functionality bound to attributes. For example, say a customer wants the following to happen:

When the Publish boolean attribute is true on a document and the Publish Date attribute has passed then move the document to a predetermined publishing folder. Otherwise, if the Publish boolean is false and Publish Date has passed then notify the document owner with an e-mail.

I used to cringe when I received requirements like this. How the heck do you programatically work with categories & attributes? After years of trying and many half-baked attempts, I finally figured it out.

Let’s first discuss what’s available.

OScript API for Categories & Attributes

Most developers who manipulate categories & attributes with OScript do it wrong. Let me explain.

Content Server provides the $LLIAPI.AttrData class for working with categories. It’s an object-like design in a similar manner to what I described in Part I of this blog series. A Frame instance of AttrData encapsulates the category and attribute information on a node (or version), and provides methods to work on the data stored in the fData feature. Saving the data commits the fData structure to the LLAttrBlobData and LLAttrData tables. The LLAttrBlobData table contains the raw data, while the LLAttrData table is sparse and is provided for easier (but not easy) querying of attribute values.

So what’s the problem? The difficulty is with the programming interface provided by $LLIAPI.AttrData. It seems the interface is incomplete and geared towards specific types of web requests. It provides no simple way to programatically set, get, or query values. I assume this lack of interface is the reason I’ve seen so many modules that read and write attribute values by accessing the fData structure directly. I’m talking about code like this:

attrData.fData.{CatID,VersionID}.Values[1].(SetID).Values[SetIndex].(AttrID).Values[AttrIndex] = newValue

Let’s discuss why this is bad. In this example there are no checks that CatID, VersionID, SetID, SetIndex, AttrID, or AttrIndex is valid and in range, nor if newValue is of the correct data type. Any wrong assumption could cause a stack trace, corrupt the structure, and make it completely inaccessible. I’ve seen it done countless times, is clearly error-prone, and is why I think most developers are doing it wrong. I’m guilty of it myself. What to do?

To reiterate something I quoted in Part I from Wikipedia on object-oriented programming:

Objects can be thought of as encapsulating their data within a set of functions designed to ensure that the data are used appropriately, and to assist in that use. The object’s methods typically include checks and safeguards specific to the data types the object contains. An object can also offer simple-to-use, standardized methods for performing particular operations on its data, while concealing the specifics of how those tasks are accomplished.

Why should this be any different with AttrData? In my opinion the category API should:

  • provide setter and getter methods that are easy to use regardless if it’s multi-valued or within a set;
  • do all the required checks and validation to guarantee the integrity of the data;
  • hide the implementation details around fData so you don’t need to think about it; and
  • manage the auditing (for whatever reason AttrData doesn’t automatically audit).

Some of these features are already available on AttrData, but in my opinion are incomplete. For example, the ValueSet() method can be used to set a value to an attribute, and has the following interface:

function Assoc ValueSet(List spec, Dynamic value)

The spec parameter is a List and defines which attribute you’d like to manipulate. That looks something like this: {{29502,95},{1,1,11,2,12,2}}. The numbers refer to the category ID, category version ID, set ID, set index, attribute ID, and the attribute index. The function doesn’t do any auditing, and will fail if any multi-valued attributes or sets haven’t been extended to accommodate the index. It’s not user friendly at all.

Introducing RHAttrData

To fix this problem I created a subclass of $LLIAPI.AttrData called $RHCore.RHAttrData and added methods to implement the requirements I listed above. None of the methods override core, which means $RHCore.RHAttrData can be used in place of $LLIAPI.AttrData and it’ll behave exactly the same. The advantage comes in the added methods. For example, the SetValue() method can be used to set a value to an attribute and has the following features:

  • validates the value is of the correct type (e.g., date, integer, etc) and attempts to cast it if it’s not;
  • handles auditing;
  • automatically extends the row count of any multi-valued set or attribute (when required); and
  • adds the category to the node if it’s not already applied.

It also has a friendlier programming interface, which looks like this:

function Assoc SetValue(Integer CatID, Integer AttrID, Dynamic value, Integer AttrIndex=1, Integer SetIndex=1)

With this I can set a value to an attribute:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value")

Or, set the 2nd value of a multi-value attribute on the 3rd row of a multi-valued set:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value 2", 2, 3)

You’ll notice the interface doesn’t accept a set ID. The RHAttrData extensions determines if an attribute is contained within a set and handles this for you. You just need to provide the attribute ID.

Getting a value is just as easy:

Assoc results = attrdata.GetValue(CatID, AttrID)

if results.OK
    value = results.Value
end

Or, if you’re feeling lazy:

value = attrdata.GetValue(CatID, AttrID).Value

To save the changes call:

attrdata.commit()

The commit() call makes the appropriate call to llnode.NodeCategoriesUpdate() without having to do any extra DAPINODE or llnode lookups. It also audits the update without having to think about the fAttrChangePrefix or fValueChanges features on attrData, which would otherwise need to be manually manipulated to enforce auditing.

Other features & wrapping up

One additional feature worth mentioning is $RHCore.AttrQuery, which is an extension to RHQuery (introduced in Part V). The class can be used to query the LLAttrData table while respecting permissions and not getting bogged down in multiple table joins. The syntax is simple:

// Create an instance
Frame attrQuery = $RHCore.AttrQuery.New( prgCtx )

// Add a condition (multiple conditions accepted)
attrQuery.filter( CatID, "Country", "startsWith", "Cana")

The $RHCore.AttrQuery class is also a subclass of Pagintator (also introduced in Part V), which means the results can be paged:

attrQuery.setPageNumber(4)
attrQuery.setPageSize(100)

Finally, get the results:

// Get a WebNodes RecArray containing the paged and filtered nodes with 
// attribute "Country" starting with "Cana"
RecArray items = attrQuery.items()

// or, an Iterator that provides an RHNode on each iteration
Frame iterator = attrQuery.iterator()

The extensions have made some interesting integrations possible. One example is the flattening of the category structure into a simple Assoc structure and caching it with memcached. This permits lightning fast access to category values without having to instantiate a Frame or hit the database each time you want to read the values. I’m using this to show category values in list views with extremely fast performance.

The RHAttrData extensions contains many more methods to simplify the programatic handling of categories. However, most of the time I just use the setters and getters without having to think about the complex details behind them. The extensions have become a cornerstone in all my projects involving categories.

Comments, suggestions, or questions? Leave a comment below. Please “Like” on LinkedIn if you found this post through there.

Need help developing for Content Server or interested in using RHCore? Contact me at cmeyer@rhouse.ch.

  12 Responses to “Part VI – Developing with Categories & Attributes in OpenText Content Server”

  1. Very interesting Chris, I totally agree with you. We adopted a very similar solution in our Content Script module (different perspective but very similar usage pattern)

    • Hi Patrick: It surprises me that a simple programming interface wasn’t part of the original design. This stuff is quite complicated but it doesn’t need to be. Thanks for the comment!

      • Probably wasn’t part of the original design because either a) they were rushing it out the door, and/or b) OT development never intended for oscripters outside of OT Dev to be messing around in that area. Heck they didn’t even like OTGS messing around with their lower level ospaces.

        • Hi Hugh: Hard to say. Categories and attributes are complex but robust. I’m just surprised that last step of a simple API wasn’t provided. I doubt this was done to obscure how it works, but it has been the side-effect of what’s provided. You can’t blame anyone for wanting to avoid the lower-level ospaces: That’s quite dangerous if you think about it. While certain hooks are provided to introduce new code, one can never anticipate customer requirements and what can and cannot be done in a non-intrusive way with Content Server. Thanks for your comment!

  2. How does one then get the categories applied to a node. For e.g I have a folder that has a catid and some values.In the examples it looks like we are asking the category template for its values.Am I not understanding this correctly? one other thing I noticed is you say you are using memcached .John Simon and I was really wondering what one uses it for.Now you are onto something

    • Hi Appu: Good question. The $LLIAPI.AttrData object represents all the categories applied to a node. You can use the standard $LLIAPI.AttrData.New() constructor, which is inherited by $RHCore.AttrData:

      Frame attrdata = $RHCore.RHAttrData.New(prgCtx, CatID, VersionID)
      attrdata.DBGet()
      

      However, I find this pattern silly since it’s always required to call DBGet() on the constructed instance. For this reason I have a second constructor on $RHCore.AttrData called NewFrame(), which does this in a single call.

      But I don’t even use that. Most of the time I get the $RHCore.RHAttrData instance from my RHNode instance (RHNode introduced in Part I):

      // Get the RHNode for the Enterprise Workspace
      Frame node = $RHCore.RHNode.New(prgCtx, 2000)
      
      // Get the fully constructed attrdata object
      Frame attrdata = node.categories()
      

      Behind the scenes I cache the object for the life of the request. This means calling node.categories() multiple times will return the same instance and not construct a new instance each time it’s called. This is better for performance and prevents multiple updates from conflicting with each other.

      Memcached is a great addition to Content Server. I use it whenever I have a function that is deterministic (i.e., the same input always return the same output) and takes long to execute. Another place I use it is in RHTemplate for fragment caching in a similar manner to how it’s done with the Django Web framework. I’ve been able to get some great performance boosts with this.

      Thanks for your comment!

  3. Excellent article. What sets your approach apart from the rest of us is that you are willing to share your developments. I think that you now have enough material for a crackerjack course. Throw in CSIDE as an aside and you could have a winner for the next conference.

    • Hi Alex: It’s a pity there isn’t more of an open discussion around Content Server development. The forums on KC are okay, but rarely do I see posts on design. Most of us work in a bubble and are driven by deadlines. Like many developers, I used to scoff at OScript as if it were a primitive language. Working on this framework has completely changed my opinion: It’s an extremely powerful language, but lacks the frameworks for web development. This has nothing to do with Builder or the new SDK; it has to do with how Content Server was built on top of OScript and how you have to work with it to build applications. I’m trying to change this with RHCore, which I hope to share in some manner one day. Thanks for your comment!

  4. Question about setting…if I want to set multiple attributes within a category would it be something like this:

    results = attrdata.SetValue(1234, 2, “New Value”)
    results = attrdata.SetValue(1234, 3, “New Value”)
    results = attrdata.SetValue(1234, 4, “New Value”)
    attrdata.commit()

    so the changes only written 1 time to the DB? Or are the changes written with each SetValue call? Would I need to wrap it in a transcation or do you handle that part as well?

    • Hi John: Correct. The attrdata.SetValue() call only modifies the structure in memory; nothing is written to the database here. The commit() function calls llnode.NodeCategoriesUpdate() (which calls attrdata.DBPut()) and is where the database gets updated.

      You probably don’t need to wrap this in a transaction since llnode.NodeCategoriesUpdate() and attrdata.DBPut() already do that. The llnode.NodeCategoriesUpdate() or attrdata.DBPut() call should do the rollback if anything goes wrong.

      Thanks for your comment!

 Leave a Reply

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)