Feb 142014
 

Updated: 12.2017

Categories & Attributes are a powerful feature of OpenText Content Server. Categories allow custom attributes to be defined, which can then be assigned to documents, folders, or any other node type in the system. Other key features include:

  • definitions can be versioned;
  • categories can be applied to each version of a document;
  • attributes can be multi-valued;
  • simple one-to-many relationships are possible using “sets”;
  • supports various data types (boolean, string, text, long text, user, & date);
  • required attributes can be enforced;
  • custom attribute types can be added (e.g., table key lookup);
  • changes are audited; and
  • category data is indexed and searchable.

This is all possible from the Content Server user interface without having to write any custom code. It’s quite amazing when you think about it.

Things get tricky when functionality or logic needs to be bound to a category or attribute. For example, say a client wants the following to happen:

When the Publish boolean attribute on a document is true and the Publish Date attribute has passed then move the document to a publishing folder. Otherwise, if the Publish boolean is false and Publish Date has passed then notify the document owner with an e-mail.

I used to cringe when I received requirements like this. How the heck do you programatically work with categories & attributes? After years of trying and many half-baked attempts, I finally have a solution I’m happy with.

Let’s first discuss what’s available.

The OScript API for Categories & Attributes

I’ve seen a lot of code to manipulate categories & attributes and most do it wrong. Let me explain.

Content Server provides the $LLIAPI.AttrData class for working with categories. It’s an object-like design in a similar manner to what I describe in Part I of this blog series. A Frame instance of AttrData encapsulates the category and attribute information of a node, and provides methods to operate on the data. Saving the category data commits the data stored in .fData to the database.

So what’s the problem? The difficulty is with the programming interface provided by $LLIAPI.AttrData. It seems the interface is incomplete and oriented towards specific types of web requests. It provides no simple way for a developer to set, get, or query values, which is why I assume so many developers access the .fData structure directly:

attrData.fData.{CatID,VersionID}.Values[1].(SetID).Values[SetIndex].(AttrID).Values[AttrIndex] = newValue

Let’s discuss why this is bad. In the example there are no checks that CatID, VersionID, SetID, SetIndex, AttrID, or AttrIndex is valid, in range, or if newValue is of the correct data type. Any wrong assumption about the parameters, structure, or values can corrupt the data and cause all sorts of problems. I’ve seen it done countless times, is highly error-prone, and is why I think most developers are doing it wrong.

To reiterate something I quoted in Part I from Wikipedia on object-oriented programming:

Objects can be thought of as encapsulating their data within a set of functions designed to ensure that the data are used appropriately, and to assist in that use. The object’s methods typically include checks and safeguards specific to the data types the object contains. An object can also offer simple-to-use, standardized methods for performing particular operations on its data, while concealing the specifics of how those tasks are accomplished.

Why should this be any different with AttrData? In my opinion the API should:

  • provide setter and getter methods that are easy to use regardless of the data type, if it’s multi-valued, or within a set;
  • abstract away the need to traverse, manipulate, or even look at .fData;
  • do all the required checks and validation to enforce data integrity;
  • manage auditing (AttrData doesn’t audit automatically); and
  • provide an interface to query on attribute values, independent of IDs, and without having to write any SQL queries.

Some of these features are already available on AttrData, but are incomplete. For example, the ValueSet() method can be used to set a value to an attribute and has the following interface:

function Assoc ValueSet(List spec, Dynamic value)

The spec parameter is a List that defines the attribute to manipulate. That looks something like this: {{29502,95},{1,1,11,2,12,2}}. The numbers refer to the category ID, category version ID, set ID, set index, attribute ID, and the attribute index. The function doesn’t handle auditing and fails if a multi-valued attribute or set hasn’t been extended to accommodate the index. It’s not developer friendly at all.

Introducing RHAttrData

To fix this problem I created a subclass of $LLIAPI.AttrData called $RHCore.RHAttrData and added methods to implement the requirements listed above. None of the extensions modify the original code, which means $RHCore.RHAttrData can be used in place of $LLIAPI.AttrData without side-effects. The advantages come with the added methods. For example, the SetValue() method can be used to set an attribute value and does the following:

  • validates the value is of the correct type (e.g., date, integer, etc.) and attempts to cast it if not;
  • handles auditing;
  • adds workarounds to common misuse of the API that could otherwise corrupt the category (it’s easy to do if you’re not careful);
  • automatically extends the row count of any multi-valued set or attribute (when required); and
  • adds the category to the node if not already applied.

It also has a friendlier programming interface:

function Assoc SetValue(Dynamic CatID, Dynamic AttrID, Dynamic value, Integer AttrIndex=1, Integer SetIndex=1)

With this a developer can set a value to an attribute:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value")

Or, set the 2nd value of a multi-valued attribute on the 3rd row of a multi-valued set:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value 2", 2, 3)

You’ll notice the interface doesn’t accept a set ID. The extension determines if an attribute is contained within a set and handles it for you. You just need to provide the attribute ID.

Getting a value is just as easy:

Assoc results = attrdata.GetValue(CatID, AttrID)

if results.ok
    value = results.Value
end

To save the changes:

results = attrdata.commit()

The commit() method makes the appropriate call to llnode.NodeCategoriesUpdate() without having to do any extra DAPINODE or llnode lookups. It also handles auditing without having to manipulate the fAttrChangePrefix or fValueChanges features on attrData (which is otherwise necessary for auditing to work).

Many other methods are available for performing operations such as cloning of categories, setting an attribute to read-only, profiling the properties of the category (e.g., what type of attributes it contains), rendering for a custom form, validating required attributes are fulfilled, etc.

Attribute Identifiers

In the examples I pass CatID and AttrID as parameters to the SetValue() and GetValue() functions. These are just the category DataID and attribute ID.

The difficulty with using IDs is that they are inherently different among Content Server installations. Hardcoding these values will certainly cause problems once the module is installed on a different system.

One workaround is to retrieve the category and attribute IDs from the category and attribute name. However, this can lead to ambiguous cases when the category name is repeated within a system (I’ve seen this happen) or the attribute name is repeated within a category. It’s also unreliable since any user with the appropriate permissions could rename a category or attribute.

To deal with this problem RHCore introduces the concept of identifiers and attribute identifiers. An identifier permits a unique string to be mapped to a DataID. It’s similar to a nickname, but can only be viewed or modified by a user with Administrator rights.

An attribute identifier extends the identifier concept to attribute IDs by allowing each attribute in a category to be mapped to a unique string. With the category identifier it becomes possible to reference a specific category and attribute without having to hardcode any IDs. The only prerequisite is for the Administrator to setup the identifier mappings on each target system (which only needs to be done once).

Querying

Querying on Content Server attributes requires complex and complicated joins with LLAttrData. The query also depends on category and attribute IDs, which means a SQL statement must be refactored if it’s to be used on another system.

RHCore abstracts this away with a simple extension to $RHCore.RHNodeQuery (see Part XVII for a blog post about querying), which permits filtering by categories and attributes without having to write any SQL. For example, to query the system for all documents having a date attribute value in the future:

Frame query = $RHCore.RHNodeQuery.New(prgCtx) \
                .filter('subtype', '==', $TypeDocument) \
                .filterAttribute(CatID, AttrID, '>', Date.Now())

The filterAttribute() method does all the necessary joins to make the query work, and also accepts identifiers as described in the previous section. The convenience, reliability, and amount time saved with this feature cannot be overstated.

Other features and Wrapping up

The extensions have made some interesting solutions possible. One example is the transformation of the category structure into an Assoc and caching the result with Memcached. This permits lightning fast access to attribute values without having to instantiate a Frame or query the database. I’m using this to display attribute values in table views with extremely fast performance.

Many other convenient methods exist. However, most of the time I just use the setters and getters without having to concern myself with the complex details behind them. The extensions have become a cornerstone in my work and makes dealing with categories and attributes almost as easy as any other persisted variable in OScript.

Need help developing for Content Server or interested in using RHCore? Contact me at cmeyer@rhouse.ch.

  12 Responses to “Part VI – Developing with Categories & Attributes in OpenText Content Server”

  1. Very interesting Chris, I totally agree with you. We adopted a very similar solution in our Content Script module (different perspective but very similar usage pattern)

    • Hi Patrick: It surprises me that a simple programming interface wasn’t part of the original design. This stuff is quite complicated but it doesn’t need to be. Thanks for the comment!

      • Probably wasn’t part of the original design because either a) they were rushing it out the door, and/or b) OT development never intended for oscripters outside of OT Dev to be messing around in that area. Heck they didn’t even like OTGS messing around with their lower level ospaces.

        • Hi Hugh: Hard to say. Categories and attributes are complex but robust. I’m just surprised that last step of a simple API wasn’t provided. I doubt this was done to obscure how it works, but it has been the side-effect of what’s provided. You can’t blame anyone for wanting to avoid the lower-level ospaces: That’s quite dangerous if you think about it. While certain hooks are provided to introduce new code, one can never anticipate customer requirements and what can and cannot be done in a non-intrusive way with Content Server. Thanks for your comment!

  2. How does one then get the categories applied to a node. For e.g I have a folder that has a catid and some values.In the examples it looks like we are asking the category template for its values.Am I not understanding this correctly? one other thing I noticed is you say you are using memcached .John Simon and I was really wondering what one uses it for.Now you are onto something

    • Hi Appu: Good question. The $LLIAPI.AttrData object represents all the categories applied to a node. You can use the standard $LLIAPI.AttrData.New() constructor, which is inherited by $RHCore.AttrData:

      Frame attrdata = $RHCore.RHAttrData.New(prgCtx, CatID, VersionID)
      attrdata.DBGet()
      

      However, I find this pattern silly since it’s always required to call DBGet() on the constructed instance. For this reason I have a second constructor on $RHCore.AttrData called NewFrame(), which does this in a single call.

      But I don’t even use that. Most of the time I get the $RHCore.RHAttrData instance from my RHNode instance (RHNode introduced in Part I):

      // Get the RHNode for the Enterprise Workspace
      Frame node = $RHCore.RHNode.New(prgCtx, 2000)
      
      // Get the fully constructed attrdata object
      Frame attrdata = node.categories()
      

      Behind the scenes I cache the object for the life of the request. This means calling node.categories() multiple times will return the same instance and not construct a new instance each time it’s called. This is better for performance and prevents multiple updates from conflicting with each other.

      Memcached is a great addition to Content Server. I use it whenever I have a function that is deterministic (i.e., the same input always return the same output) and takes long to execute. Another place I use it is in RHTemplate for fragment caching in a similar manner to how it’s done with the Django Web framework. I’ve been able to get some great performance boosts with this.

      Thanks for your comment!

  3. Excellent article. What sets your approach apart from the rest of us is that you are willing to share your developments. I think that you now have enough material for a crackerjack course. Throw in CSIDE as an aside and you could have a winner for the next conference.

    • Hi Alex: It’s a pity there isn’t more of an open discussion around Content Server development. The forums on KC are okay, but rarely do I see posts on design. Most of us work in a bubble and are driven by deadlines. Like many developers, I used to scoff at OScript as if it were a primitive language. Working on this framework has completely changed my opinion: It’s an extremely powerful language, but lacks the frameworks for web development. This has nothing to do with Builder or the new SDK; it has to do with how Content Server was built on top of OScript and how you have to work with it to build applications. I’m trying to change this with RHCore, which I hope to share in some manner one day. Thanks for your comment!

  4. Question about setting…if I want to set multiple attributes within a category would it be something like this:

    results = attrdata.SetValue(1234, 2, “New Value”)
    results = attrdata.SetValue(1234, 3, “New Value”)
    results = attrdata.SetValue(1234, 4, “New Value”)
    attrdata.commit()

    so the changes only written 1 time to the DB? Or are the changes written with each SetValue call? Would I need to wrap it in a transcation or do you handle that part as well?

    • Hi John: Correct. The attrdata.SetValue() call only modifies the structure in memory; nothing is written to the database here. The commit() function calls llnode.NodeCategoriesUpdate() (which calls attrdata.DBPut()) and is where the database gets updated.

      You probably don’t need to wrap this in a transaction since llnode.NodeCategoriesUpdate() and attrdata.DBPut() already do that. The llnode.NodeCategoriesUpdate() or attrdata.DBPut() call should do the rollback if anything goes wrong.

      Thanks for your comment!

 Leave a Reply

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)