Part VI – Developing with Categories & Attributes in OpenText Content Server

Christopher Meyer

Fri Feb 14 2014

Updated: 12.2017

Categories & Attributes are a powerful feature of OpenText Content Server. Categories allow custom attributes to be defined, which can then be assigned to documents, folders, or any other node type in the system. Other key features include:

definitions can be versioned;
categories can be applied to each version of a document;
attributes can be multi-valued;
simple one-to-many relationships are possible using "sets";
supports various data types (boolean, string, text, long text, user, & date);
required attributes can be enforced;
custom attribute types can be added (e.g., table key lookup);
changes are audited; and
category data is indexed and searchable.

This is all possible from the Content Server user interface without having to write any custom code. It's quite amazing when you think about it.

Things get tricky when functionality or logic needs to be bound to a category or attribute. For example, say a client wants the following to happen:

When the Publish boolean attribute on a document is true and the Publish Date attribute has passed then move the document to a publishing folder. Otherwise, if the Publish boolean is false and Publish Date has passed then notify the document owner with an e-mail.

I used to cringe when I received requirements like this. How the heck do you programatically work with categories & attributes? After years of trying and many half-baked attempts, I finally have a solution I'm happy with.

Let's first discuss what's available.

# The OScript API for Categories & Attributes

I've seen a lot of code to manipulate categories & attributes, and most do it wrong. Let me explain.

Content Server provides the $LLIAPI.AttrData class for working with categories. It's an object-like design in a similar manner to what I describe in Part I of this blog series. A Frame instance of AttrData encapsulates the category and attribute information of a node, and provides methods to operate on the data. Saving the category data commits the data stored in .fData to the database.

So what's the problem? The difficulty is with the programming interface provided by $LLIAPI.AttrData. It seems the interface is incomplete and oriented towards specific types of web requests. It provides no simple way for a developer to set, get, or query values, which is why I assume so many developers access the .fData structure directly:

attrData.fData.{CatID,VersionID}.Values[1].(SetID).Values[SetIndex].(AttrID).Values[AttrIndex] = newValue

Let's discuss why this is bad. In the example there are no checks that CatID, VersionID, SetID, SetIndex, AttrID, or AttrIndex is valid, in range, or if newValue is of the correct data type. Any wrong assumption about the parameters, structure, or values can corrupt the data and cause all sorts of problems. I've seen it done countless times, is highly error-prone, and is why I think most developers are doing it wrong.

To reiterate something I quoted in Part I from Wikipedia on object-oriented programming (opens new window):

Objects can be thought of as encapsulating their data within a set of functions designed to ensure that the data are used appropriately, and to assist in that use. The object's methods typically include checks and safeguards specific to the data types the object contains. An object can also offer simple-to-use, standardized methods for performing particular operations on its data, while concealing the specifics of how those tasks are accomplished.

Why should this be any different with AttrData? In my opinion the API should:

provide setter and getter methods that are easy to use regardless of the data type, if it's multi-valued, or within a set;
abstract away the need to traverse, manipulate, or even look at .fData;
do all the required checks and validation to enforce data integrity;
manage auditing (AttrData doesn't audit automatically); and
provide an interface to query on attribute values, independent of IDs, and without having to write any SQL queries.

Some of these features are already available on AttrData, but are incomplete. For example, the ValueSet() method can be used to set a value to an attribute and has the following interface:

function Assoc ValueSet(List spec, Dynamic value)

The spec parameter is a List that defines the attribute to manipulate. That looks something like this: {{29502,95},{1,1,11,2,12,2}}. The numbers refer to the category ID, category version ID, set ID, set index, attribute ID, and the attribute index. The function doesn't handle auditing and fails if a multi-valued attribute or set hasn't been extended to accommodate the index. It's not developer friendly at all.

# Introducing RHAttrData

To fix this problem I created a subclass of $LLIAPI.AttrData called $RHCore.RHAttrData and added methods to implement the requirements listed above. None of the extensions modify the original code, which means $RHCore.RHAttrData can be used in place of $LLIAPI.AttrData without side-effects. The advantages come with the added methods. For example, the SetValue() method can be used to set an attribute value and does the following:

validates the value is of the correct type (e.g., date, integer, etc.) and attempts to cast it if not;
handles auditing;
adds workarounds to common misuse of the API that could otherwise corrupt the category (it's easy to do if you're not careful);
automatically extends the row count of any multi-valued set or attribute (when required); and
adds the category to the node if not already applied.

It also has a friendlier programming interface:

function Assoc SetValue(Dynamic CatID, Dynamic AttrID, Dynamic value, Integer AttrIndex=1, Integer SetIndex=1)

With this a developer can set a value to an attribute:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value")

Or, set the 2nd value of a multi-valued attribute on the 3rd row of a multi-valued set:

Assoc results = attrdata.SetValue(CatID, AttrID, "New Value 2", 2, 3)

You'll notice the interface doesn't accept a set ID. The extension determines if an attribute is contained within a set and handles it for you. You just need to provide the attribute ID.

Getting a value is just as easy:

Assoc results = attrdata.GetValue(CatID, AttrID)

if results.ok
	value = results.Value
end

To save the changes:

results = attrdata.commit()

The commit() method makes the appropriate call to llnode.NodeCategoriesUpdate() without having to do any extra DAPINODE or llnode lookups. It also handles auditing without having to manipulate the fAttrChangePrefix or fValueChanges features on attrData (which is otherwise necessary for auditing to work).

Many other methods are available for performing operations such as cloning of categories, setting an attribute to read-only, profiling the properties of the category (e.g., what type of attributes it contains), rendering for a custom form, validating required attributes are fulfilled, etc.

# Attribute Identifiers

In the examples I pass CatID and AttrID as parameters to the SetValue() and GetValue() functions. These are just the category DataID and attribute ID.

The difficulty with using IDs is that they are inherently different among Content Server installations. Hardcoding these values will certainly cause problems once the module is installed on a different system.

One workaround is to retrieve the category and attribute IDs from the category and attribute name. However, this can lead to ambiguous cases when the category name is repeated within a system, or the attribute name is repeated within a category. It's also unreliable since any user with the appropriate permissions could rename a category or attribute.

To deal with this problem, RHCore introduces the concept of identifiers and attribute identifiers. An identifier permits a unique string to be mapped to a DataID. It's similar to a nickname, but can only be viewed or modified by a user with Administrator rights.

An attribute identifier extends the identifier concept to attribute IDs by allowing each attribute in a category to be mapped to a unique string. With the category identifier, it becomes possible to reference a specific category and attribute without having to hardcode any IDs. The only prerequisite is for the Administrator to setup the identifier mappings on each target system, which only needs to be done once.

# Querying

Querying on Content Server attributes requires complex and complicated joins with LLAttrData. The query also depends on category and attribute IDs, which means a SQL statement must be refactored if it's to be used on another system.

RHCore abstracts this away with a simple extension to $RHCore.RHNodeQuery (see Part XVII for a blog post about querying), which permits filtering by categories and attributes without having to write any SQL. For example, to query the system for all documents having a date attribute value in the future:

Frame query = $RHCore.RHNodeQuery.New(prgCtx) \
	.filter('subtype', '==', $TypeDocument) \
	.filterAttribute(CatID, AttrID, '>', Date.Now())

1
2
3

The filterAttribute() method does all the necessary joins to make the query work, and also accepts identifiers as described in the previous section. The convenience, reliability, and amount time saved with this feature cannot be overstated.

# Other features and Wrapping up

The extensions have made some interesting solutions possible. One example is the transformation of the category structure into an Assoc and caching the result with Memcached. This permits lightning fast access to attribute values without having to instantiate a Frame or query the database. I'm using this to display attribute values in table views with great performance.

Many other convenient methods exist. However, most of the time I just use the setters and getters without having to concern myself with the complex details behind them. The extensions have become a cornerstone in my work, and makes dealing with categories and attributes almost as easy as any other persisted variable in OScript.