Holes in the template: piping content into a web CMS

When companies have large quantities of content – for example, many products, where each one has several pieces of information – that product information probably doesn’t originate from their web content management system (WCMS). The WCMS acts as a ‘presentation layer’ – in other words, a mechanism to display content.

The content doesn’t have to live in a single system. In fact, there may be multiple systems that feed different types of information, both content and data, into the ‘presentation layer’ of the WCMS. This isn’t a bad thing. Different end systems are optimised to input, store and process particular kinds of content or data.

In this post, we look at how content works within a WCMS, and how a WCMS works with other systems to present content in ways that create richer information so that content consumers can make sense of, and make decisions with, that content.

What constitutes a product?

Before we talk about how content comes together, we need to ask what information is actually needed. This search on the staples.co.uk site for Post-it® Flags returned a typical result. The product consists of a few elements that we as content consumers recognise on the page:

  • images of the product
  • a short description
  • a price
  • quantity available
  • description
  • features
  • accessories
Results of a search on the staples.com site for 'Post-It flags'

A product on the staples.com site

This is what gets displayed on the screen, but there’s a whole other layer of technical information behind the screen that makes it possible for us to search for a product and see all of the bits of information we need to make sense of it.

To make sense of the technical side of what makes up a product, we need to look at the markup language behind the scenes. Content that needs to work between systems – for example, on different sites – needs to use a standard that can be understood across those systems. This has led to using international standards. In this case, we would use the markup language, or ‘schema’, specific to a product.

The set of standards preferred by search engines, at schema.org, defines the superset of elements that make up a product. There are 55 elements that make up a superset of a product, with some of those elements (such as ‘Offer’) being schemas of their own.

This is important to our discussion, specifically because we need to realise that what happens behind the scenes becomes critical to automating the delivery of the on-page content.

Product on shema,org

Some of the elements from the product schema

Product from schema.org

Presenting products in a WCMS

A typical WCMS is optimised for presenting content in complex ways. It doesn’t really ‘manage’ the content, but does excel at presentation of content and important post-publication functions, like providing a hook for gathering analytics. The WCMS allows all of the information to be aggregated into rich product descriptions and converge onto a single page. The front-end WCMS is good at presenting all of this information for consumers to understand the content in context.

For example, my favourite shoe store, Fluevog, can display shoes filtered by:

  • wearer type
  • size
  • style
  • colour
  • heel height
  • shoe family

A consumer can zoom in, check the fit (on a scale of narrow to wide), see the price how many are left in stock.

Shoes on the Fluevog site after the search has been filtered

A Fluevog search after refining the search using filters

Shoes are probably on the simpler side of the product spectrum. In B2B situations, products could:

  • come bundled with other products
  • be subject to bulk discounts
  • have geographic restrictions around where they can be sold
  • display company-specific information for buyers who have log-ins

A decent enterprise WCMS can calculate, based on programmed-in business logic, what gets shown where – which products, which currency, which bundling options, inventory levels, and so on.

Different systems, different functions

While a WCMS is sophisticated about the way it presents content, storing all of the content in a WCMS doesn’t make sense. The product information, such as attributes, pricing, and so on, needs to be stored in systems that are meant to manipulate content or data in particular ways.

Some systems have specialty functions to manipulate content at a granular level. Other systems have specialty functions – for example, a pricing tool may convert between currencies, round up or down, calculate volume discounts, and add the appropriate taxes per country. These back-end systems are generally highly configured, and the content in them is highly structured and tagged so that it can automatically be pushed to the display layer in a WCMS.

This kind of content, and the data that goes with it, could be displayed in two ways: a view that looks relatively dry – think of an Excel spreadsheet view that lists sizes and colours – or a view that makes the information clearer and more enticing to whoever is consuming the content. Technologists call this “decorating the data”. Seth Gottlieb explains more about this in a post on his blog.

Seth Gottlieb article: The CMS Decorator Pattern

Multiple systems, each fit for purpose

For an organisation with any significant amount of product information, there is a high probability that the images come from one system, the descriptions live elsewhere, the attributes come from yet another system, the price comes from a dedicated pricing tool, the delivery information is calculated and delivered from another system, and the ratings served up from yet another system.

Search result from Amazon for portable headphones

Search result from Amazon for portable headphones

These elements can be recognised in the description of the headphones shown here, taken from an Amazon search. Again, this is to be expected – sometimes there are over 50 systems working together in complex enterprise solutions. It’s fine, as long as the systems are configured well and work together seamlessly to either push the content into the WCMS or to allow the WCMS to pull the content on demand.

Putting holes in the template

How multiple systems work together is by putting ‘holes in the template’ and calling some scripts to get the right information to populate those template holes. Sounds simple, but there’s actually a lot of complexity to the equation.

A typical complement of systems that work together could be:

  • ERP (Enterprise Resource Production) system, which pushes data (SKUs, prices, etc) into the WCMS
  • PIM (Product Information Management) system, which pushes product content and attributes into the WCMS
  • DAM (Digital Asset Management) system, which pushes binary files (images, video, audio, PDFs etc) into the WCMS
  • TrM (Translation Management) system, which manages language and other market variants behind the scenes
  • TxM (Taxonomy Management) system, which controls the terminology and tags to optimise search

These are parallel processes. And just as you can ‘tag up’ content in many different ways, these systems can deliver that same content according to many different criteria.

For example, the content can be shown according to specific reader profiles. This could mean that a content consumer logs in as a premium-package member and sees something different to a standard-package member. Or a corporate buyer sees something different to a retail shopper. Or that a reader chooses some filters (women’s shoes, red, heeled, size 8) and sees content specific to their criteria.

The role of semantics

When we talk about content filters and ‘tagging up’ content, we’re actually talking about semantics. Creating content that has enough semantics to meet all of the demands on it is hard and complicated to do in a WCMS. The content has to have enough semantics, and the right semantics, for the underlying systems to understand under what conditions to display specific content.

That’s the difference between being shown the expected products and being shown something completely unexpected. Deane Barker, author of the O’Reilly book, Web Content Management, and blogger at Gadgetopia, describes the folly of not paying enough attention to what happens in the holes. And as the saying goes, therein lies the problem.

This is why companies that need to respond to market conditions in a hurry, or that want to output to multiple devices, channels, markets, or audiences, don’t put their content directly into a WCMS. They put their content into fit-for-purpose systems, and then let the WCMS do what it does best – pipe the content into the right holes.

Deane Barker article: Editors Live in the Holes

Making content more intelligent

There are structured authoring environments that push the ability to manipulate content at more granular levels. These haven’t been as popular among digital agencies but have long been staples of organisations that have to control and publish vast amounts of product content, particularly content audited by regulators. These typically replace the tangle of tools (word processing, email, JIRA, and other clunky kludges):

  • a CCMS (Component Content Management System) in which authors use recognised schemas (DITA, DocBook, S1000D) to structure content
  • a HAT (Help Authoring Tool), which uses custom schemas to structure content
  • an XML editor, which works with a CCMS or, in some occasions, a WCMS

In these cases, the authors take control of the content elements and attributes, which at delivery time get processed through a ‘build’ (much like a software build), and which then get pushed into downstream systems such as the WCMS.

Someone asked me whether using one of the popular content markup standards, specifically DITA, meant losing out on the ability to easily re-use and re-purpose content for different media and devices. Actually, it’s the other way around. Creating highly semantic content, or ‘intelligent content’, means being able to re-use and re-purpose content with ease and agility.

Ann Rockley article: What is intelligent content?

Content trade-offs

Intelligent content and schemas such as DITA are not for companies that have a few thousand pages of highly crafted marketing content that rarely changes. For those organisations, it may be enough to enter content into forms where, after clicking ‘Submit’, it eventually get piped into the holes in the templates.

Intelligent content is for companies with enough content to warrant having content developers who are trained professionals. They need to understand:

  • the theory behind structured content
  • how to write for a structured authoring environment
  • how to apply semantics and metadata
  • how to craft content for a multichannel publishing environment

It’s important to know that both options exist, and when to use the right option. By understanding how content gets moved around by systems until it is presented to end users, we can make better decisions about how and where we should be creating content.