Holes in the template: piping content into a web CMS

When companies have large quantities of content – for example, many products, where each one has several pieces of information – that product information probably doesn’t originate from their web content management system (WCMS). The WCMS acts as a ‘presentation layer’ – in other words, a mechanism to display content.

The content doesn’t have to live in a single system. In fact, there may be multiple systems that feed different types of information, both content and data, into the ‘presentation layer’ of the WCMS. This isn’t a bad thing. Different end systems are optimised to input, store and process particular kinds of content or data.

In this post, we look at how content works within a WCMS, and how a WCMS works with other systems to present content in ways that create richer information so that content consumers can make sense of, and make decisions with, that content.

What constitutes a product?

Before we talk about how content comes together, we need to ask what information is actually needed. This search on the site for Post-it® Flags returned a typical result. The product consists of a few elements that we as content consumers recognise on the page:

  • images of the product
  • a short description
  • a price
  • quantity available
  • description
  • features
  • accessories
Results of a search on the site for 'Post-It flags'

A product on the site

This is what gets displayed on the screen, but there’s a whole other layer of technical information behind the screen that makes it possible for us to search for a product and see all of the bits of information we need to make sense of it.

To make sense of the technical side of what makes up a product, we need to look at the markup language behind the scenes. Content that needs to work between systems – for example, on different sites – needs to use a standard that can be understood across those systems. This has led to using international standards. In this case, we would use the markup language, or ‘schema’, specific to a product.

The set of standards preferred by search engines, at, defines the superset of elements that make up a product. There are 55 elements that make up a superset of a product, with some of those elements (such as ‘Offer’) being schemas of their own.

This is important to our discussion, specifically because we need to realise that what happens behind the scenes becomes critical to automating the delivery of the on-page content.

Product on shema,org

Some of the elements from the product schema

Product from

Presenting products in a WCMS

A typical WCMS is optimised for presenting content in complex ways. It doesn’t really ‘manage’ the content, but does excel at presentation of content and important post-publication functions, like providing a hook for gathering analytics. The WCMS allows all of the information to be aggregated into rich product descriptions and converge onto a single page. The front-end WCMS is good at presenting all of this information for consumers to understand the content in context.

For example, my favourite shoe store, Fluevog, can display shoes filtered by:

  • wearer type
  • size
  • style
  • colour
  • heel height
  • shoe family

A consumer can zoom in, check the fit (on a scale of narrow to wide), see the price how many are left in stock.

Shoes on the Fluevog site after the search has been filtered

A Fluevog search after refining the search using filters

Shoes are probably on the simpler side of the product spectrum. In B2B situations, products could:

  • come bundled with other products
  • be subject to bulk discounts
  • have geographic restrictions around where they can be sold
  • display company-specific information for buyers who have log-ins

A decent enterprise WCMS can calculate, based on programmed-in business logic, what gets shown where – which products, which currency, which bundling options, inventory levels, and so on.

Different systems, different functions

While a WCMS is sophisticated about the way it presents content, storing all of the content in a WCMS doesn’t make sense. The product information, such as attributes, pricing, and so on, needs to be stored in systems that are meant to manipulate content or data in particular ways.

Some systems have specialty functions to manipulate content at a granular level. Other systems have specialty functions – for example, a pricing tool may convert between currencies, round up or down, calculate volume discounts, and add the appropriate taxes per country. These back-end systems are generally highly configured, and the content in them is highly structured and tagged so that it can automatically be pushed to the display layer in a WCMS.

This kind of content, and the data that goes with it, could be displayed in two ways: a view that looks relatively dry – think of an Excel spreadsheet view that lists sizes and colours – or a view that makes the information clearer and more enticing to whoever is consuming the content. Technologists call this “decorating the data”. Seth Gottlieb explains more about this in a post on his blog.

Seth Gottlieb article: The CMS Decorator Pattern

Multiple systems, each fit for purpose

For an organisation with any significant amount of product information, there is a high probability that the images come from one system, the descriptions live elsewhere, the attributes come from yet another system, the price comes from a dedicated pricing tool, the delivery information is calculated and delivered from another system, and the ratings served up from yet another system.

Search result from Amazon for portable headphones

Search result from Amazon for portable headphones

These elements can be recognised in the description of the headphones shown here, taken from an Amazon search. Again, this is to be expected – sometimes there are over 50 systems working together in complex enterprise solutions. It’s fine, as long as the systems are configured well and work together seamlessly to either push the content into the WCMS or to allow the WCMS to pull the content on demand.

Putting holes in the template

How multiple systems work together is by putting ‘holes in the template’ and calling some scripts to get the right information to populate those template holes. Sounds simple, but there’s actually a lot of complexity to the equation.

A typical complement of systems that work together could be:

  • ERP (Enterprise Resource Production) system, which pushes data (SKUs, prices, etc) into the WCMS
  • PIM (Product Information Management) system, which pushes product content and attributes into the WCMS
  • DAM (Digital Asset Management) system, which pushes binary files (images, video, audio, PDFs etc) into the WCMS
  • TrM (Translation Management) system, which manages language and other market variants behind the scenes
  • TxM (Taxonomy Management) system, which controls the terminology and tags to optimise search

These are parallel processes. And just as you can ‘tag up’ content in many different ways, these systems can deliver that same content according to many different criteria.

For example, the content can be shown according to specific reader profiles. This could mean that a content consumer logs in as a premium-package member and sees something different to a standard-package member. Or a corporate buyer sees something different to a retail shopper. Or that a reader chooses some filters (women’s shoes, red, heeled, size 8) and sees content specific to their criteria.

The role of semantics

When we talk about content filters and ‘tagging up’ content, we’re actually talking about semantics. Creating content that has enough semantics to meet all of the demands on it is hard and complicated to do in a WCMS. The content has to have enough semantics, and the right semantics, for the underlying systems to understand under what conditions to display specific content.

That’s the difference between being shown the expected products and being shown something completely unexpected. Deane Barker, author of the O’Reilly book, Web Content Management, and blogger at Gadgetopia, describes the folly of not paying enough attention to what happens in the holes. And as the saying goes, therein lies the problem.

This is why companies that need to respond to market conditions in a hurry, or that want to output to multiple devices, channels, markets, or audiences, don’t put their content directly into a WCMS. They put their content into fit-for-purpose systems, and then let the WCMS do what it does best – pipe the content into the right holes.

Deane Barker article: Editors Live in the Holes

Making content more intelligent

There are structured authoring environments that push the ability to manipulate content at more granular levels. These haven’t been as popular among digital agencies but have long been staples of organisations that have to control and publish vast amounts of product content, particularly content audited by regulators. These typically replace the tangle of tools (word processing, email, JIRA, and other clunky kludges):

  • a CCMS (Component Content Management System) in which authors use recognised schemas (DITA, DocBook, S1000D) to structure content
  • a HAT (Help Authoring Tool), which uses custom schemas to structure content
  • an XML editor, which works with a CCMS or, in some occasions, a WCMS

In these cases, the authors take control of the content elements and attributes, which at delivery time get processed through a ‘build’ (much like a software build), and which then get pushed into downstream systems such as the WCMS.

Someone asked me whether using one of the popular content markup standards, specifically DITA, meant losing out on the ability to easily re-use and re-purpose content for different media and devices. Actually, it’s the other way around. Creating highly semantic content, or ‘intelligent content’, means being able to re-use and re-purpose content with ease and agility.

Ann Rockley article: What is intelligent content?

Content trade-offs

Intelligent content and schemas such as DITA are not for companies that have a few thousand pages of highly crafted marketing content that rarely changes. For those organisations, it may be enough to enter content into forms where, after clicking ‘Submit’, it eventually get piped into the holes in the templates.

Intelligent content is for companies with enough content to warrant having content developers who are trained professionals. They need to understand:

  • the theory behind structured content
  • how to write for a structured authoring environment
  • how to apply semantics and metadata
  • how to craft content for a multichannel publishing environment

It’s important to know that both options exist, and when to use the right option. By understanding how content gets moved around by systems until it is presented to end users, we can make better decisions about how and where we should be creating content.

How to cope with the increased demands on content

The complexity of producing and delivering content has grown exponentially over the past couple of decades, as the demands for content have grown. In simpler times, content was produced as a single-channel deliverable. We would write an article for a magazine, or a user guide, or a maintenance manual. There was one piece of content and one deliverable.

Writing content in simpler times

When the web came along, things changed considerably. We made the transition from writing in the book model for print and chunking the copy up for the web, to writing in topics for the web and then stitching the contents together for the print version that got delivered to customers.

For the most part, we still worked alone on a content deliverable. Each person on a team would be assigned an area to cover. For example, a company that produced a product would have:

  • marketing collateral in print done by a marketing team
  • marketing collateral on the website done by a digital marketing team
  • a user guide done by a technical communicator/technical author
  • a maintenance manual done by a different technical communicator
  • PDFs of the product material, uploaded (and forgotten) by a webmaster

Content got more complicated

As time went on, content got more complicated. The inconsistencies between digital and traditional channels became more apparent, and less tolerated, by customers. There were more demands on content, and more channels demanding content to fill them. There was not only the marketing funnel waiting to be filled, which makes up about 20% of any large website, but also product support material, the other 80%. Traditional product content was needed, such as quick start guides, user guides, training manuals, and service-center material. New channels also needed content: forums, knowledge bases, social, and so on. This didn’t account for the additional channels for that content, such as tablets, smartphones, wearables, and newer channels such as chat bots.

Multiplicity and the demands on content

Organisations are now in a situation where the volume of content and number of delivery variables means that the complexity of producing and delivering content has reached a tipping point. The demands on the business, the content developers, the technologies, and the content itself have grown exponentially, and it’s harder and harder to keep up.

For a moment, let’s picture 4 unique pieces of content that come together to describe a feature of a product. Now let’s say that that particular feature is used in 4 different product lines; that content is now being used 16 times. Now, imagine that each product line has four products within each line that uses that feature. Those 4 pieces of content get repeated 64 times. Now, multiply by 4 delivery channels, and that means those original 4 pieces of content are used a whopping 256 times. That’s a lot of copy-and-pasting.

4 pieces of content can be used 256 times.

How 4 pieces of content can balloon to 256 different uses.

This example of multiplicity is not understated. In fact, the phenomenon is all too common. As organisations develop more products and services, they create more content deliverables to support them, and deliver that content through multiple channels. At its best, content re-use is a laborious, time-consuming way to track where content is used and re-used. At worst, the process of tracking content use becomes a maintenance nightmare.

Finding a way to cope

How are organisations coping with this explosion of content? In my experience, not well. Too many clients have finally broken down and sought help because they’ve run out of spreadsheet management capacity – even in environments with a web CMS. Yet the demands on content continue to grow, and a greater level of sophistication is needed to deliver on the value propositions anticipated by the business.

So how can organisations cope? With a CODA (Create Once, Deliver Anywhere) strategy based on the COPE (Create One, Publish Everywhere) strategy used by the US’s NPR (National Public Radio). The basic idea is that a piece of content can be created once, and then re-used through automation, instead of using a copy-and-paste approach.

By pulling content into the many places it gets used, content developers experience a marked decrease in maintenance effort. After all, CODA also means Fix Once, Fix Everywhere. This is because when content is re-used by ‘transclusion’, the original piece of content is the only actual instance of the content. All of the other ‘copies’ are actually only a reference of the original. Fix a typo in the original piece of content, and all of the derivative content is automatically fixed as well.

Multichannel content

How 4 pieces of content can exist in multiple channels, in multiple contexts.


What goes into CODA

Creating CODA content is based on the principles of intelligent content. This means that content is structurally rich and semantically categorised. The definition, created by Ann Rockley, goes on to say that this makes content automatically discoverable, re-usable, reconfigurable, and adaptable. Those may sounds like technical benefits, so perhaps they are best rephrased in business terms.

  • Business efficiency. With less maintenance overhead, content developers can focus less on low-level tasks like searching for duplicate content and filling in spreadsheets, and spend more time on value-add activities. On one recent project, a particular task that took several staff several months to complete could have been completed in a matter of minutes, had the content been highly structured and semantically categorised.
  • Accountability. When a CODA framework is implemented well, there is a granular audit trail that would make any auditor swoon with delight.
  • Accuracy. Brand, marketing, legal, and compliance are all concerned with content accuracy. Having a single source of truth to draw from means less mistakes, fewer review cycles, and less legal checks before content goes live.
  • Personalisation. Whatever personalisation means to your organisation, it is more easily done within a CODA framework. The semantics added to content means the content is adaptive – in other words, it’s easier to change a sentence or two within a message to reach a different audience, to vary an offering, to output specific parts of a block of content to different devices, and so on. This can be done without losing the context, and makes maintenance so much easier.
  • Extension of reach. The idea that content can be produced in a tighter way also means that the company can leverage the content in new ways. Going into new markets, adding new product lines, taking new languages on board – all of these are possibilities that can be supported with content. No more lag between the intent and action.
  • Dynamic publishing. In companies with large quantities of content, the ability to publish content on the fly, collect existing content into new contexts, and create new assets for customers, whether paid or promotional, becomes competitive advantage.

Adopting CODA

A logical question is, “If CODA is so good, why isn’t everyone doing it?” The content developers who have been doing CODA for decades ask that question a lot. It’s a technique that has been used extensively for large bodies of content (in all fairness, the technique has traditionally been applied to post-sales content such as technical documentation, customer support content, and training material) to cope with demanding production schedules and a high likelihood of post-publication maintenance.

However, as the complexity of content delivery grows and the demand on content grows with it, the imperative for well-structured, highly semantic content will need to become the norm. It has implications for all areas of business, from how we create content to how we deliver it, and all the steps in between.


© Scroll Ltd.