Topic-based authoring: The undiscovered country

NT Live Hamlet Many software companies, when they start out, provide user documentation as downloadable PDFs or as web pages. As they develop more products and versions, and as they expand into countries that use different English spellings, the amount of documents can grow until it becomes hard to keep all of these documents up to date.

It’s at this point that they tend to call a specialist technical writing company (such as Cherryleaf) to see if they can fix the problem for them. We find they’ve usually had a brief look at a Help Authoring tool, such as Flare or RoboHelp, and can see that it would solve a lot of their problems. However, they’re often not really sure how to use these tools in the best way.

Although topic-based authoring has been around for over twenty years, for many people it’s a completely new concept. It is, to quote from either Hamlet or Star Trek VI, an undiscovered country. Our meetings with them often end up focusing on the benefits of topic-based authoring.

Topic-based writing is an approach where you write a piece of text (or topics) that typically contains a paragraph or two about a single topic. These topics can be combined to create a page in a PDF document, and they can be organised in a sequence to create an online Help system ( See topic-based authoring page in Wikipedia). It’s a modular approach to creating content. The main advantage of this approach is the topics are often reusable; you can save time by reusing topics across different documents, and you can publish the same content to different media. For example, you could use a topic in training courseware, in a user guide and in marketing information.

As each topic is usually about a specific subject, and has an identifiable purpose, it can also help the writer write more clearly. If you need longer articles, you can build these up from the topics you’ve created.

It’s easy for professional Technical Authors to forget sometimes that many people have never come across this approach to writing before.

Assessing the potential savings from single sourcing

One of the main benefits from single sourcing is the ability to reuse existing content. Different departments can avoid duplicating work, which means they can save time and money.

Unfortunately, it can be difficult to quantify these savings before you move to an authoring or content management system that enables you to single source. Analysing all the existing documents in a business can be overwhelming, which means often organisations only quantify the savings after the single sourcing content management system has been implemented.

There are a few software applications that can help you analyse your existing content and determine how much duplication exists. You get a sense of how much time and effort was wasted in the past, which is a pretty good indication of how much waste you’d avoid in the future.

Continue reading

Different world views of content and content strategy

TshirtThere’s a wonderful German word, die Weltanschauung, which roughly translates as a view of the world. It suggests there is a framework of ideas and beliefs behind people’s descriptions of various things in the world. I was reminded of Weltanschauung at this week’s London Agile Content Meetup, where Rahel Bailie neatly summed up some of the different views of content, content strategy and single sourcing.

Baked v Fried content

Paul hollywood

CMS Wiki described baked content as “pages that have been generated by a Content Management System, but then moved to a static delivery server, which can serve them at high speed and high volume”. The word “baked” is used, because this approach means you cannot separate the content from the format afterwards. They are baked together.

“Fried” content is where the Web pages are built “on the fly” when they are requested by the end user. Rahel used the example of frying eggs: if you put too many eggs into the frying pan, you can always remove one. Fried content may take a little longer to generate than baked content, but this approach enables you to personalise and filter the content. It also means you can present the information in different ways, depending on which device a person is using.

COPE through technology v COPE through authoring

COPE (Create Once, Publish Everywhere) is another way of describing single sourcing content.

“COPE through technology” is the view that the content is essentially data that can be managed through software. If you need to create a personalised or filtered view of the content, you get a developer to create that version. If you need to create a mobile-ready version of your site, again you get a developer to do this. Content is often created by completing forms, in order to create structured information.

“COPE through authoring” is  the view that the writers can do all of the fine-grain manipulation of content. If you need to create a personalised or filtered view of the content, you get the Technical Author to mark up sections for those different conditions in the content itself. To quote Rahel, “You can then run a transformation script run, which compiles the content into its final form, and uploads the content to the Web CMS, or other publishing platform, for consumption and presentation.” The advantage of this approach is it stops you from being tied to a technology or application. The disadvantage is it relies on your writers being able to mark up and structure the text correctly.

It’s important to be aware of these distinctions when you talk about content, content strategy and single sourcing, because your Weltanschauung may not be shared by the person you’re talking to.

See also: Introduction to Content Strategy Training – Classroom and Online Courses

Book review: Every Page is Page One

Every Page is Page One book coverThere’s a joke in education along the lines that students are taught the notes their teachers wrote down at university 20 years earlier…without going through the heads of either.

I mention this because there have been a number of technical communicators who have started to question the technical writing best practices that have been taught to student Technical Authors for the past 30+ years. At Cherryleaf, we show on our advanced technical writing techniques course how some of the largest websites have been breaking the generally accepted rules for writing User Assistance – companies that test and test again to see what works best for their users. Ray Gallon of CultureCom has been developing his cognitive approach to User Assistance, and Mark Baker has been developing and promoting the idea of “Every Page is Page One” (EPPO) Help topics.

Mark has published his ideas in a new book called “Every Page is Page One“. I was asked to review an early draft of the book, and, over Christmas, I was sent a copy of the published version.

In a nutshell, Mark’s argument is that, with Web-based content, you don’t know the context in which people are reading a Help page. You cannot assume that they have read any other pages prior to reading this topic. Therefore, you need to treat every page as Page One, the starting point, and include more introductory, contextual information in your topics. He argues that most Technical Authors have misunderstood minimalism, and the EPPO approach is actually more consistent with how John Carroll (the creator of minimalism) recommended User Assistance should be written.

The book provides recommendations on the level of detail you should include on a page before you need to create a new topic, and when and where to create links to other pages. He also compares EPPO to Information Mapping and DITA, and outlines how EPPO can complement these standards.

Reading the early PDF draft with a reviewer’s eye was struggle at times, but reading the final version in printed book format was an easy and enjoyable exercise. Perhaps reading some sections for a second time helped, as well.

We agree with a great deal of Mark’s ideas. We agree with the general idea of self-contained topics that provide the context for a task. We agree with the need for mini-Tables of Contents and a bottom-up approach to writing. We agree that tasks should include some contextual information. We agree online content can be atomised too much. We also liked his analysis of why screencasts are so popular, and the secrets to their success.

We have a few minor issues. Mark cautions against duplicating content on more than one Web page, because it’s bad for Search Engine Optimisation. We believe you should write efficiently in a way that’s best for the user, and that it’s up to the Search Engines to improve their algorithms so they can differentiate between “good” duplication and “bad” duplication. Google should be adapting and learning from the way good content is written, not us having to create sub-optimal content in order to satisfy Google.

It’s a book for people involved today in writing online User Assistance. Although the book is very clear and well structured, you probably need to have some experience of creating User Assistance to fully understand everything that’s covered in the book. It’s an important contribution to the discussion over whether technical communicators have focused too much on production efficiencies to the detriment of creating content that’s actually of value to their users. It’s worth getting a copy of this book.

How much content can you actually re-use when you move to single sourcing?

One of the challenges when considering moving to a single sourcing authoring environment, such as DITA, is determining the Return on Investment. This often boils down to a key question: how much content can you actually re-use?

Organisations typically attempt to answer this question in a number of ways:

  • Conducting a semi-manual information audit of the existing content to identify the number of times the same chunks of information is repeated. Unfortunately, this can be a large and lengthly exercise.
  • If the content is translated, getting reports from Translation Memory tools indicating where content might be repeated. Unfortunately, if you’re not translating your content, you won’t have this information.
  • Using benchmark industry measures. Unfortunately, these can vary enormously (from 25% to 75% content re-use), and your situation may be totally different.

In an ideal world, you’d be able use an application that could look at all your content and give you a report telling you the where content is repeated. It could do the “heavy lifting” in the information audit automatically for you. This programmatic analysis of reuse within existing content, at an affordable cost, is now starting to become possible.

Continue reading

Building intelligence into business documents

Often business documents, such as sales proposals and annual reports, are a joint effort between various people and departments. It involves collaborative writing and incorporating existing content. For printable documents, this collaboration can make it really difficult to maintain a consistent level of quality, writing style and “look and feel”.

For Technical Communicators, there’s an opportunity to provide their organisations with systems that produce key business documents in a more efficient way. They have the skills and experience to build systems that can (a) guide a writer through the process of developing a new document, and (b) enforce content and layout standards.

The advantage of a such a system is that writers who might not be familiar with writing a particular document are no longer faced with a blank sheet to begin with. It’s possible to create a system that can build the bulk of the document in a matter of minutes, leaving the writer with the task of customising the information to suit the requirements of each particular situation.

The result of this approach is that:

  • A document is pre-structured in the appropriate format
  • Mandatory information is included automatically in the document
  • Actual writing time is greatly reduced
  • The skills required to produce high quality documents are significantly lowered
  • People can contribute easily, and can be guided on how to best write their contributions
  • The organisation creates more consistent documents

To build a system like this, the organisation needs to:

  • Create a global design for the document, including contributions from other departments
  • Define the workflow and sign-off procedures
  • Develop base content and reusable content objects from various departments
  • Create boilerplate documents for various situations
  • Deploy the system to the contributors and editors

The good news is, if it has a Technical Author working for them, then the organisation already has someone with the skills and experience to carry out these tasks correctly. If you don’t, then don’t forget Cherryleaf can help.

So what is Single Sourcing and what is DITA?

Single Sourcing reduces the need to create and maintain duplicate content, by enabling you to use existing “chunks” of content. This means you can have the same information in different publications, and you can have a library of existing content to re-use when you’re developing new documents.

The content is stored independently of the formatting, the same content can be published to different media. These can be used many times to generate paper manuals, Web pages, online Help and e-learning material.

Single Sourcing can significantly improve the way you create, develop and maintain content.

What is DITA ?

DITA is an increasingly popular open source XML-based framework for designing and delivering well-structured technical documentation efficiently and consistently in a single-sourcing environment. Cherryleaf can help you understand when, why, where and how to use DITA.

What an Accountant can teach a Technical Author about single sourcing

Many people struggle to see the difference between cutting and pasting and true single sourcing. It’s difficult to come up with an analogy that people understand.

One way is to compare single sourcing to accountancy, and look at the impact financial software has had on book-keeping and accounting. Before the days of computers, companies would have to enter individual sales into a number of different ledgers. If a mistake was made, you’d have to go and fix the problem in all the different books of accounts that were impacted by it. It was time consuming, lengthy and costly. Financial software applications have transformed that process, as they update all the relevant day books, ledgers and accounts automatically.

Image by takeabreak (Flickr Creative Commons)

Single sourcing works in a similar to these financial applications, but cutting and pasting doesn’t. At a recent presentation for Author-it, its President, Steve Davis said cutting and pasting content from one document to another simply creates “graveyard” documents. In other words, it’s fine if you want to leave a document to grow old and die, but it will cause you problems if you ever want to update the document at some stage in the future. You’ll then need to spend time searching for the places that have used that information and then recreate the content in all those different places. You’re like the book-keeper running up and down lists trying to keep them all in harmony.

Single Sourcing manages content so that it can be updated centrally: any changes to important information (such as a legal notice, company overview or terms) will be reflected everywhere that content is used. That’s because documents are not stored as files, but as chunks of information managed by a database of same form. By storing content in this way, content can be easily reused in multiple documents, ensuring greater accuracy and consistency.

It’s also likely to be at less cost – something Accountants also know a great deal about.