Cutting and pasting content into Word documents – Is there a better way?

Earlier this week, we were helping a large company finalise a bid document where they were required to use a Word file sent by their client. This involved taking content from the company’s repository of standard documents on SharePoint, and from emails, plus writing down information provided verbally by the Subject Matter Experts. The bid writing team had to cut the relevant content from a Word document (and emails, Excel spreadsheets, Visio files, Microsoft Project files and PowerPoint presentations), and then paste it into the bid document.

Before we started to work on the document, this had resulted in it containing a large amount of different formatting styles. For example, the content pasted from emails was in Calibri 10pt. font, and the content posted from Word was in Arial 11pt. This meant the bid writing team had to spend a lot of time remedying the formatting.

This method also meant there was no reliable way to embed content, like there is, for example, in Excel – if you change a cell in Excel, related cells in other places can update themselves automatically to reflect that change. For the bid document, any changes to the source content could trigger a further round of copying and pasting into our master document.

In addition, there wasn’t a simple of way of merging any improvements to the content back into the original source, in the way that programmers use Git branching and merging to update source code (illustrated in the diagrams below).

Git branching - image by Vincent DriessenGit branching - image Vincent Driessen

This prompted a discussion as to whether large organisations needing to produce large bid documents and large reports would ever move away from this often inefficient way of working.

Existing solutions

Of course, there are solutions out there today. For example, an organisation could:

  • Move to authoring software that:
    1. Separates the formatting from the content, and
    2. Can generate Word documents.

For example, Adobe FrameMaker, MadCap Flare, Confluence or RoboHelp. The issue with this approach is weaning staff off Word and into a new authoring environment.

  • Move to a more complex authoring tool that:
    1. Separates the formatting from the content,
    2. Can generate Word documents, and
    3. Enables you to break documents down into individual, structured components.

For example, standardising on DITA XML. The issue with this is that it would require a great deal of retraining of staff to write in this way.

  • Use a tool that can
    1. Manage multiple source documents, authored elsewhere, and then
    2. Generate a single Word document.

For example, Doc-to-Help or MadCap Flare. The issue with this approach is you might end up with only one person knowing how to use the tool, and therefore a potential bottleneck.

These are all viable solutions, but the potential disruption to working practices means that many companies stick with muddling through, using their existing tools for creating documents. The problem isn’t sufficiently big enough for a lot of organisations to want to go through a great deal of pain of fixing it.

Another approach

So let’s explore another approach – transferring content through a simple format that separates content from the formatting.

Tom Johnson has been blogging about his experience of using Markdown, and perhaps this could be a solution for corporate documents. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, and then convert it to fancier formats such as HTML or Word. Markdown has a syntax for marking text that will be displayed as a heading, a list item, and so on:

Markdown example

The overriding design goal for Markdown’s formatting syntax was to make it as readable as possible, without looking like it’s been marked up with tags or formatting instructions.

There are applications that can toggle between Markdown and other formats. Below is an example of how emails can be converted to Markdown, using the Markdown Here extension for Chrome, Firefox, Safari and Thunderbird:

markdown example

To convert the Markdown document back to Word, you can use Word plugins such as Writage (or standalone tools such as Texts and Pandoc).

This approach could save the time spent on fixing formatting issues from having content from multiple sources.

There are also versions of Markdown that enable you to add metadata (giving you a potential migration path to DITA), and create the complex tables and numbering that are sometimes required in bid documents.

What about the cutting and pasting?

At this point, of course, you would still be cutting and pasting content. You would be saving time on managing formatting, but there could still be multiple versions of the same content.

The original version of Markdown lacked a notation for including files, but there are ways to introduce file transclusion and ways to embed files within files.

including files within files Image: Jason Verly

Common approaches include:

  • Use a tool that supports merging multiple Markdown files into a single document. For example, Marked 2 or Scrivener.
  • Use a command line script, such as mdmerge with MultiMarkdown, to concatenate markdown files into a single file.
  • Use a technical authoring tool such as Doc-to-Help, RoboHelp or Flare to manage multiple source documents and generate a single Word document. You need to create a simple import and conversion template in those applications, as these tools don’t support Markdown directly yet.
  • Use Assemble. Assemble enables you to write document fragments in markdown, so they can be included within a large document. However, Assemble doesn’t currently support pagination, navigation and indexing, so you’d need to rely on Word to add those. It’s also aimed at developers, rather than end-users.
  • If the staff were familiar with Git, use a tool, such as GitHub Flavored Markdown, to manage and merge content.

We welcome your comments!

Our idea is to find a way for a typical, large organisation to make a small step: something that wouldn’t disrupt the way most people work, and also offer a path to more advanced capabilities at a some future date.

Please share your thoughts and experiences.

2 Comments

Julian Maynard-Smith

All markup languages feel like a regression to the eighties and the pre-WYSIWYG days of WordStar, WordCraft and their ilk – and still require a cultural shift and relatively steep learning curve. A wiki environment is superior.

Deep down, the real problem is less which technological solution to go with (important though this is) and more the misguided belief that information needs to be entombed in a Word document or PDF. People are already extremely comfortable with updating information on LinkedIn, Twitter, Facebook and other social media interactively online, even using their mobiles, so the “document” is already dead in the most important areas of most people’s lives. If we can all stick a stake in the vampiric heart of “documents” as the be all and end all, technical writing can finally move wholeheartedly into the 21st century.

Ted Bergeron

I like the concept of separating the content from the formatting. I use the Asciidoctor tool chain because it is like a super-set of Markdown. Asciidoctor allows comments and including other files. Asciidoctor outputs to HTML, PDF, DocBook, Reveal.js slide decks, and EPUB.

I maintain text documents in repositories that track history. Then as needed render and copy/paste the output into Word.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.