Towards content lakes

One of the trends in both data and content management is the move away from silos. In data management circles, there is a trend towards the collection and aggregation of customer data into “data lakes”. According to¬†Margaret Rouse, a data lake is:

A storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question…Like big data, the term data lake is sometimes disparaged as being simply a marketing label for a product that supports Hadoop. Increasingly, however, the term is being accepted as a way to describe any large data pool in which the schema and data requirements are not defined until the data is queried.

(source: what is a data lake?)

“Content lake” isn’t a word that’s used in the content management or technical communication sectors yet, and whilst it seems unlikely end user content will grow at the same rate as other forms of data, there’s a¬†fair¬†chance this phrase could catch on.

A content lake is likely to have similar attributes to a data lake:

  • Content will be stored in a native format that is then changed into other formats.
  • It will use a flat architecture to store data.
  • Content will be stored in some type of structured format. Perhaps XML, JSON or plain text (with AsciiDoc-like attributes assigned to certain sections). However, user documentation does not require the rigorous structure of other forms of content.
  • The content¬†lake can be queried for relevant content, and that a smaller set of information can then be extracted¬†to help answer questions. This might not mean publishing content on-the-fly, but generating PDFs, CHM files and web-based content from a single source.
  • Rather than¬†content¬†being simply archived, it will deliver the right information in very short timeframes.

See also:

Please share your comments below.

A technical communication user’s hierarchy of needs

At the TCUK 2015 conference, Rachel Johnston mentioned the idea¬†of¬†a content maturity model. We thought we’d take this idea and ask:

Could we develop a model that illustrates a hierarchy of needs for users of technical communication (and in particular, User Assistance)?

A model of what?

We suggest calling this model a¬†technical communication user’s¬†hierarchy of needs. This is because¬†we’re considering the different¬†points where a user interacts with technical communication content, the information they need, and value it gives to them.

It takes a similar approach to the content maturity model Rachel suggested (shown in the photo below), with the least mature organisations providing just the legal minimum, and most mature content systems contributing to branding and evangelism.

content maturity model diagram

A¬†user’s¬†hierarchy of needs also enables us to compare this model to similar models from content marketing and product design. For example, the¬†categories in our model’s¬†hierarchy¬†roughly correspond to Peter Morville’s “User Experience honeycomb”, as well as¬†the key elements¬†in product design.

Continue reading

Reflections on the TCUK15 conference

I was one of the presenters at last week’s Technical Communication UK 2015 (TCUK) conference.¬†TCUK is the Institute of Scientific and Technical Communicators’ (ISTC’s) annual conference for everyone involved in writing, editing, illustrating, delivering and publishing technical information. It’s an opportunity for Technical Communicators from the UK and mainland Europe to meet up and mingle, learn and present.

auditorium at tcuk 15 conference

Here are my reflections on the event.

Continue reading

The ContentHug interviews

I was asked to take part in the ContentHug series of interviews on technical communication and content strategy.

It was fun and challenging, going through the questions.

ContentHug’s Vinish Garg is interviewing a number of consultants involved in technical communication and content strategy, and asking them essentially the same questions. By reading the interviews, you can see where there are areas of agreement and where there are a variety of opinions. In general, there is a fair bit of consensus. They are worth reading.

Ellis will be speaking at MadWorld 2016

MadWorld conference

Cherryleaf’s Ellis Pratt will be speaking again at MadCap Software’s conference on¬†technical communication and content strategy conference.¬†MadWorld 2016 will be held between the 10th and 12th April 2016 at the Hilton San Diego Resort and Spa, in San Diego, California.

Continue reading

Building Information Modelling (BIM) for content

Building Information Modelling (BIM) is an increasingly popular technique used in the construction industry. It involves creating XML digital models of buildings and tunnels during each stage of a project. However, these are more than just 3D animated models, as they also embed information about physical objects in the building. According to Wikipedia:

“A¬†building owner may find evidence of a leak in his building. Rather than exploring the physical building, he may turn to the model and see that water valve is located in the suspect location. He could also have in the model the specific valve size, manufacturer, part number, and any other information ever researched in the past, pending adequate computing power. “

It means architects and¬†engineers¬†can¬†“see” behind walls and discover if there are any pipes or cables that might be affected by any planned works.

This concept of an intelligent model that can be shared between stakeholders throughout the whole lifecycle is also the future for content. Organisations want the ability to know how different items of content are related, what is the structural and metadata information behind the presentation layer and how content has developed chronologically. They want the ability to use a model to plan and modify before they start the more costly work of implementation.

BIM could perhaps provide a useful analogy for Technical Authors, procedures writers, and others developing text-based content, when they are explaining the purpose and value of structured content, single sourcing and Component Content Management Systems.

Teachers need content management systems, too

The Guardian has an article today called “Teachers and parents criticise ‚Äėrobotic‚Äô software-generated school reports“. It explains¬†teachers are finding report writing software isn’t meeting their needs:

“It often frustrated as none of the options would quite capture what he wanted to say about a child and the end product was never satisfactory.”

It states, as an alternative, some teachers have a comment bank, which they use to cut and past into school reports. One teacher said

“I‚Äôve got a bank of literary comments, maths comments and general comments. You can pick one that sounds about right, whip it out and plonk it in.”

A better solution might be a content management system that could contain a single-sourced comment bank, templates and some advice of what to write where.

As the spokesman for the National Association for Head Teachers said:

‚ÄúHeadteachers invest a lot of time and effort into making sure this happens. Technology can help that process but it should never get in the way of a truly personal report for each and every child in the school.‚ÄĚ