one man writes
one man designs
one man blogs
one man tales

Archive of Single Source posts

 
 

Content Analysis for re-use

The basic premise of “single source” can be summed up in one word.

Re-use.

Sounds simple enough but there is a wealth analysis and work that is required before that, somewhat elegant, aim can be met.

Analysing your content for potential re-use opportunites is, by and large, an onerous task. Whether you do it all by hand, printing out reams of documentation and annotating by hand, or electronically compiling spreadsheets using colour coding or obscure (“they made sense to me at the time”) codes, it takes time to do it properly and there are no shortcuts. Sorry to break it to you so bluntly.

However it does mean that you are forced to spend some time re-reading your content, content which you might not have visited for some time or in some cases, may not have written yourself. You’ll likely find inconsistencies in the content itself, styling errors and quite probably a completely different writing style. Whilst it may seem obvious I urge you, should it arise, to fight the urge to start editing as you go along.

My basic understanding of single source, and the re-use of information, is that there are times when you’ll need to rewrite content so it can be easier used in multiple locations. A change of tense perhaps, a rephrasing or reconstruction of a sentence may be all that is required, and hell, if you have the document open in front of you, why not just go ahead and make that change? Suffice to say that editing content that you are analysing has only one potential outcome. Chaos. Regardless of how well organised, how well planned your analysis is, if you start making changes to your content on the fly, you will soon find yourself with a blurred view of the very thing you are trying to analyse.

Yeah, I know. It’s sounds obvious, and it is when viewed from a distance.

However what I really wanted to discuss, for I’m certainly not 100% certain on this, is at what level does content granularity become too granular? If I want to re-use a paragraph then, obviously, breaking up content to the paragraph level makes sense but that immediately seems like overkill in many cases. So I’ve been steering away from that kind of structural thinking, away from paragraphs and sentences into semantically discrete blocks. So a short product description, containing a heading and a paragraph, is one block and a long product description, containing a heading and several paragraphs, is another. I’m pretty sure this is the correct approach but it does mean that, once you’ve made that decision, you are stuck with fairly large chunks of information.

I’m hoping that this is a good balance though, for if we are to break our content into smaller granules, the overhead of maintaining and manipulating them surely increases. Remember, in a single source system we are concerned with more than content, we also have to contend with the metadata associated with that content, and the more pieces of information we have to maintain, the increase in risk that the metadata becomes so complex as to be useless?

I think. Maybe.. I’m really not that sure.

Have you conducted any content analysis? If so how did you approach the granularity issue? I get the sense that, for a lot of people, the level of granularity is reached once the content analysis is complete, that it basically decides itself.

As we slowly progress towards a single source solution, I’m intrigued as to what to expect next, any thoughts or comments are much appreciated. After all, all the articles, conferences and books in the world can replace real life experience.

Notes:
This post was, in part, inspired when pondering if semantic analysis might be a way to tackle this but, for now, I wonder if it is perhaps a step too far for most?

CSS for layout

… and why you should use it.

Separating content from structure and style is a common theory, widely accepted to those of us either using or investigating single source solutions for our documentation. The same theory has been applied to web development and offers similar benefits.

CSS-based web design developed in parallel with the growing movement towards (and promotion of) the use of standards on the web. The web standards movement was a direct response to the increasing problems faced by web designers as they struggled to keep pace with the bespoke features introduced by the browser software of the day. Advocating support for the W3 maintained standards around, initially, HTML it soon found a band of supporters who were challenging themselves, and everyone else, to stop using tables as a mechanism for controlling page layout, and instead switch to using Cascading Style Sheets (CSS).

The origin of table-based layout was, essentially, a clever hack. Early versions of HTML, and the internet browsers that people used to view web pages didn’t have any way to control the layout of a page so tables were used. Nesting tables within tables to provide discrete areas for navigation, content and so on, became the norm and some very complex examples still exist. However, as the web gained popularity and large sites started to emerge, it became apparent that table-based layout were no longer workable. They were far too hard and too time consuming to maintain, and many web developers recognised this and started searching for a solution.

Separating the content from the layout elements was an obvious step and is easily achieved using CSS. Whilst it was primarily created to allow more flexible and powerful styling, it was soon evident that, as each page element can have positioning assigned, that it could also be used as the positioning mechanism.

The basic theory of CSS-based layout is pretty simple. If you draw out the sections of your web page you’ll probably end up with several different blocks. One for the banner, one for the navigation, another for secondary navigation, one for the content, and so on. Each of those blocks can be positioned separately, or in relation to one another and as each block is uniquely identified division, then all you need to do is apply layout rules to every division to position it where you want. OK, maybe it’s a little flippant to say “all you need to do” as there is a wealth of issues to be aware of when using CSS for layout but don’t panic, there are plenty of templates to get you started, I’ve linked to some at the end of this post.

Mind you, this doesn’t really sound much different from using tables though. Right?

Wrong. The real power of using CSS for layout comes when you need to change the position or other layout characteristics of one of those divisions. For example, let’s say you have a set of navigation links in a column down the left of the page. In a table-based layout you’d have a separate table cell holding those links (which may in turn be held in a nested table to help you align them). Simple enough.

Now, you need to switch that list of links to the right of the page. In table-based layout you’d need to cut-n-paste that table cell and move it on EVERY PAGE in your website or across your help system. Do you fancy doing that for every page in a 500 page help system, because I don’t.

Using CSS for layout, you’d make a change to the stylesheet (the .CSS file) and all the pages in your website would be updated. For a large website, or for anything more than 20 or so pages, the time savings soon become evident. I’d advocate that you take this approach for smaller static websites as whilst, table-based layout is still possible, the repetition of making any minor layout change still needs to be reflected across every page.

Ultimately, using CSS for layout isn’t really about web standards, nor is it just a trend. It’s a justified and valid use of technology to allow you to work smarter, to concentrate on the content you are delivering, and not spend a disproportionate amount of time editing multiple pages of a web-based help system or website. When your boss asks you what you did last week, what would YOU rather say?

Learning CSS-based layout is not without problems, there are still browser compatibility issues to overcome, although most are now well documented and easy to grasp but I truly believe that it is worthwhile learning the basics. Of course, the internet being what it is, there are a myriad of templates available to get you started, in fact some may even provide all you need.

Related reading:
Layoutomatic – offers three simple CSS-based layouts. A good way to learn the basics.
Free CSS layouts and templates – compiled by the wonderful Smashing Magazine.
Web Standards Project – keep up to date with the latest news in web standards.
CSS Zen Garden – one structured page of content, hundreds of different CSS layouts and styles. THE example of the power of CSS-based layout.
A List Apart – an excellent online magazine for web design, chock full of good stuff.

Content Audits

The basic premise behind auditing your content is to better understand both the structure and the content itself. Conceptually the idea seems simple enough, but in reality performing a content audit can be fairly boring. However, whether you are conducting the audit as part of a single source conversion project, or if you have recently inherited a large documentation set, I’d suggest that it is an excellent way to gain an understanding of what already exists and, with little guesswork on your part, start to understand what may be missing.

Content Audits are usually one of the early tasks undertaken by a team moving towards a single source publishing model but they can also provide a clear indicator about whether you need to single source or not. For many teams the primary driver of a move towards single source comes when an additional product platform or customer is introduced, or perhaps through a requirement to translate and localise. However, a thorough audit of your content will show whether what you believe to be true is valid and may indicate that you don’t need to start single sourcing your documentation at all (you might just need to change your working practises).

As I mentioned, the act of auditing, in any form, can be repetitive, onerous and very much a chore, so my first piece of advice is to break it up into short manageable chunks and most certainly don’t try and do it all at once. Perhaps aim to do a couple of chapters a week, thus leaving you time to do fulfill other duties, keeping the documentation up-to-date for example.

For me, the aim of a Content Audit is two-fold, on the one hand you will end up with a very detailed breakdown of the structure of your documentation, and on the other you should also be able to extrapolate the types of information that your documentation holds (e.g. procedures, concepts, and so on). A key benefit, which almost comes as a bonus, is that having spent time looking at your content, you will also have a good plan of which parts of the documentation can be reused and which parts may need rewritten before reuse is possible.

If you’ve done any research into this area, you probably have a good idea of what is involved and what the aims are. But what is a Content Audit, what does it look like?

Well it’s fairly simple and the easiest way to get started is to use your existing Table of Contents. Pull that out into a spreadsheet and you have an excellent starting point, particularly if your documentation has been written in short sections. Then you need to get into the content itself, and analyse the structure in a bit more detail. Again there are obvious chunks of information that can very easily be pulled out, or broken down, into discrete chunks. Procedures, illustrations, tables of data, anything that is of a similar type and is repeated throughout your documentation is easily identifiable as a distinct unit (you probably have unique paragraph formats for these too, another quick way to check!).

A simple example for you.

All of our product guides and online help have “Overview” sections. They are, typically, very very similar. The product guide Overview is longer than that in the online help.

With a small amount of re-writing, we can create chunks for “Overview” and an “Overview Extension”, with the former being used in the online help, and the latter appended when used in a product guide.

Ultimately a content audit will involve a lot of time reading, cross-checking, double-checking, and I’d advise you grab a nice big desk (in the boardroom perhaps?) so you can layout printed copies of your documentation. I’d also advocate that you don’t try and do the entire process, across all of your documentation, in one fell swoop. Pausing between batches, and discussing the findings with your co-workers, will stop you missing potential re-use opportunities AND stop you trying to re-use (re-write) chunks of information that need to be kept discrete.

Once you understand your own content, then you can start the process of seeing how it stacks up against the content created in the other areas of your company. More on that another time.

X-Pubs Conference

Just about finished at this years conference and, as ever, I feel fired up to get back to the office and get things moving. Overall the main theme of the conference was preparation, preparation, preparation, mainly focussed around gathering requirements before kicking off a project. Nothing special there but if you are considering moving towards a single source environment, there is a LOT of preparatory work you’ll need to consider.

I’ll amend this post tomorrow with some notes and thoughts from some of the sessions, but overall I’d highly recommend you visit X-Pubs next year. What follows is largely compiled from scribbled notes and random thoughts, but hopefully may be of interest. I’m not sure if copies of all the slides will be available on the X-Pubs website at any point, I certainly hope so.