one man writes
one man designs
one man blogs
one man tales

Archive of Single Source posts

 
 

DITA Maturity Model

I mentioned this in passing last week but having had a little time to delve into the model in a little more depth I thought it was worth re-visiting.

The DITA Maturity Model as an organic model that is still being developed. Rather smartly it’s presented in Wiki format allowing anyone who is interested to comment and debate any and all of the content.

The model itself follows a familiar pattern with six levels of maturity against which you can map where you and your organisation sit. However the DITA Maturity Model starts with the presumption that you are already committed to topic-based writing, and I think that’s a gap that needs to be addressed.

For me, the model allows me to explain to my boss (and his boss) why investing in DITA as a document schema is worthwhile but it misses the gap of why we should change what we are doing at all. Once you have made the leap, the maturity model is all well and good but MAKING the leap in the first place, well that can be considerably harder.

Of course I’m not the only person who realises this, and in steps the DITA Wiki which has an entire section on building the business case for DITA.

The DITA Wiki is interesting. Not only is it chock full of useful information but ALL the major players in the single source/content reuse arena contribute to the content and discussions. Again it’s telling that it grew up alongside the growth of DITA usage.

Anyway, the DITA Maturity Model is definitely worth a look if you are considering heading down the DITA road. If nothing else it will give you a better understanding of the road ahead, some of the pitfalls you will encounter and the benefits you will gain.

Only the good die young

One of the reasons DITA has gained so much traction in such a short space of time is that the people behind it are taking advantage of the internet to publicise and drive it forward. With that in mind it’s great to see them open the new DITA Maturity Model out to the community:

This community is designed to bring the DITA Maturity Model to life, applying the “Wisdom of the Crowds” to the evolution and refinement of this approach to DITA adoption. The premise is that none of us is as good as all of us. The DITA MMC is an evolving resource that will grow and change over time with your active participation and contributions.

Definitely a good usage of the social media tools available at the moment.

One thing that struck me, taken from the Content Wrangler coverage, is a simple reason as to why more people are considering a move towards DITA-based content:

Enterprises looking to fast track their content strategy and minimize the risks of a big-bang initiative are choosing DITA–one of the most popular information models to suit today’s content–rich, multi-channel environment.

For some reason I hadn’t quite figured that out, but if you are putting together a business case built around DITA then it’s worth investigating this in more depth. That said, this is definitely one of those “so obvious I hadn’t considered it” moments!

The maturity model also highlights one of the reasons that DITA is proving popular even if it isn’t the best standard to be using for every circumstance. Quite simply, it’s because it’s young, new and (this is the important bit) is being developed in plain view of everyone on the internet. Admittedly I’ve not gone looking for DocBook or SD1000 resources but as they are already fairly mature they seem to be struggling to keep up with the pace of development around DITA. If DITA is the cool kid on the block, DocBook is definitely the wise old sage, stooped on the corner.

Social media on the internet thrives on participation and with DITA still growing up everyone has a chance to get involved and influence things, and that helps generate buy-in, which drives more improvements, which increases community buy-in… and so on.

So, even if you aren’t interested in DITA but are interested in how social media (online communities, web 2.0, whatever you want to call it) might help you and your company, it might be worth while checking out the maturity model and see if the same … erm… model.. can be applied to what you do.

Everything is connected

This post has been bubbling for the past year or so, ever since I started this blog. It’s a bit of a ramble but if I don’t publish it now I’ll just keep adding to it and it’s long enough as it is!

I question everything. It’s part of the way my mind works, and is something I’ve embraced and believe it makes me better at my job as a technical communicator. That attitude has also helped me realise that there is a common thread that can be found across several different areas of our industry, which I (and others) are slowly pulling together. Convergence is the word that springs to mind, and as businesses clamber onto the social networking bandwagon, now is an excellent time to grab the reigns and take control.

Let’s step back a little.

Late last year, on two separate mailing lists, I followed discussions about what the myriad of people who share my profession have as job titles. I prompted one discussion on the ISTC mailing list, and chipped in some thoughts on the TechWR mailing list before dropping out later on when the noise ratio, as ever, got too high.

I wonder how much useful information I miss when I do that? Ahhh something else to ponder. But not today.

Anyway, discussions around how we as a profession should be referring to ourselves, envitably leads to discussions and thoughts about what we do, where our skills lie, and the benefits we can bring to an organisation. Something I’ve toyed with before, but which is wrapped up in many layers of ifs, buts and other such caveats.

Following on from that, I read an article by Virginia Lynch in the CIDM newsletter (and if you aren’t subscribed to their newsletter, you should be) entitled Information Developers – The New Role of Technical Writers in a Flat World which encapsulates a lot of my current thinking on how to take my current team forward, making sure we are matching company strategy whilst allowing the team members to retain a focus on maintaining and developing their core skills. The article title rather neatly alludes to Thomas Friedman’s book The World Is Flat: The Globalized World in the Twenty-first Century which is certainly worth a read.

Virginia mentions that JoAnn Hackos recently referred to these core skills as “Basic Hygiene”, citing the fact that, regardless of how the collation and production, distribution and usage of information may change, as we explore the burgeoning arena of new tools available to us under the banner of “social web applications” our core skills remain. Typically they tend to drop off as we are pushed to create more, faster, with a rise in quantity favoured over a maintenance of quality.

style, grammar, punctuation, spelling, and even clarity seem to have been sacrificed for quantity —JoAnn points out that knowledge of basic writing skills is still critical to our success as writers. Basic Hygiene also comprises an understanding and appreciation of editing, the information development life cycle, fundamental web and computer skills, and of course attention to detail.

However it is important to note the nod towards quantity being a business leader, and those of us tasked with managing a team need to consider how we achieve that business aim, without impacting our integrity as Technical Writ… umm… Information Developers?

So, how do we produce more whilst maintaining quality?

Wait! What’s that coming over the hill? Ahhh yes, the shining white knight of single source, armour gleaming, his trusty DITA (or DocBook) in hand, ready to do battle against the ills of productivity measurements and over-zealous QA departments. What else were you expecting? Ohh more resource? No, not these days when everyone is a “content creator”, not these days when we should be embracing and encouraging our audience to help plug the gaps in our information dykes (I really must stop mixing my metaphors).

Topic-based writing certainly seems to tick the required boxes and every business case and ROI I’ve read (and I’ve written a couple myself) points us towards the promises found over the horizon and the “he’ll be here real soon, honest” arrival of the aforementioned white knight. The trouble is that, whilst it is easy to agree with the theory, I’m not all that sure the white knight is all he seems. Certainly as we climb the hill towards him, auditing our content, deciding on chunking levels, agreeing metadata requirements, we begin to see that that armour seems a little thin and dented in areas, and I’m not entirely sure the knight is filling that armour as much as he should. Aren’t they supposed to be big strapping warriors? He looks a little weedy to me…

Topic driven content written with a minimalist slant, deferring here to the instructions of Strunk and White* rather than Roy Carroll, are where we seem to be (need to be?) heading and that’s fine and good from where I’m sitting.

* A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid detail and treat his subject only in outline, but that every word tell.

On the flip side, there is a definite growth in awareness around the use of Web 2.0 technologies and systems, building online communities, integrating Wikis, blogs, RSS feeds into the information flow either as part of end user deliverables or as methods for encouraging information creation by everyone involved with the product, internal or external.

A large part of our job concerns the collation and filtering of information so as far as I’m concerned anything we can do to make the creation of source information easier has to be welcomed. Extending these mechanisms beyond internal usage means it should be easier to provide information to the people who really need it, with the added bonus of a greater level of trust in that information. Don’t believe me? Which type of information do you put most weight on, the information passed to you by a trusted colleague who you know uses the product heavily, or the product documentation? (and bear in mind that we technical writers pre-disposed to favour the work of our peers). That in itself is another issue which may be alleviated by embracing social content creation, pulling on the goodwill generated by openly inviting contribution and collaboration, whilst giving technical writers a chance to show their worth in full public view.

So where is all this heading? I’m not sure if anyone is too sure but there do seem to be some trends appearing. The use of Wikis to host documentation, the creation of community websites with few restrictions, and more. There are plenty of tools, and with a little work you can get them talking to each other. Technology is not the limiting factor anymore, attitudes are now the only things stopping us trying these wonderous new things. It’s a big step for some companies, and some people, to free their information, to pass their hard earned knowledge about willy-nilly without a clue as to how it will be used.

Once you’ve gotten past the limitations, the real effort, once you have your community or collaboration up and running, is the surrounding processes. Do you want to pump content into the website regularly? (yes). Do you want to allow anyone and everyone to contribute to that same store of information? (yes). Do you want to allow others to quietly correct your mistakes? (yes). Do you want to give the people who need it, access to information about your product, regardless where it originates, trusting them to use their judgement? (yes).

The final pieces of the jigsaw are the finer details of implementation. Presuming we want to reuse information as often as possible where do you store information and how do you allow access to it? Who should be involved in verifying new information? Where/how is the level of trust established?

Pulling together the threads of this emerging role is tricky, with so much overlap into multiple areas and so much to consider there is a danger of not seeing the wood for the trees. This post is an attempt to step back and make a little more sense of what I can see, what I know, and the changes starting to drag our profession in interesting new directions. I fear I may have muddied the waters, but hopefully they’ll settle and things will start to make sense.

Regardless of whether I’m right or wrong, one thing is for sure, these are exciting times and we have a great opportunity to finally leverage technical communications into the spotlight. The value of information is finally being properly realised, and we are ideally placed to help any organisation make the most of what information they have and help them understand and create the information they really need.

Back to DITA?

I’ve mentioned DITA a few times on this blog, and my DITA is not the answer post is still attracting attention. As I’ve said, I think the DITA standard is an excellent one for software documentation and the DITA movement is slowly catching up to the hype. I’ve never given up on DITA and had always planned to use it as the basis for the next stage of our content development, and as it happens the switch to a full DITA/CMS based solution may be closer than I had anticipated.

We have been considering how best to publish up to date information in keeping with patches and minor releases, and if we can tidy up and publish useful information from our internal Wikis and support system. The nature of the product we work with means there are a lot of different usage patterns, not all of which we would document as they fall outwith typical (common) usage.

So, how to publish formal product documentation, in-line with three versions of the product, in PDF for ‘printed’ manuals, JavaHelp to be added to our product, and HTML to be published to a live website alongside other technical content (ideally maintained in the same system as the product documentation). Storing the content as XML chunks also allows us to further re-use the content programmatically (which can be tied into our product in a smarter, dynamic, fashion).

The obvious answer is single source using DITA to structure the content, storing the content as XML to give us the greatest potential avenues for re-use. Nothing particularly startling there I know, but it’s a switch from the direction we had been considering. So I’ve been catching up on what’s new in DITA-land and have to admit I’m a little disappointed.

We already have FrameMaker and Webworks in-house, although both are a couple of versions old, and thinking we might keep using those applications I’ve been hunting about to see if I can find a solution that offers a coherent, end-to-end, story. There are several CMS solutions which require an editor, editing solutions which require a CMS, and a few products that straddle both CMS and editing but then require publishing engines.

I understand that it would take a collaboration between vendors to be able to offer a simple, seamless solution

In addition to that there does seem to be a tendency for any DITA focused solution to remain the remit of the overly technical. Don’t get me wrong, I’m quite happy delving into XML code, hacking elements, or running command line scripts to get things done. But surely I shouldn’t have to resort such things? Now, I’m sure there are many vendors who will tell me that I don’t need to worry, but I’ve seen several demos and all of them miss a part of the FULL story.

Come on then vendors, stick your necks out. If you are a CMS provider, then recommend an editor. If you sell editing software then talk nice to a CMS vendor and start promoting each other (yeah Adobe, I’m looking at you!).

And yes, I’ll happily admit that maybe I’m just not looking closely enough. If only there was some sort of technical community website that I could join, perhaps with a group or two on DITA? That’d be great.

Ohhh wait. There is! (not the most subtle plug in the world, was it? I think the new Content Wrangler communities could be a big hit, do check them out).

Have a got the wrong end of the stick, are there really gaps in the market in this area at present or is it just my imagination? I guess I’ll be running a fair few evaluations over the coming few weeks and, of course, I’ll post my thoughts and findings here.

The tool is not important

The tool is not important. The tool is not important. The tool is not important.

I have been repeating this mantra in my head for the past week or so, over and over, like a broken record. I’m in the middle of pulling together the requirements and scope for a new technical community website for our users, which will become the key focus of our technical information. The more traditional product documentation set will be maintained as we move forward, so there is some thought to be given towards how we manage the information as well as how it is published, or rather where.

I must stop considering the how. The tool is not important.

At present I have a list of requirements, all of which I’m thinking through from the point of view of how the process will work as far as creating and maintaining the information. Who will be access the source, who will be viewing the published information, who can edit what, how will the information be used by the audience? All the while there is a part of my brain dragging me towards HOW this will work. What tool will be able to handle our requirements?

The tool is not important.

I enjoy a challenge, and this is most certainly a new venture for me, but the basic foundations of this idea are rooted in things I know well, single sourcing content, developing online communities (I run a website for Scottish Bloggers (currently dead after our hosting service disappeared)). As such I’m confident I can get this off the ground, but even so I’m being careful to properly gather requirements, and fully understand the impact of changing our publishing model. Note I said “model”.

The tool is not important.

So with a list of requirements, and a full understanding of the processes that will be involved both to maintain the main documentation set and the development of other supporting information (culled from internal Wikis, mailing lists and anywhere else we stumble across something useful) one change is the way in which we plan, design and write product documentation.

As I’ve said, this is all about the processes that support the way we work. I’m being quite deliberate in how I pull together the requirements, focussing discussions on the audience, the expectations, the information and processes, with no mention of the technology which will need to support the new website.

The tool is not important.

Last year’s X-Pubs conference drilled this message home, and it’s good to be able to draw on the information and knowledge gained there. Get your requirements sorted out and agreed, understand the impact of changing the way people access information, and the impact of changing how people work, figure out how best to handle the reaction to change and agree the expectations and limitations of your system. Decide which models you will follow, how the processes will hang together and outline the various roles that will be required, and make sure they understand what is required of them.

Then and only then should you consider what tools you require and make sure they are serving you.

Why AuthorIT?

As I mentioned before, we are planning to migrate content from FrameMaker to AuthorIT, staging the migration across two different product sets (and no small amount of time!). I’m in the process of evaluating AuthorIT for, despite having used it before, it has recently been overhauled with a spiffy new UI and some new features.

AuthorIT is a single source system, with content stored in a central database, which can publish to most (all?) of the formats that anyone would ever need. It includes an editor, supports multiple users, and has some additional add-ons for localisation and so on. Their website is very good if you want more information on their product.

After downloading and installing the trial version, which limits your import and publishing but otherwise has all the features available for use, I fired it up and was greeted with the new interface. Based on the ribbons used in the latest version of Microsoft Office, it is quite a shift away from the previous version and it took me a while to get to grips with. However it is a huge improvement over the old version and once you are used to it, like anything, it’s very nice to use. Yes I know there are still issues being dealt with, but I didn’t run across that many during my testing, so I’m happy.

During my evaluation I spoke to their Business Development Manager who was very helpful in delving into some of the issues I had around versioning and set my mind at rest. I’ll outline how we are going to handle maintaining multiple versions of documents in another post, once I’ve given it a dry run or two.

One issue that cropped up was the location and format of the supporting database. You can run AuthorIT on a Jet database either locally or on a network drive although that is particularly performant, or run it on a SQL Server. As we are a small team I did consider the Jet database but our situation suggests a server database would be better. Which introduced another problem, price. SQL Server isn’t the cheapest and we don’t have an installation in-house. Thankfully one of our IT guys suggested SQL Express (a limited free version of SQL Server) as a possibility, and after a quick check on the AuthorIT Yahoo Group, I’ve found that it will run quite happily on that database.

There is a limit of 4GB on the database size but as long as we keep our images elsewhere there is little chance we’ll hit that limit. Our total content at present, including images, tops out under 500MB for one version of the documentation. So we’ll actually be saving space on a server as we won’t be maintaining multiple versions of entire documents. Must remember to point that out to our IT guys!

Aside from versioning the only feature I was unfamiliar with was the batch runner, which allows you to run a batch file (.bat) as a scheduled task. Our current system runs at night, using Webworks to create a Javahelp file which is then included in the software build and AuthorIT will give us similar functionality.

Why AuthorIT? Well, quite simply it gives us what we need.

I spent some time at the X-Pubs conference last year, and throughout the presentations the underlying message was “get your requirements sorted before hunting for a system”. The premise is obvious enough, if you decide on a system first, you end up shoe-horning your processes around how it works rather than getting a system that works you way YOU work.

I also spent some time considering DITA but ultimately switching to an XML-based system is still too cost-prohibitive. AuthorIT is a compromise, allowing us to work how we want to work, whilst giving us single source benefits. We will use DITA as a framework for how we plan and write the content, but the simple fact is that AuthorIT is a much better value proposition than a bespoke system, both in monetary and resource terms. This makes the business case much easier to sell.

If you are considering single sourcing your content, then I’d strongly suggest you investigate AuthorIT as a possibility. It has limitations, including the oft-cited reliance on Word as a publishing engine, but for me the advantages outweight those.

And no, I am not being paid to endorse AuthorIT.

Content Analysis for re-use

The basic premise of “single source” can be summed up in one word.

Re-use.

Sounds simple enough but there is a wealth analysis and work that is required before that, somewhat elegant, aim can be met.

Analysing your content for potential re-use opportunites is, by and large, an onerous task. Whether you do it all by hand, printing out reams of documentation and annotating by hand, or electronically compiling spreadsheets using colour coding or obscure (“they made sense to me at the time”) codes, it takes time to do it properly and there are no shortcuts. Sorry to break it to you so bluntly.

However it does mean that you are forced to spend some time re-reading your content, content which you might not have visited for some time or in some cases, may not have written yourself. You’ll likely find inconsistencies in the content itself, styling errors and quite probably a completely different writing style. Whilst it may seem obvious I urge you, should it arise, to fight the urge to start editing as you go along.

My basic understanding of single source, and the re-use of information, is that there are times when you’ll need to rewrite content so it can be easier used in multiple locations. A change of tense perhaps, a rephrasing or reconstruction of a sentence may be all that is required, and hell, if you have the document open in front of you, why not just go ahead and make that change? Suffice to say that editing content that you are analysing has only one potential outcome. Chaos. Regardless of how well organised, how well planned your analysis is, if you start making changes to your content on the fly, you will soon find yourself with a blurred view of the very thing you are trying to analyse.

Yeah, I know. It’s sounds obvious, and it is when viewed from a distance.

However what I really wanted to discuss, for I’m certainly not 100% certain on this, is at what level does content granularity become too granular? If I want to re-use a paragraph then, obviously, breaking up content to the paragraph level makes sense but that immediately seems like overkill in many cases. So I’ve been steering away from that kind of structural thinking, away from paragraphs and sentences into semantically discrete blocks. So a short product description, containing a heading and a paragraph, is one block and a long product description, containing a heading and several paragraphs, is another. I’m pretty sure this is the correct approach but it does mean that, once you’ve made that decision, you are stuck with fairly large chunks of information.

I’m hoping that this is a good balance though, for if we are to break our content into smaller granules, the overhead of maintaining and manipulating them surely increases. Remember, in a single source system we are concerned with more than content, we also have to contend with the metadata associated with that content, and the more pieces of information we have to maintain, the increase in risk that the metadata becomes so complex as to be useless?

I think. Maybe.. I’m really not that sure.

Have you conducted any content analysis? If so how did you approach the granularity issue? I get the sense that, for a lot of people, the level of granularity is reached once the content analysis is complete, that it basically decides itself.

As we slowly progress towards a single source solution, I’m intrigued as to what to expect next, any thoughts or comments are much appreciated. After all, all the articles, conferences and books in the world can replace real life experience.

Notes:
This post was, in part, inspired when pondering if semantic analysis might be a way to tackle this but, for now, I wonder if it is perhaps a step too far for most?