, ,

Ten Reasons to Ditch Your Word Documents


My first word processor was WordStar, running on an Osborne computer sporting the CP/M Operating System. Back in 1982, it seemed an amazing feat to be able to type a document, format it, save it to disk, modify it, and print it, all with a computer sitting on my desktop. Considering that the technology being replaced was a typewriter, this was a brilliant leap forward.

Since that time, of course, computing technology has evolved considerably. Computing networks, the Internet, the World Wide Web, Google, Wikipedia, cloud computing, PDAs, smart phones, tablet computers and more have all changed the way we think about computers and communication.

And yet, somehow, especially in business and organizational life, the word processor and its output — a stand-alone textual document, formatted for printing — have survived to this day substantially unchanged, with word processing and collections of words being treated essentially the same way they were thirty years ago.

Is this because such documents and the conventions of traditional word processors are still as useful as ever and continue to make modern businesses more productive?

No. Quite to the contrary, I will argue that such office documents are millstones around the necks of otherwise contemporary organizations, quaint anachronisms whose continuing usage is dragging down productivity and gumming up the works of organizational life.

The Problems

Here are the reasons why traditional word processors should be abandoned.

1. Formatting a document for a single output device no longer makes much sense.

Back in 1982, when printing a document on a piece of paper was about the only useful way to share it with someone else, formatting a document for printing was eminently practical. But in today’s world, when a document might be published as a Web page, a PDF document, a chapter in an e-book, or a smart phone app, it makes little sense for the specific details of margins, typefaces, font sizes, paragraph justification and page breaks to be so visibly part of a document editing interface.

2. The appearance of a document should be separated from its structure.

Directly manipulating the specific appearance of an individual chunk of text within a document is a terrible idea. Authors should be focused on the content of their work, not on the details of how each letter, word, heading or paragraph will be formatted in its final presentation. And while tweaking the formatting details of a document until it looks just the way you want it may provide some satisfaction, such embedded details make a document very difficult to maintain, since all this formatting information needs to be individually re-tweaked in order to make any significant changes in the overall appearance of the document.

3. WYSIWYG is an idea whose time has come and gone.

One of the main features of modern word processors is an editing interface that presents the author with a visual representation of his document that appears very similar to the final output to be produced. Such WYSIWYG interfaces — “What You See Is What You Get” — first became popular in the 1980’s and early nineties. WYSIWYG seemed like a great idea when it first appeared, because it allowed for a very intuitive, graphical interface that freed the author from having to memorize a number of obscure formatting codes whose effects would only be apparent once the document was printed.

In today’s world, though, WYSIWYG is increasingly meaningless, since what you see will depend on what sort of device the document is ultimately viewed on, and how the reader chooses to configure their document viewing software. Further, WYSIWYG encourages authors to obsess over formatting details as intrinsic elements of the documents they produce, and actually hides the structural details of their documents, since these structural details are typically implicit — but not actually visible — in the final output.

In other words, a WYSIWYG interface runs exactly counter to points 1 and 2 above.

4. Document file formats are needlessly bloated, complex and obtuse.

I just opened a new document using a modern word processor, entered the text “Hello World,” and saved it to my hard drive. The resulting document takes up 25k. Since disk space is unbelievably cheap these days, this might seem like no big deal, but all that bloat implies needless complexity that makes it difficult to find other software that might do useful things to my document. And so I am stuck with using the same software I used to create it — or one of a few other pieces of software that seem remarkably similar — when I want to do other things with or to my writings.

5. Documents lack meaningful identifiers.

Documents are identified by their operating system file names. Back in my WordStar days, when I was saving a document onto a floppy disk, that was probably sufficient. But in today’s world of ubiquitous networks, superabundant disk storage and e-mail, it is woefully insufficient. Let’s say I save several different versions of my document. And I save them in different folders. And I save them with different names and naming conventions, as the mood strikes me. And I save them on a variety of file shares on a network. And I e-mail some of these versions of my document to different people at different times. And the recipients of my e-mails save my attached documents in various places and with various names. And these same recipients also store my e-mail, with my document attached, in various dark recesses of their mail filing system.

OK, at this point, where is my document? Even more fundamentally, what is my document? Is it a single thing, residing in various locations and at various version levels? Or is each storage name and location a different document? And when someone goes to use the document, for whatever purpose it was originally intended, what will they actually see?

Unfortunately, when using a traditional word processor, these are unanswerable questions.

6. A document is an outdated concept.

One of the fundamental attributes of a document is that it is meant to stand alone, to be relatively complete and self-contained. Another fundamental attribute of a document is that it is meant to be consumed in a linear fashion, from front to back.

Both of these attributes make a lot of sense if you consider a document as something intended to be read from a series of printed pages.

But in today’s digitized, hyperlinked, time-sliced world, these attributes seem increasingly dated. I’m going to print something out in order to read it? Rather than read it on my laptop computer, my tablet or my smart phone? And I’m going to read the entire thing through, from front to back, in one sitting? Rather than going to the specific section I want just when I need it? And this document is going to stand alone, rather than being a smaller unit within a larger collection? And I’m going to consider this one document the last word on its particular subject, and not hyperlink to, or Google, other sources while I’m reading this one?

The answer to all these questions is increasingly the same: “Not bloody likely.”

What if I want to create a link within my document, so that an interested reader can view related material? Of course, using any modern word processor, I can select a run of text in a document, and attach a hyperlink to it. And if I’m linking to a Web page, then this works reasonably well. But what if I want to link to another document I created using the same software? How do I identify the document I want to link to? Only by using the file’s disk location and file name — both subject to all the same variations I described above, which means that, before long, the link will probably stop working or take me to an out-of-date version of the document. And when my links stop working, how do I fix them, since my word processing software gives me no way to manage these links other than viewing and editing them one-by-one? Especially since the document file format is so peculiar and inscrutable that no other software exists that will let me manage my hyperlinks in any more meaningful way.

8. Version Control is missing.

Almost all documents of any value change over time. So it is natural to want to inspect a document’s version history. It is helpful, when doing so, to have some automatically maintained identifier that can be used to distinguish one version from another. It is useful to be able to see when each version was created, and by whom. And, of course, it is sometimes vitally important to be able to determine what changed going from one version to another.

None of this is readily available from typical word processing software. As a result, entire corporations make decent profits selling expensive document management systems to other corporations, partly in order to provide such basic version control functionality.

9. Collaborative authoring is awkward at best.

Much of the meaningful work done today is increasingly team-based. No one person may be the authority on all aspects of a particular subject, and a single document may need to include the expertise of several team members, or may benefit from several sets of eyes having a chance to review and revise its content. While some word processing software makes minor concessions to such a reality, none of them fully embrace it, so real document collaboration is still the exception rather than the rule.

10. Documents don’t work well with Web browsers.

Increasingly, users open documents as a result of following links from other documents, or from Web pages. Unfortunately, when a user has to open a document from a Web browser — or, even worse, ends up opening a series of documents — navigation proves tedious and confusing, going something like this.

  • Click on a link.
  • Wait for word processing software to load.
  • Respond to a dialog box asking if you want to open the document or save it to your hard drive.
  • View the document using your word processing software.
  • Possibly click on other links, perhaps opening other types of office software.
  • Now try to find a way back to the place that you started from.
  • At some point, deal with all of these applications and windows that are now still open, even though you are no longer using them.

The Defense

Let me deal with some potential objections to my arguments at this point. Some readers may suspect that I’ve overstated my case. After all, word processing software has continued to grow and evolve over the years, and in many respects has made attempts to address many of the issues listed above. Is the situation really as bad as all that?

Well, first of all, let me admit that word processing software still has one pretty good use case: production of papers for publication in academic journals. This is a relatively isolated backwater of document production that has not changed appreciably over the years, and word processing software is still a good fit for this particular task. (Unfortunately, this may be a significant contributing factor to the software’s broader popularity, since it is after all the academics who teach the courses in which students learn to write class papers using this same software).

Other than that, though, all of my experience bears out the theoretical criticisms listed above. I’ve seen people struggle time and time again from the problems I’ve identified, and recent additions to word processing software and extensions of the document production model seem to be like the proverbial lipstick on the porcine barnyard animal, doing little if anything to address the underlying issues.

As a matter of fact, in many ways I think I’ve understated the problem, since other types of “office” documents — such as presentations, spreadsheets and e-mail messages — suffer from many of these same problems. Even worse, as makers of these traditional software packages have moved into groupware, they have carried many of these same problems with them into what would otherwise be a new space.

The Alternatives

So now what? If the stuff we’ve been using is no good, what do we do now? Throw it all out and start over?

Luckily, that’s not necessary. Because, as organizations have continued using traditional software, a whole new set of tools have emerged and evolved, and are perfectly capable of solving all the problems above, when taken together. All businesses need to do is to use them more extensively.

Here are the tools that solve the problems above.

  1. Hypertext Markup Language (HTML)

    This is the basic language understood by Web browsers. HTML has a nice clean document structure, and a single HTML document can be easily output to multiple device types. Users can typically choose from a number of available browsers, and can configure their browser as it suits them, adjusting things like font sizes and window sizes to their liking. HTML files are small, simple, and relatively easy for mere mortals to understand. Hyperlinks are an innate part of the language, rather than being bolted on as an afterthought. And a whole host of HTML editing tools and assistants are available to choose from.

  2. Cascading Style Sheets (CSS)

    A CSS file encapsulates all of the desired formatting information into a separate, compact, simple file that can be shared across a whole set of documents. Many different CSS editing tools are available.

  3. Uniform Resource Identifiers (URIs) and Universal Resource Locators (URLs)

    These strings of characters are used to identify Web pages and other resources, with each such resource being assigned a unique identifier.

  4. Web Sites

    A Web site offers a set of related pages as an alternative to a stand-alone document. Such an organizational scheme offers a relatively flat, networked structure that allows a user to access what they need when they need it, rather than forcing her to navigate through all the content in a linear fashion.

  5. Lightweight Markup Languages

    HTML is relatively simple, especially when using it for document composition, but the recent appearance of languages such as Markdown and Textile have made it even easier and more natural for authors to create documents in HTML. This blog post, for example, was composed using Markdown.

  6. Version Control systems

    Version Control software is relatively available, inexpensive and easy to use for plain text files such as those used to store HTML and CSS. And version control is often available as a baked-in function of web content management systems, including wikis.

  7. Web Content Management Systems (WCMS)

    Many such systems have emerged over the last few years, including several good open source solutions. Such systems make it relatively easy to organize, serve and administer Web sites of any size, typically incorporating all of the items above: HTML, CSS, URLs/URIs, lightweight markup and version control. Such systems also help to separate the duties associated with Web site creation, allowing content contributors to do so easily, while shielding them from associated complexities having to do with administration and “look and feel” issues.

  8. Wikis

    A wiki is a particular type of Web content management system, with a particular focus on the ease and speed of content creation and organization. Wikis also make it easy to author content collaboratively, and reduce the risk of errors or vandalism, since any past version of any page can be easily restored. The extraordinary success of Wikipedia has given many people the impression that wikis are used solely for encyclopedias, but wiki software can be used to create any sort of Web or intranet site.

In Summary

So that’s it. There are still good reasons to use word processing software, when creating legal documents or preparing printed materials. But for the great majority of informal written communication within large organizations, there are much better choices available today that can help improve productivity and decrease the possibilities of errors due to miscommunication.

All it will take to realize these gains is for office workers of the world to unite, throw off the chains of their anachronistic word processing packages, and enter the modern age of Web publishing!

The author, in addition to using a typewriter and WordStar, has prepared text for print using a combination of Microsoft Word and PageMaker, and has published text on the Web and corporate intranets using HTML, CSS, Textile, Markdown, BBEdit, TextMate, Drupal and Confluence.

June 7, 2010

Next: Types of Web Sites