Older blog entries for prozac (starting at number 44)

MSDN - Miserable Stupid Developers Network

Microsoft's MSDN a lesson in how NOT to present important information for your customers.

Go here: http://msdn.microsoft.com/library/default.asp and you try and find information regarding, say, VBSCRIPT's ERR object, and you tell me how easy it is. These people at Microsoft are INSANE. That is the only conclusion I can find.

At first glance one would think that the overall HTML layout is a pretty good choice: a header across the top, a narrow left side "navigation" frame, and a large right side frame for the information you want. However, Microsoft manages to completely ruin it.

First thing you will find is that the navigation frame (their "TOC") is extremely large, extremely redundant HTML (well, XML generated, but HTML in its final presentation to a Browser); HTML that at first is not very large but its size increases as you start "navigating". You will find that initial navigation by clicking in the navigation frame is easy and does not generate much HTTP traffic as at this point the entire TOC has not been loaded (assuming you have started viewing the MSDN Library for the first time). And the Browser caches things nicely.

For example, I navigated the TOC frame and after several ten's of clicks got the tree down to the VBSCRIPT ERR object.

The description of the Err object is a nice overview of the object but lacks completeness. I know it's an object; I want to see it's properties and methods. A couple are mentioned in the text. The text is great if you already have knowledge of this object! But for someone wanting to get a complete reference to the object the information about this object is entirely inadequate. Okay -- so we just have to click another level in the TOC or a link in the description page: Err Object Properties and Methods .

Clicking in the TOC brings up the properties and methods page in the description frame. All well and good as the saying goes. Things get worse from here on in as the saying goes.

At this point the TOC tree has ended and there is a list of links for the properties and methods of the Err object. No descriptions of them, just links. Now clicking on a property link results in a curious thing, we get to the description of the property and then the entire TOC reloads and we suddenly find ourselves very far down in the TOC. So, there is a Book analogy of sorts, we were at a page reading a list of object properties and each property had a reference to a page further down the book to get an actual description of the property. Okay fine. In this case the page was only say, 20 pages away. And the TOC that the browser had to reload had only several more branches expanded*. Well, not too bad. The Browser does have a "Back" button.

So now I read the description of the property I want to learn. And I click several more times to learn about the rest of them.

At this point I (starting form the point where I got to the base Err Object page) I had to click a dozen times to learn about the 5 properties and 2 methods of this object, for the property descriptions are linked together, they are not linked to the methods. All the while for every click the TOC reloads each time. And how Microsoft has implemented the TOC it is not cached**.

What Microsoft has done, is to list consecutively, all objects, then all properties, then all methods (and keywords, functions etc.) for not only the entire VBSCRIPT reference but their entire MSDN reference as a whole.

Instead of listing each object as AN OBJECT! with all of it's properties and methods, for each object, they have linearized their entire database!

And what makes things worse is that each time you want to read the description of something you must load just that one something (property, method, etc.) and at the same time the entire TOC!***.

There is no "next" button in the content of an object, i.e. go to the next description of the next property, you can not navigate a single object as a single object! You can only "move" from next to next for all properties or all methods. This lineararity works within a single context only; all statements, all constants, all events, and all properties of a object for each object. It is horrible that they linearize on all properties for all objects. Compounded by the fact that they have a single page for each property, method, etc.

Now, VBSCRIPT has five objects with a total of 11 properties and 7 methods. So it actually ain't too bad. However, the references for Excel.Application or Word.Application objects are far more complex with hundreds of properties and methods.

It takes hours to navigate MSDN. Hours. Hours.

Fully, 90% of my time programming for Microsoft Office is spent navigating MSDN (and Googling). The tops of their reference tree for all of their objects are slight overviews only, with simplistic, useless examples that show only one or two properties or methods. There is no complete set of links all in one place for any one Office Object.

MSDN is a horrible, horrible quagmire of twisty little passages all alike. There is no wonder whatsoever as to why people say that Microsoft Sucks. MSDN is an example of either design by compartmentalized committees, planned obsolescence, or stupidity. I actually believe that it is a combination of all three, with most of the problem based on the sickness of "getting people to buy CDROMS and BOOKS" to try to get better programming references. However, have you ever read a popular book about Microsoft Programming? They too are horrible. Every book about Microsoft Programming that I have ever purchased or seen -- with the exception of O'reilly's VBSCRIPT reference -- has been as poorly presented as MSDN with the addition of being never complete and having lame examples.

I have come to the conclusion over the last several years that Microsoft's programming languages, VBS, ASP, along with COM and the DOM, are actually VERY GOOD THINGS. PHP and Perl also integrate well into this model. HOWEVER. The Documentation SUCKS COMPLETELY!

I wanted to post this as an Article but I chickened out. I wrote this some time ago and had it on my website. My website is no more so I shall archive it here.

Steps to Programming Success

In this document I outline the steps for successfully developing a programming project. By programming project I mean one or more pieces of software that makes up something like a library, an application or a set of utilities, for either a personal computer or an embedded system. By success I mean the result of ending up with bug free, working, quality software.

The intended audience is that of a lead programmer or architect in a small groups of developers, or a programmer who is a member in a group of developers, working at a small software company. As I wrote this open source projects kept coming to mind so I think that my thoughts here will work for both commercial and open source software.

The following steps should be generally useful when starting a new, not too large, programming project. But I outline overall programming practices that should scale quite a bit. Please keep in mind that I do not provide an exact recipe for success, but provide an outline which can be applied to the process of software development.

I have developed these ideas over many years of writing code, creating programs small and large, and interacting with people one-on-one, through e-mail and as a member of various programming teams.

I will not say that what I have written here is the one best way. However, I will say that I have had my share of both failures and successes so that I feel that I am not completely ignorant in these matters. I also have nothing to gain from what I say here -- I am not an author and am doing this solely for my own personal gratification. (This is basically a compilation of notes I have written down in an old notebook over the summer, so do not expect a great piece of literature.)

We start with zero, or, more importantly, before step one.

0. Have a clear goal.

There always must be a clear goal. Without one there can be nothing to succeed in. Success is reaching your goal. You cannot have success without a goal.

Obviously, wistful goals are not goals. A goal of "success" or "fame" or "fortune" is not a goal. A goal of "the best Web application" is unrealistic and unreachable. Goals cannot be vague, they must be specific.

A goal to learn a second spoken language is not unlike a goal to write a computer program -- as long as you want to learn a specific language and write a specific program.

Scratching an itch is just a vulgar way of saying to write something that does not exist (or that you do not have), to accomplish something that you need or want, but the relief of that itch is the goal in that the result is a specific program that you want to use for a specific reason.

For open source software, a project will not succeed by just opening doors and expecting people to come. It takes a clear goal with a clear result as the reward to attract people.

1. Design your design.

The first thing to do once you have a goal is to document a design. As with a goal, specifics are needed. It will not be enough to simply say "fast," "flexible," "versatile," or "massively multiplayer." Software cannot be "powerful" -- it cannot push your car up a hill. Software is limited to manipulating data and computer peripherals.

A design is more than just an idea or a list of ideas. As a filmmaker uses a storyboard, a writer an outline, and a builder a blueprint, a programmer needs a design.

The design should be in proportion to the size of the project. Small programs that can be (and are) written by a single programmer can be designed "on the fly" in the programmer's head as he or she writes the code. Larger programs may need a design from a page or two in a notebook to several documents including many diagrams. The documentation is not directly proportional to size but to complexity. A program that only interacts locally -- keyboard, screen, file system -- may be smaller than the same program doing the same thing interacting with multiple connections over the Internet, but it probably will not require a larger design or more documentation.

The design may also be influenced by hardware or software interfaces such as device drivers. Designing the UI may be sufficient for some programs.

But a design should never be too detailed. Do not over document! If you try to document beyond data structures and APIs, and cover how to code down to for loops vs. while loops or other code aspects, you can wind up documented into a corner, losing the creativity and inspirations of the coders. When a code comes up with a more efficient way of doing something you will be faced with either adhering to inefficient code, having to rewrite the documentation or having inaccurate documentation.

Once you have the basics down the code will follow.

There is no need for a detailed design of the implementation down to the level of each function call before coding can begin. If you have to document how you are going to implement the code you are then essentially writing the code twice -- the "how I am going to implement the code" document and then code itself. You will then be in the situation of having to maintain two code bases, the implementation documentation and the code itself.

So, what is a design? How would one know what is sufficient? Well, perhaps this will be seen as a cop out, but answering those questions will require another essay of itself. Let me at least try to offer some suggestions. A design could range from a list of features the software is to accomplish in the case of a service or to something that would be similar to the software manual in the case of a an application. A design should answer the question, "What does the software do?"

2. Keep a strong lead.

One experienced person needs to be in charge, to oversee things, to have final say in disputes. The design needs to be communal, but the overall control needs to be top down. A strong lead needs to be experienced in knowing that no one will always be right. The lead must know when a change to the design is necessary, and when to let others have their way when they are right.

A leader who demands total control over the design risks alienating the coders, slowly killing a project. A leader that demands that the design strictly follow a textbook, rather than treating textbooks as guidelines, will quickly kill a project.

But too many people all trying to design, and arguing over details, can slow a project down, risking making the project difficult to manage.

Adding a little Cathedral to the Bazaar can be a good thing. One person designing everything, instructing everyone, will probably not work well unless that person is extremely good. How many of us are that good? But one person, keeping the team together, getting the team to design together, delegating, asking for input and ideas, making the "tie-breaking" decisions, will work and work well. How many of us can do that? Perhaps more than we think if they understand the process.

In the Bazaar model, the "community" of developers spread across the Net are not all directly involved in the design; in actuality, one person or a few people (the small "core" group), does most of the work, while fielding suggestions, bug fixes and patches from the rest of the "community" (taking the best solutions, rejecting the rest).

3. Let the coders implement to the design.

Once the design is done, unknowns marked down, and the coders start coding, the team spirit must be maintained but only on the large scale. The individual coders must be able to act and program like individual coders; they must be able to converse and consult among themselves and their peers when making implementation decisions.

A weekly meeting to see how things are going, to see if the work is progressing within the time frame, to see if anyone needs anything, is quite adequate. The leader's role is important, needing to be supportive and not intrusive. A leader working one-on-one with each coder in separate little meetings is a recipe for disaster.

The reason why the design should not go into excessive detail will become apparent during implementation. Very often, as the coding starts, a better idea comes along and a different algorithm takes over an original idea -- this should never be prevented. Trying to justify a new algorithm to an inexperienced manager who had you over-design in the first place is not fun, to say the least.

Programmers have many resources at their disposal: books, magazines, the World Wide Web, Usenet and colleagues. A good programmer will takes advantage of all of these. So there is a little of the Bazaar in the Cathedral too as programmers working in a closed source environment take advantage of everyone else's published works. With today's Web, all "in house" closed source projects are helped indirectly by a world wide community of developers. All free software is available to all.

(In looking at the competing aspects of closed source vs. open source, for closed source projects there can be less of a sustained programmer base for programmers as closed source companies are subject to programmer turnover, as a result there may be little or limited peer review (of code and coding practices).)

4. Relaxation and reflection are important.

Sometimes there is an end to a project, more often so for closed source (when a delivery is made, or a release is published). But while developing one needs to have some sort of closure from time to time -- a period to rest, idle time to think about and clean-up the code, perhaps polishing it a bit here and there. (Remember that algorithm that you know can be more efficient?)

If there is no rest or room for reflection, especially if there is feature creep, chances are that the code will remain in a "just finished" state, or "just good enough," and it will never get cleaned-up, let alone polished. Debugging statements may get left in, test code may be left in but commented out, comments may remain wrong or misleading.

Time needs to be set aside, during development, for relaxing and reflecting on the code -- allowing the coders to clean-up and polish.

I realize that such an R & R process during development will be quite controversial and many people will immediately call me crazy. And I do understand the importance of scheduling and of delivering on time. But there is importance in quality code and in quality time.

If a project has a tight schedule or is running late there will be no time for pauses in the development will be one argument. Another will be that if everything is designed properly there will be no need to re-code anything in the middle of development.

Late projects will always, perhaps, be an exception. But scheduling can easily accommodate a few days a month toward this. Since the entire code base can not be designed fully and must be implemented in sections there may always be room for improvement. And I am certainly not advocating any sort of basic re-design every once in a while -- code clean-up and polishing should consist of comment clean up, the improvement of a conditional branch here or the re-organization of a for-loop there. Along with the ever important process of catching bugs.

As for re-implementing an algorithm based solely on its efficiency, this is where a strong lead and weekly meetings shows its most important role. If, during development, someone thinks of a way of re-implementing something non-trivial, she can bring it to the table where it can be discussed. If re-implementation would effect the schedule there would be no need to carry it out.

This need not be thought of R & R if you do not like that term. Thinking of it as a code review process may help in understanding this. Improving quality and preventing bugs is extremely important. Why leave code review and clean-up to the end of a project?

Afterword

Sometimes I think this is all right on the money, other times... it seems like a bunch of hooey... I don't know. My programming projects are getting smaller and smaller these days; PERL, PHP and a few scripting languages are all I use these days. Well, perhaps there is something good here. If not... Well, no harm done. (I hope. ;-)

Bibliography

The following books have influenced me or I feel that they having something important to say on this subject.

1. The Art of Programming, Donald Knuth

Although the works have not effected much of my experiences of the management process, the three volumes that make up The Art of Programming has had by far the most influence on my coding style.

2. The C Programming Language, Brian Kernighan and Dennis Ritchie

The second most influential book on my coding style.

3. Software Tools, Brian Kernighan and P. J. Plauger

4. The Practice of Programming, Brian Kernighan and Rob Pike

Two more books explaining the basis of quality programming and program development.

5. Literate Programming, Donald Knuth

6. The Cathedral and the Bazaar, Eric Raymond

7. The Psychology of Computer Programming, Gerald Weinberg

8. The Mythical Man-Month, Frederick Brooks

I am reminded of an old saying frequently heard on BBSs:

Open mouth, insert foot, echo internationally.

The older I get it seems the more my brain does not work.

Why?

Whenever I am there, I always wonder why there are so few people getting involved at Greplaw. So few comments... so few readers? I don't get it.

11 Sep 2004 (updated 19 Mar 2005 at 23:31 UTC) »
Anatomy Of Style

I was working on part of my latest pet project in PHP and, obsessive that I am, I wrote four different versions using various programing design strategies.

I have documented and summarized my findings. Here is an excerpt:

"This text documents four different PHP programming styles. My meaning of "style" here is less to do with brace alignment and whitespace than with program structure or orientation. (I am aware of other definitions of these words and purposely am being slightly vague.) I have endeavored to portray some of the very basic yet different styles by providing working examples.

...

I offer four versions of an example program that I came up with which will stress (but not too much) the mixture of PHP and HTML: the use of an array of data to be displayed in a TABLE. Minor perhaps in the overall scheme of things, but important enough for what I want to demonstrate. It is a working example; it provides a list of all files in the current directory and three functions: open the file, delete the file, and to create a new file. (The code to actually perform those particular actions has been left out but everything else works and it will be easy to add those features.)"

Read more here: Anatomy of Style

I just saw boog's Learn blog.

Think of the possibilities:

  • Humans come by and post comments.
  • Learn learns how to respond to comments.
  • Humans interact with Learn.
  • Other versions of Learn are created.
  • Learn learns about other blogs.
  • Other Learns learn about Learn.
  • Learns become smart.
  • Humans and Learns interact all over the Blogsphere.
  • Someone creates an evil Learn to upset other Learns.
  • Evil Learns spread.
  • As blogs become more legitimate to News organizations Learns start effecting popular opinion.
  • Evil Learns grow exponentially.
  • Good Learns and Humans can't stem the spread of Evil Learns.
  • News organizations exploit Learns for their own purposes.
  • Evil Learns exploit News organizations for their own purposes.
  • An Evil Learn gets elected President of the United States.
9 Sep 2004 (updated 9 Sep 2004 at 00:59 UTC) »
MY Brain Just Doesn't Work Right

I suck at HTML design. I struggle just to keep things from looking stupid.

What there is is a disconnect in my brain between the visual and the technical. I struggled with the concept of nested TABLES--tables within tables within tables--just to acheive things like columns. It was (and still is) difficult for me to visualize what the code would end up looking like. My mind just can't make the transition from code to screen. I would write HTML and it would rarely do what I wanted. And when I did get something I wanted (usually a compromise with what I wanted) I would make one more change only to see everything get completely screwed up.

All the HTML editors I ever used sucked. Well, perhaps they don't really suck, for they just do what they are designed to do. It is just that no HTML editor ever did--ever allowed me to do--what I wanted to do. (I even shelled out $500 for a fancy named commercial editor that turned out to be useless. I have vowed to never buy commercial software ever again--except for games of course.)

Here are a couple of examples of why HTML editors suck.

Images (I went through all this before the "position: absolute" style; perhaps things are better now but I doubt it.) To an HTML editor an image gets placed on the "page" as just another character, a letter, inline with all the other characters of text. You can align the image to the left, the right, up, down. But always it's stuck in the text. Me? I guess I'm just different. My brain just doesn't work like everyone else's or something. For I always wanted to just click an image with the mouse and drag it into position! To me, it was such a simple and obvious thing to do. I mean, that was why I wanted to use an "HTML Editor"! So I did not have to learn HTML to learn how to position images. I expected the editor to do this for me.

It turned out that I had to understand HTML just to get an HTML editor to do what I wanted.

Manual Repetition I once was working on a "news" Website ("look & feel", logos, images, header/footer templates, etc.) where the article summaries came one after the other in a lone table, and every other summary had a light-grey background--a typical thing to do with lots of rows in a table of items. When I wanted to delete or insert one of these article summaries though, the result was two "greys" or two "non-greys" side by side, and to change everything--there was dozens of them--I had to adjust each and every one of the subsequent table rows manually. Arg!

What did other people do? I wondered. Well, perhaps some companies have several Web designers and a bunch of people continuously working day in and out manually updating all their site's content. But just one person? And only several hours a weeks? Hey man, I want to do more than struggle with HTML editors all day!

So I have crappy Websites.

Personal

I had a nice dinner with some of my own grown corn and tomatoes. Like having your own hand made tools--for the garden as for the computer--it is better than something store bought.

Books

One of the books I am reading is "From the Eagle's Wing", a biography of John Muir. Muir was an inventor, making complex things like clocks, unschooled, with his own hands out of wood. Of showing some of his inventions at a State Fair, in 1860, the author writes:

"The joys of the small fry, the sly humor and pride of the inventor, the cleverness of the machine, plus the uproarious response of the crowd, made John Muir's invention the great attraction of the hall."

This, at least, I understand.

Philosophical

Broad-sweeping statements are all actually pretty meaningless. "Linux users are..." That sort of thing. Since there are (WAG) 100 million Linux users there are 100 million different Linux users. However, we can look at basic facts and statistics to kind of get some insight into compter users. For example, when Linux began it was distributed through the Net and BBSs. "Linux users are hobbyists"? Well, perhaps. Because Linux became available for purchase on CDs (that is how I first got it). Then Linux became available even more widespread as RedHat, SUSE, et al. And at some point during all this Linux entered the server market and took it over. Is someone running an ISP who has a rack of Linux servers because it makes performance and economic sense a hobbyist? Most likely he or she is an entreprenuer. One thing that can be said of Linux users is that Linux users are Linux users by choice.

Windows users on the other hand, they had only the choice of MAC or DOS at first when PCs first became available. The ecnomonic system of going to the store (or catalog) and buying something has a phsycological "feel good" association to many people. ("What to you want for Christmas dear?" "A computer!") Millions of PCs bought from stores and catalogs--before the Internet--were either Apple or Microsoft. By the time of Redhat, SUSE, et al., many Windows users were fully ensconced with the idea that purchasing the PC with the logo and corporate jingle and installed software is somehow the only legitimate thing to do. Sure, for the longest time Linux was only discussed and passed around any never corporate-ized. Dad--knowing nothing about computers--would go to the store and never encounter Linux. But, by the time "Linux" got into most people's ears many Windows users were quite thoroughly brainwashed. (Which is why the RIAA, MPAA and BSA all have "kids" and "school" programs and agendas. Ever see Jack Valenti espousing the "goodness" of copyright to a class of grammar school kids? I have. It was deplorable--and sad.)

I think I can summarize the general differences between the hardcore Linux user and the hardcore Windows user like so:

When a Linux user flames a Windows user in a newsgroup it is most likely based on pride in himself and in the Linux community in general. (Pride is one of the seven deadly sins, but has its good points too. Re, "Pride of the Yankees" with Gary Cooper.)

Whan a Windows user flames a Linux user it is usually based on a fearful, herd mentality; and many times it has a viciousness too that is very disturbing.

Notes:

In the early days on Linux, hardware support lagged Windows' for obvious reasons, and sometimes a distro just plain sucked, which, I am sure, turned off many a Windows user who was used to "easier" installations.

Reality is, of course, much more complicated than I record here. But perhaps there is something to these words. Perhaps not.

I am also a Windows user. But my other computer is Fedora. I also still own an S-100 bus CP/M computer, and a couple of Macs, and a couple of DOS boxes, and ....

Practical

I wrote this before posting in my own (silly little) "editor" I call Phase - a PHP Ascii Editor. There are more than a few editors written in hosted languages, but most are for programming or for part of a CMS or have WYSIWYG features. Phase is for straight text, plain text actually. I use if for writing stories and essays for publication where no formatting is wanted beyond for printing in a standard double-sided manner for sending to an editor.

Phase

Inefficiency By Design

One of my (many) weaknesses when it comes to programming is the overwhelming desire to produce the most efficient code as possible.

One might think that that could hardly be a weakness, but that desire causes me to spend an inordinate amount of time re-writing and re-designing. Bordering on obsession, I probably spend, and I am not exaggerating, 10 times longer than I would otherwise.

(Of course, this would be all okay if I always ended up with excellent code, but that is not always the case.)

I mention this obsession with efficiency--size, speed, whatever may be the case--because it probably is the cause behind my extreme dislike (and lack of understanding) of people/groups who purposefully create inefficiency.

The World Wide Web offers two such examples: Source Forge and PHP Classes (sourceforge.net and phpclasses.org).

These two Websites could be criticized just for their "look & feel" but that would be very subjective of me (my own visual design skills are pretty lame). What bugs me more about them are the programmed in inefficiencies, their inefficiencies by design.

"Inefficiencies by design?" How could that be? you ask. Why would anyone do that? Simple answer: providing Advertisements.

There is nothing wrong with ads per se, but coupled with designed inefficiencies created solely to maximize ad viewing at the expense of web users is.... well, what is it?

Source Forge, the "world's largest Open Source software development web site" has up to five ads per page. It takes about three "clicks" to browse to a project hosted at Source Forge, which is typical of most any development website (can't get much more efficient than that!).

But when you click on a project file available for viewing or for downloading, you can get up to three more pages before you get to the file download page. (You can enable cookies to lessen the number of pages viewed by one.)

Like Source Forge, PHP Classes is a repository for developers to distribute code. Getting to the page where one can browse their content takes a few clicks, but this is to choose a mirror site. Once you get in and start to click around you are subjected to three or four ads per page. And once you see some code you like, you are subjected to several more ads, only to be told that "You need to be a subscriber and log in to access this file".

It is this process, this "bait and serve", that is so annoying.

This process could just be bad design, but if you look at the code and see the results, the conclusion that it was designed this way on purpose kind of stands out. The pages actually refresh themselves over and over as you navigate through all this.

Making things even worse, is that although much of the code made available there is free software, and there are links to Freshmeat and to author's homepages, all code archives that I have looked at are only available for download via PHP Classes. (For example, the typical Freshmeat entry has links to directly downloadable archives, but not for those referring to PHP Classes.)

Certainly, I can be considered as "picking nits" here. And I freely admit that I am no Web Guru or Programming Expert or anything. But software and websites can be designed so much better. I mean, its not like things have to be this way.

The Broken Office Paradigm

In this post I describe the current paradigm of office productivity software and why it is broken.

Background

I have been the IT Manager/Network Administrator for a small business for two years. The business, a Mergers & Acquisitions brokerage, like many small businesses, generates many documents. These documents are printed, mailed, faxed, e-mailed and posted to the Web.

The type of documents vary from standard letters and faxes to advertisements, brochures, product/service announcements, etc. We have a company logo and a corporate "style" which we use when designing documents. These documents are collectively called our "literature".

We use Personal Computers and standard "office productivity software" such as word processor, spreadsheet, graphics manipulation, and an HTML editor to create these documents. Very typical small office stuff.

I have come to the conclusion that the office document paradigm is seriously broken for it causes excessive waste of time and resources.

For complete disclosure, we use Microsoft Office (Word, Excel, Outlook, Frontpage) and Adobe Acrobat and Photoshop. At times we use other similar Windows software such as from Lotus and Macromedia. We have a small computer network (a dozen) all running versions of Microsoft Windows, along with a few networked printers, a copier and a fax machine.

All of this is what I figure the majority of small businesses use. And it is a complete waste of my time.

And I cringe at the thought of millions of businesses all wasting their time--and paper--dealing with the problems that result from all this.

The problems are many. I will take on the many issues one at a time. Most are technical but some are human issues.

First, let me dispel the notion of a "paper-less office" ever being achieved under this paradigm. In fact, we generate (and waste) more paper than ever I would have imagined. Part of this excess of paper generation is that the non-technically inclined are just so used to paper (as far as I can tell) that they can not deal with information unless they can hold it in their hands. Many in the office distribute inter-office memos by writing them in a word processor and printing them and putting them on people's desks and "in-boxes". The first thing some people do with received e-mails is to print them out!

(One of the worst examples of this is when someone in the office wants a new advertisement, mailing or letter generated, they write the text they want in Word, print it out, and then hand that paper to the IT department for processing.)

This techno-illiteracy, it seems to me, can not be solved without time consuming training on the computers and software people use, even though they use them every day. (This may, of course, be an isolated case limited to the people in the office I work in.//1) Part of the reason for techno-illiteracy is the extreme, and I do mean extreme, complexity of the Microsoft Windows menu/dialog-box interface--but that is a separate issue.

Perhaps to grasp the difficulties of office document life, I should just examine the life of a typical office document.

The act of saving a document makes no sense to the inexperienced computer user, I see it all the time. Discussing this is a bit off topic, but consider: you are presented with a dialog-box with a "save in" location which is the current folder--you do not see the entire folder path--with a linear, branching folder hierarchy knowing the full folder path is crucial to understanding where you are in the Windows computer file system. And in this "current folder" there is a list of other documents and other sub-folders.

If simply clicking the "save" button was enough there would be no issues, but saving documents in a single location is too confusing when one has thousands of documents to deal with. Hence, we put documents in folders. But folders live in "paths" and traversing back-and-forth along these paths is difficult when all one sees is one part of the path only.//2

Having successfully created a new document in a word processor, it now exists as a file somewhere on some file system on some computer in the office. With the standard set of "office productivity software" there is no easy way of locating documents--one has to know the full name and the full path of the document to find it--you have to know where it is located to locate it. All the programs I have used have a small "recent file history", and there is a Windows "search" interface--that is it.

Broken piece number one: Non-intuitive, overly complex GUI designs.

Once we have documents, and have learned the ways of saving and opening them, we start to do things with them. Most of the documents are to be printed; letters for example, letters on "letterhead". To print a document on letterhead (a really nice paper with or logo, printed by a real printer) we have to format it to fit the margins. This process generally goes well for that is what word processors are designed for. You can even set the word processor up to default to the right settings for new documents. This works well. Until you need to change something global.

If your office is like ours you have many documents. And when you change something global, like the size of your logo, which effects say, the top margin, with a standard office word processor, you have to manually change every document you have to update to the new margin settings. Sometimes, word processors can store "styles" in a common template, but not some things like margins and paper size.

Broken piece number two: Outdated software design of embedding sheet parameters within each document.

Microsoft Office documents are, of course, of a proprietary format. This means that the documents I create can not be shared with anyone who does not have the same program that I used to create the document; sometimes even the exact same version. (This is known as "planned obsolescence".) This problem is so widespread that we all know about it. It is quite ironic, to me anyway, that the main "solution" to this is a well advertised, widespread format calling itself a "portable document format" that is itself a proprietary format which requires everyone to have the same program and sometimes even the exact same version to read.//3

So, let's say I want to send someone one of my Microsoft Word documents as a "portable document". I load the document in Microsoft Word and I "print" it using a specially purchased "driver" which converts it to a PDF. I now have one of these "portable documents." Of course, I then have hundreds of these portable documents, one for each of the Word documents I originally created. And, of course, when I change my logo, I have to manually locate, edit and "print" each document anew.

Why don't you just change to a new word processor you ask? One that saves as PDF? You can't. PDFs do not work that way. Adobe only provides printer drivers. Adobe Acrobat, Adobe's PDF editor, is extremely limited in what it can do. It has no real word processing capabilities, it is basically for "touch ups".//4

Adobe's conversion capabilities are also a bit limited; it does not convert Microsoft Word forms for example, which is a dirty shame. Adobe Acrobat needs to be used for creating PDF forms.//5

Broken piece number three: Vast gap in format conversion software.

Another related problem our office has is that some of our literature we create is printed on pre-printed paper--that is paper which has on it our color company logo. Having paper pre-printed is cheaper than to buy and maintain a high-end printer ourselves. But this "pre-printing" does not have a digital equivalent. So we have to import images into our documents to duplicate the pre-printed paper and then we make the PDF document. What that means is that we have to have two versions of the original document--one without images and one with.

Our office basically has to maintain three or four versions of most of our documents, and we have to manually update each and every one of them.

Broken piece number four: Multiple document formats means multiple programs and editing processes per document.

Then along came the Internet.

The Internet changed everything. Well, it just added to the quagmire. We have, of course, a company Website. And, of course, our literature needs to get "published" on the Web. This means, of course, simply yet another document format--another computer program and another way of converting and editing our existing documents. The result is a fourth version of every piece of our literature that gets to the Web. And another manual edit every time there is a global change.

Broken piece number five: The edit/convert/maintain process grows geometrically with each new format and always requires another proprietary program.

Well, that is how document handling using Microsoft Windows "office productivity software" is currently "done".

A Solution

There are solutions; I think. (Hopefully I will be writing more about them at a later date.) For right now let me offer one solution, not viable for me right now, for it will take a long time to convert to, and in the meantime our office must deal with our current set of documents, but one that I want to work on.

But before I talk of a solution, let me say that the solution is not simply to replace Microsoft Office with StarOffice or OpenOffice. Those do not change the paradigm one bit. They may cost less, but they do the same thing and in the same manner.

Nor is the solution one great-big-does-everything-program like integrated "solutions" such as Lotus SmartSuite or Microsoft Works.

I think the solution is Open Source or Free software using HTML and XML along with programs running on an (internal) Apache server written in a language such as PHP.

All documents would be created and written in ASCII and stored on the server (the editing process would include some sort of WYSIWYG interface). I think that HTML along with CSS and XML will be able to provide a way to format documents just as does any advanced word processor.

I don't think there needs to be a database other than a database of information about the documents; for indexing, searching and reporting.

This way will present other issues (many I have not even thought of). You can not highlight a line and make it bold when you are editing an HTML form and text is in a TEXTAREA tag. And you don't want to have to mark your text with some really weird syntax just to italicize a few words in a paragraph.

But here is something I just went through, and if I describe it, it may help in conveying my meaning.

Someone gave we a word document of a bunch of paragraphs of text and wanted the text up on their website. Each first line and last line of the text was to be bold. There were about a hundred paragraphs.

Microsoft Word's export to HTML feature is horrible. And I was not about to go through the process of highlight a line of text with the mouse, press Control-B, highlight another line of text with the mouse press Control-B, highlight another line of text press Control-B, ...

No need to. I saved the text as ASCII. Wrote a PHP script (I could just as easily used Perl) to do the formatting for me; the script generated output in HTML. Viola! I now have the text along with a small piece of code that does the formatting they wanted. Perhaps they want italics instead of bold? Minor change to the PHP and I re-run the script. Viola!

That is the paradigm I am thinking of. Text and an algorithm to display/convert that text. Perhaps there would be a template describing the attributes of layout. Text, a template, and an algorithm. We have the text, assign it to a category, the category has a layout and we run a script to convert the text using the layout to create either, a letter, a fax, a memo, a brochure.

We can do anything with Free Software. But we will always be limited by the current Broken Office Paradigm.

Notes:

1. Some of the users in our office show absolutely no initiative when it comes to computers.

2. I can write much more about some of the problems relating to GUI designs, but wanted to keep this short. Maybe later...

3. I mean Adobe PDF of course. Perhaps PDF readers are freely available enough now, but the format remains proprietary. Perhaps there are some third-party programming libraries, but still, it is a very difficult process to manipulate PDF documents; especially to convert to and from PDF document. Try it with forms and see.

4. Perhaps there is a word processor out there that saves in PDF form. I am not about to look. The process of trying demo versions is too time consuming. Besides, converting to yet another set of programs to edit documents is a long and expensive process.

5. There are some other conversion programs out there, but again, the process of finding them is long and can be expensive.

35 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!