(Skip to main content.)

Blogs Quoderat Land and Hold Short

Quoderat

Archive for February, 2006

A new Namespaces discussion

Sunday, February 26th, 2006

Eliot Kimber and I were both on the old W3C XML Working Group during the development of the Namespaces in XML specification. Late in the process, pressure from outside the WG forced us to make a major change to the specification, angering many of the members. Eliot, who was already pretty unhappy with the Namespaces spec, left; I decided to stay.

Eliot has recently had the grace and integrity to making a posting where he admits to being wrong about Namespaces, and states that he is now, with only a few caveats, a big fan of the spec. He even goes so far as to write the following:

If you’re not using namespaces you should be–I can’t see any excuse for anyone defining any set of XML elements that is not in a namespace. It should be required and it’s too bad that XML, for compatibility reasons, has to allow no-namespace documents.

The context problem

While I wasn’t originally as strongly opposed to Namespaces as Eliot was, I cannot claim to be as strongly in favour now. For me, the biggest problem with the Namespaces spec is the requirement for a context to interpret prefixed names. That’s no biggie as far as XML element and attribute names go:

<foo:bar xmlns:foo="http://www.example.org/foo" foo:a="b"/>

Here, there’s no doubt that foo:bar stands for “{http://www.example.org/foo/}bar” (or however you want to notate it), while foo:a stands for “{http://www.example.org/foo/}a.”

QNames in content and attribute values

What happens, however, when the prefixed name appears in an attribute value or content?

<foo:bar xmlns:foo="http://www.example.org/foo/" foo:a="foo:b">foo:c</foo:bar>

Simply looking at this XML document in isolation, there’s no way to know whether the attribute value “foo:b” and the content “foo:c” is meant as a literal string or a qualified name. The context (the xmlns declaration) is still there to allow software to expand the prefix, but you need something else — an external schema, hard-coded application logic, prompting a human operator — to decide whether it’s safe to expand the name. Any feature that requires the use of schemas to perform basic XML processing should raise red flags.

QNames in XPath expressions

The biggest problem, however, comes with referring to parts of an XML document in non-XML syntax. Consider the following XPath expression:

//foo:bar/@foo:a

Unlike the XML document, this expression does not provide any way to expand the foo: prefix. It needs some kind of external context. That means that you can never simply pass this around as a string argument in a programming language, for example, without also passing around a whole set of Namespace declarations. Namespace processors cannot safely discard prefixes, because they might still be important later on. XML transformation filters have to try to preserve original prefixes whenever possible. In short, in non-trivial XML processing, the distinction between the Namespace prefix and the Namespace URI quickly becomes blurred. And this is not simply a problem for tool makers — it’s one that bites developers, script writers, database administrators, and even information authors.

Namespaces if necessary, but not necessarily Namespaces

I don’t know an easy fix for this (perhaps including the full Namespace URI in XPath expressions would have been smarter), but given all of this hassle, I cannot agree with Eliot that Namespaces should always be mandatory. Where Namespaces are not needed for disambiguation — where an XML document isn’t meant to be published to the web for general use — avoiding Namespaces (or at least, using them sparsely) removes a huge amount of complexity from XML development, authoring, and information management. A script kiddie, for example, can easily write PHP code to deal with non-Namespaces qualified XML documents, but may quickly fall out of his or her depth once we stir Namespaces into the mix.

I do still believe that Namespaces are valuable, and in general, I’m not unhappy with the current specification; however, I also believe that simpler XML markup still has its place for a huge range of applications, especially when the XML document will be used in a specific way and not published to the world at large.

Earthquakes and high tech

Saturday, February 25th, 2006

Ottawa had a little earthquake (magnitude 4.5) yesterday evening at 8:39 pm EST. Ottawa is Canada’s biggest high tech centre (or at least was before the dot.bomb, drawing more investment than Toronto). Like the San Francisco Bay area, Ottawa is built on top of a series of geological fault lines; however, ours never result in worse than a minor tremor every 5-10 years. Our tech industry is (relatively) minor as well. Does the severity of fault lines correlate with high tech success?

Maybe a little danger gives people an edge. Tech people in the Bay area live every day wondering if they’re going to fall into the Pacific tomorrow, and bus ads in San Francisco talk about stocking up on food and flashlights (I don’t think anyone’s every going to count on timely help from FEMA again). What are we worried about in Ottawa? A bad skating season on the Rideau Canal?

Note to Route 128 companies: to find the edge you’ll need to compete seriously with the Bay area, you’ll have to come up with a looming natural disaster. A mega tsunami caused by a volcano in the Canary Islands might fit the bill.

Two Web Services Questions (what actually works?)

Thursday, February 23rd, 2006

My biggest frustration with the current Web Services debate (triggered innocently in a posting by Don Box, with followups by nearly everyone) is the lack of verifiable information. We need a big, independent study to answer two important questions about each part of the WS-* stack:

  1. Does it actually work as specified in each individual implementation?
  2. Does it actually work as specified across many different implementations?

Any WS-* feature that receives a ‘no’ answer to either of these questions is excluded from the debate — WS advocates cannot credibly claim that WS-* is more appropriate for complex, enterprise interfaces unless the complex enterprise features actually work, portably.

On the other hand, any WS-* feature that receives a ‘yes’ answer to both of these questions needs to be taken seriously by the REST advocates. They’ve gotten used to throwing mud at WS-*, assuming that everything is broken; where the WS people have managed to get something working robustly and portably, let’s at least start by giving them the benefit of a doubt that they might have solved a real business problem.

Remembering the Y2K panic

Monday, February 20th, 2006

Steven Levitt (of Freakonomics fame) has started a small controversy by casually mentioning that the Y2K crisis was a false prophesy (his more detailed followup posting is here; he also points to a paper that I didn’t bother reading, but probably does a better job than my posting of going over the issue).

While I never advertised myself as a Y2K consultant, I made money from the Y2K panic like everyone else in IT — even if I didn’t do Y2K projects directly, systems were being replaced early because of Y2K, IT departments were getting bigger budgets and spending on whatever they wanted, etc. And like many (most?) people reading this weblog, I went out of my way to try to explain my customers at every opportunity why the Y2K threat was exaggerated.

The logic was simple: the scare stories in the press talked about everything shutting down at midnight on December 31 2000, but in fact, times and dates in IT systems are much more complicated than that: information and events go through lifecycles that have starts, ends, and often many stages in-between. Here are some examples:

  • If you took out a 20-year mortgage in 1980, the expirty date would have been 2000.
  • If you were 55 in 1990, you would have been 65 in 2000.
  • If you received a new credit card with a five-year term in 1995, the expiry date would have been 2000.
  • When your credit card bill arrived on 15 December 1999, payment was probably due in 2000.

So how many of you received notices in 1981 that your mortages were 81 years overdue? Or how many of you received pension benefits for 156-year-olds in 1991? How many of you found that your credit cards were declined in 1996 because they were 96 years past expiry? Or how many of you were charged 99 years’ interest for an unpaid credit-card bill in 2000?

Of course, some of these things did happen to some people in the decades leading up to Y2K, but only very rarely — rarely enough, in fact, that every case was considered newsworthy. 2000 was going to be the peak of a curve that started decades before and ended decades after, but since the curve was still so close to zero by the 1990s, it was obvious to anyone who cared to spend time thinking (even a statistical numbskull like me) that the Y2K consultants screaming doom and gloom were either not fully competent or not fully honest. It was important, of course, to check the most critical systems, like hospital equipment or nuclear power plants, but Y2K was hardly going to be a real operational problem for most organizations.

Those same consultants defend themselves now, of course, by claiming that they averted a catastrophe, but that is trivially easy to disprove — countries that spent very little on Y2K preparedness, like France, had no more problems that countries that spent a lot, like the U.S. and Canada. Of course, France benefitted from some spill-over from the North American IT work, but there still should have been a significant, measurable difference between the two. There wasn’t. QED.

Hire Bob

Monday, February 13th, 2006

Bob DuCharme, author of many successful books and a long-time XML expert, is leaving Lexis-Nexis.  If you’re looking to hire a senior XML person with good name recognition, you might want to make their loss into your gain.

The big recycling pile

Saturday, February 11th, 2006

A couple of times every month, I open all the bills, bank statements, investment statements, and government correspondence for my three (!!) corporations and blast through the paperwork (somehow, I always seem to open cheques from customers a bit more promptly). I’ve read marketing books that suggest businesses should use invoices and other business correspondence as an opportunity to market to customers, and the banks, phone companies, and even the government have taken this to heart — when I open a typical envelope, I’ll pull out a 1-2 page statement, then dump the envelope and several pages of brochures and newsletters into the recycling pile.

Am I an anomaly for not reading those? I discard them just as fast as I discard spam email, except that in the case of spam, I at least have to scan the subject lines first. For the junk inserts, all I have to do is feel the glossy paper under my fingers, or catch a glimpse of a smiling model staring off the page, and my arm reflexively tosses them; even easier, once I’ve pulled out the actual statement or invoice, I know that everything else is junk, and don’t need to examine it at all.

How are marketers ever going to reach people when we’ve developed such good, and even casual defences against them, both online and in print?

The Curse of the Tin Woodman

Friday, February 10th, 2006

Tin WoodmanIn L. Frank Baum’s book The Wonderful Wizard of Oz (Project Gutenberg), the Tin Woodman was originally a human named Nick Chopper. In an effort to prevent his marrying his sweetheart, the Wicked Witch of the East cursed his axe so that it would cut off part of his body every time he tried to chop wood. Nick lost his limbs one by one, only to have them replaced with metal versions by a friendly tinsmith. Eventually he lost his head and trunk as well, and had them replaced with tin in the same way.

Buy or build?

Nick’s story might sound painfully familiar to anyone who has spent time working with IT in large enterprises or government. Big organizations will buy a huge, off-the-shelf software system in an attempt to save the cost and risk of building their own, only to replace one part after the other because of lack of scalability, bad performance, bugs, or missing features. They end up with a system that they’ve built almost entirely themselves (at perhaps double the cost of a from-scratch system) but still have to pay royalties to an outside vendor to use.

How to avoid building your own Tin Woodman

Why does this happen? In principle, buying instead of building is a great idea — it lets a company share development costs with many others while concentrating its limited IT resources on its core specialties. This approach works, however, only when a product does something that is well understood and widely implemented (i.e. it’s a commodity). When considering an OTS product instead of building a system from scratch, a company should ask itself two questions:

  1. Do we have a choice of more than one comparable product (preferably following the same open standards)?
  2. Is this particular product already in full-scale production use at at least two or three sites that do the same kind of business, at the same volume, as we do?

If the answer to either of these questions is no, then you’re looking at a potential Tin Woodman: you’ll probably end up chopping off one limb at a time and rebuilding the product yourself, piece by piece. That’s not always a bad idea — companies will often choose to outsource R&D by funding the initial development of a product at another (usually smaller) company and serving as the launch customer — but in those cases, both management and IT know what they’re getting into, and there’s no expectation of simply installing the software, doing a bit of configuration and testing, then going into production in six months.

Early retirement is no fun

Wednesday, February 1st, 2006

Philip Greenspun has an posting about the problems with early retirement. It’s hard for people to sympathise with the problems of a guy who has enough money that he can buy fun airplanes and not work 9-5, but I have to say that nearly every word of his posting rang true for me. Like a lot of people, I did well consulting during the dot.com boom, so when the tech market dried up earlier this decade, I was in a good enough financial position that I could basically stop working and take a two-year sabbatical until the market picked up again.

I imagined that I’d come up with a brilliant business idea, invent something important, or at least figure out what I wanted to do with my life. I did learn to fly and buy a (really cheap, old, slow) airplane, but otherwise, those years stand out as probably the least fun of my life. A bit of leisure, like a bit of chocolate, is nice, but retirement — or, in my case, a extended sabbatical — is like an all-chocolate diet. I’ve been busy again for the last couple of years, and I’m much happier this way. I also find that I’m more creative and get more personal stuff done (exercise, reading, etc.) precisely because I have less time to do it. I’m more organized, more motivated, and, I think, nicer to the people around me.

I no longer dream of early retirement and a life of leisure — work, as long as it’s not stupid or excessive, really is the only path to happiness. +1 for the Puritan work ethic (though we could have done without the Maypole-felling and witch hunts).