(Skip to main content.)

Blogs Quoderat Land and Hold Short

Quoderat

Archive for the 'General' Category

The Code Factory, Ottawa, Canada

Monday, June 16th, 2008

Ian Graham, who is well-known in the Ottawa tech community because of his involvement with Bar Camps, Demo Camps, etc, has a new start-up called The Code Factory.

Location, location, location …

Located in Ottawa’s downtown core a couple of blocks south of Parliament Hill, The Code Factory has offices for rent, drop-in communal working space with WiFi, and a lot of character (old building, bright with lots of windows, hardwood, rickity old elevator that reminds me of office buildings in SoHo).

No three-year leases

The offices go for around $800-$1,000/month (no long-term lease required), and all have windows or skylights. The open (no cubicles) communal working space costs $5/hour, billed to the nearest half hour, and includes unlimited coffee, cappuccino, or espresso, making it a break-even for coders who spend a lot at Starbucks (think one cappuccino/hour).

Unlimited caffeine

For the communal working space, his target market is coders who work from home, but miss the energy and social life of an office and want to drop in a couple of times a week. Despite all the friendly chatter, I find I’m actually getting more done in four hours there than I do in eight hours at home.

Shut-in no more

At a 45 minute walk from my house, The Code Factory is perfect for me — I’ve worked from home for over 10 years, and the Factory gives me an excuse for a little extra exercise, gets me downtown, and gives me the buzz that comes from working around other coders. I think that Ian has done a great service for the Ottawa tech community — much more so than yet another incubator or government-subsidized fund — and that we’ll see at least one or two successful companies tracing their roots back to the Factory, as well as a lot of happy consultants like me.


View Larger Map

Religious wars hit close to home

Friday, December 21st, 2007

Update: I read that the school concert went ahead, with Frosty the Snowman replacing the modified Silver Bells as the token non-religious song on the programme (Frosty makes no reference to any religious holidays).

Both of my children attended Elmdale Public School here in Ottawa from junior kindergarten to grade six. Now, my kids’ alma mater has triggered a nation-wide moral panic by changing the line “it’s Christmas time in the city” to “it’s festive time in the city” in the song Silver Bells for a grade-two and -three concert.

I’ve already gone on record saying that it’s OK to wish me Merry Christmas — I’m as proud of my Christian background as some of my friends and neighbours are of their Jewish, Muslim, Sikh, and Hindu backgrounds — but that’s not what this was all about. The primary choir was already singing songs about Christmas and Hanukkah, and the choir leaders decided to add an additional song that was non-religious. I think that the existing non-religious songs Jingle Bells or Winter Wonderland would have been fine, but they decided to take Silver Bells — an otherwise secular pop song about shopping downtown in a city — and replace the word “Christmas”. Silly? Probably. An attack on Christmas or Christianity? Hardly.

The real attack on Christmas and Christianity

Here are some people who might need help understanding the idea of Christmas and Christianity:

  • the school parent(s) who decided to take this to the media
  • the newspaper columnists who made a primary class holiday concert into a national culture battle
  • the talk radio hosts who urged listeners to go after the school and ended up putting the lives of hundreds of small children at risk
  • the hundreds of people who called or e-mail messages of hatred (and a bomb threat) to the nice women working in the school office

According to the Christian New Testament, Jesus didn’t have anything good to say about people like this — he far preferred the company of prostitutes and tax collectors to the religious self-righteous. If you are religious (any religion), pray, meditate, or just hope that their hearts can still be opened this season.

Acer technical support phone number

Thursday, November 15th, 2007

For North America, it’s 1-800-816-2237.

The phone number is not available anywhere on Acer’s support sites (supposedly, it’s hidden somewhere in the Windows XP control panel, but you can’t get that with a broken computer), so I thought a nice, Google-friendly posting might be in order to help anyone else looking at a broken Acer notebook.

Probably because the number’s so well hidden, I got through instantly to a human being, who (a) seemed to be based in North America rather than overseas, and (b) knew what he was talking about, or at least was able to get answers.

Ubuntu gutsy is about to mess up

Monday, September 17th, 2007

Ubuntu — my favorite distro of my favorite OS — is about to mess up. The next official release, Gutsy Gibbon, is scheduled for release in a month.

In an attempt to out-cool Vista and OSX, they’re switching over to compiz as the default window manager on systems with 3D hardware support, to enable all kinds of 3D effects for windows and dialogs. Unfortunately, that leads to two big problems:

  1. Even on a fast machine (I’m running on a 2.2 GHz dual-core), there are long pauses/freezes while doing things like typing into OpenOffice or entering info into dialog boxes (including lost info typed into dialogs) — I actually uninstalled the 3D driver to make my machine usable, before I realized that compiz was the problem.

  2. For machines with Nvidia cards, X windows will crash (using the current Nvidia binary drivers) if you run any other 3D app under compiz.

Ubuntu has a well-deserved reputation as the Linux distro that just works out of the box — on desktop machines, at least, it’s generally easier to install than Windows — and giving all that away in gutsy for a bit of dubious eye candy looks like a bad move. People who want compiz can enable it with a single click in gutsy’s GUIs.

Thinking about structure

Sunday, January 28th, 2007

Douglas Crockford left an excellent comment on my recent posting All markup ends up looking like XML, which he later made into its own blog posting, For the trees. I agree with his reworking of the structure: given the data that I provided, the JSON, LISP, and XML markup all could have been simpler.

If he’s right about the examples, though, he’s wrong about two things. First, my posting doesn’t represent any kind of softening to JSON among its opponents in the XML community, simply because I’ve never been one of those opponents. Second, I spend at least one order of magnitude more time working with SQL and programming languages (not processing XML) than I do with XML, so if anything, my perspective on XML would likely be tainted by them rather than the other way around. Instead, I think the examples were complicated because I built for tomorrow instead of today.

Tomorrow

So what might tomorrow look like for an application dealing with names? Consider, for example, this XML markup, moving gender out of the element/property name as Doug suggests, and eliminating the other attributes (since they don’t add much to the discussion):

<names>
  <name gender="male"><surname>Saddam</surname> Hussein</name>
  <name gender="female">Susan B. <surname>Anthony</surname></name>
  <name gender="male">Al <surname>Unser</surname> Jr.</name>
  <name gender=”male”>Don Alonso <surname>Quixote</surname>
    de la Mancha</name>
</names>

It’s surprisingly messy breaking each name down into a simple property list. If we tried the approach Doug used for my simpler examples, we’d end up with this (note that this is a list of names, not of people):

{"names": [
    {"gender": "male", "given-name": "Hussein", "surname": "Saddam"},
    {"gender": "female", "given-name": "Susan B.", "surname": "Anthony"},
    {"gender": "male", "given-name": "Al Jr.", "surname": "Unser"}
    {"gender": "male", "given-name": "Don Alonso Quixote de la",
      "surname": "Mancha"}
]}

This list needs a bit of patching. First, if we reconstruct the names as strings, we don’t want to end up with “Hussein Saddam” instead of “Saddam Hussein”, so we’ll have to add a property specifying whether the surname comes first or last:

{"gender": "male", "given-name": "Hussein", "surname": "Saddam",
  "surname-after-given-name": false}

Great — that’s all we need to fix that, and now we know to print “Saddam Hussein”. Now, let’s look at Susan — there’s no problem recreating the string “Susan B. Anthony” from these properties, but we probably should rename the property given-name to given-names, just to avoid confusion:

{"gender": "female", "given-names": "Susan B.", "surname": "Anthony",
  "surname-after-given-names": true}

Al Unser Jr. is a bit trickier, because there was no obvious place to put the “Jr.”. Strictly speaking, it’s neither a given name nor a surname, so for now, let’s just call it a postfix (although that assumes a physical position that might not apply to all languages):

{"gender": "male", "given-names": "Al", "surname": "Unser",
  "surname-after-given-names": true, "postfix": "Jr."}

Don Quixote, however, forces us to reconsider some of our assumptions, because “Don” is not a given name but an honorific. Assuming, however, that we don’t care whether it’s a name or an honorific, lets just call it prefix for now, to go with postfix:

{"gender": "male", "prefix": "Don", given-name: "Alonso",
  "surname": "Quixote", "surname-after-given-names": true,
  "postfix": "de la Mancha"}

Finally, just to throw a wrench into things, let’s assume that our list might contain things other than names, so that we need to add a type property:

{"type": "name", "gender": "male", "prefix": "Don",
  "given-name": "Alonso", "surname": "Quixote",
  "surname-after-given-names": true, "postfix": "de la Mancha"}

Granted, that sort-of works, but it’s really not very nice, and it’s extremely brittle: there are names with extra words in the middle (such as “de”) that are properly not part of the given name or surnames, for example. Then again, why overtag it? Perhaps we don’t need to know what’s a given name or honorific, as long as we can distinguish the surname. One possibility is simple to break it down to four properties:


{”type”: “name”, “gender”: “male”, “presurname”: “Don Alonso”,
  “surname”: “Quixote”, “postsurname”: “de la Mancha”}

While I’m a big fan of Agile development in principle, however, I’ve worked on enough broken legacy systems to leave a little wiggle room for future requirements, like, say, a need to isolate the primary given name for a mail merge or index, even if we’re not going to isolate it right now. Fortunately JSON, like XML, has a natural ability to represent ordered information much more elegantly — let’s make the name into an ordered array:

{"type": "name", "gender": "male",
  "value:" ["Don Alonso", {"type": "surname", "value": "Quixote}, "de la Mancha"]}

This approach provides us with almost limitless flexibility (for example, if we start isolating honorifics, we can deal with a language where the honorific comes at the end of the name with no extra trouble), and is just as simple and easy to read as the much less flexible presurname/postsurname approach. Building for today is great, but if you have a choice between two roughly equivalent approaches where one provides an easy future upgrade path and the other doesn’t, which is the best choice? JSON is new enough that the JSON community hasn’t yet had to deal much with the life cycle of information — once enough people have built apps relying on specific JSON formats, it will be very, very hard to make any changes: v.2 of any popular data format generally results in enormous costs (in money and goodwill), and v.3 rarely happens.

Some people might prefer to shorten the above example a bit by following a simple convention: the first member of each array is a label, the second is a map with properties describing the rest of the array, and the remainder is the value, where order may be significant:

["name", {"gender": "male"},
  "Don Alonso", ["surname", {}, "Quixote"],  “de la Mancha”]

That is trickier to dump straight into a data structure or database table, but it’s a much more natural way to represent the information, and a lot easier to read on the screen. And just in case it doesn’t look look familiar, compare:

<name gender="male">Don Alonso <surname>Quixote</surname>
  de la Mancha</name>

If your information isn’t this complicated, JSON, XML, or LISP can be simple, as Doug pointed out — the XML could just as easily be


<name gender=”male” presurname=”Don Alonso” surname=”Quixote”
  postsurname=”de la Mancha”/>

The reason you don’t see that much is not because XML people never thought of it — read the xml-dev archives from ten years ago to read megabytes of discussion — but because it kept breaking in production systems as soon as the customer (or users) thought of a new requirement. When the information gets complicated, as I pointed out, there’s a bit of a tendency for all markup to end up looking like XML; when the information is simple, of course, XML can just as easily look like JSON or LISP.

All markup ends up looking like XML

Wednesday, January 3rd, 2007

In the current JSON vs. XML debate (see Bray, Winer, Box, Obasanjo, and many others), there are three things that important to understand:

  1. There is no information that can be represented in an XML document that cannot be represented in a JSON document.
  2. There is no information that can be represented in a JSON document that cannot be represented in an XML document.
  3. There is no information that can be represented in an XML or JSON document that cannot be represented by a LISP S-expression.

They are all capable of modeling recursive, hierarchical data structures with labeled nodes. Do we have a term for that, like Turing completeness for programming languages? It would certainly be convenient in discussions like this.

Syntactic sugar

The only important differences among the three are the size of the user base (and opportunity for network effects), software support, and syntactic convenience or inconvenience. The first two are fickle — where are the Pascal programmers of yesteryear? — so let’s concentrate on syntax. Here’s a simple list of three names in each of the three representations:

<!-- XML -->
<names>
  <name>Anna Maria</name>
  <name>Fitzwilliam</name>
  <name>Maurice</name>
</names>
/* JSON */
{"names": ["Anna Maria", "Fitzwilliam", "Maurice"]}
;; LISP
'(names "Anna Maria" "Fitzwilliam" "Maurice")

Nearly all comparisons between XML and JSON look something like this, and I have to admit, it’s a slam dunk — in an example like this, XML seems to go out of its way to violate Larry Wall’s second slogan: “Easy things should be easy and hard things should be possible.” On the other hand, I rarely see any data structures that are really this simple, outside of toy examples in books or tutorials, so a comparison like this might not have a lot of value; after all, I could have written the XML like this:

<names>Anna Maria, Fitzwilliam, Maurice</names>

Let’s dig a bit deeper and see what we find.

Node labels

In the previous example, I made some important assumptions: I assumed that node label for the individual names (”name”) didn’t matter and could be omitted from the JSON and LISP, and I assumed that the node label for the entire list (”names”) was a legal XML and LISP identifier. Let’s break both of those assumptions now, and make the label for the list “names!” and the labels for the items “male-name” or “female-name”. Here’s what we can do now to handle this in XML, JSON, and LISP:

<!-- XML -->
<list label="names!">
  <female-name>Anna Maria</female-name>
  <male-name>Fitzwilliam</male-name>
  <male-name>Maurice</male-name>
</list>
/* JSON */
{"names!": [
  {"female-name": "Anna Maria"},
  {"male-name: "Fitzwilliam"},
  {"male-name": "Maurice"}]}
;; LISP
'(names!
  (female-name "Anna Maria")
  (male-name "Fitzwilliam")
  (male-name "Maurice"))

XML is forced to use a secondary syntactic construction (an attribute value) to represent the top-level label, because it no longer matches XML’s syntactic rules for element names. LISP simply switches from a token to a string to represent “names!”can still use names! as a token, and JSON doesn’t notice, because it has been using a string all along — XML syntax is convenient for trees of labeled nodes only when the labels are heavily restricted. That aside, however, note that as soon as we add any non-trivial complexity to the information — as soon as we assume that node labels matter — then all three formats start to look a little more like XML.

Additional node attributes

Now, let’s add the next wrinkle, by allowing additional attributes (beside a label) for each node. In this case, we’re going to add a “lang” (language) attribute to each of the nodes:

<!-- XML -->
<list label="names!">
  <female-name xml:lang="it">Anna Maria</female-name>
  <male-name xml:lang="en">Fitzwilliam</male-name>
  <male-name xml:lang="fr">Maurice</male-name>
</list>
/* JSON */
{"names!": [
  {"female-name": [{"lang": "it"}, "Anna Maria"]},
  {”male-name: [{"lang": "en"}, "Fitzwilliam"]},
  {”male-name”: [{"lang": "fr"}, "Maurice"]}]}
;; LISP
'(names!
  (female-name (((lang it)) "Anna Maria"))
  (male-name (((lang en)) "Fitzwilliam"))
  (male-name (((lang fr)) "Maurice")))

Now, while XML is still using ad-hoc convention to represent the “name!” label, JSON and LISP are forced to use ad-hoc conventions to represent attribute lists (a dictionary list for JSON, and an a-list for LISP). It’s also worth noting that JSON and LISP now look so much like XML, both in length and complexity, that it’s hardly possible to distinguish them. Node attributes are not esoteric — they’re the basis of such simple things as hyperlinks.

Data typing

XML certainly looks better for the attributes, but now let’s jump to data typing. Let’s assume that there is a country where people use real numbers as names, and we need to find a way to distinguish names that are real numbers from names that just happen to look like real numbers (say, a person named “1.7″ in a country where names are strings). JSON and LISP can make that distinction naturally using first-class syntax, while XML has to use a different standard that is not part of the core language:

<!-- XML -->
<list label="names!" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <female-name xml:lang="it">Anna Maria</female-name>
  <male-name xml:lang="en">Fitzwilliam</male-name>
  <male-name xml:lang="fr">Maurice</male-name>
  <female-name xsd:type="xsi:float" xml:lang="de">7.9</female-name>
</list>
/* JSON */
{"names!": [
  {"female-name": [{"lang": "it"}, "Anna Maria"]},
  {”male-name: [{"lang": "en"}, "Fitzwilliam"]},
  {”male-name”: [{"lang": "fr"}, "Maurice"]},
  {”female-name”: [{"lang": "de"}, 7.9]}]}
;; LISP
'(names!
  (female-name (((lang it)) "Anna Maria"))
  (male-name (((lang en)) "Fitzwilliam"))
  (male-name (((lang fr)) "Maurice"))
  (female-name (((lang de)) 7.9)))

XML loses badly on this particular example; however, if the extra data were (say) a date or currency, we would have to make up an ad-hoc way to label its type in JSON and LISP as well, since they have no special syntax to distinguish a date or monetary value from a regular number or string. For anything other than simple numeric data types, this one’s actually a draw.

Mixed content

And now, finally, for mixed content. I will add surnames to all of the (non-numeric) names in the list, and (here’s the kicker) will put those in their own labeled nodes:

<!-- XML -->
<list label="names!" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <female-name xml:lang="it">Anna Maria <surname>Mozart</surname></female-name>
  <male-name xml:lang="en">Fitzwilliam <surname>Darcy</surname></male-name>
  <male-name xml:lang="fr">Maurice <surname>Chevalier</surname></male-name>
  <female-name xsd:type="xsi:float" xml:lang="de">7.9</female-name>
</list>
/* JSON */
{"names!": [
  {"female-name": [{"lang": "it"}, "Anna Maria", {surname: "Mozart"}]},
  {”male-name: [{"lang": "en"}, "Fitzwilliam", {surname: "Darcy"}]},
  {”male-name”: [{"lang": "fr"}, "Maurice", {"surname": "Chevalier"}]},
  {”female-name”: [{"lang": "de"}, 7.9]}]}
;; LISP
'(names!
  (female-name (((lang it)) "Anna Maria" (surname "Mozart")))
  (male-name (((lang en)) "Fitzwilliam" (surname "Darcy")))
  (male-name (((lang fr)) "Maurice" (surname "Chevalier")))
  (female-name (((lang de)) 7.9)))

Character for character, the JSON and LISP are still shorter, but the difference is not nearly as dramatic as it was in the very first example. In fact, typing all of these examples by hand, I find myself appreciating the redundant end tags on the XML parts, because it’s getting very hard to keep track of all the closing “]”, “}” and “)” for JSON and LISP.

No silver bullet

There are a few morals here. First, with markup, as with coding, there’s no silver bullet. JSON (and LISP) have the important advantage that they make the most trivial cases easy to represent, but as soon as we introduce even the slightest complexity, all of the markup starts to look about equally verbose. That means that the real problems we have to solve with structured data are no longer syntactic, and anyone trying to find a syntactic solution to structured data is really missing the point: JSON, XML (and LISP) people would be best making common cause to start dealing with more important problems than whether we use braces, pointy brackets, or parentheses. That’s why I was excited to have JSON inventor Doug Crockford speak at XML 2006, and why I hope that we’ll get more submissions about JSON as well as XML for 2007.

Personally, I like XML because it’s familiar and has a lot of tool support, but I could easily (and happily) build an application based on any of the three — after all, once I stare long enough, they all look the same to me.

XML 2006 proceedings due today

Monday, December 18th, 2006

If you gave a regular presentation at XML 2006, I’d like to remind you that your proceedings — slides and or text — are due today (PDF or XHTML format, please). If you gave a keynote or participated in a panel or Masters Series and have slides or text to send in, we’d also be happy to put it on the site. Please send your presentation by e-mail to cmills at idealliance dot org.

Some presentations already online

Thanks to the many of you who sent in your presentations before the proceedings deadline. You can look at the conference programme page to see who those people are, since their presentations are highlighted in yellow (and with an asterisk, for text browsers or screen readers).

Fake real-time blog

Finally, for a creative approach to the proceedings, take a look at Rick Jelliffe’s Fake real-time blog from XML 2006: day one.

Newmatica Barcode: privacy policy

Tuesday, October 10th, 2006

[Update: Newmatica is closed.]

Thank you to those of you who have visited and used my new consumer-product-discussion site, Newmatica Barcode, since I announced it on Friday. I had an exciting Canadian Thanksgiving weekend dealing with all the bug reports (especially browser-related) and suggestions, and it is indescribably gratifying seeing real members joining and people entering products and comments.

Following one member’s suggestion, I’ve added a privacy policy to the site. I decided to write something short in reasonably plain (if slightly technical) English, rather than the typical 2,000 tome of lawyer-ese. I’d be grateful for comments and suggestions about what should or shouldn’t be in it.

MacOS X vs. Ubuntu Linux

Thursday, June 29th, 2006

[Update: Tim O'Reilly writes that not only Apple but Red Hat should be worried about Ubuntu.]

We bought our older daughter a cheap used ThinkPad as a grade 8 graduation present, and I installed Ubuntu Linux on it for her. It was by far the easiest install I’ve ever done for any operating system — everything just worked, including sound, wireless networking, DVD playback (once I downloaded special restricted packages), remote printing, and suspend-to-ram. I did not have to edit a single text config file — I could do all of the setup, even WiFi, entirely through the Ubuntu/Gnome GUIs. [Correction: I did edit /etc/apt/sources.list to add the sources for the restricted packages, but I think I could also have done that through the GUI if I'd wanted to.]

Why not a Mac?

My daughter is familiar with Linux (at least the GUI parts), MacOS X, and Windows XP. We had considered buying her a Mac notebook, but used Macs cost more than new Wintel books of the same capabilities, so that was a non-starter. She doesn’t much like Windows, probably because they make her use it at school, so Ubuntu it was.

I always liked the Mac interface, especially during the early period up to about 1990 when they led the industry in GUI innovation. Contrary to the accepted wisdom, however, before MacOS X I never found Macs to be a particularly stable desktop computing platform, even compared to Windows. For example, earlier versions of MacOS had highly unreliable TCP/IP support and required huge fixed-size memory allocations for applications, problems that other OS’s had long ago fixed — as much as I liked looking at Macs, I always shrank in terror when family members asked me to help fix problems with them. Windows might crash a lot, but at least it crashed in more predictable ways.

Return of the king …

Fortunately, Apple finally addressed these problems a few years ago by admitting that their backend was completely broken, throwing it out, and replacing it with Unix in MacOS X. They’ve even admitted that maybe the one-button mouse wasn’t such a great idea after all. After that the Mac, if still overpriced, was a strong and stable platform: Mac notebooks started to reappear at IT conferences, Mac ports of Open Source software flourished, and everyone in the Mac world was generally happy. Mac even started innovating in GUI design again, after letting Windows take the lead for a decade.

… but the natives are restless

So why are some of the highest-profile Mac users starting to show disaffection? Mark Pilgrim was the first to announce that he was moving to Ubuntu Linux, and he recently posted a very funny list of Ubuntu essentials for ex-Mac users. Next, Tim Bray announced that he was thinking of switching to Ubuntu, though he’s worried about WiFi support and LCD projector support (my experience has been the opposite of Tim’s). Now Cory Doctorow is also planning to move from Mac to Ubuntu. For all of them, I think, one of the big issues is gaining control over their data which is now locked into Mac proprietary formats that Apple changes at whim.

Cool or cheap?

Do only three defections matter, even if they’re high-profile? Possibly not for Mac usage, but perhaps for Mac price. Historically, Apple has been able to sell its computers for up to a 100% price premium because of the perception that they’re cool — if the mavens suddenly decide that Ubuntu is cooler than MacOS, as seems to be happening, Apple’s price premium could suddenly evaporate.

The end of the Mac-is-cool myth would be as good for Mac users as it would for everyone else (except Apple itself), since cheaper Macs could mean a much larger user base. Would even the hardest-core Mac afficionado complain about a $500 Mac notebook? I didn’t think so.

Giving thanks

Monday, May 15th, 2006

Over on XML.com, David Peterson gives Microsoft some well-deserved thanks for implementing and popularizing the XMLHttpRequest object that’s so useful in modern web development. He also thanks them for not charging for it, but of course, if they had tried to charge it never would have become popular (from SAX, I know that paradox well).

Omissions

There are a couple of problems with giving thanks to inventors, though. The first is that you inevitably leave people out. David, for example, thanked Microsoft for all of AJAX and modern web development in general. AJAX doesn’t consist solely of XMLHttpRequest, however; it also needs JavaScript and a DOM (both pioneered by Netscape) to manipulate the client display, and something like XML (W3C) or JSON (Douglas Crockford) to encode the messages. Most modern web developers also want CSS (Håkon Wium Lie and Bert Bos). And then, of course, there’s HTML and HTTP (Tim Berners-Lee). To illustrate my point, I’ve certainly left out a lot more that I could have included here, and have likely misassigned at least some credit.

Death of the inventor

The second problem is that it almost never makes sense to assign credit to individual people or companies. Who should get credit for SAX? Me, because I coordinated it? James Clark, because I based many of the ideas on his earlier SGML interfaces (and he suggested many of SAX’s features)? Tim Bray, because he thought up a catchy name? The other dozens of other xml-dev members who contributed most of the core ideas? The major software vendors who actually decided to use SAX, giving it credibility outside of the xml-dev community?

The same applies to just about every other technology we use. Not only do they depend on other innovations (the Web without TCP/IP? SAX without XML?), but the successful innovations are almost always simple and obvious, so their main value comes not from any particular technical brilliance but from the brute-force fact that lots of people use them — in other words, community-building is more important than innovation. Microsoft imitated Netscape’s level-0 DOM, and then the W3C standardized it so that it would work across browsers, then browser developers agreed to follow along, then web developers decided it was safe to start using it. Microsoft initially failed to build a community for XMLHttpRequest (which was a proprietary ActiveX component), so it languished mostly unused for years, until other browsers like Mozilla/Firefox, Safari, and Opera decided to support it as well — it was only then that we started to see a real community grow, and high-profile sites like Gmail and Google Maps take off. Tim Berners-Lee’s original HTML would hardly have mattered if the early Mosaic browser hadn’t shown how to make it user-friendly. Etc., etc. While Netscape introduced some good ideas like the DOM and Javascript, they also introduced some that flopped (does anyone else remember CORBA in the browser?) — no community of users, no success.

Thank the users

The moral of the story is that technology success is not something that a person or company gives to the net, but something that comes back from it, as if you threw a stone at a tree without knowing whether an avalanche of silver or of bird dung would shower down from the branches onto your head. A complex, brilliant idea with no users is worthless; a simple, mediocre idea with lots of users is a treasure.

Mobile Web at XTech

Wednesday, May 10th, 2006

Michael Smith has a short post about the Mobile Web Morning at XTech 2006 next week in Amsterdam. I’ve been excited about the mobile web for a long time — granted, it’s been slow taking off, but with mobile phones as the only form of connectivity (voice or network) for much of the developing world, I think that it’s bound to become hugely important. I’ll be charing the first session on Friday morning, and hope to see many of you there.

Early retirement is no fun

Wednesday, February 1st, 2006

Philip Greenspun has an posting about the problems with early retirement. It’s hard for people to sympathise with the problems of a guy who has enough money that he can buy fun airplanes and not work 9-5, but I have to say that nearly every word of his posting rang true for me. Like a lot of people, I did well consulting during the dot.com boom, so when the tech market dried up earlier this decade, I was in a good enough financial position that I could basically stop working and take a two-year sabbatical until the market picked up again.

I imagined that I’d come up with a brilliant business idea, invent something important, or at least figure out what I wanted to do with my life. I did learn to fly and buy a (really cheap, old, slow) airplane, but otherwise, those years stand out as probably the least fun of my life. A bit of leisure, like a bit of chocolate, is nice, but retirement — or, in my case, a extended sabbatical — is like an all-chocolate diet. I’ve been busy again for the last couple of years, and I’m much happier this way. I also find that I’m more creative and get more personal stuff done (exercise, reading, etc.) precisely because I have less time to do it. I’m more organized, more motivated, and, I think, nicer to the people around me.

I no longer dream of early retirement and a life of leisure — work, as long as it’s not stupid or excessive, really is the only path to happiness. +1 for the Puritan work ethic (though we could have done without the Maypole-felling and witch hunts).

The v.2 problem

Saturday, December 31st, 2005

Palm Z22 handheld

[Update: in a comment, Mihai Parparita points out that the Graffiti v.2 was changed for legal reasons, not aesthetic, as explained in a Wikipedia article.]

I got a Palm Z22 for Christmas, to replace my old monochrome Palm Vx. I know that most people have moved beyond Palm, but I like a PDA that’s very small and simple. I make heavy use of Laurie Davis’s outstanding free CoPilot app, especially when I’m in the pilot lounge at some distant airport and need to figure out a new route or recalculate time and fuel for a new upper wind forecast.

Don’t want v.2 for Graffiti …

I love a lot about the Z22, but its one huge, ugly wart is Graffiti v.2, which I had never encountered before now (I’m not sure when it first came in). I have known Graffiti v.1 for years, and can enter it as easily as I touch type. Now, a handful of letters have changed rather arbitrarily, and I keep having to bring up the little virtual keyboard. I’m working on learning the new shapes, and I acknowledge that they would have been better choices for v.1 all those years ago, but what real benefit came from fixing them after the fact? A few engineers might have satisfied their perfectionist aesthetic sense, but thousands of existing, experienced Graffiti users were no doubt gratuitously annoyed, just as I am now.

… or for XML

Let’s not do the same thing with XML and other specifications — sure, we made mistakes writing them, but unless the mistakes are huge, why annoy millions of users with tiny, backwards-incompatible changes? We were forced to create SAX 2.* to support the XML Namespaces specification, though we did our best to maintain backwards-compability, even where we could have made SAX more elegant with a little tweak here and there — the same thing happened with the DOM people. For XML itself, I agree with Tim Bray that it would be convenient to write a specification that combines XML 1.*, XML Namespaces, and the XML Infoset into a single document (as long as nothing else changed — I wouldn’t follow Tim in removing DOCTYPE, as annoying as it is), but otherwise, my motto for specifications is long live v.1!

Forkability

Thursday, December 29th, 2005

Kurt Cagle has an interesting piece on the term Open Standard and what, if anything, it means. Rather than a definition, I’m more interested in a shiboleth, a single test that can tell us whether source or a standard (or any other intellectual thingy) is open.

How about this: source code or a standard is open only if it can be forked against the objections of the maintainer. At first glance, this looks horrible — forking is usually considered the worst fate for a standard, a loud non-confidence vote in the maintainer — but that’s the point. Just as a true Democracy (in the modern, non-Athenian sense) allows you to throw out your government, a truly open standard or source code base allows you to throw out your maintainer. If the copyright terms, patents, or anything else prevent forking, then a standard or source code base is not open.

Sometimes a fork forces the original maintainer to get in gear. In the world of source code, XEmacs is an excellent example — while the maintainers of Emacs stubbornly refused to add anything but the most minimal support for modern GUIs, the early success of the GUI-fied XEmacs eventually forced them at least partly into the modern world, however reluctantly. Other times, a fork fixes something that is broken. In the world of standards, XML, with a more agile standards process and sharper focus (at least in the early days), forked and then completely replaced SGML. Linux is a stranger kind of fork, stealing all of the utilities that were being designed for Hurd without bothering with Hurd itself.

Like the ballot box for a politician, the fork — or even the threat of it — is what makes maintainers listen.

Of Dilbert and Torture

Friday, December 23rd, 2005

[I normally stick to technical issues on this weblog. This posting is about logic, which is sort-of related to tech; apologies in advance to anyone who came here hoping for a short break from personal pontification about current events.]

Over on The Dilbert Blog, Scott Adams has just declared himself the winner of a debate. He asked the following question:

If you think there’s no moral justification for torture, would you accept the nuclear destruction of NYC (for example) to avoid torturing one known terrorist? (No fair extending my question to more ambiguous hypotheticals.)

Most people who commented objected to the question itself; as a result, today Adams declared himself the winner by a knockout and went on to insult his opponents:

… a scary number of people offered comments that were the logical equivalent of punching themselves unconscious in the first round. I don’t need to point them out because they’re somewhat obvious. The point is that most of those people are eligible to vote.

Let’s put aside the issue of torture, and simply look at the question itself. Adams has structured his question so that whether you answer ‘yes’ or ‘no’, you’re forced first to accept the premise that torture is an effective way to get information — in other words, there’s no way to answer the question directly without agreeing with him. This trick is called the Fallacy of many questions — the classic (somewhat disturbing) example is the question “when did you stop beating your wife” — and in a formal debate, it would result in a severe penalty.

To show how this fallacy distorts an argument, substitute a premise that (I hope) no one reading this posting would agree with, and try to come up with a straight ‘yes’ or ‘no’ answer:

If you think there’s no moral justification for murdering children, would you accept the nuclear destruction of NYC (for example) to avoid pushing one live baby slowly into a wood chipper? (No fair extending my question to more ambiguous hypotheticals.)

I do believe that it’s important to debate all issues openly, even touchy ones such as whether torture is an effective kind of interrogation — I believe that the answer is ‘no’ , but in my personal, offline life, I’m not afraid to hear legitimate evidence and reasonable arguments from people who disagree with me. I promise not to introduce any logical fallacies to try to trip those people up.

And I don’t plan an ad hominem attack against Adams either. He seems to be a smart guy, and I enjoy his comics. I’ll look forward to hearing his legitimate arguments on the torture issue.

Mind your colons …

Friday, December 23rd, 2005

… and make friends with a technical writer.

Prescriptive grammarians — the ones who argue that the English language should follow a single standard that is both correct and eternal (at least since Fowler) and attempt to impose that standard on people around them — have generally had, at most, a very limited exposure to serious language study. To put it bluntly, folks, we laugh at you behind your backs. Alexander Pope’s famous quip about dim-witted, self-important critics applies here as well:

A little Learning is a dang’rous Thing;
Drink deep, or taste not the Pierian Spring:
There shallow Draughts intoxicate the Brain,
And drinking largely sobers us again.

Technical writers

To a software engineer, the person who often seems the most drunk on shallow drafts of prescriptive grammar is the technical writer. The engineer sends the tech writer a spec, hoping to have the spelling corrected or the prose tidied a bit, and gets back pages covered in red ink, pointing out apparently minor details like ambiguous pronoun reference, comma splices, and colon usage. Sadly, there do exist dim-witted, self-important technical writers, but in fact, most of them are not closet prescriptive grammarians; instead, they are trying to do two things:

  1. make the phrases, clauses, sentences, and paragraphs consistent and intuitive in the documentation, just as you try to make the class APIs, GUI components, and interfaces consistent and intuitive in the code; and
  2. bridge the gap between engineers, who know a lot about the application, and users, who know little to nothing about it.

Colon usage

As a gift to technical writers, in keeping with the holiday spirit, I’m going to descend a little into the underworld of prescriptive grammar and point out one item that gives tech writers no end of frustration: the use of the colon (:). Take a look at this sentence:

[no]

The three functions are: create, edit, and delete.

Tech writers, copy editors, and English teachers will not accept this use of the colon, any more than a software engineer would accept a method named retrieveAmount beside getDate and getAuthorization. On the other hand, a tech writer would have no objection to this sentence:

[yes]

There are three functions: create, edit, and delete.

Can you spot the difference? If not, here’s another example of colon usage that is unacceptable to most tech writers:

[no]

To enable editing, select:

  • authenticate users,
  • enable backups, and
  • enable page modification.

Without the colon, the example would be perfectly acceptable:

[yes]

To enable editing, select

  • authenticate users,
  • enable backups, and
  • enable page modification.

Here’s an alternative version that is acceptable with a colon:

[yes]

To enable editing, select the following options:

  • authenticate users,
  • enable backups, and
  • enable page modification.

There is a very simple rule of thumb that you can apply: use a colon only if what appears before it could be a sentence on its own. “The three functions are” and “To enable editing, select” cannot stand on their own as sentences; “There are three functions,” “To enable editing, select the following functions,” and Pope’s “A little Learning is a dang’rous Thing; Drink deep, or taste not the Pierian Spring” can.

Best practice for punctuation changes fast, and some day (likely soon), this rule of thumb will be completely obsolete. For now, though, why not make a tech writer’s day a little brighter, and mind the colons?

Must-Ignore and Must-Understand

Wednesday, November 16th, 2005

I was listening to Tim Bray’s excellent talk On Language Creation today at the XML 2005 conference in Atlanta. Tim was talking about creating new XML-based markup languages (summary: “please don’t”), and in passing he mentioned the must-ignore/must-understand design pattern. For the first time, it occured to me that this pattern has a serious flaw.

The pattern

The pattern works this way: you want to let people extend your XML-based language with new elements, and you want to allow forward-compatibility so that systems don’t break if or when you upgrade the language, so it’s usually a good idea to let applications simply ignore what they don’t understand (as is the case with HTML). That’s called must-ignore. For example, if your application sees this XML document

<record>
 <a>xxx</a>
 <b>xxx</b>
 <w>xxx</w>
 <c>xxx</c>
</record>

but it does not understand the w element (maybe you added it to hold extra information for a different application), it will just pretend that the w element wasn’t there, and might process the document as if it read

<record>
 <a>xxx</a>
 <b>xxx</b>
 <c>xxx</c>
</record>

On the other hand, if w contained some kind of crucial information that would change the application’s processing — say, by reversing the outcome or specifying an essential prerequisite (”turn off the oxygen first“) — it would be better to have the application quit and report an error instead of chugging on ahead. That’s called must-understand. Some specifications, like SOAP, actually specify these rules inside the XML instance on an instance-by-instance basis, but most simply frame them in general terms in the specification.

The problem

I realized today, however, that there’s a huge problem with this approach: must-ignore and must-understand are properties of a processing model, not a markup language. Consider an XML language for a business report: if I designate an element as must-understand, what do I really mean?

  1. An application must understand this element to copy this information into a database?
  2. A search engine must understand this element to index it?
  3. A formatting engine must understand this element to generate a PDF?
  4. An XML editing tool must understand this element to open the document?
  5. An XSLT engine must understand this element to do a transformation?
  6. An archiver must understand this element to save the report for auditing purposes (say, Sarbanes-Oxley requirements)?

Each of these represents a different processing model for the same XML document. The must-understand and must-ignore constraints will likely be different for each one, so they’re obviously not properties of the XML-based markup language. Some XML languages, like SOAP and Atom, are specified explicitly as parts of protocols, so the must-understand/must-ignore constraints are part of the protocol specification, but even then, once you have XML, you never know what clever things people will decide to do with it.

First mover (dis)advantage

Tuesday, October 25th, 2005

I recently heard from an older computer user who was delighted that his hotel’s free WiFi simply worked with his notebook computer. Internet access on the road didn’t use to be so easy, either for hotels or their guests. Consider these three (hypothetical) hotels:

  1. In 1995, hotel #1 spent a lot of money to redo its digital phone system to make it compatible with computer modems.
  2. In 2000, hotels #1 and 2 spent even more money to run Cat 5 (Ethernet) cable to all of their rooms
  3. In 2005, hotels #1, 2, and 3 spent much less money to set up a few WiFi hotspots.

A quickie moral would be that hotel #3 came off better, since it ended up in the same place for a fraction of the cost, while the other two suffered from a first mover disadvantage. Reality, of course, is more complicated.: hotels #1 and 2 had five years to amortize each of their earlier investments. If those investments allowed them to steal guests from hotel #3, or to charge higher rates, then the investments may well have turned a net profit for the hotels.

The real moral is that the one that the extreme programming advocates push: build for today. As long as hotels #1 and 2 were investing in technology that their guests needed right away (rather than at some ill-defined point in the future), they probably came out OK. On the other hand, if a hotel were putting in technology just because, some day, it might be needed, it probably saw that technology superceded before it could bring in any return.

If this moral seems simple and obvious when applied to hotels, then why do architects ignore it sometimes when designing information systems for big enterprise and government? When we sell them on something like WS-* (or a REST-based data architecture), what criteria do we use to figure out whether we’re building for today, or for a tomorrow that may never come?

Sputtering down to XML 2005

Thursday, October 13th, 2005

My creaky little Piper Warrior has been grounded since a lighting strike (while tied-down on the apron) back in July, but the engine’s finally back from overhaul, and I plan to be in the air soon — just in time, in fact, to sputter my way down from Ottawa to Atlanta to speak at the XML 2005 conference. I’m planning a 7-8 hour flight down if weather permits, with stops in Watertown NY (to clear customs) and in either Pittsburgh PA or somewhere in West Virginia to refuel. I flew myself to XML 2003 in Philadelphia as well, but that was a much shorter (and non-stop) flight.

Is anyone else flying to the conference in a small plane? Perhaps we can set up an informal general aviation BOF. I’m looking forward to seeing you all there — even the non-pilots, of course.

Oracle vs MySQL AB

Tuesday, October 11th, 2005

Tim O’Reilly reprinted a note from Andy Oram about Oracle’s recent purchase of InnoDB, the company that produces the best of the MySQL backends.

Assuming that Oracle knows what they’re doing (generally a safe assumption), the purchase is not an attempt to attack MySQL as an Open Source product, and it certainly shows no weakness in the Open Source model or the choice to use Open Source in the enterprise. Oracle’s lawyers are smart enough to understand that since InnoDB was released under the GPL, they cannot prevent others from forking the code and continuing development. Anyone — from private users to LAMP websites to large enterprises — can continue to use MySQL with the InnoDB backend under the GPL no matter what Oracle says or does.

What Oracle can now prevent, however, is dual licensing of InnoDB itself and any future forked version. Unless Oracle gives permission, InnoDB can be licensed only under the GPL, which means that it can be used as the backend only for a GPL-licensed database. MySQL AB, the company that produces MySQL, earns a chunk of its revenue from dual licensing, and when their contract with InnoDB runs out in a year and a half, they will no longer be able to distribute a non-GPL version of MySQL that uses InnoDB. So here are Oracle’s next two likely moves:

  1. Oracle continues to develop and improve InnoDB, releasing its changes under the GPL, winning kudos from the Open Source community, and encouraging even more users to switch MySQL and InnoDB.
  2. Oracle does not allow MySQL AB to renew its contract for dual-licensing InnoDB, weaking a potential competitor even while it helps an Open Source product.

In other words, MySQL extends its lead as the predominant Open Source database, but MySQL AB loses its dual-licensing revenue source and becomes a less effective commercial competitor to Oracle. Nice move. Looking over the pieces on the board, I can see two ways for MySQL to respond to this threat:

  • create a new backend to replace InnoDB in the 1+ year remaining of MySQL AB’s InnoDB contract; or
  • forget about dual-licensing, make MySQL exclusively Open Source, and concentrate on support revenue.

The first move isn’t as good an idea, since Oracle will be able to use its own code and expertise to keep improving InnoDB to be faster, more conformant, etc., giving it effective control of the game while MySQL AB is constantly reacting. Heck, Oracle can even fork its own version of MySQL, as long as it stays GPL. The second move, on the other hand, will give MySQL AB back the initiative by allowing it to benefit from any work Oracle puts into InnoDB — the better Oracle (or anyone else) makes InnoDB, the more revenue MySQL AB can pull in for supporting a pure-Open Source MySQL.

If there’s any lesson in this, it’s that the dual-licensing business model has some serious flaws. Stick with selling support and professional services.