(Skip to main content.)

Blogs Quoderat Land and Hold Short

Quoderat: XML and the Web.

sorry.google.com

October 1st, 2008

See the update below. I was right: Google’s new bot detection is overly naive, and I’m not the only one having problems.

See also John Cowan’s comment below, for a different (personal) interpretation of Google’s terms of service.

Google Maps won’t show me satellite imagery this morning.

Google has recently set up a system to try to autodetect and block bots scraping their system, and it isn’t working very well — people are getting blocked even from Google Search simply because they have too many (human-generated) queries passing through the same proxy.

This morning, I suddenly discovered a different problem: the satellite view in Google Maps has stopped working for me — I get the “don’t have imagery at this zoom level for this region” error everywhere, at every zoom level. I can still see maps and terrain, but not satellite pics, and I noticed the host sorry.google.com setting a lot of cookies.

Is Google’s satellite imagery down for everyone else this morning, or has their software decided that I’m a bot trying to scrape satellite imagery?

Update

I was right — Google’s software had decided that I was a bot. They have a test link directly to a satellite to see if you’re being blocked:

http://khm0.google.com/kh?v=31&hl=en&x=0&y=0&z=1&s=

It took me to this page. I was able to renable access simply by entering a CAPTCHA.

What happened?

I wrote a couple of months ago about how to detect overzoom in Google Maps. My guess is that the overzoom protection in OurAirports — automatically zooming out every 4 seconds until there were actual satellite tiles available — triggered to bot alert, and I’ve disabled the feature for now.

That’s very bad news for any mashup that uses JavaScript to do more sophisticated things with Google Maps, like, say, panning at regular intervals. Google’s bot detection seems to be extremely naive, and any repeated action at regular intervals will fire it off.

Taking sides

September 23rd, 2008

I don’t believe that anything — especially a political argument — can be self-evidently true: people get together in groups and construct their realities, whatever those may be. In my reality, however, there are some arguments that just don’t go well together, and I have a lot of trouble respecting any commentator, politician, or even dinner-table pundit who supports both statements in any of the following pairs.

Age

  1. A 16-year-old is too young to vote.
  2. A 16-year-old is old enough to be tried for a crime as an adult.

This is a variation of the “no taxation without representation” idea that helped drive the American Revolution. Any person who is considered legally capable of making an informed decision as an adult should have a share in choosing his/her government. If a 16-year-old is capable of forming a plan to steal/murder etc. as an adult, then a 16-year-old is capable of voting as an adult. There is no excuse for the voting age to be different from the age of full criminal responsibility.

There are lots of variations: for example, an 18-year-old is too young to drink in most of the U.S., but plenty old enough to have his finger on the trigger of major ordnance in a war. The age of sexual consent also comes into play here. This is one that right-ish political parties, like the Canadian Conservatives or the U.S. Republicans, usually flunk.

Environment

  1. The government should do something to lower gas prices.
  2. The government should do something to lower carbon emissions.

So far, high energy prices are the only thing that seems to cut carbon emissions. If you don’t believe that carbon emissions are accelerating global warming or that global warming is a serious threat, then go ahead and push for lower gas prices; if you do believe that global warming is a serious threat, then you should be cheering for $20/gallon gas. Most North American politicians — especially those in left-ish parties (like the Canadian NDP) — flunk this test cold.

Military intervention

  1. Rich countries should never send in their armies to invade poor ones.
  2. Rich countries have an obligation to ensure that there’s never another Rwandan massacre.

This is a tough one for me, because I believe that the rich world has botched nearly every military intervention it’s made in the poor and developing worlds over the last 200 years (Bosnia stands as one of the partial exceptions). Isolationism is a perfectly consistent political view, but for the rest of us, if we do ask our governments to protect people in poorer countries from their own governments, we are implicitly asking them to go in shooting if economic sanctions and strong words on the floor of the U.N. Assembly don’t do the trick. The rich world could probably could have stopped the Rwandan massacre, for example, but there’s a good chance rich-world troops would still be stuck as unwelcome guests in central Africa today, as they are in Iraq and (to a large extent) Afghanistan.

This is another one that politicians from left-ish parties usually flunk.

Freedom

  1. Freedom is what makes Democracies [sic] better than other forms of government.
  2. When Democracy is under threat, security is more important than people’s rights.

No explanation required. This is another one that politicians from right-ish parties usually flunk.

XML-in-Practice 2008: call for participation

August 26th, 2008

The new name for IDEAlliance’s annual XML conference is XML-in-Practice (December 8–10, Arlington, VA), and it has just released its call for participation, with proposals due by 19 September and selected papers announced by 3 October.

I won’t be chairing the conference this year, but I’m looking forward to reading many of the submissions as a peer reviewer.

Detecting overzoom in Google Maps

July 27th, 2008

[Warning: as of 1 October 2008, Google is using an over-simplistic bot detection algorithm, and something as simple as zooming out at regular intervals can trigger it and temporarily block access to Google resources. I recommend waiting until they fix their algorithms to use this technique.]

Here’s a link to a web page showing how to detect overzoom with the Google Maps API.

Overzoom is a big issue for my site OurAirports, which shows a close-up satellite view on each airport’s page (e.g. the former Meigs Field). Unfortunately, there’s no documented way to use the Google Maps API to check if a satellite view is overzoomed (instead of a satellite picture, it’s showing the “We are sorry, but we don’t have imagery at this zoom level for this region…” message). That can be confusing for someone who isn’t a regular Google Maps user and hasn’t actually touched the zoom controls on the map.

The hack

The page above came up with the clever solution of counting paragraphs in the map container. If there is a “sorry” message, there will be HTML p elements inside the map container. Here’s a simple JavaScript function that checks to see if the map is overzoomed, and zooms out one level if it is:

function check_zoom (map)
{
    var zoom = map.getZoom();
    var count = map.getContainer().getElementsByTagName('p').length;

    if (zoom > 1 && count > 0) {
        map.setZoom(zoom - 1);
    }
}

Note that it doesn’t iterate — it just does one check and exits. The easiest way to use this is just to have it run every two seconds or so. If your map is available in a variable named map, this will do the trick:

setInterval("check_zoom(map)", 2000);

Examples

To see how it works, check out Vuotso Airport in Finland. Google Maps doesn’t have very good satellite coverage for Lapland in the far north, but OurAirports now detects that after a couple of seconds and zooms out one step. For a more extreme example, look at Alert Airport, the world’s northernmost permanent airport, in Nunavut, Canada — the code has to zoom out several times until you can see anything in the satellite view.

Caveats

The more elegant solution would be to detect when the map is finished loading after every event that can affect it, but that sounds like too much work to save a couple of milliseconds here and there.

Note that Google can break this at any time simply by adding or removing p elements — it would be much better to have an official, reliable way to detect when a satellite view is overzoomed.

Widgets vs. Portlets

July 14th, 2008

Widgets are web pages embedded in larger web pages, generally using iFrames — the content comes via a separate HTTP connection and has its own CSS stylesheet, cookies, etc. Final composition takes place in the user’s browser.

Portlets are software modules that produce fragments of HTML markup that are assembled into a single HTML page, sharing common CSS stylesheet, cookies, etc. Final composition takes place on a portal server, and a single page is delivered to the client browser.

Features

Portlets have a lot of features that iFrames don’t: they require fewer HTTP connections, they allow for common styling (one CSS stylesheet can style all the portlets on a page), and they can communicate with each other and take advantage of common authentication/authorization, etc. (so that a user doesn’t have to sign on to each portlet separately).

Portlets use a window-manager metaphor, allowing the portlet server to resize them, expand them etc. They also have modes, like edit and view, all of which can be accessed through a common interface. All of this happens on the server side.

iFrame-based widgets don’t normally do any of that, but they don’t require special portal servers, they can be embedded in more creative ways, and they offload the processing from the server to the client. They also introduce potential security holes, but only if they’re hosted somewhere that’s not under the original company’s control (the same applies to remote portlets using WSRP).

Users

Portlets are used mainly in intranets, to provide a collection of enterprise apps on a single web page for employees (e.g. a news feed, calendar, expense forms, bug reports, etc.).

Widgets are used everywhere else (e.g. embedding Google maps, Facebook applications, etc.). While widget authors/consumers don’t tend to know (or care) much about portlets, the portlet people haven’t failed to notice the popularity of widgets — most (if not all) portal servers now have an iFrame portlet that does little more than wrap an iFrame and allow it to be resized, etc.

Future?

Are the extra features of portlets compelling enough to justify the extra cost and hassle of running a portlet server? Now that we have browser tabs, AJAX, etc., do enterprises really need to continue to squish all their apps into a single web page that looks like a 1995 Mac desktop gone bad?

My guess is that the only portlet feature with compelling benefits is common authentication/authorization — once the web community gets behind a solution to that problem (OpenID or something similar), widgets will probably push portlets out completely, even in the enterprise.

Structured community authoring

June 24th, 2008

About 10 months after launching my OurAirports site for air travelers and pilots, I’ve finished the basic infrastructure to allow community authoring. Unlike Wikipedia, OurAirports contains information that is specialized, structured and finite (there are only so many airports in the world), and I’m interested to see the technical and social differences from the Wikipedia world.

More details are available in the announcement on my flying blog. Note, also, that all of the data collected is free for download (public domain).

Set and forget: 335 days and counting …

June 18th, 2008

Late in summer 2007, I set up a dedicated Linux Ubuntu server at a site in San Diego to host OurAirports and my consulting site, megginson.com. The ISP has had some net outages, but the Ubuntu server itself has kept on chugging through. Here’s the uptime:

 11:18:31 up 335 days,  7:12,  1 user,  load average: 0.05, 0.06, 0.01

Since the ISP set the computer up with a minimal Ubuntu install and gave me the access info, it has run continuously — I know I should install an updated kernel some day, but it’s hard to bring myself to do that.

The Code Factory, Ottawa, Canada

June 16th, 2008

Ian Graham, who is well-known in the Ottawa tech community because of his involvement with Bar Camps, Demo Camps, etc, has a new start-up called The Code Factory.

Location, location, location …

Located in Ottawa’s downtown core a couple of blocks south of Parliament Hill, The Code Factory has offices for rent, drop-in communal working space with WiFi, and a lot of character (old building, bright with lots of windows, hardwood, rickity old elevator that reminds me of office buildings in SoHo).

No three-year leases

The offices go for around $800-$1,000/month (no long-term lease required), and all have windows or skylights. The open (no cubicles) communal working space costs $5/hour, billed to the nearest half hour, and includes unlimited coffee, cappuccino, or espresso, making it a break-even for coders who spend a lot at Starbucks (think one cappuccino/hour).

Unlimited caffeine

For the communal working space, his target market is coders who work from home, but miss the energy and social life of an office and want to drop in a couple of times a week. Despite all the friendly chatter, I find I’m actually getting more done in four hours there than I do in eight hours at home.

Shut-in no more

At a 45 minute walk from my house, The Code Factory is perfect for me — I’ve worked from home for over 10 years, and the Factory gives me an excuse for a little extra exercise, gets me downtown, and gives me the buzz that comes from working around other coders. I think that Ian has done a great service for the Ottawa tech community — much more so than yet another incubator or government-subsidized fund — and that we’ll see at least one or two successful companies tracing their roots back to the Factory, as well as a lot of happy consultants like me.


View Larger Map

Ready for Prime Time?

May 10th, 2008

I bought a cheap HP C4280 printer-scanner-copier today, since my old HP 1210 finally gave up the ghost.

Installing the printer in Windows Vista

Installing the printer in Windows Vista wasn’t too difficult. I followed the instruction not to plug in the USB cable until asked, then inserted the supplied CD-ROM and authorized Vista to run the setup.exe program. I had a click through a few screens, then I plugged in the the USB cable, let it autodetect the printer, and left it running over supper. The whole process took less than 15 minutes. When I came back in, it was finished, and I just had to dodge the ads attached to the end of the installation program. I think my non-computer-literate older relatives could have managed fine without any help from me.

Installing the printer in Ubuntu Linux

I turned on the computer. The HP C4280 appeared in the printer list.

Prime time

So who’s not ready for Prime Time on the desktop? No TV show on Prime Time is without flaws, and no OS is without flaws — Ubuntu still has trouble with some wireless networking cards, and pretty-much 100% of the tech support calls we made at XML 2007 were for Mac notebooks (Windows and Linux notebooks just worked, every time) — but Ubuntu makes it hard to argue that somehow Windows and Mac are good enough for the desktop, while Linux isn’t.

Dealing with strangers

April 8th, 2008

From the leader in this week’s Economist:

“Financial progress is about learning to deal with strangers in more complex ways.”

s/Financial/Technical/ and it applies just as well. What else are we doing in tech, if not figuring out ways for strangers to deal with each-other? Sometimes we focus on designing safeguards, like firewalls or spam filters, and sometimes we focus on creating opportunities, like social networks or source code repositories.

A political posting

March 29th, 2008

Late in 1963, shortly before he was assassinated, U.S. President John F. Kennedy asked Canadian Prime Minister Lester B. Pearson for his opinion on how the U.S. should cope with escalating unrest in Vietnam.

Peason: “Get out.”

JFK: “That’s a stupid answer. Everyone knows that. The question is how do we get out?”

How, indeed? As JFK had finally come to understand, military conflicts, justified or not, are like a Chinese finger trap: it’s easy for a political leader to order the troops in, but very tricky to pull them back out (just ask the British about Northern Ireland, the Russians about Chechnya, or even Pearson’s Canadian successors about southern Afghanistan).

Good luck to President Clinton, President McCain, or President Obama (alphabetical order) in January 2009 — they’re all smart and well-intentioned people, but they’re going to find that the trap has already been pulled very tight, and there’s not much room left to wiggle free.

Strange web exploit attempt (?)

February 4th, 2008

In the search logs for OurAirports, I noticed a series of searches for URLs:

http://www.feliciano.de/Webgalerie/bilder/Italy/une/yiwul/
http://www.unduetretoccaate.it/codice/aseje/wocobo/
http://www.altaiseer-eg.com/ar/articles/jed/umut/

At first, I thought they might be a kind of link spam — some sites display recent searches — but when I checked one of the URLs, I found something totally unexpected:

<?php echo md5("just_a_test");?>

They’re all the same. This is almost certainly related to passwords: is there a known flaw in a PHP content-management system like Drupal, or in the PHP API for a search engine like Lucene, where this would do some damage, or is it just a test probing for weaknesses? Is the PHP code supposed to be served up literally like that, or should I be seeing the MD5 instead?

Delayed echo in the echo chamber

February 2nd, 2008

Some people compare blogs (and mainstream media) to an echo chamber, constantly repeating and amplifying the same messages, but the echoes usually die out quickly. Not so, today, when I found this story on the planenews.com aviation news feed:

21 Feared Dead in Munich Crash.

About twenty one of the 44 passengers and crew of the British European Airways airliner which crashed yesterday near Munich carrying the Manchester United football team and many journalists are feared dead. About eight others are in hospital, seriously injured. Frank Swift, the former international goalkeeper, who had become a journalist, died in hospital.

I didn’t hear about any crash yesterday, but according to the Wikipedia article on Manchester United, there was a crash near Munich on 6 February 1958 that killed eight of the team’s players. In fact, when you follow the full story link in the posting, there is a story about the crash. The phrase “From the archive” is hidden in the deckline, but the dateline is “Saturday February 2, 2008″ (probably automatically updated by the site). There’s nothing else in the online version to indicate that this is an archived story from 7 February 1958, though a Brit would probably know that British European Airways ceased operations in 1974.

This is an easy mistake to make trying to keep up a blog of current events, and I don’t mean to suggest that the maintainer is stupid, or that I couldn’t do the same thing — in fact, next December, watch this spot for postings about an air attack on Perl Harbor.

Is the problem Wikipedia, or David Megginson?

January 23rd, 2008

The Wikipedia article about me was vandalized yesterday (vandalized version) by someone from the IP address 24.225.66.95, which seems to be in or near Raleigh, North Carolina.

What should I do?

  1. Edit the article myself to remove the vandalism? — OK, that’s a really bad idea
  2. Go in anonymously and edit the article? — also a bad idea
  3. Rejoice in the fact that my article is important enough to be vandalized?
  4. Despair in the fact that my article is not important enough for anyone else to have noticed and fixed it?
  5. Reconcile myself to the idea that the edits are not vandalism at all, and I am, in truth, “a freaking looser who knows nothing” and “a noob”

I’m leaning towards #5, though I’m disappointed that kids these days seem to have forgotten how to swear properly: “a freaking loser”???

Google analytics for XML 2007

January 21st, 2008

I forgot that I’d enabled Google analytics for the XML 2007 web site. Even though the conference is long over, I though it would be interesting to look and see what some of the trends were from September 2007 to January 2008 (keeping in mind that these stats apply to the kind of web users interested in a tech conference, not to the web at large).

MacOS is still #3

Despite the halo effect from the iPod and the widespread use of Mac notebooks among speakers, MacOS still hasn’t managed to make much of a dent in the visitor logs:

  1. Windows: 80.70%
  2. Linux: 9.57%
  3. MacOS: 9.44%

If MacOS can’t beat Linux on the desktop, I don’t know if it has a bright future.

Internet Explorer below 50%

Firefox is still #2 behind MSIE, but for this crowd, the gap is small:

  1. MSIE: 49.61%
  2. Firefox: 41.14%
  3. Safari: 3.50%
  4. Mozilla: 3.22%
  5. Opera: 1.76%

If you’re designing or maintaining a web site with a tech audience, you’d better be testing on Firefox as well as MSIE.

Screen resolution and colour depth

I know that web designers like big layouts, but the sad fact remains that 1024×768 is still the most common resolution (and remember that the browser window may be much smaller than the screen):

  1. 1024×768: 28.32%
  2. 1280×1024: 25.84%
  3. 1280×800: 10.61%

A long tail of resolutions follows, but it’s worth noting that the classic 800×600 has only 1.96%. Better news comes from colour depth, where almost everyone has 16bpp or better:

  1. 32bpp: 80.29%
  2. 24bpp: 11.89%
  3. 16bpp: 7.37%

Traffic

Search engines, referrers, and direct access were all important traffic sources:

  1. Search engines: 36.77%
  2. Referring sites: 34.97%
  3. Direct traffic: 28.22%

Blogs did show up among the referring sites, but the biggest traffic producers were traditional links from partner organizations (other conferences, IDEAlliance itself, etc.) — these were also the stickiest, since most people coming from these links went on to read more than one page.

As far as search engines go, I was surprised to find that nothing really matters but Google (assuming that Google Analytics isn’t biasing the numbers):

  1. Google: 94.16%
  2. Yahoo!: 3.46%
  3. Live: 1.51%
  4. MSN: 0.45%

I knew that Yahoo! and MSN were behind in search, but I had no idea just how bad it was (at least in the tech crowd). More than half of the people who found the site via a search engine went on to read more than one page.

The top search phrases were rather dull and predictable:

  1. “xml 2007″: 28.50%
  2. “xml conference”: 8.22%
  3. “xml conference 2007″: 3.20%
  4. “xml conferences” 3.04%

And so on through a very long tail. Individual speakers’ names start appearing soon, but none with more than 10 searches. I trolled through the low-frequency search phrases for something funny (and maybe risque), but all I came up with was the number “736″, which resulted in three visits. I gave up trying to find the site in the Google results for that number. Does anyone really search for a single three-digit integer, and if so, how many pages of results will that person scroll through?

LAMP stack stability

January 10th, 2008

I’m using a single dedicated server to host ourairports.com, megginson.com, and a couple of minor domains. OurAirports is a database-heavy application using (currently) a MySQL v.5 database hosted on the same server. I’ll offload the database to a separate server if traffic keeps increasing, but as long as I’m getting compliments from tech people for my fast response times (mainly thanks to MySQL’s built-in query caching), there’s no point paying for extra hardware.

Uptime

My ISP set up the server for me last summer with a bare-bones Ubuntu distro, then I installed the extra packages I needed using aptitude over ssh. Since then, I’ve done many Ubuntu in-place upgrades, rolled out hundreds of changes and upgrades to the web apps and dozens to the database schema (some very significant), and upgraded WordPress n-teen times. Check this out:

$ uptime
 13:08:31 up 175 days, 10:02,  1 user,  load average: 0.23, 0.06, 0.02

That’s right — since my ISP first set up the server with a basic Ubuntu system, I’ve never had to restart it. In fact, if Apache and mod_php (PHP5) had ‘uptime’ commands, they’d show almost the same amount of time, since I restarted them only to make configuration changes in the first few days of setting up the server (unless apt stopped them to install a newer version during one of my upgrades). I’ve restarted MySQL more recently, but again, only to experiment with configuration changes (especially for fulltext).

-1 for being cool, +10 for having a life

Using reliable old technologies like Linux, Apache, MySQL, and PHP doesn’t win any cool points, but it certainly makes maintaining a web server and its applications easy. I can go on vacation, for example, without worrying about being able to get online to fix or restart my server every couple of days. I don’t have to stay up until 3:00 am on Sunday night so that I can take the server offline to roll out new software versions or bug fixes (aptitude installs any security fixes in place). I spend lots of time with my family. I go to my kids’ school concerts. I learned banjo and mandolin (why not, since I have the free time?).

It’s the developer, not the language

And yes, my PHP web app is easy to maintain and extend, because I designed it to be that way (I can often implement, test and roll out new features in a matter of minutes, even when they require database schema changes) — it’s the developer, not the programming language, that determines the quality and maintainability of an app. A lot of newbies use PHP, so there’s a lot of bad PHP out there, but the same can be said for any language, even Ruby.

Social web sites: the new Proprietors?

January 3rd, 2008

Image: Thomas Penn, second proprietor of Pennsylvania, not as nice as his dad William.

Almost a year ago, I wrote that Open data matters more than Open Source — it doesn’t matter (to you, the end user) whether a web site is using Open Source software or not, if they still keep your data locked up.

Here’s a nasty example: Robert Scoble has just had his Facebook account disabled for running a script to try to scrape his personal information off the site (since Facebook doesn’t provide him with any other way to get it).

I understand that Facebook needs to protect against malicious bots — and they might decide to restore his account once they know what Robert was actually trying to do (though for now all traces of him have vanished) — but do we really want to have hope for the good will of social sites and beg for our own data every time we want it? Are web site owners the new version of the Proprietors in the early American colonies, who can grant rights as favours when they see fit?

Religious wars hit close to home

December 21st, 2007

Update: I read that the school concert went ahead, with Frosty the Snowman replacing the modified Silver Bells as the token non-religious song on the programme (Frosty makes no reference to any religious holidays).

Both of my children attended Elmdale Public School here in Ottawa from junior kindergarten to grade six. Now, my kids’ alma mater has triggered a nation-wide moral panic by changing the line “it’s Christmas time in the city” to “it’s festive time in the city” in the song Silver Bells for a grade-two and -three concert.

I’ve already gone on record saying that it’s OK to wish me Merry Christmas — I’m as proud of my Christian background as some of my friends and neighbours are of their Jewish, Muslim, Sikh, and Hindu backgrounds — but that’s not what this was all about. The primary choir was already singing songs about Christmas and Hanukkah, and the choir leaders decided to add an additional song that was non-religious. I think that the existing non-religious songs Jingle Bells or Winter Wonderland would have been fine, but they decided to take Silver Bells — an otherwise secular pop song about shopping downtown in a city — and replace the word “Christmas”. Silly? Probably. An attack on Christmas or Christianity? Hardly.

The real attack on Christmas and Christianity

Here are some people who might need help understanding the idea of Christmas and Christianity:

  • the school parent(s) who decided to take this to the media
  • the newspaper columnists who made a primary class holiday concert into a national culture battle
  • the talk radio hosts who urged listeners to go after the school and ended up putting the lives of hundreds of small children at risk
  • the hundreds of people who called or e-mail messages of hatred (and a bomb threat) to the nice women working in the school office

According to the Christian New Testament, Jesus didn’t have anything good to say about people like this — he far preferred the company of prostitutes and tax collectors to the religious self-righteous. If you are religious (any religion), pray, meditate, or just hope that their hearts can still be opened this season.

E-mail users fight back

December 16th, 2007

A bit over a year ago, I ran into an unusual problem — for several days, I stopped receiving messages from a customer (in the middle of an important project), then I discovered the messages all hidden deep in my (gmail-hosted) spam box. Everything from that domain was suddenly being flagged as spam.

What happened? This customer had a large mailing list that they used for announcements, etc. My guess is that they sent out an announcement, a lot of other gmail-users flagged it as spam, and whatever weighting algorithm gmail uses tipped it over so that the messages were no longer considered legit by default. I was able to train gmail not to treat those messages as spam (for me, specifically), but it took a week or two before I could trust that some of them weren’t being sent to the spam box.

Hard-core spammers have always had to deal with this kind of thing, and they spend a lot of time trying to figure out a way around it. What’s happening now, though, is that companies with legit (or semi-legit) e-mail lists are also starting to get into trouble, because web-mail makes it possible for hundreds or thousands of people to get together and all vote your e-mail to be undesirable.

The letter of the law isn’t enough

That this isn’t a legal thing. It doesn’t matter at all if your e-mail list is opt-in or opt-out, if the “Send me announcements” checkbox was checked by default or not, or if the recipient originally clicked 10 screens of disclaimers before buying your product/signing up for your service. If they don’t like the e-mail you’re sending them, they’ll just click “Spam”, even if you had a legal right to send it; and if enough of them do it, the e-mail value of your domain fast approaches nil.

You’d better make sure that your mass e-mails have stuff that people actually want to read:

  • I don’t care that your company just won five awards — SPAM! (even if I said before that it was OK to send me e-mails)
  • I probably do care that someone wants to connect with me on a social networking site that I actually use.
  • I don’t care that a merchant I did business with from 2 years ago has a Christmas special on something I’d never buy — SPAM!.
  • I don’t care that your web site has a new look — SPAM!
  • I don’t care that your company has a training session coming up in Tulsa, since I don’t live anywhere near there (and probably wouldn’t go anyway) — SPAM!
  • Yes, I am interested in the tracking info for the books I just ordered. Thanks.
  • I do care that there’s a substantive change to a site that I use a lot.
  • I don’t care about a change on a site I haven’t logged into for a year — SPAM!.

And so on.

This new collaboration is an unexpected side-effect of the shift from desktop e-mail clients to web mail, and it would be foolish for companies not to pay attention. If you consider your domain name to be a valuable part of your corporate identity, don’t piss it away by sending out poorly-targeted mass e-mails, because no matter what prior permission you have, people now can … and will … punish you. After all, it takes only a single mouse click.

Amazon SimpleDB (not very Codd-y)

December 14th, 2007

This might be of interest:

Amazon SimpleDB

Amazon’s announcement

Dear AWS Developers,

This is a short note to let a subset of our most active developers know about an upcoming limited beta of our newest web service: Amazon SimpleDB, which is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud.

Traditionally, this type of functionality has been accomplished with a clustered relational database that requires a sizable upfront investment, brings more complexity than is typically needed, and often requires a DBA to maintain and administer. In contrast, Amazon SimpleDB is easy to use and provides the core functionality of a database - real-time lookup and simple querying of structured data - without the operational complexity.

Were excited about this upcoming service and wanted to let you know about it as soon as possible. We anticipate beginning the limited beta in the next few weeks. In the meantime, you can read more about the service, and sign up to be notified when the limited beta program opens and a spot becomes available for you. To do so, simply click the “Sign Up For This Web Service” button on the web site below and we will record your contact information.

Not much there, though

It’s not SQL, or even SQL-like, though, supporting only the operators “=, !=, <, > <=, >=, STARTS-WITH, AND, OR, NOT, INTERSECTION AND UNION”. I’m no relational expert, but I don’t think Codd would have been impressed. A distributed database is one of the big missing pieces from Amazon’s services, but I’m not sure if this will be it.