Like RSS, NewsML is an XML format for syndication, or publish-subscribe. RSS is a simple format with strong grass-roots support, however, while NewsML is a complex format with mainly corporate backing. The International Press Telecommunications Council (IPTC) maintains and controls the NewsML specification, and there are NewsML-format feeds available from some of the world's major news providers, including Reuters, Agence France Presse (AFP), United Press International (UPI), and many others.

News providers use NewsML files for three different purposes:

  1. to assemble news resources like pictures, text, video, and audio together into a single package for distribution;

  2. to provide alternative versions of resources, such as the same picture in different resolutions, or the same news story in different language; and

  3. to provide metadata for both individual resources and collections of resources.

A simple use example of NewsML is a sports story about Michael Shumacher winning the Montreal Grande Prix. A news provider might distribute a package including three logical resources: the main sports story, a sidebar on Schumacher's past wins, and a photo of Schumacher passing the finish line. All three of these are meant to appear together as a single unit, either in print or on a news Web site. The logical resources come in different versions: for example, the text stories are available in more than one language, while the photo is available in both high-resolution print and low-resolution online versions. The following table shows how all of the information fits together.

Simple NewsML hierarchy.
Package Logical Content Variants
Schumacher wins Montreal Grande Prix Main story: race coverage English version: schumacher-en.xml
German version: schumacher-de.xml
Japanese version: shumacher-jp.xml
Sidebar: Schumacher's past wins English version: schumacher-bg-en.xml
German version: schumacher-bg-de.xml
Photo: Schumacher crossing the finish line high-res print version: schumacher-win.tif
low-res online version: schumacher-win.jpg

A NewsML document can encode all of the relationships from this table in machine-readable format, so that a news customer's system can automatically extract the correct versions of each resource: a German racing Web site, for example, would use the German versions of the news stories together with the low-resolution JPEG version of the picture.

The content-packaging is useful in itself, but the key value of NewsML comes in its metadata. For metadata to work, people have to share not only markup but controlled vocabularies: for example, providers have to agree that the code c00475 will represent a particular industry or topic. Once that happens, it will be possible to build intelligent classification, search, and production engines that can handle newsfeeds from several providers seamlessly. The IPTC has made a start at publishing standard codes in several areas.

NewsML Resources

IPTC newsml.org Web site

The official IPTC Web site for NewsML, with specifications, documentation, and examples. This site will always have the latest news and point to the most up-to-date copy of the NewsML specification.

IPTC TopicSets

Standard, shared controlled vocabularies published by the IPTC. Using these where possible will improve NewsML interoperability.

Reuters NewsML Showcase

Information about NewsML at Reuters and a sample live NewsML demo.

NewsML Toolkit

An open source Java toolkit for NewsML 1.0, sponsored by Reuters and Wavo and originally written by Megginson Technologies.

