Like RSS, NewsML is an XML format for syndication, or publish-subscribe. RSS is a simple format with strong grass-roots support, however, while NewsML is a complex format with mainly corporate backing. The International Press Telecommunications Council (IPTC) maintains and controls the NewsML specification, and there are NewsML-format feeds available from some of the world's major news providers, including Reuters, Agence France Presse (AFP), United Press International (UPI), and many others.
News providers use NewsML files for three different purposes:
to assemble news resources like pictures, text, video, and audio together into a single package for distribution;
to provide alternative versions of resources, such as the same picture in different resolutions, or the same news story in different language; and
to provide metadata for both individual resources and collections of resources.
A simple use example of NewsML is a sports story about Michael Shumacher winning the Montreal Grande Prix. A news provider might distribute a package including three logical resources: the main sports story, a sidebar on Schumacher's past wins, and a photo of Schumacher passing the finish line. All three of these are meant to appear together as a single unit, either in print or on a news Web site. The logical resources come in different versions: for example, the text stories are available in more than one language, while the photo is available in both high-resolution print and low-resolution online versions. The following table shows how all of the information fits together.
|Schumacher wins Montreal Grande Prix||Main story: race coverage||English version:
|Sidebar: Schumacher's past wins||English version:
|Photo: Schumacher crossing the finish line||high-res print version:
|low-res online version:
A NewsML document can encode all of the relationships from this table in machine-readable format, so that a news customer's system can automatically extract the correct versions of each resource: a German racing Web site, for example, would use the German versions of the news stories together with the low-resolution JPEG version of the picture.
The content-packaging is useful in itself, but the key value of
NewsML comes in its metadata. For metadata to work, people have to
share not only markup but controlled vocabularies: for example,
providers have to agree that the code
represent a particular industry or topic. Once that happens, it will
be possible to build intelligent classification, search, and
production engines that can handle newsfeeds from several providers
seamlessly. The IPTC has made a start at publishing standard codes in
The official IPTC Web site for NewsML, with specifications, documentation, and examples. This site will always have the latest news and point to the most up-to-date copy of the NewsML specification.
Standard, shared controlled vocabularies published by the IPTC. Using these where possible will improve NewsML interoperability.
Information about NewsML at Reuters and a sample live NewsML demo.
An open source Java toolkit for NewsML 1.0, sponsored by Reuters and Wavo and originally written by Megginson Technologies.