Michael Macrone

What's Next:
A New Language May Ease Web Use
(The New York Times, June 3, 1999)

By Anne Eisenberg

THE latest browsers understand it, undergraduates are signing up for classes to learn it, and - ultimate cyberspace compliment - start-ups are forming around it.

Illustration This Internet phenomenon of the moment is XML, or Extensible Markup Language, a new language that may one day power the second generation of the World Wide Web.

"The Web has two main problems - it's slow and it's hard to find the one piece of information you need when you search,” said Tim Bray, chief technical officer at Textuality, in Vancouver, British Columbia, and one of the architects of the new language. “XML is designed to fix both these problems.”

Right now the lingua franca of the web is HTML, or Hypertext Markup Language, the most popular publishing language ever but one that has its limitations, Mr. Bray said. “HTML is easy to learn and use, but it was never intended for many of the complicated jobs it is being asked to do.”

The problem lies with the kinds of tags used to mark HTML text. HTML uses tags like <p> to show that a paragraph follows, or <H1> to indicate a headline. But HTML tags do not mark the meaning or semantic content of the text. For example, they can mark text to say, “This is displayed in bold,” but they are not capable of saying, “This is the price of an item” on an E-commerce site or “This is a date.”

“Think about ordering a book on the Web right now," Mr. Bray said. “If you want to know which of the books on the subject is the most recent, or cheapest, it's a full Internet roundup each time while you ask for a new page and an overburdened server sends it to you.”'

XML will make it easier to search a site for detailed information.

XML is designed to remedy this problem by specifying not what the information looks like but what the information is. It uses tags or markers just as HTML does, but while HTML marks the text to say “This is in color” or “This is in boldface," XML marks the text to say “This is a price” or “This is a date." An XMLenabled browser can sort and manipulate this data right at the desktop.

XML was intended to replace HTML, but in the near future it is most likely to be used in tandem with it. The new language has many practical advantages. For example, if the the user asked a travel site for flight times between San Francisco and Chicago, the browser could receive not only the schedule but a small program that could select the flights by cost, time of day or even seat selection. The browser could do the work that previously had to be done by the server.

“Multiply that by the entire Net and you have servers and networks that are far less loaded and run more quickly,” Mr. Bray said.

XML came to life in 1996 when the World Wide Web Consortium commissioned a group of markup language experts, organized and led by Jon Bosak of Sun Microsystems, to deal with the limitations of HTML.

“The problem was not in having tags but in having one set of tags,” said C. M. Sperberg-McQueen, of the University of Illinois at Chicago and a member of the original committee. “Businesses who use the Web needed to say, “This is an order number. This is a part number.' They needed to identify each part of their business documents.” Dr. Sperberg-McQueen and his colleagues developed the new markup language as a simplified version of a difficult precursor, SGML, or Standard Generalized Markup Language, completing the job in 1998.

Many groups have already begun developing their own applications for XML. In medical applications, for example, there is an urgent need for a single, easy-to-handle language capable of saying “This is a patient's name” or “This is a blood-sugar level” on the Web so that many machines can communicate quickly. “There are very active efforts right now in the medical community to hammer out a tag set so the disparate computers can talk with one another more easily,” Mr. Bray said.

Indeed, whole companies are springing up around XML. Erutech, a Seattle start-up, will offer XMLbased services for the health insurance industry. Dave Wascha, Erutech's vice president of marketing, said that the industry has to learn new ways to exchange mountains of information over the Internet. “It took a few years of ramping up with HTML to realize the language wouldn't work efficiently to exchange data,” he said.

XML is most likely to spread quickest among businesses and organizations, not individual users, some experts predict. “I don't believe in XML for the average person,” said Michael Macrone, an independent Web developer in San Francisco. “The promise of a cross-platform, universal technology is always a false one for the ordinary person, because too many factors like the version of the browser or the speed of the modem must be controlled.” Many people have old browsers or slow modems, for example, and are not interested in upgrading.

“The public thinks of their computers like their TV's or refrigerators – they don't want to install a new freezer each year,” Mr. Macrone said. “XML will be fine for superusers, but the average person is not going to download the required software or put up with the flawed performance and errors messages that always ensue with any new program.”

Mr. Bosak is very much aware of positions like Mr. Macrone's. “Certainly, once you put a product out there, it's difficult to replace it with the next generation,” he said. If old browsers do linger, Mr. Bosak predicts XML will live on the server and be translated to HTML for users. Companies will continue to develop middleware - a layer of servers that knows XML and translates it to HTML for the browsers. “This model works, but severely limits what we intended,” Mr. Bosak said.

Mr. Bray said he looks forward to a second generation of the Web powered by XML. “We are locking up a high proportion of our intellectual capital in short-lived, inaccessible, proprietary formats," he said. “I can't open some of my old Microsoft Word files. I can't go through my own writing for the last few years to extract all the information on one topic.” XML deals with this problem. “XML files will be fully documented and stable,” Mr. Bray said. “They can be read and reused in 2028, and we won't have to pay a tax to any software vendors for the right to use them.”

– 30 –

First published in The New York Times (June 3, 1999)

National Enquirer page

press & awards  • •  home