HP3000-L Archives

February 2000, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Gavin Scott <[log in to unmask]>
Reply To:
Gavin Scott <[log in to unmask]>
Date:
Tue, 15 Feb 2000 22:17:40 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (173 lines)
Glenn after me:
>> "XML: Just Say No."
> Gavin, can you elaborate, please?

It's getting late, but I'll take a few random stabs at it...

When I want to give something to Stan (whose office is 25 feet from mine) I
don't send it via FEDX, I just walk over and drop it on his chair.  But some
people will argue that I should use FEDX for everything because of all the
wonderful advantages it would provide, such as that I could track my
delivery on the web, all deliveries would come in standardized packaging,
etc., etc.

XML is overkill for most applications just as FEDX is overkill for
intra-office mail.

Most people talking about XML and how great it is don't understand it, and
certainly haven't seen it do anything significant.

First of all, let's look at the example document from the first article that
Glenn pointed out:

<?xml version="1.0"?>
<list>
   <recipe>
      <author>Carol Schmidt</author>
      <recipe_name>Chocolate Chip Bars</recipe_name>
      <meal>Dinner
         <course>Dessert</course>
      </meal>
      <ingredients>
         <item>2/3 C butter</item>
         <item>2 C brown sugar</item>
         [...]
       </ingredients>
       <directions>
         Preheat oven to 350 degrees. Melt butter;
         [...]
       </directions>
   </recipe>
</list>

Here we have a document written in XML right?  Wrong.  The above is not
written in XML, as XML is not a language for describing data, XML is a
language for describing languages which describe data.  That is, it is a
meta-language like it's parent SGML.

The above example is not written in XML, it's written in another language
which might be named "RML", the Recipe Markup Language.  These RML documents
will only be useful if there are other systems in the world which speak RML.
Do you think your cell phone is ever going to understand RML?

RML itself would be described in an XML DTD.  That DTD *is* a document in
the XML language.  The DTD describes the basic structure of a language for
describing recipes.  Unfortunately the DTD (and XML in general) only
describe the elements of a language and very generally how they can be put
together.  The DTD can't be used to determine if a document written in the
language it describes will be meaningful to a program designed to process
that language or not.

So, ok, I want to communicate with someone.  If we already share a common
language (English, COBOL, HTML, etc.) then I can just write up a document in
that language and send it to them, knowing that they will be able to makes
sense of it since the language we agree on (and its interpretation) is well
defined.  If there is no common language, then maybe I can use XML to make
one.  Let's see.

First of all, I'll write a DTD which describes as accurately as possible the
language I'm going to use.

Next I'll create my document in the language described by that DTD, and I'll
have a document which is both
"well formed" in terms of all languages describable in XML languages and
"valid" as far as can be determined using the DTD.

Now I'll send the document and the DTD to the person I wish to communicate
with.  What can they do with it?  Not a heck of a lot.  They can use the DTD
to verify that my document is "valid" as far as the DTD goes, and they can
feed the document and the DTD to an XML parser which will perform lexical
analysis on my document and present it as some kind of tree structure, but
the user will have to write a program to do all the parsing and
interpretation of the data themselves, since XML doesn't bother with
actually making *sense* out of documents.

In reality DTDs are unlikely to ever be passed with data.  If the receiver
already knows what to do with the data then the DTD is not needed.  It the
receiver doesn't know what to do with the data then the DTD isn't going to
help much.

In very special cases I may be able to write a "stylesheet" (in *another*
language) specific to some application that the other person uses which
would describe how the new XML language should be interpreted ("rendered")
in some specific application.  Or perhaps I can use yet another
meta-language to describe the content of my language to the point that it
can be completely parsed for correctness in addition to being lexically
analyzed (XSchema or whatever).

Even with my document, it's DTD, and all the other meta-data, it's still up
to the other user to figure out what to do with my data.  Nothing in XML or
any of its many related TLAs is going to ever be able to teach your cell
phone or your web browser to bake cookies.

All of these marvelous examples of XML that I've seen stop short of actually
*doing* anything with the data once it gets to its destination.  Usually
they feed the file and DTD into an XML parser which presents the result as
an expandable tree structure in a window.  Whoopee.  Why don't I just write
out my data in English and send it in a text file since it will be just as
easy to view?

As a programmer who wants to process input in an XML language, even with the
best XML tools you're left to do the true parsing of the document yourself.
The process of breaking the document up into the tree structure (which the
tools will do for you using the DTD) is really just lexical analysis.  This
parsing is non-trivial in many cases.  You end up having to write a sort of
event driven state machine to read the tags and data as they come out of the
XML "parser" to your program.  This isn't the sort of thing that your
average COBOL programmer is familiar with either.

The process of communication goes something like:

Produce data -> Encode data -> Transmit data -> Decode data ->
Understand/use data

XML languages can help with the encoding and decoding of communicated
information, but the don't do anything for the actual interpretation and use
of the data when it gets to its destination.  The knowledge of how to
interpret what the data *means* has to get to the destination via some
completely different means.

XML addresses only one part of the communication process, and it does it in
an imprecise and overly complex manner.  The only automated manipulation of
XML data you're going to find will be toy applications like the XML
"viewers" or "browsers" that display the "parsed" XML document.  Anything
that does more than this means that in reality you're not using XML at all,
but some new *real* language (that may happen to have been specified in part
via XML) that both sides agreed on in advance.

In essence, XML comes down to nothing more than a general syntax for
representing data in a tree structure rather than some more limited (and
simple and easy to process) data structure.  Whether this will result in any
great breakthroughs in understanding communicated information is doubtful,
as the underlying problems are hard ones which people have been working on
since the beginning of human history.  Of course this won't stop everyone
getting all excited about it for a few months until something new comes
along.

XML is something that "other people" will use for the most part.  SGML has
only spawned a few actual languages (HTML, DocBook), so the number of people
actually working in SGML is tiny compared to the number of people who use,
say, HTML.  Likewise XML may be used to define a lot of HTML-like little
languages for describing data, but you and I will only ever see these
languages that other people have created, and the fact that they were done
using XML as a tool will only be hinted at by the fact that that the syntax
looks xmlish.  So a better slogan for almost all users might be:

   "XML: It's someone else's problem."

If someone suggests using XML as a solution for your problem, I'd look
closely to see if there isn't some nice simple boring way to accomplish the
same thing first.

The Emperor may have a document describing what his clothes look like, and a
document describing the document that describes what his clothes look like,
and a document describing the document that describes the document that
describes what his clothes look like, but when it comes down to it, he's
still not wearing anything.

If the problem is that you're bored after Y2K and want some fun new
technology to chase around then I can think of lots of things more
interesting (and potentially useful) than XML.

G.

ATOM RSS1 RSS2