XML - what for?


[ Zettels Traum ] [ search / suche ]

von dp am 30.Maerz 98 um 01:03:39:

Alex Zavatone wrote on March 28, in part "... the conversation turns to Andy
White's Shockwave chat and then to XML where I express 'I don't have a freakin
clue why I should care about XML.' To which Sir Mark of Shepherd replies 'You
can pass complex data structures back and forth between the front end in your
browser and the back end of a database.' Or something to that effect. The
phrase, 'Oh, I get it' leaked from my lips. So what the hell does this mean?
The capability to create and pass data structures is a very powerful thing.
Less hoops to jump through. More database connectivity. Applications galore.
Hmmmm. Not bad."


(Thanks for the entrypoint so I can rave on a little more, Alex.... ;-)

XML feels a lot like FileIO to me. There's access to the outside world.

Before the FileIO XObject a movie had to be a self-contained thing. Once we
could read in a stream of ASCII we were able to trade information with other
applications.

The difference between XML and FileIO or getNetText is that we now have access
to nested objects, rather than just a stream of characters. An XML-formatted
file describes how bits of data relate to each other... the structure of a
chunk of information.

Rephrased, it's no longer just characters that you have to know how to
uniquely interpret in order to use. By following the rules of Extensible
Markup Language, the document itself describes the information it contains.


For instance, suppose you're making a movie catalog. You can submit a query to
a database and retrieve an HTML-formatted document about a particular movie.
But you'd have to know how that database formats its pages in order to extract
the movie name. Perhaps they put the movie title in a heading-2 tag. Perhaps
it's after a horizontal rule. You'd have to know the layout of the document
before being able to use its data.

Doesn't matter whether it returns an HTML document, or a text file with tab-
delimited data, or whatever... you'd have to know the layout of the data ahead
of time in order to be able to use it.

The problem is compounded if the host changes its page layout scheme, or if
you're searching across different databases.

But an XML-formatted record would identify the different parts of the
information, and different phrasings of the same information can produce
identical objects in memory. For instance, compare the following two records,
from two different hypothetical databases:


1980
Kagemusha
Kurosawa
Warlord's death is hidden by double blah blah
blah...
Leo Gorcey
Huntz Hall
-- et cetera....

Kagemusha
Kurosawa

Aldo Velani
scratchy, dropouts in latter half
April 3 1998


Randy Wiley
-- et cetera.....

If you receive two netText files, it's easy to track nesting levels as you
parse the text. A "movieRecord" object can setaProp for name, director, etc.
Once you have a collection of such movieRecord objects it's easy to step
through looking for those with a certain "director" property, etc.

Even though each of the two databases is designed for a certain purpose --
although the videoshop has no need for capsule descriptions, and the
filmography database does not track individual copies -- even though two
databases have different goals and formatting, you can take advantage of the
shared *structure* of the information they contain.

Additionally, even with just a single database, it's easier for a clientside
application to manipulate the retrieved data. If they wish to view Kurosawa's
movies in alphabetical order, you can sort the list of movieRecords by the
"name" property. If they wish to view in chronological order, they can sort of
the "year" property. If they wish to see only those movies with Toshiro Mifune
they can filter by the "cast" property.

While this clientside manipulation *could* be done with an arbitrary text
file, you'd have to know the formatting of the text file ahead of time, and
relying on a comma-delimited format separated by line-returns or whatever is a
pretty dangerous assumption when you're designing at timeA for use at timeB.
Having the data describe itself gives a lot more power and assurance with even
a single application.


Why *I* care about XML is that we can not only exchange letters and numbers
with other applications, we can exchange objects now, as well. Objects have
properties and relationships with other objects. Different objects of the same
class can be viewed in cross-sections by their properties. You can search for
particular data objects with particular combinations of properties.

And different applications can share information about the relationships
between objects, without necessarily ever seeing those objects before.

It feels sorta like FileIO cubed, to me... past combinations of letters, and
into combinations of objects.

How does this seem to you, there...?

jd




Dazu:























D. Plänitz