- Ali Kamandi
Ali Kamandi kamandi@ce.sharif.edu Spring 2007 Sharif University - - PowerPoint PPT Presentation
Ali Kamandi kamandi@ce.sharif.edu Spring 2007 Sharif University - - PowerPoint PPT Presentation
Ali Kamandi kamandi@ce.sharif.edu Spring 2007 Sharif University of Technology Part 1: XML and DTD SGML (Standard Generalized Markup Language) ISO Standard, 1986, for data storage & exchange
- Part 1: XML and DTD
- SGML (Standard Generalized Markup Language)
ISO Standard, 1986, for data storage & exchange Meta-language for defining languages A famous SGML language: HTML!! Separation of content and display SGML reference is 600 pages long
XML (eXtensible Markup Language)
W3C (World Wide Web Consortium) --
http://www.w3.org/XML/) recommendation in 1998
Simple subset (80/20 rule) of SGML XML specification is 26 pages long
- eXtensible Markup Language
Metalanguage - used to create other
languages
Has become a universal data-exchange
format
- <bibliography>
<paper ID= "object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>
- Human-readable
Machine-readable (easy to parse) Standard format for data interchange Possible to validate Extensible
can represent any data can add new tags for new data formats
Hierarchical structure (nesting)
- !"#
element element name Character content Element Content Empty Element
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>
$
!%
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>
Attribute name Attribute Value
&
'( !
A tag is a name, enclosed by angle brackets,
with optional attributes
<foo id=“123”>
An element is a tree, containing an open tag,
contents, and a close tag
<foo id=“123”>This is an element</foo>
)
- A basic XML document is an XML element
Example:
<books> <book isbn=“123”> <title> Second Chance </title> <author> Matthew Dunn </author> </book> </books>
- <BOOKS>
<book id=“123” loc=“library”> <author>Hull</author> <title>California</title> <year> 1995 </year> </book> <article id=“555” ref=“123”> <author>Su</author> <title> Purdue</title> </article> </BOOKS>
Hull Purdue
BOOKS 123 555
California Su title author title author article book year 1995 ref loc=“library”
- *+
Tags properly nested Tag names case-sensitive All tags must be closed
- r self-closing
<foo/> is the same as <foo></foo>
Attributes enclosed in quotes Document consists of a single (root) element
- , -.!(/
Well-Formed:
Structure follows XML syntax rules
Valid:
Structure conforms to a DTD
- <DT>
<IMG SRC= "greenball.gif" > <A NAME="object-fusion"></A> Y.Papakonstantinou, S. Abiteboul, H. Garcia-Molina. <A HREF="http://www-cse.ucsd.edu/~yannis/papers/fusion.ps"> "ObjectFusion in Mediator Systems".</A> In <I>VLDB 96.</I> </DT>
- HTML confuses presentation with content
- No Explicit Structure, Semantics
Author Conference Title
'/(0
- Extensible set of tags
Content orientated Standard Data
infrastructure
Allows multiple output
forms
Fixed set of tags Presentation oriented No data validation
capabilities
Single presentation
XML HTML
- *1*!1
XML Document Type Definitions (DTDs): XML Schema
defines structure and data types allows developers to build their own libraries of
interchanged data types
- An XML document may have an optional DTD.
A grammar for XML documents Defines
which elements can contain which other elements which attributes are allowed/required/permitted on
which elements
$
2'22+1"
Both sides must agree on DTD DTD can be part of document or stored
separately
&
Consider an XML document:
<db><person><name>Alan</name> <age>42</age> <email>agb@usa.net </email> </person> <person>………</person> ………. </db>
)
DTD for it might be:
<!DOCTYPE db [ <!ELEMENT db (person*)> <!ELEMENT person (name, age, email)> <!ELEMENT name (#PCDATA)> <!ELEMENT age (#PCDATA)> <!ELEMENT email (#PCDATA)> ]>
- Occurrence Indicator:
Occurrence Indicator One or more Required, repeatable + None, one, or more Optional, repeatable * None or one Optional ? One and only
- ne
Required (no indicator)
- !21
<!element bibliography paper*> <!element paper (authors, fullPaper?, title, booktitle)> <!element authors author+> <!element author (#PCDATA)>
Character content Authors followed by
- ptional fullpaper,
followed by title, followed by booktitle Sequence of 1 or more author Sequence of 0 or more paper
- *1"!3+!
<type name="Order" > <element name="name" type="string" /> <element name="street" type="string" /> <element name="zip" type="integer" /> <...> <attribute name="orderDate" type="date" /> </type>
- *1"!3+!
- <type name="personName">
<element name="title" minOccurs="0"/> <element name="forename" minOccurs="0" maxOccurs="*"/> <element name="surname"/> </type> <type name="extendedName" source="personName" derivedBy="extension"> <element name="generation" minOccurs="0"/> </type> <type name="simpleName" source="personName" derivedBy="restriction"> <restrictions> <element name="title" maxOccurs="0"/> <element name="forename" minOccurs="1" maxOccurs="1"/> </restrictions> </type>
- Part 2:
XSL: XML Transformation
- *
The eXtensible Style Language Transforms XML into HTML Actually, transforms XML into a tree, then
turns that tree into another tree, then outputs that tree as XML
- *1"1
XML Source XSL Stylesheet HTML Output XSL Processor
$
'
<?xml version="1.0"?> <!DOCTYPE menu SYSTEM "menu.dtd"> <menu> <meal name="breakfast"> <food>Scrambled Eggs</food> <food>Hash Browns</food> <drink>Orange Juice</drink> </meal> <meal name="snack"> <food>Chips</food> </meal> </menu>
menu meal name
"breakfast"
food
"Scrambled Eggs"
food
"Hash Browns"
drink
"Orange Juice"
meal
&
4"
The stuff inside the quotes in XSL patterns
"/person/name/firstname"
A sensible way to locate content in an XML
document
)
4" *+
book/title
title child of book child of current node
/book/title
title child of book child of document root
@language
language attribute of current node
chapter/@language
language attribute of chapter child of current node
- 4" *+51(6
chapter[3]/para
all the para children of the third chapter
book/*/title
all title children of all children of book (but not of
their children)
chapter//para
all para children of any child of chapter,
recursively
../../title
title child of parent of parent
- 4" %%
descendant-or- self::node() // parent::node() .. attribute:: @ self::node() .
- 4" .1
para[1] or para[position()=1]
the first para node of the current node
para[last()] para[count(child::note)>0]
all paragraphs with one or more notes
para[id("abstract")]
selects all child nodes like
<para id="abstract">
para[@type='secret'] or para[attribute::type='secret']
selects all child nodes like
<para type="secret">
- 4" .151(6
para[not(title)]
selects all child paragraphs with no title elements
para[position() >= 2 and position() < last()]
selects all but the first and last paragraphs
para[lang("en")]
matches <para xml:lang="en-uk">…</para>
note[contains(., "alex")]
. means "test childrens' content too, recursively" in this
context
note[starts-with(., "hello")]
- *7
XSL is a series of rules or templates Each template matches an element Templates can contain XML commands
- *#!!3 -!
Main rule: apply-templates
looks for a template match applies it
Usually the template calls apply-templates
recursively on its children
If not, then processing stops at that node (but
continues for its other siblings that matched this template)
- 2 7
For a leaf node, output its contents For a branch node, apply templates
(recursively) (including default rule)
$
*!*#!!
value-of
grabs raw value, good for text elements and
attributes
if
executes conditionally
number
counts position of element in group good for ordered list numbering, table of contents,
etc.
&
'
<?xml version="1.0"?> <!DOCTYPE menu SYSTEM "menu.dtd"> <menu> <meal name="breakfast"> <food>Scrambled Eggs</food> <food>Hash Browns</food> <drink>Orange Juice</drink> </meal> <meal name="snack"> <food>Chips</food> </meal> </menu>
menu meal name
"breakfast"
food
"Scrambled Eggs"
food
"Hash Browns"
drink
"Orange Juice"
meal
)
*+!
<?xml version="1.0"?> <!DOCTYPE xsl:stylesheet [ <!ENTITY background "#99FFFF"> ]> <xsl:stylesheet xmlns:xsl="http://www.w3.org/XSL/Transform/1.0" xmlns="http://www.w3.org/TR/REC-html40" result-ns="">
- +! 51(6
<xsl:template match="menu"> <HTML> <HEAD> <TITLE>Menu: <xsl:value-of select="@name"/> </TITLE> </HEAD> <BODY BGCOLOR="&background;"> <H1> Menu <xsl:value-of select="@name"/> </H1>
- +! 51(6
<xsl:apply-templates /> </BODY> </HTML> </xsl:template>
- +! 51(6
<xsl:template match="meal"> <H2><xsl:value-of select="@name"/></H2><br />; <UL> <xsl:apply-templates/> </UL> </xsl:template>
- +! 51(6
<xsl:template match="food"> <LI><xsl:apply-templates/></LI> </xsl:template> <xsl:template match="drink"> <LI><xsl:apply-templates/></LI> </xsl:template> </xsl:stylesheet>
- <BOOKS>
<book id=“123” loc=“library”> <author>Hull</author> <title>California</title> <year> 1995 </year> </book> <article id=“555” ref=“123”> <author>Su</author> <title> Purdue</title> </article> </BOOKS>
Hull Purdue
BOOKS 123 555
California Su title author title author article book year 1995 ref loc=“library”
- *1 3
<xsl:if test="author">
by <xsl:apply-templates select="author" /> </xsl:if>
Note: no else (?!?)
- *3-1"
<xsl:for-each select="chapter">
<h2><xsl:value-of select="@title"/> </h2> </xsl:for-each>
$
713
XML Spec
http://www.w3.org/TR/REC-xml
XML FAQ
http://www.ucc.ie/xml/
Café con Leche
http://metalab.unc.edu/xml/
http://www.xml.com/
xml.org ibm.com/xml Servlet FAQ in XSL
http://www.purpletech.com/servlet-faq/