Practical Use of XML XML Practical Use of Rostislav Titov - - PowerPoint PPT Presentation

practical use of xml xml practical use of
SMART_READER_LITE
LIVE PREVIEW

Practical Use of XML XML Practical Use of Rostislav Titov - - PowerPoint PPT Presentation

CERN European Organization for Nuclear Research IT Department e Business Section Practical Use of XML XML Practical Use of Rostislav Titov IT-AIS-EB (e-Business) Section CERN Geneva, Switzerland XML XML eXtensible Markup


slide-1
SLIDE 1

CERN – European Organization for Nuclear Research

IT Department – e–Business Section

Practical Use of Practical Use of XML XML

Rostislav Titov

IT-AIS-EB (e-Business) Section

CERN – Geneva, Switzerland

slide-2
SLIDE 2

CERN

e–Business

XML XML eXtensible Markup Language eXtensible Markup Language

SGML (ISO standard, 1986)

Mainly for technical documentation

XML (W3C recommendation, 1998)

Simplification and enhancement of SGML, wide area of use

slide-3
SLIDE 3

CERN

e–Business

<book lang=“Hungarian”> <chapter> <section> </section> <section> </section> </chapter> <chapter> <section> </section> <section> </section> </chapter> </book> Introduction Text Markup More document markup Reserved attributes Processing instructions

Why Markup? Why Markup?

Markup allows to add information about data structure Markup allows to add information about data structure

???????? ????? ???????? ?????????????? ?????? ? ??????? ? ????????????????? ???????? ? ????????? ?? ?????????

slide-4
SLIDE 4

CERN

e–Business

<?xml version="1.0" encoding="UTF-8"?> <presentation> <author> <firstname>Rostislav</firstname> <lastname>Titov</lastname> </author> <chapter number="1" title="What is XML"> XML (Extensible Markup Language) is … </chapter> <conclusion/> </presentation>

XML XML: Rules : Rules

Header One root element Tag hierarchy Attributes

Some rules

Element names are case-sensitive Every opening tag should have a closing tag Tags cannot intersect (<a><b></a></b>) Attributes values – in quotes or apostrophes Text elements Empty elements

slide-5
SLIDE 5

CERN

e–Business

XML XML: Data Transfer : Data Transfer

Platform and language independent Easy to write, easy to process Understandable for humans and computers Open standard

– Many libraries exist – Lots of literature available – Specialized XML-editors

Possibility to check the document structure

slide-6
SLIDE 6

CERN

e–Business

XML XML: Data Transfer (2) : Data Transfer (2)

External Program

EDH XML Automatic form generation from external programs XML as data transfer format Schema checkup as a warranty of data consistency

Example: EDH Transport Request

slide-7
SLIDE 7

CERN

e–Business

Web Services Web Services

Web service WSDL WSDL XML SOAP SOAP XML Data transfer between programs on Internet Open Standard Platform and language independent (Java, .Net, …)

WSDL – Web Service Definition Language SOAP – Simple Object Access Protocol

slide-8
SLIDE 8

CERN

e–Business

XML XML: Data Storage : Data Storage

Data structure is kept together with the data Object “addendum” to relational RDBMS Structure checkup Supported by many modern RDBMS

– Microsoft SQL Server 2005, Oracle 9i +, – XML Data Type – XML indexes – XML Queries (XQuery etc.) – Data output in XML format

slide-9
SLIDE 9

CERN

e–Business

XML XML: Data Storage (2) : Data Storage (2)

Example: EDH Search System

Our solution:

All documents are stored in XML Context-specific XML search (Oracle InterMedia)

Example: «Find documents created by Slava»: Select DOC_ID from DOC_XML where Contains(XML, “Slava within creator”) > 0; Problem: Effective search using arbitrary number of criteria is problematic

slide-10
SLIDE 10

CERN

e–Business

XML XML: Data Transformations : Data Transformations

XML can be transformed into HTML, text, PDF, ...

– No need for special program solutions – Commercial visual editors exist – Platform independent

slide-11
SLIDE 11

CERN

e–Business

XML XML-

  • based Standards

based Standards

Possibility to formally define the structure Platform and language independent Understandable for humans and computers Possibility to use XML technologies (XSLT

transformations, XQuery queries)…

– WSDL (Web Services Definition Language) – SOAP (Simple Object Access Protocol) – XHTML (HTML that complies to XML rules) – SVG (Scalable Vector Graphics) – ebXML (XML for e-Business) – …

slide-12
SLIDE 12

CERN

e–Business

Formal Structure Definition Formal Structure Definition

There are ways to define XML

structure formally

  • DTD (Document Type Definition)
  • XML Schema

Obsolete! Not for new development Obsolete! Not for new development

slide-13
SLIDE 13

CERN

e–Business

XML Schema XML Schema: : Possibilities Possibilities

Check element presence and their order Sequences and choices Number of repetitions for elements and groups Attributes and their presence Type of elements and attributes Restrictions for elements and attributes Default values Unique constraints ...

slide-14
SLIDE 14

CERN

e–Business

XML XML-

  • schema

schema: : when it is needed when it is needed? ?

Formal structure definition for future

reference

Programmers may rely on data

consistence

Authors may check XML validness in

advance

slide-15
SLIDE 15

CERN

e–Business

XML XML-

  • schema

schema: : when NOT needed when NOT needed? ?

When we know in advance that XML

is valid

When we do not care about

document validness

When maximum processing speed is

required

Small “throw away” projects

slide-16
SLIDE 16

CERN

e–Business

XPath XPath: : XML Navigation XML Navigation

Access to XML elements Result of an XPATH-expression can be:

C:\presentation\author\firstname /presentation/author/firstname

XML Node Node Set Boolean String Number Empty Set

slide-17
SLIDE 17

CERN

e–Business

X XPath Path: Examples : Examples

  • Find the DG’s name

/cern/dg/person/text()

  • Find all departments

/cern/department/@name

  • Find all people

//person

  • Find the name of DH of IT

/cern/department[@name=“IT”]/dh/person/text()

  • Find how many groups has a department where
  • R. Martens works

count(//gl/person[starts-with(., 'R. Martens')]/../../../group)

<cern> <dg><person>R. Aymar</person></dg> <department name=“PH”> <dh><person>W-D. Schlatter</person></dh> </department> <department name=“IT”> <dh><person>W. von Rueden</person></dh> <group name=“IT-AIS”> <gl><person>R. Martens</person></gl> </group> <group name=“IT-CO”> <gl><person>D. Myers</person></gl> </group> <group name=“IT-IS”> <gl><person>A. Pace</person></gl> </group> </department> </cern>

slide-18
SLIDE 18

CERN

e–Business

XPath XPath: Examples : Examples ( (8 8) )

Example: Event Handling System

Check events against XPath

XML XML XML

Events Subscriptions

XPath XPath

Handling System

Notifications

«I want to see all documents for more than 600 CHF» / document [amount > 600]

slide-19
SLIDE 19

CERN

e–Business

XPath XPath: Program Use : Program Use

Element root = xml.getDocumentElement(); Node child; for (child= root.getFirstChild(); child != null; child = child.getNextSibling()) if (child.getNodeName().equals ("report") && ( (Element)child ).getAttribute("name").equals ("Slava")) break; for (child = ((Element)child).getFirstChild(); child != null; child = child.getNextSibling()) { if (child.getNodeName().equals ("title") ) { for (Node child2 = child.getFirstChild(); child2 != null; child2 = child2.getNextSibling()) if ( child2 instanceof Text) System.out.println(( (Text)child2 ).getData().trim()); } }

System.out.println(((XMLDocument)xml).selectSingleNode( "/config/report[@name='Slava']/title/text()").getNodeValue());

XPath DOM Model

slide-20
SLIDE 20

CERN

e–Business

XQuer XQuery y – –XML XML Query Language Query Language

XQuery is SQL for XML

– Database independent – Easy to use

Supported by popular RDBMS

(Microsoft SQL Server 2005, Oracle 9i and10g)

Based on XPath, supports document sets

slide-21
SLIDE 21

CERN

e–Business

XSLT: XML Transformations XSLT: XML Transformations

Transforms XML to HTML, text or other XML XSLT 1.0 (Current), XSLT 2.0 (Draft) XSLT is a “Human Interface” to XML Supported by Web Browsers

XSLT

slide-22
SLIDE 22

CERN

e–Business

XSLT: Simplified Structure XSLT: Simplified Structure

xsl:stylesheet xsl:template xsl:template xsl:value-of xsl:value-of xsl:apply-templates <html> <body> … </body> <html>

XSLT is an XML file Active usage of XPath expressions

… … …

Apply a template to the given element Evaluate XPath and print value Apply templates to other elements

slide-23
SLIDE 23

CERN

e–Business

XSLT: Possibilities XSLT: Possibilities

  • Conditions (<xsl:if>)
  • Loops (<xsl:for-each>)
  • Variables (<xsl:variable>)
  • Sorting (<xsl:sort>)
  • Numbering [1., 1.1., 1.1.?, 2.,] (<xsl:number>)
  • Number formatting (format-number())
  • Multiple step processing (mode)
  • String manipulations (via XPath)

XSLT 2.0 (Draft)

  • XPath 2.0
  • Custom functions
  • Regular expressions
  • Date and time formatting
  • Groupings
slide-24
SLIDE 24

CERN

e–Business

XSLT: Example XSLT: Example

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="presentation"> <html> <body bgcolor="#FFCCFF"> <h1><font color="darkblue"><xsl:value-of select="title"/></font></h1> <h4><font color="green"><i>Author: <xsl:value-of select="author"/></i></font></h4> <b>Table of Contents</b><br/><br/> <xsl:apply-templates select="chapter" mode="contents"/> <br/><br/> <xsl:apply-templates select="chapter" mode="normal"/> </body> </html> </xsl:template> <xsl:template match="chapter" mode="normal"> <b>Chapter <xsl:value-of select="@number"/>. <xsl:value-of select="@title"/></b><br/><br/> <i><xsl:value-of select="text()"/></i><br/><br/> </xsl:template> <xsl:template match="chapter" mode="contents"> <xsl:value-of select="@number"/>. <xsl:value-of select="@title"/><br/> </xsl:template> </xsl:stylesheet>

slide-25
SLIDE 25

CERN

e–Business

XSLT XSLT: : Web Web “Skins” “Skins”

<aissearchscreen> <head><title>Person Search</title></head> <body> <input type="hidden" name="isAdvanced" value="false"/> <input show="always" type="text" label="Keyword" value="titov"/> <input type="checkbox" label="Fuzzy search" value="No"/> <result> <header> <tablecell>Full Name</tablecell> … </header> <row> <tablecell>Maksym TITOV</tablecell> <tablecell>71169</tablecell> <tablecell>40-3-C08</tablecell> … </row> <row> <tablecell>Oleg TITOV</tablecell> <tablecell>EXT</tablecell> … </row> … <rowcount>4</rowcount> </result> </body> </aissearchscreen>

slide-26
SLIDE 26

CERN

e–Business

XSLT XSLT: : Web Web “Skins” “Skins” -

  • 2

2

XSLT

slide-27
SLIDE 27

CERN

e–Business

XSLT XSLT: : User Interfaces User Interfaces

CERN Stores Catalog

  • Data loaded through XML
  • Data stored in XML
  • XSLT for data output
  • 150000 items
  • +10000 users
  • ~15-20K XML for each page
  • Custom formatting

(through XSLT redefinition)

slide-28
SLIDE 28

CERN

e–Business

XSLT: XML to Text XSLT: XML to Text

Example:

Automatic code generation

<document> <input type=“person” name=“A”/> <input type=“number” name=“B”/> … </document> Interface Interface XML-description Program Business Logic Business Logic SQL SQL ...

Did you know…

that 1 EDH document is:

  • At least 20 source files (code, HTML

templates, resources, SQL, …)

  • About 250K of source code
slide-29
SLIDE 29

CERN

e–Business

XSLT: XML to XML XSLT: XML to XML

Generate XML from another XML source “Configuration files update” XSL:FO

slide-30
SLIDE 30

CERN

e–Business

XSL XSL-

  • FO: Formatting Objects

FO: Formatting Objects

FO: XML-description of document layout XSL-FO: XSLT transformation

  • f XML document to FO document

FO Processor: program that converts the FO

definition into a printable format (PDF, PS, ...)

<?xml version="1.0"?> <presentation> <title> XXX </title> </presentation> <?xml version="1.0"?> <presentation> <title> XXX </title> </presentation> <fo:root> <fo:page-sequence> <fo:flow> ... </fo:flow> </fo:page-sequence> </fo:root> <fo:root> <fo:page-sequence> <fo:flow> ... </fo:flow> </fo:page-sequence> </fo:root>

XML Document FO Document PDF Document

XSL:FO Transformation FO Processor

slide-31
SLIDE 31

CERN

e–Business

XSL XSL-

  • FO: Formatting Objects

FO: Formatting Objects

Fonts Pagination Headers and footers Page numbering Odd/even page distinction Margins and intervals Keep paragraphs together Hangout lines Tables Graphics …

FO has all capabilities of modern text editors:

FO Processor:

Apache FOP

slide-32
SLIDE 32

CERN

e–Business

XSL XSL-

  • FO: Example

FO: Example

XML XML e-MAPS

XSLT Web Interface Printable Version XSL:FO

FOP Processor No extra code required RTF to XSL:FO converters are good Can be written by a student Output format independent

slide-33
SLIDE 33

CERN

e–Business

XML XML Editors Editors

Specially designed

for XML editing

XML well-formedness

and validity check

DTD and Schema visual editing XML generation accordingly to DTD/Schema Creation and debugging of XSLT and XSL:FO Visual XSLT editing

Example: Altova XML Spy (www.xmlspy.com)

  • Available from NICE
  • License can be obtained from the SDT service

XMLSpy 2005

slide-34
SLIDE 34

CERN

e–Business

XML XML: Program Handling : Program Handling

DOM (Document Object Model)

– Tree building

SAX

– Event handling – startElement() – endElement()

Java, C++:

– Apache Xalan – Oracle XML Parser ... PERL, .Net: – Built-in support

SAX - much faster, DOM – more versatile SAX - much faster, DOM – more versatile

slide-35
SLIDE 35

CERN

e–Business

New Technologies New Technologies

InfoPath 2003

– Corporate system for electronic form handling – XML-based – Business rules defined by XML schema – Data validation using XML schemas

Adobe Intellegent Document Platform

– Similar ideas

slide-36
SLIDE 36

CERN

e–Business

Conclusion Conclusion

«XML is one of the biggest inventions in IT area in the last few years. There is a lot of XML applications around the world today, and this amount will grow every year» «XML is one of the biggest inventions in IT area in the last few years. There is a lot of XML applications around the world today, and this amount will grow every year» W3C Consortium Web Site: http://www.w3c.org Questions: Rostislav.Titov@cern.ch