Jump to content

SXML

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Timofonic (talk | contribs) at 21:36, 22 January 2007 (Created page with '{{Infobox file format | name = SXML | icon = | extension = <tt>.xml</tt>, <tt>.sxml</tt> | mime = text/sxml | type code = <tt>TEXT</tt> | uniform type = public.htm...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
SXML
Filename extension
.xml, .sxml
Internet media type
text/sxml
Type codeTEXT
Uniform Type Identifier (UTI)public.html
Type of formatmarkup language


SXML is simply a way to write XML as s-expressions. The official specification for SXML can be found at http://okmij.org/ftp/Scheme/SXML.html. A simple XHTML page looks like this:

<html xmlns="http://www.w3.org/1999/xhtml"
        xml:lang="en" lang="en">
   <head>
      <title>An example page</title>
   </head>
   <body>
      <h1 id="greeting">Hi, there!</h1>
      <p>This is just an >>example<< to show XHTML & S
   </body>
</html>
    When we translate this to SXML it looks like this:
(html (@ (xmlns "http://www.w3.org/1999/xhtml")
            (xml:lang "en") (lang "en"))
   (head
      (title "An example page"))
   (body
      (h1 (@ (id "greeting")) "Hi, there")
      (p "This is just an >>example<< to show XHTML & SXML.")))


Each element's tag pair is replaced by a set of parentheses. The tag's name is not repeated at the end, it is simply the first symbol in the list. The element's contents follow, which are either elements themselves or strings. There is no special syntax required for XML attributes. In SXML they are simply represented as just another node, which has the special name of @. This can't cause a name clash with an actual "@" tag, because @ is not allowed as a tag name in XML. This is a common pattern in SXML: Anytime a tag is used to indicate a special status or something that is not possible in XML, a name is used that does not constitute a valid XML identifier.

We can also see that there's no need to "escape" otherwise meaningful characters like & and > as &amp; and >gt; entities. All string content is automatically escaped because it is considered to be pure content, and has no tags or entities in it. This also means it is much easier to insert autogenerated content and there is no danger that we might forget to escape user input when we display it to other users (which could lead to all kinds of nasty cross-site scripting attacks or other annoyances).

SXML Tools Tutorial by Dmitry Lizorkin

Main SSAX/SXML page

XML Matters: Investigating SXML and SSAX: Manipulating XML in the Scheme programming language by David Mertz, Ph.D. IBM developerWorks article


Detailed introduction, motivation and real-life case-studies of SSAX, SXML, SXPath and SXSLT. The paper and the complementary talk presented at the International Lisp Conference 2002. [1] [2]