<!DOCTYPE html PUBLIC
"-//W3C//DTD HTML 4.0 Transitional//EN">
<link rel=
"STYLESHEET" href=
"lib.css" type='text/css'
/>
<link rel=
"SHORTCUT ICON" href=
"../icons/pyfav.png" type=
"image/png" />
<link rel='start' href='../index.html' title='Python Documentation Index'
/>
<link rel=
"first" href=
"lib.html" title='Python Library Reference'
/>
<link rel='contents' href='contents.html'
title=
"Contents" />
<link rel='index' href='genindex.html' title='Index'
/>
<link rel='last' href='about.html' title='About this document...'
/>
<link rel='help' href='about.html' title='About this document...'
/>
<link rel=
"next" href=
"mmedia.html" />
<link rel=
"prev" href=
"netdata.html" />
<link rel=
"parent" href=
"lib.html" />
<link rel=
"next" href=
"module-HTMLParser.html" />
<meta name='aesop' content='information'
/>
<title>13. Structured Markup Processing Tools
</title>
<div id='top-navigation-panel' xml:id='top-navigation-panel'
>
<table align=
"center" width=
"100%" cellpadding=
"0" cellspacing=
"2">
<td class='online-navigation'
><a rel=
"prev" title=
"12.20.5 Examples"
href=
"node636.html"><img src='../icons/previous.png'
border='
0' height='
32' alt='Previous Page' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"parent" title=
"Python Library Reference"
href=
"lib.html"><img src='../icons/up.png'
border='
0' height='
32' alt='Up One Level' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"next" title=
"13.1 HTMLParser "
href=
"module-HTMLParser.html"><img src='../icons/next.png'
border='
0' height='
32' alt='Next Page' width='
32'
/></A></td>
<td align=
"center" width=
"100%">Python Library Reference
</td>
<td class='online-navigation'
><a rel=
"contents" title=
"Table of Contents"
href=
"contents.html"><img src='../icons/contents.png'
border='
0' height='
32' alt='Contents' width='
32'
/></A></td>
<td class='online-navigation'
><a href=
"modindex.html" title=
"Module Index"><img src='../icons/modules.png'
border='
0' height='
32' alt='Module Index' width='
32'
/></a></td>
<td class='online-navigation'
><a rel=
"index" title=
"Index"
href=
"genindex.html"><img src='../icons/index.png'
border='
0' height='
32' alt='Index' width='
32'
/></A></td>
<div class='online-navigation'
>
<b class=
"navlabel">Previous:
</b>
<a class=
"sectref" rel=
"prev" href=
"node636.html">12.20.5 Examples
</A>
<b class=
"navlabel">Up:
</b>
<a class=
"sectref" rel=
"parent" href=
"lib.html">Python Library Reference
</A>
<b class=
"navlabel">Next:
</b>
<a class=
"sectref" rel=
"next" href=
"module-HTMLParser.html">13.1 HTMLParser
</A>
<!--End of Navigation Panel-->
<H1><A NAME=
"SECTION0015000000000000000000"></A><A NAME=
"markup"></A>
13. Structured Markup Processing Tools
Python supports a variety of modules to work with various forms of
structured data markup. This includes modules to work with the
Standard Generalized Markup Language (SGML) and the Hypertext Markup
Language (HTML), and several interfaces for working with the
Extensible Markup Language (XML).
It is important to note that modules in the
<tt class=
"module">xml
</tt> package
require that there be at least one SAX-compliant XML parser available.
Starting with Python
2.3, the Expat parser is included with Python, so
the
<tt class=
"module"><a href=
"module-xml.parsers.expat.html">xml.parsers.expat
</a></tt> module will always be available.
You may still want to be aware of the
<a class=
"ulink" href=
"http://pyxml.sourceforge.net/"
package
</a>; that package provides an
extended set of XML libraries for Python.
The documentation for the
<tt class=
"module">xml.dom
</tt> and
<tt class=
"module">xml.sax
</tt>
packages are the definition of the Python bindings for the DOM and SAX
<table class='synopsistable' valign='baseline'
>
<td><b><tt class='module'
><a href='module-HTMLParser.html'
>HTMLParser
</a></tt></b></td>
<td class='synopsis'
>A simple parser that can handle HTML and XHTML.
</td></tr>
<tr><td><b><tt class='module'
><a href='module-sgmllib.html'
>sgmllib
</a></tt></b></td>
<td class='synopsis'
>Only as much of an SGML parser as needed to parse HTML.
</td></tr>
<td><b><tt class='module'
><a href='module-htmllib.html'
>htmllib
</a></tt></b></td>
<td class='synopsis'
>A parser for HTML documents.
</td></tr>
<tr><td><b><tt class='module'
><a href='module-htmlentitydefs.html'
>htmlentitydefs
</a></tt></b></td>
<td class='synopsis'
>Definitions of HTML general entities.
</td></tr>
<td><b><tt class='module'
><a href='module-xml.parsers.expat.html'
>xml.parsers.expat
</a></tt></b></td>
<td class='synopsis'
>An interface to the Expat non-validating XML parser.
</td></tr>
<tr><td><b><tt class='module'
><a href='module-xml.dom.html'
>xml.dom
</a></tt></b></td>
<td class='synopsis'
>Document Object Model API for Python.
</td></tr>
<td><b><tt class='module'
><a href='module-xml.dom.minidom.html'
>xml.dom.minidom
</a></tt></b></td>
<td class='synopsis'
>Lightweight Document Object Model (DOM) implementation.
</td></tr>
<tr><td><b><tt class='module'
><a href='module-xml.dom.pulldom.html'
>xml.dom.pulldom
</a></tt></b></td>
<td class='synopsis'
>Support for building partial DOM trees from SAX events.
</td></tr>
<td><b><tt class='module'
><a href='module-xml.sax.html'
>xml.sax
</a></tt></b></td>
<td class='synopsis'
>Package containing SAX2 base classes and convenience
<tr><td><b><tt class='module'
><a href='module-xml.sax.handler.html'
>xml.sax.handler
</a></tt></b></td>
<td class='synopsis'
>Base classes for SAX event handlers.
</td></tr>
<td><b><tt class='module'
><a href='module-xml.sax.saxutils.html'
>xml.sax.saxutils
</a></tt></b></td>
<td class='synopsis'
>Convenience functions and classes for use with SAX.
</td></tr>
<tr><td><b><tt class='module'
><a href='module-xml.sax.xmlreader.html'
>xml.sax.xmlreader
</a></tt></b></td>
<td class='synopsis'
>Interface which SAX-compliant XML parsers must implement.
</td></tr>
<td><b><tt class='module'
><a href='module-xmllib.html'
>xmllib
</a></tt></b></td>
<td class='synopsis'
>A parser for XML documents.
</td></tr>
<p class=
"heading">See Also:
</p>
<dl compact=
"compact" class=
"seetitle">
<dt><em class=
"citetitle"><a href=
"http://pyxml.sourceforge.net/"
>Python/XML Libraries
</a></em></dt>
<dd>Home page for the PyXML package, containing an extension
of
<tt class=
"module">xml
</tt> package bundled with Python.
</dd>
<div class='online-navigation'
>
<table align=
"center" width=
"100%" cellpadding=
"0" cellspacing=
"2">
<td class='online-navigation'
><a rel=
"prev" title=
"12.20.5 Examples"
href=
"node636.html"><img src='../icons/previous.png'
border='
0' height='
32' alt='Previous Page' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"parent" title=
"Python Library Reference"
href=
"lib.html"><img src='../icons/up.png'
border='
0' height='
32' alt='Up One Level' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"next" title=
"13.1 HTMLParser "
href=
"module-HTMLParser.html"><img src='../icons/next.png'
border='
0' height='
32' alt='Next Page' width='
32'
/></A></td>
<td align=
"center" width=
"100%">Python Library Reference
</td>
<td class='online-navigation'
><a rel=
"contents" title=
"Table of Contents"
href=
"contents.html"><img src='../icons/contents.png'
border='
0' height='
32' alt='Contents' width='
32'
/></A></td>
<td class='online-navigation'
><a href=
"modindex.html" title=
"Module Index"><img src='../icons/modules.png'
border='
0' height='
32' alt='Module Index' width='
32'
/></a></td>
<td class='online-navigation'
><a rel=
"index" title=
"Index"
href=
"genindex.html"><img src='../icons/index.png'
border='
0' height='
32' alt='Index' width='
32'
/></A></td>
<div class='online-navigation'
>
<b class=
"navlabel">Previous:
</b>
<a class=
"sectref" rel=
"prev" href=
"node636.html">12.20.5 Examples
</A>
<b class=
"navlabel">Up:
</b>
<a class=
"sectref" rel=
"parent" href=
"lib.html">Python Library Reference
</A>
<b class=
"navlabel">Next:
</b>
<a class=
"sectref" rel=
"next" href=
"module-HTMLParser.html">13.1 HTMLParser
</A>
<span class=
"release-info">Release
2.4.2, documentation updated on
28 September
2005.
</span>
<!--End of Navigation Panel-->
See
<i><a href=
"about.html">About this document...
</a></i> for information on suggesting changes.