| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> |
| 5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> |
| 6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> |
| 7 | <link rel="first" href="lib.html" title='Python Library Reference' /> |
| 8 | <link rel='contents' href='contents.html' title="Contents" /> |
| 9 | <link rel='index' href='genindex.html' title='Index' /> |
| 10 | <link rel='last' href='about.html' title='About this document...' /> |
| 11 | <link rel='help' href='about.html' title='About this document...' /> |
| 12 | <link rel="next" href="module-xml.sax.handler.html" /> |
| 13 | <link rel="prev" href="module-xml.dom.pulldom.html" /> |
| 14 | <link rel="parent" href="markup.html" /> |
| 15 | <link rel="next" href="sax-exception-objects.html" /> |
| 16 | <meta name='aesop' content='information' /> |
| 17 | <title>13.9 xml.sax -- Support for SAX2 parsers</title> |
| 18 | </head> |
| 19 | <body> |
| 20 | <DIV CLASS="navigation"> |
| 21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> |
| 22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 23 | <tr> |
| 24 | <td class='online-navigation'><a rel="prev" title="13.8.1 DOMEventStream Objects" |
| 25 | href="domeventstream-objects.html"><img src='../icons/previous.png' |
| 26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 27 | <td class='online-navigation'><a rel="parent" title="13. Structured Markup Processing" |
| 28 | href="markup.html"><img src='../icons/up.png' |
| 29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 30 | <td class='online-navigation'><a rel="next" title="13.9.1 SAXException Objects" |
| 31 | href="sax-exception-objects.html"><img src='../icons/next.png' |
| 32 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 33 | <td align="center" width="100%">Python Library Reference</td> |
| 34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 35 | href="contents.html"><img src='../icons/contents.png' |
| 36 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 37 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 38 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 39 | <td class='online-navigation'><a rel="index" title="Index" |
| 40 | href="genindex.html"><img src='../icons/index.png' |
| 41 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 42 | </tr></table> |
| 43 | <div class='online-navigation'> |
| 44 | <b class="navlabel">Previous:</b> |
| 45 | <a class="sectref" rel="prev" href="domeventstream-objects.html">13.8.1 DOMEventStream Objects</A> |
| 46 | <b class="navlabel">Up:</b> |
| 47 | <a class="sectref" rel="parent" href="markup.html">13. Structured Markup Processing</A> |
| 48 | <b class="navlabel">Next:</b> |
| 49 | <a class="sectref" rel="next" href="sax-exception-objects.html">13.9.1 SAXException Objects</A> |
| 50 | </div> |
| 51 | <hr /></div> |
| 52 | </DIV> |
| 53 | <!--End of Navigation Panel--> |
| 54 | |
| 55 | <H1><A NAME="SECTION0015900000000000000000"> |
| 56 | 13.9 <tt class="module">xml.sax</tt> -- |
| 57 | Support for SAX2 parsers</A> |
| 58 | </H1> |
| 59 | |
| 60 | <P> |
| 61 | <A NAME="module-xml.sax"></A> |
| 62 | |
| 63 | <P> |
| 64 | |
| 65 | <span class="versionnote">New in version 2.0.</span> |
| 66 | |
| 67 | <P> |
| 68 | The <tt class="module">xml.sax</tt> package provides a number of modules which |
| 69 | implement the Simple API for XML (SAX) interface for Python. The |
| 70 | package itself provides the SAX exceptions and the convenience |
| 71 | functions which will be most used by users of the SAX API. |
| 72 | |
| 73 | <P> |
| 74 | The convenience functions are: |
| 75 | |
| 76 | <P> |
| 77 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 78 | <td><nobr><b><tt id='l2h-4463' xml:id='l2h-4463' class="function">make_parser</tt></b>(</nobr></td> |
| 79 | <td><var></var><big>[</big><var>parser_list</var><big>]</big><var></var>)</td></tr></table></dt> |
| 80 | <dd> |
| 81 | Create and return a SAX <tt class="class">XMLReader</tt> object. The first parser |
| 82 | found will be used. If <var>parser_list</var> is provided, it must be a |
| 83 | sequence of strings which name modules that have a function named |
| 84 | <tt class="function">create_parser()</tt>. Modules listed in <var>parser_list</var> |
| 85 | will be used before modules in the default list of parsers. |
| 86 | </dl> |
| 87 | |
| 88 | <P> |
| 89 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 90 | <td><nobr><b><tt id='l2h-4464' xml:id='l2h-4464' class="function">parse</tt></b>(</nobr></td> |
| 91 | <td><var>filename_or_stream, handler</var><big>[</big><var>, error_handler</var><big>]</big><var></var>)</td></tr></table></dt> |
| 92 | <dd> |
| 93 | Create a SAX parser and use it to parse a document. The document, |
| 94 | passed in as <var>filename_or_stream</var>, can be a filename or a file |
| 95 | object. The <var>handler</var> parameter needs to be a SAX |
| 96 | <tt class="class">ContentHandler</tt> instance. If <var>error_handler</var> is given, |
| 97 | it must be a SAX <tt class="class">ErrorHandler</tt> instance; if omitted, |
| 98 | <tt class="exception">SAXParseException</tt> will be raised on all errors. There |
| 99 | is no return value; all work must be done by the <var>handler</var> |
| 100 | passed in. |
| 101 | </dl> |
| 102 | |
| 103 | <P> |
| 104 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 105 | <td><nobr><b><tt id='l2h-4465' xml:id='l2h-4465' class="function">parseString</tt></b>(</nobr></td> |
| 106 | <td><var>string, handler</var><big>[</big><var>, error_handler</var><big>]</big><var></var>)</td></tr></table></dt> |
| 107 | <dd> |
| 108 | Similar to <tt class="function">parse()</tt>, but parses from a buffer <var>string</var> |
| 109 | received as a parameter. |
| 110 | </dl> |
| 111 | |
| 112 | <P> |
| 113 | A typical SAX application uses three kinds of objects: readers, |
| 114 | handlers and input sources. ``Reader'' in this context is another |
| 115 | term for parser, i.e. some piece of code that reads the bytes or |
| 116 | characters from the input source, and produces a sequence of events. |
| 117 | The events then get distributed to the handler objects, i.e. the |
| 118 | reader invokes a method on the handler. A SAX application must |
| 119 | therefore obtain a reader object, create or open the input sources, |
| 120 | create the handlers, and connect these objects all together. As the |
| 121 | final step of preparation, the reader is called to parse the input. |
| 122 | During parsing, methods on the handler objects are called based on |
| 123 | structural and syntactic events from the input data. |
| 124 | |
| 125 | <P> |
| 126 | For these objects, only the interfaces are relevant; they are normally |
| 127 | not instantiated by the application itself. Since Python does not have |
| 128 | an explicit notion of interface, they are formally introduced as |
| 129 | classes, but applications may use implementations which do not inherit |
| 130 | from the provided classes. The <tt class="class">InputSource</tt>, <tt class="class">Locator</tt>, |
| 131 | <tt class="class">Attributes</tt>, <tt class="class">AttributesNS</tt>, and |
| 132 | <tt class="class">XMLReader</tt> interfaces are defined in the module |
| 133 | <tt class="module"><a href="module-xml.sax.xmlreader.html">xml.sax.xmlreader</a></tt>. The handler interfaces are defined in |
| 134 | <tt class="module"><a href="module-xml.sax.handler.html">xml.sax.handler</a></tt>. For convenience, <tt class="class">InputSource</tt> |
| 135 | (which is often instantiated directly) and the handler classes are |
| 136 | also available from <tt class="module">xml.sax</tt>. These interfaces are described |
| 137 | below. |
| 138 | |
| 139 | <P> |
| 140 | In addition to these classes, <tt class="module">xml.sax</tt> provides the following |
| 141 | exception classes. |
| 142 | |
| 143 | <P> |
| 144 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 145 | <td><nobr><b><span class="typelabel">exception</span> <tt id='l2h-4466' xml:id='l2h-4466' class="exception">SAXException</tt></b>(</nobr></td> |
| 146 | <td><var>msg</var><big>[</big><var>, exception</var><big>]</big><var></var>)</td></tr></table></dt> |
| 147 | <dd> |
| 148 | Encapsulate an XML error or warning. This class can contain basic |
| 149 | error or warning information from either the XML parser or the |
| 150 | application: it can be subclassed to provide additional |
| 151 | functionality or to add localization. Note that although the |
| 152 | handlers defined in the <tt class="class">ErrorHandler</tt> interface receive |
| 153 | instances of this exception, it is not required to actually raise |
| 154 | the exception -- it is also useful as a container for information. |
| 155 | |
| 156 | <P> |
| 157 | When instantiated, <var>msg</var> should be a human-readable description |
| 158 | of the error. The optional <var>exception</var> parameter, if given, |
| 159 | should be <code>None</code> or an exception that was caught by the parsing |
| 160 | code and is being passed along as information. |
| 161 | |
| 162 | <P> |
| 163 | This is the base class for the other SAX exception classes. |
| 164 | </dl> |
| 165 | |
| 166 | <P> |
| 167 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 168 | <td><nobr><b><span class="typelabel">exception</span> <tt id='l2h-4467' xml:id='l2h-4467' class="exception">SAXParseException</tt></b>(</nobr></td> |
| 169 | <td><var>msg, exception, locator</var>)</td></tr></table></dt> |
| 170 | <dd> |
| 171 | Subclass of <tt class="exception">SAXException</tt> raised on parse errors. |
| 172 | Instances of this class are passed to the methods of the SAX |
| 173 | <tt class="class">ErrorHandler</tt> interface to provide information about the |
| 174 | parse error. This class supports the SAX <tt class="class">Locator</tt> interface |
| 175 | as well as the <tt class="class">SAXException</tt> interface. |
| 176 | </dl> |
| 177 | |
| 178 | <P> |
| 179 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 180 | <td><nobr><b><span class="typelabel">exception</span> <tt id='l2h-4468' xml:id='l2h-4468' class="exception">SAXNotRecognizedException</tt></b>(</nobr></td> |
| 181 | <td><var>msg</var><big>[</big><var>, exception</var><big>]</big><var></var>)</td></tr></table></dt> |
| 182 | <dd> |
| 183 | Subclass of <tt class="exception">SAXException</tt> raised when a SAX |
| 184 | <tt class="class">XMLReader</tt> is confronted with an unrecognized feature or |
| 185 | property. SAX applications and extensions may use this class for |
| 186 | similar purposes. |
| 187 | </dl> |
| 188 | |
| 189 | <P> |
| 190 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 191 | <td><nobr><b><span class="typelabel">exception</span> <tt id='l2h-4469' xml:id='l2h-4469' class="exception">SAXNotSupportedException</tt></b>(</nobr></td> |
| 192 | <td><var>msg</var><big>[</big><var>, exception</var><big>]</big><var></var>)</td></tr></table></dt> |
| 193 | <dd> |
| 194 | Subclass of <tt class="exception">SAXException</tt> raised when a SAX |
| 195 | <tt class="class">XMLReader</tt> is asked to enable a feature that is not |
| 196 | supported, or to set a property to a value that the implementation |
| 197 | does not support. SAX applications and extensions may use this |
| 198 | class for similar purposes. |
| 199 | </dl> |
| 200 | |
| 201 | <P> |
| 202 | <div class="seealso"> |
| 203 | <p class="heading">See Also:</p> |
| 204 | |
| 205 | <dl compact="compact" class="seetitle"> |
| 206 | <dt><em class="citetitle"><a href="http://www.saxproject.org/" |
| 207 | >SAX: The Simple API for |
| 208 | XML</a></em></dt> |
| 209 | <dd>This site is the focal point for the definition of |
| 210 | the SAX API. It provides a Java implementation and online |
| 211 | documentation. Links to implementations and historical |
| 212 | information are also available.</dd> |
| 213 | </dl> |
| 214 | |
| 215 | <P> |
| 216 | <dl compact="compact" class="seemodule"> |
| 217 | <dt>Module <b><tt class="module"><a href="module-xml.sax.handler.html">xml.sax.handler</a></tt>:</b> |
| 218 | <dd>Definitions of the interfaces for |
| 219 | application-provided objects. |
| 220 | </dl> |
| 221 | |
| 222 | <P> |
| 223 | <dl compact="compact" class="seemodule"> |
| 224 | <dt>Module <b><tt class="module"><a href="module-xml.sax.saxutils.html">xml.sax.saxutils</a></tt>:</b> |
| 225 | <dd>Convenience functions for use in SAX |
| 226 | applications. |
| 227 | </dl> |
| 228 | |
| 229 | <P> |
| 230 | <dl compact="compact" class="seemodule"> |
| 231 | <dt>Module <b><tt class="module"><a href="module-xml.sax.xmlreader.html">xml.sax.xmlreader</a></tt>:</b> |
| 232 | <dd>Definitions of the interfaces for |
| 233 | parser-provided objects. |
| 234 | </dl> |
| 235 | </div> |
| 236 | |
| 237 | <P> |
| 238 | |
| 239 | <p><br /></p><hr class='online-navigation' /> |
| 240 | <div class='online-navigation'> |
| 241 | <!--Table of Child-Links--> |
| 242 | <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></a> |
| 243 | |
| 244 | <UL CLASS="ChildLinks"> |
| 245 | <LI><A href="sax-exception-objects.html">13.9.1 SAXException Objects</a> |
| 246 | </ul> |
| 247 | <!--End of Table of Child-Links--> |
| 248 | </div> |
| 249 | |
| 250 | <DIV CLASS="navigation"> |
| 251 | <div class='online-navigation'> |
| 252 | <p></p><hr /> |
| 253 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 254 | <tr> |
| 255 | <td class='online-navigation'><a rel="prev" title="13.8.1 DOMEventStream Objects" |
| 256 | href="domeventstream-objects.html"><img src='../icons/previous.png' |
| 257 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 258 | <td class='online-navigation'><a rel="parent" title="13. Structured Markup Processing" |
| 259 | href="markup.html"><img src='../icons/up.png' |
| 260 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 261 | <td class='online-navigation'><a rel="next" title="13.9.1 SAXException Objects" |
| 262 | href="sax-exception-objects.html"><img src='../icons/next.png' |
| 263 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 264 | <td align="center" width="100%">Python Library Reference</td> |
| 265 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 266 | href="contents.html"><img src='../icons/contents.png' |
| 267 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 268 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 269 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 270 | <td class='online-navigation'><a rel="index" title="Index" |
| 271 | href="genindex.html"><img src='../icons/index.png' |
| 272 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 273 | </tr></table> |
| 274 | <div class='online-navigation'> |
| 275 | <b class="navlabel">Previous:</b> |
| 276 | <a class="sectref" rel="prev" href="domeventstream-objects.html">13.8.1 DOMEventStream Objects</A> |
| 277 | <b class="navlabel">Up:</b> |
| 278 | <a class="sectref" rel="parent" href="markup.html">13. Structured Markup Processing</A> |
| 279 | <b class="navlabel">Next:</b> |
| 280 | <a class="sectref" rel="next" href="sax-exception-objects.html">13.9.1 SAXException Objects</A> |
| 281 | </div> |
| 282 | </div> |
| 283 | <hr /> |
| 284 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> |
| 285 | </DIV> |
| 286 | <!--End of Navigation Panel--> |
| 287 | <ADDRESS> |
| 288 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. |
| 289 | </ADDRESS> |
| 290 | </BODY> |
| 291 | </HTML> |