| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> |
| 5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> |
| 6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> |
| 7 | <link rel="first" href="lib.html" title='Python Library Reference' /> |
| 8 | <link rel='contents' href='contents.html' title="Contents" /> |
| 9 | <link rel='index' href='genindex.html' title='Index' /> |
| 10 | <link rel='last' href='about.html' title='About this document...' /> |
| 11 | <link rel='help' href='about.html' title='About this document...' /> |
| 12 | <link rel="next" href="module-xml.dom.pulldom.html" /> |
| 13 | <link rel="prev" href="module-xml.dom.html" /> |
| 14 | <link rel="parent" href="markup.html" /> |
| 15 | <link rel="next" href="dom-objects.html" /> |
| 16 | <meta name='aesop' content='information' /> |
| 17 | <title>13.7 xml.dom.minidom -- Lightweight DOM implementation</title> |
| 18 | </head> |
| 19 | <body> |
| 20 | <DIV CLASS="navigation"> |
| 21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> |
| 22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 23 | <tr> |
| 24 | <td class='online-navigation'><a rel="prev" title="13.6.3.2 Accessor Methods" |
| 25 | href="dom-accessor-methods.html"><img src='../icons/previous.png' |
| 26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 27 | <td class='online-navigation'><a rel="parent" title="13. Structured Markup Processing" |
| 28 | href="markup.html"><img src='../icons/up.png' |
| 29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 30 | <td class='online-navigation'><a rel="next" title="13.7.1 DOM Objects" |
| 31 | href="dom-objects.html"><img src='../icons/next.png' |
| 32 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 33 | <td align="center" width="100%">Python Library Reference</td> |
| 34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 35 | href="contents.html"><img src='../icons/contents.png' |
| 36 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 37 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 38 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 39 | <td class='online-navigation'><a rel="index" title="Index" |
| 40 | href="genindex.html"><img src='../icons/index.png' |
| 41 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 42 | </tr></table> |
| 43 | <div class='online-navigation'> |
| 44 | <b class="navlabel">Previous:</b> |
| 45 | <a class="sectref" rel="prev" href="dom-accessor-methods.html">13.6.3.2 Accessor Methods</A> |
| 46 | <b class="navlabel">Up:</b> |
| 47 | <a class="sectref" rel="parent" href="markup.html">13. Structured Markup Processing</A> |
| 48 | <b class="navlabel">Next:</b> |
| 49 | <a class="sectref" rel="next" href="dom-objects.html">13.7.1 DOM Objects</A> |
| 50 | </div> |
| 51 | <hr /></div> |
| 52 | </DIV> |
| 53 | <!--End of Navigation Panel--> |
| 54 | |
| 55 | <H1><A NAME="SECTION0015700000000000000000"> |
| 56 | 13.7 <tt class="module">xml.dom.minidom</tt> -- |
| 57 | Lightweight DOM implementation</A> |
| 58 | </H1> |
| 59 | |
| 60 | <P> |
| 61 | <A NAME="module-xml.dom.minidom"></A> |
| 62 | |
| 63 | <P> |
| 64 | |
| 65 | <span class="versionnote">New in version 2.0.</span> |
| 66 | |
| 67 | <P> |
| 68 | <tt class="module">xml.dom.minidom</tt> is a light-weight implementation of the |
| 69 | Document Object Model interface. It is intended to be |
| 70 | simpler than the full DOM and also significantly smaller. |
| 71 | |
| 72 | <P> |
| 73 | DOM applications typically start by parsing some XML into a DOM. With |
| 74 | <tt class="module">xml.dom.minidom</tt>, this is done through the parse functions: |
| 75 | |
| 76 | <P> |
| 77 | <div class="verbatim"><pre> |
| 78 | from xml.dom.minidom import parse, parseString |
| 79 | |
| 80 | dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name |
| 81 | |
| 82 | datasource = open('c:\\temp\\mydata.xml') |
| 83 | dom2 = parse(datasource) # parse an open file |
| 84 | |
| 85 | dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>') |
| 86 | </pre></div> |
| 87 | |
| 88 | <P> |
| 89 | The <tt class="function">parse()</tt> function can take either a filename or an open |
| 90 | file object. |
| 91 | |
| 92 | <P> |
| 93 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 94 | <td><nobr><b><tt id='l2h-4445' xml:id='l2h-4445' class="function">parse</tt></b>(</nobr></td> |
| 95 | <td><var>filename_or_file, parser</var>)</td></tr></table></dt> |
| 96 | <dd> |
| 97 | Return a <tt class="class">Document</tt> from the given input. <var>filename_or_file</var> |
| 98 | may be either a file name, or a file-like object. <var>parser</var>, if |
| 99 | given, must be a SAX2 parser object. This function will change the |
| 100 | document handler of the parser and activate namespace support; other |
| 101 | parser configuration (like setting an entity resolver) must have been |
| 102 | done in advance. |
| 103 | </dl> |
| 104 | |
| 105 | <P> |
| 106 | If you have XML in a string, you can use the |
| 107 | <tt class="function">parseString()</tt> function instead: |
| 108 | |
| 109 | <P> |
| 110 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 111 | <td><nobr><b><tt id='l2h-4446' xml:id='l2h-4446' class="function">parseString</tt></b>(</nobr></td> |
| 112 | <td><var>string</var><big>[</big><var>, parser</var><big>]</big><var></var>)</td></tr></table></dt> |
| 113 | <dd> |
| 114 | Return a <tt class="class">Document</tt> that represents the <var>string</var>. This |
| 115 | method creates a <tt class="class">StringIO</tt> object for the string and passes |
| 116 | that on to <tt class="function">parse</tt>. |
| 117 | </dl> |
| 118 | |
| 119 | <P> |
| 120 | Both functions return a <tt class="class">Document</tt> object representing the |
| 121 | content of the document. |
| 122 | |
| 123 | <P> |
| 124 | What the <tt class="function">parse()</tt> and <tt class="function">parseString()</tt> functions do |
| 125 | is connect an XML parser with a ``DOM builder'' that can accept parse |
| 126 | events from any SAX parser and convert them into a DOM tree. The name |
| 127 | of the functions are perhaps misleading, but are easy to grasp when |
| 128 | learning the interfaces. The parsing of the document will be |
| 129 | completed before these functions return; it's simply that these |
| 130 | functions do not provide a parser implementation themselves. |
| 131 | |
| 132 | <P> |
| 133 | You can also create a <tt class="class">Document</tt> by calling a method on a ``DOM |
| 134 | Implementation'' object. You can get this object either by calling |
| 135 | the <tt class="function">getDOMImplementation()</tt> function in the |
| 136 | <tt class="module"><a href="module-xml.dom.html">xml.dom</a></tt> package or the <tt class="module">xml.dom.minidom</tt> module. |
| 137 | Using the implementation from the <tt class="module">xml.dom.minidom</tt> module will |
| 138 | always return a <tt class="class">Document</tt> instance from the minidom |
| 139 | implementation, while the version from <tt class="module"><a href="module-xml.dom.html">xml.dom</a></tt> may provide |
| 140 | an alternate implementation (this is likely if you have the |
| 141 | <a class="ulink" href="http://pyxml.sourceforge.net/" |
| 142 | >PyXML package</a> installed). Once |
| 143 | you have a <tt class="class">Document</tt>, you can add child nodes to it to populate |
| 144 | the DOM: |
| 145 | |
| 146 | <P> |
| 147 | <div class="verbatim"><pre> |
| 148 | from xml.dom.minidom import getDOMImplementation |
| 149 | |
| 150 | impl = getDOMImplementation() |
| 151 | |
| 152 | newdoc = impl.createDocument(None, "some_tag", None) |
| 153 | top_element = newdoc.documentElement |
| 154 | text = newdoc.createTextNode('Some textual content.') |
| 155 | top_element.appendChild(text) |
| 156 | </pre></div> |
| 157 | |
| 158 | <P> |
| 159 | Once you have a DOM document object, you can access the parts of your |
| 160 | XML document through its properties and methods. These properties are |
| 161 | defined in the DOM specification. The main property of the document |
| 162 | object is the <tt class="member">documentElement</tt> property. It gives you the |
| 163 | main element in the XML document: the one that holds all others. Here |
| 164 | is an example program: |
| 165 | |
| 166 | <P> |
| 167 | <div class="verbatim"><pre> |
| 168 | dom3 = parseString("<myxml>Some data</myxml>") |
| 169 | assert dom3.documentElement.tagName == "myxml" |
| 170 | </pre></div> |
| 171 | |
| 172 | <P> |
| 173 | When you are finished with a DOM, you should clean it up. This is |
| 174 | necessary because some versions of Python do not support garbage |
| 175 | collection of objects that refer to each other in a cycle. Until this |
| 176 | restriction is removed from all versions of Python, it is safest to |
| 177 | write your code as if cycles would not be cleaned up. |
| 178 | |
| 179 | <P> |
| 180 | The way to clean up a DOM is to call its <tt class="method">unlink()</tt> method: |
| 181 | |
| 182 | <P> |
| 183 | <div class="verbatim"><pre> |
| 184 | dom1.unlink() |
| 185 | dom2.unlink() |
| 186 | dom3.unlink() |
| 187 | </pre></div> |
| 188 | |
| 189 | <P> |
| 190 | <tt class="method">unlink()</tt> is a <tt class="module">xml.dom.minidom</tt>-specific extension to |
| 191 | the DOM API. After calling <tt class="method">unlink()</tt> on a node, the node and |
| 192 | its descendants are essentially useless. |
| 193 | |
| 194 | <P> |
| 195 | <div class="seealso"> |
| 196 | <p class="heading">See Also:</p> |
| 197 | |
| 198 | <dl compact="compact" class="seetitle"> |
| 199 | <dt><em class="citetitle"><a href="http://www.w3.org/TR/REC-DOM-Level-1/" |
| 200 | >Document Object |
| 201 | Model (DOM) Level 1 Specification</a></em></dt> |
| 202 | <dd>The W3C recommendation for the |
| 203 | DOM supported by <tt class="module">xml.dom.minidom</tt>.</dd> |
| 204 | </dl> |
| 205 | </div> |
| 206 | |
| 207 | <P> |
| 208 | |
| 209 | <p><br /></p><hr class='online-navigation' /> |
| 210 | <div class='online-navigation'> |
| 211 | <!--Table of Child-Links--> |
| 212 | <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></a> |
| 213 | |
| 214 | <UL CLASS="ChildLinks"> |
| 215 | <LI><A href="dom-objects.html">13.7.1 DOM Objects</a> |
| 216 | <LI><A href="dom-example.html">13.7.2 DOM Example</a> |
| 217 | <LI><A href="minidom-and-dom.html">13.7.3 minidom and the DOM standard</a> |
| 218 | </ul> |
| 219 | <!--End of Table of Child-Links--> |
| 220 | </div> |
| 221 | |
| 222 | <DIV CLASS="navigation"> |
| 223 | <div class='online-navigation'> |
| 224 | <p></p><hr /> |
| 225 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 226 | <tr> |
| 227 | <td class='online-navigation'><a rel="prev" title="13.6.3.2 Accessor Methods" |
| 228 | href="dom-accessor-methods.html"><img src='../icons/previous.png' |
| 229 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 230 | <td class='online-navigation'><a rel="parent" title="13. Structured Markup Processing" |
| 231 | href="markup.html"><img src='../icons/up.png' |
| 232 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 233 | <td class='online-navigation'><a rel="next" title="13.7.1 DOM Objects" |
| 234 | href="dom-objects.html"><img src='../icons/next.png' |
| 235 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 236 | <td align="center" width="100%">Python Library Reference</td> |
| 237 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 238 | href="contents.html"><img src='../icons/contents.png' |
| 239 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 240 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 241 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 242 | <td class='online-navigation'><a rel="index" title="Index" |
| 243 | href="genindex.html"><img src='../icons/index.png' |
| 244 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 245 | </tr></table> |
| 246 | <div class='online-navigation'> |
| 247 | <b class="navlabel">Previous:</b> |
| 248 | <a class="sectref" rel="prev" href="dom-accessor-methods.html">13.6.3.2 Accessor Methods</A> |
| 249 | <b class="navlabel">Up:</b> |
| 250 | <a class="sectref" rel="parent" href="markup.html">13. Structured Markup Processing</A> |
| 251 | <b class="navlabel">Next:</b> |
| 252 | <a class="sectref" rel="next" href="dom-objects.html">13.7.1 DOM Objects</A> |
| 253 | </div> |
| 254 | </div> |
| 255 | <hr /> |
| 256 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> |
| 257 | </DIV> |
| 258 | <!--End of Navigation Panel--> |
| 259 | <ADDRESS> |
| 260 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. |
| 261 | </ADDRESS> |
| 262 | </BODY> |
| 263 | </HTML> |