| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> |
| 5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> |
| 6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> |
| 7 | <link rel="first" href="lib.html" title='Python Library Reference' /> |
| 8 | <link rel='contents' href='contents.html' title="Contents" /> |
| 9 | <link rel='index' href='genindex.html' title='Index' /> |
| 10 | <link rel='last' href='about.html' title='About this document...' /> |
| 11 | <link rel='help' href='about.html' title='About this document...' /> |
| 12 | <link rel="prev" href="match-objects.html" /> |
| 13 | <link rel="parent" href="module-re.html" /> |
| 14 | <link rel="next" href="module-struct.html" /> |
| 15 | <meta name='aesop' content='information' /> |
| 16 | <title>4.2.6 Examples</title> |
| 17 | </head> |
| 18 | <body> |
| 19 | <DIV CLASS="navigation"> |
| 20 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> |
| 21 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 22 | <tr> |
| 23 | <td class='online-navigation'><a rel="prev" title="4.2.5 Match Objects" |
| 24 | href="match-objects.html"><img src='../icons/previous.png' |
| 25 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 26 | <td class='online-navigation'><a rel="parent" title="4.2 re " |
| 27 | href="module-re.html"><img src='../icons/up.png' |
| 28 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 29 | <td class='online-navigation'><a rel="next" title="4.3 struct " |
| 30 | href="module-struct.html"><img src='../icons/next.png' |
| 31 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 32 | <td align="center" width="100%">Python Library Reference</td> |
| 33 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 34 | href="contents.html"><img src='../icons/contents.png' |
| 35 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 36 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 37 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 38 | <td class='online-navigation'><a rel="index" title="Index" |
| 39 | href="genindex.html"><img src='../icons/index.png' |
| 40 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 41 | </tr></table> |
| 42 | <div class='online-navigation'> |
| 43 | <b class="navlabel">Previous:</b> |
| 44 | <a class="sectref" rel="prev" href="match-objects.html">4.2.5 Match Objects</A> |
| 45 | <b class="navlabel">Up:</b> |
| 46 | <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> |
| 47 | <b class="navlabel">Next:</b> |
| 48 | <a class="sectref" rel="next" href="module-struct.html">4.3 struct </A> |
| 49 | </div> |
| 50 | <hr /></div> |
| 51 | </DIV> |
| 52 | <!--End of Navigation Panel--> |
| 53 | |
| 54 | <H2><A NAME="SECTION006260000000000000000"> |
| 55 | 4.2.6 Examples</A> |
| 56 | </H2> |
| 57 | |
| 58 | <P> |
| 59 | <DIV CLASS="leftline" ID="par86950" ALIGN="LEFT"> |
| 60 | <strong>Simulating <tt class="cfunction">scanf()</tt></strong></DIV> |
| 61 | |
| 62 | <P> |
| 63 | Python does not currently have an equivalent to <tt class="cfunction">scanf()</tt>. |
| 64 | <a id='l2h-914' xml:id='l2h-914'></a> |
| 65 | Regular expressions are generally more powerful, though also more |
| 66 | verbose, than <tt class="cfunction">scanf()</tt> format strings. The table below |
| 67 | offers some more-or-less equivalent mappings between |
| 68 | <tt class="cfunction">scanf()</tt> format tokens and regular expressions. |
| 69 | |
| 70 | <P> |
| 71 | <div class="center"><table class="realtable"> |
| 72 | <thead> |
| 73 | <tr> |
| 74 | <th class="left" ><tt class="cfunction">scanf()</tt> Token</th> |
| 75 | <th class="left" >Regular Expression</th> |
| 76 | </tr> |
| 77 | </thead> |
| 78 | <tbody> |
| 79 | <tr><td class="left" valign="baseline"><code>%c</code></td> |
| 80 | <td class="left" ><tt class="regexp">.</tt></td></tr> |
| 81 | <tr><td class="left" valign="baseline"><code>%5c</code></td> |
| 82 | <td class="left" ><tt class="regexp">.{5}</tt></td></tr> |
| 83 | <tr><td class="left" valign="baseline"><code>%d</code></td> |
| 84 | <td class="left" ><tt class="regexp">[-+]?\d+</tt></td></tr> |
| 85 | <tr><td class="left" valign="baseline"><code>%e</code>, <code>%E</code>, <code>%f</code>, <code>%g</code></td> |
| 86 | <td class="left" ><tt class="regexp">[-+]?(\d+(\.\d*)?|\d*\.\d+)([eE][-+]?\d+)?</tt></td></tr> |
| 87 | <tr><td class="left" valign="baseline"><code>%i</code></td> |
| 88 | <td class="left" ><tt class="regexp">[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)</tt></td></tr> |
| 89 | <tr><td class="left" valign="baseline"><code>%o</code></td> |
| 90 | <td class="left" ><tt class="regexp">0[0-7]*</tt></td></tr> |
| 91 | <tr><td class="left" valign="baseline"><code>%s</code></td> |
| 92 | <td class="left" ><tt class="regexp">\S+</tt></td></tr> |
| 93 | <tr><td class="left" valign="baseline"><code>%u</code></td> |
| 94 | <td class="left" ><tt class="regexp">\d+</tt></td></tr> |
| 95 | <tr><td class="left" valign="baseline"><code>%x</code>, <code>%X</code></td> |
| 96 | <td class="left" ><tt class="regexp">0[xX][\dA-Fa-f]+</tt></td></tr></tbody> |
| 97 | </table></div> |
| 98 | |
| 99 | <P> |
| 100 | To extract the filename and numbers from a string like |
| 101 | |
| 102 | <P> |
| 103 | <div class="verbatim"><pre> |
| 104 | /usr/sbin/sendmail - 0 errors, 4 warnings |
| 105 | </pre></div> |
| 106 | |
| 107 | <P> |
| 108 | you would use a <tt class="cfunction">scanf()</tt> format like |
| 109 | |
| 110 | <P> |
| 111 | <div class="verbatim"><pre> |
| 112 | %s - %d errors, %d warnings |
| 113 | </pre></div> |
| 114 | |
| 115 | <P> |
| 116 | The equivalent regular expression would be |
| 117 | |
| 118 | <P> |
| 119 | <div class="verbatim"><pre> |
| 120 | (\S+) - (\d+) errors, (\d+) warnings |
| 121 | </pre></div> |
| 122 | |
| 123 | <P> |
| 124 | <DIV CLASS="leftline" ID="par86951" ALIGN="LEFT"> |
| 125 | <strong>Avoiding recursion</strong></DIV> |
| 126 | |
| 127 | <P> |
| 128 | If you create regular expressions that require the engine to perform a |
| 129 | lot of recursion, you may encounter a RuntimeError exception with |
| 130 | the message <code>maximum recursion limit</code> exceeded. For example, |
| 131 | |
| 132 | <P> |
| 133 | <div class="verbatim"><pre> |
| 134 | >>> import re |
| 135 | >>> s = 'Begin ' + 1000*'a very long string ' + 'end' |
| 136 | >>> re.match('Begin (\w| )*? end', s).end() |
| 137 | Traceback (most recent call last): |
| 138 | File "<stdin>", line 1, in ? |
| 139 | File "/usr/local/lib/python2.3/sre.py", line 132, in match |
| 140 | return _compile(pattern, flags).match(string) |
| 141 | RuntimeError: maximum recursion limit exceeded |
| 142 | </pre></div> |
| 143 | |
| 144 | <P> |
| 145 | You can often restructure your regular expression to avoid recursion. |
| 146 | |
| 147 | <P> |
| 148 | Starting with Python 2.3, simple uses of the <tt class="regexp">*?</tt> pattern are |
| 149 | special-cased to avoid recursion. Thus, the above regular expression |
| 150 | can avoid recursion by being recast as |
| 151 | <tt class="regexp">Begin [a-zA-Z0-9_ ]*?end</tt>. As a further benefit, such regular |
| 152 | expressions will run faster than their recursive equivalents. |
| 153 | |
| 154 | <DIV CLASS="navigation"> |
| 155 | <div class='online-navigation'> |
| 156 | <p></p><hr /> |
| 157 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 158 | <tr> |
| 159 | <td class='online-navigation'><a rel="prev" title="4.2.5 Match Objects" |
| 160 | href="match-objects.html"><img src='../icons/previous.png' |
| 161 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 162 | <td class='online-navigation'><a rel="parent" title="4.2 re " |
| 163 | href="module-re.html"><img src='../icons/up.png' |
| 164 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 165 | <td class='online-navigation'><a rel="next" title="4.3 struct " |
| 166 | href="module-struct.html"><img src='../icons/next.png' |
| 167 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 168 | <td align="center" width="100%">Python Library Reference</td> |
| 169 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 170 | href="contents.html"><img src='../icons/contents.png' |
| 171 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 172 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 173 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 174 | <td class='online-navigation'><a rel="index" title="Index" |
| 175 | href="genindex.html"><img src='../icons/index.png' |
| 176 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 177 | </tr></table> |
| 178 | <div class='online-navigation'> |
| 179 | <b class="navlabel">Previous:</b> |
| 180 | <a class="sectref" rel="prev" href="match-objects.html">4.2.5 Match Objects</A> |
| 181 | <b class="navlabel">Up:</b> |
| 182 | <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> |
| 183 | <b class="navlabel">Next:</b> |
| 184 | <a class="sectref" rel="next" href="module-struct.html">4.3 struct </A> |
| 185 | </div> |
| 186 | </div> |
| 187 | <hr /> |
| 188 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> |
| 189 | </DIV> |
| 190 | <!--End of Navigation Panel--> |
| 191 | <ADDRESS> |
| 192 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. |
| 193 | </ADDRESS> |
| 194 | </BODY> |
| 195 | </HTML> |