Commit | Line | Data |
---|---|---|
920dae64 AT |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
2 | <html> | |
3 | <head> | |
4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> | |
5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> | |
6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> | |
7 | <link rel="first" href="lib.html" title='Python Library Reference' /> | |
8 | <link rel='contents' href='contents.html' title="Contents" /> | |
9 | <link rel='index' href='genindex.html' title='Index' /> | |
10 | <link rel='last' href='about.html' title='About this document...' /> | |
11 | <link rel='help' href='about.html' title='About this document...' /> | |
12 | <link rel="prev" href="match-objects.html" /> | |
13 | <link rel="parent" href="module-re.html" /> | |
14 | <link rel="next" href="module-struct.html" /> | |
15 | <meta name='aesop' content='information' /> | |
16 | <title>4.2.6 Examples</title> | |
17 | </head> | |
18 | <body> | |
19 | <DIV CLASS="navigation"> | |
20 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> | |
21 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
22 | <tr> | |
23 | <td class='online-navigation'><a rel="prev" title="4.2.5 Match Objects" | |
24 | href="match-objects.html"><img src='../icons/previous.png' | |
25 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
26 | <td class='online-navigation'><a rel="parent" title="4.2 re " | |
27 | href="module-re.html"><img src='../icons/up.png' | |
28 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
29 | <td class='online-navigation'><a rel="next" title="4.3 struct " | |
30 | href="module-struct.html"><img src='../icons/next.png' | |
31 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
32 | <td align="center" width="100%">Python Library Reference</td> | |
33 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
34 | href="contents.html"><img src='../icons/contents.png' | |
35 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
36 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
37 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
38 | <td class='online-navigation'><a rel="index" title="Index" | |
39 | href="genindex.html"><img src='../icons/index.png' | |
40 | border='0' height='32' alt='Index' width='32' /></A></td> | |
41 | </tr></table> | |
42 | <div class='online-navigation'> | |
43 | <b class="navlabel">Previous:</b> | |
44 | <a class="sectref" rel="prev" href="match-objects.html">4.2.5 Match Objects</A> | |
45 | <b class="navlabel">Up:</b> | |
46 | <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> | |
47 | <b class="navlabel">Next:</b> | |
48 | <a class="sectref" rel="next" href="module-struct.html">4.3 struct </A> | |
49 | </div> | |
50 | <hr /></div> | |
51 | </DIV> | |
52 | <!--End of Navigation Panel--> | |
53 | ||
54 | <H2><A NAME="SECTION006260000000000000000"> | |
55 | 4.2.6 Examples</A> | |
56 | </H2> | |
57 | ||
58 | <P> | |
59 | <DIV CLASS="leftline" ID="par86950" ALIGN="LEFT"> | |
60 | <strong>Simulating <tt class="cfunction">scanf()</tt></strong></DIV> | |
61 | ||
62 | <P> | |
63 | Python does not currently have an equivalent to <tt class="cfunction">scanf()</tt>. | |
64 | <a id='l2h-914' xml:id='l2h-914'></a> | |
65 | Regular expressions are generally more powerful, though also more | |
66 | verbose, than <tt class="cfunction">scanf()</tt> format strings. The table below | |
67 | offers some more-or-less equivalent mappings between | |
68 | <tt class="cfunction">scanf()</tt> format tokens and regular expressions. | |
69 | ||
70 | <P> | |
71 | <div class="center"><table class="realtable"> | |
72 | <thead> | |
73 | <tr> | |
74 | <th class="left" ><tt class="cfunction">scanf()</tt> Token</th> | |
75 | <th class="left" >Regular Expression</th> | |
76 | </tr> | |
77 | </thead> | |
78 | <tbody> | |
79 | <tr><td class="left" valign="baseline"><code>%c</code></td> | |
80 | <td class="left" ><tt class="regexp">.</tt></td></tr> | |
81 | <tr><td class="left" valign="baseline"><code>%5c</code></td> | |
82 | <td class="left" ><tt class="regexp">.{5}</tt></td></tr> | |
83 | <tr><td class="left" valign="baseline"><code>%d</code></td> | |
84 | <td class="left" ><tt class="regexp">[-+]?\d+</tt></td></tr> | |
85 | <tr><td class="left" valign="baseline"><code>%e</code>, <code>%E</code>, <code>%f</code>, <code>%g</code></td> | |
86 | <td class="left" ><tt class="regexp">[-+]?(\d+(\.\d*)?|\d*\.\d+)([eE][-+]?\d+)?</tt></td></tr> | |
87 | <tr><td class="left" valign="baseline"><code>%i</code></td> | |
88 | <td class="left" ><tt class="regexp">[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)</tt></td></tr> | |
89 | <tr><td class="left" valign="baseline"><code>%o</code></td> | |
90 | <td class="left" ><tt class="regexp">0[0-7]*</tt></td></tr> | |
91 | <tr><td class="left" valign="baseline"><code>%s</code></td> | |
92 | <td class="left" ><tt class="regexp">\S+</tt></td></tr> | |
93 | <tr><td class="left" valign="baseline"><code>%u</code></td> | |
94 | <td class="left" ><tt class="regexp">\d+</tt></td></tr> | |
95 | <tr><td class="left" valign="baseline"><code>%x</code>, <code>%X</code></td> | |
96 | <td class="left" ><tt class="regexp">0[xX][\dA-Fa-f]+</tt></td></tr></tbody> | |
97 | </table></div> | |
98 | ||
99 | <P> | |
100 | To extract the filename and numbers from a string like | |
101 | ||
102 | <P> | |
103 | <div class="verbatim"><pre> | |
104 | /usr/sbin/sendmail - 0 errors, 4 warnings | |
105 | </pre></div> | |
106 | ||
107 | <P> | |
108 | you would use a <tt class="cfunction">scanf()</tt> format like | |
109 | ||
110 | <P> | |
111 | <div class="verbatim"><pre> | |
112 | %s - %d errors, %d warnings | |
113 | </pre></div> | |
114 | ||
115 | <P> | |
116 | The equivalent regular expression would be | |
117 | ||
118 | <P> | |
119 | <div class="verbatim"><pre> | |
120 | (\S+) - (\d+) errors, (\d+) warnings | |
121 | </pre></div> | |
122 | ||
123 | <P> | |
124 | <DIV CLASS="leftline" ID="par86951" ALIGN="LEFT"> | |
125 | <strong>Avoiding recursion</strong></DIV> | |
126 | ||
127 | <P> | |
128 | If you create regular expressions that require the engine to perform a | |
129 | lot of recursion, you may encounter a RuntimeError exception with | |
130 | the message <code>maximum recursion limit</code> exceeded. For example, | |
131 | ||
132 | <P> | |
133 | <div class="verbatim"><pre> | |
134 | >>> import re | |
135 | >>> s = 'Begin ' + 1000*'a very long string ' + 'end' | |
136 | >>> re.match('Begin (\w| )*? end', s).end() | |
137 | Traceback (most recent call last): | |
138 | File "<stdin>", line 1, in ? | |
139 | File "/usr/local/lib/python2.3/sre.py", line 132, in match | |
140 | return _compile(pattern, flags).match(string) | |
141 | RuntimeError: maximum recursion limit exceeded | |
142 | </pre></div> | |
143 | ||
144 | <P> | |
145 | You can often restructure your regular expression to avoid recursion. | |
146 | ||
147 | <P> | |
148 | Starting with Python 2.3, simple uses of the <tt class="regexp">*?</tt> pattern are | |
149 | special-cased to avoid recursion. Thus, the above regular expression | |
150 | can avoid recursion by being recast as | |
151 | <tt class="regexp">Begin [a-zA-Z0-9_ ]*?end</tt>. As a further benefit, such regular | |
152 | expressions will run faster than their recursive equivalents. | |
153 | ||
154 | <DIV CLASS="navigation"> | |
155 | <div class='online-navigation'> | |
156 | <p></p><hr /> | |
157 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
158 | <tr> | |
159 | <td class='online-navigation'><a rel="prev" title="4.2.5 Match Objects" | |
160 | href="match-objects.html"><img src='../icons/previous.png' | |
161 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
162 | <td class='online-navigation'><a rel="parent" title="4.2 re " | |
163 | href="module-re.html"><img src='../icons/up.png' | |
164 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
165 | <td class='online-navigation'><a rel="next" title="4.3 struct " | |
166 | href="module-struct.html"><img src='../icons/next.png' | |
167 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
168 | <td align="center" width="100%">Python Library Reference</td> | |
169 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
170 | href="contents.html"><img src='../icons/contents.png' | |
171 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
172 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
173 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
174 | <td class='online-navigation'><a rel="index" title="Index" | |
175 | href="genindex.html"><img src='../icons/index.png' | |
176 | border='0' height='32' alt='Index' width='32' /></A></td> | |
177 | </tr></table> | |
178 | <div class='online-navigation'> | |
179 | <b class="navlabel">Previous:</b> | |
180 | <a class="sectref" rel="prev" href="match-objects.html">4.2.5 Match Objects</A> | |
181 | <b class="navlabel">Up:</b> | |
182 | <a class="sectref" rel="parent" href="module-re.html">4.2 re </A> | |
183 | <b class="navlabel">Next:</b> | |
184 | <a class="sectref" rel="next" href="module-struct.html">4.3 struct </A> | |
185 | </div> | |
186 | </div> | |
187 | <hr /> | |
188 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> | |
189 | </DIV> | |
190 | <!--End of Navigation Panel--> | |
191 | <ADDRESS> | |
192 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. | |
193 | </ADDRESS> | |
194 | </BODY> | |
195 | </HTML> |