Commit | Line | Data |
---|---|---|
920dae64 AT |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
2 | <html> | |
3 | <head> | |
4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> | |
5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> | |
6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> | |
7 | <link rel="first" href="lib.html" title='Python Library Reference' /> | |
8 | <link rel='contents' href='contents.html' title="Contents" /> | |
9 | <link rel='index' href='genindex.html' title='Index' /> | |
10 | <link rel='last' href='about.html' title='About this document...' /> | |
11 | <link rel='help' href='about.html' title='About this document...' /> | |
12 | <link rel="next" href="module-mhlib.html" /> | |
13 | <link rel="prev" href="module-mailcap.html" /> | |
14 | <link rel="parent" href="netdata.html" /> | |
15 | <link rel="next" href="mailbox-objects.html" /> | |
16 | <meta name='aesop' content='information' /> | |
17 | <title>12.4 mailbox -- Read various mailbox formats</title> | |
18 | </head> | |
19 | <body> | |
20 | <DIV CLASS="navigation"> | |
21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> | |
22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
23 | <tr> | |
24 | <td class='online-navigation'><a rel="prev" title="12.3 mailcap " | |
25 | href="module-mailcap.html"><img src='../icons/previous.png' | |
26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
27 | <td class='online-navigation'><a rel="parent" title="12. Internet Data Handling" | |
28 | href="netdata.html"><img src='../icons/up.png' | |
29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
30 | <td class='online-navigation'><a rel="next" title="12.4.1 Mailbox Objects" | |
31 | href="mailbox-objects.html"><img src='../icons/next.png' | |
32 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
33 | <td align="center" width="100%">Python Library Reference</td> | |
34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
35 | href="contents.html"><img src='../icons/contents.png' | |
36 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
37 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
38 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
39 | <td class='online-navigation'><a rel="index" title="Index" | |
40 | href="genindex.html"><img src='../icons/index.png' | |
41 | border='0' height='32' alt='Index' width='32' /></A></td> | |
42 | </tr></table> | |
43 | <div class='online-navigation'> | |
44 | <b class="navlabel">Previous:</b> | |
45 | <a class="sectref" rel="prev" href="module-mailcap.html">12.3 mailcap </A> | |
46 | <b class="navlabel">Up:</b> | |
47 | <a class="sectref" rel="parent" href="netdata.html">12. Internet Data Handling</A> | |
48 | <b class="navlabel">Next:</b> | |
49 | <a class="sectref" rel="next" href="mailbox-objects.html">12.4.1 Mailbox Objects</A> | |
50 | </div> | |
51 | <hr /></div> | |
52 | </DIV> | |
53 | <!--End of Navigation Panel--> | |
54 | ||
55 | <H1><A NAME="SECTION0014400000000000000000"> | |
56 | 12.4 <tt class="module">mailbox</tt> -- | |
57 | Read various mailbox formats</A> | |
58 | </H1> | |
59 | ||
60 | <P> | |
61 | <A NAME="module-mailbox"></A> | |
62 | ||
63 | <P> | |
64 | This module defines a number of classes that allow easy and uniform | |
65 | access to mail messages in a (<span class="Unix">Unix</span>) mailbox. | |
66 | ||
67 | <P> | |
68 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
69 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3964' xml:id='l2h-3964' class="class">UnixMailbox</tt></b>(</nobr></td> | |
70 | <td><var>fp</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
71 | <dd> | |
72 | Access to a classic <span class="Unix">Unix</span>-style mailbox, where all messages are | |
73 | contained in a single file and separated by "<tt class="samp">From </tt>"(a.k.a. "<tt class="samp">From_</tt>") lines. The file object <var>fp</var> points to the | |
74 | mailbox file. The optional <var>factory</var> parameter is a callable that | |
75 | should create new message objects. <var>factory</var> is called with one | |
76 | argument, <var>fp</var> by the <tt class="method">next()</tt> method of the mailbox | |
77 | object. The default is the <tt class="class">rfc822.Message</tt> class (see the | |
78 | <tt class="module"><a href="module-rfc822.html">rfc822</a></tt> module - and the note below). | |
79 | ||
80 | <P> | |
81 | <div class="note"><b class="label">Note:</b> | |
82 | For reasons of this module's internal implementation, you will | |
83 | probably want to open the <var>fp</var> object in binary mode. This is | |
84 | especially important on Windows. | |
85 | </div> | |
86 | ||
87 | <P> | |
88 | For maximum portability, messages in a <span class="Unix">Unix</span>-style mailbox are | |
89 | separated by any line that begins exactly with the string <code>'From | |
90 | '</code> (note the trailing space) if preceded by exactly two newlines. | |
91 | Because of the wide-range of variations in practice, nothing else on | |
92 | the From_ line should be considered. However, the current | |
93 | implementation doesn't check for the leading two newlines. This is | |
94 | usually fine for most applications. | |
95 | ||
96 | <P> | |
97 | The <tt class="class">UnixMailbox</tt> class implements a more strict version of | |
98 | From_ line checking, using a regular expression that usually correctly | |
99 | matched From_ delimiters. It considers delimiter line to be separated | |
100 | by "<tt class="samp">From <var>name</var> <var>time</var></tt>" lines. For maximum portability, | |
101 | use the <tt class="class">PortableUnixMailbox</tt> class instead. This class is | |
102 | identical to <tt class="class">UnixMailbox</tt> except that individual messages are | |
103 | separated by only "<tt class="samp">From </tt>" lines. | |
104 | ||
105 | <P> | |
106 | For more information, see | |
107 | <em class="citetitle"><a | |
108 | href="http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html" | |
109 | title="Configuring | |
110 | Netscape Mail on Unix: Why the Content-Length Format is Bad" | |
111 | >Configuring | |
112 | Netscape Mail on <span class="Unix">Unix</span>: Why the Content-Length Format is Bad</a></em>. | |
113 | </dl> | |
114 | ||
115 | <P> | |
116 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
117 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3965' xml:id='l2h-3965' class="class">PortableUnixMailbox</tt></b>(</nobr></td> | |
118 | <td><var>fp</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
119 | <dd> | |
120 | A less-strict version of <tt class="class">UnixMailbox</tt>, which considers only the | |
121 | "<tt class="samp">From </tt>" at the beginning of the line separating messages. The | |
122 | ``<var>name</var> <var>time</var>'' portion of the From line is ignored, to | |
123 | protect against some variations that are observed in practice. This | |
124 | works since lines in the message which begin with <code>'From '</code> are | |
125 | quoted by mail handling software at delivery-time. | |
126 | </dl> | |
127 | ||
128 | <P> | |
129 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
130 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3966' xml:id='l2h-3966' class="class">MmdfMailbox</tt></b>(</nobr></td> | |
131 | <td><var>fp</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
132 | <dd> | |
133 | Access an MMDF-style mailbox, where all messages are contained | |
134 | in a single file and separated by lines consisting of 4 control-A | |
135 | characters. The file object <var>fp</var> points to the mailbox file. | |
136 | Optional <var>factory</var> is as with the <tt class="class">UnixMailbox</tt> class. | |
137 | </dl> | |
138 | ||
139 | <P> | |
140 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
141 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3967' xml:id='l2h-3967' class="class">MHMailbox</tt></b>(</nobr></td> | |
142 | <td><var>dirname</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
143 | <dd> | |
144 | Access an MH mailbox, a directory with each message in a separate | |
145 | file with a numeric name. | |
146 | The name of the mailbox directory is passed in <var>dirname</var>. | |
147 | <var>factory</var> is as with the <tt class="class">UnixMailbox</tt> class. | |
148 | </dl> | |
149 | ||
150 | <P> | |
151 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
152 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3968' xml:id='l2h-3968' class="class">Maildir</tt></b>(</nobr></td> | |
153 | <td><var>dirname</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
154 | <dd> | |
155 | Access a Qmail mail directory. All new and current mail for the | |
156 | mailbox specified by <var>dirname</var> is made available. | |
157 | <var>factory</var> is as with the <tt class="class">UnixMailbox</tt> class. | |
158 | </dl> | |
159 | ||
160 | <P> | |
161 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
162 | <td><nobr><b><span class="typelabel">class</span> <tt id='l2h-3969' xml:id='l2h-3969' class="class">BabylMailbox</tt></b>(</nobr></td> | |
163 | <td><var>fp</var><big>[</big><var>, factory</var><big>]</big><var></var>)</td></tr></table></dt> | |
164 | <dd> | |
165 | Access a Babyl mailbox, which is similar to an MMDF mailbox. In | |
166 | Babyl format, each message has two sets of headers, the | |
167 | <em>original</em> headers and the <em>visible</em> headers. The original | |
168 | headers appear before a line containing only <code>'*** EOOH ***'</code> | |
169 | (End-Of-Original-Headers) and the visible headers appear after the | |
170 | <code>EOOH</code> line. Babyl-compliant mail readers will show you only the | |
171 | visible headers, and <tt class="class">BabylMailbox</tt> objects will return messages | |
172 | containing only the visible headers. You'll have to do your own | |
173 | parsing of the mailbox file to get at the original headers. Mail | |
174 | messages start with the EOOH line and end with a line containing only | |
175 | <code>'\037\014'</code>. <var>factory</var> is as with the | |
176 | <tt class="class">UnixMailbox</tt> class. | |
177 | </dl> | |
178 | ||
179 | <P> | |
180 | Note that because the <tt class="module"><a href="module-rfc822.html">rfc822</a></tt> module is deprecated, it is | |
181 | recommended that you use the <tt class="module"><a href="module-email.html">email</a></tt> package to create | |
182 | message objects from a mailbox. (The default can't be changed for | |
183 | backwards compatibility reasons.) The safest way to do this is with | |
184 | bit of code: | |
185 | ||
186 | <P> | |
187 | <div class="verbatim"><pre> | |
188 | import email | |
189 | import email.Errors | |
190 | import mailbox | |
191 | ||
192 | def msgfactory(fp): | |
193 | try: | |
194 | return email.message_from_file(fp) | |
195 | except email.Errors.MessageParseError: | |
196 | # Don't return None since that will | |
197 | # stop the mailbox iterator | |
198 | return '' | |
199 | ||
200 | mbox = mailbox.UnixMailbox(fp, msgfactory) | |
201 | </pre></div> | |
202 | ||
203 | <P> | |
204 | The above wrapper is defensive against ill-formed MIME messages in the | |
205 | mailbox, but you have to be prepared to receive the empty string from | |
206 | the mailbox's <tt class="function">next()</tt> method. On the other hand, if you | |
207 | know your mailbox contains only well-formed MIME messages, you can | |
208 | simplify this to: | |
209 | ||
210 | <P> | |
211 | <div class="verbatim"><pre> | |
212 | import email | |
213 | import mailbox | |
214 | ||
215 | mbox = mailbox.UnixMailbox(fp, email.message_from_file) | |
216 | </pre></div> | |
217 | ||
218 | <P> | |
219 | <div class="seealso"> | |
220 | <p class="heading">See Also:</p> | |
221 | ||
222 | <dl compact="compact" class="seetitle"> | |
223 | <dt><em class="citetitle"><a href="http://www.qmail.org/man/man5/mbox.html" | |
224 | >mbox - | |
225 | file containing mail messages</a></em></dt> | |
226 | <dd>Description of the | |
227 | traditional ``mbox'' mailbox format.</dd> | |
228 | </dl> | |
229 | <dl compact="compact" class="seetitle"> | |
230 | <dt><em class="citetitle"><a href="http://www.qmail.org/man/man5/maildir.html" | |
231 | >maildir - | |
232 | directory for incoming mail messages</a></em></dt> | |
233 | <dd>Description of the | |
234 | ``maildir'' mailbox format.</dd> | |
235 | </dl> | |
236 | <dl compact="compact" class="seetitle"> | |
237 | <dt><em class="citetitle"><a href="http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html" | |
238 | >Configuring | |
239 | Netscape Mail on <span class="Unix">Unix</span>: Why the Content-Length Format is | |
240 | Bad</a></em></dt> | |
241 | <dd>A description of problems with relying on the | |
242 | <span class="mailheader">Content-Length:</span> header for messages stored in | |
243 | mailbox files.</dd> | |
244 | </dl> | |
245 | </div> | |
246 | ||
247 | <P> | |
248 | ||
249 | <p><br /></p><hr class='online-navigation' /> | |
250 | <div class='online-navigation'> | |
251 | <!--Table of Child-Links--> | |
252 | <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></a> | |
253 | ||
254 | <UL CLASS="ChildLinks"> | |
255 | <LI><A href="mailbox-objects.html">12.4.1 Mailbox Objects</a> | |
256 | </ul> | |
257 | <!--End of Table of Child-Links--> | |
258 | </div> | |
259 | ||
260 | <DIV CLASS="navigation"> | |
261 | <div class='online-navigation'> | |
262 | <p></p><hr /> | |
263 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
264 | <tr> | |
265 | <td class='online-navigation'><a rel="prev" title="12.3 mailcap " | |
266 | href="module-mailcap.html"><img src='../icons/previous.png' | |
267 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
268 | <td class='online-navigation'><a rel="parent" title="12. Internet Data Handling" | |
269 | href="netdata.html"><img src='../icons/up.png' | |
270 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
271 | <td class='online-navigation'><a rel="next" title="12.4.1 Mailbox Objects" | |
272 | href="mailbox-objects.html"><img src='../icons/next.png' | |
273 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
274 | <td align="center" width="100%">Python Library Reference</td> | |
275 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
276 | href="contents.html"><img src='../icons/contents.png' | |
277 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
278 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
279 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
280 | <td class='online-navigation'><a rel="index" title="Index" | |
281 | href="genindex.html"><img src='../icons/index.png' | |
282 | border='0' height='32' alt='Index' width='32' /></A></td> | |
283 | </tr></table> | |
284 | <div class='online-navigation'> | |
285 | <b class="navlabel">Previous:</b> | |
286 | <a class="sectref" rel="prev" href="module-mailcap.html">12.3 mailcap </A> | |
287 | <b class="navlabel">Up:</b> | |
288 | <a class="sectref" rel="parent" href="netdata.html">12. Internet Data Handling</A> | |
289 | <b class="navlabel">Next:</b> | |
290 | <a class="sectref" rel="next" href="mailbox-objects.html">12.4.1 Mailbox Objects</A> | |
291 | </div> | |
292 | </div> | |
293 | <hr /> | |
294 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> | |
295 | </DIV> | |
296 | <!--End of Navigation Panel--> | |
297 | <ADDRESS> | |
298 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. | |
299 | </ADDRESS> | |
300 | </BODY> | |
301 | </HTML> |