Commit | Line | Data |
---|---|---|
86530b38 AT |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
2 | <html> | |
3 | <head> | |
4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> | |
5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> | |
6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> | |
7 | <link rel="first" href="lib.html" title='Python Library Reference' /> | |
8 | <link rel='contents' href='contents.html' title="Contents" /> | |
9 | <link rel='index' href='genindex.html' title='Index' /> | |
10 | <link rel='last' href='about.html' title='About this document...' /> | |
11 | <link rel='help' href='about.html' title='About this document...' /> | |
12 | <link rel="next" href="module-difflib.html" /> | |
13 | <link rel="prev" href="module-re.html" /> | |
14 | <link rel="parent" href="strings.html" /> | |
15 | <link rel="next" href="module-difflib.html" /> | |
16 | <meta name='aesop' content='information' /> | |
17 | <title>4.3 struct -- Interpret strings as packed binary data</title> | |
18 | </head> | |
19 | <body> | |
20 | <DIV CLASS="navigation"> | |
21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> | |
22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
23 | <tr> | |
24 | <td class='online-navigation'><a rel="prev" title="4.2.6 Examples" | |
25 | href="node118.html"><img src='../icons/previous.png' | |
26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
27 | <td class='online-navigation'><a rel="parent" title="4. String Services" | |
28 | href="strings.html"><img src='../icons/up.png' | |
29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
30 | <td class='online-navigation'><a rel="next" title="4.4 difflib " | |
31 | href="module-difflib.html"><img src='../icons/next.png' | |
32 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
33 | <td align="center" width="100%">Python Library Reference</td> | |
34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
35 | href="contents.html"><img src='../icons/contents.png' | |
36 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
37 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
38 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
39 | <td class='online-navigation'><a rel="index" title="Index" | |
40 | href="genindex.html"><img src='../icons/index.png' | |
41 | border='0' height='32' alt='Index' width='32' /></A></td> | |
42 | </tr></table> | |
43 | <div class='online-navigation'> | |
44 | <b class="navlabel">Previous:</b> | |
45 | <a class="sectref" rel="prev" href="node118.html">4.2.6 Examples</A> | |
46 | <b class="navlabel">Up:</b> | |
47 | <a class="sectref" rel="parent" href="strings.html">4. String Services</A> | |
48 | <b class="navlabel">Next:</b> | |
49 | <a class="sectref" rel="next" href="module-difflib.html">4.4 difflib </A> | |
50 | </div> | |
51 | <hr /></div> | |
52 | </DIV> | |
53 | <!--End of Navigation Panel--> | |
54 | ||
55 | <H1><A NAME="SECTION006300000000000000000"> | |
56 | 4.3 <tt class="module">struct</tt> -- | |
57 | Interpret strings as packed binary data</A> | |
58 | </H1> | |
59 | <A NAME="module-struct"></A> | |
60 | <P> | |
61 | ||
62 | <P> | |
63 | <a id='l2h-916' xml:id='l2h-916'></a><a id='l2h-917' xml:id='l2h-917'></a> | |
64 | <P> | |
65 | This module performs conversions between Python values and C | |
66 | structs represented as Python strings. It uses <i class="dfn">format strings</i> | |
67 | (explained below) as compact descriptions of the lay-out of the C | |
68 | structs and the intended conversion to/from Python values. This can | |
69 | be used in handling binary data stored in files or from network | |
70 | connections, among other sources. | |
71 | ||
72 | <P> | |
73 | The module defines the following exception and functions: | |
74 | ||
75 | <P> | |
76 | <dl><dt><b><span class="typelabel">exception</span> <tt id='l2h-918' xml:id='l2h-918' class="exception">error</tt></b></dt> | |
77 | <dd> | |
78 | Exception raised on various occasions; argument is a string | |
79 | describing what is wrong. | |
80 | </dd></dl> | |
81 | ||
82 | <P> | |
83 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
84 | <td><nobr><b><tt id='l2h-919' xml:id='l2h-919' class="function">pack</tt></b>(</nobr></td> | |
85 | <td><var>fmt, v1, v2, ...</var>)</td></tr></table></dt> | |
86 | <dd> | |
87 | Return a string containing the values | |
88 | <code><var>v1</var>, <var>v2</var>, ...</code> packed according to the given | |
89 | format. The arguments must match the values required by the format | |
90 | exactly. | |
91 | </dl> | |
92 | ||
93 | <P> | |
94 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
95 | <td><nobr><b><tt id='l2h-920' xml:id='l2h-920' class="function">unpack</tt></b>(</nobr></td> | |
96 | <td><var>fmt, string</var>)</td></tr></table></dt> | |
97 | <dd> | |
98 | Unpack the string (presumably packed by <code>pack(<var>fmt</var>, | |
99 | ...)</code>) according to the given format. The result is a | |
100 | tuple even if it contains exactly one item. The string must contain | |
101 | exactly the amount of data required by the format | |
102 | (<code>len(<var>string</var>)</code> must equal <code>calcsize(<var>fmt</var>)</code>). | |
103 | </dl> | |
104 | ||
105 | <P> | |
106 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> | |
107 | <td><nobr><b><tt id='l2h-921' xml:id='l2h-921' class="function">calcsize</tt></b>(</nobr></td> | |
108 | <td><var>fmt</var>)</td></tr></table></dt> | |
109 | <dd> | |
110 | Return the size of the struct (and hence of the string) | |
111 | corresponding to the given format. | |
112 | </dl> | |
113 | ||
114 | <P> | |
115 | Format characters have the following meaning; the conversion between | |
116 | C and Python values should be obvious given their types: | |
117 | ||
118 | <P> | |
119 | <div class="center"><table class="realtable"> | |
120 | <thead> | |
121 | <tr> | |
122 | <th class="center">Format</th> | |
123 | <th class="left" >C Type</th> | |
124 | <th class="left" >Python</th> | |
125 | <th class="center">Notes</th> | |
126 | </tr> | |
127 | </thead> | |
128 | <tbody> | |
129 | <tr><td class="center" valign="baseline"><samp>x</samp></td> | |
130 | <td class="left" >pad byte</td> | |
131 | <td class="left" >no value</td> | |
132 | <td class="center"></td></tr> | |
133 | <tr><td class="center" valign="baseline"><samp>c</samp></td> | |
134 | <td class="left" ><tt class="ctype">char</tt></td> | |
135 | <td class="left" >string of length 1</td> | |
136 | <td class="center"></td></tr> | |
137 | <tr><td class="center" valign="baseline"><samp>b</samp></td> | |
138 | <td class="left" ><tt class="ctype">signed char</tt></td> | |
139 | <td class="left" >integer</td> | |
140 | <td class="center"></td></tr> | |
141 | <tr><td class="center" valign="baseline"><samp>B</samp></td> | |
142 | <td class="left" ><tt class="ctype">unsigned char</tt></td> | |
143 | <td class="left" >integer</td> | |
144 | <td class="center"></td></tr> | |
145 | <tr><td class="center" valign="baseline"><samp>h</samp></td> | |
146 | <td class="left" ><tt class="ctype">short</tt></td> | |
147 | <td class="left" >integer</td> | |
148 | <td class="center"></td></tr> | |
149 | <tr><td class="center" valign="baseline"><samp>H</samp></td> | |
150 | <td class="left" ><tt class="ctype">unsigned short</tt></td> | |
151 | <td class="left" >integer</td> | |
152 | <td class="center"></td></tr> | |
153 | <tr><td class="center" valign="baseline"><samp>i</samp></td> | |
154 | <td class="left" ><tt class="ctype">int</tt></td> | |
155 | <td class="left" >integer</td> | |
156 | <td class="center"></td></tr> | |
157 | <tr><td class="center" valign="baseline"><samp>I</samp></td> | |
158 | <td class="left" ><tt class="ctype">unsigned int</tt></td> | |
159 | <td class="left" >long</td> | |
160 | <td class="center"></td></tr> | |
161 | <tr><td class="center" valign="baseline"><samp>l</samp></td> | |
162 | <td class="left" ><tt class="ctype">long</tt></td> | |
163 | <td class="left" >integer</td> | |
164 | <td class="center"></td></tr> | |
165 | <tr><td class="center" valign="baseline"><samp>L</samp></td> | |
166 | <td class="left" ><tt class="ctype">unsigned long</tt></td> | |
167 | <td class="left" >long</td> | |
168 | <td class="center"></td></tr> | |
169 | <tr><td class="center" valign="baseline"><samp>q</samp></td> | |
170 | <td class="left" ><tt class="ctype">long long</tt></td> | |
171 | <td class="left" >long</td> | |
172 | <td class="center">(1)</td></tr> | |
173 | <tr><td class="center" valign="baseline"><samp>Q</samp></td> | |
174 | <td class="left" ><tt class="ctype">unsigned long long</tt></td> | |
175 | <td class="left" >long</td> | |
176 | <td class="center">(1)</td></tr> | |
177 | <tr><td class="center" valign="baseline"><samp>f</samp></td> | |
178 | <td class="left" ><tt class="ctype">float</tt></td> | |
179 | <td class="left" >float</td> | |
180 | <td class="center"></td></tr> | |
181 | <tr><td class="center" valign="baseline"><samp>d</samp></td> | |
182 | <td class="left" ><tt class="ctype">double</tt></td> | |
183 | <td class="left" >float</td> | |
184 | <td class="center"></td></tr> | |
185 | <tr><td class="center" valign="baseline"><samp>s</samp></td> | |
186 | <td class="left" ><tt class="ctype">char[]</tt></td> | |
187 | <td class="left" >string</td> | |
188 | <td class="center"></td></tr> | |
189 | <tr><td class="center" valign="baseline"><samp>p</samp></td> | |
190 | <td class="left" ><tt class="ctype">char[]</tt></td> | |
191 | <td class="left" >string</td> | |
192 | <td class="center"></td></tr> | |
193 | <tr><td class="center" valign="baseline"><samp>P</samp></td> | |
194 | <td class="left" ><tt class="ctype">void *</tt></td> | |
195 | <td class="left" >integer</td> | |
196 | <td class="center"></td></tr></tbody> | |
197 | </table></div> | |
198 | ||
199 | <P> | |
200 | Notes: | |
201 | ||
202 | <P> | |
203 | <DL> | |
204 | <DT><STRONG>(1)</STRONG></DT> | |
205 | <DD>The "<tt class="character">q</tt>" and "<tt class="character">Q</tt>" conversion codes are available in | |
206 | native mode only if the platform C compiler supports C <tt class="ctype">long long</tt>, | |
207 | or, on Windows, <tt class="ctype">__int64</tt>. They are always available in standard | |
208 | modes. | |
209 | ||
210 | <span class="versionnote">New in version 2.2.</span> | |
211 | ||
212 | </DD> | |
213 | </DL> | |
214 | ||
215 | <P> | |
216 | A format character may be preceded by an integral repeat count. For | |
217 | example, the format string <code>'4h'</code> means exactly the same as | |
218 | <code>'hhhh'</code>. | |
219 | ||
220 | <P> | |
221 | Whitespace characters between formats are ignored; a count and its | |
222 | format must not contain whitespace though. | |
223 | ||
224 | <P> | |
225 | For the "<tt class="character">s</tt>" format character, the count is interpreted as the | |
226 | size of the string, not a repeat count like for the other format | |
227 | characters; for example, <code>'10s'</code> means a single 10-byte string, while | |
228 | <code>'10c'</code> means 10 characters. For packing, the string is | |
229 | truncated or padded with null bytes as appropriate to make it fit. | |
230 | For unpacking, the resulting string always has exactly the specified | |
231 | number of bytes. As a special case, <code>'0s'</code> means a single, empty | |
232 | string (while <code>'0c'</code> means 0 characters). | |
233 | ||
234 | <P> | |
235 | The "<tt class="character">p</tt>" format character encodes a "Pascal string", meaning | |
236 | a short variable-length string stored in a fixed number of bytes. | |
237 | The count is the total number of bytes stored. The first byte stored is | |
238 | the length of the string, or 255, whichever is smaller. The bytes | |
239 | of the string follow. If the string passed in to <tt class="function">pack()</tt> is too | |
240 | long (longer than the count minus 1), only the leading count-1 bytes of the | |
241 | string are stored. If the string is shorter than count-1, it is padded | |
242 | with null bytes so that exactly count bytes in all are used. Note that | |
243 | for <tt class="function">unpack()</tt>, the "<tt class="character">p</tt>" format character consumes count | |
244 | bytes, but that the string returned can never contain more than 255 | |
245 | characters. | |
246 | ||
247 | <P> | |
248 | For the "<tt class="character">I</tt>", "<tt class="character">L</tt>", "<tt class="character">q</tt>" and "<tt class="character">Q</tt>" | |
249 | format characters, the return value is a Python long integer. | |
250 | ||
251 | <P> | |
252 | For the "<tt class="character">P</tt>" format character, the return value is a Python | |
253 | integer or long integer, depending on the size needed to hold a | |
254 | pointer when it has been cast to an integer type. A <tt class="constant">NULL</tt> pointer will | |
255 | always be returned as the Python integer <code>0</code>. When packing pointer-sized | |
256 | values, Python integer or long integer objects may be used. For | |
257 | example, the Alpha and Merced processors use 64-bit pointer values, | |
258 | meaning a Python long integer will be used to hold the pointer; other | |
259 | platforms use 32-bit pointers and will use a Python integer. | |
260 | ||
261 | <P> | |
262 | By default, C numbers are represented in the machine's native format | |
263 | and byte order, and properly aligned by skipping pad bytes if | |
264 | necessary (according to the rules used by the C compiler). | |
265 | ||
266 | <P> | |
267 | Alternatively, the first character of the format string can be used to | |
268 | indicate the byte order, size and alignment of the packed data, | |
269 | according to the following table: | |
270 | ||
271 | <P> | |
272 | <div class="center"><table class="realtable"> | |
273 | <thead> | |
274 | <tr> | |
275 | <th class="center">Character</th> | |
276 | <th class="left" >Byte order</th> | |
277 | <th class="left" >Size and alignment</th> | |
278 | </tr> | |
279 | </thead> | |
280 | <tbody> | |
281 | <tr><td class="center" valign="baseline"><samp>@</samp></td> | |
282 | <td class="left" >native</td> | |
283 | <td class="left" >native</td></tr> | |
284 | <tr><td class="center" valign="baseline"><samp>=</samp></td> | |
285 | <td class="left" >native</td> | |
286 | <td class="left" >standard</td></tr> | |
287 | <tr><td class="center" valign="baseline"><samp><</samp></td> | |
288 | <td class="left" >little-endian</td> | |
289 | <td class="left" >standard</td></tr> | |
290 | <tr><td class="center" valign="baseline"><samp>></samp></td> | |
291 | <td class="left" >big-endian</td> | |
292 | <td class="left" >standard</td></tr> | |
293 | <tr><td class="center" valign="baseline"><samp>!</samp></td> | |
294 | <td class="left" >network (= big-endian)</td> | |
295 | <td class="left" >standard</td></tr></tbody> | |
296 | </table></div> | |
297 | ||
298 | <P> | |
299 | If the first character is not one of these, "<tt class="character">@</tt>" is assumed. | |
300 | ||
301 | <P> | |
302 | Native byte order is big-endian or little-endian, depending on the | |
303 | host system. For example, Motorola and Sun processors are big-endian; | |
304 | Intel and DEC processors are little-endian. | |
305 | ||
306 | <P> | |
307 | Native size and alignment are determined using the C compiler's | |
308 | <tt class="keyword">sizeof</tt> expression. This is always combined with native byte | |
309 | order. | |
310 | ||
311 | <P> | |
312 | Standard size and alignment are as follows: no alignment is required | |
313 | for any type (so you have to use pad bytes); | |
314 | <tt class="ctype">short</tt> is 2 bytes; | |
315 | <tt class="ctype">int</tt> and <tt class="ctype">long</tt> are 4 bytes; | |
316 | <tt class="ctype">long long</tt> (<tt class="ctype">__int64</tt> on Windows) is 8 bytes; | |
317 | <tt class="ctype">float</tt> and <tt class="ctype">double</tt> are 32-bit and 64-bit | |
318 | IEEE floating point numbers, respectively. | |
319 | ||
320 | <P> | |
321 | Note the difference between "<tt class="character">@</tt>" and "<tt class="character">=</tt>": both use | |
322 | native byte order, but the size and alignment of the latter is | |
323 | standardized. | |
324 | ||
325 | <P> | |
326 | The form "<tt class="character">!</tt>" is available for those poor souls who claim they | |
327 | can't remember whether network byte order is big-endian or | |
328 | little-endian. | |
329 | ||
330 | <P> | |
331 | There is no way to indicate non-native byte order (force | |
332 | byte-swapping); use the appropriate choice of "<tt class="character"><</tt>" or | |
333 | "<tt class="character">></tt>". | |
334 | ||
335 | <P> | |
336 | The "<tt class="character">P</tt>" format character is only available for the native | |
337 | byte ordering (selected as the default or with the "<tt class="character">@</tt>" byte | |
338 | order character). The byte order character "<tt class="character">=</tt>" chooses to | |
339 | use little- or big-endian ordering based on the host system. The | |
340 | struct module does not interpret this as native ordering, so the | |
341 | "<tt class="character">P</tt>" format is not available. | |
342 | ||
343 | <P> | |
344 | Examples (all using native byte order, size and alignment, on a | |
345 | big-endian machine): | |
346 | ||
347 | <P> | |
348 | <div class="verbatim"><pre> | |
349 | >>> from struct import * | |
350 | >>> pack('hhl', 1, 2, 3) | |
351 | '\x00\x01\x00\x02\x00\x00\x00\x03' | |
352 | >>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03') | |
353 | (1, 2, 3) | |
354 | >>> calcsize('hhl') | |
355 | 8 | |
356 | </pre></div> | |
357 | ||
358 | <P> | |
359 | Hint: to align the end of a structure to the alignment requirement of | |
360 | a particular type, end the format with the code for that type with a | |
361 | repeat count of zero. For example, the format <code>'llh0l'</code> | |
362 | specifies two pad bytes at the end, assuming longs are aligned on | |
363 | 4-byte boundaries. This only works when native size and alignment are | |
364 | in effect; standard size and alignment does not enforce any alignment. | |
365 | ||
366 | <P> | |
367 | <div class="seealso"> | |
368 | <p class="heading">See Also:</p> | |
369 | ||
370 | <dl compact="compact" class="seemodule"> | |
371 | <dt>Module <b><tt class="module"><a href="module-array.html">array</a></tt>:</b> | |
372 | <dd>Packed binary storage of homogeneous data. | |
373 | </dl> | |
374 | <dl compact="compact" class="seemodule"> | |
375 | <dt>Module <b><tt class="module"><a href="module-xdrlib.html">xdrlib</a></tt>:</b> | |
376 | <dd>Packing and unpacking of XDR data. | |
377 | </dl> | |
378 | </div> | |
379 | ||
380 | <DIV CLASS="navigation"> | |
381 | <div class='online-navigation'> | |
382 | <p></p><hr /> | |
383 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
384 | <tr> | |
385 | <td class='online-navigation'><a rel="prev" title="4.2.6 Examples" | |
386 | href="node118.html"><img src='../icons/previous.png' | |
387 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
388 | <td class='online-navigation'><a rel="parent" title="4. String Services" | |
389 | href="strings.html"><img src='../icons/up.png' | |
390 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
391 | <td class='online-navigation'><a rel="next" title="4.4 difflib " | |
392 | href="module-difflib.html"><img src='../icons/next.png' | |
393 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
394 | <td align="center" width="100%">Python Library Reference</td> | |
395 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
396 | href="contents.html"><img src='../icons/contents.png' | |
397 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
398 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' | |
399 | border='0' height='32' alt='Module Index' width='32' /></a></td> | |
400 | <td class='online-navigation'><a rel="index" title="Index" | |
401 | href="genindex.html"><img src='../icons/index.png' | |
402 | border='0' height='32' alt='Index' width='32' /></A></td> | |
403 | </tr></table> | |
404 | <div class='online-navigation'> | |
405 | <b class="navlabel">Previous:</b> | |
406 | <a class="sectref" rel="prev" href="node118.html">4.2.6 Examples</A> | |
407 | <b class="navlabel">Up:</b> | |
408 | <a class="sectref" rel="parent" href="strings.html">4. String Services</A> | |
409 | <b class="navlabel">Next:</b> | |
410 | <a class="sectref" rel="next" href="module-difflib.html">4.4 difflib </A> | |
411 | </div> | |
412 | </div> | |
413 | <hr /> | |
414 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> | |
415 | </DIV> | |
416 | <!--End of Navigation Panel--> | |
417 | <ADDRESS> | |
418 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. | |
419 | </ADDRESS> | |
420 | </BODY> | |
421 | </HTML> |