| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <link rel="STYLESHEET" href="lib.css" type='text/css' /> |
| 5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> |
| 6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> |
| 7 | <link rel="first" href="lib.html" title='Python Library Reference' /> |
| 8 | <link rel='contents' href='contents.html' title="Contents" /> |
| 9 | <link rel='index' href='genindex.html' title='Index' /> |
| 10 | <link rel='last' href='about.html' title='About this document...' /> |
| 11 | <link rel='help' href='about.html' title='About this document...' /> |
| 12 | <link rel="next" href="module-difflib.html" /> |
| 13 | <link rel="prev" href="module-re.html" /> |
| 14 | <link rel="parent" href="strings.html" /> |
| 15 | <link rel="next" href="module-difflib.html" /> |
| 16 | <meta name='aesop' content='information' /> |
| 17 | <title>4.3 struct -- Interpret strings as packed binary data</title> |
| 18 | </head> |
| 19 | <body> |
| 20 | <DIV CLASS="navigation"> |
| 21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> |
| 22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 23 | <tr> |
| 24 | <td class='online-navigation'><a rel="prev" title="4.2.6 Examples" |
| 25 | href="node118.html"><img src='../icons/previous.png' |
| 26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 27 | <td class='online-navigation'><a rel="parent" title="4. String Services" |
| 28 | href="strings.html"><img src='../icons/up.png' |
| 29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 30 | <td class='online-navigation'><a rel="next" title="4.4 difflib " |
| 31 | href="module-difflib.html"><img src='../icons/next.png' |
| 32 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 33 | <td align="center" width="100%">Python Library Reference</td> |
| 34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 35 | href="contents.html"><img src='../icons/contents.png' |
| 36 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 37 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 38 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 39 | <td class='online-navigation'><a rel="index" title="Index" |
| 40 | href="genindex.html"><img src='../icons/index.png' |
| 41 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 42 | </tr></table> |
| 43 | <div class='online-navigation'> |
| 44 | <b class="navlabel">Previous:</b> |
| 45 | <a class="sectref" rel="prev" href="node118.html">4.2.6 Examples</A> |
| 46 | <b class="navlabel">Up:</b> |
| 47 | <a class="sectref" rel="parent" href="strings.html">4. String Services</A> |
| 48 | <b class="navlabel">Next:</b> |
| 49 | <a class="sectref" rel="next" href="module-difflib.html">4.4 difflib </A> |
| 50 | </div> |
| 51 | <hr /></div> |
| 52 | </DIV> |
| 53 | <!--End of Navigation Panel--> |
| 54 | |
| 55 | <H1><A NAME="SECTION006300000000000000000"> |
| 56 | 4.3 <tt class="module">struct</tt> -- |
| 57 | Interpret strings as packed binary data</A> |
| 58 | </H1> |
| 59 | <A NAME="module-struct"></A> |
| 60 | <P> |
| 61 | |
| 62 | <P> |
| 63 | <a id='l2h-916' xml:id='l2h-916'></a><a id='l2h-917' xml:id='l2h-917'></a> |
| 64 | <P> |
| 65 | This module performs conversions between Python values and C |
| 66 | structs represented as Python strings. It uses <i class="dfn">format strings</i> |
| 67 | (explained below) as compact descriptions of the lay-out of the C |
| 68 | structs and the intended conversion to/from Python values. This can |
| 69 | be used in handling binary data stored in files or from network |
| 70 | connections, among other sources. |
| 71 | |
| 72 | <P> |
| 73 | The module defines the following exception and functions: |
| 74 | |
| 75 | <P> |
| 76 | <dl><dt><b><span class="typelabel">exception</span> <tt id='l2h-918' xml:id='l2h-918' class="exception">error</tt></b></dt> |
| 77 | <dd> |
| 78 | Exception raised on various occasions; argument is a string |
| 79 | describing what is wrong. |
| 80 | </dd></dl> |
| 81 | |
| 82 | <P> |
| 83 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 84 | <td><nobr><b><tt id='l2h-919' xml:id='l2h-919' class="function">pack</tt></b>(</nobr></td> |
| 85 | <td><var>fmt, v1, v2, ...</var>)</td></tr></table></dt> |
| 86 | <dd> |
| 87 | Return a string containing the values |
| 88 | <code><var>v1</var>, <var>v2</var>, ...</code> packed according to the given |
| 89 | format. The arguments must match the values required by the format |
| 90 | exactly. |
| 91 | </dl> |
| 92 | |
| 93 | <P> |
| 94 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 95 | <td><nobr><b><tt id='l2h-920' xml:id='l2h-920' class="function">unpack</tt></b>(</nobr></td> |
| 96 | <td><var>fmt, string</var>)</td></tr></table></dt> |
| 97 | <dd> |
| 98 | Unpack the string (presumably packed by <code>pack(<var>fmt</var>, |
| 99 | ...)</code>) according to the given format. The result is a |
| 100 | tuple even if it contains exactly one item. The string must contain |
| 101 | exactly the amount of data required by the format |
| 102 | (<code>len(<var>string</var>)</code> must equal <code>calcsize(<var>fmt</var>)</code>). |
| 103 | </dl> |
| 104 | |
| 105 | <P> |
| 106 | <dl><dt><table cellpadding="0" cellspacing="0"><tr valign="baseline"> |
| 107 | <td><nobr><b><tt id='l2h-921' xml:id='l2h-921' class="function">calcsize</tt></b>(</nobr></td> |
| 108 | <td><var>fmt</var>)</td></tr></table></dt> |
| 109 | <dd> |
| 110 | Return the size of the struct (and hence of the string) |
| 111 | corresponding to the given format. |
| 112 | </dl> |
| 113 | |
| 114 | <P> |
| 115 | Format characters have the following meaning; the conversion between |
| 116 | C and Python values should be obvious given their types: |
| 117 | |
| 118 | <P> |
| 119 | <div class="center"><table class="realtable"> |
| 120 | <thead> |
| 121 | <tr> |
| 122 | <th class="center">Format</th> |
| 123 | <th class="left" >C Type</th> |
| 124 | <th class="left" >Python</th> |
| 125 | <th class="center">Notes</th> |
| 126 | </tr> |
| 127 | </thead> |
| 128 | <tbody> |
| 129 | <tr><td class="center" valign="baseline"><samp>x</samp></td> |
| 130 | <td class="left" >pad byte</td> |
| 131 | <td class="left" >no value</td> |
| 132 | <td class="center"></td></tr> |
| 133 | <tr><td class="center" valign="baseline"><samp>c</samp></td> |
| 134 | <td class="left" ><tt class="ctype">char</tt></td> |
| 135 | <td class="left" >string of length 1</td> |
| 136 | <td class="center"></td></tr> |
| 137 | <tr><td class="center" valign="baseline"><samp>b</samp></td> |
| 138 | <td class="left" ><tt class="ctype">signed char</tt></td> |
| 139 | <td class="left" >integer</td> |
| 140 | <td class="center"></td></tr> |
| 141 | <tr><td class="center" valign="baseline"><samp>B</samp></td> |
| 142 | <td class="left" ><tt class="ctype">unsigned char</tt></td> |
| 143 | <td class="left" >integer</td> |
| 144 | <td class="center"></td></tr> |
| 145 | <tr><td class="center" valign="baseline"><samp>h</samp></td> |
| 146 | <td class="left" ><tt class="ctype">short</tt></td> |
| 147 | <td class="left" >integer</td> |
| 148 | <td class="center"></td></tr> |
| 149 | <tr><td class="center" valign="baseline"><samp>H</samp></td> |
| 150 | <td class="left" ><tt class="ctype">unsigned short</tt></td> |
| 151 | <td class="left" >integer</td> |
| 152 | <td class="center"></td></tr> |
| 153 | <tr><td class="center" valign="baseline"><samp>i</samp></td> |
| 154 | <td class="left" ><tt class="ctype">int</tt></td> |
| 155 | <td class="left" >integer</td> |
| 156 | <td class="center"></td></tr> |
| 157 | <tr><td class="center" valign="baseline"><samp>I</samp></td> |
| 158 | <td class="left" ><tt class="ctype">unsigned int</tt></td> |
| 159 | <td class="left" >long</td> |
| 160 | <td class="center"></td></tr> |
| 161 | <tr><td class="center" valign="baseline"><samp>l</samp></td> |
| 162 | <td class="left" ><tt class="ctype">long</tt></td> |
| 163 | <td class="left" >integer</td> |
| 164 | <td class="center"></td></tr> |
| 165 | <tr><td class="center" valign="baseline"><samp>L</samp></td> |
| 166 | <td class="left" ><tt class="ctype">unsigned long</tt></td> |
| 167 | <td class="left" >long</td> |
| 168 | <td class="center"></td></tr> |
| 169 | <tr><td class="center" valign="baseline"><samp>q</samp></td> |
| 170 | <td class="left" ><tt class="ctype">long long</tt></td> |
| 171 | <td class="left" >long</td> |
| 172 | <td class="center">(1)</td></tr> |
| 173 | <tr><td class="center" valign="baseline"><samp>Q</samp></td> |
| 174 | <td class="left" ><tt class="ctype">unsigned long long</tt></td> |
| 175 | <td class="left" >long</td> |
| 176 | <td class="center">(1)</td></tr> |
| 177 | <tr><td class="center" valign="baseline"><samp>f</samp></td> |
| 178 | <td class="left" ><tt class="ctype">float</tt></td> |
| 179 | <td class="left" >float</td> |
| 180 | <td class="center"></td></tr> |
| 181 | <tr><td class="center" valign="baseline"><samp>d</samp></td> |
| 182 | <td class="left" ><tt class="ctype">double</tt></td> |
| 183 | <td class="left" >float</td> |
| 184 | <td class="center"></td></tr> |
| 185 | <tr><td class="center" valign="baseline"><samp>s</samp></td> |
| 186 | <td class="left" ><tt class="ctype">char[]</tt></td> |
| 187 | <td class="left" >string</td> |
| 188 | <td class="center"></td></tr> |
| 189 | <tr><td class="center" valign="baseline"><samp>p</samp></td> |
| 190 | <td class="left" ><tt class="ctype">char[]</tt></td> |
| 191 | <td class="left" >string</td> |
| 192 | <td class="center"></td></tr> |
| 193 | <tr><td class="center" valign="baseline"><samp>P</samp></td> |
| 194 | <td class="left" ><tt class="ctype">void *</tt></td> |
| 195 | <td class="left" >integer</td> |
| 196 | <td class="center"></td></tr></tbody> |
| 197 | </table></div> |
| 198 | |
| 199 | <P> |
| 200 | Notes: |
| 201 | |
| 202 | <P> |
| 203 | <DL> |
| 204 | <DT><STRONG>(1)</STRONG></DT> |
| 205 | <DD>The "<tt class="character">q</tt>" and "<tt class="character">Q</tt>" conversion codes are available in |
| 206 | native mode only if the platform C compiler supports C <tt class="ctype">long long</tt>, |
| 207 | or, on Windows, <tt class="ctype">__int64</tt>. They are always available in standard |
| 208 | modes. |
| 209 | |
| 210 | <span class="versionnote">New in version 2.2.</span> |
| 211 | |
| 212 | </DD> |
| 213 | </DL> |
| 214 | |
| 215 | <P> |
| 216 | A format character may be preceded by an integral repeat count. For |
| 217 | example, the format string <code>'4h'</code> means exactly the same as |
| 218 | <code>'hhhh'</code>. |
| 219 | |
| 220 | <P> |
| 221 | Whitespace characters between formats are ignored; a count and its |
| 222 | format must not contain whitespace though. |
| 223 | |
| 224 | <P> |
| 225 | For the "<tt class="character">s</tt>" format character, the count is interpreted as the |
| 226 | size of the string, not a repeat count like for the other format |
| 227 | characters; for example, <code>'10s'</code> means a single 10-byte string, while |
| 228 | <code>'10c'</code> means 10 characters. For packing, the string is |
| 229 | truncated or padded with null bytes as appropriate to make it fit. |
| 230 | For unpacking, the resulting string always has exactly the specified |
| 231 | number of bytes. As a special case, <code>'0s'</code> means a single, empty |
| 232 | string (while <code>'0c'</code> means 0 characters). |
| 233 | |
| 234 | <P> |
| 235 | The "<tt class="character">p</tt>" format character encodes a "Pascal string", meaning |
| 236 | a short variable-length string stored in a fixed number of bytes. |
| 237 | The count is the total number of bytes stored. The first byte stored is |
| 238 | the length of the string, or 255, whichever is smaller. The bytes |
| 239 | of the string follow. If the string passed in to <tt class="function">pack()</tt> is too |
| 240 | long (longer than the count minus 1), only the leading count-1 bytes of the |
| 241 | string are stored. If the string is shorter than count-1, it is padded |
| 242 | with null bytes so that exactly count bytes in all are used. Note that |
| 243 | for <tt class="function">unpack()</tt>, the "<tt class="character">p</tt>" format character consumes count |
| 244 | bytes, but that the string returned can never contain more than 255 |
| 245 | characters. |
| 246 | |
| 247 | <P> |
| 248 | For the "<tt class="character">I</tt>", "<tt class="character">L</tt>", "<tt class="character">q</tt>" and "<tt class="character">Q</tt>" |
| 249 | format characters, the return value is a Python long integer. |
| 250 | |
| 251 | <P> |
| 252 | For the "<tt class="character">P</tt>" format character, the return value is a Python |
| 253 | integer or long integer, depending on the size needed to hold a |
| 254 | pointer when it has been cast to an integer type. A <tt class="constant">NULL</tt> pointer will |
| 255 | always be returned as the Python integer <code>0</code>. When packing pointer-sized |
| 256 | values, Python integer or long integer objects may be used. For |
| 257 | example, the Alpha and Merced processors use 64-bit pointer values, |
| 258 | meaning a Python long integer will be used to hold the pointer; other |
| 259 | platforms use 32-bit pointers and will use a Python integer. |
| 260 | |
| 261 | <P> |
| 262 | By default, C numbers are represented in the machine's native format |
| 263 | and byte order, and properly aligned by skipping pad bytes if |
| 264 | necessary (according to the rules used by the C compiler). |
| 265 | |
| 266 | <P> |
| 267 | Alternatively, the first character of the format string can be used to |
| 268 | indicate the byte order, size and alignment of the packed data, |
| 269 | according to the following table: |
| 270 | |
| 271 | <P> |
| 272 | <div class="center"><table class="realtable"> |
| 273 | <thead> |
| 274 | <tr> |
| 275 | <th class="center">Character</th> |
| 276 | <th class="left" >Byte order</th> |
| 277 | <th class="left" >Size and alignment</th> |
| 278 | </tr> |
| 279 | </thead> |
| 280 | <tbody> |
| 281 | <tr><td class="center" valign="baseline"><samp>@</samp></td> |
| 282 | <td class="left" >native</td> |
| 283 | <td class="left" >native</td></tr> |
| 284 | <tr><td class="center" valign="baseline"><samp>=</samp></td> |
| 285 | <td class="left" >native</td> |
| 286 | <td class="left" >standard</td></tr> |
| 287 | <tr><td class="center" valign="baseline"><samp><</samp></td> |
| 288 | <td class="left" >little-endian</td> |
| 289 | <td class="left" >standard</td></tr> |
| 290 | <tr><td class="center" valign="baseline"><samp>></samp></td> |
| 291 | <td class="left" >big-endian</td> |
| 292 | <td class="left" >standard</td></tr> |
| 293 | <tr><td class="center" valign="baseline"><samp>!</samp></td> |
| 294 | <td class="left" >network (= big-endian)</td> |
| 295 | <td class="left" >standard</td></tr></tbody> |
| 296 | </table></div> |
| 297 | |
| 298 | <P> |
| 299 | If the first character is not one of these, "<tt class="character">@</tt>" is assumed. |
| 300 | |
| 301 | <P> |
| 302 | Native byte order is big-endian or little-endian, depending on the |
| 303 | host system. For example, Motorola and Sun processors are big-endian; |
| 304 | Intel and DEC processors are little-endian. |
| 305 | |
| 306 | <P> |
| 307 | Native size and alignment are determined using the C compiler's |
| 308 | <tt class="keyword">sizeof</tt> expression. This is always combined with native byte |
| 309 | order. |
| 310 | |
| 311 | <P> |
| 312 | Standard size and alignment are as follows: no alignment is required |
| 313 | for any type (so you have to use pad bytes); |
| 314 | <tt class="ctype">short</tt> is 2 bytes; |
| 315 | <tt class="ctype">int</tt> and <tt class="ctype">long</tt> are 4 bytes; |
| 316 | <tt class="ctype">long long</tt> (<tt class="ctype">__int64</tt> on Windows) is 8 bytes; |
| 317 | <tt class="ctype">float</tt> and <tt class="ctype">double</tt> are 32-bit and 64-bit |
| 318 | IEEE floating point numbers, respectively. |
| 319 | |
| 320 | <P> |
| 321 | Note the difference between "<tt class="character">@</tt>" and "<tt class="character">=</tt>": both use |
| 322 | native byte order, but the size and alignment of the latter is |
| 323 | standardized. |
| 324 | |
| 325 | <P> |
| 326 | The form "<tt class="character">!</tt>" is available for those poor souls who claim they |
| 327 | can't remember whether network byte order is big-endian or |
| 328 | little-endian. |
| 329 | |
| 330 | <P> |
| 331 | There is no way to indicate non-native byte order (force |
| 332 | byte-swapping); use the appropriate choice of "<tt class="character"><</tt>" or |
| 333 | "<tt class="character">></tt>". |
| 334 | |
| 335 | <P> |
| 336 | The "<tt class="character">P</tt>" format character is only available for the native |
| 337 | byte ordering (selected as the default or with the "<tt class="character">@</tt>" byte |
| 338 | order character). The byte order character "<tt class="character">=</tt>" chooses to |
| 339 | use little- or big-endian ordering based on the host system. The |
| 340 | struct module does not interpret this as native ordering, so the |
| 341 | "<tt class="character">P</tt>" format is not available. |
| 342 | |
| 343 | <P> |
| 344 | Examples (all using native byte order, size and alignment, on a |
| 345 | big-endian machine): |
| 346 | |
| 347 | <P> |
| 348 | <div class="verbatim"><pre> |
| 349 | >>> from struct import * |
| 350 | >>> pack('hhl', 1, 2, 3) |
| 351 | '\x00\x01\x00\x02\x00\x00\x00\x03' |
| 352 | >>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03') |
| 353 | (1, 2, 3) |
| 354 | >>> calcsize('hhl') |
| 355 | 8 |
| 356 | </pre></div> |
| 357 | |
| 358 | <P> |
| 359 | Hint: to align the end of a structure to the alignment requirement of |
| 360 | a particular type, end the format with the code for that type with a |
| 361 | repeat count of zero. For example, the format <code>'llh0l'</code> |
| 362 | specifies two pad bytes at the end, assuming longs are aligned on |
| 363 | 4-byte boundaries. This only works when native size and alignment are |
| 364 | in effect; standard size and alignment does not enforce any alignment. |
| 365 | |
| 366 | <P> |
| 367 | <div class="seealso"> |
| 368 | <p class="heading">See Also:</p> |
| 369 | |
| 370 | <dl compact="compact" class="seemodule"> |
| 371 | <dt>Module <b><tt class="module"><a href="module-array.html">array</a></tt>:</b> |
| 372 | <dd>Packed binary storage of homogeneous data. |
| 373 | </dl> |
| 374 | <dl compact="compact" class="seemodule"> |
| 375 | <dt>Module <b><tt class="module"><a href="module-xdrlib.html">xdrlib</a></tt>:</b> |
| 376 | <dd>Packing and unpacking of XDR data. |
| 377 | </dl> |
| 378 | </div> |
| 379 | |
| 380 | <DIV CLASS="navigation"> |
| 381 | <div class='online-navigation'> |
| 382 | <p></p><hr /> |
| 383 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 384 | <tr> |
| 385 | <td class='online-navigation'><a rel="prev" title="4.2.6 Examples" |
| 386 | href="node118.html"><img src='../icons/previous.png' |
| 387 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 388 | <td class='online-navigation'><a rel="parent" title="4. String Services" |
| 389 | href="strings.html"><img src='../icons/up.png' |
| 390 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 391 | <td class='online-navigation'><a rel="next" title="4.4 difflib " |
| 392 | href="module-difflib.html"><img src='../icons/next.png' |
| 393 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 394 | <td align="center" width="100%">Python Library Reference</td> |
| 395 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 396 | href="contents.html"><img src='../icons/contents.png' |
| 397 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 398 | <td class='online-navigation'><a href="modindex.html" title="Module Index"><img src='../icons/modules.png' |
| 399 | border='0' height='32' alt='Module Index' width='32' /></a></td> |
| 400 | <td class='online-navigation'><a rel="index" title="Index" |
| 401 | href="genindex.html"><img src='../icons/index.png' |
| 402 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 403 | </tr></table> |
| 404 | <div class='online-navigation'> |
| 405 | <b class="navlabel">Previous:</b> |
| 406 | <a class="sectref" rel="prev" href="node118.html">4.2.6 Examples</A> |
| 407 | <b class="navlabel">Up:</b> |
| 408 | <a class="sectref" rel="parent" href="strings.html">4. String Services</A> |
| 409 | <b class="navlabel">Next:</b> |
| 410 | <a class="sectref" rel="next" href="module-difflib.html">4.4 difflib </A> |
| 411 | </div> |
| 412 | </div> |
| 413 | <hr /> |
| 414 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> |
| 415 | </DIV> |
| 416 | <!--End of Navigation Panel--> |
| 417 | <ADDRESS> |
| 418 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. |
| 419 | </ADDRESS> |
| 420 | </BODY> |
| 421 | </HTML> |