| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <link rel="STYLESHEET" href="ref.css" type='text/css' /> |
| 5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> |
| 6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> |
| 7 | <link rel="first" href="ref.html" title='Python Reference Manual' /> |
| 8 | <link rel='contents' href='contents.html' title="Contents" /> |
| 9 | <link rel='index' href='genindex.html' title='Index' /> |
| 10 | <link rel='last' href='about.html' title='About this document...' /> |
| 11 | <link rel='help' href='about.html' title='About this document...' /> |
| 12 | <link rel="next" href="datamodel.html" /> |
| 13 | <link rel="prev" href="introduction.html" /> |
| 14 | <link rel="parent" href="ref.html" /> |
| 15 | <link rel="next" href="line-structure.html" /> |
| 16 | <meta name='aesop' content='information' /> |
| 17 | <title>2. Lexical analysis</title> |
| 18 | </head> |
| 19 | <body> |
| 20 | <DIV CLASS="navigation"> |
| 21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> |
| 22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 23 | <tr> |
| 24 | <td class='online-navigation'><a rel="prev" title="1.2 Notation" |
| 25 | href="notation.html"><img src='../icons/previous.png' |
| 26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 27 | <td class='online-navigation'><a rel="parent" title="Python Reference Manual" |
| 28 | href="ref.html"><img src='../icons/up.png' |
| 29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 30 | <td class='online-navigation'><a rel="next" title="2.1 Line structure" |
| 31 | href="line-structure.html"><img src='../icons/next.png' |
| 32 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 33 | <td align="center" width="100%">Python Reference Manual</td> |
| 34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 35 | href="contents.html"><img src='../icons/contents.png' |
| 36 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 37 | <td class='online-navigation'><img src='../icons/blank.png' |
| 38 | border='0' height='32' alt='' width='32' /></td> |
| 39 | <td class='online-navigation'><a rel="index" title="Index" |
| 40 | href="genindex.html"><img src='../icons/index.png' |
| 41 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 42 | </tr></table> |
| 43 | <div class='online-navigation'> |
| 44 | <b class="navlabel">Previous:</b> |
| 45 | <a class="sectref" rel="prev" href="notation.html">1.2 Notation</A> |
| 46 | <b class="navlabel">Up:</b> |
| 47 | <a class="sectref" rel="parent" href="ref.html">Python Reference Manual</A> |
| 48 | <b class="navlabel">Next:</b> |
| 49 | <a class="sectref" rel="next" href="line-structure.html">2.1 Line structure</A> |
| 50 | </div> |
| 51 | <hr /></div> |
| 52 | </DIV> |
| 53 | <!--End of Navigation Panel--> |
| 54 | |
| 55 | <H1><A NAME="SECTION004000000000000000000"></A><A NAME="lexical"></A><a id='l2h-2' xml:id='l2h-2'></a> |
| 56 | <BR> |
| 57 | 2. Lexical analysis |
| 58 | </H1> |
| 59 | |
| 60 | <P> |
| 61 | A Python program is read by a <em>parser</em>. Input to the parser is a |
| 62 | stream of <em>tokens</em>, generated by the <em>lexical analyzer</em>. This |
| 63 | chapter describes how the lexical analyzer breaks a file into tokens. |
| 64 | |
| 65 | <P> |
| 66 | Python uses the 7-bit ASCII character set for program text. |
| 67 | |
| 68 | <span class="versionnote">New in version 2.3: |
| 69 | An encoding declaration can be used to indicate that |
| 70 | string literals and comments use an encoding different from ASCII..</span> |
| 71 | |
| 72 | For compatibility with older versions, Python only warns if it finds |
| 73 | 8-bit characters; those warnings should be corrected by either declaring |
| 74 | an explicit encoding, or using escape sequences if those bytes are binary |
| 75 | data, instead of characters. |
| 76 | |
| 77 | <P> |
| 78 | The run-time character set depends on the I/O devices connected to the |
| 79 | program but is generally a superset of ASCII. |
| 80 | |
| 81 | <P> |
| 82 | <strong>Future compatibility note:</strong> It may be tempting to assume that the |
| 83 | character set for 8-bit characters is ISO Latin-1 (an ASCII |
| 84 | superset that covers most western languages that use the Latin |
| 85 | alphabet), but it is possible that in the future Unicode text editors |
| 86 | will become common. These generally use the UTF-8 encoding, which is |
| 87 | also an ASCII superset, but with very different use for the |
| 88 | characters with ordinals 128-255. While there is no consensus on this |
| 89 | subject yet, it is unwise to assume either Latin-1 or UTF-8, even |
| 90 | though the current implementation appears to favor Latin-1. This |
| 91 | applies both to the source character set and the run-time character |
| 92 | set. |
| 93 | |
| 94 | <P> |
| 95 | |
| 96 | <p><br /></p><hr class='online-navigation' /> |
| 97 | <div class='online-navigation'> |
| 98 | <!--Table of Child-Links--> |
| 99 | <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></a> |
| 100 | |
| 101 | <UL CLASS="ChildLinks"> |
| 102 | <LI><A href="line-structure.html">2.1 Line structure</a> |
| 103 | <UL> |
| 104 | <LI><A href="logical.html">2.1.1 Logical lines</a> |
| 105 | <LI><A href="physical.html">2.1.2 Physical lines</a> |
| 106 | <LI><A href="comments.html">2.1.3 Comments</a> |
| 107 | <LI><A href="encodings.html">2.1.4 Encoding declarations</a> |
| 108 | <LI><A href="explicit-joining.html">2.1.5 Explicit line joining</a> |
| 109 | <LI><A href="implicit-joining.html">2.1.6 Implicit line joining</a> |
| 110 | <LI><A href="blank-lines.html">2.1.7 Blank lines</a> |
| 111 | <LI><A href="indentation.html">2.1.8 Indentation</a> |
| 112 | <LI><A href="whitespace.html">2.1.9 Whitespace between tokens</a> |
| 113 | </ul> |
| 114 | <LI><A href="other-tokens.html">2.2 Other tokens</a> |
| 115 | <LI><A href="identifiers.html">2.3 Identifiers and keywords</a> |
| 116 | <UL> |
| 117 | <LI><A href="keywords.html">2.3.1 Keywords</a> |
| 118 | <LI><A href="id-classes.html">2.3.2 Reserved classes of identifiers</a> |
| 119 | </ul> |
| 120 | <LI><A href="literals.html">2.4 Literals</a> |
| 121 | <UL> |
| 122 | <LI><A href="strings.html">2.4.1 String literals</a> |
| 123 | <LI><A href="string-catenation.html">2.4.2 String literal concatenation</a> |
| 124 | <LI><A href="numbers.html">2.4.3 Numeric literals</a> |
| 125 | <LI><A href="integers.html">2.4.4 Integer and long integer literals</a> |
| 126 | <LI><A href="floating.html">2.4.5 Floating point literals</a> |
| 127 | <LI><A href="imaginary.html">2.4.6 Imaginary literals</a> |
| 128 | </ul> |
| 129 | <LI><A href="operators.html">2.5 Operators</a> |
| 130 | <LI><A href="delimiters.html">2.6 Delimiters</a> |
| 131 | </ul> |
| 132 | <!--End of Table of Child-Links--> |
| 133 | </div> |
| 134 | |
| 135 | <DIV CLASS="navigation"> |
| 136 | <div class='online-navigation'> |
| 137 | <p></p><hr /> |
| 138 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> |
| 139 | <tr> |
| 140 | <td class='online-navigation'><a rel="prev" title="1.2 Notation" |
| 141 | href="notation.html"><img src='../icons/previous.png' |
| 142 | border='0' height='32' alt='Previous Page' width='32' /></A></td> |
| 143 | <td class='online-navigation'><a rel="parent" title="Python Reference Manual" |
| 144 | href="ref.html"><img src='../icons/up.png' |
| 145 | border='0' height='32' alt='Up One Level' width='32' /></A></td> |
| 146 | <td class='online-navigation'><a rel="next" title="2.1 Line structure" |
| 147 | href="line-structure.html"><img src='../icons/next.png' |
| 148 | border='0' height='32' alt='Next Page' width='32' /></A></td> |
| 149 | <td align="center" width="100%">Python Reference Manual</td> |
| 150 | <td class='online-navigation'><a rel="contents" title="Table of Contents" |
| 151 | href="contents.html"><img src='../icons/contents.png' |
| 152 | border='0' height='32' alt='Contents' width='32' /></A></td> |
| 153 | <td class='online-navigation'><img src='../icons/blank.png' |
| 154 | border='0' height='32' alt='' width='32' /></td> |
| 155 | <td class='online-navigation'><a rel="index" title="Index" |
| 156 | href="genindex.html"><img src='../icons/index.png' |
| 157 | border='0' height='32' alt='Index' width='32' /></A></td> |
| 158 | </tr></table> |
| 159 | <div class='online-navigation'> |
| 160 | <b class="navlabel">Previous:</b> |
| 161 | <a class="sectref" rel="prev" href="notation.html">1.2 Notation</A> |
| 162 | <b class="navlabel">Up:</b> |
| 163 | <a class="sectref" rel="parent" href="ref.html">Python Reference Manual</A> |
| 164 | <b class="navlabel">Next:</b> |
| 165 | <a class="sectref" rel="next" href="line-structure.html">2.1 Line structure</A> |
| 166 | </div> |
| 167 | </div> |
| 168 | <hr /> |
| 169 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> |
| 170 | </DIV> |
| 171 | <!--End of Navigation Panel--> |
| 172 | <ADDRESS> |
| 173 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. |
| 174 | </ADDRESS> |
| 175 | </BODY> |
| 176 | </HTML> |