Commit | Line | Data |
---|---|---|
920dae64 AT |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> |
2 | <html> | |
3 | <head> | |
4 | <link rel="STYLESHEET" href="ref.css" type='text/css' /> | |
5 | <link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" /> | |
6 | <link rel='start' href='../index.html' title='Python Documentation Index' /> | |
7 | <link rel="first" href="ref.html" title='Python Reference Manual' /> | |
8 | <link rel='contents' href='contents.html' title="Contents" /> | |
9 | <link rel='index' href='genindex.html' title='Index' /> | |
10 | <link rel='last' href='about.html' title='About this document...' /> | |
11 | <link rel='help' href='about.html' title='About this document...' /> | |
12 | <link rel="next" href="datamodel.html" /> | |
13 | <link rel="prev" href="introduction.html" /> | |
14 | <link rel="parent" href="ref.html" /> | |
15 | <link rel="next" href="line-structure.html" /> | |
16 | <meta name='aesop' content='information' /> | |
17 | <title>2. Lexical analysis</title> | |
18 | </head> | |
19 | <body> | |
20 | <DIV CLASS="navigation"> | |
21 | <div id='top-navigation-panel' xml:id='top-navigation-panel'> | |
22 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
23 | <tr> | |
24 | <td class='online-navigation'><a rel="prev" title="1.2 Notation" | |
25 | href="notation.html"><img src='../icons/previous.png' | |
26 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
27 | <td class='online-navigation'><a rel="parent" title="Python Reference Manual" | |
28 | href="ref.html"><img src='../icons/up.png' | |
29 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
30 | <td class='online-navigation'><a rel="next" title="2.1 Line structure" | |
31 | href="line-structure.html"><img src='../icons/next.png' | |
32 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
33 | <td align="center" width="100%">Python Reference Manual</td> | |
34 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
35 | href="contents.html"><img src='../icons/contents.png' | |
36 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
37 | <td class='online-navigation'><img src='../icons/blank.png' | |
38 | border='0' height='32' alt='' width='32' /></td> | |
39 | <td class='online-navigation'><a rel="index" title="Index" | |
40 | href="genindex.html"><img src='../icons/index.png' | |
41 | border='0' height='32' alt='Index' width='32' /></A></td> | |
42 | </tr></table> | |
43 | <div class='online-navigation'> | |
44 | <b class="navlabel">Previous:</b> | |
45 | <a class="sectref" rel="prev" href="notation.html">1.2 Notation</A> | |
46 | <b class="navlabel">Up:</b> | |
47 | <a class="sectref" rel="parent" href="ref.html">Python Reference Manual</A> | |
48 | <b class="navlabel">Next:</b> | |
49 | <a class="sectref" rel="next" href="line-structure.html">2.1 Line structure</A> | |
50 | </div> | |
51 | <hr /></div> | |
52 | </DIV> | |
53 | <!--End of Navigation Panel--> | |
54 | ||
55 | <H1><A NAME="SECTION004000000000000000000"></A><A NAME="lexical"></A><a id='l2h-2' xml:id='l2h-2'></a> | |
56 | <BR> | |
57 | 2. Lexical analysis | |
58 | </H1> | |
59 | ||
60 | <P> | |
61 | A Python program is read by a <em>parser</em>. Input to the parser is a | |
62 | stream of <em>tokens</em>, generated by the <em>lexical analyzer</em>. This | |
63 | chapter describes how the lexical analyzer breaks a file into tokens. | |
64 | ||
65 | <P> | |
66 | Python uses the 7-bit ASCII character set for program text. | |
67 | ||
68 | <span class="versionnote">New in version 2.3: | |
69 | An encoding declaration can be used to indicate that | |
70 | string literals and comments use an encoding different from ASCII..</span> | |
71 | ||
72 | For compatibility with older versions, Python only warns if it finds | |
73 | 8-bit characters; those warnings should be corrected by either declaring | |
74 | an explicit encoding, or using escape sequences if those bytes are binary | |
75 | data, instead of characters. | |
76 | ||
77 | <P> | |
78 | The run-time character set depends on the I/O devices connected to the | |
79 | program but is generally a superset of ASCII. | |
80 | ||
81 | <P> | |
82 | <strong>Future compatibility note:</strong> It may be tempting to assume that the | |
83 | character set for 8-bit characters is ISO Latin-1 (an ASCII | |
84 | superset that covers most western languages that use the Latin | |
85 | alphabet), but it is possible that in the future Unicode text editors | |
86 | will become common. These generally use the UTF-8 encoding, which is | |
87 | also an ASCII superset, but with very different use for the | |
88 | characters with ordinals 128-255. While there is no consensus on this | |
89 | subject yet, it is unwise to assume either Latin-1 or UTF-8, even | |
90 | though the current implementation appears to favor Latin-1. This | |
91 | applies both to the source character set and the run-time character | |
92 | set. | |
93 | ||
94 | <P> | |
95 | ||
96 | <p><br /></p><hr class='online-navigation' /> | |
97 | <div class='online-navigation'> | |
98 | <!--Table of Child-Links--> | |
99 | <A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></a> | |
100 | ||
101 | <UL CLASS="ChildLinks"> | |
102 | <LI><A href="line-structure.html">2.1 Line structure</a> | |
103 | <UL> | |
104 | <LI><A href="logical.html">2.1.1 Logical lines</a> | |
105 | <LI><A href="physical.html">2.1.2 Physical lines</a> | |
106 | <LI><A href="comments.html">2.1.3 Comments</a> | |
107 | <LI><A href="encodings.html">2.1.4 Encoding declarations</a> | |
108 | <LI><A href="explicit-joining.html">2.1.5 Explicit line joining</a> | |
109 | <LI><A href="implicit-joining.html">2.1.6 Implicit line joining</a> | |
110 | <LI><A href="blank-lines.html">2.1.7 Blank lines</a> | |
111 | <LI><A href="indentation.html">2.1.8 Indentation</a> | |
112 | <LI><A href="whitespace.html">2.1.9 Whitespace between tokens</a> | |
113 | </ul> | |
114 | <LI><A href="other-tokens.html">2.2 Other tokens</a> | |
115 | <LI><A href="identifiers.html">2.3 Identifiers and keywords</a> | |
116 | <UL> | |
117 | <LI><A href="keywords.html">2.3.1 Keywords</a> | |
118 | <LI><A href="id-classes.html">2.3.2 Reserved classes of identifiers</a> | |
119 | </ul> | |
120 | <LI><A href="literals.html">2.4 Literals</a> | |
121 | <UL> | |
122 | <LI><A href="strings.html">2.4.1 String literals</a> | |
123 | <LI><A href="string-catenation.html">2.4.2 String literal concatenation</a> | |
124 | <LI><A href="numbers.html">2.4.3 Numeric literals</a> | |
125 | <LI><A href="integers.html">2.4.4 Integer and long integer literals</a> | |
126 | <LI><A href="floating.html">2.4.5 Floating point literals</a> | |
127 | <LI><A href="imaginary.html">2.4.6 Imaginary literals</a> | |
128 | </ul> | |
129 | <LI><A href="operators.html">2.5 Operators</a> | |
130 | <LI><A href="delimiters.html">2.6 Delimiters</a> | |
131 | </ul> | |
132 | <!--End of Table of Child-Links--> | |
133 | </div> | |
134 | ||
135 | <DIV CLASS="navigation"> | |
136 | <div class='online-navigation'> | |
137 | <p></p><hr /> | |
138 | <table align="center" width="100%" cellpadding="0" cellspacing="2"> | |
139 | <tr> | |
140 | <td class='online-navigation'><a rel="prev" title="1.2 Notation" | |
141 | href="notation.html"><img src='../icons/previous.png' | |
142 | border='0' height='32' alt='Previous Page' width='32' /></A></td> | |
143 | <td class='online-navigation'><a rel="parent" title="Python Reference Manual" | |
144 | href="ref.html"><img src='../icons/up.png' | |
145 | border='0' height='32' alt='Up One Level' width='32' /></A></td> | |
146 | <td class='online-navigation'><a rel="next" title="2.1 Line structure" | |
147 | href="line-structure.html"><img src='../icons/next.png' | |
148 | border='0' height='32' alt='Next Page' width='32' /></A></td> | |
149 | <td align="center" width="100%">Python Reference Manual</td> | |
150 | <td class='online-navigation'><a rel="contents" title="Table of Contents" | |
151 | href="contents.html"><img src='../icons/contents.png' | |
152 | border='0' height='32' alt='Contents' width='32' /></A></td> | |
153 | <td class='online-navigation'><img src='../icons/blank.png' | |
154 | border='0' height='32' alt='' width='32' /></td> | |
155 | <td class='online-navigation'><a rel="index" title="Index" | |
156 | href="genindex.html"><img src='../icons/index.png' | |
157 | border='0' height='32' alt='Index' width='32' /></A></td> | |
158 | </tr></table> | |
159 | <div class='online-navigation'> | |
160 | <b class="navlabel">Previous:</b> | |
161 | <a class="sectref" rel="prev" href="notation.html">1.2 Notation</A> | |
162 | <b class="navlabel">Up:</b> | |
163 | <a class="sectref" rel="parent" href="ref.html">Python Reference Manual</A> | |
164 | <b class="navlabel">Next:</b> | |
165 | <a class="sectref" rel="next" href="line-structure.html">2.1 Line structure</A> | |
166 | </div> | |
167 | </div> | |
168 | <hr /> | |
169 | <span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span> | |
170 | </DIV> | |
171 | <!--End of Navigation Panel--> | |
172 | <ADDRESS> | |
173 | See <i><a href="about.html">About this document...</a></i> for information on suggesting changes. | |
174 | </ADDRESS> | |
175 | </BODY> | |
176 | </HTML> |