Initial commit of OpenSPARC T2 architecture model.
[OpenSPARC-T2-SAM] / sam-t2 / devtools / amd64 / html / python / ref / strings.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<link rel="STYLESHEET" href="ref.css" type='text/css' />
<link rel="SHORTCUT ICON" href="../icons/pyfav.png" type="image/png" />
<link rel='start' href='../index.html' title='Python Documentation Index' />
<link rel="first" href="ref.html" title='Python Reference Manual' />
<link rel='contents' href='contents.html' title="Contents" />
<link rel='index' href='genindex.html' title='Index' />
<link rel='last' href='about.html' title='About this document...' />
<link rel='help' href='about.html' title='About this document...' />
<link rel="next" href="string-catenation.html" />
<link rel="prev" href="literals.html" />
<link rel="parent" href="literals.html" />
<link rel="next" href="string-catenation.html" />
<meta name='aesop' content='information' />
<title>2.4.1 String literals</title>
</head>
<body>
<DIV CLASS="navigation">
<div id='top-navigation-panel' xml:id='top-navigation-panel'>
<table align="center" width="100%" cellpadding="0" cellspacing="2">
<tr>
<td class='online-navigation'><a rel="prev" title="2.4 Literals"
href="literals.html"><img src='../icons/previous.png'
border='0' height='32' alt='Previous Page' width='32' /></A></td>
<td class='online-navigation'><a rel="parent" title="2.4 Literals"
href="literals.html"><img src='../icons/up.png'
border='0' height='32' alt='Up One Level' width='32' /></A></td>
<td class='online-navigation'><a rel="next" title="2.4.2 String literal concatenation"
href="string-catenation.html"><img src='../icons/next.png'
border='0' height='32' alt='Next Page' width='32' /></A></td>
<td align="center" width="100%">Python Reference Manual</td>
<td class='online-navigation'><a rel="contents" title="Table of Contents"
href="contents.html"><img src='../icons/contents.png'
border='0' height='32' alt='Contents' width='32' /></A></td>
<td class='online-navigation'><img src='../icons/blank.png'
border='0' height='32' alt='' width='32' /></td>
<td class='online-navigation'><a rel="index" title="Index"
href="genindex.html"><img src='../icons/index.png'
border='0' height='32' alt='Index' width='32' /></A></td>
</tr></table>
<div class='online-navigation'>
<b class="navlabel">Previous:</b>
<a class="sectref" rel="prev" href="literals.html">2.4 Literals</A>
<b class="navlabel">Up:</b>
<a class="sectref" rel="parent" href="literals.html">2.4 Literals</A>
<b class="navlabel">Next:</b>
<a class="sectref" rel="next" href="string-catenation.html">2.4.2 String literal concatenation</A>
</div>
<hr /></div>
</DIV>
<!--End of Navigation Panel-->
<H2><A NAME="SECTION004410000000000000000"></A><A NAME="strings"></A><a id='l2h-13' xml:id='l2h-13'></a>
<BR>
2.4.1 String literals
</H2>
<P>
String literals are described by the following lexical definitions:
<P>
<dl><dd class="grammar">
<div class="productions">
<table>
<tr>
<td><a id='tok-stringliteral' xml:id='tok-stringliteral'>stringliteral</a></td>
<td>::=</td>
<td>[<a class='grammartoken' href="strings.html#tok-stringprefix">stringprefix</a>](<a class='grammartoken' href="strings.html#tok-shortstring">shortstring</a> | <a class='grammartoken' href="strings.html#tok-longstring">longstring</a>)</td></tr>
<tr>
<td><a id='tok-stringprefix' xml:id='tok-stringprefix'>stringprefix</a></td>
<td>::=</td>
<td>"r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"</td></tr>
<tr>
<td><a id='tok-shortstring' xml:id='tok-shortstring'>shortstring</a></td>
<td>::=</td>
<td>"'" <a class='grammartoken' href="strings.html#tok-shortstringitem">shortstringitem</a>* "'"
| '"' <a class='grammartoken' href="strings.html#tok-shortstringitem">shortstringitem</a>* '"'</td></tr>
<tr>
<td><a id='tok-longstring' xml:id='tok-longstring'>longstring</a></td>
<td>::=</td>
<td>"'''" <a class='grammartoken' href="strings.html#tok-longstringitem">longstringitem</a>* "'''"</td></tr>
<tr>
<td></td>
<td></td>
<td><code>| '"""' <a class='grammartoken' href="strings.html#tok-longstringitem">longstringitem</a>* '"""'</code></td></tr>
<tr>
<td><a id='tok-shortstringitem' xml:id='tok-shortstringitem'>shortstringitem</a></td>
<td>::=</td>
<td><a class='grammartoken' href="strings.html#tok-shortstringchar">shortstringchar</a> | <a class='grammartoken' href="strings.html#tok-escapeseq">escapeseq</a></td></tr>
<tr>
<td><a id='tok-longstringitem' xml:id='tok-longstringitem'>longstringitem</a></td>
<td>::=</td>
<td><a class='grammartoken' href="strings.html#tok-longstringchar">longstringchar</a> | <a class='grammartoken' href="strings.html#tok-escapeseq">escapeseq</a></td></tr>
<tr>
<td><a id='tok-shortstringchar' xml:id='tok-shortstringchar'>shortstringchar</a></td>
<td>::=</td>
<td>&lt;any source character except "&#92;" or newline or the quote&gt;</td></tr>
<tr>
<td><a id='tok-longstringchar' xml:id='tok-longstringchar'>longstringchar</a></td>
<td>::=</td>
<td>&lt;any source character except "&#92;"&gt;</td></tr>
<tr>
<td><a id='tok-escapeseq' xml:id='tok-escapeseq'>escapeseq</a></td>
<td>::=</td>
<td>"&#92;" &lt;any ASCII character&gt;</td></tr>
</table>
</div>
<a class="grammar-footer"
href="grammar.txt" type="text/plain"
>Download entire grammar as text.</a>
</dd></dl>
<P>
One syntactic restriction not indicated by these productions is that
whitespace is not allowed between the <a class='grammartoken' href="strings.html#tok-stringprefix">stringprefix</a> and
the rest of the string literal. The source character set is defined
by the encoding declaration; it is ASCII if no encoding declaration
is given in the source file; see section&nbsp;<A href="encodings.html#encodings">2.1.4</A>.
<P>
In plain English: String literals can be enclosed in matching single
quotes (<code>'</code>) or double quotes (<code>"</code>). They can also be
enclosed in matching groups of three single or double quotes (these
are generally referred to as <em>triple-quoted strings</em>). The
backslash (<code>&#92;</code>) character is used to escape characters that
otherwise have a special meaning, such as newline, backslash itself,
or the quote character. String literals may optionally be prefixed
with a letter "<tt class="character">r</tt>" or "<tt class="character">R</tt>"; such strings are called
<i class="dfn">raw strings</i><a id='l2h-14' xml:id='l2h-14'></a> and use different rules for interpreting
backslash escape sequences. A prefix of "<tt class="character">u</tt>" or "<tt class="character">U</tt>"
makes the string a Unicode string. Unicode strings use the Unicode character
set as defined by the Unicode Consortium and ISO&nbsp;10646. Some additional
escape sequences, described below, are available in Unicode strings.
The two prefix characters may be combined; in this case, "<tt class="character">u</tt>" must
appear before "<tt class="character">r</tt>".
<P>
In triple-quoted strings,
unescaped newlines and quotes are allowed (and are retained), except
that three unescaped quotes in a row terminate the string. (A
``quote'' is the character used to open the string, i.e. either
<code>'</code> or <code>"</code>.)
<P>
Unless an "<tt class="character">r</tt>" or "<tt class="character">R</tt>" prefix is present, escape
sequences in strings are interpreted according to rules similar
to those used by Standard C. The recognized escape sequences are:
<P>
<div class="center"><table class="realtable">
<thead>
<tr>
<th class="left" >Escape Sequence</th>
<th class="left" >Meaning</th>
<th class="center">Notes</th>
</tr>
</thead>
<tbody>
<tr><td class="left" valign="baseline"><code>&#92;<var>newline</var></code></td>
<td class="left" >Ignored</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;&#92;</code></td>
<td class="left" >Backslash (<code>&#92;</code>)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;'</code></td>
<td class="left" >Single quote (<code>'</code>)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;"</code></td>
<td class="left" >Double quote (<code>"</code>)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;a</code></td>
<td class="left" >ASCII Bell (BEL)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;b</code></td>
<td class="left" >ASCII Backspace (BS)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;f</code></td>
<td class="left" >ASCII Formfeed (FF)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;n</code></td>
<td class="left" >ASCII Linefeed (LF)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;N{<var>name</var>}</code></td>
<td class="left" >Character named <var>name</var> in the Unicode database (Unicode only)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;r</code></td>
<td class="left" >ASCII Carriage Return (CR)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;t</code></td>
<td class="left" >ASCII Horizontal Tab (TAB)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;u<var>xxxx</var></code></td>
<td class="left" >Character with 16-bit hex value <var>xxxx</var> (Unicode only)</td>
<td class="center">(1)</td></tr>
<tr><td class="left" valign="baseline"><code>&#92;U<var>xxxxxxxx</var></code></td>
<td class="left" >Character with 32-bit hex value <var>xxxxxxxx</var> (Unicode only)</td>
<td class="center">(2)</td></tr>
<tr><td class="left" valign="baseline"><code>&#92;v</code></td>
<td class="left" >ASCII Vertical Tab (VT)</td>
<td class="center"></td></tr>
<tr><td class="left" valign="baseline"><code>&#92;<var>ooo</var></code></td>
<td class="left" >Character with octal value <var>ooo</var></td>
<td class="center">(3,5)</td></tr>
<tr><td class="left" valign="baseline"><code>&#92;x<var>hh</var></code></td>
<td class="left" >Character with hex value <var>hh</var></td>
<td class="center">(4,5)</td></tr></tbody>
</table></div>
<P>
Notes:
<P>
<DL COMPACT>
<DT>(1)</DT>
<DD>Individual code units which form parts of a surrogate pair can be
encoded using this escape sequence.
</DD>
<DT>(2)</DT>
<DD>Any Unicode character can be encoded this way, but characters
outside the Basic Multilingual Plane (BMP) will be encoded using a
surrogate pair if Python is compiled to use 16-bit code units (the
default). Individual code units which form parts of a surrogate
pair can be encoded using this escape sequence.
</DD>
<DT>(3)</DT>
<DD>As in Standard C, up to three octal digits are accepted.
</DD>
<DT>(4)</DT>
<DD>Unlike in Standard C, at most two hex digits are accepted.
</DD>
<DT>(5)</DT>
<DD>In a string literal, hexadecimal and octal escapes denote the
byte with the given value; it is not necessary that the byte
encodes a character in the source character set. In a Unicode
literal, these escapes denote a Unicode character with the given
value.
</DD>
</DL>
<P>
Unlike Standard <a id='l2h-15' xml:id='l2h-15'></a>C,
all unrecognized escape sequences are left in the string unchanged,
i.e., <em>the backslash is left in the string</em>. (This behavior is
useful when debugging: if an escape sequence is mistyped, the
resulting output is more easily recognized as broken.) It is also
important to note that the escape sequences marked as ``(Unicode
only)'' in the table above fall into the category of unrecognized
escapes for non-Unicode string literals.
<P>
When an "<tt class="character">r</tt>" or "<tt class="character">R</tt>" prefix is present, a character
following a backslash is included in the string without change, and <em>all
backslashes are left in the string</em>. For example, the string literal
<code>r"&#92;n"</code> consists of two characters: a backslash and a lowercase
"<tt class="character">n</tt>". String quotes can be escaped with a backslash, but the
backslash remains in the string; for example, <code>r"&#92;""</code> is a valid string
literal consisting of two characters: a backslash and a double quote;
<code>r"&#92;"</code> is not a valid string literal (even a raw string cannot
end in an odd number of backslashes). Specifically, <em>a raw
string cannot end in a single backslash</em> (since the backslash would
escape the following quote character). Note also that a single
backslash followed by a newline is interpreted as those two characters
as part of the string, <em>not</em> as a line continuation.
<P>
When an "<tt class="character">r</tt>" or "<tt class="character">R</tt>" prefix is used in conjunction
with a "<tt class="character">u</tt>" or "<tt class="character">U</tt>" prefix, then the <code>&#92;uXXXX</code>
escape sequence is processed while <em>all other backslashes are
left in the string</em>. For example, the string literal
<code>ur"&#92;u0062&#92;n"</code> consists of three Unicode characters: `LATIN
SMALL LETTER B', `REVERSE SOLIDUS', and `LATIN SMALL LETTER N'.
Backslashes can be escaped with a preceding backslash; however, both
remain in the string. As a result, <code>&#92;uXXXX</code> escape sequences
are only recognized when there are an odd number of backslashes.
<P>
<DIV CLASS="navigation">
<div class='online-navigation'>
<p></p><hr />
<table align="center" width="100%" cellpadding="0" cellspacing="2">
<tr>
<td class='online-navigation'><a rel="prev" title="2.4 Literals"
href="literals.html"><img src='../icons/previous.png'
border='0' height='32' alt='Previous Page' width='32' /></A></td>
<td class='online-navigation'><a rel="parent" title="2.4 Literals"
href="literals.html"><img src='../icons/up.png'
border='0' height='32' alt='Up One Level' width='32' /></A></td>
<td class='online-navigation'><a rel="next" title="2.4.2 String literal concatenation"
href="string-catenation.html"><img src='../icons/next.png'
border='0' height='32' alt='Next Page' width='32' /></A></td>
<td align="center" width="100%">Python Reference Manual</td>
<td class='online-navigation'><a rel="contents" title="Table of Contents"
href="contents.html"><img src='../icons/contents.png'
border='0' height='32' alt='Contents' width='32' /></A></td>
<td class='online-navigation'><img src='../icons/blank.png'
border='0' height='32' alt='' width='32' /></td>
<td class='online-navigation'><a rel="index" title="Index"
href="genindex.html"><img src='../icons/index.png'
border='0' height='32' alt='Index' width='32' /></A></td>
</tr></table>
<div class='online-navigation'>
<b class="navlabel">Previous:</b>
<a class="sectref" rel="prev" href="literals.html">2.4 Literals</A>
<b class="navlabel">Up:</b>
<a class="sectref" rel="parent" href="literals.html">2.4 Literals</A>
<b class="navlabel">Next:</b>
<a class="sectref" rel="next" href="string-catenation.html">2.4.2 String literal concatenation</A>
</div>
</div>
<hr />
<span class="release-info">Release 2.4.2, documentation updated on 28 September 2005.</span>
</DIV>
<!--End of Navigation Panel-->
<ADDRESS>
See <i><a href="about.html">About this document...</a></i> for information on suggesting changes.
</ADDRESS>
</BODY>
</HTML>