<!DOCTYPE html PUBLIC
"-//W3C//DTD HTML 4.0 Transitional//EN">
<link rel=
"STYLESHEET" href=
"ref.css" type='text/css'
/>
<link rel=
"SHORTCUT ICON" href=
"../icons/pyfav.png" type=
"image/png" />
<link rel='start' href='../index.html' title='Python Documentation Index'
/>
<link rel=
"first" href=
"ref.html" title='Python Reference Manual'
/>
<link rel='contents' href='contents.html'
title=
"Contents" />
<link rel='index' href='genindex.html' title='Index'
/>
<link rel='last' href='about.html' title='About this document...'
/>
<link rel='help' href='about.html' title='About this document...'
/>
<link rel=
"next" href=
"string-catenation.html" />
<link rel=
"prev" href=
"literals.html" />
<link rel=
"parent" href=
"literals.html" />
<link rel=
"next" href=
"string-catenation.html" />
<meta name='aesop' content='information'
/>
<title>2.4.1 String literals
</title>
<div id='top-navigation-panel' xml:id='top-navigation-panel'
>
<table align=
"center" width=
"100%" cellpadding=
"0" cellspacing=
"2">
<td class='online-navigation'
><a rel=
"prev" title=
"2.4 Literals"
href=
"literals.html"><img src='../icons/previous.png'
border='
0' height='
32' alt='Previous Page' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"parent" title=
"2.4 Literals"
href=
"literals.html"><img src='../icons/up.png'
border='
0' height='
32' alt='Up One Level' width='
32'
/></A></td>
<td class='online-navigation'
><a rel=
"next" title=
"2.4.2 String literal concatenation"
href=
"string-catenation.html"><img src='../icons/next.png'
border='
0' height='
32' alt='Next Page' width='
32'
/></A></td>
<td align=
"center" width=
"100%">Python Reference Manual
</td>
<td class='online-navigation'
><a rel=
"contents" title=
"Table of Contents"
href=
"contents.html"><img src='../icons/contents.png'
border='
0' height='
32' alt='Contents' width='
32'
/></A></td>
<td class='online-navigation'
><img src='../icons/blank.png'
border='
0' height='
32' alt='' width='
32'
/></td>
<td class='online-navigation'
><a rel=
"index" title=
"Index"
href=
"genindex.html"><img src='../icons/index.png'
border='
0' height='
32' alt='Index' width='
32'
/></A></td>
<div class='online-navigation'
>
<b class=
"navlabel">Previous:
</b>
<a class=
"sectref" rel=
"prev" href=
"literals.html">2.4 Literals
</A>
<b class=
"navlabel">Up:
</b>
<a class=
"sectref" rel=
"parent" href=
"literals.html">2.4 Literals
</A>
<b class=
"navlabel">Next:
</b>
<a class=
"sectref" rel=
"next" href=
"string-catenation.html">2.4.2 String literal concatenation
</A>
<!--End of Navigation Panel-->
<H2><A NAME=
"SECTION004410000000000000000"></A><A NAME=
"strings"></A><a id='l2h-
13' xml:id='l2h-
13'
></a>
String literals are described by the following lexical definitions:
<div class=
"productions">
<td><a id='tok-stringliteral' xml:id='tok-stringliteral'
>stringliteral
</a></td>
<td>[
<a class='grammartoken'
href=
"strings.html#tok-stringprefix">stringprefix
</a>](
<a class='grammartoken'
href=
"strings.html#tok-shortstring">shortstring
</a> |
<a class='grammartoken'
href=
"strings.html#tok-longstring">longstring
</a>)
</td></tr>
<td><a id='tok-stringprefix' xml:id='tok-stringprefix'
>stringprefix
</a></td>
<td>"r" |
"u" |
"ur" |
"R" |
"U" |
"UR" |
"Ur" |
"uR"</td></tr>
<td><a id='tok-shortstring' xml:id='tok-shortstring'
>shortstring
</a></td>
<td>"'" <a class='grammartoken'
href=
"strings.html#tok-shortstringitem">shortstringitem
</a>*
"'"
| '
"' <a class='grammartoken' href="strings.html#tok-shortstringitem
">shortstringitem</a>* '"'
</td></tr>
<td><a id='tok-longstring' xml:id='tok-longstring'
>longstring
</a></td>
<td>"'''" <a class='grammartoken'
href=
"strings.html#tok-longstringitem">longstringitem
</a>*
"'''"</td></tr>
<td><code>| '
"""' <a class='grammartoken' href="strings.html#tok-longstringitem
">longstringitem</a>* '"""'
</code></td></tr>
<td><a id='tok-shortstringitem' xml:id='tok-shortstringitem'
>shortstringitem
</a></td>
<td><a class='grammartoken'
href=
"strings.html#tok-shortstringchar">shortstringchar
</a> |
<a class='grammartoken'
href=
"strings.html#tok-escapeseq">escapeseq
</a></td></tr>
<td><a id='tok-longstringitem' xml:id='tok-longstringitem'
>longstringitem
</a></td>
<td><a class='grammartoken'
href=
"strings.html#tok-longstringchar">longstringchar
</a> |
<a class='grammartoken'
href=
"strings.html#tok-escapeseq">escapeseq
</a></td></tr>
<td><a id='tok-shortstringchar' xml:id='tok-shortstringchar'
>shortstringchar
</a></td>
<td><any source character except
"\" or newline or the quote
></td></tr>
<td><a id='tok-longstringchar' xml:id='tok-longstringchar'
>longstringchar
</a></td>
<td><any source character except
"\"></td></tr>
<td><a id='tok-escapeseq' xml:id='tok-escapeseq'
>escapeseq
</a></td>
<td>"\" <any ASCII character
></td></tr>
<a class=
"grammar-footer"
href=
"grammar.txt" type=
"text/plain"
>Download entire grammar as text.
</a>
One syntactic restriction not indicated by these productions is that
whitespace is not allowed between the
<a class='grammartoken'
href=
"strings.html#tok-stringprefix">stringprefix
</a> and
the rest of the string literal. The source character set is defined
by the encoding declaration; it is ASCII if no encoding declaration
is given in the source file; see section
<A href=
"encodings.html#encodings">2.1.4</A>.
In plain English: String literals can be enclosed in matching single
quotes (
<code>'
</code>) or double quotes (
<code>"</code>). They can also be
enclosed in matching groups of three single or double quotes (these
are generally referred to as <em>triple-quoted strings</em>). The
backslash (<code>\</code>) character is used to escape characters that
otherwise have a special meaning, such as newline, backslash itself,
or the quote character. String literals may optionally be prefixed
with a letter "<tt class=
"character">r
</tt>" or "<tt class=
"character">R
</tt>"; such strings are called
<i class="dfn
">raw strings</i><a id='l2h-14' xml:id='l2h-14'></a> and use different rules for interpreting
backslash escape sequences. A prefix of "<tt class=
"character">u
</tt>" or "<tt class=
"character">U
</tt>"
makes the string a Unicode string. Unicode strings use the Unicode character
set as defined by the Unicode Consortium and ISO 10646. Some additional
escape sequences, described below, are available in Unicode strings.
The two prefix characters may be combined; in this case, "<tt class=
"character">u
</tt>" must
appear before "<tt class=
"character">r
</tt>".
In triple-quoted strings,
unescaped newlines and quotes are allowed (and are retained), except
that three unescaped quotes in a row terminate the string. (A
``quote'' is the character used to open the string, i.e. either
<code>'</code> or <code>"</code>.)
Unless an
"<tt class="character
">r</tt>" or
"<tt class="character
">R</tt>" prefix is present, escape
sequences in strings are interpreted according to rules similar
to those used by Standard C. The recognized escape sequences are:
<div class=
"center"><table class=
"realtable">
<th class=
"left" >Escape Sequence
</th>
<th class=
"left" >Meaning
</th>
<th class=
"center">Notes
</th>
<tr><td class=
"left" valign=
"baseline"><code>\<var>newline
</var></code></td>
<td class=
"left" >Ignored
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\\</code></td>
<td class=
"left" >Backslash (
<code>\</code>)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\'
</code></td>
<td class=
"left" >Single quote (
<code>'
</code>)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\"</code></td>
<td class="left
" >Double quote (<code>"</code>)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\a
</code></td>
<td class=
"left" >ASCII Bell (BEL)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\b
</code></td>
<td class=
"left" >ASCII Backspace (BS)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\f
</code></td>
<td class=
"left" >ASCII Formfeed (FF)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\n
</code></td>
<td class=
"left" >ASCII Linefeed (LF)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\N{
<var>name
</var>}
</code></td>
<td class=
"left" >Character named
<var>name
</var> in the Unicode database (Unicode only)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\r
</code></td>
<td class=
"left" >ASCII Carriage Return (CR)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\t
</code></td>
<td class=
"left" >ASCII Horizontal Tab (TAB)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\u
<var>xxxx
</var></code></td>
<td class=
"left" >Character with
16-bit hex value
<var>xxxx
</var> (Unicode only)
</td>
<td class=
"center">(
1)
</td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\U
<var>xxxxxxxx
</var></code></td>
<td class=
"left" >Character with
32-bit hex value
<var>xxxxxxxx
</var> (Unicode only)
</td>
<td class=
"center">(
2)
</td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\v
</code></td>
<td class=
"left" >ASCII Vertical Tab (VT)
</td>
<td class=
"center"></td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\<var>ooo
</var></code></td>
<td class=
"left" >Character with octal value
<var>ooo
</var></td>
<td class=
"center">(
3,
5)
</td></tr>
<tr><td class=
"left" valign=
"baseline"><code>\x
<var>hh
</var></code></td>
<td class=
"left" >Character with hex value
<var>hh
</var></td>
<td class=
"center">(
4,
5)
</td></tr></tbody>
<DD>Individual code units which form parts of a surrogate pair can be
encoded using this escape sequence.
<DD>Any Unicode character can be encoded this way, but characters
outside the Basic Multilingual Plane (BMP) will be encoded using a
surrogate pair if Python is compiled to use
16-bit code units (the
default). Individual code units which form parts of a surrogate
pair can be encoded using this escape sequence.
<DD>As in Standard C, up to three octal digits are accepted.
<DD>Unlike in Standard C, at most two hex digits are accepted.
<DD>In a string literal, hexadecimal and octal escapes denote the
byte with the given value; it is not necessary that the byte
encodes a character in the source character set. In a Unicode
literal, these escapes denote a Unicode character with the given
Unlike Standard
<a id='l2h-
15' xml:id='l2h-
15'
></a>C,
all unrecognized escape sequences are left in the string unchanged,
i.e.,
<em>the backslash is left in the string
</em>. (This behavior is
useful when debugging: if an escape sequence is mistyped, the
resulting output is more easily recognized as broken.) It is also
important to note that the escape sequences marked as ``(Unicode
only)'' in the table above fall into the category of unrecognized
escapes for non-Unicode string literals.
When an
"<tt class="character
">r</tt>" or
"<tt class="character
">R</tt>" prefix is present, a character
following a backslash is included in the string without change, and
<em>all
backslashes are left in the string
</em>. For example, the string literal
<code>r
"\n"</code> consists of two characters: a backslash and a lowercase
"<tt class="character
">n</tt>". String quotes can be escaped with a backslash, but the
backslash remains in the string; for example,
<code>r
"\""</code> is a valid string
literal consisting of two characters: a backslash and a double quote;
<code>r"\"</code> is not a valid string literal (even a raw string cannot
end in an odd number of backslashes). Specifically, <em>a raw
string cannot end in a single backslash</em> (since the backslash would
escape the following quote character). Note also that a single
backslash followed by a newline is interpreted as those two characters
as part of the string, <em>not</em> as a line continuation.
When an "<tt class=
"character">r
</tt>" or "<tt class=
"character">R
</tt>" prefix is used in conjunction
with a "<tt class=
"character">u
</tt>" or "<tt class=
"character">U
</tt>" prefix, then the <code>\uXXXX</code>
escape sequence is processed while <em>all other backslashes are
left in the string</em>. For example, the string literal
<code>ur"\u0062
\n
"</code> consists of three Unicode characters: `LATIN
SMALL LETTER B', `REVERSE SOLIDUS', and `LATIN SMALL LETTER N'.
Backslashes can be escaped with a preceding backslash; however, both
remain in the string. As a result, <code>\uXXXX</code> escape sequences
are only recognized when there are an odd number of backslashes.
<div class='online-navigation'>
<table align="center
" width="100%
" cellpadding="0" cellspacing="2">
<td class='online-navigation'><a rel="prev
" title="2.4 Literals
"
href="literals.html
"><img src='../icons/previous.png'
border='0' height='32' alt='Previous Page' width='32' /></A></td>
<td class='online-navigation'><a rel="parent
" title="2.4 Literals
"
href="literals.html
"><img src='../icons/up.png'
border='0' height='32' alt='Up One Level' width='32' /></A></td>
<td class='online-navigation'><a rel="next
" title="2.4.2 String literal concatenation
"
href="string-catenation.html
"><img src='../icons/next.png'
border='0' height='32' alt='Next Page' width='32' /></A></td>
<td align="center
" width="100%
">Python Reference Manual</td>
<td class='online-navigation'><a rel="contents
" title="Table of Contents
"
href="contents.html
"><img src='../icons/contents.png'
border='0' height='32' alt='Contents' width='32' /></A></td>
<td class='online-navigation'><img src='../icons/blank.png'
border='0' height='32' alt='' width='32' /></td>
<td class='online-navigation'><a rel="index
" title="Index
"
href="genindex.html
"><img src='../icons/index.png'
border='0' height='32' alt='Index' width='32' /></A></td>
<div class='online-navigation'>
<b class="navlabel
">Previous:</b>
<a class="sectref
" rel="prev
" href="literals.html
">2.4 Literals</A>
<b class="navlabel
">Up:</b>
<a class="sectref
" rel="parent
" href="literals.html
">2.4 Literals</A>
<b class="navlabel
">Next:</b>
<a class="sectref
" rel="next
" href="string-catenation.html
">2.4.2 String literal concatenation</A>
<span class="release-info
">Release 2.4.2, documentation updated on 28 September 2005.</span>
<!--End of Navigation Panel-->
See <i><a href="about.html
">About this document...</a></i> for information on suggesting changes.