string literal

From cppreference.com
 
 
C++ language
General topics
Flow control
Conditional execution statements
Iteration statements
Jump statements
Functions
function declaration
lambda function declaration
function template
inline specifier
exception specifications (deprecated)
noexcept specifier (C++11)
Exceptions
Namespaces
Types
decltype specifier (C++11)
Specifiers
cv specifiers
storage duration specifiers
constexpr specifier (C++11)
auto specifier (C++11)
alignas specifier (C++11)
Initialization
Literals
Expressions
alternative representations
Utilities
Types
typedef declaration
type alias declaration (C++11)
attributes (C++11)
Casts
implicit conversions
const_cast conversion
static_cast conversion
dynamic_cast conversion
reinterpret_cast conversion
C-style and functional cast
Memory allocation
Classes
Class-specific function properties
Special member functions
Templates
class template
function template
template specialization
parameter packs (C++11)
Miscellaneous
Inline assembly
 

Contents

[edit] Syntax

" (unescaped_character|escaped_character)* " (1)
L " (unescaped_character|escaped_character)* " (2)
u8 " (unescaped_character|escaped_character)* " (3) (since C++11)
u " (unescaped_character|escaped_character)* " (4) (since C++11)
U " (unescaped_character|escaped_character)* " (5) (since C++11)
prefix(optional) R "delimiter( raw_character* )delimiter" (6) (since C++11)

[edit] Explanation

unescaped_character - Any valid character
escaped_character - See escape sequences
prefix - One of L, u8, u, U
delimiter - A string made of any source character but parentheses, backslash and spaces (can be empty)
raw_character - Must not contain the closing sequence )delimiter"


1) Narrow multibyte string literal. The type of an unprefixed string literal is const char[]
2) Wide string literal. The type of a L"..." string literal is const wchar_t[]
3) UTF-8 encoded string literal. The type of a u8"..." string literal is const char[]
4) UTF-16 encoded string literal. The type of a u"..." string literal is const char16_t[]
5) UTF-32 encoded string literal. The type of a U"..." string literal is const char32_t[]
6) Raw string literal. Used to avoid escaping of any character, anything between the delimiters becomes part of the string, if prefix is present has the same meaning as described above.

[edit] Notes

  • The null character ('\0', L'\0', char16_t(), etc) is always appended to the string literal: thus, a string literal "Hello" is a const char[6] holding the characters 'H', 'e', 'l', 'l', 'o', and '\0'.
  • String literals placed side-by-side are concatenated during compilation. That is, "Hello,"  " world!" yields the (single) string "Hello, world!".
    • If the two strings have the same encoding prefix (or neither has one), the resulting string will have the same encoding prefix (or no prefix).
    • If one of the strings has an encoding prefix and the other doesn't, the one that doesn't will be considered to have the same encoding prefix as the other.
    • If a UTF-8 string literal and a wide string literal are side by side, the program is ill-formed.
    • Any other combination of encoding prefixes may or may not be supported by the implementation. The result of such a concatenation is implementation-defined.
  • String literals have static storage duration, and thus exist in memory for the life of the program.
  • String literals can be used to initialize character arrays. If an array is initialized like char str[] = "foo";, str will contain a copy of the string "foo".
  • The compiler is allowed, but not required, to merge string literals. That means that identical string literals may or may not compare equal when compared by pointer. Even whether the expression "foo" == "foo" returns true is implementation-defined.
  • In C, string literals are of type char[], and can be assigned directly to a (non-const) char*. C++03 allowed it as well (but deprecated it, as literals are const in C++). C++11 no longer allows such assignments without a cast.
  • Attempting to modify a string literal results in undefined behavior.

[edit] Example

#include <iostream>
 
char array1[] = "Foo" "bar";
// same as
char array2[] = { 'F', 'o', 'o', 'b', 'a', 'r', '\0' };
 
const char* s1 = R"foo(
Hello
World
)foo";
//same as
const char* s2 = "\nHello\nWorld\n";
 
int main()
{
    std::cout << array1 << '\n' << array2 << '\n';
 
    std::cout << s1 << s2;
}

Output:

Foobar
Foobar
 
Hello
World
 
Hello
World