The development, refinement, and deployment of SE-0200 Enhancing String Literals Delimiters to Support Raw Text was a long and surprising journey. It ended with a uniquely Swift take on “raw strings” that focused on adding custom delimiters to string literals and escape sequences.
This post discusses what raw strings are, how Swift designed its take on this technology, and how you can use this new Swift 5 feature in your code.
Table of Contents
Escape Sequences
Escape sequences are backslash-prepended combinations like \\
, \"
, and \u{n}
that incorporate characters that would otherwise be hard to express inside a normal string literal. Swift escape sequences include:
- The special characters
\0
(null character),\\
(backslash),\t
(horizontal tab),\n
(line feed),\r
(carriage return),\"
(double quotation mark) and\'
(single quotation mark) - Arbitrary Unicode scalars, written as
\u{n}
, where n is a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point - Interpolated expressions, introduced by
\(
and terminated by)
. Swift’s interpolation feature offers a powerful and compiler-checked way to add content to strings. It is one of the language’s highlights.
For example, the string literal "hello\n\n\tworld"
consists of three lines, with “hello” on the first and “world” on the third. “world” is indented by a single tab:
A raw string, in contrast, ignores escape sequences and treats all content as literal characters. In a raw string, \n
represents the backslash character followed by the letter n rather than a line feed. This feature is used in applications that produce code output, that work with regular expressions, that use in-app source code (for example, when interactively teaching a language), and for pre-escaped domain-specific content like JSON and XML.
Raw Strings
Raw strings are used in many languages including C#, Perl, Rust, Python, Ruby, and Scala. A raw string does not interpret escape sequences. Its content continues until it reaches the string’s end delimiter, which varies by language, as in the following table:
Syntax | Language(s) |
---|---|
'Hello, world!' |
Bourne shell, Perl, PHP, Ruby, Windows PowerShell |
q(Hello, world!) |
Perl (alternate) |
%q(Hello, world!) |
Ruby (alternate) |
@"Hello, world!" |
C#, F# |
R"(Hello, world!)" |
C++11 |
r"Hello, world!" |
D, Python |
r#"Hello, world!"# |
Rust |
"""hello \' world""" and raw"Hello, world!" |
Scala |
`Hello, world!` |
D, Go, `…` |
``...`` |
Java, any number of ` |
Most languages adopt a prefix (like q
, R
, or r
) to indicate raw content. Rust and Java go beyond this to allow customizable delimiters. This feature allows variations of the delimiter to be included within the string, allowing more expressive raw string content.
Multi-Line Swift Strings
SE-0168 Multi-Line String Literals not only introduced a way to create string literals with more than one line and no new-line escapes, it also provided a hint of the direction the Swift language would take in terms of custom delimiters. Since multi-line strings used three quotes """
to start and end literals, they allowed individual quote marks and new lines without escape sequences. Under the new system, this literal:
"\"Either it brings tears to their eyes, or else -\"\n\n\"Or else what?\" said Alice, for the Knight had made a sudden pause.\n\n\"Or else it doesn't, you know.\""
became this:
"""
"Either it brings tears to their eyes, or else -"
"Or else what?" said Alice, for the Knight had made a sudden pause.
"Or else it doesn't, you know."
"""
Quote and newline backslashes evaporate in the new syntax. The resulting string literal is clear, readable, and inspectable. In introducing the new delimiter and multi-line support, new-line and quote marks can be used without escapes, taking the first steps forward towards better literals.
Multi-line literals did not lose any of Swift’s string power. They support escapes, including interpolation, unicode character insertion, and so forth. At the same time, the feature set the standard for what Swift “raw” strings should look like.
Swift Raw Strings: Take One
SE-0200 first entered review in March 2018. Its initial design added a single r
prefix to single and multi-line strings. The community disliked the design (“The proposed r"..."
syntax didn’t fit well with the rest of the language”) and felt it wasn’t expansive enough to support enough use-cases. The proposal was returned for revision in April 2018. It was time to search for a better design, better use-cases, and a more Swift-aligned expression.
Revisiting design involved an extensive review of raw strings in other languages, eventually focussing on Rust. Rust not only supports raw strings, it uses customizable delimiters. You can create raw strings with r#""#
, r##""##
, r###""###
, and so forth. You choose the number of pound signs to pad each side of the string literal. In the unlikely circumstance you needed to include "#
in a string, which would normally terminate a basic raw string, these custom delimiters ensure you can add a second pound sign, allowing you to adjust the way the string ends.
Yes, it is extremely rare you ever need more than one pound sign but Rust’s design takes that rarity into account. It creates an expansible and customizable system that offers coverage of even the most outlandish edge cases. That strength is impressive and core to Swift’s eventual design. In its revision, SE-0200 dropped the r
(which stands for “raw”) while adopting the adaptable Rust-style pound signs on each side of the literal. As in Rust, each Swift string literal must use the same number of pounds before and after, whether working with single- or multi-line strings.
At that point, inspiration struck as the SE-0200 team realized that custom delimiters offered more power than plain raw strings.
Customizable Delimiters
When using the updated raw strings design, time and again the team regretted the loss of string interpolation. By definition, raw strings do not use escape sequences. Interpolation depends on them. It was SE-0200 co-author Brent Royal-Gordon who had the flash of insight that we could incorporate the Rust-inspired syntax while retaining access to escape sequences.
Instead of creating raw strings, SE-0200 introduced something similar: a blend of the alternate delimiters Swift first encountered in multi-line strings and the customizable delimiters from Rust. By extending that customization to escape sequences, SE-0200’s design inherited all the power of raw strings and the convenience of Swift interpolation.
SE-0200 adds custom delimiters at the start and end of each string literal and, in lockstep, customizes the escape sequence delimiter from a simple backslash to one decorated with pound-signs. This design matches escape sequences to the number of pound-signs for the string literal. For a ""
string, the escape token is \
. For #""#
, it is \#
, and ##""##
it is \##
, and so forth.
By adding escape sequences – this modification supports all of them, not just interpolation – Swift’s #-annotated strings were no longer “raw”. They support the same features you find in raw strings, they mostly act like raw strings, however the design incorporates escaping, which means the literals are not raw. If you feel fanciful, you can call them “medium rare” strings.
Any time you include what would otherwise be recognized as an escape sequence, you can extend the number of delimiter pound-signs until the contents are no longer interpreted. It is rare to need this feature but when used, just one or two pound signs should both support interpolation in some parts of your string and disallow it in others:
"\(thisInterpolates)"
#"\(thisDoesntInterpolate) \#(thisInterpolates)"#
##"\(thisDoesntInterpolate) \#(thisDoesntInterpolate) \##(thisInterpolates)"##
"\n" // new line
#"\n"# // backslash plus n
#"\#n"# // new line
Adopting SE-0200 Strings In Your Code
In Swift 5, each of the following literals declares the string “Hello”, even though they use a variety of single and multi-line styles:
let u = "Hello" // No pounds
let v = #"Hello"# // One pound
let w = ####"Hello"#### // Many pounds
let x = "\("Hello")" // Interpolation
let y = #"\#("Hello")"# // Interpolation with pound
let z = """ // Multiline
Hello
"""
let a = #""" // Multiline with pound
Hello
"""#
The rules are as follows:
- Match the number of pound-signs before and after a string literal, from zero to however many. “Zero” or “one” are almost always the right answer for “however many”.
- When using pound-signs, you change the escape sequence from a single backslash to a backslash infixed with the same number of pound signs. A
##"Hello"##
string uses a\##
escape sequence. - Anything that doesn’t match the closing delimiter is part of the string. To add
"""
to a multiline string without escaping, change the delimiter by adding a pound-sign. - Use the fewest pound signs required for the results you need. Zero is best. One is fine. Two or more should be very, very rare.
With SE-0200, anyone writing code generation apps like PaintCode or Kite Compositor, writing network code with escaped-JSON, or including backslash-heavy ASCII clip art, can paste and go. Add pound-signs as needed, without sacrificing the convenience of string interpolation or escape sequences.
These delimiters ensure your code remains free of escape clutter. The results are cleaner. They’re easier to read and to cut/paste into your codebase. You’ll be able to test, reconfigure, and adapt raw content without the hurdles of escaping and unescaping that otherwise limit your development.
Read more about Swift’s new custom string delimiters in the SE-0200 proposal. It includes further details, many examples, and explores alternate designs that were considered and rejected.
Please feel free to post questions about this post on the associated thread on the Swift forums.
Behind the Proposal — SE-0200 Enhancing String Literals Delimiters to Support Raw Text
Leave a Reply