let mut message = String::new();
message.push_str("Hello ");
message.push_str("World!");
Rust has more than a single string type which is useful in some cases, but also leads to lots of confusion.
The String type is the standard string which allocated a location on the heap
and thus is used for mutable strings where the size is not known at compile
time.
String literals on the other hand are stored in the binary and you are given an
immutable reference of a rust slices of type &str. Additionally, you can
always take a subslice of a String to get an &str; so it's good practice to
have your function parameters use &str.
which type to use?
Use String by default.
function parameters
Prefer &str for function parameters:
struct Person {
name: String,
}
fn first_word(words: &str) -> String {
words
.split_whitespace()
.next()
.expect("words should not be empty")
.to_string()
}
fn main() {
let sentence = "Hello, world!";
println!("{}", first_word(sentence));
let owned = String::from("A string");
println!("{}", first_word(&owned));
println!("{}", first_word(&owned));
}
return type
If the return type of your function is derived from an argument and isn’t
mutated by the body, return &str. If you run into any trouble here, return
String instead:
struct Person {
name: String,
}
// we're returning a substring of words, so &str is appropriate
fn first_word(words: &str) -> &str {
words
.split_whitespace()
.next()
.expect("words should not be empty")
}
fn main() {
let sentence = "Hello, world!";
println!("{}", first_word(sentence));
let owned = String::from("A string");
println!("{}", first_word(&owned));
println!("{}", first_word(&owned));
}
utf-8
Strings and string literals are utf-8. However, this does not mean indexing on them is always straight forward.
let hello = String::from("السلام عليكم");
let hello = String::from("Dobrý den");
let hello = String::from("Hello");
let hello = String::from("שלום");
let hello = String::from("नमस्ते");
let hello = String::from("こんにちは");
let hello = String::from("안녕하세요");
let hello = String::from("你好");
let hello = String::from("Olá");
let hello = String::from("Здравствуйте");
let hello = String::from("Hola");
indexing and length
In many other programming languages, accessing individual characters in a string by referencing them by index is a valid and common operation. However, if you try to access parts of a String using indexing syntax in Rust, you’ll get an error.
Unicode is complicated. When asking for the length or
index of a string we could be referring to bytes or referring to grapheme
clusters (visual symbols). For example, the word नमस्तेis 18 bytes, 6 chars and
ultimately just 4 grapheme clusters.
by chars
Rust includes standard library support for individual Unicode scalar values. However THESE ARE NOT THE SAME AS DISPLAYED CHARACTERS!
for c in "Зд".chars() {
println!("{c}");
}
by grapheme cluster
Rust's standard library does not currently include support for grapheme clusters as they change from time to time and can be fairly complex. A popular library for this is https://crates.io/crates/unicode-segmentation
There are various other unicode libraries for more specific tasks like figuring out what characters you might wish to make acceptable in an identifier (username, hashtag, etc): https://crates.io/crates/unicode-ident
by line
Rust has a helpful method to handle line-by-line iteration of strings,
conveniently named lines.
concatenation
let s1 = String::from("Hello, ");
let s2 = String::from("world!");
let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used
The + operator uses the add method, whose signature looks something like
this: fn add(self, s: &str) -> String {
formatting
The format! macro works like println!, but instead of printing the output to
the screen, it returns a String with the contents. It uses references so this
call wont take ownership of any of its parameters:
let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");
let s = format!("{s1}-{s2}-{s3}");
notable methods
.trim()
Removes whitespace.
.parse()
Parse a string into any other type which implements the FromStr trait.
string literal
When declaring a large multi-line string literal you can begin with a \ which
will instruct rust to skip the first newline character:
let contents = "\
Rust:
safe, fast, productive.
Pick three.";