用生命周期验证引用
Validating References with Lifetimes
生命周期(lifetimes)是另一种我们已经一直在使用的泛型。生命周期不是确保类型具有我们想要的行为,而是确保引用在我们需要它们的时间内一直有效。
Lifetimes are another kind of generic that we’ve already been using. Rather than ensuring that a type has the behavior we want, lifetimes ensure that references are valid as long as we need them to be.
我们在第 4 章的 “引用与借用” 一节中没有讨论的一个细节是,Rust 中的每个引用都有一个生命周期,即该引用有效的范围。大多数时候,生命周期是隐式的且可以被推断出来的,就像大多数时候类型也是可以被推断出来的一样。只有当可能存在多个类型时,我们才需要标注类型。类似地,当引用的生命周期可能以几种不同的方式相关联时,我们必须标注生命周期。Rust 要求我们使用泛型生命周期参数来标注这些关系,以确保在运行时使用的实际引用绝对是有效的。
One detail we didn’t discuss in the “References and Borrowing” section in Chapter 4 is that every reference in Rust has a lifetime, which is the scope for which that reference is valid. Most of the time, lifetimes are implicit and inferred, just like most of the time, types are inferred. We are only required to annotate types when multiple types are possible. In a similar way, we must annotate lifetimes when the lifetimes of references could be related in a few different ways. Rust requires us to annotate the relationships using generic lifetime parameters to ensure that the actual references used at runtime will definitely be valid.
标注生命周期甚至不是大多数其他编程语言中拥有的概念,所以这会让你感到陌生。虽然我们不会在本章中涵盖生命周期的全部内容,但我们将讨论你可能遇到生命周期语法的常见方式,以便你能够适应这个概念。
Annotating lifetimes is not even a concept most other programming languages have, so this is going to feel unfamiliar. Although we won’t cover lifetimes in their entirety in this chapter, we’ll discuss common ways you might encounter lifetime syntax so that you can get comfortable with the concept.
悬垂引用
Dangling References
生命周期的主要目标是防止悬垂引用(dangling references),如果允许它们存在,会导致程序引用非预期的数据。考虑示例 10-16 中的程序,它有一个外部作用域和一个内部作用域。
The main aim of lifetimes is to prevent dangling references, which, if they were allowed to exist, would cause a program to reference data other than the data it’s intended to reference. Consider the program in Listing 10-16, which has an outer scope and an inner scope.
fn main() {
let r;
{
let x = 5;
r = &x;
}
println!("r: {r}");
}
注意:示例 10-16、10-17 和 10-23 声明了变量但没有赋予初值,因此变量名存在于外部作用域中。乍一看,这似乎与 Rust 没有空值(null values)相冲突。但是,如果我们尝试在给变量赋值之前使用它,我们会得到一个编译时错误,这表明 Rust 确实不允许空值。
Note: The examples in Listings 10-16, 10-17, and 10-23 declare variables without giving them an initial value, so the variable name exists in the outer scope. At first glance, this might appear to be in conflict with Rust having no null values. However, if we try to use a variable before giving it a value, we’ll get a compile-time error, which shows that indeed Rust does not allow null values.
外部作用域声明了一个名为 r 的变量且没有初值,内部作用域声明了一个名为 x 的变量且初值为 5。在内部作用域中,我们尝试将 r 的值设置为对 x 的引用。然后,内部作用域结束,我们尝试打印 r 中的值。这段代码无法编译,因为 r 所引用的值在我们尝试使用它之前就已经超出了作用域。以下是错误信息:
The outer scope declares a variable named r with no initial value, and the
inner scope declares a variable named x with the initial value of 5. Inside
the inner scope, we attempt to set the value of r as a reference to x.
Then, the inner scope ends, and we attempt to print the value in r. This code
won’t compile, because the value that r is referring to has gone out of scope
before we try to use it. Here is the error message:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0597]: `x` does not live long enough
--> src/main.rs:6:13
|
5 | let x = 5;
| - binding `x` declared here
6 | r = &x;
| ^^ borrowed value does not live long enough
7 | }
| - `x` dropped here while still borrowed
8 |
9 | println!("r: {r}");
| - borrow later used here
For more information about this error, try `rustc --explain E0597`.
error: could not compile `chapter10` (bin "chapter10") due to 1 previous error
错误信息指出变量 x “活得不够久”。原因是当第 7 行的内部作用域结束时,x 将超出作用域。但 r 对于外部作用域仍然有效;因为它的作用域更大,我们说它“活得更久”。如果 Rust 允许这段代码工作,r 将引用在 x 超出作用域时已被释放的内存,而我们尝试对 r 做的任何操作都无法正确工作。那么,Rust 是如何确定这段代码无效的呢?它使用借用检查器。
The error message says that the variable x “does not live long enough.” The
reason is that x will be out of scope when the inner scope ends on line 7.
But r is still valid for the outer scope; because its scope is larger, we say
that it “lives longer.” If Rust allowed this code to work, r would be
referencing memory that was deallocated when x went out of scope, and
anything we tried to do with r wouldn’t work correctly. So, how does Rust
determine that this code is invalid? It uses a borrow checker.
借用检查器
The Borrow Checker
Rust 编译器有一个 借用检查器 (borrow checker),它比较作用域以确定所有借用是否有效。示例 10-17 显示了与示例 10-16 相同的代码,但添加了显示变量生命周期的注释。
The Rust compiler has a borrow checker that compares scopes to determine whether all borrows are valid. Listing 10-17 shows the same code as Listing 10-16 but with annotations showing the lifetimes of the variables.
fn main() {
let r; // ---------+-- 'a
// |
{ // |
let x = 5; // -+-- 'b |
r = &x; // | |
} // -+ |
// |
println!("r: {r}"); // |
} // ---------+
在这里,我们将 r 的生命周期标注为 'a,将 x 的生命周期标注为 'b。如你所见,内部的 'b 块比外部的 'a 生命周期块要小得多。在编译时,Rust 比较这两个生命周期的大小,看到 r 的生命周期为 'a,但它引用了生命周期为 'b 的内存。由于 'b 比 'a 短,程序被拒绝:引用的主体没有引用本身活得久。
Here, we’ve annotated the lifetime of r with 'a and the lifetime of x
with 'b. As you can see, the inner 'b block is much smaller than the outer
'a lifetime block. At compile time, Rust compares the size of the two
lifetimes and sees that r has a lifetime of 'a but that it refers to memory
with a lifetime of 'b. The program is rejected because 'b is shorter than
'a: The subject of the reference doesn’t live as long as the reference.
示例 10-18 修复了代码,使其不再有悬垂引用,并且在编译时没有任何错误。
Listing 10-18 fixes the code so that it doesn’t have a dangling reference and it compiles without any errors.
fn main() {
let x = 5; // ----------+-- 'b
// |
let r = &x; // --+-- 'a |
// | |
println!("r: {r}"); // | |
// --+ |
} // ----------+
在这里,x 的生命周期是 'b,在本例中它比 'a 大。这意味着 r 可以引用 x,因为 Rust 知道只要 x 有效,r 中的引用就始终有效。
Here, x has the lifetime 'b, which in this case is larger than 'a. This
means r can reference x because Rust knows that the reference in r will
always be valid while x is valid.
既然你已经知道引用的生命周期在哪里,以及 Rust 如何分析生命周期以确保引用始终有效,现在让我们探索函数参数和返回值中的泛型生命周期。
Now that you know where the lifetimes of references are and how Rust analyzes lifetimes to ensure that references will always be valid, let’s explore generic lifetimes in function parameters and return values.
函数中的泛型生命周期
Generic Lifetimes in Functions
我们将编写一个返回两个字符串切片中较长者的函数。这个函数将接受两个字符串切片并返回一个字符串切片。在我们实现了 longest 函数之后,示例 10-19 中的代码应该打印 The longest string is abcd。
We’ll write a function that returns the longer of two string slices. This
function will take two string slices and return a single string slice. After
we’ve implemented the longest function, the code in Listing 10-19 should
print The longest string is abcd.
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
请注意,我们希望函数接受字符串切片(它们是引用)而不是字符串,因为我们不希望 longest 函数夺取其参数的所有权。有关为什么示例 10-19 中使用的参数正是我们想要的,请参阅第 4 章中的 “作为参数的字符串切片” 讨论。
Note that we want the function to take string slices, which are references,
rather than strings, because we don’t want the longest function to take
ownership of its parameters. Refer to “String Slices as
Parameters” in Chapter 4 for more
discussion about why the parameters we use in Listing 10-19 are the ones we
want.
如果我们尝试像示例 10-20 所示那样实现 longest 函数,它将无法编译。
If we try to implement the longest function as shown in Listing 10-20, it
won’t compile.
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() { x } else { y }
}
相反,我们得到了以下涉及生命周期的错误:
Instead, we get the following error that talks about lifetimes:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0106]: missing lifetime specifier
--> src/main.rs:9:33
|
9 | fn longest(x: &str, y: &str) -> &str {
| ---- ---- ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
|
9 | fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
| ++++ ++ ++ ++
For more information about this error, try `rustc --explain E0106`.
error: could not compile `chapter10` (bin "chapter10") due to 1 previous error
帮助文本显示返回类型需要一个泛型生命周期参数,因为 Rust 无法分辨返回的引用是指向 x 还是 y。实际上,我们也不知道,因为该函数体中的 if 块返回对 x 的引用,而 else 块返回对 y 的引用!
The help text reveals that the return type needs a generic lifetime parameter
on it because Rust can’t tell whether the reference being returned refers to
x or y. Actually, we don’t know either, because the if block in the body
of this function returns a reference to x and the else block returns a
reference to y!
当我们定义此函数时,我们不知道将传递给此函数的具体值,因此不知道将执行 if 情况还是 else 情况。我们也不知道传入引用的具体生命周期,因此我们无法像在示例 10-17 和 10-18 中那样查看作用域,来确定我们返回的引用是否始终有效。借用检查器也无法确定这一点,因为它不知道 x 和 y 的生命周期与返回值的生命周期是如何关联的。为了修复此错误,我们将添加泛型生命周期参数,这些参数定义了引用之间的关系,以便借用检查器可以执行其分析。
When we’re defining this function, we don’t know the concrete values that will
be passed into this function, so we don’t know whether the if case or the
else case will execute. We also don’t know the concrete lifetimes of the
references that will be passed in, so we can’t look at the scopes as we did in
Listings 10-17 and 10-18 to determine whether the reference we return will
always be valid. The borrow checker can’t determine this either, because it
doesn’t know how the lifetimes of x and y relate to the lifetime of the
return value. To fix this error, we’ll add generic lifetime parameters that
define the relationship between the references so that the borrow checker can
perform its analysis.
生命周期标注语法
Lifetime Annotation Syntax
生命周期标注并不改变任何引用的活多久。相反,它们在不影响生命周期的前提下,描述了多个引用的生命周期相互之间的关系。就像当签名指定泛型类型参数时函数可以接受任何类型一样,通过指定泛型生命周期参数,函数可以接受任何生命周期的引用。
Lifetime annotations don’t change how long any of the references live. Rather, they describe the relationships of the lifetimes of multiple references to each other without affecting the lifetimes. Just as functions can accept any type when the signature specifies a generic type parameter, functions can accept references with any lifetime by specifying a generic lifetime parameter.
生命周期标注有一种稍微不寻常的语法:生命周期参数的名称必须以撇号 (') 开头,通常全是小写字母且非常短,就像泛型类型一样。大多数人使用名称 'a 作为第一个生命周期标注。我们将生命周期参数标注放在引用的 & 之后,使用空格将标注与引用的类型分开。
Lifetime annotations have a slightly unusual syntax: The names of lifetime
parameters must start with an apostrophe (') and are usually all lowercase
and very short, like generic types. Most people use the name 'a for the first
lifetime annotation. We place lifetime parameter annotations after the & of a
reference, using a space to separate the annotation from the reference’s type.
这里有一些例子:一个没有生命周期参数的 i32 引用,一个带有名为 'a 的生命周期参数的 i32 引用,以及一个同样具有生命周期 'a 的 i32 可变引用:
Here are some examples—a reference to an i32 without a lifetime parameter, a
reference to an i32 that has a lifetime parameter named 'a, and a mutable
reference to an i32 that also has the lifetime 'a:
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime
单个生命周期标注本身没有多大意义,因为标注的目的是告诉 Rust 多个引用的泛型生命周期参数是如何相互关联的。让我们在 longest 函数的上下文中研究生命周期标注是如何相互关联的。
One lifetime annotation by itself doesn’t have much meaning, because the
annotations are meant to tell Rust how generic lifetime parameters of multiple
references relate to each other. Let’s examine how the lifetime annotations
relate to each other in the context of the longest function.
在函数签名中
In Function Signatures
要在函数签名中使用生命周期标注,我们需要在函数名和参数列表之间的尖括号内声明泛型生命周期参数,就像我们处理泛型类型参数一样。
To use lifetime annotations in function signatures, we need to declare the generic lifetime parameters inside angle brackets between the function name and the parameter list, just as we did with generic type parameters.
我们希望签名表达以下约束:只要两个参数都有效,返回的引用就有效。这就是参数和返回值生命周期之间的关系。我们将生命周期命名为 'a,然后将其添加到每个引用中,如示例 10-21 所示。
We want the signature to express the following constraint: The returned
reference will be valid as long as both of the parameters are valid. This is
the relationship between lifetimes of the parameters and the return value.
We’ll name the lifetime 'a and then add it to each reference, as shown in
Listing 10-21.
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
当我们将这段代码与示例 10-19 中的 main 函数一起使用时,它应该能够编译并产生我们想要的结果。
This code should compile and produce the result we want when we use it with the
main function in Listing 10-19.
现在函数签名告诉 Rust,对于某个生命周期 'a,函数接受两个参数,它们都是字符串切片,且存活时间至少与生命周期 'a 一样长。函数签名还告诉 Rust,从函数返回的字符串切片也将至少与生命周期 'a 一样长。实际上,这意味着由 longest 函数返回的引用的生命周期,与函数参数所引用的值的生命周期中较小的一个相同。这些关系正是我们希望 Rust 在分析此代码时使用的。
The function signature now tells Rust that for some lifetime 'a, the function
takes two parameters, both of which are string slices that live at least as
long as lifetime 'a. The function signature also tells Rust that the string
slice returned from the function will live at least as long as lifetime 'a.
In practice, it means that the lifetime of the reference returned by the
longest function is the same as the smaller of the lifetimes of the values
referred to by the function arguments. These relationships are what we want
Rust to use when analyzing this code.
请记住,当我们在此函数签名中指定生命周期参数时,我们并没有改变任何传入或返回值的生命周期。相反,我们是在指定借用检查器应该拒绝任何不符合这些约束的值。请注意,longest 函数不需要确切地知道 x 和 y 将活多久,只需要知道某个作用域可以替代 'a 以满足此签名。
Remember, when we specify the lifetime parameters in this function signature,
we’re not changing the lifetimes of any values passed in or returned. Rather,
we’re specifying that the borrow checker should reject any values that don’t
adhere to these constraints. Note that the longest function doesn’t need to
know exactly how long x and y will live, only that some scope can be
substituted for 'a that will satisfy this signature.
在函数中标注生命周期时,标注放在函数签名中,而不是函数体中。生命周期标注成为函数契约的一部分,很像签名中的类型。让函数签名包含生命周期契约意味着 Rust 编译器执行的分析可以更简单。如果函数的标注方式或调用方式有问题,编译器错误可以更精确地指向代码的部分和约束条件。相反,如果 Rust 编译器对我们预期的生命周期关系做出更多推断,编译器可能只能指向距离问题原因许多步骤之外的代码调用。
When annotating lifetimes in functions, the annotations go in the function signature, not in the function body. The lifetime annotations become part of the contract of the function, much like the types in the signature. Having function signatures contain the lifetime contract means the analysis the Rust compiler does can be simpler. If there’s a problem with the way a function is annotated or the way it is called, the compiler errors can point to the part of our code and the constraints more precisely. If, instead, the Rust compiler made more inferences about what we intended the relationships of the lifetimes to be, the compiler might only be able to point to a use of our code many steps away from the cause of the problem.
当我们向 longest 传递具体引用时,替换 'a 的具体生命周期是 x 的作用域与 y 的作用域重叠的部分。换句话说,泛型生命周期 'a 将获得等于 x 和 y 生命周期中较小者的具体生命周期。因为我们已经用相同的生命周期参数 'a 标注了返回的引用,所以返回的引用在 x 和 y 生命周期中较短的那段时间内也是有效的。
When we pass concrete references to longest, the concrete lifetime that is
substituted for 'a is the part of the scope of x that overlaps with the
scope of y. In other words, the generic lifetime 'a will get the concrete
lifetime that is equal to the smaller of the lifetimes of x and y. Because
we’ve annotated the returned reference with the same lifetime parameter 'a,
the returned reference will also be valid for the length of the smaller of the
lifetimes of x and y.
让我们来看看生命周期标注如何通过传入具有不同具体生命周期的引用来约束 longest 函数。示例 10-22 是一个简单的例子。
Let’s look at how the lifetime annotations restrict the longest function by
passing in references that have different concrete lifetimes. Listing 10-22 is
a straightforward example.
fn main() {
let string1 = String::from("long string is long");
{
let string2 = String::from("xyz");
let result = longest(string1.as_str(), string2.as_str());
println!("The longest string is {result}");
}
}
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
在这个例子中,string1 在外部作用域结束前有效,string2 在内部作用域结束前有效,而 result 引用了在内部作用域结束前有效的东西。运行这段代码,你会看到借用检查器通过了;它将编译并打印 The longest string is long string is long。
In this example, string1 is valid until the end of the outer scope, string2
is valid until the end of the inner scope, and result references something
that is valid until the end of the inner scope. Run this code and you’ll see
that the borrow checker approves; it will compile and print The longest string is long string is long.
接下来,让我们尝试一个例子,展示 result 中引用的生命周期必须是两个参数中较小的那个生命周期。我们将 result 变量的声明移到内部作用域之外,但将 result 变量的赋值留在与 string2 相同的作用域内。然后,我们将使用 result 的 println! 移到内部作用域之外,即内部作用域结束之后。示例 10-23 中的代码将无法编译。
Next, let’s try an example that shows that the lifetime of the reference in
result must be the smaller lifetime of the two arguments. We’ll move the
declaration of the result variable outside the inner scope but leave the
assignment of the value to the result variable inside the scope with
string2. Then, we’ll move the println! that uses result to outside the
inner scope, after the inner scope has ended. The code in Listing 10-23 will
not compile.
fn main() {
let string1 = String::from("long string is long");
let result;
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
}
println!("The longest string is {result}");
}
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
当我们尝试编译这段代码时,我们得到这个错误:
When we try to compile this code, we get this error:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0597]: `string2` does not live long enough
--> src/main.rs:6:44
|
5 | let string2 = String::from("xyz");
| ------- binding `string2` declared here
6 | result = longest(string1.as_str(), string2.as_str());
| ^^^^^^^ borrowed value does not live long enough
7 | }
| - `string2` dropped here while still borrowed
8 | println!("The longest string is {result}");
| ------ borrow later used here
For more information about this error, try `rustc --explain E0597`.
error: could not compile `chapter10` (bin "chapter10") due to 1 previous error
错误表明为了让 result 对 println! 语句有效,string2 需要在外部作用域结束之前一直有效。Rust 知道这一点,是因为我们使用了相同的生命周期参数 'a 标注了函数参数和返回值的生命周期。
The error shows that for result to be valid for the println! statement,
string2 would need to be valid until the end of the outer scope. Rust knows
this because we annotated the lifetimes of the function parameters and return
values using the same lifetime parameter 'a.
作为人类,我们可以看到这段代码中 string1 比 string2 长,因此 result 将包含对 string1 的引用。因为 string1 还没有超出作用域,对 string1 的引用对于 println! 语句仍然有效。然而,编译器在这种情况下无法看出引用是有效的。我们告诉 Rust,longest 函数返回的引用的生命周期与传入引用的生命周期中较小的一个相同。因此,借用检查器不允许示例 10-23 中的代码,认为它可能包含无效引用。
As humans, we can look at this code and see that string1 is longer than
string2, and therefore, result will contain a reference to string1.
Because string1 has not gone out of scope yet, a reference to string1 will
still be valid for the println! statement. However, the compiler can’t see
that the reference is valid in this case. We’ve told Rust that the lifetime of
the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in. Therefore, the borrow checker
disallows the code in Listing 10-23 as possibly having an invalid reference.
尝试设计更多实验,改变传递给 longest 函数的引用的值和生命周期,以及返回引用的使用方式。在编译之前,对你的实验是否能通过借用检查器进行假设;然后,检查你是否正确!
Try designing more experiments that vary the values and lifetimes of the
references passed in to the longest function and how the returned reference
is used. Make hypotheses about whether or not your experiments will pass the
borrow checker before you compile; then, check to see if you’re right!
关系
Relationships
你需要指定生命周期参数的方式取决于你的函数在做什么。例如,如果我们更改 longest 函数的实现,使其始终返回第一个参数而不是最长的字符串切片,我们就不需要在 y 参数上指定生命周期。以下代码将能够编译:
The way in which you need to specify lifetime parameters depends on what your
function is doing. For example, if we changed the implementation of the
longest function to always return the first parameter rather than the longest
string slice, we wouldn’t need to specify a lifetime on the y parameter. The
following code will compile:
fn main() {
let string1 = String::from("abcd");
let string2 = "efghijklmnopqrstuvwxyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
fn longest<'a>(x: &'a str, y: &str) -> &'a str {
x
}
我们为参数 x 和返回类型指定了生命周期参数 'a,但没有为参数 y 指定,因为 y 的生命周期与 x 或返回值的生命周期没有任何关系。
We’ve specified a lifetime parameter 'a for the parameter x and the return
type, but not for the parameter y, because the lifetime of y does not have
any relationship with the lifetime of x or the return value.
从函数返回引用时,返回类型的生命周期参数需要与其中一个参数的生命周期参数匹配。如果返回的引用不指向其中一个参数,它必须指向在此函数内创建的值。然而,这将是一个悬垂引用,因为该值将在函数结束时超出作用域。考虑下面这个尝试实现的 longest 函数,它无法编译:
When returning a reference from a function, the lifetime parameter for the
return type needs to match the lifetime parameter for one of the parameters. If
the reference returned does not refer to one of the parameters, it must refer
to a value created within this function. However, this would be a dangling
reference because the value will go out of scope at the end of the function.
Consider this attempted implementation of the longest function that won’t
compile:
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest(string1.as_str(), string2);
println!("The longest string is {result}");
}
fn longest<'a>(x: &str, y: &str) -> &'a str {
let result = String::from("really long string");
result.as_str()
}
在这里,即使我们为返回类型指定了生命周期参数 'a,这个实现也会编译失败,因为返回值的生命周期与参数的生命周期根本没有关系。以下是我们得到的错误信息:
Here, even though we’ve specified a lifetime parameter 'a for the return
type, this implementation will fail to compile because the return value
lifetime is not related to the lifetime of the parameters at all. Here is the
error message we get:
$ cargo run
Compiling chapter10 v0.1.0 (file:///projects/chapter10)
error[E0515]: cannot return value referencing local variable `result`
--> src/main.rs:11:5
|
11 | result.as_str()
| ------^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `result` is borrowed here
For more information about this error, try `rustc --explain E0515`.
error: could not compile `chapter10` (bin "chapter10") due to 1 previous error
问题在于 result 在 longest 函数结束时超出了作用域并被清理掉了。而我们还尝试从函数返回对 result 的引用。没有任何办法可以指定生命周期参数来改变悬垂引用,而且 Rust 不会让我们创建悬垂引用。在这种情况下,最好的修复方法是返回一个拥有的数据类型而不是引用,这样调用函数就负责清理该值了。
The problem is that result goes out of scope and gets cleaned up at the end
of the longest function. We’re also trying to return a reference to result
from the function. There is no way we can specify lifetime parameters that
would change the dangling reference, and Rust won’t let us create a dangling
reference. In this case, the best fix would be to return an owned data type
rather than a reference so that the calling function is then responsible for
cleaning up the value.
归根结底,生命周期语法是为了连接函数的各种参数和返回值的生命周期。一旦它们连接起来,Rust 就有足够的信息来允许内存安全的操作,并禁止会产生悬垂指针或以其他方式违反内存安全的操作。
Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return values of functions. Once they’re connected, Rust has enough information to allow memory-safe operations and disallow operations that would create dangling pointers or otherwise violate memory safety.
在结构体定义中
In Struct Definitions
到目前为止,我们定义的结构体都持有拥有的类型。我们可以定义结构体来持有引用,但在这种情况下,我们需要在结构体定义的每个引用上添加生命周期标注。示例 10-24 有一个名为 ImportantExcerpt 的结构体,它持有一个字符串切片。
So far, the structs we’ve defined all hold owned types. We can define structs
to hold references, but in that case, we would need to add a lifetime
annotation on every reference in the struct’s definition. Listing 10-24 has a
struct named ImportantExcerpt that holds a string slice.
struct ImportantExcerpt<'a> {
part: &'a str,
}
fn main() {
let novel = String::from("Call me Ishmael. Some years ago...");
let first_sentence = novel.split('.').next().unwrap();
let i = ImportantExcerpt {
part: first_sentence,
};
}
该结构体有一个字段 part,它持有一个字符串切片,即一个引用。与泛型数据类型一样,我们在结构体名称后的尖括号内声明泛型生命周期参数的名称,以便我们可以在结构体定义体中使用该生命周期参数。此标注意味着 ImportantExcerpt 的实例不能比其 part 字段中持有的引用活得更久。
This struct has the single field part that holds a string slice, which is a
reference. As with generic data types, we declare the name of the generic
lifetime parameter inside angle brackets after the name of the struct so that
we can use the lifetime parameter in the body of the struct definition. This
annotation means an instance of ImportantExcerpt can’t outlive the reference
it holds in its part field.
这里的 main 函数创建了一个 ImportantExcerpt 结构体的实例,它持有对变量 novel 拥有的 String 的第一句的引用。novel 中的数据在 ImportantExcerpt 实例创建之前就存在。此外,novel 直到 ImportantExcerpt 超出作用域之后才超出作用域,因此 ImportantExcerpt 实例中的引用是有效的。
The main function here creates an instance of the ImportantExcerpt struct
that holds a reference to the first sentence of the String owned by the
variable novel. The data in novel exists before the ImportantExcerpt
instance is created. In addition, novel doesn’t go out of scope until after
the ImportantExcerpt goes out of scope, so the reference in the
ImportantExcerpt instance is valid.
生命周期省略
Lifetime Elision
你已经了解到每个引用都有生命周期,并且你需要为使用引用的函数或结构体指定生命周期参数。然而,我们在示例 4-9 中有一个函数(示例 10-25 再次显示),它在没有生命周期标注的情况下编译通过了。
You’ve learned that every reference has a lifetime and that you need to specify lifetime parameters for functions or structs that use references. However, we had a function in Listing 4-9, shown again in Listing 10-25, that compiled without lifetime annotations.
fn first_word(s: &str) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
fn main() {
let my_string = String::from("hello world");
// first_word works on slices of `String`s
let word = first_word(&my_string[..]);
let my_string_literal = "hello world";
// first_word works on slices of string literals
let word = first_word(&my_string_literal[..]);
// Because string literals *are* string slices already,
// this works too, without the slice syntax!
let word = first_word(my_string_literal);
}
这个函数之所以在没有生命周期标注的情况下也能编译,是有历史原因的:在 Rust 的早期版本(1.0 之前)中,这段代码是无法编译的,因为每个引用都需要显式的生命周期。在那时,函数签名会写成这样:
The reason this function compiles without lifetime annotations is historical: In early versions (pre-1.0) of Rust, this code wouldn’t have compiled, because every reference needed an explicit lifetime. At that time, the function signature would have been written like this:
fn first_word<'a>(s: &'a str) -> &'a str {
在编写了大量 Rust 代码后,Rust 团队发现 Rust 程序员在特定情况下会一遍又一遍地输入相同的生命周期标注。这些情况是可以预测的,并且遵循一些确定的模式。开发人员将这些模式编程到编译器的代码中,以便借用检查器可以在这些情况下推断生命周期,而不需要显式的标注。
After writing a lot of Rust code, the Rust team found that Rust programmers were entering the same lifetime annotations over and over in particular situations. These situations were predictable and followed a few deterministic patterns. The developers programmed these patterns into the compiler’s code so that the borrow checker could infer the lifetimes in these situations and wouldn’t need explicit annotations.
这段 Rust 历史之所以相关,是因为将来可能会出现更多确定的模式并被添加到编译器中。在未来,可能需要的生命周期标注会更少。
This piece of Rust history is relevant because it’s possible that more deterministic patterns will emerge and be added to the compiler. In the future, even fewer lifetime annotations might be required.
被编程到 Rust 引用分析中的模式被称为 生命周期省略规则 (lifetime elision rules)。这些规则不是程序员需要遵守的规则;它们是编译器会考虑的一组特定情况,如果你的代码符合这些情况,你就无需显式编写生命周期。
The patterns programmed into Rust’s analysis of references are called the lifetime elision rules. These aren’t rules for programmers to follow; they’re a set of particular cases that the compiler will consider, and if your code fits these cases, you don’t need to write the lifetimes explicitly.
省略规则并不提供完整的推断。如果 Rust 应用规则后,引用的生命周期仍然存在歧义,编译器将不会猜测剩余引用的生命周期应该是什么。编译器不会猜测,而是会给你一个错误,你可以通过添加生命周期标注来解决。
The elision rules don’t provide full inference. If there is still ambiguity about what lifetimes the references have after Rust applies the rules, the compiler won’t guess what the lifetime of the remaining references should be. Instead of guessing, the compiler will give you an error that you can resolve by adding the lifetime annotations.
函数或方法参数上的生命周期被称为 输入生命周期 (input lifetimes),返回值的生命周期被称为 输出生命周期 (output lifetimes)。
Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return values are called output lifetimes.
编译器在没有显式标注时使用三条规则来计算引用的生命周期。第一条规则适用于输入生命周期,第二条和第三条规则适用于输出生命周期。如果编译器走完这三条规则后仍有无法确定生命周期的引用,编译器将停止并报错。这些规则适用于 fn 定义以及 impl 块。
The compiler uses three rules to figure out the lifetimes of the references
when there aren’t explicit annotations. The first rule applies to input
lifetimes, and the second and third rules apply to output lifetimes. If the
compiler gets to the end of the three rules and there are still references for
which it can’t figure out lifetimes, the compiler will stop with an error.
These rules apply to fn definitions as well as impl blocks.
第一条规则是编译器为每一个引用类型的参数分配一个生命周期参数。换句话说,有一个参数的函数获得一个生命周期参数:fn foo<'a>(x: &'a i32);有两个参数的函数获得两个独立的生命周期参数:fn foo<'a, 'b>(x: &'a i32, y: &'b i32);依此类推。
The first rule is that the compiler assigns a lifetime parameter to each
parameter that’s a reference. In other words, a function with one parameter
gets one lifetime parameter: fn foo<'a>(x: &'a i32); a function with two
parameters gets two separate lifetime parameters: fn foo<'a, 'b>(x: &'a i32, y: &'b i32); and so on.
第二条规则是,如果恰好只有一个输入生命周期参数,那么该生命周期将被分配给所有输出生命周期参数:fn foo<'a>(x: &'a i32) -> &'a i32。
The second rule is that, if there is exactly one input lifetime parameter, that
lifetime is assigned to all output lifetime parameters: fn foo<'a>(x: &'a i32) -> &'a i32.
第三条规则是,如果有多个输入生命周期参数,但其中一个是 &self 或 &mut self(因为这是一个方法),那么 self 的生命周期将被分配给所有输出生命周期参数。这条第三规则使得方法读写起来更加舒心,因为需要的符号更少。
The third rule is that, if there are multiple input lifetime parameters, but
one of them is &self or &mut self because this is a method, the lifetime of
self is assigned to all output lifetime parameters. This third rule makes
methods much nicer to read and write because fewer symbols are necessary.
让我们假装自己是编译器。我们将应用这些规则来计算示例 10-25 中 first_word 函数签名中引用的生命周期。签名开始时没有任何与引用关联的生命周期:
Let’s pretend we’re the compiler. We’ll apply these rules to figure out the
lifetimes of the references in the signature of the first_word function in
Listing 10-25. The signature starts without any lifetimes associated with the
references:
fn first_word(s: &str) -> &str {
然后,编译器应用第一条规则,该规则指定每个参数获得自己的生命周期。我们像往常一样称它为 'a,所以现在的签名是这样的:
Then, the compiler applies the first rule, which specifies that each parameter
gets its own lifetime. We’ll call it 'a as usual, so now the signature is
this:
fn first_word<'a>(s: &'a str) -> &str {
第二条规则适用,因为恰好有一个输入生命周期。第二条规则指定将这一个输入参数的生命周期分配给输出生命周期,所以签名现在是这样的:
The second rule applies because there is exactly one input lifetime. The second rule specifies that the lifetime of the one input parameter gets assigned to the output lifetime, so the signature is now this:
fn first_word<'a>(s: &'a str) -> &'a str {
现在此函数签名中的所有引用都有了生命周期,编译器可以继续其分析,而不需要程序员在此函数签名中标注生命周期。
Now all the references in this function signature have lifetimes, and the compiler can continue its analysis without needing the programmer to annotate the lifetimes in this function signature.
让我们看另一个例子,这次使用我们开始在示例 10-20 中处理时没有生命周期参数的 longest 函数:
Let’s look at another example, this time using the longest function that had
no lifetime parameters when we started working with it in Listing 10-20:
fn longest(x: &str, y: &str) -> &str {
让我们应用第一条规则:每个参数获得它自己的生命周期。这次我们有两个参数而不是一个,所以我们有两个生命周期:
Let’s apply the first rule: Each parameter gets its own lifetime. This time we have two parameters instead of one, so we have two lifetimes:
fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &str {
你可以看到第二条规则不适用,因为输入生命周期不止一个。第三条规则也不适用,因为 longest 是一个函数而不是一个方法,所以参数中没有 self。在走完所有三条规则后,我们仍然没有计算出返回类型的生命周期。这就是为什么我们在尝试编译示例 10-20 中的代码时会得到错误:编译器走完了生命周期省略规则,但仍然无法计算出签名中所有引用的生命周期。
You can see that the second rule doesn’t apply, because there is more than one
input lifetime. The third rule doesn’t apply either, because longest is a
function rather than a method, so none of the parameters are self. After
working through all three rules, we still haven’t figured out what the return
type’s lifetime is. This is why we got an error trying to compile the code in
Listing 10-20: The compiler worked through the lifetime elision rules but still
couldn’t figure out all the lifetimes of the references in the signature.
因为第三条规则实际上只适用于方法签名,所以我们接下来将在那个上下文中查看生命周期,看看为什么第三条规则意味着我们不必经常在方法签名中标注生命周期。
Because the third rule really only applies in method signatures, we’ll look at lifetimes in that context next to see why the third rule means we don’t have to annotate lifetimes in method signatures very often.
在方法定义中
In Method Definitions
当我们为带有生命周期的结构体实现方法时,我们使用与泛型类型参数相同的语法,如示例 10-11 所示。我们在哪里声明和使用生命周期参数取决于它们是与结构体字段相关,还是与方法参数和返回值相关。
When we implement methods on a struct with lifetimes, we use the same syntax as that of generic type parameters, as shown in Listing 10-11. Where we declare and use the lifetime parameters depends on whether they’re related to the struct fields or the method parameters and return values.
结构体字段的生命周期名称始终需要在 impl 关键字之后声明,然后在结构体名称之后使用,因为这些生命周期是结构体类型的一部分。
Lifetime names for struct fields always need to be declared after the impl
keyword and then used after the struct’s name because those lifetimes are part
of the struct’s type.
在 impl 块内的方法签名中,引用可能与结构体字段中引用的生命周期相关联,也可能是独立的。此外,生命周期省略规则通常使得在方法签名中不需要生命周期标注。让我们看一些使用我们在示例 10-24 中定义的名为 ImportantExcerpt 的结构体的例子。
In method signatures inside the impl block, references might be tied to the
lifetime of references in the struct’s fields, or they might be independent. In
addition, the lifetime elision rules often make it so that lifetime annotations
aren’t necessary in method signatures. Let’s look at some examples using the
struct named ImportantExcerpt that we defined in Listing 10-24.
首先,我们将使用一个名为 level 的方法,其唯一的参数是对 self 的引用,其返回值是一个 i32,它不引用任何东西:
First, we’ll use a method named level whose only parameter is a reference to
self and whose return value is an i32, which is not a reference to anything:
struct ImportantExcerpt<'a> {
part: &'a str,
}
impl<'a> ImportantExcerpt<'a> {
fn level(&self) -> i32 {
3
}
}
impl<'a> ImportantExcerpt<'a> {
fn announce_and_return_part(&self, announcement: &str) -> &str {
println!("Attention please: {announcement}");
self.part
}
}
fn main() {
let novel = String::from("Call me Ishmael. Some years ago...");
let first_sentence = novel.split('.').next().unwrap();
let i = ImportantExcerpt {
part: first_sentence,
};
}
impl 之后的生命周期参数声明及其在类型名称之后的使用是必需的,但由于第一条省略规则,我们不被要求标注对 self 引用的生命周期。
The lifetime parameter declaration after impl and its use after the type name
are required, but because of the first elision rule, we’re not required to
annotate the lifetime of the reference to self.
这是一个适用第三条生命周期省略规则的例子:
Here is an example where the third lifetime elision rule applies:
struct ImportantExcerpt<'a> {
part: &'a str,
}
impl<'a> ImportantExcerpt<'a> {
fn level(&self) -> i32 {
3
}
}
impl<'a> ImportantExcerpt<'a> {
fn announce_and_return_part(&self, announcement: &str) -> &str {
println!("Attention please: {announcement}");
self.part
}
}
fn main() {
let novel = String::from("Call me Ishmael. Some years ago...");
let first_sentence = novel.split('.').next().unwrap();
let i = ImportantExcerpt {
part: first_sentence,
};
}
有两个输入生命周期,因此 Rust 应用第一条生命周期省略规则并给 &self 和 announcement 各自的生命周期。然后,因为参数之一是 &self,返回类型获得 &self 的生命周期,所有生命周期都已计算完毕。
There are two input lifetimes, so Rust applies the first lifetime elision rule
and gives both &self and announcement their own lifetimes. Then, because
one of the parameters is &self, the return type gets the lifetime of &self,
and all lifetimes have been accounted for.
静态生命周期
The Static Lifetime
我们需要讨论的一个特殊生命周期是 'static,它表示受影响的引用可以在整个程序的持续时间内有效。所有字符串字面量都具有 'static 生命周期,我们可以按如下方式标注:
One special lifetime we need to discuss is 'static, which denotes that the
affected reference can live for the entire duration of the program. All
string literals have the 'static lifetime, which we can annotate as follows:
#![allow(unused)]
fn main() {
let s: &'static str = "I have a static lifetime.";
}
该字符串的文本直接存储在程序的二进制文件中,该文件始终可用。因此,所有字符串字面量的生命周期都是 'static。
The text of this string is stored directly in the program’s binary, which is
always available. Therefore, the lifetime of all string literals is 'static.
你可能会在错误消息中看到使用 'static 生命周期的建议。但在将 'static 指定为引用的生命周期之前,请考虑你拥有的引用是否真的能在程序的整个生命周期内存在,以及你是否希望它如此。大多数情况下,建议使用 'static 生命周期的错误消息是由于尝试创建悬垂引用或可用生命周期不匹配导致的。在这种情况下,解决方案是修复这些问题,而不是指定 'static 生命周期。
You might see suggestions in error messages to use the 'static lifetime. But
before specifying 'static as the lifetime for a reference, think about
whether or not the reference you have actually lives the entire lifetime of
your program, and whether you want it to. Most of the time, an error message
suggesting the 'static lifetime results from attempting to create a dangling
reference or a mismatch of the available lifetimes. In such cases, the solution
is to fix those problems, not to specify the 'static lifetime.
泛型类型参数、Trait Bound 和生命周期
Generic Type Parameters, Trait Bounds, and Lifetimes
让我们简要地看一下在一个函数中同时指定泛型类型参数、Trait bound 和生命周期的语法!
Let’s briefly look at the syntax of specifying generic type parameters, trait bounds, and lifetimes all in one function!
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
let result = longest_with_an_announcement(
string1.as_str(),
string2,
"Today is someone's birthday!",
);
println!("The longest string is {result}");
}
use std::fmt::Display;
fn longest_with_an_announcement<'a, T>(
x: &'a str,
y: &'a str,
ann: T,
) -> &'a str
where
T: Display,
{
println!("Announcement! {ann}");
if x.len() > y.len() { x } else { y }
}
这是示例 10-21 中返回两个字符串切片中较长者的 longest 函数。但现在它多了一个泛型类型为 T 的参数 ann,它可以由满足 where 子句指定的 Display Trait 的任何类型填充。这个额外的参数将使用 {} 打印,这就是为什么 Display Trait bound 是必要的。因为生命周期是泛型的一种,所以生命周期参数 'a 和泛型类型参数 T 的声明都放在函数名后尖括号内的同一个列表中。
This is the longest function from Listing 10-21 that returns the longer of
two string slices. But now it has an extra parameter named ann of the generic
type T, which can be filled in by any type that implements the Display
trait as specified by the where clause. This extra parameter will be printed
using {}, which is why the Display trait bound is necessary. Because
lifetimes are a type of generic, the declarations of the lifetime parameter
'a and the generic type parameter T go in the same list inside the angle
brackets after the function name.
总结
Summary
本章涵盖了很多内容!既然你已经了解了泛型类型参数、Trait 和 Trait bound 以及泛型生命周期参数,你就准备好编写可在许多不同情况下运行且无重复的代码了。泛型类型参数允许你将代码应用于不同的类型。Trait 和 Trait bound 确保即使类型是泛型的,它们也将具有代码所需的行为。你学习了如何使用生命周期标注来确保这种灵活的代码不会产生任何悬垂引用。而所有这些分析都发生在编译时,不会影响运行时性能!
We covered a lot in this chapter! Now that you know about generic type parameters, traits and trait bounds, and generic lifetime parameters, you’re ready to write code without repetition that works in many different situations. Generic type parameters let you apply the code to different types. Traits and trait bounds ensure that even though the types are generic, they’ll have the behavior the code needs. You learned how to use lifetime annotations to ensure that this flexible code won’t have any dangling references. And all of this analysis happens at compile time, which doesn’t affect runtime performance!
信不信由你,关于我们在本章中讨论的主题还有更多内容需要学习:第 18 章讨论了 Trait 对象,这是使用 Trait 的另一种方式。还有涉及生命周期标注的更复杂场景,你只会在非常高级的场景中需要它们;对于这些内容,你应该阅读 Rust 参考手册。但接下来,你将学习如何在 Rust 中编写测试,以便确保你的代码按预期方式工作。
Believe it or not, there is much more to learn on the topics we discussed in this chapter: Chapter 18 discusses trait objects, which are another way to use traits. There are also more complex scenarios involving lifetime annotations that you will only need in very advanced scenarios; for those, you should read the Rust Reference. But next, you’ll learn how to write tests in Rust so that you can make sure your code is working the way it should.