使用 Box<T> 指向堆上的数据 - Rust 程序设计语言简体中文版

使用 `Box<T>` 指向堆上的数据

Using `Box<T>` to Point to Data on the Heap

最直接的智能指针是 box，其类型写为 Box<T>。Box 允许你将数据存储在堆上而不是栈上。留在栈上的是指向堆数据的指针。请参阅第 4 章回顾栈和堆的区别。

The most straightforward smart pointer is a box, whose type is written Box<T>. Boxes allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data. Refer to Chapter 4 to review the difference between the stack and the heap.

除了将数据存储在堆上而非栈上外，Box 没有性能开销。但它们也没有很多额外能力。你最常在以下情况下使用它们：

Boxes don’t have performance overhead, other than storing their data on the heap instead of on the stack. But they don’t have many extra capabilities either. You’ll use them most often in these situations:

当你有一个在编译时无法知道大小的类型，并且你想在需要精确大小的上下文中使用该类型的值时
When you have a type whose size can’t be known at compile time, and you want to use a value of that type in a context that requires an exact size
当你拥有大量数据，并且想要转移所有权但确保在执行此操作时不会复制数据时
When you have a large amount of data, and you want to transfer ownership but ensure that the data won’t be copied when you do so
当你想拥有一个值，并且只关心它是一个实现了特定 trait 的类型，而不是具体的类型时
When you want to own a value, and you care only that it’s a type that implements a particular trait rather than being of a specific type

我们将在“通过 Box 实现递归类型”中演示第一种情况。在第二种情况下，转移大量数据的所有权可能需要很长时间，因为数据会在栈上被到处复制。为了在这种情况下提高性能，我们可以将大量数据以 box 的形式存储在堆上。这样，栈上只需复制少量的指针数据，而它引用的数据则保留在堆上的一个位置。第三种情况被称为 trait 对象（trait object），第 18 章中的“使用 Trait 对象实现对不同类型间共享行为的抽象”专门讨论了该话题。所以，你在这里学到的知识将再次应用到那个章节！

We’ll demonstrate the first situation in “Enabling Recursive Types with Boxes”. In the second case, transferring ownership of a large amount of data can take a long time because the data is copied around on the stack. To improve performance in this situation, we can store the large amount of data on the heap in a box. Then, only the small amount of pointer data is copied around on the stack, while the data it references stays in one place on the heap. The third case is known as a trait object, and “Using Trait Objects to Abstract over Shared Behavior” in Chapter 18 is devoted to that topic. So, what you learn here you’ll apply again in that section!

在堆上存储数据

Storing Data on the Heap

在讨论 Box<T> 的堆存储用例之前，我们将介绍其语法以及如何与存储在 Box<T> 中的值进行交互。

Before we discuss the heap storage use case for Box<T>, we’ll cover the syntax and how to interact with values stored within a Box<T>.

示例 15-1 展示了如何使用 box 在堆上存储一个 i32 值。

Listing 15-1 shows how to use a box to store an i32 value on the heap.

fn main() {
    let b = Box::new(5);
    println!("b = {b}");
}

我们定义变量 b 的值为一个指向值 5 的 Box，该值是在堆上分配的。该程序将打印 b = 5；在这种情况下，我们可以像访问栈上数据一样访问 box 中的数据。就像任何拥有所有权的值一样，当一个 box 超出作用域时（如 main 结尾处的 b），它将被释放。释放操作同时针对 box（存储在栈上）及其指向的数据（存储在堆上）。

We define the variable b to have the value of a Box that points to the value 5, which is allocated on the heap. This program will print b = 5; in this case, we can access the data in the box similarly to how we would if this data were on the stack. Just like any owned value, when a box goes out of scope, as b does at the end of main, it will be deallocated. The deallocation happens both for the box (stored on the stack) and the data it points to (stored on the heap).

在堆上存放单个值并没有太大用处，所以你不会经常这样单独使用 box。在大多数情况下，像单个 i32 这样默认存储在栈上的值更合适。让我们来看一个如果不使用 box 就无法定义类型的例子。

Putting a single value on the heap isn’t very useful, so you won’t use boxes by themselves in this way very often. Having values like a single i32 on the stack, where they’re stored by default, is more appropriate in the majority of situations. Let’s look at a case where boxes allow us to define types that we wouldn’t be allowed to define if we didn’t have boxes.

通过 Box 实现递归类型

Enabling Recursive Types with Boxes

递归类型（recursive type）的值可以将相同类型的另一个值作为其自身的一部分。递归类型会带来一个问题，因为 Rust 需要在编译时知道一个类型占用多少空间。然而，递归类型值的嵌套在理论上可以无限进行，因此 Rust 无法知道该值需要多少空间。因为 box 的大小是已知的，我们可以通过在递归类型定义中插入一个 box 来实现递归类型。

A value of a recursive type can have another value of the same type as part of itself. Recursive types pose an issue because Rust needs to know at compile time how much space a type takes up. However, the nesting of values of recursive types could theoretically continue infinitely, so Rust can’t know how much space the value needs. Because boxes have a known size, we can enable recursive types by inserting a box in the recursive type definition.

作为递归类型的一个例子，让我们探索一下 cons list。这是函数式编程语言中常见的数据类型。除了递归之外，我们要定义的 cons list 类型很简单；因此，我们要处理的示例中的概念在任何涉及递归类型的复杂情况下都会很有用。

As an example of a recursive type, let’s explore the cons list. This is a data type commonly found in functional programming languages. The cons list type we’ll define is straightforward except for the recursion; therefore, the concepts in the example we’ll work with will be useful anytime you get into more complex situations involving recursive types.

认识 Cons List

Understanding the Cons List

Cons list 是一种源自 Lisp 编程语言及其方言的数据结构，由嵌套的对（pairs）组成，是 Lisp 版本的链表。它的名称源自 Lisp 中的 cons 函数（construct function 的缩写），该函数从其两个参数构造一个新的对。通过对由一个值和另一个对组成的对调用 cons，我们可以构造出由递归对组成的 cons list。

Cons list is a data structure that comes from the Lisp programming language and its dialects, is made up of nested pairs, and is the Lisp version of a linked list. Its name comes from the cons function (short for construct function) in Lisp that constructs a new pair from its two arguments. By calling cons on a pair consisting of a value and another pair, we can construct cons lists made up of recursive pairs.

例如，这里是一个包含列表 1, 2, 3 的 cons list 的伪代码表示，每个对都在括号中：

For example, here’s a pseudocode representation of a cons list containing the list 1, 2, 3 with each pair in parentheses:

(1, (2, (3, Nil)))

cons list 中的每个项包含两个元素：当前项的值和下一个项。列表中的最后一项只包含一个名为 Nil 的值，而没有下一个项。cons list 是通过递归调用 cons 函数产生的。表示递归基本情况的规范名称是 Nil。请注意，这与第 6 章讨论的“null”或“nil”概念不同，后者代表无效或缺失的值。

Each item in a cons list contains two elements: the value of the current item and of the next item. The last item in the list contains only a value called Nil without a next item. A cons list is produced by recursively calling the cons function. The canonical name to denote the base case of the recursion is Nil. Note that this is not the same as the “null” or “nil” concept discussed in Chapter 6, which is an invalid or absent value.

cons list 在 Rust 中并不是常用的数据结构。在 Rust 中，大多数情况下当你有一个项目列表时，Vec<T> 是更好的选择。其他更复杂的递归数据类型在各种情况下是有用的，但通过本章从 cons list 开始，我们可以探索 box 如何让我们在没有太多干扰的情况下定义递归数据类型。

The cons list isn’t a commonly used data structure in Rust. Most of the time when you have a list of items in Rust, Vec<T> is a better choice to use. Other, more complex recursive data types are useful in various situations, but by starting with the cons list in this chapter, we can explore how boxes let us define a recursive data type without much distraction.

示例 15-2 包含了一个用于 cons list 的枚举定义。请注意，这段代码还无法编译，因为 List 类型的大小不是已知的，我们将对此进行演示。

Listing 15-2 contains an enum definition for a cons list. Note that this code won’t compile yet, because the List type doesn’t have a known size, which we’ll demonstrate.

enum List {
    Cons(i32, List),
    Nil,
}

fn main() {}

注意：为了本示例的目的，我们实现的是一个只保存 i32 值的 cons list。我们本可以像在第 10 章讨论的那样使用泛型来实现它，以定义一个可以存储任何类型值的 cons list 类型。

Note: We’re implementing a cons list that holds only i32 values for the purposes of this example. We could have implemented it using generics, as we discussed in Chapter 10, to define a cons list type that could store values of any type.

使用 List 类型存储列表 1, 2, 3 的代码看起来像示例 15-3 所示。

Using the List type to store the list 1, 2, 3 would look like the code in Listing 15-3.

enum List {
    Cons(i32, List),
    Nil,
}

// --snip--

use crate::List::{Cons, Nil};

fn main() {
    let list = Cons(1, Cons(2, Cons(3, Nil)));
}

第一个 Cons 值持有 1 和另一个 List 值。这个 List 值是另一个持有 2 和另一个 List 值的 Cons 值。这个 List 值又是另一个持有 3 和一个 List 值的 Cons 值，最后这个 List 值是 Nil，即发出列表结束信号的非递归变体。

The first Cons value holds 1 and another List value. This List value is another Cons value that holds 2 and another List value. This List value is one more Cons value that holds 3 and a List value, which is finally Nil, the non-recursive variant that signals the end of the list.

如果我们尝试编译示例 15-3 中的代码，会得到示例 15-4 所示的错误。

If we try to compile the code in Listing 15-3, we get the error shown in Listing 15-4.

$ cargo run
   Compiling cons-list v0.1.0 (file:///projects/cons-list)
error[E0072]: recursive type `List` has infinite size
 --> src/main.rs:1:1
  |
1 | enum List {
  | ^^^^^^^^^
2 |     Cons(i32, List),
  |               ---- recursive without indirection
  |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
  |
2 |     Cons(i32, Box<List>),
  |               ++++    +

error[E0391]: cycle detected when computing when `List` needs drop
 --> src/main.rs:1:1
  |
1 | enum List {
  | ^^^^^^^^^
  |
  = note: ...which immediately requires computing when `List` needs drop again
  = note: cycle used when computing whether `List` needs drop
  = note: see https://rustc-dev-guide.rust-lang.org/overview.html#queries and https://rustc-dev-guide.rust-lang.org/query.html for more information

Some errors have detailed explanations: E0072, E0391.
For more information about an error, try `rustc --explain E0072`.
error: could not compile `cons-list` (bin "cons-list") due to 2 previous errors

错误显示该类型“具有无限大小”。原因是我们将 List 定义为一个递归的变体：它直接持有自身的另一个值。结果，Rust 无法计算出存储一个 List 值需要多少空间。让我们分析一下为什么会出现这个错误。首先，我们将了解 Rust 如何决定存储非递归类型的值需要多少空间。

The error shows this type “has infinite size.” The reason is that we’ve defined List with a variant that is recursive: It holds another value of itself directly. As a result, Rust can’t figure out how much space it needs to store a List value. Let’s break down why we get this error. First, we’ll look at how Rust decides how much space it needs to store a value of a non-recursive type.

计算非递归类型的大小

Computing the Size of a Non-Recursive Type

回想一下我们在第 6 章讨论枚举定义时定义的 Message 枚举（示例 6-2）：

Recall the Message enum we defined in Listing 6-2 when we discussed enum definitions in Chapter 6:

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

fn main() {}

为了确定要为 Message 值分配多少空间，Rust 会检查每个变体，看哪个变体需要的空间最多。Rust 看到 Message::Quit 不需要任何空间，Message::Move 需要足够的空间来存储两个 i32 值，依此类推。因为只会使用一个变体，所以 Message 值需要的最大空间就是存储其最大变体所需的空间。

To determine how much space to allocate for a Message value, Rust goes through each of the variants to see which variant needs the most space. Rust sees that Message::Quit doesn’t need any space, Message::Move needs enough space to store two i32 values, and so forth. Because only one variant will be used, the most space a Message value will need is the space it would take to store the largest of its variants.

与 Rust 尝试确定像示例 15-2 中的 List 枚举这样的递归类型需要多少空间时发生的情况相对比。编译器首先查看 Cons 变体，它持有 i32 类型的值和 List 类型的值。因此，Cons 需要的空间等于 i32 的大小加上 List 的大小。为了计算 List 类型需要多少内存，编译器查看其变体，从 Cons 变体开始。Cons 变体持有 i32 类型的值和 List 类型的值，这个过程会无限持续下去，如图 15-1 所示。

Contrast this with what happens when Rust tries to determine how much space a recursive type like the List enum in Listing 15-2 needs. The compiler starts by looking at the Cons variant, which holds a value of type i32 and a value of type List. Therefore, Cons needs an amount of space equal to the size of an i32 plus the size of a List. To figure out how much memory the List type needs, the compiler looks at the variants, starting with the Cons variant. The Cons variant holds a value of type i32 and a value of type List, and this process continues infinitely, as shown in Figure 15-1.

无限 Cons 列表：一个标记为 'Cons' 的矩形被分成两个较小的矩形。第一个较小的矩形持有标签 'i32'，第二个较小的矩形持有标签 'Cons' 和外部 'Cons' 矩形的缩小版本。'Cons' 矩形继续持有越来越小的自身版本，直到最小的、大小适中的矩形持有一个无穷大符号，表示这种重复永远持续下去。

An infinite Cons list: a rectangle labeled 'Cons' split into two smaller rectangles. The first smaller rectangle holds the label 'i32', and the second smaller rectangle holds the label 'Cons' and a smaller version of the outer 'Cons' rectangle. The 'Cons' rectangles continue to hold smaller and smaller versions of themselves until the smallest comfortably sized rectangle holds an infinity symbol, indicating that this repetition goes on forever.

图 15-1：一个由无限个 Cons 变体组成的无限 List Figure 15-1: An infinite List consisting of infinite Cons variants

获取已知大小的递归类型

Getting a Recursive Type with a Known Size

由于 Rust 无法计算出为递归定义的类型分配多少空间，编译器给出了一个错误，并提出了以下有用的建议：

Because Rust can’t figure out how much space to allocate for recursively defined types, the compiler gives an error with this helpful suggestion:

help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
  |
2 |     Cons(i32, Box<List>),
  |               ++++    +

在此建议中，间接（indirection）意味着我们不应该直接存储一个值，而应该改变数据结构，通过存储指向该值的指针来间接存储该值。

In this suggestion, indirection means that instead of storing a value directly, we should change the data structure to store the value indirectly by storing a pointer to the value instead.

因为 Box<T> 是一个指针，所以 Rust 总是知道 Box<T> 需要多少空间：指针的大小不会根据它指向的数据量而改变。这意味着我们可以在 Cons 变体中放入一个 Box<T>，而不是直接放入另一个 List 值。Box<T> 将指向堆上的下一个 List 值，而不是在 Cons 变体内部。从概念上讲，我们仍然有一个通过持有其他列表的列表而创建的列表，但现在的这种实现更像是将项彼此相邻放置，而不是一个嵌套在另一个里面。

Because a Box<T> is a pointer, Rust always knows how much space a Box<T> needs: A pointer’s size doesn’t change based on the amount of data it’s pointing to. This means we can put a Box<T> inside the Cons variant instead of another List value directly. The Box<T> will point to the next List value that will be on the heap rather than inside the Cons variant. Conceptually, we still have a list, created with lists holding other lists, but this implementation is now more like placing the items next to one another rather than inside one another.

我们可以将示例 15-2 中的 List 枚举定义和示例 15-3 中的 List 用法更改为示例 15-5 所示的代码，这段代码是可以编译的。

We can change the definition of the List enum in Listing 15-2 and the usage of the List in Listing 15-3 to the code in Listing 15-5, which will compile.

enum List {
    Cons(i32, Box<List>),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() {
    let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}

Cons 变体需要 i32 的大小加上存储 box 指针数据的空间。Nil 变体不存储任何值，因此它在栈上比 Cons 变体需要的空间少。我们现在知道，任何 List 值都将占用一个 i32 的大小加上一个 box 指针数据的大小。通过使用 box，我们打破了无限递归链，因此编译器可以计算出存储 List 值所需的大小。图 15-2 展示了现在 Cons 变体的样子。

The Cons variant needs the size of an i32 plus the space to store the box’s pointer data. The Nil variant stores no values, so it needs less space on the stack than the Cons variant. We now know that any List value will take up the size of an i32 plus the size of a box’s pointer data. By using a box, we’ve broken the infinite, recursive chain, so the compiler can figure out the size it needs to store a List value. Figure 15-2 shows what the Cons variant looks like now.

一个标记为 'Cons' 的矩形被分成两个较小的矩形。第一个较小的矩形持有标签 'i32'，第二个较小的矩形持有标签 'Box'，内部有一个包含标签 'usize' 的矩形，代表 box 指针的有限大小。

A rectangle labeled 'Cons' split into two smaller rectangles. The first smaller rectangle holds the label 'i32', and the second smaller rectangle holds the label 'Box' with one inner rectangle that contains the label 'usize', representing the finite size of the box's pointer.

图 15-2：大小并非无限的 List，因为 Cons 持有一个 Box Figure 15-2: A List that is not infinitely sized, because Cons holds a Box

Box 仅提供间接性（indirection）和堆分配；它们没有其他特殊能力，不像我们将要看到的其他智能指针类型。它们也没有由于这些特殊能力而带来的性能开销，因此在像 cons list 这样只需要间接性功能的情况下，它们很有用。我们将在第 18 章中看到 box 的更多用例。

Boxes provide only the indirection and heap allocation; they don’t have any other special capabilities, like those we’ll see with the other smart pointer types. They also don’t have the performance overhead that these special capabilities incur, so they can be useful in cases like the cons list where the indirection is the only feature we need. We’ll look at more use cases for boxes in Chapter 18.

Box<T> 类型是一个智能指针，因为它实现了 Deref trait，这允许 Box<T> 值被当作引用对待。当一个 Box<T> 值超出作用域时，由于 Drop trait 的实现，该 box 指向的堆数据也会被清理。这两个 trait 对于我们在本章剩余部分讨论的其他智能指针类型所提供的功能将更加重要。让我们更详细地探讨这两个 trait。

The Box<T> type is a smart pointer because it implements the Deref trait, which allows Box<T> values to be treated like references. When a Box<T> value goes out of scope, the heap data that the box is pointing to is cleaned up as well because of the Drop trait implementation. These two traits will be even more important to the functionality provided by the other smart pointer types we’ll discuss in the rest of this chapter. Let’s explore these two traits in more detail.

Keyboard shortcuts

Rust 程序设计语言 简体中文版

Rust 程序设计语言简体中文版