Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rc<T>:引用计数智能指针

Rc<T>, the Reference-Counted Smart Pointer

在大多数情况下,所有权是明确的:你清楚地知道哪个变量拥有给定的值。然而,在某些情况下,单个值可能具有多个所有者。例如,在图数据结构中,多个边可能指向同一个节点,而该节点在概念上由指向它的所有边共同拥有。除非一个节点没有任何指向它的边,从而没有所有者,否则不应清理该节点。

In the majority of cases, ownership is clear: You know exactly which variable owns a given value. However, there are cases when a single value might have multiple owners. For example, in graph data structures, multiple edges might point to the same node, and that node is conceptually owned by all of the edges that point to it. A node shouldn’t be cleaned up unless it doesn’t have any edges pointing to it and so has no owners.

你必须通过使用 Rust 的 Rc<T> 类型来显式开启多重所有权,它是 引用计数(reference counting)的缩写。Rc<T> 类型会跟踪指向一个值的引用数量,以确定该值是否仍在使用。如果一个值的引用数量为零,则可以清理该值,而不会使任何引用变得无效。

You have to enable multiple ownership explicitly by using the Rust type Rc<T>, which is an abbreviation for reference counting. The Rc<T> type keeps track of the number of references to a value to determine whether or not the value is still in use. If there are zero references to a value, the value can be cleaned up without any references becoming invalid.

Rc<T> 想象成起居室里的电视机。当一个人进来打算看电视时,他会打开它。其他人也可以进到房间里来看电视。当最后一个人离开房间时,他会关掉电视,因为它不再被使用了。如果有人在其他人还在看电视时关掉电视,剩下的观众肯定会抗议!

Imagine Rc<T> as a TV in a family room. When one person enters to watch TV, they turn it on. Others can come into the room and watch the TV. When the last person leaves the room, they turn off the TV because it’s no longer being used. If someone turns off the TV while others are still watching it, there would be an uproar from the remaining TV watchers!

当我们想要在堆上分配一些数据供程序的多个部分读取,并且在编译时无法确定哪个部分最后结束使用该数据时,我们会使用 Rc<T> 类型。如果我们知道哪个部分会最后结束,我们就可以直接让该部分作为数据的所有者,这样在编译时强制执行的常规所有权规则就会生效。

We use the Rc<T> type when we want to allocate some data on the heap for multiple parts of our program to read and we can’t determine at compile time which part will finish using the data last. If we knew which part would finish last, we could just make that part the data’s owner, and the normal ownership rules enforced at compile time would take effect.

请注意,Rc<T> 仅用于单线程场景。当我们第 16 章讨论并发时,我们将介绍如何在多线程程序中进行引用计数。

Note that Rc<T> is only for use in single-threaded scenarios. When we discuss concurrency in Chapter 16, we’ll cover how to do reference counting in multithreaded programs.

共享数据

Sharing Data

让我们回到示例 15-5 中的 cons list 例子。回想一下,我们使用 Box<T> 定义了它。这一次,我们将创建两个列表,它们共同拥有第三个列表的所有权。从概念上讲,这类似于图 15-3。

Let’s return to our cons list example in Listing 15-5. Recall that we defined it using Box<T>. This time, we’ll create two lists that both share ownership of a third list. Conceptually, this looks similar to Figure 15-3.

一个带有标签 'a' 的链表,指向三个元素。第一个元素包含整数 5 并指向第二个元素。第二个元素包含整数 10 并指向第三个元素。第三个元素包含值 'Nil',表示列表结束;它不指向任何地方。一个带有标签 'b' 的链表指向一个包含整数 3 的元素,并指向列表 'a' 的第一个元素。一个带有标签 'c' 的链表指向一个包含整数 4 的元素,也指向列表 'a' 的第一个元素,因此列表 'b' 和 'c' 的尾部都是列表 'a'。 A linked list with the label 'a' pointing to three elements. The first element contains the integer 5 and points to the second element. The second element contains the integer 10 and points to the third element. The third element contains the value 'Nil' that signifies the end of the list; it does not point anywhere. A linked list with the label 'b' points to an element that contains the integer 3 and points to the first element of list 'a'. A linked list with the label 'c' points to an element that contains the integer 4 and also points to the first element of list 'a' so that the tails of lists 'b' and 'c' are both list 'a'.

图 15-3:两个列表 bc,共享第三个列表 a 的所有权 Figure 15-3: Two lists, b and c, sharing ownership of a third list, a

我们将创建一个包含 510 的列表 a。然后,我们将创建另外两个列表:以 3 开始的 b 和以 4 开始的 c。列表 bc 之后都将连接到包含 510 的第一个列表 a。换句话说,两个列表将共享包含 510 的第一个列表。

We’ll create list a that contains 5 and then 10. Then, we’ll make two more lists: b that starts with 3 and c that starts with 4. Both the b and c lists will then continue on to the first a list containing 5 and 10. In other words, both lists will share the first list containing 5 and 10.

尝试使用带 Box<T>List 定义来实现此场景是行不通的,如示例 15-17 所示。

Trying to implement this scenario using our definition of List with Box<T> won’t work, as shown in Listing 15-17.

enum List {
    Cons(i32, Box<List>),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() {
    let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
    let b = Cons(3, Box::new(a));
    let c = Cons(4, Box::new(a));
}

编译这段代码时,我们会得到如下错误:

When we compile this code, we get this error:

$ cargo run
   Compiling cons-list v0.1.0 (file:///projects/cons-list)
error[E0382]: use of moved value: `a`
  --> src/main.rs:11:30
   |
 9 |     let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
   |         - move occurs because `a` has type `List`, which does not implement the `Copy` trait
10 |     let b = Cons(3, Box::new(a));
   |                              - value moved here
11 |     let c = Cons(4, Box::new(a));
   |                              ^ value used here after move
   |
note: if `List` implemented `Clone`, you could clone the value
  --> src/main.rs:1:1
   |
 1 | enum List {
   | ^^^^^^^^^ consider implementing `Clone` for this type
...
10 |     let b = Cons(3, Box::new(a));
   |                              - you could clone this value

For more information about this error, try `rustc --explain E0382`.
error: could not compile `cons-list` (bin "cons-list") due to 1 previous error

Cons 变体拥有它们持有的数据,因此当我们创建列表 b 时,a 被移入 bb 拥有了 a。接着,当我们尝试在创建 c 时再次使用 a 时,这是不允许的,因为 a 已经被移走了。

The Cons variants own the data they hold, so when we create the b list, a is moved into b and b owns a. Then, when we try to use a again when creating c, we’re not allowed to because a has been moved.

我们可以将 Cons 的定义改为持有引用,但那样我们就必须指定生命周期参数。通过指定生命周期参数,我们将指定列表中的每个元素至少与整个列表存活得一样久。示例 15-17 中的元素和列表属于这种情况,但并非在所有场景下都是如此。

We could change the definition of Cons to hold references instead, but then we would have to specify lifetime parameters. By specifying lifetime parameters, we would be specifying that every element in the list will live at least as long as the entire list. This is the case for the elements and lists in Listing 15-17, but not in every scenario.

相反,我们将修改 List 的定义,使用 Rc<T> 代替 Box<T>,如示例 15-18 所示。现在每个 Cons 变体将持有一个值和一个指向 ListRc<T>。在创建 b 时,我们不再获取 a 的所有权,而是克隆 a 所持有的 Rc<List>,从而将引用计数从一增加到二,并让 ab 共享该 Rc<List> 中数据的所有权。在创建 c 时,我们也会克隆 a,将引用计数从二增加到三。每当我们调用 Rc::clone 时,指向 Rc<List> 内部数据的引用计数就会增加,并且除非引用计数为零,否则数据不会被清理。

Instead, we’ll change our definition of List to use Rc<T> in place of Box<T>, as shown in Listing 15-18. Each Cons variant will now hold a value and an Rc<T> pointing to a List. When we create b, instead of taking ownership of a, we’ll clone the Rc<List> that a is holding, thereby increasing the number of references from one to two and letting a and b share ownership of the data in that Rc<List>. We’ll also clone a when creating c, increasing the number of references from two to three. Every time we call Rc::clone, the reference count to the data within the Rc<List> will increase, and the data won’t be cleaned up unless there are zero references to it.

enum List {
    Cons(i32, Rc<List>),
    Nil,
}

use crate::List::{Cons, Nil};
use std::rc::Rc;

fn main() {
    let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
    let b = Cons(3, Rc::clone(&a));
    let c = Cons(4, Rc::clone(&a));
}

我们需要添加一个 use 语句将 Rc<T> 引入作用域,因为它不在 prelude 中。在 main 中,我们创建持有 510 的列表,并将它存储在 a 的新 Rc<List> 中。接着,在创建 bc 时,我们调用 Rc::clone 函数并将 aRc<List> 的引用作为参数传递。

We need to add a use statement to bring Rc<T> into scope because it’s not in the prelude. In main, we create the list holding 5 and 10 and store it in a new Rc<List> in a. Then, when we create b and c, we call the Rc::clone function and pass a reference to the Rc<List> in a as an argument.

我们本可以调用 a.clone() 而不是 Rc::clone(&a),但在这种情况下 Rust 的惯例是使用 Rc::cloneRc::clone 的实现并不像大多数类型的 clone 实现那样对所有数据进行深拷贝(deep copy)。调用 Rc::clone 只会增加引用计数,这并不耗时。数据的深拷贝可能需要大量时间。通过使用 Rc::clone 进行引用计数,我们可以从视觉上区分深拷贝类的克隆和增加引用计数类的克隆。在查找代码中的性能问题时,我们只需要考虑深拷贝克隆,而可以忽略对 Rc::clone 的调用。

We could have called a.clone() rather than Rc::clone(&a), but Rust’s convention is to use Rc::clone in this case. The implementation of Rc::clone doesn’t make a deep copy of all the data like most types’ implementations of clone do. The call to Rc::clone only increments the reference count, which doesn’t take much time. Deep copies of data can take a lot of time. By using Rc::clone for reference counting, we can visually distinguish between the deep-copy kinds of clones and the kinds of clones that increase the reference count. When looking for performance problems in the code, we only need to consider the deep-copy clones and can disregard calls to Rc::clone.

通过克隆增加引用计数

Cloning to Increase the Reference Count

让我们修改示例 15-18 中的示例,以便观察当我们创建和丢弃对 aRc<List> 的引用时引用计数的变化。

Let’s change our working example in Listing 15-18 so that we can see the reference counts changing as we create and drop references to the Rc<List> in a.

在示例 15-19 中,我们将修改 main,使其围绕列表 c 有一个内部作用域;这样我们就可以看到当 c 超出作用域时引用计数是如何变化的。

In Listing 15-19, we’ll change main so that it has an inner scope around list c; then, we can see how the reference count changes when c goes out of scope.

enum List {
    Cons(i32, Rc<List>),
    Nil,
}

use crate::List::{Cons, Nil};
use std::rc::Rc;

// --snip--

fn main() {
    let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
    println!("count after creating a = {}", Rc::strong_count(&a));
    let b = Cons(3, Rc::clone(&a));
    println!("count after creating b = {}", Rc::strong_count(&a));
    {
        let c = Cons(4, Rc::clone(&a));
        println!("count after creating c = {}", Rc::strong_count(&a));
    }
    println!("count after c goes out of scope = {}", Rc::strong_count(&a));
}

在程序中引用计数发生变化的每个点,我们都打印引用计数,这是通过调用 Rc::strong_count 函数获得的。该函数被命名为 strong_count 而不是 count,是因为 Rc<T> 类型也有一个 weak_count;我们将在“使用 Weak<T> 防止引用循环”中看到 weak_count 的用途。

At each point in the program where the reference count changes, we print the reference count, which we get by calling the Rc::strong_count function. This function is named strong_count rather than count because the Rc<T> type also has a weak_count; we’ll see what weak_count is used for in “Preventing Reference Cycles Using Weak<T>.

这段代码打印以下内容:

This code prints the following:

$ cargo run
   Compiling cons-list v0.1.0 (file:///projects/cons-list)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.45s
     Running `target/debug/cons-list`
count after creating a = 1
count after creating b = 2
count after creating c = 3
count after c goes out of scope = 2

我们可以看到,a 中的 Rc<List> 初始引用计数为 1;接着,每次我们调用 clone,计数就增加 1。当 c 超出作用域时,计数减少 1。我们不需要像调用 Rc::clone 增加引用计数那样通过调用函数来减少引用计数:当 Rc<T> 值超出作用域时,Drop trait 的实现会自动减少引用计数。

We can see that the Rc<List> in a has an initial reference count of 1; then, each time we call clone, the count goes up by 1. When c goes out of scope, the count goes down by 1. We don’t have to call a function to decrease the reference count like we have to call Rc::clone to increase the reference count: The implementation of the Drop trait decreases the reference count automatically when an Rc<T> value goes out of scope.

在这个例子中我们看不到的是,当 b 之后是 amain 结尾超出作用域时,计数为 0,Rc<List> 被彻底清理。使用 Rc<T> 允许单个值具有多个所有者,而计数确保了只要任何所有者仍然存在,该值就保持有效。

What we can’t see in this example is that when b and then a go out of scope at the end of main, the count is 0, and the Rc<List> is cleaned up completely. Using Rc<T> allows a single value to have multiple owners, and the count ensures that the value remains valid as long as any of the owners still exist.

通过不可变引用,Rc<T> 允许你在程序的多个部分之间共享数据以供读取。如果 Rc<T> 也允许你拥有多个可变引用,你可能会违反第 4 章讨论的借用规则之一:对同一位置的多个可变借用可能导致数据竞争和不一致。但能够修改数据非常有用!在下一节中,我们将讨论内部可变性模式和 RefCell<T> 类型,你可以将它与 Rc<T> 结合使用,以克服这一不可变性限制。

Via immutable references, Rc<T> allows you to share data between multiple parts of your program for reading only. If Rc<T> allowed you to have multiple mutable references too, you might violate one of the borrowing rules discussed in Chapter 4: Multiple mutable borrows to the same place can cause data races and inconsistencies. But being able to mutate data is very useful! In the next section, we’ll discuss the interior mutability pattern and the RefCell<T> type that you can use in conjunction with an Rc<T> to work with this immutability restriction.