Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

什么是所有权?

What Is Ownership?

“所有权”(Ownership)是一套管理 Rust 程序如何管理内存的规则。所有程序在运行时都必须管理它们使用计算机内存的方式。一些语言具有垃圾回收机制,在程序运行时定期寻找不再使用的内存;在其他语言中,程序员必须显式地分配和释放内存。Rust 采用了第三种方法:内存通过一个所有权系统进行管理,该系统有一套编译器检查的规则。如果违反了任何规则,程序将无法编译。在程序运行时,所有权的功能都不会减慢程序的运行速度。

Ownership is a set of rules that govern how a Rust program manages memory. All programs have to manage the way they use a computer’s memory while running. Some languages have garbage collection that regularly looks for no-longer-used memory as the program runs; in other languages, the programmer must explicitly allocate and free the memory. Rust uses a third approach: Memory is managed through a system of ownership with a set of rules that the compiler checks. If any of the rules are violated, the program won’t compile. None of the features of ownership will slow down your program while it’s running.

因为所有权对许多程序员来说是一个新概念,所以确实需要一些时间来适应。好消息是,你对 Rust 和所有权系统的规则越熟悉,你就越容易自然地开发出既安全又高效的代码。坚持下去!

Because ownership is a new concept for many programmers, it does take some time to get used to. The good news is that the more experienced you become with Rust and the rules of the ownership system, the easier you’ll find it to naturally develop code that is safe and efficient. Keep at it!

当你理解了所有权,你就为理解使 Rust 独特的功能打下了坚实的基础。在本章中,你将通过学习一些专注于非常常见的数据结构(字符串)的示例来学习所有权。

When you understand ownership, you’ll have a solid foundation for understanding the features that make Rust unique. In this chapter, you’ll learn ownership by working through some examples that focus on a very common data structure: strings.

栈和堆

The Stack and the Heap

许多编程语言不要求你经常思考栈和堆。但在像 Rust 这样的系统编程语言中,一个值是在栈上还是在堆上会影响语言的行为方式,以及你为什么必须做出某些决定。本章稍后将结合栈和堆来描述所有权的部分内容,因此这里先做一个简要的解释作为准备。

Many programming languages don’t require you to think about the stack and the heap very often. But in a systems programming language like Rust, whether a value is on the stack or the heap affects how the language behaves and why you have to make certain decisions. Parts of ownership will be described in relation to the stack and the heap later in this chapter, so here is a brief explanation in preparation.

栈和堆都是你的代码在运行时可以使用的内存部分,但它们的结构方式不同。栈按接收值的顺序存储值,并按相反的顺序移除值。这被称为“后进先出”(last in, first out (LIFO))。想象一叠盘子:当你添加更多盘子时,你把它们放在堆的最上面,当你需要一个盘子时,你从最上面拿走一个。从中间或底部添加或移除盘子就不那么方便了!添加数据被称为“压入栈”(pushing onto the stack),移除数据被称为“弹出栈”(popping off the stack)。所有存储在栈上的数据都必须具有已知的、固定的大小。在编译时大小未知或大小可能发生变化的数据必须存储在堆上。

Both the stack and the heap are parts of memory available to your code to use at runtime, but they are structured in different ways. The stack stores values in the order it gets them and removes the values in the opposite order. This is referred to as last in, first out (LIFO). Think of a stack of plates: When you add more plates, you put them on top of the pile, and when you need a plate, you take one off the top. Adding or removing plates from the middle or bottom wouldn’t work as well! Adding data is called pushing onto the stack, and removing data is called popping off the stack. All data stored on the stack must have a known, fixed size. Data with an unknown size at compile time or a size that might change must be stored on the heap instead.

堆的组织性较差:当你把数据放在堆上时,你会请求一定量的空间。内存分配器在堆中找到一个足够大的空位,将其标记为正在使用,并返回一个“指针”(pointer),即该位置的地址。这个过程被称为“在堆上分配”(allocating on the heap),有时简称为“分配”(allocating)(将值压入栈不被视为分配)。因为指向堆的指针是已知的、固定的大小,所以你可以将指针存储在栈上,但当你想要实际数据时,必须跟随指针。想象一下在餐厅就座。当你进入时,你说明你的人数,服务生会找一张适合所有人的空桌子并带你过去。如果你组里有人迟到了,他们可以询问你坐在哪里来找到你。

The heap is less organized: When you put data on the heap, you request a certain amount of space. The memory allocator finds an empty spot in the heap that is big enough, marks it as being in use, and returns a pointer, which is the address of that location. This process is called allocating on the heap and is sometimes abbreviated as just allocating (pushing values onto the stack is not considered allocating). Because the pointer to the heap is a known, fixed size, you can store the pointer on the stack, but when you want the actual data, you must follow the pointer. Think of being seated at a restaurant. When you enter, you state the number of people in your group, and the host finds an empty table that fits everyone and leads you there. If someone in your group comes late, they can ask where you’ve been seated to find you.

压入栈比在堆上分配快,因为分配器永远不需要搜索存储新数据的地方;那个位置总是在栈的最顶端。相比之下,在堆上分配空间需要更多的工作,因为分配器必须首先找到一个足够大的空间来容纳数据,然后进行记账工作以准备下一次分配。

Pushing to the stack is faster than allocating on the heap because the allocator never has to search for a place to store new data; that location is always at the top of the stack. Comparatively, allocating space on the heap requires more work because the allocator must first find a big enough space to hold the data and then perform bookkeeping to prepare for the next allocation.

访问堆中的数据通常比访问栈上的数据慢,因为你必须通过指针才能到达那里。如果现代处理器在内存中跳跃较少,它们的速度会更快。继续这个类比,考虑餐厅的服务员接受许多桌子的订单。在移到下一张桌子之前,在一张桌子上拿走所有的订单是最有效的。从 A 桌拿一个订单,然后从 B 桌拿一个,然后再从 A 桌拿一个,然后再从 B 桌拿一个,这将是一个慢得多的过程。出于同样的理由,如果处理器处理与其他数据接近的数据(如在栈上),而不是较远的数据(如在堆上),它通常能更好地完成工作。

Accessing data in the heap is generally slower than accessing data on the stack because you have to follow a pointer to get there. Contemporary processors are faster if they jump around less in memory. Continuing the analogy, consider a server at a restaurant taking orders from many tables. It’s most efficient to get all the orders at one table before moving on to the next table. Taking an order from table A, then an order from table B, then one from A again, and then one from B again would be a much slower process. By the same token, a processor can usually do its job better if it works on data that’s close to other data (as it is on the stack) rather than farther away (as it can be on the heap).

当你的代码调用一个函数时,传递给函数的值(可能包括指向堆上数据的指针)和函数的局部变量会被压入栈。当函数结束时,这些值会从栈中弹出。

When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack.

跟踪代码的哪些部分正在使用堆上的哪些数据,最大限度地减少堆上的重复数据量,以及清理堆上未使用的数据以防止空间耗尽,这些都是所有权要解决的问题。一旦你理解了所有权,你就不需要经常考虑栈和堆了。但知道所有权的主要目的是管理堆数据,可以帮助解释它为什么以这种方式工作。

Keeping track of what parts of code are using what data on the heap, minimizing the amount of duplicate data on the heap, and cleaning up unused data on the heap so that you don’t run out of space are all problems that ownership addresses. Once you understand ownership, you won’t need to think about the stack and the heap very often. But knowing that the main purpose of ownership is to manage heap data can help explain why it works the way it does.

所有权规则

Ownership Rules

首先,让我们来看看所有权规则。在学习说明这些规则的示例时,请记住这些规则:

First, let’s take a look at the ownership rules. Keep these rules in mind as we work through the examples that illustrate them:

  • Rust 中的每个值都有一个“所有者”(owner)。

  • Each value in Rust has an owner.

  • 同一时间只能有一个所有者。

  • There can only be one owner at a time.

  • 当所有者离开作用域时,该值将被丢弃。

  • When the owner goes out of scope, the value will be dropped.

变量作用域

Variable Scope

既然我们已经学过了 Rust 的基本语法,我们就不会在示例中包含所有的 fn main() { 代码了,所以如果你在跟着做,请确保手动将以下示例放入 main 函数中。因此,我们的示例将更加简洁,让我们能够专注于实际的细节而不是样板代码。

Now that we’re past basic Rust syntax, we won’t include all the fn main() { code in the examples, so if you’re following along, make sure to put the following examples inside a main function manually. As a result, our examples will be a bit more concise, letting us focus on the actual details rather than boilerplate code.

作为所有权的第一个例子,我们将看看一些变量的作用域。“作用域”(scope)是一个项在程序中有效的范围。以以下变量为例:

As a first example of ownership, we’ll look at the scope of some variables. A scope is the range within a program for which an item is valid. Take the following variable:

#![allow(unused)]
fn main() {
let s = "hello";
}

变量 s 指向一个字符串字面量,其中字符串的值被硬编码在程序的文本中。该变量从声明点开始有效,直到当前作用域结束。示例 4-1 显示了一个带有注释说明变量 s 在何处有效的程序。

The variable s refers to a string literal, where the value of the string is hardcoded into the text of our program. The variable is valid from the point at which it’s declared until the end of the current scope. Listing 4-1 shows a program with comments annotating where the variable s would be valid.

fn main() {
    {                      // s is not valid here, since it's not yet declared
        let s = "hello";   // s is valid from this point forward

        // do stuff with s
    }                      // this scope is now over, and s is no longer valid
}

换句话说,这里有两个重要的时间点:

In other words, there are two important points in time here:

  • s “进入”作用域时,它是有效的。

  • When s comes into scope, it is valid.

  • 它保持有效,直到它“离开”作用域。

  • It remains valid until it goes out of scope.

到目前为止,作用域与变量何时有效之间的关系与其他编程语言类似。现在我们将通过引入 String 类型来在此基础上进行构建。

At this point, the relationship between scopes and when variables are valid is similar to that in other programming languages. Now we’ll build on top of this understanding by introducing the String type.

String 类型

The String Type

为了说明所有权规则,我们需要一种比我们在第 3 章“数据类型”部分中介绍的更复杂的数据类型。之前介绍的类型具有已知的大小,可以存储在栈上并在其作用域结束时从栈中弹出,并且如果代码的其他部分需要在不同的作用域中使用相同的值,可以快速且琐碎地复制以制作一个新的、独立的实例。但我们想要研究存储在堆上的数据,并探索 Rust 如何知道何时清理这些数据,而 String 类型就是一个很好的例子。

To illustrate the rules of ownership, we need a data type that is more complex than those we covered in the “Data Types” section of Chapter 3. The types covered previously are of a known size, can be stored on the stack and popped off the stack when their scope is over, and can be quickly and trivially copied to make a new, independent instance if another part of code needs to use the same value in a different scope. But we want to look at data that is stored on the heap and explore how Rust knows when to clean up that data, and the String type is a great example.

我们将集中讨论 String 中与所有权相关的部分。这些方面也适用于其他复杂数据类型,无论它们是由标准库提供的还是由你创建的。我们将在第 8 章讨论 String 的非所有权方面。

We’ll concentrate on the parts of String that relate to ownership. These aspects also apply to other complex data types, whether they are provided by the standard library or created by you. We’ll discuss non-ownership aspects of String in Chapter 8.

我们已经见过字符串字面量,其中字符串值被硬编码到我们的程序中。字符串字面量很方便,但它们并不适用于我们可能想要使用文本的每种情况。一个原因是它们是不可变的。另一个原因是,并非每个字符串值在我们编写代码时都能知道:例如,如果我们想要获取用户输入并存储它该怎么办?针对这些情况,Rust 提供了 String 类型。此类型管理在堆上分配的数据,因此能够存储我们在编译时未知的文本量。你可以使用 from 函数从字符串字面量创建一个 String,如下所示:

We’ve already seen string literals, where a string value is hardcoded into our program. String literals are convenient, but they aren’t suitable for every situation in which we may want to use text. One reason is that they’re immutable. Another is that not every string value can be known when we write our code: For example, what if we want to take user input and store it? It is for these situations that Rust has the String type. This type manages data allocated on the heap and as such is able to store an amount of text that is unknown to us at compile time. You can create a String from a string literal using the from function, like so:

#![allow(unused)]
fn main() {
let s = String::from("hello");
}

双冒号 :: 运算符允许我们将这个特定的 from 函数命名空间化在 String 类型下,而不是使用类似于 string_from 之类的名称。我们将在第 5 章的“方法”部分,以及第 7 章“引用模块树中项的路径”中讨论模块命名空间时进一步讨论这种语法。

The double colon :: operator allows us to namespace this particular from function under the String type rather than using some sort of name like string_from. We’ll discuss this syntax more in the “Methods” section of Chapter 5, and when we talk about namespacing with modules in “Paths for Referring to an Item in the Module Tree” in Chapter 7.

这种字符串“可以”被修改:

This kind of string can be mutated:

fn main() {
    let mut s = String::from("hello");

    s.push_str(", world!"); // push_str() appends a literal to a String

    println!("{s}"); // this will print `hello, world!`
}

那么,这里有什么区别呢?为什么 String 可以修改而字面量不能?区别在于这两类如何处理内存。

So, what’s the difference here? Why can String be mutated but literals cannot? The difference is in how these two types deal with memory.

内存与分配

Memory and Allocation

对于字符串字面量,我们在编译时就知道了内容,所以文本被直接硬编码到最终的可执行文件中。这就是为什么字符串字面量快速且高效的原因。但这些特性仅源于字符串字面量的不可变性。不幸的是,对于每一段在编译时大小未知且在运行程序时大小可能发生变化的文本,我们无法将一块内存放入二进制文件中。

In the case of a string literal, we know the contents at compile time, so the text is hardcoded directly into the final executable. This is why string literals are fast and efficient. But these properties only come from the string literal’s immutability. Unfortunately, we can’t put a blob of memory into the binary for each piece of text whose size is unknown at compile time and whose size might change while running the program.

使用 String 类型,为了支持可变的、可增长的文本片段,我们需要在堆上分配一定数量的内存(编译时未知)来保存内容。这意味着:

With the String type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:

  • 必须在运行时从内存分配器请求内存。

  • The memory must be requested from the memory allocator at runtime.

  • 当我们用完 String 后,我们需要一种将此内存返回给分配器的方法。

  • We need a way of returning this memory to the allocator when we’re done with our String.

第一部分由我们完成:当我们调用 String::from 时,它的实现会请求它需要的内存。这在编程语言中几乎是通用的。

That first part is done by us: When we call String::from, its implementation requests the memory it needs. This is pretty much universal in programming languages.

然而,第二部分不同。在具有“垃圾回收器”(GC)的语言中,GC 会跟踪并清理不再使用的内存,我们不需要思考它。在大多数没有 GC 的语言中,我们的责任是识别内存何时不再被使用,并调用代码显式释放它,就像我们请求它一样。正确执行此操作在历史上一直是一个困难的编程问题。如果我们忘记了,我们将浪费内存。如果我们做得太早,我们将拥有一个无效变量。如果我们做两次,那也是一个错误。我们需要将恰好一个 allocate(分配)与恰好一个 free(释放)配对。

However, the second part is different. In languages with a garbage collector (GC), the GC keeps track of and cleans up memory that isn’t being used anymore, and we don’t need to think about it. In most languages without a GC, it’s our responsibility to identify when memory is no longer being used and to call code to explicitly free it, just as we did to request it. Doing this correctly has historically been a difficult programming problem. If we forget, we’ll waste memory. If we do it too early, we’ll have an invalid variable. If we do it twice, that’s a bug too. We need to pair exactly one allocate with exactly one free.

Rust 走了一条不同的路:一旦拥有内存的变量离开作用域,内存就会自动返回。这里有一个使用 String 代替字符串字面量的示例 4-1 中的作用域示例版本:

Rust takes a different path: The memory is automatically returned once the variable that owns it goes out of scope. Here’s a version of our scope example from Listing 4-1 using a String instead of a string literal:

fn main() {
    {
        let s = String::from("hello"); // s is valid from this point forward

        // do stuff with s
    }                                  // this scope is now over, and s is no
                                       // longer valid
}

有一个自然的时间点可以将 String 需要的内存返回给分配器:当 s 离开作用域时。当变量离开作用域时,Rust 会为我们调用一个特殊的函数。这个函数被称为 dropString 的作者可以在其中放入返回内存的代码。Rust 在遇到闭合花括号时会自动调用 drop

There is a natural point at which we can return the memory our String needs to the allocator: when s goes out of scope. When a variable goes out of scope, Rust calls a special function for us. This function is called drop, and it’s where the author of String can put the code to return the memory. Rust calls drop automatically at the closing curly bracket.

注意:在 C++ 中,这种在项的生命周期结束时释放资源的模式有时被称为“资源获取即初始化”(Resource Acquisition Is Initialization (RAII))。如果你使用过 RAII 模式,Rust 中的 drop 函数对你来说会很熟悉。

Note: In C++, this pattern of deallocating resources at the end of an item’s lifetime is sometimes called Resource Acquisition Is Initialization (RAII). The drop function in Rust will be familiar to you if you’ve used RAII patterns.

这种模式对 Rust 代码的编写方式有着深远的影响。现在看来可能很简单,但在我们想要让多个变量使用我们在堆上分配的数据的更复杂情况下,代码的行为可能会出乎意料。让我们现在来探索其中的一些情况。

This pattern has a profound impact on the way Rust code is written. It may seem simple right now, but the behavior of code can be unexpected in more complicated situations when we want to have multiple variables use the data we’ve allocated on the heap. Let’s explore some of those situations now.

变量与数据交互的方式:移动

Variables and Data Interacting with Move

在 Rust 中,多个变量可以以不同的方式与相同的数据进行交互。示例 4-2 显示了一个使用整数的例子。

Multiple variables can interact with the same data in different ways in Rust. Listing 4-2 shows an example using an integer.

fn main() {
    let x = 5;
    let y = x;
}

我们大概可以猜到这是在做什么:“将值 5 绑定到 x;然后,复制 x 中的值并将其绑定到 y。”我们现在有两个变量 xy,它们都等于 5。这确实是正在发生的事情,因为整数是具有已知的固定大小的简单值,而这两个 5 值被压入栈。

We can probably guess what this is doing: “Bind the value 5 to x; then, make a copy of the value in x and bind it to y.” We now have two variables, x and y, and both equal 5. This is indeed what is happening, because integers are simple values with a known, fixed size, and these two 5 values are pushed onto the stack.

现在让我们看看 String 版本:

Now let’s look at the String version:

fn main() {
    let s1 = String::from("hello");
    let s2 = s1;
}

这看起来非常相似,所以我们可能会假设它的工作方式是相同的:即第二行会复制 s1 中的值并将其绑定到 s2。但这并非完全如此。

This looks very similar, so we might assume that the way it works would be the same: That is, the second line would make a copy of the value in s1 and bind it to s2. But this isn’t quite what happens.

看看图 4-1,了解 String 底层发生了什么。一个 String 由三部分组成,如左图所示:一个指向保存字符串内容的内存的指针、一个长度和一个容量。这组数据存储在栈上。右边是堆中保存内容的内存。

Take a look at Figure 4-1 to see what is happening to String under the covers. A String is made up of three parts, shown on the left: a pointer to the memory that holds the contents of the string, a length, and a capacity. This group of data is stored on the stack. On the right is the memory on the heap that holds the contents.

两张表格:第一张表包含 s1 在栈上的表示,由其长度 (5)、容量 (5) 和指向第二张表中第一个值的指针组成。第二张表包含堆上字符串数据的逐字节表示。

图 4-1:存储绑定到 s1 的值 "hello"String 在内存中的表示 Figure 4-1: The representation in memory of a String holding the value "hello" bound to s1

长度是 String 内容当前使用的内存量(以字节为单位)。容量是 String 从分配器接收到的内存总量(以字节为单位)。长度和容量之间的差异很重要,但在这种情况下并不重要,所以现在可以忽略容量。

The length is how much memory, in bytes, the contents of the String are currently using. The capacity is the total amount of memory, in bytes, that the String has received from the allocator. The difference between length and capacity matters, but not in this context, so for now, it’s fine to ignore the capacity.

当我们把 s1 赋值给 s2 时,String 数据被复制,这意味着我们复制了栈上的指针、长度和容量。我们不复制指针指向的堆上的数据。换句话说,内存中的数据表示如图 4-2 所示。

When we assign s1 to s2, the String data is copied, meaning we copy the pointer, the length, and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to. In other words, the data representation in memory looks like Figure 4-2.

三张表格:表格 s1 和 s2 分别代表栈上的那些字符串,并且都指向堆上相同的字符串数据。

图 4-2:变量 s2 的内存表示,它具有 s1 指针、长度和容量的副本 Figure 4-2: The representation in memory of the variable s2 that has a copy of the pointer, length, and capacity of s1

这种表示“并不”像图 4-3 所示,如果 Rust 同时也复制了堆数据,内存就会是这个样子。如果 Rust 这样做,如果堆上的数据很大,操作 s2 = s1 可能会在运行时性能方面非常昂贵。

The representation does not look like Figure 4-3, which is what memory would look like if Rust instead copied the heap data as well. If Rust did this, the operation s2 = s1 could be very expensive in terms of runtime performance if the data on the heap were large.

四张表格:两张表代表 s1 和 s2 的栈数据,每张表都指向其自己在堆上的字符串数据副本。

图 4-3:如果 Rust 也复制堆数据,s2 = s1 可能做的另一种可能性 Figure 4-3: Another possibility for what s2 = s1 might do if Rust copied the heap data as well

之前我们说过,当变量离开作用域时,Rust 会自动调用 drop 函数并为该变量清理堆内存。但图 4-2 显示两个数据指针指向同一个位置。这是一个问题:当 s2s1 离开作用域时,它们都会尝试释放相同的内存。这被称为“二次释放”(double free)错误,是我们之前提到的内存安全漏洞之一。释放两次内存会导致内存损坏,这可能会导致安全漏洞。

Earlier, we said that when a variable goes out of scope, Rust automatically calls the drop function and cleans up the heap memory for that variable. But Figure 4-2 shows both data pointers pointing to the same location. This is a problem: When s2 and s1 go out of scope, they will both try to free the same memory. This is known as a double free error and is one of the memory safety bugs we mentioned previously. Freeing memory twice can lead to memory corruption, which can potentially lead to security vulnerabilities.

为了确保内存安全,在 let s2 = s1; 行之后,Rust 认为 s1 不再有效。因此,当 s1 离开作用域时,Rust 不需要释放任何东西。看看在创建 s2 后尝试使用 s1 会发生什么;它将无法工作:

To ensure memory safety, after the line let s2 = s1;, Rust considers s1 as no longer valid. Therefore, Rust doesn’t need to free anything when s1 goes out of scope. Check out what happens when you try to use s1 after s2 is created; it won’t work:

fn main() {
    let s1 = String::from("hello");
    let s2 = s1;

    println!("{s1}, world!");
}

你会得到如下错误,因为 Rust 阻止你使用已失效的引用:

You’ll get an error like this because Rust prevents you from using the invalidated reference:

$ cargo run
   Compiling ownership v0.1.0 (file:///projects/ownership)
error[E0382]: borrow of moved value: `s1`
 --> src/main.rs:5:16
  |
2 |     let s1 = String::from("hello");
  |         -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
3 |     let s2 = s1;
  |              -- value moved here
4 |
5 |     println!("{s1}, world!");
  |                ^^ value borrowed here after move
  |
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
  |
3 |     let s2 = s1.clone();
  |                ++++++++

For more information about this error, try `rustc --explain E0382`.
error: could not compile `ownership` (bin "ownership") due to 1 previous error

如果你在学习其他语言时听说过“浅拷贝”(shallow copy)和“深拷贝”(deep copy)这两个术语,那么只复制指针、长度和容量而不复制数据的概念听起来可能像是浅拷贝。但由于 Rust 还会使第一个变量失效,因此它不被称为浅拷贝,而是被称为“移动”(move)。在这个例子中,我们会说 s1 被“移动”到了 s2。所以,实际发生的事情如图 4-4 所示。

If you’ve heard the terms shallow copy and deep copy while working with other languages, the concept of copying the pointer, length, and capacity without copying the data probably sounds like making a shallow copy. But because Rust also invalidates the first variable, instead of being called a shallow copy, it’s known as a move. In this example, we would say that s1 was moved into s2. So, what actually happens is shown in Figure 4-4.

三张表格:表格 s1 和 s2 分别代表栈上的那些字符串,并且都指向堆上相同的字符串数据。表格 s1 是灰色的,因为 s1 不再有效;只有 s2 可以用来访问堆数据。

图 4-4:s1 失效后的内存表示 Figure 4-4: The representation in memory after s1 has been invalidated

这解决了我们的问题!只有 s2 有效,当它离开作用域时,只有它会释放内存,任务完成。

That solves our problem! With only s2 valid, when it goes out of scope it alone will free the memory, and we’re done.

此外,这隐含了一个设计选择:Rust 永远不会自动创建数据的“深”拷贝。因此,任何“自动”复制都可以被认为是运行时性能开销较小的。

In addition, there’s a design choice that’s implied by this: Rust will never automatically create “deep” copies of your data. Therefore, any automatic copying can be assumed to be inexpensive in terms of runtime performance.

作用域与赋值

Scope and Assignment

反过来,对于作用域、所有权与通过 drop 函数释放内存之间的关系也是如此。当你给现有变量分配一个全新的值时,Rust 会立即调用 drop 并释放原始值的内存。例如,考虑这段代码:

The inverse of this is true for the relationship between scoping, ownership, and memory being freed via the drop function as well. When you assign a completely new value to an existing variable, Rust will call drop and free the original value’s memory immediately. Consider this code, for example:

fn main() {
    let mut s = String::from("hello");
    s = String::from("ahoy");

    println!("{s}, world!");
}

我们最初声明一个变量 s 并将其绑定到一个值为 "hello"String。然后,我们立即创建一个值为 "ahoy" 的新 String 并将其赋给 s。此时,没有任何东西指向堆上的原始值。图 4-5 展示了现在的栈和堆数据:

We initially declare a variable s and bind it to a String with the value "hello". Then, we immediately create a new String with the value "ahoy" and assign it to s. At this point, nothing is referring to the original value on the heap at all. Figure 4-5 illustrates the stack and heap data now:

一个代表栈上字符串值的表格,指向堆上的第二个字符串数据 (ahoy),原来的字符串数据 (hello) 是灰色的,因为它不再能被访问。

图 4-5:初始值被完全替换后的内存表示 Figure 4-5: The representation in memory after the initial value has been replaced in its entirety

因此原始字符串立即离开作用域。Rust 将对其运行 drop 函数,其内存将立即被释放。当我们最后打印该值时,它将是 "ahoy, world!"

The original string thus immediately goes out of scope. Rust will run the drop function on it and its memory will be freed right away. When we print the value at the end, it will be "ahoy, world!".

变量与数据交互的方式:克隆

Variables and Data Interacting with Clone

如果我们“确实”想要深度复制 String 的堆数据,而不只是栈数据,我们可以使用一个常用的方法叫做 clone。我们将在第 5 章讨论方法语法,但因为方法在许多编程语言中都是一个通用功能,你可能以前见过它们。

If we do want to deeply copy the heap data of the String, not just the stack data, we can use a common method called clone. We’ll discuss method syntax in Chapter 5, but because methods are a common feature in many programming languages, you’ve probably seen them before.

这是 clone 方法的一个示例:

Here’s an example of the clone method in action:

fn main() {
    let s1 = String::from("hello");
    let s2 = s1.clone();

    println!("s1 = {s1}, s2 = {s2}");
}

这工作得很好,并显式地产生了如图 4-3 所示的行为,即堆数据“确实”被复制了。

This works just fine and explicitly produces the behavior shown in Figure 4-3, where the heap data does get copied.

当你看到对 clone 的调用时,你就知道某些任意代码正在被执行,并且该代码可能开销很大。这是一个视觉指示,表明正在发生一些不同的事情。

When you see a call to clone, you know that some arbitrary code is being executed and that code may be expensive. It’s a visual indicator that something different is going on.

只在栈上的数据:拷贝

Stack-Only Data: Copy

还有一个我们还没谈到的细节。这段使用整数的代码(其中一部分在示例 4-2 中显示)是有效且可以运行的:

There’s another wrinkle we haven’t talked about yet. This code using integers—part of which was shown in Listing 4-2—works and is valid:

fn main() {
    let x = 5;
    let y = x;

    println!("x = {x}, y = {y}");
}

但这段代码似乎与我们刚刚学到的相矛盾:我们没有调用 clone,但 x 仍然有效,没有被移动到 y

But this code seems to contradict what we just learned: We don’t have a call to clone, but x is still valid and wasn’t moved into y.

原因是像整数这样在编译时具有已知大小的类型完全存储在栈上,所以实际值的副本可以快速制作。这意味着没有理由在我们创建变量 y 后阻止 x 继续有效。换句话说,这里深拷贝和浅拷贝没有区别,所以调用 clone 不会做任何与通常的浅拷贝不同的事情,我们可以省略它。

The reason is that types such as integers that have a known size at compile time are stored entirely on the stack, so copies of the actual values are quick to make. That means there’s no reason we would want to prevent x from being valid after we create the variable y. In other words, there’s no difference between deep and shallow copying here, so calling clone wouldn’t do anything different from the usual shallow copying, and we can leave it out.

Rust 有一个特殊的注解叫做 Copy trait,我们可以将其放置在像整数那样存储在栈上的类型上(我们将在第 10 章更多地讨论 trait)。如果一个类型实现了 Copy trait,使用它的变量不会移动,而是被琐碎地复制,使它们在赋值给另一个变量后仍然有效。

Rust has a special annotation called the Copy trait that we can place on types that are stored on the stack, as integers are (we’ll talk more about traits in Chapter 10). If a type implements the Copy trait, variables that use it do not move, but rather are trivially copied, making them still valid after assignment to another variable.

如果一个类型或其任何部分实现了 Drop trait,Rust 将不允许我们用 Copy 来注解该类型。如果该类型在值离开作用域时需要发生一些特殊处理,而我们又给该类型添加了 Copy 注解,我们就会得到一个编译时错误。要了解如何向你的类型添加 Copy 注解以实现该 trait,请参阅附录 C 中的“派生 Trait”

Rust won’t let us annotate a type with Copy if the type, or any of its parts, has implemented the Drop trait. If the type needs something special to happen when the value goes out of scope and we add the Copy annotation to that type, we’ll get a compile-time error. To learn about how to add the Copy annotation to your type to implement the trait, see “Derivable Traits” in Appendix C.

那么,哪些类型实现了 Copy trait 呢?你可以查看给定类型的文档以确定,但作为一般规则,任何一组简单的标量值都可以实现 Copy,而任何需要分配或作为某种资源的形式都不能实现 Copy。以下是一些实现了 Copy 的类型:

So, what types implement the Copy trait? You can check the documentation for the given type to be sure, but as a general rule, any group of simple scalar values can implement Copy, and nothing that requires allocation or is some form of resource can implement Copy. Here are some of the types that implement Copy:

  • 所有整数类型,如 u32

  • All the integer types, such as u32.

  • 布尔类型 bool,值为 truefalse

  • The Boolean type, bool, with values true and false.

  • 所有浮点类型,如 f64

  • All the floating-point types, such as f64.

  • 字符类型 char

  • The character type, char.

  • 元组,如果它们仅包含也实现 Copy 的类型。例如,(i32, i32) 实现了 Copy,但 (i32, String) 不实现。

  • Tuples, if they only contain types that also implement Copy. For example, (i32, i32) implements Copy, but (i32, String) does not.

所有权与函数

Ownership and Functions

将值传递给函数机制与将值赋给变量的机制类似。向函数传递变量将发生移动或复制,就像赋值一样。示例 4-3 有一个带有注释的例子,显示了变量进入和离开作用域的位置。

The mechanics of passing a value to a function are similar to those when assigning a value to a variable. Passing a variable to a function will move or copy, just as assignment does. Listing 4-3 has an example with some annotations showing where variables go into and out of scope.

fn main() {
    let s = String::from("hello");  // s comes into scope

    takes_ownership(s);             // s's value moves into the function...
                                    // ... and so is no longer valid here

    let x = 5;                      // x comes into scope

    makes_copy(x);                  // Because i32 implements the Copy trait,
                                    // x does NOT move into the function,
                                    // so it's okay to use x afterward.

} // Here, x goes out of scope, then s. However, because s's value was moved,
  // nothing special happens.

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{some_string}");
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.

fn makes_copy(some_integer: i32) { // some_integer comes into scope
    println!("{some_integer}");
} // Here, some_integer goes out of scope. Nothing special happens.

如果我们尝试在调用 takes_ownership 之后使用 s,Rust 将抛出编译时错误。这些静态检查保护我们免受错误的影响。尝试向 main 中添加使用 sx 的代码,看看在哪里可以使用它们,以及所有权规则在哪里阻止你这样做。

If we tried to use s after the call to takes_ownership, Rust would throw a compile-time error. These static checks protect us from mistakes. Try adding code to main that uses s and x to see where you can use them and where the ownership rules prevent you from doing so.

返回值与作用域

Return Values and Scope

返回值也可以转移所有权。示例 4-4 显示了一个返回某些值的函数示例,带有与示例 4-3 类似的注释。

Returning values can also transfer ownership. Listing 4-4 shows an example of a function that returns some value, with similar annotations as those in Listing 4-3.

fn main() {
    let s1 = gives_ownership();        // gives_ownership moves its return
                                       // value into s1

    let s2 = String::from("hello");    // s2 comes into scope

    let s3 = takes_and_gives_back(s2); // s2 is moved into
                                       // takes_and_gives_back, which also
                                       // moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing
  // happens. s1 goes out of scope and is dropped.

fn gives_ownership() -> String {       // gives_ownership will move its
                                       // return value into the function
                                       // that calls it

    let some_string = String::from("yours"); // some_string comes into scope

    some_string                        // some_string is returned and
                                       // moves out to the calling
                                       // function
}

// This function takes a String and returns a String.
fn takes_and_gives_back(a_string: String) -> String {
    // a_string comes into
    // scope

    a_string  // a_string is returned and moves out to the calling function
}

变量的所有权每次都遵循相同的模式:将一个值赋给另一个变量会发生移动。当包含堆上数据的变量离开作用域时,除非数据的所有权已移动到另一个变量,否则该值将被 drop 清理。

The ownership of a variable follows the same pattern every time: Assigning a value to another variable moves it. When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless ownership of the data has been moved to another variable.

虽然这可行,但在每个函数中获取所有权然后返回所有权有点乏味。如果我们想让函数使用值但不获取所有权呢?非常恼人的是,如果我们想再次使用它,任何我们传入的东西也需要被传回来,此外还可能需要返回函数体产生的任何数据。

While this works, taking ownership and then returning ownership with every function is a bit tedious. What if we want to let a function use a value but not take ownership? It’s quite annoying that anything we pass in also needs to be passed back if we want to use it again, in addition to any data resulting from the body of the function that we might want to return as well.

Rust 确实允许我们使用元组返回多个值,如示例 4-5 所示。

Rust does let us return multiple values using a tuple, as shown in Listing 4-5.

fn main() {
    let s1 = String::from("hello");

    let (s2, len) = calculate_length(s1);

    println!("The length of '{s2}' is {len}.");
}

fn calculate_length(s: String) -> (String, usize) {
    let length = s.len(); // len() returns the length of a String

    (s, length)
}

但这对于一个本应通用的概念来说,仪式感太强,工作量也太大了。幸运的是,Rust 有一个在不转移所有权的情况下使用值的功能:引用(references)。

But this is too much ceremony and a lot of work for a concept that should be common. Luckily for us, Rust has a feature for using a value without transferring ownership: references.