高级类型 - Rust 程序设计语言简体中文版

Advanced Types

Rust 类型系统有一些我们到目前为止提到过但尚未讨论的功能。我们将从讨论通用的 Newtype 开始，研究为什么它们作为类型很有用。然后，我们将转向类型别名（type alias），这是一个类似于 Newtype 但语义略有不同的功能。我们还将讨论 ! 类型和动态大小类型（dynamically sized types）。

The Rust type system has some features that we’ve so far mentioned but haven’t yet discussed. We’ll start by discussing newtypes in general as we examine why they are useful as types. Then, we’ll move on to type aliases, a feature similar to newtypes but with slightly different semantics. We’ll also discuss the ! type and dynamically sized types.

使用 Newtype 模式实现类型安全和抽象

Type Safety and Abstraction with the Newtype Pattern

本节假设你已经阅读了前面的“使用 Newtype 模式实现外部 trait”部分。Newtype 模式除了我们已经讨论过的任务外，还对其他任务很有用，包括静态地强制执行值永远不会被混淆，以及指示值的单位。你在示例 20-16 中看到了使用 Newtype 指示单位的例子：回想一下，Millimeters 和 Meters 结构体将 u32 值包装在 Newtype 中。如果我们编写一个具有 Millimeters 类型参数的函数，我们就无法编译一个意外尝试使用 Meters 类型或纯 u32 值调用该函数的程序。

This section assumes you’ve read the earlier section “Implementing External Traits with the Newtype Pattern”. The newtype pattern is also useful for tasks beyond those we’ve discussed so far, including statically enforcing that values are never confused and indicating the units of a value. You saw an example of using newtypes to indicate units in Listing 20-16: Recall that the Millimeters and Meters structs wrapped u32 values in a newtype. If we wrote a function with a parameter of type Millimeters, we wouldn’t be able to compile a program that accidentally tried to call that function with a value of type Meters or a plain u32.

我们还可以使用 Newtype 模式来抽象掉类型的一些实现细节：新类型可以公开一个不同于私有内部类型 API 的公有 API。

We can also use the newtype pattern to abstract away some implementation details of a type: The new type can expose a public API that is different from the API of the private inner type.

Newtype 还可以隐藏内部实现。例如，我们可以提供一个 People 类型来包装一个 HashMap<i32, String>，其中存储了与姓名关联的人员 ID。使用 People 的代码将仅与我们提供的公有 API 交互，例如将姓名字符串添加到 People 集合的方法；该代码不需要知道我们在内部为姓名分配了一个 i32 类型的 ID。Newtype 模式是实现封装以隐藏实现细节的一种轻量级方式，我们在第 18 章的“隐藏实现细节的封装”部分中讨论过这一内容。

Newtypes can also hide internal implementation. For example, we could provide a People type to wrap a HashMap<i32, String> that stores a person’s ID associated with their name. Code using People would only interact with the public API we provide, such as a method to add a name string to the People collection; that code wouldn’t need to know that we assign an i32 ID to names internally. The newtype pattern is a lightweight way to achieve encapsulation to hide implementation details, which we discussed in the “Encapsulation that Hides Implementation Details” section in Chapter 18.

类型同义词和类型别名

Type Synonyms and Type Aliases

Rust 提供了声明类型别名 (type alias) 的能力，以便为现有类型提供另一个名称。为此，我们使用 type 关键字。例如，我们可以像这样为 i32 创建别名 Kilometers：

Rust provides the ability to declare a type alias to give an existing type another name. For this we use the type keyword. For example, we can create the alias Kilometers to i32 like so:

fn main() {
    type Kilometers = i32;

    let x: i32 = 5;
    let y: Kilometers = 5;

    println!("x + y = {}", x + y);
}

现在别名 Kilometers 是 i32 的同义词 (synonym)；与我们在示例 20-16 中创建的 Millimeters 和 Meters 类型不同，Kilometers 不是一个单独的新类型。类型为 Kilometers 的值将被视为与 i32 类型的值相同：

Now the alias Kilometers is a synonym for i32; unlike the Millimeters and Meters types we created in Listing 20-16, Kilometers is not a separate, new type. Values that have the type Kilometers will be treated the same as values of type i32:

fn main() {
    type Kilometers = i32;

    let x: i32 = 5;
    let y: Kilometers = 5;

    println!("x + y = {}", x + y);
}

因为 Kilometers 和 i32 是相同的类型，我们可以将这两种类型的值相加，并且可以将 Kilometers 值传递给接受 i32 参数的函数。但是，使用这种方法，我们无法获得前面讨论的 Newtype 模式所带来的类型检查优势。换句话说，如果我们如果在某处混淆了 Kilometers 和 i32 值，编译器将不会给出错误。

Because Kilometers and i32 are the same type, we can add values of both types and can pass Kilometers values to functions that take i32 parameters. However, using this method, we don’t get the type-checking benefits that we get from the newtype pattern discussed earlier. In other words, if we mix up Kilometers and i32 values somewhere, the compiler will not give us an error.

类型同义词的主要用例是减少重复。例如，我们可能有一个像这样冗长的类型：

The main use case for type synonyms is to reduce repetition. For example, we might have a lengthy type like this:

Box<dyn Fn() + Send + 'static>

在代码各处的函数签名和类型标注中编写这种冗长的类型可能会令人厌烦且容易出错。想象一下，一个项目中充满了类似于示例 20-25 中的代码。

Writing this lengthy type in function signatures and as type annotations all over the code can be tiresome and error-prone. Imagine having a project full of code like that in Listing 20-25.

fn main() {
    let f: Box<dyn Fn() + Send + 'static> = Box::new(|| println!("hi"));

    fn takes_long_type(f: Box<dyn Fn() + Send + 'static>) {
        // --snip--
    }

    fn returns_long_type() -> Box<dyn Fn() + Send + 'static> {
        // --snip--
        Box::new(|| ())
    }
}

类型别名通过减少重复使此代码更易于管理。在示例 20-26 中，我们为该冗长类型引入了名为 Thunk 的别名，并可以用较短的别名 Thunk 替换该类型的所有用途。

A type alias makes this code more manageable by reducing the repetition. In Listing 20-26, we’ve introduced an alias named Thunk for the verbose type and can replace all uses of the type with the shorter alias Thunk.

fn main() {
    type Thunk = Box<dyn Fn() + Send + 'static>;

    let f: Thunk = Box::new(|| println!("hi"));

    fn takes_long_type(f: Thunk) {
        // --snip--
    }

    fn returns_long_type() -> Thunk {
        // --snip--
        Box::new(|| ())
    }
}

这段代码读起来和写起来都容易得多！为类型别名选择一个有意义的名称也有助于传达你的意图（thunk 是一个用于表示稍后求值的代码的术语，因此对于存储的闭包来说是一个合适的名称）。

This code is much easier to read and write! Choosing a meaningful name for a type alias can help communicate your intent as well (thunk is a word for code to be evaluated at a later time, so it’s an appropriate name for a closure that gets stored).

类型别名也经常与 Result<T, E> 类型一起使用，以减少重复。考虑标准库中的 std::io 模块。I/O 操作通常返回 Result<T, E> 以处理操作失败的情况。该库有一个 std::io::Error 结构体，代表所有可能的 I/O 错误。std::io 中的许多函数将返回 Result<T, E>，其中 E 为 std::io::Error，例如 Write trait 中的这些函数：

Type aliases are also commonly used with the Result<T, E> type for reducing repetition. Consider the std::io module in the standard library. I/O operations often return a Result<T, E> to handle situations when operations fail to work. This library has a std::io::Error struct that represents all possible I/O errors. Many of the functions in std::io will be returning Result<T, E> where the E is std::io::Error, such as these functions in the Write trait:

use std::fmt;
use std::io::Error;

pub trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize, Error>;
    fn flush(&mut self) -> Result<(), Error>;

    fn write_all(&mut self, buf: &[u8]) -> Result<(), Error>;
    fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<(), Error>;
}

Result<..., Error> 重复了很多次。因此，std::io 有这个类型别名声明：

The Result<..., Error> is repeated a lot. As such, std::io has this type alias declaration:

use std::fmt;

type Result<T> = std::result::Result<T, std::io::Error>;

pub trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;

    fn write_all(&mut self, buf: &[u8]) -> Result<()>;
    fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
}

因为此声明在 std::io 模块中，我们可以使用完全限定的别名 std::io::Result<T>；也就是说，一个 E 已填充为 std::io::Error 的 Result<T, E>。Write trait 函数签名最终看起来像这样：

Because this declaration is in the std::io module, we can use the fully qualified alias std::io::Result<T>; that is, a Result<T, E> with the E filled in as std::io::Error. The Write trait function signatures end up looking like this:

use std::fmt;

type Result<T> = std::result::Result<T, std::io::Error>;

pub trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;

    fn write_all(&mut self, buf: &[u8]) -> Result<()>;
    fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
}

类型别名在两个方面提供了帮助：它使代码更容易编写，并且在整个 std::io 中为我们提供了一个一致的接口。因为它是别名，所以它只是另一个 Result<T, E>，这意味着我们可以对它使用任何适用于 Result<T, E> 的方法，以及特殊的语法（如 ? 运算符）。

The type alias helps in two ways: It makes code easier to write and it gives us a consistent interface across all of std::io. Because it’s an alias, it’s just another Result<T, E>, which means we can use any methods that work on Result<T, E> with it, as well as special syntax like the ? operator.

永不返回的 Never 类型

The Never Type That Never Returns

Rust 有一个名为 ! 的特殊类型，在类型理论术语中被称为空类型 (empty type)，因为它没有值。我们更倾向于称它为 Never 类型，因为当一个函数永远不会返回时，它代表了返回类型。这是一个例子：

Rust has a special type named ! that’s known in type theory lingo as the empty type because it has no values. We prefer to call it the never type because it stands in the place of the return type when a function will never return. Here is an example:

fn bar() -> ! {
    // --snip--
    panic!();
}

这段代码读作“函数 bar 永远不会返回。” 永远不返回的函数被称为发散函数 (diverging functions)。我们无法创建 ! 类型的值，因此 bar 永远不可能返回。

This code is read as “the function bar returns never.” Functions that return never are called diverging functions. We can’t create values of the type !, so bar can never possibly return.

但是，对于一个永远无法为其创建值的类型有什么用呢？回想一下示例 2-5 中的代码，这是猜数字游戏的一部分；我们在示例 20-27 中再现了其中的一部分。

But what use is a type you can never create values for? Recall the code from Listing 2-5, part of the number-guessing game; we’ve reproduced a bit of it here in Listing 20-27.

use std::cmp::Ordering;
use std::io;

use rand::Rng;

fn main() {
    println!("Guess the number!");

    let secret_number = rand::thread_rng().gen_range(1..=100);

    println!("The secret number is: {secret_number}");

    loop {
        println!("Please input your guess.");

        let mut guess = String::new();

        // --snip--

        io::stdin()
            .read_line(&mut guess)
            .expect("Failed to read line");

        let guess: u32 = match guess.trim().parse() {
            Ok(num) => num,
            Err(_) => continue,
        };

        println!("You guessed: {guess}");

        // --snip--

        match guess.cmp(&secret_number) {
            Ordering::Less => println!("Too small!"),
            Ordering::Greater => println!("Too big!"),
            Ordering::Equal => {
                println!("You win!");
                break;
            }
        }
    }
}

当时，我们跳过了这段代码中的一些细节。在第 6 章的“match 控制流结构”部分，我们讨论过 match 分支必须全部返回相同的类型。因此，例如，以下代码不起作用：

At the time, we skipped over some details in this code. In “The match Control Flow Construct” section in Chapter 6, we discussed that match arms must all return the same type. So, for example, the following code doesn’t work:

fn main() {
    let guess = "3";
    let guess = match guess.trim().parse() {
        Ok(_) => 5,
        Err(_) => "hello",
    };
}

在此代码中，guess 的类型必须是整数且是字符串，而 Rust 要求 guess 只能有一种类型。那么，continue 返回什么呢？在示例 20-27 中，我们是如何被允许从一个分支返回 u32 而另一个分支以 continue 结尾的呢？

The type of guess in this code would have to be an integer and a string, and Rust requires that guess have only one type. So, what does continue return? How were we allowed to return a u32 from one arm and have another arm that ends with continue in Listing 20-27?

正如你可能已经猜到的那样，continue 具有一个 ! 值。也就是说，当 Rust 计算 guess 的类型时，它会查看两个匹配分支，前者具有 u32 值，后者具有 ! 值。因为 ! 永远不可能有值，所以 Rust 决定 guess 的类型是 u32。

As you might have guessed, continue has a ! value. That is, when Rust computes the type of guess, it looks at both match arms, the former with a value of u32 and the latter with a ! value. Because ! can never have a value, Rust decides that the type of guess is u32.

描述这种行为的正式方式是，类型为 ! 的表达式可以被强转（coerced）为任何其他类型。我们被允许以 continue 结束这个 match 分支，因为 continue 不返回值；相反，它将控制权移回循环顶部，因此在 Err 的情况下，我们从未给 guess 分配值。

The formal way of describing this behavior is that expressions of type ! can be coerced into any other type. We’re allowed to end this match arm with continue because continue doesn’t return a value; instead, it moves control back to the top of the loop, so in the Err case, we never assign a value to guess.

Never 类型对 panic! 宏也很有用。回想一下我们在 Option<T> 值上调用的 unwrap 函数，它的定义如下，要么产生一个值，要么 panic：

The never type is useful with the panic! macro as well. Recall the unwrap function that we call on Option<T> values to produce a value or panic with this definition:

enum Option<T> {
    Some(T),
    None,
}

use crate::Option::*;

impl<T> Option<T> {
    pub fn unwrap(self) -> T {
        match self {
            Some(val) => val,
            None => panic!("called `Option::unwrap()` on a `None` value"),
        }
    }
}

在此代码中，发生了与示例 20-27 中的 match 相同的情况：Rust 看到 val 具有类型 T，而 panic! 具有类型 !，因此整个 match 表达式的结果是 T。这段代码之所以起作用，是因为 panic! 不产生值，它结束了程序。在 None 的情况下，我们将不会从 unwrap 返回值，因此此代码是有效的。

In this code, the same thing happens as in the match in Listing 20-27: Rust sees that val has the type T and panic! has the type !, so the result of the overall match expression is T. This code works because panic! doesn’t produce a value; it ends the program. In the None case, we won’t be returning a value from unwrap, so this code is valid.

最后一个具有 ! 类型的表达式是循环：

One final expression that has the type ! is a loop:

fn main() {
    print!("forever ");

    loop {
        print!("and ever ");
    }
}

在这里，循环永远不会结束，因此 ! 是该表达式的值。但是，如果我们包含一个 break，情况就不再如此了，因为循环在执行到 break 时就会终止。

Here, the loop never ends, so ! is the value of the expression. However, this wouldn’t be true if we included a break, because the loop would terminate when it got to the break.

动态大小类型和 Sized Trait

Dynamically Sized Types and the `Sized` Trait

Rust 需要了解其类型的某些细节，例如要为特定类型的值分配多少空间。这使得其类型系统的一个角落起初有些令人困惑：动态大小类型 (dynamically sized types) 的概念。有时被称为 DST 或无大小类型 (unsized types)，这些类型让我们可以编写使用仅在运行时才能知道其大小的值的代码。

Rust needs to know certain details about its types, such as how much space to allocate for a value of a particular type. This leaves one corner of its type system a little confusing at first: the concept of dynamically sized types. Sometimes referred to as DSTs or unsized types, these types let us write code using values whose size we can know only at runtime.

让我们深入研究一下我们在本书中一直在使用的名为 str 的动态大小类型的细节。没错，不是 &str，而是单独的 str，是一个 DST。在许多情况下，例如存储用户输入的文本时，我们在运行时才能知道字符串有多长。这意味着我们不能创建一个 str 类型的变量，也不能接受一个 str 类型的参数。考虑以下代码，它不起作用：

Let’s dig into the details of a dynamically sized type called str, which we’ve been using throughout the book. That’s right, not &str, but str on its own, is a DST. In many cases, such as when storing text entered by a user, we can’t know how long the string is until runtime. That means we can’t create a variable of type str, nor can we take an argument of type str. Consider the following code, which does not work:

fn main() {
    let s1: str = "Hello there!";
    let s2: str = "How's it going?";
}

Rust 需要知道为任何特定类型的值分配多少内存，并且同一类型的所有值必须使用相同数量的内存。如果 Rust 允许我们编写这段代码，这两个 str 值将需要占用相同数量的空间。但它们具有不同的长度：s1 需要 12 字节的存储空间，而 s2 需要 15 字节。这就是为什么无法创建一个持有动态大小类型的变量。

Rust needs to know how much memory to allocate for any value of a particular type, and all values of a type must use the same amount of memory. If Rust allowed us to write this code, these two str values would need to take up the same amount of space. But they have different lengths: s1 needs 12 bytes of storage and s2 needs 15. This is why it’s not possible to create a variable holding a dynamically sized type.

那我们该怎么办呢？在这种情况下，你已经知道答案了：我们将 s1 和 s2 的类型改为字符串切片 (&str) 而不是 str。回想一下第 4 章“字符串切片”部分，切片数据结构仅存储切片的起始位置和长度。因此，虽然 &T 是存储 T 所在内存地址的单个值，但字符串切片是两个值：str 的地址及其长度。因此，我们可以在编译时知道字符串切片值的大小：它是 usize 长度的两倍。也就是说，无论它引用的字符串有多长，我们始终知道字符串切片的大小。通常，这就是在 Rust 中使用动态大小类型的方式：它们具有一个额外的元数据位，用于存储动态信息的大小。动态大小类型的金科玉律是，我们必须始终将动态大小类型的值放在某种指针之后。

So, what do we do? In this case, you already know the answer: We make the type of s1 and s2 string slice (&str) rather than str. Recall from the “String Slices” section in Chapter 4 that the slice data structure only stores the starting position and the length of the slice. So, although &T is a single value that stores the memory address of where the T is located, a string slice is two values: the address of the str and its length. As such, we can know the size of a string slice value at compile time: It’s twice the length of a usize. That is, we always know the size of a string slice, no matter how long the string it refers to is. In general, this is the way in which dynamically sized types are used in Rust: They have an extra bit of metadata that stores the size of the dynamic information. The golden rule of dynamically sized types is that we must always put values of dynamically sized types behind a pointer of some kind.

我们可以将 str 与各种指针结合使用：例如 Box<str> 或 Rc<str>。事实上，你以前见过这种情况，但是使用的是不同的动态大小类型：trait。每个 trait 都是一个动态大小类型，我们可以通过使用 trait 的名称来引用它。在第 18 章的“使用 trait 对象抽象化共享行为”部分，我们提到过，要将 trait 作为 trait 对象使用，我们必须将其放在指针之后，例如 &dyn Trait 或 Box<dyn Trait>（Rc<dyn Trait> 也可以）。

We can combine str with all kinds of pointers: for example, Box<str> or Rc<str>. In fact, you’ve seen this before but with a different dynamically sized type: traits. Every trait is a dynamically sized type we can refer to by using the name of the trait. In the “Using Trait Objects to Abstract over Shared Behavior” section in Chapter 18, we mentioned that to use traits as trait objects, we must put them behind a pointer, such as &dyn Trait or Box<dyn Trait> (Rc<dyn Trait> would work too).

为了处理 DST，Rust 提供了 Sized trait 来确定类型的大小在编译时是否已知。对于在编译时已知大小的所有内容，都会自动实现此 trait。此外，Rust 会隐式地为每个泛型函数添加一个关于 Sized 的约束。也就是说，一个像这样的泛型函数定义：

To work with DSTs, Rust provides the Sized trait to determine whether or not a type’s size is known at compile time. This trait is automatically implemented for everything whose size is known at compile time. In addition, Rust implicitly adds a bound on Sized to every generic function. That is, a generic function definition like this:

fn generic<T>(t: T) {
    // --snip--
}

实际上被视为像我们这样编写的一样：

is actually treated as though we had written this:

fn generic<T: Sized>(t: T) {
    // --snip--
}

默认情况下，泛型函数仅适用于在编译时具有已知大小的类型。但是，你可以使用以下特殊语法来放宽此限制：

By default, generic functions will work only on types that have a known size at compile time. However, you can use the following special syntax to relax this restriction:

fn generic<T: ?Sized>(t: &T) {
    // --snip--
}

关于 ?Sized 的 trait bound 意味着“T 可能是也可能不是 Sized”，这种表示法覆盖了泛型类型在编译时必须具有已知大小的默认规定。具有此含义的 ?Trait 语法仅对 Sized 可用，而对任何其他 trait 不可用。

A trait bound on ?Sized means “T may or may not be Sized,” and this notation overrides the default that generic types must have a known size at compile time. The ?Trait syntax with this meaning is only available for Sized, not any other traits.

另请注意，我们将 t 参数的类型从 T 更改为 &T。由于该类型可能不是 Sized，因此我们需要将其放在某种指针之后。在这种情况下，我们选择了一个引用。

Also note that we switched the type of the t parameter from T to &T. Because the type might not be Sized, we need to use it behind some kind of pointer. In this case, we’ve chosen a reference.

接下来，我们将讨论函数和闭包！

Next, we’ll talk about functions and closures!

Keyboard shortcuts

Rust 程序设计语言 简体中文版

Rust 程序设计语言简体中文版