Future 和 Async 语法 - Rust 程序设计语言简体中文版

Futures and the Async Syntax

Rust 异步编程的关键要素是 futures 以及 Rust 的 async 和 await 关键字。

The key elements of asynchronous programming in Rust are futures and Rust’s async and await keywords.

一个 future 是一个目前可能还没有准备好，但在未来的某个时间点会准备好的值。（同样的概念出现在许多语言中，有时使用其他名称，如 task 或 promise。）Rust 提供了一个 Future trait 作为构建块，以便不同的异步操作可以用不同的数据结构实现，但拥有共同的接口。在 Rust 中，futures 是实现了 Future trait 的类型。每个 future 都持有其自身的关于已取得进展的信息，以及“就绪”（ready）意味着什么。

A future is a value that may not be ready now but will become ready at some point in the future. (This same concept shows up in many languages, sometimes under other names such as task or promise.) Rust provides a Future trait as a building block so that different async operations can be implemented with different data structures but with a common interface. In Rust, futures are types that implement the Future trait. Each future holds its own information about the progress that has been made and what “ready” means.

你可以将 async 关键字应用于代码块和函数，以指定它们可以被中断和恢复。在异步块（async block）或异步函数（async function）中，你可以使用 await 关键字来 等待一个 future（即等待它变得就绪）。在异步块或函数中等待 future 的任何一点都是该块或函数暂停和恢复的潜在位置。向 future 检查其值是否已可用的过程称为轮询（polling）。

You can apply the async keyword to blocks and functions to specify that they can be interrupted and resumed. Within an async block or async function, you can use the await keyword to await a future (that is, wait for it to become ready). Any point where you await a future within an async block or function is a potential spot for that block or function to pause and resume. The process of checking with a future to see if its value is available yet is called polling.

其他一些语言（如 C# 和 JavaScript）也使用 async 和 await 关键字进行异步编程。如果你熟悉这些语言，你可能会注意到 Rust 处理语法的方式有一些显著差异。这是有充分理由的，正如我们将看到的！

Some other languages, such as C# and JavaScript, also use async and await keywords for async programming. If you’re familiar with those languages, you may notice some significant differences in how Rust handles the syntax. That’s for good reason, as we’ll see!

在编写异步 Rust 时，我们大部分时间都使用 async 和 await 关键字。Rust 将它们编译成使用 Future trait 的等效代码，就像它将 for 循环编译成使用 Iterator trait 的等效代码一样。不过，因为 Rust 提供了 Future trait，所以在需要时你也可以为自己的数据类型实现它。我们在本章中看到的许多函数都会返回具有其自身 Future 实现的类型。我们将在本章末尾回到该 trait 的定义，并深入研究它的工作原理，但这足以让我们继续前进。

When writing async Rust, we use the async and await keywords most of the time. Rust compiles them into equivalent code using the Future trait, much as it compiles for loops into equivalent code using the Iterator trait. Because Rust provides the Future trait, though, you can also implement it for your own data types when you need to. Many of the functions we’ll see throughout this chapter return types with their own implementations of Future. We’ll return to the definition of the trait at the end of the chapter and dig into more of how it works, but this is enough detail to keep us moving forward.

这一切可能感觉有点抽象，所以让我们编写第一个异步程序：一个小型的网页爬虫。我们将从命令行传入两个 URL，并发获取它们，并返回其中最先完成的那个的结果。这个示例将会有相当多新语法，但别担心——我们会在进行过程中解释你需要知道的一切。

This may all feel a bit abstract, so let’s write our first async program: a little web scraper. We’ll pass in two URLs from the command line, fetch both of them concurrently, and return the result of whichever one finishes first. This example will have a fair bit of new syntax, but don’t worry—we’ll explain everything you need to know as we go.

我们的第一个异步程序

Our First Async Program

为了将本章的重点放在学习异步而不是应付生态系统的各个部分上，我们创建了 trpl crate（trpl 是“The Rust Programming Language”的缩写）。它重新导出了你将需要的所有类型、trait 和函数，主要来自 futures 和 tokio crate。futures crate 是 Rust 异步代码实验的官方大本营，实际上 Future trait 最初就是在那里设计的。Tokio 是目前 Rust 中使用最广泛的异步运行时，尤其是对于 Web 应用程序。市面上还有其他出色的运行时，它们可能更适合你的目的。我们在 trpl 的底层使用了 tokio crate，因为它经过了充分的测试且使用广泛。

To keep the focus of this chapter on learning async rather than juggling parts of the ecosystem, we’ve created the trpl crate (trpl is short for “The Rust Programming Language”). It re-exports all the types, traits, and functions you’ll need, primarily from the futures and tokio crates. The futures crate is an official home for Rust experimentation for async code, and it’s actually where the Future trait was originally designed. Tokio is the most widely used async runtime in Rust today, especially for web applications. There are other great runtimes out there, and they may be more suitable for your purposes. We use the tokio crate under the hood for trpl because it’s well tested and widely used.

在某些情况下，trpl 还会重命名或包装原始 API，以使你专注于本章相关的细节。如果你想了解这个 crate 的作用，我们鼓励你查看它的源代码。你将能够看到每个重新导出是来自哪个 crate 的，并且我们留下了大量的注释来解释这个 crate 的作用。

In some cases, trpl also renames or wraps the original APIs to keep you focused on the details relevant to this chapter. If you want to understand what the crate does, we encourage you to check out its source code. You’ll be able to see what crate each re-export comes from, and we’ve left extensive comments explaining what the crate does.

创建一个名为 hello-async 的新二进制项目，并将 trpl crate 添加为依赖项：

Create a new binary project named hello-async and add the trpl crate as a dependency:

$ cargo new hello-async
$ cd hello-async
$ cargo add trpl

现在我们可以使用 trpl 提供的各种组件来编写我们的第一个异步程序。我们将构建一个小型的命令行工具，它可以获取两个网页，从每个网页中提取 <title> 元素，并打印出最先完成整个过程的页面的标题。

Now we can use the various pieces provided by trpl to write our first async program. We’ll build a little command line tool that fetches two web pages, pulls the <title> element from each, and prints out the title of whichever page finishes that whole process first.

定义 page_title 函数

Defining the page_title Function

让我们从编写一个函数开始，它将一个页面 URL 作为参数，向其发起请求，并返回 <title> 元素的文本（见示例 17-1）。

Let’s start by writing a function that takes one page URL as a parameter, makes a request to it, and returns the text of the <title> element (see Listing 17-1).

extern crate trpl; // required for mdbook test

fn main() {
    // TODO: we'll add this next!
}

use trpl::Html;

async fn page_title(url: &str) -> Option<String> {
    let response = trpl::get(url).await;
    let response_text = response.text().await;
    Html::parse(&response_text)
        .select_first("title")
        .map(|title| title.inner_html())
}

首先，我们定义一个名为 page_title 的函数，并用 async 关键字标记它。然后我们使用 trpl::get 函数来获取传入的任何 URL，并添加 await 关键字来等待响应（response）。为了获取 response 的文本，我们调用它的 text 方法，并再次使用 await 关键字等待它。这两个步骤都是异步的。对于 get 函数，我们必须等待服务器发回其响应的第一部分，其中将包括 HTTP 标头、cookie 等，这些可以与响应体分开交付。特别是如果正文非常大，它可能需要一些时间才能全部到达。因为我们必须等待整个响应到达，所以 text 方法也是异步的。

First, we define a function named page_title and mark it with the async keyword. Then we use the trpl::get function to fetch whatever URL is passed in and add the await keyword to await the response. To get the text of the response, we call its text method and once again await it with the await keyword. Both of these steps are asynchronous. For the get function, we have to wait for the server to send back the first part of its response, which will include HTTP headers, cookies, and so on and can be delivered separately from the response body. Especially if the body is very large, it can take some time for it all to arrive. Because we have to wait for the entirety of the response to arrive, the text method is also async.

我们必须显式地等待这两个 future，因为 Rust 中的 future 是惰性（lazy）的：在你就 await 关键字要求它们之前，它们什么都不做。（事实上，如果你不使用 future，Rust 会显示编译器警告。）这可能会让你想起第 13 章 “使用迭代器处理项序列” 一节中关于迭代器的讨论。除非你调用迭代器的 next 方法——无论是直接调用，还是通过使用 for 循环或 map 等底层使用 next 的方法——否则迭代器什么也不做。同样，除非你显式地要求，否则 future 什么也不做。这种惰性允许 Rust 避免运行直到真正需要的异步代码。

We have to explicitly await both of these futures, because futures in Rust are lazy: they don’t do anything until you ask them to with the await keyword. (In fact, Rust will show a compiler warning if you don’t use a future.) This might remind you of the discussion of iterators in the “Processing a Series of Items with Iterators” section in Chapter 13. Iterators do nothing unless you call their next method—whether directly or by using for loops or methods such as map that use next under the hood. Likewise, futures do nothing unless you explicitly ask them to. This laziness allows Rust to avoid running async code until it’s actually needed.

注意：这与我们在第 16 章 “使用 spawn 创建新线程” 一节中看到的 thread::spawn 的行为不同，在那里我们传递给另一个线程的闭包会立即开始运行。这也与许多其他语言处理异步的方式不同。但正如迭代器一样，这对于 Rust 能够提供其性能保证至关重要。

Note: This is different from the behavior we saw when using thread::spawn in the “Creating a New Thread with spawn” section in Chapter 16, where the closure we passed to another thread started running immediately. It’s also different from how many other languages approach async. But it’s important for Rust to be able to provide its performance guarantees, just as it is with iterators.

一旦我们有了 response_text，我们就可以使用 Html::parse 将其解析为 Html 类型的实例。我们现在有了一个可以用来将 HTML 处理为更丰富的数据结构的数据类型，而不是原始字符串。特别地，我们可以使用 select_first 方法来查找给定 CSS 选择器的第一个实例。通过传入字符串 "title"，我们将获得文档中的第一个 <title> 元素（如果有的话）。因为可能没有任何匹配的元素，所以 select_first 返回一个 Option<ElementRef>。最后，我们使用 Option::map 方法，它允许我们在 Option 中的项存在时处理它，在不存在时什么也不做。（我们也可以在这里使用 match 表达式，但 map 更符合习惯。）在我们提供给 map 的函数体中，我们在 title 上调用 inner_html 以获取其内容，这是一个 String。说到底，我们得到了一个 Option<String>。

Once we have response_text, we can parse it into an instance of the Html type using Html::parse. Instead of a raw string, we now have a data type we can use to work with the HTML as a richer data structure. In particular, we can use the select_first method to find the first instance of a given CSS selector. By passing the string "title", we’ll get the first <title> element in the document, if there is one. Because there may not be any matching element, select_first returns an Option<ElementRef>. Finally, we use the Option::map method, which lets us work with the item in the Option if it’s present, and do nothing if it isn’t. (We could also use a match expression here, but map is more idiomatic.) In the body of the function we supply to map, we call inner_html on the title to get its content, which is a String. When all is said and done, we have an Option<String>.

注意 Rust 的 await 关键字出现在你正在等待的表达式之后，而不是之前。也就是说，它是一个后缀（postfix）关键字。如果你在其他语言中使用过 async，这可能与你习惯的做法不同，但在 Rust 中，这使得链式方法调用处理起来更加美观。因此，我们可以更改 page_title 的函数体，将 trpl::get 和 text 函数调用链接在一起，并在它们之间使用 await，如示例 17-2 所示。

Notice that Rust’s await keyword goes after the expression you’re awaiting, not before it. That is, it’s a postfix keyword. This may differ from what you’re used to if you’ve used async in other languages, but in Rust it makes chains of methods much nicer to work with. As a result, we could change the body of page_title to chain the trpl::get and text function calls together with await between them, as shown in Listing 17-2.

extern crate trpl; // required for mdbook test

use trpl::Html;

fn main() {
    // TODO: we'll add this next!
}

async fn page_title(url: &str) -> Option<String> {
    let response_text = trpl::get(url).await.text().await;
    Html::parse(&response_text)
        .select_first("title")
        .map(|title| title.inner_html())
}

至此，我们已经成功编写了第一个异步函数！在我们在 main 中添加代码来调用它之前，让我们再多谈谈我们所写的内容及其含义。

With that, we have successfully written our first async function! Before we add some code in main to call it, let’s talk a little more about what we’ve written and what it means.

当 Rust 看到一个被标记为 async 关键字的 代码块 时，它会将其编译成一个实现了 Future trait 的唯一的、匿名的数据类型。当 Rust 看到一个标记为 async 的函数时，它会将其编译成一个非异步函数，其主体是一个异步块。异步函数的返回类型是编译器为该异步块创建的匿名数据类型的类型。

When Rust sees a block marked with the async keyword, it compiles it into a unique, anonymous data type that implements the Future trait. When Rust sees a function marked with async, it compiles it into a non-async function whose body is an async block. An async function’s return type is the type of the anonymous data type the compiler creates for that async block.

因此，编写 async fn 相当于编写一个返回返回类型 future 的函数。对于编译器来说，像示例 17-1 中的 async fn page_title 这样的函数定义大致相当于这样定义的非异步函数：

Thus, writing async fn is equivalent to writing a function that returns a future of the return type. To the compiler, a function definition such as the async fn page_title in Listing 17-1 is roughly equivalent to a non-async function defined like this:

#![allow(unused)]
fn main() {
extern crate trpl; // required for mdbook test
use std::future::Future;
use trpl::Html;

fn page_title(url: &str) -> impl Future<Output = Option<String>> {
    async move {
        let text = trpl::get(url).await.text().await;
        Html::parse(&text)
            .select_first("title")
            .map(|title| title.inner_html())
    }
}
}

让我们逐一分析转换后的各个部分：

Let’s walk through each part of the transformed version:

它使用了我们在第 10 章 “Trait 作为参数” 一节中讨论过的 impl Trait 语法。
返回值实现了带有关联类型 Output 的 Future trait。请注意，Output 类型是 Option<String>，这与 page_title 的 async fn 版本中原始返回类型相同。
在原函数体中调用的所有代码都被包装在一个 async move 块中。记住，代码块是表达式。这整个块就是从函数返回的表达式。
就像刚才描述的那样，这个异步块产生一个类型为 Option<String> 的值。该值与返回类型中的 Output 类型相匹配。这和你见过的其他代码块一样。
新的函数体是一个 async move 块，这是因为它使用了 url 参数的方式。（本章稍后我们将更详细地讨论 async 与 async move 的对比。）
It uses the impl Trait syntax we discussed back in Chapter 10 in the “Traits as Parameters” section.
The returned value implements the Future trait with an associated type of Output. Notice that the Output type is Option<String>, which is the same as the original return type from the async fn version of page_title.
All of the code called in the body of the original function is wrapped in an async move block. Remember that blocks are expressions. This whole block is the expression returned from the function.
This async block produces a value with the type Option<String>, as just described. That value matches the Output type in the return type. This is just like other blocks you have seen.
The new function body is an async move block because of how it uses the url parameter. (We’ll talk much more about async versus async move later in the chapter.)

现在我们可以在 main 中调用 page_title 了。

Now we can call page_title in main.

使用运行时执行异步函数

Executing an Async Function with a Runtime

首先，我们将获取单个页面的标题，如示例 17-3 所示。不幸的是，这段代码目前还无法编译。

To start, we’ll get the title for a single page, shown in Listing 17-3. Unfortunately, this code doesn’t compile yet.

extern crate trpl; // required for mdbook test

use trpl::Html;

async fn main() {
    let args: Vec<String> = std::env::args().collect();
    let url = &args[1];
    match page_title(url).await {
        Some(title) => println!("The title for {url} was {title}"),
        None => println!("{url} had no title"),
    }
}

async fn page_title(url: &str) -> Option<String> {
    let response_text = trpl::get(url).await.text().await;
    Html::parse(&response_text)
        .select_first("title")
        .map(|title| title.inner_html())
}

我们遵循第 12 章 “接受命令行参数” 一节中获取命令行参数的相同模式。然后我们将 URL 参数传递给 page_title 并等待（await）结果。因为 future 产生的值是一个 Option<String>，所以我们使用 match 表达式根据页面是否有 <title> 来打印不同的消息。

We follow the same pattern we used to get command line arguments in the “Accepting Command Line Arguments” section in Chapter 12. Then we pass the URL argument to page_title and await the result. Because the value produced by the future is an Option<String>, we use a match expression to print different messages to account for whether the page had a <title>.

我们唯一可以使用 await 关键字的地方是在异步函数或代码块中，而 Rust 不允许我们将特殊的 main 函数标记为 async。

The only place we can use the await keyword is in async functions or blocks, and Rust won’t let us mark the special main function as async.

error[E0752]: `main` function is not allowed to be `async`
 --> src/main.rs:6:1
  |
6 | async fn main() {
  | ^^^^^^^^^^^^^^^ `main` function is not allowed to be `async`

main 不能被标记为 async 的原因是异步代码需要一个 运行时（runtime）：一个管理异步代码执行细节的 Rust crate。程序的 main 函数可以 初始化 一个运行时，但它本身不是一个运行时。（稍后我们将看到更多关于为什么会出现这种情况的原因。）每个执行异步代码的 Rust 程序都至少有一个设置运行异步 future 的运行时的位置。

The reason main can’t be marked async is that async code needs a runtime: a Rust crate that manages the details of executing asynchronous code. A program’s main function can initialize a runtime, but it’s not a runtime itself. (We’ll see more about why this is the case in a bit.) Every Rust program that executes async code has at least one place where it sets up a runtime that executes the futures.

大多数支持异步的语言都捆绑了一个运行时，但 Rust 没有。相反，有许多不同的异步运行时可用，每个运行时都针对其目标用例做出了不同的权衡。例如，一个具有多个 CPU 核心和大量 RAM 的高吞吐量 Web 服务器的需求，与一个具有单核心、少量 RAM 且没有堆分配能力的微控制器的需求非常不同。提供这些运行时的 crate 通常还提供常用功能（如文件或网络 I/O）的异步版本。

Most languages that support async bundle a runtime, but Rust does not. Instead, there are many different async runtimes available, each of which makes different tradeoffs suitable to the use case it targets. For example, a high-throughput web server with many CPU cores and a large amount of RAM has very different needs than a microcontroller with a single core, a small amount of RAM, and no heap allocation ability. The crates that provide those runtimes also often supply async versions of common functionality such as file or network I/O.

在这里以及本章的其余部分，我们将使用 trpl crate 中的 block_on 函数，它接收一个 future 作为参数，并阻塞当前线程直到该 future 运行完成。在幕后，调用 block_on 会使用 tokio crate 设置一个运行时，该运行时用于运行传入的 future（trpl crate 的 block_on 行为与其他运行时 crate 的 block_on 函数类似）。一旦 future 完成，block_on 就会返回 future 产生的任何值。

Here, and throughout the rest of this chapter, we’ll use the block_on function from the trpl crate, which takes a future as an argument and blocks the current thread until this future runs to completion. Behind the scenes, calling block_on sets up a runtime using the tokio crate that’s used to run the future passed in (the trpl crate’s block_on behavior is similar to other runtime crates’ block_on functions). Once the future completes, block_on returns whatever value the future produced.

我们可以将 page_title 返回的 future 直接传递给 block_on，一旦它完成，我们就可以像示例 17-3 中尝试做的那样，对结果 Option<String> 进行匹配。然而，对于本章中的大多数示例（以及现实世界中的大多数异步代码），我们将做的不仅仅是一个异步函数调用，因此我们将传递一个 async 块并显式地等待 page_title 调用的结果，如示例 17-4 所示。

We could pass the future returned by page_title directly to block_on and, once it completed, we could match on the resulting Option<String> as we tried to do in Listing 17-3. However, for most of the examples in the chapter (and most async code in the real world), we’ll be doing more than just one async function call, so instead we’ll pass an async block and explicitly await the result of the page_title call, as in Listing 17-4.

extern crate trpl; // required for mdbook test

use trpl::Html;

fn main() {
    let args: Vec<String> = std::env::args().collect();

    trpl::block_on(async {
        let url = &args[1];
        match page_title(url).await {
            Some(title) => println!("The title for {url} was {title}"),
            None => println!("{url} had no title"),
        }
    })
}

async fn page_title(url: &str) -> Option<String> {
    let response_text = trpl::get(url).await.text().await;
    Html::parse(&response_text)
        .select_first("title")
        .map(|title| title.inner_html())
}

当我们运行这段代码时，我们得到了最初预期的行为：

When we run this code, we get the behavior we expected initially:

$ cargo run -- "https://www.rust-lang.org"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/async_await 'https://www.rust-lang.org'`
The title for https://www.rust-lang.org was
            Rust Programming Language

呼——我们终于有了一些可以工作的异步代码！但在我们添加让两个网站相互竞争的代码之前，让我们简要地将注意力转回到 future 的工作原理上。

Phew—we finally have some working async code! But before we add the code to race two sites against each other, let’s briefly turn our attention back to how futures work.

每个 等待点（await point）——即代码使用 await 关键字的每个地方——都代表一个将控制权交回运行时的位置。为了使其工作，Rust 需要跟踪异步块中涉及的状态，以便运行时可以启动一些其他工作，然后在准备好再次尝试推进第一个工作时返回。这是一个无形的状态机，就好像你写了一个像这样的枚举来保存每个等待点的当前状态：

Each await point—that is, every place where the code uses the await keyword—represents a place where control is handed back to the runtime. To make that work, Rust needs to keep track of the state involved in the async block so that the runtime could kick off some other work and then come back when it’s ready to try advancing the first one again. This is an invisible state machine, as if you’d written an enum like this to save the current state at each await point:

#![allow(unused)]
fn main() {
extern crate trpl; // required for mdbook test

enum PageTitleFuture<'a> {
    Initial { url: &'a str },
    GetAwaitPoint { url: &'a str },
    TextAwaitPoint { response: trpl::Response },
}
}

然而，手动编写在每个状态之间转换的代码会非常乏味且容易出错，尤其是当你以后需要向代码添加更多功能和更多状态时。幸运的是，Rust 编译器会自动为异步代码创建和管理状态机数据结构。围绕数据结构的普通借用和所有权规则仍然适用，令人高兴的是，编译器还会为我们处理这些检查并提供有用的错误消息。我们将在本章稍后部分处理其中一些错误。

Writing the code to transition between each state by hand would be tedious and error-prone, however, especially when you need to add more functionality and more states to the code later. Fortunately, the Rust compiler creates and manages the state machine data structures for async code automatically. The normal borrowing and ownership rules around data structures all still apply, and happily, the compiler also handles checking those for us and provides useful error messages. We’ll work through a few of those later in the chapter.

最终，必须有某种东西来执行这个状态机，而那个东西就是运行时。（这就是为什么在研究运行时时你可能会遇到 执行器 [executors] 的说法：执行器是运行时中负责执行异步代码的部分。）

Ultimately, something has to execute this state machine, and that something is a runtime. (This is why you may come across mentions of executors when looking into runtimes: an executor is the part of a runtime responsible for executing the async code.)

现在你可以明白为什么编译器在示例 17-3 中阻止我们将 main 本身变成异步函数了。如果 main 是一个异步函数，那么就需要其他东西来管理 main 返回的任何 future 的状态机，但 main 是程序的起点！相反，我们在 main 中调用了 trpl::block_on 函数来设置运行时并运行 async 块返回的 future，直到它完成。

Now you can see why the compiler stopped us from making main itself an async function back in Listing 17-3. If main were an async function, something else would need to manage the state machine for whatever future main returned, but main is the starting point for the program! Instead, we called the trpl::block_on function in main to set up a runtime and run the future returned by the async block until it’s done.

注意：一些运行时提供了宏，因此你可以编写一个异步 main 函数。这些宏将 async fn main() { ... } 重写为普通的 fn main，这与我们在示例 17-4 中手动完成的工作相同：调用一个函数来运行 future 直到其完成，就像 trpl::block_on 所做的那样。

Note: Some runtimes provide macros so you can write an async main function. Those macros rewrite async fn main() { ... } to be a normal fn main, which does the same thing we did by hand in Listing 17-4: call a function that runs a future to completion the way trpl::block_on does.

现在让我们把这些碎片拼接起来，看看我们如何编写并发代码。

Now let’s put these pieces together and see how we can write concurrent code.

并发地竞争两个 URL

Racing Two URLs Against Each Other Concurrently

在示例 17-5 中，我们使用从命令行传入的两个不同 URL 调用 page_title，并通过选择最先完成的那个 future 来让它们竞争。

In Listing 17-5, we call page_title with two different URLs passed in from the command line and race them by selecting whichever future finishes first.

extern crate trpl; // required for mdbook test

use trpl::{Either, Html};

fn main() {
    let args: Vec<String> = std::env::args().collect();

    trpl::block_on(async {
        let title_fut_1 = page_title(&args[1]);
        let title_fut_2 = page_title(&args[2]);

        let (url, maybe_title) =
            match trpl::select(title_fut_1, title_fut_2).await {
                Either::Left(left) => left,
                Either::Right(right) => right,
            };

        println!("{url} returned first");
        match maybe_title {
            Some(title) => println!("Its page title was: '{title}'"),
            None => println!("It had no title."),
        }
    })
}

async fn page_title(url: &str) -> (&str, Option<String>) {
    let response_text = trpl::get(url).await.text().await;
    let title = Html::parse(&response_text)
        .select_first("title")
        .map(|title| title.inner_html());
    (url, title)
}

我们首先为每个用户提供的 URL 调用 page_title。我们将生成的 future 保存为 title_fut_1 和 title_fut_2。记住，这些目前什么都不做，因为 future 是惰性的，我们还没有等待它们。然后我们将这些 future 传递给 trpl::select，它返回一个值来指示传递给它的哪些 future 最先完成。

We begin by calling page_title for each of the user-supplied URLs. We save the resulting futures as title_fut_1 and title_fut_2. Remember, these don’t do anything yet, because futures are lazy and we haven’t yet awaited them. Then we pass the futures to trpl::select, which returns a value to indicate which of the futures passed to it finishes first.

注意：在底层，trpl::select 是建立在 futures crate 中定义的更通用的 select 函数之上的。futures crate 的 select 函数可以做很多 trpl::select 函数做不到的事情，但它也有一些额外的复杂性，我们现在可以略过。

Note: Under the hood, trpl::select is built on a more general select function defined in the futures crate. The futures crate’s select function can do a lot of things that the trpl::select function can’t, but it also has some additional complexity that we can skip over for now.

任何一个 future 都可以合法地“获胜”，所以返回 Result 没有意义。相反，trpl::select 返回一个我们以前从未见过的类型：trpl::Either。Either 类型在某种程度上类似于 Result，因为它有两种情况。但与 Result 不同的是，Either 中没有成功或失败的概念。相反，它使用 Left 和 Right 来表示“两者择其一”：

Either future can legitimately “win,” so it doesn’t make sense to return a Result. Instead, trpl::select returns a type we haven’t seen before, trpl::Either. The Either type is somewhat similar to a Result in that it has two cases. Unlike Result, though, there is no notion of success or failure baked into Either. Instead, it uses Left and Right to indicate “one or the other”:

#![allow(unused)]
fn main() {
enum Either<A, B> {
    Left(A),
    Right(B),
}
}

如果第一个参数获胜，select 函数将返回带有该 future 输出的 Left；如果那个（第二个）future 参数获胜，则返回带有第二个 future 输出的 Right。这与调用函数时参数出现的顺序相匹配：第一个参数在第二个参数的左侧。

The select function returns Left with that future’s output if the first argument wins, and Right with the second future argument’s output if that one wins. This matches the order the arguments appear in when calling the function: the first argument is to the left of the second argument.

我们还更新了 page_title 以返回传入的相同 URL。这样，如果最先返回的页面没有我们可以解析的 <title>，我们仍然可以打印出有意义的消息。有了这些可用信息，我们最后通过更新 println! 输出，来指示哪个 URL 最先完成，以及该 URL 处的网页的 <title> 是什么（如果有的话）。

We also update page_title to return the same URL passed in. That way, if the page that returns first does not have a <title> we can resolve, we can still print a meaningful message. With that information available, we wrap up by updating our println! output to indicate both which URL finished first and what, if any, the <title> is for the web page at that URL.

你现在已经构建了一个小型且可以工作的网页爬虫！挑选几个 URL 并运行这个命令行工具。你可能会发现某些网站总是比其他网站快，而在其他情况下，速度更快的网站随每次运行而变化。更重要的是，你已经学习了使用 future 的基础知识，所以现在我们可以更深入地研究异步的功能。

You have built a small working web scraper now! Pick a couple URLs and run the command line tool. You may discover that some sites are consistently faster than others, while in other cases the faster site varies from run to run. More importantly, you’ve learned the basics of working with futures, so now we can dig deeper into what we can do with async.

Keyboard shortcuts

Rust 程序设计语言 简体中文版

Rust 程序设计语言简体中文版