Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

使用迭代器处理一系列项目

Processing a Series of Items with Iterators

迭代器模式允许你对序列中的每一项依次执行某些任务。迭代器负责遍历每一项并决定序列何时结束的逻辑。当你使用迭代器时,你不必自己重新实现这些逻辑。

The iterator pattern allows you to perform some task on a sequence of items in turn. An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished. When you use iterators, you don’t have to reimplement that logic yourself.

在 Rust 中,迭代器是 惰性的(lazy),这意味着直到你调用消费迭代器的方法来消耗它之前,它都不会产生任何效果。例如,示例 13-10 中的代码通过调用 Vec<T> 上定义的 iter 方法,为 vector v1 中的项创建了一个迭代器。这段代码本身并不做任何有用的事情。

In Rust, iterators are lazy, meaning they have no effect until you call methods that consume the iterator to use it up. For example, the code in Listing 13-10 creates an iterator over the items in the vector v1 by calling the iter method defined on Vec<T>. This code by itself doesn’t do anything useful.

fn main() {
    let v1 = vec![1, 2, 3];

    let v1_iter = v1.iter();
}

迭代器存储在 v1_iter 变量中。一旦创建了迭代器,我们就可以通过多种方式使用它。在示例 3-5 中,我们使用 for 循环遍历数组,对其中的每一项执行一些代码。在底层,这隐式地创建并消耗了一个迭代器,但直到现在我们才详细讨论它是如何工作的。

The iterator is stored in the v1_iter variable. Once we’ve created an iterator, we can use it in a variety of ways. In Listing 3-5, we iterated over an array using a for loop to execute some code on each of its items. Under the hood, this implicitly created and then consumed an iterator, but we glossed over how exactly that works until now.

在示例 13-11 的例子中,我们将迭代器的创建与在 for 循环中使用迭代器分开。当使用 v1_iter 中的迭代器调用 for 循环时,迭代器中的每个元素都会在循环的一次迭代中使用,从而打印出每个值。

In the example in Listing 13-11, we separate the creation of the iterator from the use of the iterator in the for loop. When the for loop is called using the iterator in v1_iter, each element in the iterator is used in one iteration of the loop, which prints out each value.

fn main() {
    let v1 = vec![1, 2, 3];

    let v1_iter = v1.iter();

    for val in v1_iter {
        println!("Got: {val}");
    }
}

在标准库没有提供迭代器的语言中,你可能会通过从索引 0 开始创建一个变量,使用该变量对 vector 进行索引以获取值,并在循环中递增该变量值,直到达到 vector 中的项目总数来实现相同的功能。

In languages that don’t have iterators provided by their standard libraries, you would likely write this same functionality by starting a variable at index 0, using that variable to index into the vector to get a value, and incrementing the variable value in a loop until it reached the total number of items in the vector.

迭代器为你处理了所有这些逻辑,减少了你可能出错的重复代码。迭代器为你提供了更大的灵活性,可以将相同的逻辑用于许多不同种类的序列,而不仅仅是像 vector 这样可以索引的数据结构。让我们来看看迭代器是如何做到这一点的。

Iterators handle all of that logic for you, cutting down on repetitive code you could potentially mess up. Iterators give you more flexibility to use the same logic with many different kinds of sequences, not just data structures you can index into, like vectors. Let’s examine how iterators do that.

Iterator trait 和 next 方法

The Iterator Trait and the next Method

所有的迭代器都实现了一个定义在标准库中名为 Iterator 的 trait。该 trait 的定义看起来像这样:

All iterators implement a trait named Iterator that is defined in the standard library. The definition of the trait looks like this:

#![allow(unused)]
fn main() {
pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;

    // methods with default implementations elided
}
}

请注意,此定义使用了一些新语法:type ItemSelf::Item,它们定义了该 trait 的 关联类型(associated type)。我们将在第 20 章深入讨论关联类型。目前,你只需要知道这段代码表示实现 Iterator trait 要求你也定义一个 Item 类型,并且这个 Item 类型被用于 next 方法的返回类型。换句话说,Item 类型将是从迭代器返回的类型。

Notice that this definition uses some new syntax: type Item and Self::Item, which are defining an associated type with this trait. We’ll talk about associated types in depth in Chapter 20. For now, all you need to know is that this code says implementing the Iterator trait requires that you also define an Item type, and this Item type is used in the return type of the next method. In other words, the Item type will be the type returned from the iterator.

Iterator trait 只要求实现者定义一个方法:next 方法,它每次返回迭代器中的一个项,封装在 Some 中,并在迭代结束时返回 None

The Iterator trait only requires implementors to define one method: the next method, which returns one item of the iterator at a time, wrapped in Some, and, when iteration is over, returns None.

我们可以直接调用迭代器上的 next 方法;示例 13-12 演示了从 vector 创建的迭代器在重复调用 next 时返回的值。

We can call the next method on iterators directly; Listing 13-12 demonstrates what values are returned from repeated calls to next on the iterator created from the vector.

#[cfg(test)]
mod tests {
    #[test]
    fn iterator_demonstration() {
        let v1 = vec![1, 2, 3];

        let mut v1_iter = v1.iter();

        assert_eq!(v1_iter.next(), Some(&1));
        assert_eq!(v1_iter.next(), Some(&2));
        assert_eq!(v1_iter.next(), Some(&3));
        assert_eq!(v1_iter.next(), None);
    }
}

请注意,我们需要使 v1_iter 可变:在迭代器上调用 next 方法会更改迭代器用于跟踪其在序列中所处位置的内部状态。换句话说,这段代码 消耗 了迭代器。每次调用 next 都会从迭代器中吃掉一个项。当使用 for 循环时,我们不需要使 v1_iter 可变,因为循环在后台获取了 v1_iter 的所有权并使其可变。

Note that we needed to make v1_iter mutable: Calling the next method on an iterator changes internal state that the iterator uses to keep track of where it is in the sequence. In other words, this code consumes, or uses up, the iterator. Each call to next eats up an item from the iterator. We didn’t need to make v1_iter mutable when we used a for loop, because the loop took ownership of v1_iter and made it mutable behind the scenes.

还要注意,我们从 next 调用中获取的值是 vector 中值的不可变引用。iter 方法产生一个不可变引用的迭代器。如果我们想创建一个获取 v1 所有权并返回所有权值的迭代器,我们可以调用 into_iter 而不是 iter。类似地,如果我们想遍历可变引用,我们可以调用 iter_mut 而不是 iter

Also note that the values we get from the calls to next are immutable references to the values in the vector. The iter method produces an iterator over immutable references. If we want to create an iterator that takes ownership of v1 and returns owned values, we can call into_iter instead of iter. Similarly, if we want to iterate over mutable references, we can call iter_mut instead of iter.

消耗迭代器的方法

Methods That Consume the Iterator

Iterator trait 有许多由标准库提供默认实现的不同方法;你可以通过查看标准库中 Iterator trait 的 API 文档来了解这些方法。其中一些方法在定义中调用了 next 方法,这就是为什么在实现 Iterator trait 时必须实现 next 方法的原因。

The Iterator trait has a number of different methods with default implementations provided by the standard library; you can find out about these methods by looking in the standard library API documentation for the Iterator trait. Some of these methods call the next method in their definition, which is why you’re required to implement the next method when implementing the Iterator trait.

调用 next 的方法被称为 消耗适配器(consuming adapters),因为调用它们会耗尽迭代器。一个例子是 sum 方法,它获取迭代器的所有权,并通过重复调用 next 来遍历其中的项,从而消耗迭代器。在遍历过程中,它将每个项添加到一个运行总和中,并在迭代完成时返回该总和。示例 13-13 有一个测试,演示了 sum 方法的使用。

Methods that call next are called consuming adapters because calling them uses up the iterator. One example is the sum method, which takes ownership of the iterator and iterates through the items by repeatedly calling next, thus consuming the iterator. As it iterates through, it adds each item to a running total and returns the total when iteration is complete. Listing 13-13 has a test illustrating a use of the sum method.

#[cfg(test)]
mod tests {
    #[test]
    fn iterator_sum() {
        let v1 = vec![1, 2, 3];

        let v1_iter = v1.iter();

        let total: i32 = v1_iter.sum();

        assert_eq!(total, 6);
    }
}

在调用 sum 之后,我们不被允许再使用 v1_iter,因为 sum 获取了我们调用它的迭代器的所有权。

We aren’t allowed to use v1_iter after the call to sum, because sum takes ownership of the iterator we call it on.

产生其他迭代器的方法

Methods That Produce Other Iterators

迭代器适配器(Iterator adapters)是定义在 Iterator trait 上的方法,它们不会消耗迭代器。相反,它们通过更改原始迭代器的某些方面来产生不同的迭代器。

Iterator adapters are methods defined on the Iterator trait that don’t consume the iterator. Instead, they produce different iterators by changing some aspect of the original iterator.

示例 13-14 显示了调用迭代器适配器方法 map 的示例,该方法接受一个闭包,在遍历每一项时调用该闭包。map 方法返回一个新的迭代器,该迭代器产生修改后的项。这里的闭包创建了一个新的迭代器,其中 vector 中的每一项都将加 1。

Listing 13-14 shows an example of calling the iterator adapter method map, which takes a closure to call on each item as the items are iterated through. The map method returns a new iterator that produces the modified items. The closure here creates a new iterator in which each item from the vector will be incremented by 1.

fn main() {
    let v1: Vec<i32> = vec![1, 2, 3];

    v1.iter().map(|x| x + 1);
}

然而,这段代码会产生一个警告:

However, this code produces a warning:

$ cargo run
   Compiling iterators v0.1.0 (file:///projects/iterators)
warning: unused `Map` that must be used
 --> src/main.rs:4:5
  |
4 |     v1.iter().map(|x| x + 1);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: iterators are lazy and do nothing unless consumed
  = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
  |
4 |     let _ = v1.iter().map(|x| x + 1);
  |     +++++++

warning: `iterators` (bin "iterators") generated 1 warning
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/iterators`

示例 13-14 中的代码不执行任何操作;我们指定的闭包从未被调用过。警告提醒了我们原因:迭代器适配器是惰性的,我们需要在这里消耗迭代器。

The code in Listing 13-14 doesn’t do anything; the closure we’ve specified never gets called. The warning reminds us why: Iterator adapters are lazy, and we need to consume the iterator here.

为了修复这个警告并消耗迭代器,我们将使用 collect 方法,我们在示例 12-1 中对 env::args 使用过它。此方法消耗迭代器并将结果值收集到一种集合数据类型中。

To fix this warning and consume the iterator, we’ll use the collect method, which we used with env::args in Listing 12-1. This method consumes the iterator and collects the resultant values into a collection data type.

在示例 13-15 中,我们将遍历调用 map 返回的迭代器的结果收集到一个 vector 中。这个 vector 最终将包含原始 vector 中的每一项加 1 后的结果。

In Listing 13-15, we collect the results of iterating over the iterator that’s returned from the call to map into a vector. This vector will end up containing each item from the original vector, incremented by 1.

fn main() {
    let v1: Vec<i32> = vec![1, 2, 3];

    let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();

    assert_eq!(v2, vec![2, 3, 4]);
}

因为 map 接受一个闭包,所以我们可以指定要在每一项上执行的任何操作。这是一个关于闭包如何让你在重用 Iterator trait 提供的迭代行为的同时,自定义某些行为的绝佳例子。

Because map takes a closure, we can specify any operation we want to perform on each item. This is a great example of how closures let you customize some behavior while reusing the iteration behavior that the Iterator trait provides.

你可以链式调用多个迭代器适配器,以可读的方式执行复杂的操作。但因为所有的迭代器都是惰性的,你必须调用一个消耗适配器方法才能从迭代器适配器调用中获得结果。

You can chain multiple calls to iterator adapters to perform complex actions in a readable way. But because all iterators are lazy, you have to call one of the consuming adapter methods to get results from calls to iterator adapters.

捕获环境的闭包

Closures That Capture Their Environment

许多迭代器适配器接受闭包作为参数,通常我们将指定为迭代器适配器参数的闭包是捕获其环境的闭包。

Many iterator adapters take closures as arguments, and commonly the closures we’ll specify as arguments to iterator adapters will be closures that capture their environment.

对于这个例子,我们将使用接受闭包的 filter 方法。闭包从迭代器中获取一个项并返回一个 bool。如果闭包返回 true,该值将包含在 filter 产生的迭代中。如果闭包返回 false,该值将不被包含。

For this example, we’ll use the filter method that takes a closure. The closure gets an item from the iterator and returns a bool. If the closure returns true, the value will be included in the iteration produced by filter. If the closure returns false, the value won’t be included.

在示例 13-16 中,我们使用 filter 配合一个捕获其环境中 shoe_size 变量的闭包来遍历 Shoe 结构体实例的集合。它将仅返回指定尺寸的鞋子。

In Listing 13-16, we use filter with a closure that captures the shoe_size variable from its environment to iterate over a collection of Shoe struct instances. It will return only shoes that are the specified size.

#[derive(PartialEq, Debug)]
struct Shoe {
    size: u32,
    style: String,
}

fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> {
    shoes.into_iter().filter(|s| s.size == shoe_size).collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn filters_by_size() {
        let shoes = vec![
            Shoe {
                size: 10,
                style: String::from("sneaker"),
            },
            Shoe {
                size: 13,
                style: String::from("sandal"),
            },
            Shoe {
                size: 10,
                style: String::from("boot"),
            },
        ];

        let in_my_size = shoes_in_size(shoes, 10);

        assert_eq!(
            in_my_size,
            vec![
                Shoe {
                    size: 10,
                    style: String::from("sneaker")
                },
                Shoe {
                    size: 10,
                    style: String::from("boot")
                },
            ]
        );
    }
}

shoes_in_size 函数获取一个鞋子的 vector 和一个鞋子尺寸作为参数。它返回一个仅包含指定尺寸鞋子的 vector。

The shoes_in_size function takes ownership of a vector of shoes and a shoe size as parameters. It returns a vector containing only shoes of the specified size.

shoes_in_size 的主体中,我们调用 into_iter 来创建一个获取该 vector 所有权的迭代器。然后,我们调用 filter 将该迭代器适配成一个新的迭代器,该迭代器仅包含闭包返回 true 的元素。

In the body of shoes_in_size, we call into_iter to create an iterator that takes ownership of the vector. Then, we call filter to adapt that iterator into a new iterator that only contains elements for which the closure returns true.

闭包从环境中捕获 shoe_size 参数,并将其值与每只鞋子的尺寸进行比较,仅保留指定尺寸的鞋子。最后,调用 collect 将适配后的迭代器返回的值收集到一个 vector 中,该 vector 由函数返回。

The closure captures the shoe_size parameter from the environment and compares the value with each shoe’s size, keeping only shoes of the size specified. Finally, calling collect gathers the values returned by the adapted iterator into a vector that’s returned by the function.

测试显示,当我们调用 shoes_in_size 时,我们只得到了与我们指定的值具有相同尺寸的鞋子。

The test shows that when we call shoes_in_size, we get back only shoes that have the same size as the value we specified.